= Coupling MPI codes using MUSCLE = == Example Application == As an example "Hello World" application that shows coupling MPI codes via MUSCLE we will use an extremely simplistic and naive simulation of the Large Hadron Collider (LHC) experiment. The application would model only two accelerators rings: * Proton Synchrotron Booster (PSB) - the "small one", * Large Hadron Collider (LHC) - the "big one". The aforementioned accelerators are modeled as separate submodels (MUSCLE kernels) and are implemented using the "MPI Ring" code. In our quasi-simulation: * insert a single proton (at an energy of `PSB:InitialEnergy`) into the PSB, * where it is accelerated (of every `"PSB:DeltaEnergy") whenever it passes a ring node, * until achieving energy of `PSB:MaxEnergy`, * then the proton is transmitted from PSB into LHC, * where it is accelerated further until it increase energy to the level of `LHC:MaxEnergy` (simulation stops). The example codes can be found in `src/cpp/examples/mpiring/` directory of the MUSCLE source distribution. The next two sections will describe two diffrent approachs of running such codes via MUSCLE. == MPI Kernels as dynamic libraries == This approach follows the original MUSCLE philosophy that relay on using Java Native !Interface/Access mechanism to integrate C/C++ codes as MUSCLE kernels. In order to support MPI applications a new method, `public void executeDirectly()`, was introduced in the `CaController` class. Only this method is called instead of normal MUSCLE routines on the processes with non-zero rank. Process with rank 0 is started in the usual way. Portals cannot be attached to slave processes (i.e. to the processes with non-zero rank). The `executeDirectly` method default implementation calls by default `execute()`. === Source files === * LHC.cxa * LHC.java * PSB.java * mpiringlib.{c,h} === Running === * plumber {{{ $ muscle --cxa_file share/muscle/cxa/LHC.cxa.rb --main plumber --autoquit ... INFO: Listening for intra-platform commands on address: jicp://150.254.149.108:1099 }}} * LHC (note the `--mpi` switch) {{{ $mpiexec -np 2 muscle --mpi --cxa_file share/muscle/cxa/LHC.cxa.rb --mainhost 150.254.149.108 --mainport 1099 LHC ... Initialized 0 node in ring LHC. Initialized 1 node in ring LHC. LHC: Received proton from PSB: 4.000000000000002 LHC: Proton energy callback: 4.200000000000002 LHC: Proton energy callback: 4.400000000000002 LHC: Proton energy callback: 4.600000000000002 LHC: Proton energy callback: 4.8000000000000025 LHC: Proton energy callback: 5.000000000000003 LHC: Proton energy callback: 5.200000000000003 LHC: Proton energy callback: 5.400000000000003 LHC: Proton energy callback: 5.600000000000003 LHC: Proton energy callback: 5.800000000000003 LHC: Proton energy callback: 6.0000000000000036 LHC: Proton energy callback: 6.200000000000004 LHC: Proton energy callback: 6.400000000000004 LHC: Proton energy callback: 6.600000000000004 LHC: Proton energy callback: 6.800000000000004 LHC: Proton energy callback: 7.000000000000004 LHC: Proton energy callback: 7.200000000000005 LHC: Proton energy callback: 7.400000000000005 LHC: Proton energy callback: 7.600000000000005 LHC: Proton energy callback: 7.800000000000005 LHC: Proton energy callback: 8.000000000000005 LHC: Proton energy callback: 8.200000000000005 LHC: Proton energy callback: 8.400000000000004 LHC: Proton energy callback: 8.600000000000003 LHC: Proton energy callback: 8.800000000000002 LHC: Proton energy callback: 9.000000000000002 LHC: Proton energy callback: 9.200000000000001 LHC: Proton energy callback: 9.4 LHC: Proton energy callback: 9.6 LHC: Proton energy callback: 9.799999999999999 LHC: Proton energy callback: 9.999999999999998 LHC: Proton energy callback: 10.199999999999998 LHC: Proton energy callback: 10.399999999999997 LHC: Proton energy callback: 10.599999999999996 LHC: Proton energy callback: 10.799999999999995 LHC: Proton energy callback: 10.999999999999995 LHC: Proton energy callback: 11.199999999999994 LHC: Proton energy callback: 11.399999999999993 LHC: Proton energy callback: 11.599999999999993 LHC: Proton energy callback: 11.799999999999992 LHC: Proton energy callback: 11.999999999999991 LHC: Proton energy callback: 12.19999999999999 LHC: Final energy: 12.19999999999999 }}} * PSB (note the `--mpi` switch) {{{ $mpiexec -np 2 muscle --jvmflags -d32 --mpi --cxa_file share/muscle/cxa/LHC.cxa.rb --mainhost 150.254.149.108 --mainport 1099 PSB Initialized 0 node in ring PSB. Initialized 1 node in ring PSB. Inserting proton into Proton Synchrotron Booster (PSB). Initial energy: 1.2 PSB: Proton energy callback: 1.3 PSB: Proton energy callback: 1.4000000000000001 PSB: Proton energy callback: 1.5000000000000002 PSB: Proton energy callback: 1.6000000000000003 PSB: Proton energy callback: 1.7000000000000004 PSB: Proton energy callback: 1.8000000000000005 PSB: Proton energy callback: 1.9000000000000006 PSB: Proton energy callback: 2.0000000000000004 PSB: Proton energy callback: 2.1000000000000005 PSB: Proton energy callback: 2.2000000000000006 PSB: Proton energy callback: 2.3000000000000007 PSB: Proton energy callback: 2.400000000000001 PSB: Proton energy callback: 2.500000000000001 PSB: Proton energy callback: 2.600000000000001 PSB: Proton energy callback: 2.700000000000001 PSB: Proton energy callback: 2.800000000000001 PSB: Proton energy callback: 2.9000000000000012 PSB: Proton energy callback: 3.0000000000000013 PSB: Proton energy callback: 3.1000000000000014 PSB: Proton energy callback: 3.2000000000000015 PSB: Proton energy callback: 3.3000000000000016 PSB: Proton energy callback: 3.4000000000000017 PSB: Proton energy callback: 3.5000000000000018 PSB: Proton energy callback: 3.600000000000002 PSB: Proton energy callback: 3.700000000000002 PSB: Proton energy callback: 3.800000000000002 PSB: Proton energy callback: 3.900000000000002 PSB: Proton energy callback: 4.000000000000002 Proton energy after PSB: 4.000000000000002. Injecting into LHC. ... }}} === Limitations === * Any MUSCLE API routine may be **ONLY** called by the rank 0 process. If you need any parameters to be available for the all MPI processes use `MPI_Bcast` function, for e.g. (as in provided example): {{{ void Ring_Broadcast_Params(double *deltaE, double *maxE) { assert( MPI_Bcast(deltaE, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD) == MPI_SUCCESS); assert( MPI_Bcast(maxE, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD) == MPI_SUCCESS); } }}} * A separate Java Virtual Machine is started for every MPI process what increase significally the memory footprint of the whole application * MPI kernel must be the sole kernel of the particular MUSCLE instance (for this reason, in our example we had to start `plumber` kernel in separate MUSCLE instance) * Many MPI implementations exploits low level optimization techniques (like Direct Memory Access) that may cause crash of Java Virtual Machine. * Using MPI to start many Java Virtual Machines, which loads some native dynamic-link library that later calls MPI routines is something that most people rarely do. In case of problems you might not found any help (you have been warned! ;-). == MPI Kernels as standalone executables == The other approach is to run MUSCLE as separate processes. The MUSCLE provide two base kernel classes: `NativeKernel` and `MPIKernel` that can be used to run application code as separate process. The process can comunicate with the library (and other kernels) via a new C/C++ MUSCLE API: {{{ ... muscle::env::init(); cout << "c++: begin "<< argv[0] <