= Coupling MPI codes using MUSCLE = == Example Application == As an example "Hello World" application that shows coupling MPI codes via MUSCLE we will use an extremely simplistic and naive simulation of the Large Hadron Collider (LHC) experiment ;-). The application would model only two accelerators rings: * Proton Synchrotron Booster (PSB) - the "small one", * Large Hadron Collider (LHC) - the "big one". The aforementioned accelerators are modeled as separate submodels (MUSCLE kernels) and are implemented using the "MPI Ring" code. In our quasi-simulation: * insert a single proton (at an energy of `PSB:InitialEnergy`) into the PSB, * where it is accelerated (of every `"PSB:DeltaEnergy") whenever it passes a ring node, * until achieving energy of `PSB:MaxEnergy`, * then the proton is transmitted from PSB into LHC, * where it is accelerated further until it increase energy to the level of `LHC:MaxEnergy`, * at this moment simulation stops. The example codes can be found in `src/cpp/examples/mpiring/` directory of the MUSCLE source distribution. The next two sections will describe two diffrent approachs of running such coupled simulation via MUSCLE. == MPI Kernels as dynamic libraries == This approach follows the original MUSCLE philosophy that relay on using [http://en.wikipedia.org/wiki/Java_Native_Interface Java Native Interface] (or more userfriendly [http://en.wikipedia.org/wiki/Java_Native_Access Java Native Access]) mechanism to integrate C/C++ codes as MUSCLE kernels. In order to support MPI applications a new method, `public void executeDirectly()`, was introduced in the `CaController` class. Only this method is called instead of normal MUSCLE routines on the processes with non-zero rank. Process with rank 0 is started in the usual way. Portals cannot be attached to slave processes (i.e. to the processes with non-zero rank). The `executeDirectly` method default implementation calls by default `execute()`. === Source files === * LHC.cxa * LHC.java * PSB.java * mpiringlib.{c,h} === Running === * plumber {{{ $ muscle --cxa_file share/muscle/cxa/LHC.cxa.rb --main plumber --autoquit ... INFO: Listening for intra-platform commands on address: jicp://150.254.149.108:1099 }}} * LHC (note the `--mpi` switch) {{{ $mpiexec -np 2 muscle --mpi --cxa_file share/muscle/cxa/LHC.cxa.rb --mainhost 150.254.149.108 --mainport 1099 LHC ... Initialized 0 node in ring LHC. Initialized 1 node in ring LHC. LHC: Received proton from PSB: 4.000000000000002 LHC: Proton energy callback: 4.200000000000002 LHC: Proton energy callback: 4.400000000000002 LHC: Proton energy callback: 4.600000000000002 LHC: Proton energy callback: 4.8000000000000025 LHC: Proton energy callback: 5.000000000000003 LHC: Proton energy callback: 5.200000000000003 LHC: Proton energy callback: 5.400000000000003 LHC: Proton energy callback: 5.600000000000003 LHC: Proton energy callback: 5.800000000000003 LHC: Proton energy callback: 6.0000000000000036 LHC: Proton energy callback: 6.200000000000004 LHC: Proton energy callback: 6.400000000000004 LHC: Proton energy callback: 6.600000000000004 LHC: Proton energy callback: 6.800000000000004 LHC: Proton energy callback: 7.000000000000004 LHC: Proton energy callback: 7.200000000000005 LHC: Proton energy callback: 7.400000000000005 LHC: Proton energy callback: 7.600000000000005 LHC: Proton energy callback: 7.800000000000005 LHC: Proton energy callback: 8.000000000000005 LHC: Proton energy callback: 8.200000000000005 LHC: Proton energy callback: 8.400000000000004 LHC: Proton energy callback: 8.600000000000003 LHC: Proton energy callback: 8.800000000000002 LHC: Proton energy callback: 9.000000000000002 LHC: Proton energy callback: 9.200000000000001 LHC: Proton energy callback: 9.4 LHC: Proton energy callback: 9.6 LHC: Proton energy callback: 9.799999999999999 LHC: Proton energy callback: 9.999999999999998 LHC: Proton energy callback: 10.199999999999998 LHC: Proton energy callback: 10.399999999999997 LHC: Proton energy callback: 10.599999999999996 LHC: Proton energy callback: 10.799999999999995 LHC: Proton energy callback: 10.999999999999995 LHC: Proton energy callback: 11.199999999999994 LHC: Proton energy callback: 11.399999999999993 LHC: Proton energy callback: 11.599999999999993 LHC: Proton energy callback: 11.799999999999992 LHC: Proton energy callback: 11.999999999999991 LHC: Proton energy callback: 12.19999999999999 LHC: Final energy: 12.19999999999999 }}} * PSB (note the `--mpi` switch) {{{ $mpiexec -np 2 muscle --jvmflags -d32 --mpi --cxa_file share/muscle/cxa/LHC.cxa.rb --mainhost 150.254.149.108 --mainport 1099 PSB Initialized 0 node in ring PSB. Initialized 1 node in ring PSB. Inserting proton into Proton Synchrotron Booster (PSB). Initial energy: 1.2 PSB: Proton energy callback: 1.3 PSB: Proton energy callback: 1.4000000000000001 PSB: Proton energy callback: 1.5000000000000002 PSB: Proton energy callback: 1.6000000000000003 PSB: Proton energy callback: 1.7000000000000004 PSB: Proton energy callback: 1.8000000000000005 PSB: Proton energy callback: 1.9000000000000006 PSB: Proton energy callback: 2.0000000000000004 PSB: Proton energy callback: 2.1000000000000005 PSB: Proton energy callback: 2.2000000000000006 PSB: Proton energy callback: 2.3000000000000007 PSB: Proton energy callback: 2.400000000000001 PSB: Proton energy callback: 2.500000000000001 PSB: Proton energy callback: 2.600000000000001 PSB: Proton energy callback: 2.700000000000001 PSB: Proton energy callback: 2.800000000000001 PSB: Proton energy callback: 2.9000000000000012 PSB: Proton energy callback: 3.0000000000000013 PSB: Proton energy callback: 3.1000000000000014 PSB: Proton energy callback: 3.2000000000000015 PSB: Proton energy callback: 3.3000000000000016 PSB: Proton energy callback: 3.4000000000000017 PSB: Proton energy callback: 3.5000000000000018 PSB: Proton energy callback: 3.600000000000002 PSB: Proton energy callback: 3.700000000000002 PSB: Proton energy callback: 3.800000000000002 PSB: Proton energy callback: 3.900000000000002 PSB: Proton energy callback: 4.000000000000002 Proton energy after PSB: 4.000000000000002. Injecting into LHC. ... }}} === Limitations === * Any MUSCLE API routine may be **ONLY** called by the rank 0 process. If you need any parameters to be available for the all MPI processes use `MPI_Bcast` function, for e.g. (as in provided example): {{{ void Ring_Broadcast_Params(double *deltaE, double *maxE) { assert( MPI_Bcast(deltaE, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD) == MPI_SUCCESS); assert( MPI_Bcast(maxE, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD) == MPI_SUCCESS); } }}} * A separate Java Virtual Machine is started for every MPI process what increase significally the memory footprint of the whole application * MPI kernel must be the sole kernel of the particular MUSCLE instance (for this reason, in our example we had to start `plumber` kernel in separate MUSCLE instance) * Many MPI implementations exploits low level optimization techniques (like Direct Memory Access) that may cause crash of Java Virtual Machine. * Using MPI to start many Java Virtual Machines, which loads some native dynamic-link library that later calls MPI routines is something that most people rarely do. In case of problems you might not found any help (you have been warned! ;-). == MPI Kernels as standalone executables == The other approach is to run MUSCLE as separate processes. The MUSCLE provide two base kernel classes: `NativeKernel` and `MPIKernel` that can be used to run application code as separate process. The process can comunicate with the library (and other kernels) via a new C/C++ MUSCLE API: {{{ ... muscle::env::init(); cout << "c++: begin "<< argv[0] <