Version 45 (modified by mmamonski, 13 years ago) (diff) |
---|
Coupling MPI codes using MUSCLE
Example Application
As an example "Hello World" application that shows coupling MPI codes via MUSCLE we will use an extremely simplistic and naive simulation of the Large Hadron Collider (LHC) experiment ;-). The application would model only two accelerators rings:
- Proton Synchrotron Booster (PSB) - the "small one",
- Large Hadron Collider (LHC) - the "big one".
The aforementioned accelerators are modeled as separate submodels (MUSCLE kernels) and are implemented using the "MPI Ring" code. In our quasi-simulation:
- insert a single proton (at an energy of PSB:InitialEnergy) into the PSB,
- where it is accelerated (of every `"PSB:DeltaEnergy") whenever it passes a ring node,
- until achieving energy of PSB:MaxEnergy,
- then the proton is transmitted from PSB into LHC,
- where it is accelerated further until it increase energy to the level of LHC:MaxEnergy,
- at this moment simulation stops.
The example codes can be found in src/cpp/examples/mpiring/ directory of the MUSCLE source distribution. The next two sections will describe two diffrent approaches of running such coupled simulation via MUSCLE.
MPI Kernels as dynamic libraries
This approach follows the original MUSCLE philosophy that relay on using Java Native Interface (or more userfriendly Java Native Access) mechanism to integrate C/C++ codes as MUSCLE kernels.
In order to support MPI applications a new method, public void executeDirectly(), was introduced in the CaController class. Only this method is called instead of normal MUSCLE routines on the processes with non-zero rank. Process with rank 0 is started in the usual way. Portals cannot be attached to slave processes (i.e. to the processes with non-zero rank). The executeDirectly method default implementation simply calls execute().
Source files
- LHC.cxa - the Complex Automata simulation file
cxa.env["PSB:InitialEnergy"] = 1.2 cxa.env["PSB:DeltaEnergy"] = 0.1 cxa.env["PSB:MaxEnergy"] = 4.0 cxa.env["LHC:DeltaEnergy"] = 0.2 cxa.env["LHC:MaxEnergy"] = 12.0 # declare kernels cxa.add_kernel('LHC', 'examples.mpiring.LHC') cxa.add_kernel('PSB', 'examples.mpiring.PSB') # configure connection scheme cs = cxa.cs cs.attach('PSB' => 'LHC') { tie('pipe', 'pipe') }
- LHC.java - a Java wrapper kernel for LHC submodel
- PSB.java - a Java wrapper kernel for PSB submodel
- mpiringlib.c - compiled into libmpiring dynamic loadable library
Running
- plumber
$ muscle --cxa_file share/muscle/cxa/LHC.cxa.rb --main plumber --autoquit ... INFO: Listening for intra-platform commands on address: jicp://150.254.149.108:1099
- LHC (note the --mpi switch)
$mpiexec -np 2 muscle --mpi --cxa_file share/muscle/cxa/LHC.cxa.rb --mainhost 150.254.149.108 --mainport 1099 LHC ... Initialized 0 node in ring LHC. Initialized 1 node in ring LHC. LHC: Received proton from PSB: 4.000000000000002 LHC: Proton energy callback: 4.200000000000002 LHC: Proton energy callback: 4.400000000000002 LHC: Proton energy callback: 4.600000000000002 LHC: Proton energy callback: 4.8000000000000025 LHC: Proton energy callback: 5.000000000000003 LHC: Proton energy callback: 5.200000000000003 LHC: Proton energy callback: 5.400000000000003 LHC: Proton energy callback: 5.600000000000003 LHC: Proton energy callback: 5.800000000000003 LHC: Proton energy callback: 6.0000000000000036 LHC: Proton energy callback: 6.200000000000004 LHC: Proton energy callback: 6.400000000000004 LHC: Proton energy callback: 6.600000000000004 LHC: Proton energy callback: 6.800000000000004 LHC: Proton energy callback: 7.000000000000004 LHC: Proton energy callback: 7.200000000000005 LHC: Proton energy callback: 7.400000000000005 LHC: Proton energy callback: 7.600000000000005 LHC: Proton energy callback: 7.800000000000005 LHC: Proton energy callback: 8.000000000000005 LHC: Proton energy callback: 8.200000000000005 LHC: Proton energy callback: 8.400000000000004 LHC: Proton energy callback: 8.600000000000003 LHC: Proton energy callback: 8.800000000000002 LHC: Proton energy callback: 9.000000000000002 LHC: Proton energy callback: 9.200000000000001 LHC: Proton energy callback: 9.4 LHC: Proton energy callback: 9.6 LHC: Proton energy callback: 9.799999999999999 LHC: Proton energy callback: 9.999999999999998 LHC: Proton energy callback: 10.199999999999998 LHC: Proton energy callback: 10.399999999999997 LHC: Proton energy callback: 10.599999999999996 LHC: Proton energy callback: 10.799999999999995 LHC: Proton energy callback: 10.999999999999995 LHC: Proton energy callback: 11.199999999999994 LHC: Proton energy callback: 11.399999999999993 LHC: Proton energy callback: 11.599999999999993 LHC: Proton energy callback: 11.799999999999992 LHC: Proton energy callback: 11.999999999999991 LHC: Proton energy callback: 12.19999999999999 LHC: Final energy: 12.19999999999999
- PSB (note the --mpi switch)
$mpiexec -np 2 muscle --jvmflags -d32 --mpi --cxa_file share/muscle/cxa/LHC.cxa.rb --mainhost 150.254.149.108 --mainport 1099 PSB Initialized 0 node in ring PSB. Initialized 1 node in ring PSB. Inserting proton into Proton Synchrotron Booster (PSB). Initial energy: 1.2 PSB: Proton energy callback: 1.3 PSB: Proton energy callback: 1.4000000000000001 PSB: Proton energy callback: 1.5000000000000002 PSB: Proton energy callback: 1.6000000000000003 PSB: Proton energy callback: 1.7000000000000004 PSB: Proton energy callback: 1.8000000000000005 PSB: Proton energy callback: 1.9000000000000006 PSB: Proton energy callback: 2.0000000000000004 PSB: Proton energy callback: 2.1000000000000005 PSB: Proton energy callback: 2.2000000000000006 PSB: Proton energy callback: 2.3000000000000007 PSB: Proton energy callback: 2.400000000000001 PSB: Proton energy callback: 2.500000000000001 PSB: Proton energy callback: 2.600000000000001 PSB: Proton energy callback: 2.700000000000001 PSB: Proton energy callback: 2.800000000000001 PSB: Proton energy callback: 2.9000000000000012 PSB: Proton energy callback: 3.0000000000000013 PSB: Proton energy callback: 3.1000000000000014 PSB: Proton energy callback: 3.2000000000000015 PSB: Proton energy callback: 3.3000000000000016 PSB: Proton energy callback: 3.4000000000000017 PSB: Proton energy callback: 3.5000000000000018 PSB: Proton energy callback: 3.600000000000002 PSB: Proton energy callback: 3.700000000000002 PSB: Proton energy callback: 3.800000000000002 PSB: Proton energy callback: 3.900000000000002 PSB: Proton energy callback: 4.000000000000002 Proton energy after PSB: 4.000000000000002. Injecting into LHC. ...
Limitations
- Any MUSCLE API routine may be ONLY called by the rank 0 process. If you need any parameters to be available for the all MPI processes use MPI_Bcast function, for e.g. (as in provided example):
void Ring_Broadcast_Params(double *deltaE, double *maxE) { assert( MPI_Bcast(deltaE, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD) == MPI_SUCCESS); assert( MPI_Bcast(maxE, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD) == MPI_SUCCESS); }
- A separate Java Virtual Machine is started for every MPI process what increases significally the memory footprint of the whole application.
- MPI kernel must be the sole kernel of the particular MUSCLE instance (for this reason, in our example we had to start plumber kernel in separate MUSCLE instance).
- Many MPI implementations exploits low level optimization techniques (like Direct Memory Access) that may cause crash of Java Virtual Machine.
- Using MPI to start many Java Virtual Machines, which loads some native dynamic-link library that later calls MPI routines is something that most people rarely do. In case of problems you might not found any help (you have been warned! ;-).
MPI Kernels as standalone executables
The other approach is to run MUSCLE as separate processes. The MUSCLE provide two base kernel classes: NativeKernel and MPIKernel that can be used to run application code as separate process. The process can comunicate with the library (and other kernels) via a new C/C++ MUSCLE API:
... muscle::env::init(); cout << "c++: begin "<< argv[0] <<endl; cout << "Kernel Name: " << muscle::cxa::kernel_name() << endl; for(int time = 0; !muscle::env::will_stop(); time ++) { // process data for(int i = 0; i < 5; i++) { dataA[i] = i; } // dump to our portals muscle::env::send("data", dataA, 5, MUSCLE_DOUBLE); } muscle::env::finalize(); ...
This approach has an advantage of separation of Java and C/C++ processes.
Sources
- LHC2.cxa - the Complex Automata simulation file
cxa.env["PSB:InitialEnergy"] = 1.2 cxa.env["PSB:DeltaEnergy"] = 0.1 cxa.env["PSB:MaxEnergy"] = 4.0 cxa.env["PSB:command"] = "PSB" cxa.env["PSB:mpiexec_args"] = "-np 2" cxa.env["LHC:DeltaEnergy"] = 0.2 cxa.env["LHC:MaxEnergy"] = 12.0 cxa.env["LHC:command"] = "LHC" cxa.env["LHC:mpiexec_args"] = "-np 2" # declare kernels cxa.add_kernel('LHC', 'examples.mpiring.LHC2') cxa.add_kernel('PSB', 'examples.mpiring.PSB2') # configure connection scheme cs = cxa.cs cs.attach('PSB' => 'LHC') { tie('pipe', 'pipe') }
- LHC2.java - a Java wrapper kernel for LHC submodel (extends MPIKernel class)
- PSB2.java - a Java wrapper kernel for PSB submodel (extends MPIKernel class)
- mpiringlib.c, LHC.c - compiled into the LHC executable
- mpitinglib.c, PBS.c - compiled into the PBS executable
Running
- LHC, plumber:
$muscle --cxa_file share/muscle/cxa/LHC2.cxa.rb --main plumber LHC --autoquit - jicp://150.254.149.108:1099 Initialized 0 node in ring LHC. Initialized 1 node in ring LHC. LHC:Received proton energy: 4.000000 LHC:Energy in loop 1: 4.200000 LHC:Energy in loop 2: 4.400000 LHC:Energy in loop 3: 4.600000 LHC:Energy in loop 4: 4.800000 LHC:Energy in loop 5: 5.000000 LHC:Energy in loop 6: 5.200000 LHC:Energy in loop 7: 5.400000 LHC:Energy in loop 8: 5.600000 LHC:Energy in loop 9: 5.800000 LHC:Energy in loop 10: 6.000000 LHC:Energy in loop 11: 6.200000 LHC:Energy in loop 12: 6.400000 LHC:Energy in loop 13: 6.600000 LHC:Energy in loop 14: 6.800000 LHC:Energy in loop 15: 7.000000 LHC:Energy in loop 16: 7.200000 LHC:Energy in loop 17: 7.400000 LHC:Energy in loop 18: 7.600000 LHC:Energy in loop 19: 7.800000 LHC:Energy in loop 20: 8.000000 LHC:Energy in loop 21: 8.200000 LHC:Energy in loop 22: 8.400000 LHC:Energy in loop 23: 8.600000 LHC:Energy in loop 24: 8.800000 LHC:Energy in loop 25: 9.000000 LHC:Energy in loop 26: 9.200000 LHC:Energy in loop 27: 9.400000 LHC:Energy in loop 28: 9.600000 LHC:Energy in loop 29: 9.800000 LHC:Energy in loop 30: 10.000000 LHC:Energy in loop 31: 10.200000 LHC:Energy in loop 32: 10.400000 LHC:Energy in loop 33: 10.600000 LHC:Energy in loop 34: 10.800000 LHC:Energy in loop 35: 11.000000 LHC:Energy in loop 36: 11.200000 LHC:Energy in loop 37: 11.400000 LHC:Energy in loop 38: 11.600000 LHC:Energy in loop 39: 11.800000 LHC:Energy in loop 40: 12.000000 LHC:Energy in loop 41: 12.200000 LHC:Final proton energy: 12.200000
- PSB
$muscle --cxa_file share/muscle/cxa/LHC2.cxa.rb --mainhost 150.254.149.108 --mainport 1099 PSB ... Initialized 0 node in ring PSB. Initialized 1 node in ring PSB. PSB: Inserting proton into Proton Synchrotron Booster (PSB). Initial energy: 1.200000 PSB:Energy in loop 1: 1.300000 PSB:Energy in loop 2: 1.400000 PSB:Energy in loop 3: 1.500000 PSB:Energy in loop 4: 1.600000 PSB:Energy in loop 5: 1.700000 PSB:Energy in loop 6: 1.800000 PSB:Energy in loop 7: 1.900000 PSB:Energy in loop 8: 2.000000 PSB:Energy in loop 9: 2.100000 PSB:Energy in loop 10: 2.200000 PSB:Energy in loop 11: 2.300000 PSB:Energy in loop 12: 2.400000 PSB:Energy in loop 13: 2.500000 PSB:Energy in loop 14: 2.600000 PSB:Energy in loop 15: 2.700000 PSB:Energy in loop 16: 2.800000 PSB:Energy in loop 17: 2.900000 PSB:Energy in loop 18: 3.000000 PSB:Energy in loop 19: 3.100000 PSB:Energy in loop 20: 3.200000 PSB:Energy in loop 21: 3.300000 PSB:Energy in loop 22: 3.400000 PSB:Energy in loop 23: 3.500000 PSB:Energy in loop 24: 3.600000 PSB:Energy in loop 25: 3.700000 PSB:Energy in loop 26: 3.800000 PSB:Energy in loop 27: 3.900000 PSB:Energy in loop 28: 4.000000 Proton energy after PSB: 4.000000. Injecting into LHC.
Limitations
- Any MUSCLE API routine may be ONLY called by the rank 0 process. If you need any parameters to be available for the all MPI processes use MPI_Bcast function, for e.g. (as in provided example):
void Ring_Broadcast_Params(double *deltaE, double *maxE) { assert( MPI_Bcast(deltaE, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD) == MPI_SUCCESS); assert( MPI_Bcast(maxE, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD) == MPI_SUCCESS); }