Version 15 (modified by mmamonski, 13 years ago) (diff) |
---|
Coupling MPI codes using MUSCLE
MPI Kernels as dynamic libraries
A new method, public void executeDirectly(), is available in CaController class. Only this method is called instead of normal MUSCLE routines on the processes with non-zero rank. Process with rank 0 is started in the usual way. Portals cannot be attached to slave processes (i.e. to the processes with non-zero rank). The executeDirectly method calls by default execute().
Compilation
Running
Limitations
- Any MUSCLE API routine may be ONLY called by the rank 0 process. If you need any parameters to be available for the all MPI processes use MPI_Bcast function, for e.g. (as in provided example):
void Ring_Broadcast_Params(double *deltaE, double *maxE) { assert( MPI_Bcast(deltaE, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD) == MPI_SUCCESS); assert( MPI_Bcast(maxE, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD) == MPI_SUCCESS); }
- A separate Java Virtual Machine is started for every MPI process what increase significally the memory footprint of the whole application.
- Many MPI implementations exploits low level optimization techniques (like Direct Memory Access) that may cause crash of Java Virtual Machine.
- Using MPI to start many Java Virtual Machines, which loads some native dynamic-link library that later calls MPI routines is something that most people rarely do. In case of problems you might not found any help (you have been warned! ;-).
MPI Kernels as standalone executables
Compilation
Running
Limitations
- Any MUSCLE API routine may be ONLY called by the rank 0 process. If you need any parameters to be available for the all MPI processes use MPI_Bcast function, for e.g. (as in provided example):
void Ring_Broadcast_Params(double *deltaE, double *maxE) { assert( MPI_Bcast(deltaE, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD) == MPI_SUCCESS); assert( MPI_Bcast(maxE, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD) == MPI_SUCCESS); }
MPI implementation
Currently OpenMPI does not support applications that dynamically loads library (thus it does not work with JNI) - see this bug. You need to use other MPI implementations:
The following implementations have been tested so far:
Running practices
When using MPI in the kernel, this kernel must be run as the only one in this muscle instance. Other kernels, as well as main, must be run from the other MUSCLE instances.
Preparing source code
Java
User may override the executeDirectly to better fit his needs.
C/C++
As in Java, only process with Rank 0 may use MUSCLE routines, such as portals or the kernel.willStop() method. User is responsible for stopping all the processes once the kernel should stop.
MPI should be used in the standard way, i.e. the program should start with MPI::Initialize() and finish with MPI::Finalize().
Running MPI kernels
To run a MPI kernel, one must distribute the processes with mpirun/mpiexec utility. One kernel per one MUSCLE instance allowed. It is crucial to add the --mpi parameter for the muscle command - otherwise the simulation will fail, as multiple identical MUSCLE instances will be initialized.
mpirun -n 5 muscle --mpi --cxa_file src/cxa/Test.cxa.rb mpi
Of course, in another process the --main parameter must be given, as well as other kernels required, for example
muscle --main plumber --cxa_file src/cxa/Test.cxa.rb --autoquit
An example (which can be started with the commands shown above) has been added to the SVN in revision 68. An example (which can be started with the commands shown above) has been added to the SVN in revision 68.