= Coupling MPI codes using MUSCLE = == MPI Kernels as dynamic libraries == This approach follows the original MUSCLE philosophy that relay on using Java Native Interface/Access mechanism to integrate C/C++ codes into the kernels. A new method, `public void executeDirectly()`, is available in the `CaController` class. Only this method is called instead of normal MUSCLE routines on the processes with non-zero rank. Process with rank 0 is started in the usual way. Portals cannot be attached to slave processes (i.e. to the processes with non-zero rank). The `executeDirectly` method default implementation calls by default `execute()`. === Compilation === === Running === === Limitations === * Any MUSCLE API routine may be **ONLY** called by the rank 0 process. If you need any parameters to be available for the all MPI processes use `MPI_Bcast` function, for e.g. (as in provided example): {{{ void Ring_Broadcast_Params(double *deltaE, double *maxE) { assert( MPI_Bcast(deltaE, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD) == MPI_SUCCESS); assert( MPI_Bcast(maxE, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD) == MPI_SUCCESS); } }}} * A separate Java Virtual Machine is started for every MPI process what increase significally the memory footprint of the whole application. * Many MPI implementations exploits low level optimization techniques (like Direct Memory Access) that may cause crash of Java Virtual Machine. * Using MPI to start many Java Virtual Machines, which loads some native dynamic-link library that later calls MPI routines is something that most people rarely do. In case of problems you might not found any help (you have been warned! ;-). == MPI Kernels as standalone executables == === Compilation === === Running === === Limitations === * Any MUSCLE API routine may be **ONLY** called by the rank 0 process. If you need any parameters to be available for the all MPI processes use `MPI_Bcast` function, for e.g. (as in provided example): {{{ void Ring_Broadcast_Params(double *deltaE, double *maxE) { assert( MPI_Bcast(deltaE, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD) == MPI_SUCCESS); assert( MPI_Bcast(maxE, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD) == MPI_SUCCESS); } }}} === MPI implementation === Currently OpenMPI //does not support// applications that dynamically loads library (thus it does not work with JNI) - see [http://www.open-mpi.org/community/lists/devel/2005/09/0359.php this bug]. You need to use other MPI implementations: The following implementations have been tested so far: * [http://software.intel.com/en-us/articles/intel-mpi-library/ Intel MPI] * [http://www.mcs.anl.gov/research/projects/mpich2/ MPICH2] === Running practices === When using MPI in the kernel, this kernel must be run as the only one in this muscle instance. Other kernels, as well as main, must be run from the other MUSCLE instances. == Preparing source code == === Java === User may override the `executeDirectly` to better fit his needs. === C/C++ === As in Java, only process with Rank 0 may use MUSCLE routines, such as portals or the kernel.willStop() method. User is responsible for stopping all the processes once the kernel should stop. MPI should be used in the standard way, i.e. the program should start with MPI::Initialize() and finish with MPI::Finalize(). == Running MPI kernels == To run a MPI kernel, one must distribute the processes with `mpirun`/`mpiexec` utility. One kernel per one MUSCLE instance allowed. It is crucial to add the --mpi parameter for the muscle command - otherwise the simulation will fail, as multiple identical MUSCLE instances will be initialized. `mpirun -n 5 muscle --mpi --cxa_file src/cxa/Test.cxa.rb mpi` Of course, in another process the `--main` parameter must be given, as well as other kernels required, for example `muscle --main plumber --cxa_file src/cxa/Test.cxa.rb --autoquit` An example (which can be started with the commands shown above) has been added to the SVN in revision 68. An example (which can be started with the commands shown above) has been added to the SVN in revision 68.