| 86 | === MUSCLE and MPI === |
| 87 | |
| 88 | When using MPI, MUSCLE_Send and MUSCLE_Receive only do their operations in rank 0, calls from other ranks are ignored. This means that the data should be gathered with MPI before a MUSCLE_Send and broadcasted after a MUSCLE_Receive. |
| 89 | |
| 90 | The `MUSCLE_Barrier` set of functions ease the integration of MUSCLE with MPI. Most MPI functions (including barrier, broadcast and gather) use a polling mechanism when they wait for communication to happen. This will use all the available CPU power but will somewhat reduce the latency of the operation. However, with MUSCLE, often other submodels than the MPI submodel should do some computing, and while the MPI operation waits this will slow down other submodels immensely. Therefore, MUSCLE has its own barrier operation, which has a higher latency than MPI_Barrier, but will not use any CPU resources. Since only rank 0 of the process ever receives data from MUSCLE, and a receive must wait for another submodel to send the message, that is a good point for calling a barrier. If multiple receives follow each other, barrier only needs to be called after the last one. |
| 91 | |
| 92 | The MUSCLE_Barrier API has three parts: `MUSCLE_Barrier_Init(char **barrier, size_t *len, int num_mpi_procs)`, |
| 93 | `MUSCLE_Barrier(const char *barrier)` and `MUSCLE_Barrier_Destroy(char *barrier)`. In Init the barrier is created. It needs the number of MPI processes since MUSCLE itself does not do any MPI calls and has no other way to find out. The barrier data structure should then be broadcasted with MPI by the user so that each process uses the same barrier. Each time MUSCLE_Barrier is called, it waits until all ranks have called it. `MUSCLE_Barrier_Init` returns -1 in rank 0 if it fails, and 0 otherwise. `MUSCLE_Barrier` returns -1 in any rank that it fails in, but that does not guarantee that it failed in other ranks. In Destroy the resources of the barrier are cleaned up. |
| 94 | |
| 95 | A typical piece of code could look as follows: |
| 96 | {{{ |
| 97 | int mpi_size; |
| 98 | MPI_Comm_size(&mpi_size, MPI_COMM_WORLD); |
| 99 | |
| 100 | char *barrier; |
| 101 | size_t barrier_len; |
| 102 | MUSCLE_Barrier_Init(&barrier, &barrier_len, mpi_size); |
| 103 | MPI_Bcast(barrier, barrier_len, MPI_CHAR, 0, MPI_COMM_WORLD); |
| 104 | |
| 105 | while (!EC) { |
| 106 | // Only receives in rank 0, it is ignored for other ranks |
| 107 | MUSCLE_Receive(data, ...); |
| 108 | MUSCLE_Barrier(barrier); |
| 109 | MPI_Bcast(data, ...) |
| 110 | |
| 111 | // do something |
| 112 | |
| 113 | // Only sends in rank 0, it is ignored in other ranks |
| 114 | MUSCLE_Send(data, ...) |
| 115 | } |
| 116 | MUSCLE_Barrier_Destroy(barrier); |
| 117 | }}} |
| 118 | This paradigm is used in `src/cpp/examples/simplempi/sender.c`. |
| 119 | |