The C API of MUSCLE is a subset of the C++ API, the C API is again a subset of the Java API. Because the core of MUSCLE is written in Java, any C++ executable will communicate with Java code, but this is hidden in the API. In both cases, the include path is $MUSCLE_HOME/include.


The header file for the C API is muscle2/cmuscle.h. The functions there are limited, but they should be sufficient for most needs:

muscle_error_t MUSCLE_Init(int* argc, char*** argv);
void MUSCLE_Finalize(void);

const char* MUSCLE_Kernel_Name();
const char* MUSCLE_Get_Property(const char* name);
int MUSCLE_Will_Stop();

muscle_error_t MUSCLE_Send(const char *exit_name, void *array, size_t size, muscle_datatype_t type);
void* MUSCLE_Receive(const char *entrance_name, void *array, size_t *size, muscle_datatype_t type);

int MUSCLE_Barrier_Init(char **barrier, int *len, int num_mpi_ranks);
int MUSCLE_Barrier(const char *barrier);
void MUSCLE_Barrier_Destroy(char *barrier);

The MUSCLE_Init function must be called before any of the other MUSCLE functions is used. It initializes MUSCLE given the current arguments of main. After all MUSCLE calls have been made, MUSCLE_Finalize should be called. Usually:

#include <muscle2/cmuscle.h>

int main(int argc, char *argv[])
    MUSCLE_Init(&argc, &argv);
    return 0;

After this, any code can be written. The MUSCLE_Kernel_Name function gives the current kernel name. This must NOT be freed after use. Similarly, MUSCLE_Get_Property gives a property as a string, which must NOT be freed afterwards.

The send and receive functions can transport several datatypes, which are enumerated in muscle2/muscle_types.h. For reference, it takes the following datatypes

muscle_datatype_tC/C++ data type Java data type maximum size
MUSCLE_DOUBLEdouble *double[]134e6 values
MUSCLE_FLOATfloat *float[]268e6 values
MUSCLE_INT32int *int[]268e6 values
MUSCLE_INT64long *long[]134e6 values
MUSCLE_STRINGconst char *String64e3 characters
MUSCLE_RAWunsigned char *byte[]268e6 values
MUSCLE_COMPLEXmuscle::ComplexData *any other object1 GiB

The type MUSCLE_COMPLEX is only available for C++ code. MUSCLE_STRING will send data up to the first nul-terminator or given length, whichever is first.

Receiving a message can be done in two ways: either the memory is initialized beforehand and the number of elements in the array is given as the third argument, or a 0-pointer is passed, in which case MUSCLE will allocate the memory. In both cases, the memory must be freed by the user. For example:

size_t sz = 100;
double *data = (double *)malloc(sz*sizeof(double));
if (data) {
    for (int i = 0; !MUSCLE_Will_Stop(); i++) {
        MUSCLE_Receive("exitName", data, &sz, MUSCLE_DOUBLE);
        // sz will be set to the actual size of the received message
        // do something with the data


size_t sz = 0;
for (int i = 0; !MUSCLE_Will_Stop(); i++) {
    double *data = (double *)MUSCLE_Receive("exitName", (void *)0, &sz, MUSCLE_DOUBLE);
    // do something with the data
    // data will have length sz;

The first example will only work if the upper bound of received message size is known in advance. The second example is safer since MUSCLE will allocate the necessary data for you based on the message size. However, it also means that new memory is allocated for each message.

The MUSCLE_Send command can only be used one way, and again starts with the conduit entrance name, then the data, the number of elements in the array and finally the type of data.

double data[100];
size_t sz = 100;
for (int i = 0; !MUSCLE_Will_Stop(); i++) {
    // do something with the data...
    // send the data
    MUSCLE_Send("entranceName", data, sz, MUSCLE_DOUBLE);
// data is on the stack; it will be freed automatically


Depending on the MPI implementation, MUSCLE_Init should be called before MPI_Init. This may generate a warning about forking, but as long as the submodel produces correct results this can be ignored. When using MPI, MUSCLE calls only do their operations in rank 0, calls from other ranks are ignored. This means that the data should be gathered with MPI before a MUSCLE_Send and broadcasted after a MUSCLE_Receive. The functions MUSCLE_Kernel_Name, MUSCLE_Get_Property, and MUSCLE_Will_Stop cannot give a meaningful result in other ranks than rank 0 so calling them from other ranks results in undefined behavior (and should be prevented). Their result can then be propagated with MPI in your code, if needed.

The MUSCLE_Barrier set of functions ease the integration of MUSCLE with MPI. Most MPI functions (including barrier, broadcast and gather) use a polling mechanism when they wait for communication to happen. This will use all the available CPU power but will somewhat reduce the latency of the operation. However, with MUSCLE, often other submodels than the MPI submodel should do some computing, and while the MPI operation waits this will slow down other submodels immensely. Therefore, MUSCLE has its own barrier operation, which has a higher latency than MPI_Barrier, but will not use any CPU resources. Since only rank 0 of the process ever receives data from MUSCLE, and a receive must wait for another submodel to send the message, that is a good point for calling a barrier. If multiple receives follow each other, barrier only needs to be called after the last one. Note that only ranks other than 0 are stopped by the barrier, rank 0 will not stop.

The MUSCLE_Barrier API has three parts: MUSCLE_Barrier_Init(char **barrier, size_t *len, int num_mpi_procs), MUSCLE_Barrier(const char *barrier) and MUSCLE_Barrier_Destroy(char *barrier). In Init the barrier is created. It needs the number of MPI processes since MUSCLE itself does not do any MPI calls and has no other way to find out. The barrier data structure should then be broadcasted with MPI by the user so that each process uses the same barrier. Each time MUSCLE_Barrier is called, it waits until all ranks have called it. MUSCLE_Barrier_Init returns -1 in rank 0 if it fails, and 0 otherwise. MUSCLE_Barrier returns -1 in any rank that it fails in, but that does not guarantee that it failed in other ranks. In Destroy the resources of the barrier are cleaned up.

A typical piece of code could look as follows:

int mpi_size;
MPI_Comm_size(&mpi_size, MPI_COMM_WORLD);

char *barrier;
size_t barrier_len;
MUSCLE_Barrier_Init(&barrier, &barrier_len, mpi_size);
MPI_Bcast(barrier, barrier_len, MPI_CHAR, 0, MPI_COMM_WORLD);

while (!EC) {
    // Only receives in rank 0, it is ignored for other ranks
    MUSCLE_Receive(data, ...);
    MPI_Bcast(data, ...)

    // do something

    // Only sends in rank 0, it is ignored in other ranks
    MUSCLE_Send(data, ...)

This paradigm is used in src/cpp/examples/simplempi/sender.c.


The C++ API resembles the C API but it is more extensive. Relevant header files are muscle2/cppmuscle.hpp and muscle2/complex_data.hpp. All functions included cppmuscle.hpp are static, so the API can only be used for one submodel per executable. The following functions are available:

muscle_error_t muscle::env::init(int* argc, char ***argv);
void muscle::env::finalize(void);

bool muscle::env::will_stop(void);

void muscle::env::send(std::string entrance_name, const void *data, size_t count, muscle_datatype_t type);
void muscle::env::sendDoubleVector(std::string entrance_name, const std::vector<double>& data);

void* muscle::env::receive(std::string exit_name, void *data, size_t &count, muscle_datatype_t type);
std::vector<double> muscle::env::receiveDoubleVector(std::string exit_name);

void muscle::env::free_data(void *ptr, muscle_datatype_t type);

std::string muscle::env::get_tmp_path(void);

std::string muscle::cxa::get_property(std::string name);
std::string muscle::cxa::get_properties(void);
std::string muscle::cxa::kernel_name(void);

int muscle::util::barrier_init(char **barrier, size_t *len, int num_mpi_procs);
int muscle::util::barrier(const char *barrier);
void muscle::util::barrier_destroy(char *barrier);

init, finalize, will_stop, send, receive, get_property, kernel_name, barrier_init, barrier, and barrier_destroy behave exactly as their C counterparts. Since scientific data is often vectors of doubles, there are two new convenience functions to send or receive only double vectors. Vectors are safer than arrays memory-wise and they do range-checking. If other vector methods are of interest to you, send us an email. Except for vectors, the data that is received must be freed by MUSCLE by calling free_data with the received pointer and the datatype of the received pointer.

Complex data

In addition to the datatypes that are available in the C API, the C++ API can also receive a MUSCLE_COMPLEX type. This will receive a muscle::ComplexData object, defined in the muscle2/complex_data.hpp header file. This ComplexData is used when the sender of a message sends different data than an array. A Java sender might send a 2-D array of doubles, for instance. In the enum muscle_complex_t all possible types are listed. For the moment only arrays and matrices are supported. Working with ComplexData goes as follows, for example with a two-dimensional double array

#include <vector>
#include <stdexcept>
#include <muscle2/cppmuscle.hpp>
#include <muscle2/complex_data.hpp>

using namespace std;
using namespace muscle;

double processMessage()
    size_t sz;
    ComplexData *cdata = (ComplexData *) muscle::env::receive("matrixIn", (void *)0, &sz, MUSCLE_COMPLEX);
    if (cdata->getType() != COMPLEX_DOUBLE_MATRIX_2D) {
        throw new runtime_error("Expecting double 2D matrix; received something else.");
    vector<int> dims = cdata->getDimensions();

    // get the matrix as a vector
    double *data = (double *) cdata->getData();

    // Do something with the data
    double result = 0;
    for (int x = 0; x < dims[0]; x++) {
        for (int y = 0; y < dims[1]; y++) {
           // the matrix is indexed according to the different dimensions
           result += data[cdata->fidx(x,y)];
    muscle::env::free_data(cdata, MUSCLE_COMPLEX);
    return result;

The indices of the 2-D array are retrieved using the index or fidx function, where the latter is faster but does not do range or dimensionality checking. To send a ComplexData object it must first be constructed. The easiest way is to let ComplexData do the memory allocation and destruction, and getting the allocated memory from ComplexData to perform actual actions with.

vector<int> dims(2);
// Set dimensions x and y
dims[0] = 10; dims[1] = 14;
ComplexData cdata(COMPLEX_DOUBLE_MATRIX_2D, &dims);
double *data = (double *)cdata.getData();

// Do something with the data
double result = 0;
for (int x = 0; x < dims[0]; x++) {
    for (int y = 0; y < dims[1]; y++) {
       // the matrix is indexed according to the different dimensions, using fidx
       result += data[cdata.fidx(x,y)];
muscle::env::send("matrixOut", &cdata, cdata.length(), MUSCLE_COMPLEX);


MUSCLE provides logging facilities for C++. The functions are all static.

void muscle::logger::severe(char *message, ...);
void muscle::logger::warning(char *message, ...);
void muscle::logger::info(char *message, ...);
void muscle::logger::config(char *message, ...);
void muscle::logger::fine(char *message, ...);
void muscle::logger::finer(char *message, ...);
void muscle::logger::finest(char *message, ...);

Each takes a message, and optionally more arguments provided in printf syntax. For instance:

muscle::logger::info("the temporary path is %s", muscle::env::get_tmp_path().c_str());

This will print the message to stdout and prepend the current kernel name and the log level. The log is written to TMP_DIR/[instance]/[instance].native.log.

<< Back to Documentation << Back to Documentation << Back to Documentation