Version 13 (modified by mmamonski, 12 years ago) (diff) |
---|
Currently submitting co-allocated MUSCLE application is only possible using the XML JobProfile (compare QCG-SimpleClient). Beside the different job description format you have to suffix the qcg-sub command with the QCG keyword:
$ qcg-sub muscle.xml QCG
Example (Fusion - Transport Turbulence Equilibrium)
- Install your application on every cluster you wish to use
- register it on every cluster using QCG Community Modules (QCE) mechanism:
qcg-module-create -g plggmuscle Fusion/Turbulence
The module must bear the same name on every cluster. Inside the module you can set/prepend any environment variable, add dependencies to other modules, e.g.:
#%Module 1.0 proc ModulesHelp { } { puts stderr "\tName: Fusion/Turbulence" puts stderr "\tVersion: 0.1" puts stderr "\tMaintainer: plgmamonski" } module-whatis "Fusion/Turbulence, 0.1" #load all needed modules module add muscle2 #sets TCL variable set FUSION_KERNELS "/home/plgrid-groups/plggmuscle/fusionkernels" #sets environment variable setenv FUSION_KERNELS $FUSION_KERNELS #add to the PATH native kernels prepend-path PATH ${FUSION_KERNELS}/bin/ set curMod [module-info name] if { [ module-info mode load ] } { puts stderr "$curMod load complete." } if { [ module-info mode remove ] } { puts stderr "$curMod unload complete." }
You can set there two environment variables interpreted by the MUSCLE framework, namely: MUSCLE_CLASSPATH and MUSCLE_LIBPATH which set the Java classpath and the path of dynamically loadable libraries respectively. Thanks to this mechanism you can use single abstract CxA that do not consist of any site-specific paths. Also you can load the module in the interactive QCG job, e.g:
bash-4.1$ module load Fusion/Turbulence openmpi/openmpi-open64_4.5.2-1.4.5-2 load complete. Fusion/Turbulence load complete. bash-4.1$ muscle2 -ma -c $FUSION_KERNELS/cxa/testSimpleModelsB_shared.cxa.rb Running both MUSCLE2 Simulation Manager and the Simulation === Running MUSCLE2 Simulation Manager ===
- Prepare XML job description:
<qcgJob appId="MAPPER" xmlns:jxb="http://java.sun.com/xml/ns/jaxb" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <task persistent="true" taskId="task"> <requirements> <topology> <processes masterGroup="true" processesId="init:transp:dupCorep:turb"> <processesCount> <value>1</value> </processesCount> <candidateHosts> <hostName>inula.man.poznan.pl</hostName> </candidateHosts> </processes> <processes processesId="equil:dupEquil"> <processesCount> <value>1</value> </processesCount> <candidateHosts> <hostName>zeus.cyfronet.pl</hostName> </candidateHosts> </processes> </topology> </requirements> <execution type="mapper"> <executable> <application name="muscle2"/> </executable> <arguments> <value>FusionSimpleModels.cxa.rb</value> <value>--verbose</value> </arguments> <stdout> <directory> <location type="URL">gsiftp://qcg.man.poznan.pl/~/MAPPER/${JOB_ID}.output</location> </directory> </stdout> <stderr> <directory> <location type="URL">gsiftp://qcg.man.poznan.pl/~/MAPPER/${JOB_ID}.error</location> </directory> </stderr> <stageInOut> <file name="FusionSimpleModels.cxa.rb" type="in"> <location type="URL">gsiftp://qcg.man.poznan.pl/~/MAPPER/FusionSimpleModels.cxa.rb</location> </file> <file name="fusion-preprocess.sh" type="in"> <location type="URL">gsiftp://qcg.man.poznan.pl/~/MAPPER/fusion-preprocess.sh</location> </file> <file name="fusion-postprocess.sh" type="in"> <location type="URL">gsiftp://qcg.man.poznan.pl/~/MAPPER/fusion-postprocess.sh</location> </file> <directory name="data" type="in"> <location type="URL">gsiftp://qcg.man.poznan.pl/~/MAPPER/data</location> </directory> <directory name="out" type="out"> <location type="URL">gsiftp://qcg.man.poznan.pl/~/MAPPER/${JOB_ID}.out</location> </directory> </stageInOut> <environment> <variable name="QCG_MODULES_LIST">Fusion/Turbulence</variable> <variable name="QCG_PREPROCESS">fusion-preprocess.sh</variable> <variable name="QCG_POSTPROCESS">fusion-postprocess.sh</variable> </environment> </execution> <executionTime> <executionDuration>P0Y0M0DT0H30M</executionDuration> </executionTime> </task> </qcgJob>
- In the above example we:
- run the simulation on the two clusters using advance reservations created automatically by the QCG-Broker (in the co-allocation process) on two clusters: inula and zeus (<candidateHosts>),
- we requested 30 minutes of maximum job walltime (<executionDuration>),
- we specify the kernels to be run in the processesId attribute of the <processes> element (multiple kernels must be separated with a colon ":"),
- we specify the number of processes to be allocated
- In the above example we: