On Scientific Linux DRMAA packages can be Installed via YUM repository:

yum install pbs-drmaa #Torque
yum install pbspro-drmaa #PBS Proffesional

Alternatively compile DRMAA using source package downloaded from  SourceForge.

After installation you need either:

  • configure the DRMAA library to use Torque logs (RECOMMENDED). An example configuration of the DRMAA library (must be put into /opt/qcg/dependencies/etc/pbs_drmaa.conf file):
    # pbs_drmaa.conf - Sample pbs_drmaa configuration file.
      
    wait_thread: 0,
      
    pbs_home: "/var/spool/pbs",
        
    cache_job_state: 600,
    
    Note: Remember to mount server log directory as described in the eariler note.

or

  • configure Torque to keep information about completed jobs (e.g.: by setting: qmgr -c 'set server keep_completed = 300'). If running in such configuration try to provide more resources (e.g. two cores instead of one) for the VM that hosts the service. Moreover tune the DRMAA configuration in order to throttle polling rate:
      
    wait_thread: 0,
    cache_job_state: 60,
    pool_delay: 60,
      
    

It is possible to set the default queue by setting default job category (in the /etc/qcg/pbs_drmaa.conf file):

job_categories: {
      default: "-q plgrid",
},

Maui/Moab configuration

Add appropriate rights for the qcg-comp and qcg-broker users in the Maui scheduler configuaration file:

vim /var/spool/maui/maui.cfg
# primary admin must be first in list
ADMIN1                root
ADMIN2                qcg-broker
ADMIN3                qcg-comp

The qcg-comp user needs ADMIN3 privileges in order to get detailed information about all users jobs (e.g. estimated start time). The qcg-broker users needs ADMIN2 privileges in order to create advance reservations (optional).

Search PATH

The service assumes that the following commands are in the standard search path:

  • pbsnodes
  • showres
  • setres
  • releaseres
  • checknode

If any of the above commands is not installed in a standard location (e.g. /usr/bin) you may need to edit the /etc/sysconfig/qcg-compd file and set the PATH variable appropriately, e.g.:

# INIT_WAIT=5
#
# DRM specific options
 
PATH=$PATH:/opt/maui/bin

Logging Level

If you compiled DRMAA with logging switched on (--enable-debug) you can set in /etc/sysconfig/qcg-compd its logging level:

# INIT_WAIT=5
#
# DRM specific options

DRMAA_LOG_LEVEL=INFO

Restricted node access

Read this section only if the system is configured in such way that not majority of nodes are accesible using any queue/user. In such case you should provide nodes filter expression in the sysconfig file (/etc/sysconfig/qcg-compd). Examples:

  • Provide information about nodes that was taged with qcg property
    QCG_NODE_FILTER=properties:qcg
    
  • Provide information about all nodes except those tagged as gpgpu
    QCG_NODE_FILTER=properties:~gpgpu
    
  • Provide information only about resources that have hp as the epoch value:
    QCG_NODE_FILTER=resources_available.epoch:hp
    

In general the QCG_NODE_FILTER must adhere the following syntax:

pbsnodes-attr:regular-expression 

or if you want to reverse semantic (i.e. all nodes except those matching the expression)

pbsnodes-attr:~regular-expression 

Configuring PBS DRMA submit filter

In order to enforce PL-Grid grant policy you must configure PBS DRMAA submit filter by editing the `/etc/sysconfig/qcg-compd and adding variable pointing to the DRMAA submit filter, e.g.:

PBSDRMAA_SUBMIT_FILTER="/opt/exp_soft/plgrid/qcg-app-scripts/app-scripts/tools/plgrid-grants/pbsdrmaa_submit_filter.py"

An example submit filter can be found in QosCosGrid svn:

svn co https://apps.man.poznan.pl/svn/qcg-computing/trunk/app-scripts/tools/plgrid-grants

More about PBS DRMAA submit filters can be found  here.

Restricting advance reservation

By default the QCG-Computing service can reserve any number of hosts. One can limit it by configuring the Maui/Moab scheduler and the QCG-Computing service properly:

  • In Maui/Moab mark some subset of nodes, using the partition mechanism, as reservable for QCG-Computing:
    # all users can use both the DEFAULT and RENABLED partition
    SYSCFG           PLIST=DEFAULT,RENABLED
    #in Moab you should use 0 instead DEFAULT
    #SYSCFG           PLIST=0,RENABLED
      
    # mark some set of the machines (e.g. 64 nodes) as reservable
    NODECFG[node01] PARTITION=RENABLED
    NODECFG[node02] PARTITION=RENABLED
    NODECFG[node03] PARTITION=RENABLED
    ...
    NODECFG[node64] PARTITION=RENABLED
    
    
  • Tell the QCG-Computing to limit reservation to the aforementioned partition by editing the /etc/sysconfig/qcg-compd configuration file:
QCG_AR_MAUI_PARTITION="RENABLED"
  • Moreover the QCG-Computing (since version 2.4) can enforce limits on maximal reservations duration length (default: one week) and size (measured in number of slots reserved):
    ...
                            <ReservationsPolicy>
                                    <MaxDuration>24</MaxDuration> <!-- 24 hours -->
                                    <MaxSlots>100</MaxSlots>
                            </ReservationsPolicy>
    ...
    

PBS JSDL Filter modules

There is no standard way in expressing requested number of cores (on any number of nodes) among the Torque/PBS Pro installations. It is also depended on version, scheduler used, and even scheduler configuration. It might be:

  • -l procs=N
  • -l nodes=N
  • -l select=N

In case in your system number of cores requested are specified in other way than -l procs then you have to modify the configuration of the pbs_jsdl_filter module in the qcg-compd.xml file.

			<sm:Module xsi:type="pbs_jsdl_filter">
				<smc:TCCExpression>-l nodes=%d </smc:TCCExpression>
			</sm:Module>

You will learn about the qcg-compd.xml later in the manual. By now just remember that you may need to come back here later.