Version 23 (modified by piontek, 9 years ago) (diff)

--

QCG Broker Client Installation

Requirements

  • Java (>= 1.5)

    For SL5.x

    yum install java-1.6.0-sun-compat.x86_64
    
  • Apache Ant (>= 1.6) ("Optional tasks for ant" are required. For SL5.x the ant-nodeps.x86_64 package must be installed)

    For SL5.x

    yum install ant.x86_64 ant-nodeps.x86_64
    
  • xml-commons-apis - installation of this package is not mandatory, but is recommended to avoid confusing information about not important errors.

    For SL5.x

    yum install xml-commons.x86_64 xml-commons-apis.x86_64
    
  • trusted CA certificates To enable secure communication between client and QCG-Broker service set of trusted CA certificates must be copied either into /etc/grid-security/certificates directory or configured for every user.

For the PL-Grid project: Install Polish Grid and PL-Grid Simpla-CA certificates:

wget https://dist.eugridpma.info/distribution/igtf/current/accredited/RPMS/ca_PolishGrid-1.38-1.noarch.rpm
wget http://software.plgrid.pl/packages/general/ca_PLGRID-SimpleCA-1.0-2.noarch.rpm
wget https://dist.eugridpma.info/distribution/util/fetch-crl/fetch-crl-2.8.5-1.noarch.rpm

rpm -i ca_PolishGrid-1.38-1.noarch.rpm 
rpm -i ca_PLGRID-SimpleCA-1.0-2.noarch.rpm 

#install certificate revocation list fetching utility 
rpm -i fetch-crl-2.8.5-1.noarch.rpm

#get fresh CRLs now
/usr/sbin/fetch-crl 

#install cron job for it
cat > /etc/cron.daily/fetch-crl.cron << EOF
#!/bin/sh 
/usr/sbin/fetch-crl
EOF

chmod a+x /etc/cron.daily/fetch-crl.cron
  • User's credential To secure the communication and authenticate the user to the service the X509 proxy certificate is needded. Client expects the access either to proxy certificate file or to pair of user certificate and private key files in "pem" format.

If the user has the certificate in p12 format, the certificate has to be first converted to pem format files.

  openssl pkcs12 -nocerts -in cert.p12 -out userkey.pem
  openssl pkcs12 -clcerts -nokeys -in cert.p12 -out usercert.pem

Installation

The installation of QCG-Broker service can be done in two ways:

  1. Using the QCG-Broker precompiled source distribution. This kind of installation is designed for every Linux distribution meeting described above requirements. The installation can be performed as well by system administrator (to deploy one common instance of client accesible for all users) as every regular user (that wants to have his own instance).
  2. Using provided RPM package. The package is designed for Scientific Linux 5.x (recommended version is 5.5) and the installation process requires root privileges.

Installation using the QCG-Broker distribution

  • download the QCG-Broker archive from  qcg-broker.tgz
    wget http://node2.qoscosgrid.man.poznan.pl/~piontek/qcg-broker/qcg-broker.tgz
    
  • unpack the archive
    tar xzf qcg-broker.tgz
    

Compilation

The distribution contains precompiled version of QCB-Broker command-line client, that can be deployed as it is. The compilation step is optional and can be skiped, except the situation in which some specific compiler options should be added or changed.

  • compile sources
    cd qcg-broker-<VERSION>
    ant client-stubs client
    

Setup

  • setup deployment configuration - all configuration variables are placed in client-deploy.prop file
    • client.deploy.dir - directory where QCG-Broker client will be deployed
    • client.service.host - QCG-Broker service hostname
    • client.service.port - QCG-Broker service port
    • client.service.dn - QCG-Broker credential DN

Deployment

  • deploy QCG-Broker command-line client
    ant deploy-client
    

Installation using provided RPM

Perform the whole installation procedure as a root user.

  • Install PL-Grid or/and QCG repositories:
    • Official PL-Grid repository
      rpm -Uvh http://software.plgrid.pl/packages/repos/plgrid-repos-2010-2.noarch.rpm
      
    • QosCosGrid testing repository
      cat > /etc/yum.repos.d/qcg.repo << EOF
      [qcg]
      name=QosCosGrid YUM repository
      baseurl=http://fury.man.poznan.pl/qcg-packages/sl/x86_64/
      enabled=1
      gpgcheck=0
      EOF
      
  • install QCG-Broker client using YUM Package Manager:
    yum install qcg-broker-client
    
  • configure the client specifying the QCG-Broker URL and DN:
    vim /opt/plgrid/qcg/qcg-broker/client/etc/qcg-broker-client.conf
    

Configuration

To work properly and to authenticate user to the service, client has to be able to load user's proxy certificate and validate service credential during the handshake procedure. To do this it needs to know location of the file containing proxy certificate and directory containing public keys of Certificate Authorities it should trust.

  • Client looks for proxy according to following rules:
    It first checks the X509_USER_PROXY system property. If the property
    is not set, it checks next the 'proxy' property in the current
    configuration. If that property is not set, then it defaults to a
    value based on the following rules: 
    If a UID system property is set, and running on a Unix machine it
    returns /tmp/x509up_u${UID}. If any other machine then Unix, it returns
    ${tempdir}/x509up_u${UID}, where tempdir is a platform-specific
    temporary directory as indicated by the java.io.tmpdir system property.
    If a UID system property is not set, the username will be used instead
    of the UID. That is, it returns ${tempdir}/x509up_u_${username}
    
  • Client looks for the CA directory according to following rules:
    It first checks the X509_CERT_DIR system property. If the property
    is not set, it checks next the 'cacert' property in the current
    configuration. If that property is not set, it tries to find
    the certificates using the following rules:
    First the ${user.home}/.globus/certificates directory is checked.
    If the directory does not exist, and on a Unix machine, the
    /etc/grid-security/certificates directory is checked next.
    If that directory does not exist and GLOBUS_LOCATION
    system property is set then the ${GLOBUS_LOCATION}/share/certificates
    directory is checked. 
    

CoG library configuration can be modified using the COG properties file ~/.globus/cog.properties

#Java CoG Kit Configuration File
proxy=/tmp/x509up_u501
cacert=/etc/grid-security/certificates/

Additionaly the location of user's certificate and private key must be specified.

usercert=/home/piontek/.globus/usercert.pem
userkey=/home/piontek/.globus/userkey.pem

If they are specified and user proxy certificate doesn't exist it will be automatically created by the client. Otherwise the proxy certificate has to be created by grid-proxy-init tool.

Job Profile

Every experiment submitted to QCG-Broker has to be described by XML-based document called Job Profile. The structure of Job Profile is formalized by  Job Profile schema.

Examples of Job Profiles describing basic use cases are distributed with QCG-Broker and can be found in <CLIENT_DIR>/examples directory.

Usage

The QCG-Groker command-line java based client can operate in two modes:

  • batch mode – that executes single operation with arguments passed directly to the client during its invocation. The batch mode allows to use the client in any kind of scripts mostly in cases when the processing of output is needed to steer the experiment,
  • console mode – that works similar to shell console in which user can type in lines with operations and arguments to be executed by service. The console mode gives additional useful features like aliases, history accessible by arrows-keys, creation and management of user proxy, help functionality.

The usage of the client depends on the mode:

  • for batch mode: "qcg-broker OPRATION [ARG1 .. ARGn]"
  • for console mode: "qcg-broker -console" and then user is prompted to type in lines in format "OPERATION [ARG1 .. ARGn]" to be processed by client.

IMPORTANT: To secure the communication between client and service and to delegate user's privileges to the service client needs access to user's proxy certificate.

Operations

Regardless from the mode the QCG-Broker java based command-line client supports following list of operations:

Operation Arguments Description
submit_job <desc_file> (GRMS or JSDL) submits a job to be executed. The description of job can be expressed either in native QCG-Broker language or if it is possible in JSDL one. If the description is valid client returns to the user a globally unique job identifier, which unambiguously identifies the job in the system. QCG defines jobs as a sets of dependent tasks that constitute a logical whole (workflow). Each task is executed by system only if all tasks it depends on are in specified by the user states.
list_jobs [<limit>] [<status>] lists jobs belonging to the user. It is possible either to limit number of jobs or to display only ones in given state. All possible states are listed below the table.
list_user_jobs [<limit>] [<status>] <user> lists jobs belonging to the given user. The functionality is destined for administrative purposes.
test_description <desc_file> (GRMS or JSDL) validates job description
translate_description <desc_file> JSDL translates job description to native QCG-Broker one
job_info <jobId> [<showJobDesc>] return complex information about the given job. If the showJobDesc is false the job description is not shown
cancel_job <jobId> cancels execution of the given job
commit_job <jobId> allows to approve the job submitted with two phase commit mechanism to be processed by the system. The two phase commit mechanism can be used to register notifications before the processing of the job will be started by broker.
list_tasks <jobId> [<status>] lists tasks belonging to given job. Optionally it is possible to specify the task's status. Possible task statuses are listed below the table.
tasks_statuses <jobId> [<summary>] lists tasks constituting the given job with their statuses. If the summary argument is true some additionall statistics is displayed.
register_job_notification <jobId> <url> registers notification consumer for the given job
list_job_notifications <jobId> lists notifications registered for the given job
register_tasks_notification <jobId> <url> register notification for all tasks of the given job
monitor_job <jobId> [<interval>] monitors status changes of tasks belonging to given job. The interval argument determines delay in seconds between next status checks.
monitor_task <jobId> <taskId> [<interval>] monitors status changes of allocations belonging to the given tasks. The interval argument determines delay in seconds between next status checks.
task_info <jobId> <taskId> [<showDesc> [<limit>]] displays information about the given task. If the showDesc is false the task description is not shown. If the limit argument is specified the history of the task is limited to given value.
register_task_notification <jobId> <taskId> <url> registers task's notification consumer
list_task_notifications <jobId> <taskId> lists task's notifications
cancel_task <jobId> <taskId> cancels execution of the given task
commit_task <jobId> <taskId> commits the given task to be processed by the system
reserve_resources [<taskId>] <job_desc> (GRMS or JSDL) reserve resources that meet either the wole job or given task requirements. The reservation identifier is returned. This functionality is not implemented yet!
reservation_info <reservationId> return complex information concerning the given reservation: list of reserved resources, local identifiers of reservations, reservation time slot. This functionality id not implemented yet!
cancel_reservation <reservationId> releases reserved resources. This functionality is not implemented yet!

List of Job statuses:

  • UNCOMMITTED - the job was submitted with two phase commit option and waits to be committed,
  • SUBMITTED – the job was submitted to the system and is executed by the system,
  • SUSPENDED – the job was suspended,
  • ACTIVE – the job is active, at least one task is processed,
  • FINISHED – the job was completed,
  • FAILED – the job (at least one crucial task belonging to the job) failed
  • CANCELED – the job was canceled by the user,
  • BROKEN - one or more of crucial tasks failed, system waits until active tasks will finish and change the status of the job to FAILED.|

List of Task statuses:

  • UNSUBMITTED – the task cannot be started because of dependencies,
  • UNCOMMITED - the task waits to be committed,
  • QUEUED – the task was put into the queue and waits for execution,
  • PREPROCESSING – system makes some actions needed to start the task (looks for the resource, stages in files),
  • PENDING – the task is pending in the queueing-system,
  • RUNNING – the task is active,
  • STOPPED – the task was finished or was checkpointed, but system did not start staging out files,
  • POSTPROCESSING – system makes some actions needed to complete the task, for example stages out files, cleares working environment, etc.,
  • FINISHED – the task was completed,
  • SUSPENDED – the task was suspended,
  • FAILED – the task failed,
  • CANCELED – the task was canceled by the user.

Usage examples

The example presented below is the QCG-Broker Job Profile describing the parameter-sweep experiment executing a set of UNIX calendar tasks for the predefined space of "month" parameter.

<grmsJob appId="calendar_example">
        <task persistent="true" taskId="calendar">
                <execution type="single">
                        <executable>
                                <execFile>
                                        <file>
                                                <location type="URL">file:////usr/bin/cal</location>
                                        </file>
                                </execFile>
                        </executable>
                        <arguments>
                                <value>${PS_month}</value>
                                <value>2010</value>
                        </arguments>
                        <stdout>
                                <file>
                                        <location type="URL">${TASK_DIR}/stdout.txt</location>
                                </file>
                        </stdout>
                </execution>
                <parametersSweep>
                        <parameter>
                                <name>month</name>
                                <value>
                                        <loop>
                                                <start>1</start>
                                                <end>12</end>
                                                <step>1</step>
                                                <except>
                                                        <value>3</value>
                                                        <value>6</value>
                                                </except>
                                        </loop>
                                </value>
                        </parameter>
                </parametersSweep>
        </task>
</grmsJob>

submit_job

  • submit_job <job_profile> - submits job. <job_profile> must be the path to the file containing the Job Profile.
    qcg-client submit_job ../examples/usecase8.xml
    Your identity: C=PL,O=GRID,O=PSNC,CN=Tomasz Piontek
    Creating proxy, please wait...
    Proxy verify OK
    Your proxy is valid until Tue May 17 02:55:47 CEST 2011
    UserDN = /C=PL/O=GRID/O=PSNC/CN=Tomasz Piontek
    ProxyLifetime = 0 Days 11 Hours 59 Minutes 57 Seconds
    
    jobId = 1305550554579_calendar_example_5366
    

list_jobs

  • list_jobs [status] [limit] - lists jobs. Optionally it is possible to specify status or limits the output to some number of recent jobs.
    qcg-client list_jobs 5
    UserDN = /C=PL/O=GRID/O=PSNC/CN=Tomasz Piontek
    ProxyLifetime = 0 Days 11 Hours 58 Minutes 47 Seconds
    
    Number of jobs: 5
    List of jobs: 
    1301904727887_calendar_example_1403
    1304020897352_calendar_example_4252
    1304065465905_calendar_example_9526
    1305287977790_calendar_example_4779
    1305550554579_calendar_example_5366
    

job_info

  • job_info <jobId> <showJobProfile> - displays information about the given job. The <showJobProfile> argument of boolean type specifies whether the Job Profile should be displayed or not.
    qcg-client job_info 1305550554579_calendar_example_5366 false
    UserDN = /C=PL/O=GRID/O=PSNC/CN=Tomasz Piontek
    ProxyLifetime = 0 Days 11 Hours 58 Minutes 18 Seconds
    
    UserDN: /C=PL/O=GRID/O=PSNC/CN=Tomasz Piontek
    Project: 
    Status: FINISHED
    StatusDesc: 
    SubmissionTime: Mon May 16 14:55:54 CEST 2011
    FinishTime: Mon May 16 14:56:42 CEST 2011
    Number of tasks: 10
    Tasks: calendar_PSit0 calendar_PSit1 calendar_PSit2 calendar_PSit3 calendar_PSit4 calendar_PSit5 calendar_PSit6 calendar_PSit7 calendar_PSit8 calendar_PSit9 
    
    $ qcg-client job_info 1305550554579_calendar_example_5366
    UserDN = /C=PL/O=GRID/O=PSNC/CN=Tomasz Piontek
    ProxyLifetime = 0 Days 11 Hours 58 Minutes 6 Seconds
    
    UserDN: /C=PL/O=GRID/O=PSNC/CN=Tomasz Piontek
    Project: 
    Status: FINISHED
    StatusDesc: 
    SubmissionTime: Mon May 16 14:55:54 CEST 2011
    FinishTime: Mon May 16 14:56:42 CEST 2011
    Number of tasks: 10
    Tasks: calendar_PSit0 calendar_PSit1 calendar_PSit2 calendar_PSit3 calendar_PSit4 calendar_PSit5 calendar_PSit6 calendar_PSit7 calendar_PSit8 calendar_PSit9 
    DescriptionType: GRMS
    UserDescription: 
    <grmsJob appId="calendar_example">
      <task persistent="true" taskId="calendar">
        <execution type="single">
          <executable>
            <execFile>
              <file>
                <location type="URL">file:////usr/bin/cal</location>
              </file>
            </execFile>
          </executable>
          <arguments>
            <value>${PS_month}</value>
            <value>2010</value>
          </arguments>
          <stdout>
            <file>
              <location type="URL">${TASK_DIR}/stdout.txt</location>
            </file>
          </stdout>
        </execution>
        <parametersSweep>
          <parameter>
            <name>month</name>
            <value>
              <loop>
                <start>1</start>
                <end>12</end>
                <step>1</step>
                <except>
                  <value>3</value>
                  <value>6</value>
                </except>
              </loop>
            </value>
          </parameter>
        </parametersSweep>
      </task>
    </grmsJob>
    
    GrmsDescription: 
    <grmsJob appId="calendar_example">
      <task persistent="true" taskId="calendar_PSit0">
        <execution type="single">
          <executable>
            <execFile>
              <file>
                <location type="URL">file:////usr/bin/cal</location>
              </file>
            </execFile>
          </executable>
          <arguments>
            <value>1.0</value>
            <value>2010</value>
          </arguments>
          <stdout>
            <file>
              <location type="URL">${TASK_DIR}/stdout.txt</location>
            </file>
          </stdout>
        </execution>
      </task>
      <task persistent="true" taskId="calendar_PSit1">
        <execution type="single">
          <executable>
            <execFile>
              <file>
                <location type="URL">file:////usr/bin/cal</location>
              </file>
            </execFile>
          </executable>
          <arguments>
            <value>2.0</value>
            <value>2010</value>
          </arguments>
          <stdout>
            <file>
              <location type="URL">${TASK_DIR}/stdout.txt</location>
            </file>
          </stdout>
        </execution>
      </task>
      <task persistent="true" taskId="calendar_PSit2">
        <execution type="single">
          <executable>
            <execFile>
              <file>
                <location type="URL">file:////usr/bin/cal</location>
              </file>
            </execFile>
          </executable>
          <arguments>
            <value>4.0</value>
            <value>2010</value>
          </arguments>
          <stdout>
            <file>
              <location type="URL">${TASK_DIR}/stdout.txt</location>
            </file>
          </stdout>
        </execution>
      </task>
      <task persistent="true" taskId="calendar_PSit3">
        <execution type="single">
          <executable>
            <execFile>
              <file>
                <location type="URL">file:////usr/bin/cal</location>
              </file>
            </execFile>
          </executable>
          <arguments>
            <value>5.0</value>
            <value>2010</value>
          </arguments>
          <stdout>
            <file>
              <location type="URL">${TASK_DIR}/stdout.txt</location>
            </file>
          </stdout>
        </execution>
      </task>
      <task persistent="true" taskId="calendar_PSit4">
        <execution type="single">
          <executable>
            <execFile>
              <file>
                <location type="URL">file:////usr/bin/cal</location>
              </file>
            </execFile>
          </executable>
          <arguments>
            <value>7.0</value>
            <value>2010</value>
          </arguments>
          <stdout>
            <file>
              <location type="URL">${TASK_DIR}/stdout.txt</location>
            </file>
          </stdout>
        </execution>
      </task>
      <task persistent="true" taskId="calendar_PSit5">
        <execution type="single">
          <executable>
            <execFile>
              <file>
                <location type="URL">file:////usr/bin/cal</location>
              </file>
            </execFile>
          </executable>
          <arguments>
            <value>8.0</value>
            <value>2010</value>
          </arguments>
          <stdout>
            <file>
              <location type="URL">${TASK_DIR}/stdout.txt</location>
            </file>
          </stdout>
        </execution>
      </task>
      <task persistent="true" taskId="calendar_PSit6">
        <execution type="single">
          <executable>
            <execFile>
              <file>
                <location type="URL">file:////usr/bin/cal</location>
              </file>
            </execFile>
          </executable>
          <arguments>
            <value>9.0</value>
            <value>2010</value>
          </arguments>
          <stdout>
            <file>
              <location type="URL">${TASK_DIR}/stdout.txt</location>
            </file>
          </stdout>
        </execution>
      </task>
      <task persistent="true" taskId="calendar_PSit7">
        <execution type="single">
          <executable>
            <execFile>
              <file>
                <location type="URL">file:////usr/bin/cal</location>
              </file>
            </execFile>
          </executable>
          <arguments>
            <value>10.0</value>
            <value>2010</value>
          </arguments>
          <stdout>
            <file>
              <location type="URL">${TASK_DIR}/stdout.txt</location>
            </file>
          </stdout>
        </execution>
      </task>
      <task persistent="true" taskId="calendar_PSit8">
        <execution type="single">
          <executable>
            <execFile>
              <file>
                <location type="URL">file:////usr/bin/cal</location>
              </file>
            </execFile>
          </executable>
          <arguments>
            <value>11.0</value>
            <value>2010</value>
          </arguments>
          <stdout>
            <file>
              <location type="URL">${TASK_DIR}/stdout.txt</location>
            </file>
          </stdout>
        </execution>
      </task>
      <task persistent="true" taskId="calendar_PSit9">
        <execution type="single">
          <executable>
            <execFile>
              <file>
                <location type="URL">file:////usr/bin/cal</location>
              </file>
            </execFile>
          </executable>
          <arguments>
            <value>12.0</value>
            <value>2010</value>
          </arguments>
          <stdout>
            <file>
              <location type="URL">${TASK_DIR}/stdout.txt</location>
            </file>
          </stdout>
        </execution>
      </task>
    </grmsJob>
    

tasks_statuses

  • tasks_statuses <jobId> - lists tasks constituting the given job with their statuses.
    qcg-broker tasks_statuses 1305550554579_calendar_example_5366
    UserDN = /C=PL/O=GRID/O=PSNC/CN=Tomasz Piontek
    ProxyLifetime = 0 Days 11 Hours 41 Minutes 30 Seconds
    
    Number of tasks: 10
    Tasks statuses: 
    calendar_PSit7  : FINISHED
    calendar_PSit6  : FINISHED
    calendar_PSit5  : FINISHED
    calendar_PSit4  : FINISHED
    calendar_PSit3  : FINISHED
    calendar_PSit2  : FINISHED
    calendar_PSit1  : FINISHED
    calendar_PSit0  : FINISHED
    calendar_PSit9  : FINISHED
    calendar_PSit8  : FINISHED
    ------ SUMMARY --------
    Number of tasks: 10
    FINISHED        : 10
    

task_info

  • task_info <jobId> <taskId> - displays information about the given job
    qcg-broker task_info 1305550554579_calendar_example_5366 calendar_PSit0
    UserDN = /C=PL/O=GRID/O=PSNC/CN=Tomasz Piontek
    ProxyLifetime = 0 Days 11 Hours 40 Minutes 11 Seconds
    
    TaskType: SINGLE
    SubmissionTime: Mon May 16 14:55:57 CEST 2011
    FinishTime: Mon May 16 14:56:27 CEST 2011
    ProxyLifetime: PT0S
    Status: FINISHED
    StatusDesc: 
    StartTime: Mon May 16 14:56:04 CEST 2011
    DescriptionType: <task persistent="true" taskId="calendar_PSit0">
      <execution type="single">
        <executable>
          <execFile>
            <file>
              <location type="URL">file:////usr/bin/cal</location>
            </file>
          </execFile>
        </executable>
        <arguments>
          <value>1.0</value>
          <value>2010</value>
        </arguments>
        <stdout>
          <file>
            <location type="URL">${TASK_DIR}/stdout.txt</location>
          </file>
        </stdout>
      </execution>
    </task>
    
    
    Coallocation: 
    UserDN: /C=PL/O=GRID/O=PSNC/CN=Tomasz Piontek
    HostName: grass1.man.poznan.pl
    ProcessesCount: 1
    ProcessesGroupId: 
    Status: FINISHED
    StatusDescription: 
    SubmissionTime: Mon May 16 14:56:04 CEST 2011
    FinishTime: Mon May 16 14:56:21 CEST 2011
    LocalSubmissionTime: Mon May 16 14:56:06 CEST 2011
    LocalStartTime: Mon May 16 14:56:10 CEST 2011
    LocalFinishTime: Mon May 16 14:56:10 CEST 2011