[1] | 1 | ========================== |
---|
| 2 | PSNC DRMAA for LoadLeveler |
---|
| 3 | ========================== |
---|
| 4 | |
---|
| 5 | :Author: Michal Matloka <michal.matloka@student.put.poznan.pl>, Mariusz Mamonski <mamonski@man.poznan.pl> |
---|
| 6 | :Organization: Poznan Supercomputing and Networking Center |
---|
| 7 | :Contact: Mariusz Mamonski <mamonski@man.poznan.pl>, Michal Matloka <michal.matloka@student.put.poznan.pl> |
---|
| 8 | :Date: $Date: 2009-05-05 16:09:52 +0200 (Tue, 05 May 2009) $ |
---|
| 9 | :Version: 1.0.1 |
---|
| 10 | :Revision: $Revision: 220 $ |
---|
| 11 | :Copyright: Copyright (C) 2010 Poznan Supercomputing and Networking Center |
---|
| 12 | |
---|
| 13 | :Abstract: This document describes installation, configuration and usage |
---|
| 14 | of PSNC DRMAA for IBM Tivoli Workload Scheduler LoadLeveler. |
---|
| 15 | |
---|
| 16 | .. meta:: |
---|
| 17 | :http-equiv=Content-Language: en |
---|
| 18 | :http-equiv=Content-Type: application/xhtml+xml; charset=UTF-8 |
---|
| 19 | :description lang=en: Distributed Resource Management Application API 1.0 implementation for IBM LoadLeveler |
---|
| 20 | :keywords: DRMAA, Distributed Resource Management Application API, IBM Tivoli LoadLeveler, Poznan Supercomputing and Networking Center |
---|
| 21 | |
---|
| 22 | .. contents:: |
---|
| 23 | |
---|
| 24 | .. default-role:: literal |
---|
| 25 | |
---|
| 26 | |
---|
| 27 | Introduction |
---|
| 28 | ============ |
---|
| 29 | |
---|
| 30 | PSNC DRMAA for LoadLeveler is an implementation of `Open Grid Forum`_ DRMAA_ |
---|
| 31 | 1.0 (Distributed Resource Management Application API) specification_ |
---|
| 32 | for submission and control of jobs to `IBM Tivoli LoadLeveler`_. Using DRMAA, |
---|
| 33 | grid applications builders, portal developers and ISVs can use the same |
---|
| 34 | high-level API to link their software with different cluster/resource |
---|
| 35 | management systems. |
---|
| 36 | |
---|
| 37 | This software also enables the integration of `SMOA Computing`_ with the |
---|
| 38 | underlying LoadLeveler system for remote multi-user job submission and control |
---|
| 39 | over Web Services. |
---|
| 40 | |
---|
| 41 | |
---|
| 42 | Installation |
---|
| 43 | ============ |
---|
| 44 | |
---|
| 45 | To compile and install the library just go to main source directory |
---|
| 46 | and type:: |
---|
| 47 | |
---|
| 48 | $ ./configure [options] && make |
---|
| 49 | $ sudo make install |
---|
| 50 | |
---|
| 51 | The library was tested with LoadLeveler version 3.5. (for AIX operating systems). |
---|
| 52 | If you encountered any problems using the library on the different systems, please use |
---|
| 53 | the contact e-mails for reporting the problem. |
---|
| 54 | |
---|
| 55 | Notable `./configure` script options: |
---|
| 56 | |
---|
| 57 | `--with-ll-inc` LL_INCLUDE_PATH |
---|
| 58 | Path to LL header files (i.e. directory containing `llapi.h` ). By default the library tries |
---|
| 59 | to guess the `LL_INCLUDE_PATH` and `LL_LIBRARY_PATH` based on location |
---|
| 60 | of the `llsubmit` executable. |
---|
| 61 | |
---|
| 62 | `--with-ll-lib` LL_LIBRARY_PATH |
---|
| 63 | Path to LL libraries (i.e. directory containing `libllapi.a` ). |
---|
| 64 | |
---|
| 65 | `--prefix` INSTALLATION_DIRECTORY |
---|
| 66 | Root directory where PSNC DRMAA for LoadLeveler shall be installed. |
---|
| 67 | When not given library is installed alongside with LL. |
---|
| 68 | |
---|
| 69 | `--enable-debug` |
---|
| 70 | Compiles library with debugging enabled (with debugging symbols not |
---|
| 71 | stripped, without optimizations, and with many log messages enabled). |
---|
| 72 | Useful when you are to debug DRMAA enabled application |
---|
| 73 | or investigate problems with DRMAA library itself. |
---|
| 74 | |
---|
| 75 | There are no unusual requirements for basic usage of library: ANSI C |
---|
| 76 | compiler and standard make program should suffice. If you have taken |
---|
| 77 | sources directly from SVN repository or wish to run test-suite you would |
---|
| 78 | need additional `developer tools`_. For further information regarding |
---|
| 79 | GNU build system see the INSTALL file. |
---|
| 80 | |
---|
| 81 | |
---|
| 82 | Configuration |
---|
| 83 | ============= |
---|
| 84 | |
---|
| 85 | During DRMAA session initialization (`drmaa_init`) library tries to |
---|
| 86 | read its configuration parameters from locations: `/etc/ll_drmaa.conf`, |
---|
| 87 | `~/.ll_drmaa.conf` and from file given in `LL_DRMAA_CONF` environment |
---|
| 88 | variable (if set to non-empty string). If multiple configuration |
---|
| 89 | sources are present then all configurations are merged with values |
---|
| 90 | from user-defined files taking precedence (in following order: |
---|
| 91 | `$LL_DRMAA_CONF`, `~/.ll_drmaa.conf`, `/etc/ll_drmaa.conf`). |
---|
| 92 | |
---|
| 93 | Currently recognized configuration parameters are: |
---|
| 94 | |
---|
| 95 | cache_job_state |
---|
| 96 | According to DRMAA specification every `drmaa_job_ps()` call should |
---|
| 97 | query DRM system for job state. With this option one may optimize |
---|
| 98 | communication with DRM. If set to positive integer `drmaa_job_ps()` |
---|
| 99 | returns remembered job state without communicating with DRM for |
---|
| 100 | `cache_job_state` seconds since last update. By default library |
---|
| 101 | conforms to specification (no caching will be performed). |
---|
| 102 | |
---|
| 103 | Type: integer, default: 0 |
---|
| 104 | |
---|
| 105 | job_categories |
---|
| 106 | Dictionary of job categories. It's keys are job categories names |
---|
| 107 | mapped to `native specification`_ strings. Attributes set by job |
---|
| 108 | category can be overridden by corresponding DRMAA attributes or |
---|
| 109 | native specification. Special category name `default` is used when |
---|
| 110 | `drmaa_job_category` job attribute was not set. |
---|
| 111 | |
---|
| 112 | Type: dictionary with string values, default: empty dictionary |
---|
| 113 | |
---|
| 114 | terminate_job_on_vacated |
---|
| 115 | Attribute which determines if job should be terminated just after entering the `vacated` state. |
---|
| 116 | |
---|
| 117 | Type: integer, default: 1 |
---|
| 118 | |
---|
| 119 | Configuration file syntax |
---|
| 120 | ------------------------- |
---|
| 121 | |
---|
| 122 | Configuration file is in form a dictionary. |
---|
| 123 | Dictionary is set of zero or more key-value pairs. |
---|
| 124 | Key is a string while value could be a string, an integer |
---|
| 125 | or another dictionary. |
---|
| 126 | :: |
---|
| 127 | |
---|
| 128 | configuration: dictionary | dictionary_body |
---|
| 129 | dictionary: '{' dictionary_body '}' |
---|
| 130 | dictionary_body: (string ':' value ',')* |
---|
| 131 | value: integer | string | dictionary |
---|
| 132 | string: unquoted-string | single-quoted-string | double-quoted-string |
---|
| 133 | unquoted-string: [^ \t\n\r:,0-9][^ \t\n\r:,]* |
---|
| 134 | single-quoted-string: '[^']*' |
---|
| 135 | double-quoted-string: "[^"]*" |
---|
| 136 | integer: [0-9]+ |
---|
| 137 | |
---|
| 138 | |
---|
| 139 | Native specification |
---|
| 140 | ==================== |
---|
| 141 | |
---|
| 142 | DRMAA interface allows to pass DRM dependent job submission options. |
---|
| 143 | Those options may be specified directly by setting `drmaa_native_specification` job |
---|
| 144 | template attribute or indirectly by the `drmaa_job_category` job template attribute. |
---|
| 145 | The legal format of the native options looks like:: |
---|
| 146 | |
---|
| 147 | @ll_cmd_keyword1 = value @ll_cmd_keyword2 = value1 value2 value3 |
---|
| 148 | |
---|
| 149 | where the `ll_cmd_keyword` can be any of the `keyword`_ accepted by the LoadLeveler. It is user |
---|
| 150 | responsibility to provide legal combination of the `drmaa_native_specification`/`drmaa_job_category` |
---|
| 151 | and the other DRMAA attributes. |
---|
| 152 | |
---|
| 153 | Release notes |
---|
| 154 | ============= |
---|
| 155 | * 1.0.1 - first public release |
---|
| 156 | |
---|
| 157 | Known bugs and limitations |
---|
| 158 | -------------------------- |
---|
| 159 | |
---|
| 160 | Library covers all `DRMAA 1.0 specification`_ with exceptions listed |
---|
| 161 | below. It was successfully tested with `IBM Tivoli LoadLeveler`_ 3.5.0.5 on AIX |
---|
| 162 | OS and passes 43/44 tests of the `official DRMAA test-suite`_. All mandatory and |
---|
| 163 | nearly all optional job attributes (except job run duration soft limit, job run duration hard limit, |
---|
| 164 | drmaa_transfer_files and drmaa_deadline_time) are implemented. |
---|
| 165 | |
---|
| 166 | Known limitations: |
---|
| 167 | |
---|
| 168 | * `drmaa_control()` - `DRMAA_CONTROL_RESUME` and `DRMAA_CONTROL_SUSPEND` are not implemented as suspending jobs in LoadLeveler requires administrator privileges. |
---|
| 169 | |
---|
| 170 | .. note:: There are plans, depending on the end users feedback, to implement the RESUME/SUSPEND functionality by leveraging the checkpointing mechanism. |
---|
| 171 | |
---|
| 172 | Authors |
---|
| 173 | ------- |
---|
| 174 | |
---|
| 175 | The library was developed by: |
---|
| 176 | |
---|
| 177 | * Michal Matloka <michal.matloka@student.put.poznan.pl> - core implementation |
---|
| 178 | * Mariusz Mamonski <mamonski@man.poznan.pl> - AIX portability issues |
---|
| 179 | |
---|
| 180 | This library relies heavily on the *Fedstage DRMAA utils* code developed by: |
---|
| 181 | |
---|
| 182 | * Lukasz Ciesnik <lukasz.ciesnik@gmail.com>. |
---|
| 183 | |
---|
| 184 | Acknowledgments |
---|
| 185 | --------------- |
---|
[17] | 186 | Portions of this research were conducted with high performance computational resources provided by the Louisiana Optical Network Initiative (http://www.loni.org). |
---|
[1] | 187 | |
---|
| 188 | Developer tools |
---|
| 189 | --------------- |
---|
| 190 | Although not needed for library user the following tools may be required |
---|
| 191 | if you intend to develop PSNC DRMAA for LoadLeveler: |
---|
| 192 | |
---|
| 193 | * GNU autotools (autoconf, automake, libtool), |
---|
| 194 | * Bison_ parser generator, |
---|
| 195 | * Docutils_ for processing this `README`, |
---|
| 196 | * LaTeX_ for creating documentation in PDF format, |
---|
| 197 | * Doxygen_ for generating source code documentation. |
---|
| 198 | |
---|
| 199 | .. _Bison: http://www.gnu.org/software/bison/ |
---|
| 200 | .. _Docutils: http://docutils.sourceforge.net/ |
---|
| 201 | .. _LaTeX: http://www.latex-project.org/ |
---|
| 202 | .. _Doxygen: http://www.stack.nl/~dimitri/doxygen/ |
---|
| 203 | .. _DRMAA: http://www.drmaa.org/ |
---|
| 204 | .. _Open Grid Forum: http://www.gridforum.org/ |
---|
| 205 | .. _DRMAA 1.0 specification: http://www.ogf.org/documents/GFD.133.pdf |
---|
| 206 | .. _specification: http://www.ogf.org/documents/GFD.133.pdf |
---|
| 207 | .. _Official DRMAA test-suite: http://drmaa.org/testsuite.php |
---|
| 208 | .. _SMOA Computing: http://larix.man.poznan.pl/wiki/SMOA_Computing |
---|
| 209 | .. _IBM Tivoli LoadLeveler: http://www-03.ibm.com/systems/software/loadleveler/index.html |
---|
| 210 | .. _keyword: http://publib.boulder.ibm.com/infocenter/clresctr/vxrx/topic/com.ibm.cluster.loadl.doc/loadl33/am2ug30223.html#jobkey |
---|
| 211 | .. _Louisiana Optical Network Initiative: http://loni.org/ |
---|
| 212 | |
---|
| 213 | |
---|
| 214 | License |
---|
| 215 | ======= |
---|
| 216 | |
---|
| 217 | Copyright (C) 2009-2010 Poznan Supercomputing and Networking Center |
---|
| 218 | |
---|
| 219 | Licensed under the Apache License, Version 2.0 (the "License"); |
---|
| 220 | you may not use this file except in compliance with the License. |
---|
| 221 | You may obtain a copy of the License at |
---|
| 222 | |
---|
| 223 | http://www.apache.org/licenses/LICENSE-2.0 |
---|
| 224 | |
---|
| 225 | Unless required by applicable law or agreed to in writing, software |
---|
| 226 | distributed under the License is distributed on an "AS IS" BASIS, |
---|
| 227 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
---|
| 228 | See the License for the specific language governing permissions and |
---|
| 229 | limitations under the License. |
---|
| 230 | |
---|
| 231 | .. vim700: spell spelllang=en |
---|
| 232 | .. vim: ft=rst |
---|
| 233 | .. vim: ts=2 sw=2 et |
---|