1 | =================================================================== |
---|
2 | PSNC DRMAA for Simple Linux Utility for Resource Management (SLURM) |
---|
3 | =================================================================== |
---|
4 | |
---|
5 | :Author: Michal Matloka <michal.matloka@student.put.poznan.pl>, Mariusz Mamonski <mamonski@man.poznan.pl> |
---|
6 | :Organization: Poznan Supercomputing and Networking Center |
---|
7 | :Contact: Mariusz Mamonski <mamonski@man.poznan.pl>, Michal Matloka <michal.matloka@student.put.poznan.pl> |
---|
8 | :Date: $Date: 2010-10-01 15:14:55 +0200 (Pt, 01 paź 2010) $ |
---|
9 | :Version: 1.0.0 |
---|
10 | :Revision: $Revision: 351 $ |
---|
11 | :Copyright: Copyright (C) 2010 Poznan Supercomputing and Networking Center |
---|
12 | |
---|
13 | :Abstract: This document describes installation, configuration and usage |
---|
14 | of PSNC DRMAA for Simple Linux Utility for Resource Management (SLURM). |
---|
15 | |
---|
16 | .. meta:: |
---|
17 | :http-equiv=Content-Language: en |
---|
18 | :http-equiv=Content-Type: application/xhtml+xml; charset=UTF-8 |
---|
19 | :description lang=en: Distributed Resource Management Application API 1.0 implementation for Simple Linux Utility for Resource Management (SLURM) |
---|
20 | :keywords: DRMAA, Distributed Resource Management Application API, Simple Linux Utility for Resource Management, SLURM, Poznan Supercomputing and Networking Center |
---|
21 | |
---|
22 | .. contents:: |
---|
23 | |
---|
24 | .. default-role:: literal |
---|
25 | |
---|
26 | |
---|
27 | Introduction |
---|
28 | ============ |
---|
29 | |
---|
30 | PSNC DRMAA for `Simple Linux Utility for Resource Management (SLURM)`_ is an implementation of `Open Grid Forum`_ DRMAA_ |
---|
31 | 1.0 (Distributed Resource Management Application API) specification_ |
---|
32 | for submission and control of jobs to `Simple Linux Utility for Resource Management (SLURM)`_. Using DRMAA, |
---|
33 | grid applications builders, portal developers and ISVs can use the same |
---|
34 | high-level API to link their software with different cluster/resource |
---|
35 | management systems. |
---|
36 | |
---|
37 | This software also enables the integration of `SMOA Computing`_ with the |
---|
38 | underlying SLURM system for remote multi-user job submission and control |
---|
39 | over Web Services. |
---|
40 | |
---|
41 | |
---|
42 | Installation |
---|
43 | ============ |
---|
44 | |
---|
45 | To compile and install the library just go to main source directory |
---|
46 | and type:: |
---|
47 | |
---|
48 | $ ./configure [options] && make |
---|
49 | $ sudo make install |
---|
50 | |
---|
51 | The library was tested with Simple Linux Utility for Resource Management version 2.1.13. |
---|
52 | If you encountered any problems using the library on the different systems, please use |
---|
53 | the contact e-mails for reporting the problem. |
---|
54 | |
---|
55 | Notable `./configure` script options: |
---|
56 | |
---|
57 | `--with-slurm-inc` SLURM_INCLUDE_PATH |
---|
58 | Path to SLURM header files (i.e. directory containing `slurm/slurm.h` ). By default the library tries |
---|
59 | to guess the `SLURM_INCLUDE_PATH` and `SLURM_LIBRARY_PATH` based on location |
---|
60 | of the `srun` executable. |
---|
61 | |
---|
62 | `--with-slurm-lib` SLURM_LIBRARY_PATH |
---|
63 | Path to SLURM libraries (i.e. directory containing `libslurm.a` ). |
---|
64 | |
---|
65 | `--prefix` INSTALLATION_DIRECTORY |
---|
66 | Root directory where PSNC DRMAA for SLURM shall be installed. |
---|
67 | When not given library is installed in `/usr/local`. |
---|
68 | |
---|
69 | `--enable-debug` |
---|
70 | Compiles library with debugging enabled (with debugging symbols not |
---|
71 | stripped, without optimizations, and with many log messages enabled). |
---|
72 | Useful when you are to debug DRMAA enabled application |
---|
73 | or investigate problems with DRMAA library itself. |
---|
74 | |
---|
75 | There are no unusual requirements for basic usage of library: ANSI C |
---|
76 | compiler and standard make program should suffice. If you have taken |
---|
77 | sources directly from SVN repository or wish to run test-suite you would |
---|
78 | need additional `developer tools`_. For further information regarding |
---|
79 | GNU build system see the INSTALL file. |
---|
80 | |
---|
81 | |
---|
82 | Configuration |
---|
83 | ============= |
---|
84 | |
---|
85 | During DRMAA session initialization (`drmaa_init`) library tries to |
---|
86 | read its configuration parameters from locations: `/etc/slurm_drmaa.conf`, |
---|
87 | `~/.slurm_drmaa.conf` and from file given in `SLURM_DRMAA_CONF` environment |
---|
88 | variable (if set to non-empty string). If multiple configuration |
---|
89 | sources are present then all configurations are merged with values |
---|
90 | from user-defined files taking precedence (in following order: |
---|
91 | `$SLURM_DRMAA_CONF`, `~/.slurm_drmaa.conf`, `/etc/slurm_drmaa.conf`). |
---|
92 | |
---|
93 | Currently recognized configuration parameters are: |
---|
94 | |
---|
95 | cache_job_state |
---|
96 | According to DRMAA specification every `drmaa_job_ps()` call should |
---|
97 | query DRM system for job state. With this option one may optimize |
---|
98 | communication with DRM. If set to positive integer `drmaa_job_ps()` |
---|
99 | returns remembered job state without communicating with DRM for |
---|
100 | `cache_job_state` seconds since last update. By default library |
---|
101 | conforms to specification (no caching will be performed). |
---|
102 | |
---|
103 | Type: integer, default: 0 |
---|
104 | |
---|
105 | job_categories |
---|
106 | Dictionary of job categories. It's keys are job categories names |
---|
107 | mapped to `native specification`_ strings. Attributes set by job |
---|
108 | category can be overridden by corresponding DRMAA attributes or |
---|
109 | native specification. Special category name `default` is used when |
---|
110 | `drmaa_job_category` job attribute was not set. |
---|
111 | |
---|
112 | Type: dictionary with string values, default: empty dictionary |
---|
113 | |
---|
114 | |
---|
115 | Configuration file syntax |
---|
116 | ------------------------- |
---|
117 | |
---|
118 | Configuration file is in form a dictionary. |
---|
119 | Dictionary is set of zero or more key-value pairs. |
---|
120 | Key is a string while value could be a string, an integer |
---|
121 | or another dictionary. |
---|
122 | :: |
---|
123 | |
---|
124 | configuration: dictionary | dictionary_body |
---|
125 | dictionary: '{' dictionary_body '}' |
---|
126 | dictionary_body: (string ':' value ',')* |
---|
127 | value: integer | string | dictionary |
---|
128 | string: unquoted-string | single-quoted-string | double-quoted-string |
---|
129 | unquoted-string: [^ \t\n\r:,0-9][^ \t\n\r:,]* |
---|
130 | single-quoted-string: '[^']*' |
---|
131 | double-quoted-string: "[^"]*" |
---|
132 | integer: [0-9]+ |
---|
133 | |
---|
134 | |
---|
135 | Native specification |
---|
136 | ==================== |
---|
137 | |
---|
138 | DRMAA interface allows to pass DRM dependent job submission options. |
---|
139 | Those options may be specified directly by setting `drmaa_native_specification` job |
---|
140 | template attribute or indirectly by the `drmaa_job_category` job template attribute. |
---|
141 | The legal format of the native options looks like:: |
---|
142 | |
---|
143 | -A My_job_name -s -N 1=10 |
---|
144 | |
---|
145 | .. table:: |
---|
146 | |
---|
147 | ============================= ======================================================================================================================= |
---|
148 | Native specification Description |
---|
149 | ============================= ======================================================================================================================= |
---|
150 | `-A, --account=`\name Charge job to specified accounts |
---|
151 | `--acctg-freq` Define the job accounting sampling interval |
---|
152 | `--comment` An arbitrary comment |
---|
153 | `-C, --constraint=`\list Specify a list of constraints |
---|
154 | `--contiguous` If set, then the allocated nodes must form a contiguous set |
---|
155 | `--exclusive` Allocate nodenumber of tasks to invoke on each nodes in exclusive mode when cpu consumable resource is enabled |
---|
156 | `--mem=`\MB Minimum amount of real memory |
---|
157 | `--mem-per-cpu=`\MB Maximum mount of real memory per allocated cpu required by a job |
---|
158 | `--mincpus=`\n Minimum number of logical processors (threads) per node |
---|
159 | `-N, --nodes=`\N Number of nodes on which to run (N = min[-max]) |
---|
160 | `--ntasks-per-node=`\n Number of tasks to invoke on each node |
---|
161 | `-p, --partition=`\partition Partition requested |
---|
162 | `--qos=`\qos Quality of Serice |
---|
163 | `--requeue` If set, permit the job to be requeued |
---|
164 | `--reservation=`\name Allocate resources from named reservation |
---|
165 | `-s, --share` Job allocation can share nodes with other running jobs |
---|
166 | `-w, --nodelist=`\hosts Request a specific list of hosts |
---|
167 | ============================= ======================================================================================================================= |
---|
168 | |
---|
169 | Description of each parameter can be found in `man sbatch`. |
---|
170 | |
---|
171 | Release notes |
---|
172 | ============= |
---|
173 | * 1.0.0 - first public release |
---|
174 | |
---|
175 | Known bugs and limitations |
---|
176 | -------------------------- |
---|
177 | |
---|
178 | Library covers all `DRMAA 1.0 specification`_ with exceptions listed |
---|
179 | below. It was successfully tested with `Simple Linux Utility for Resource Management (SLURM)`_ 2.1.13 on Linux and passes 44/44 tests of the `official DRMAA test-suite`_ (running as SLURM administrator). |
---|
180 | Using regular user account SLURM DRMAA passes 40/44 tests of the official DRMAA test-suite. |
---|
181 | |
---|
182 | Known limitations: |
---|
183 | |
---|
184 | * `drmaa_control` options `DRMAA_CONTROL_HOLD`, `DRMAA_CONTROL_SUSPEND`, `DRMAA_CONTROL_RELEASE`, `DRMAA_CONTROL_RESUME` are only available for users being SLURM administrators |
---|
185 | * `drmaa_wct_slimit` not implemented |
---|
186 | * optional attributes `drmaa_deadline_time`, `drmaa_duration_hlimit`, `drmaa_duration_slimit`, `drmaa_transfer_files` not implemented |
---|
187 | |
---|
188 | |
---|
189 | Authors |
---|
190 | ------- |
---|
191 | |
---|
192 | The library was developed by: |
---|
193 | |
---|
194 | * Michal Matloka <michal.matloka@student.put.poznan.pl> - core implementation |
---|
195 | |
---|
196 | This library relies heavily on the *Fedstage DRMAA utils* code developed by: |
---|
197 | |
---|
198 | * Lukasz Ciesnik <lukasz.ciesnik@gmail.com>. |
---|
199 | |
---|
200 | Developer tools |
---|
201 | --------------- |
---|
202 | Although not needed for library user the following tools may be required |
---|
203 | if you intend to develop PSNC DRMAA for SLURM: |
---|
204 | |
---|
205 | * GNU autotools (autoconf, automake, libtool), |
---|
206 | * Bison_ parser generator, |
---|
207 | * Docutils_ for processing this `README`, |
---|
208 | * LaTeX_ for creating documentation in PDF format, |
---|
209 | * Doxygen_ for generating source code documentation. |
---|
210 | |
---|
211 | .. _Bison: http://www.gnu.org/software/bison/ |
---|
212 | .. _Docutils: http://docutils.sourceforge.net/ |
---|
213 | .. _LaTeX: http://www.latex-project.org/ |
---|
214 | .. _Doxygen: http://www.stack.nl/~dimitri/doxygen/ |
---|
215 | .. _DRMAA: http://www.drmaa.org/ |
---|
216 | .. _Open Grid Forum: http://www.gridforum.org/ |
---|
217 | .. _DRMAA 1.0 specification: http://www.ogf.org/documents/GFD.133.pdf |
---|
218 | .. _specification: http://www.ogf.org/documents/GFD.133.pdf |
---|
219 | .. _Official DRMAA test-suite: http://drmaa.org/testsuite.php |
---|
220 | .. _SMOA Computing: http://larix.man.poznan.pl/wiki/SMOA_Computing |
---|
221 | .. _Simple Linux Utility for Resource Management (SLURM): https://computing.llnl.gov/linux/slurm/ |
---|
222 | |
---|
223 | |
---|
224 | License |
---|
225 | ======= |
---|
226 | |
---|
227 | Copyright (C) 2009-2010 Poznan Supercomputing and Networking Center |
---|
228 | |
---|
229 | This program is free software: you can redistribute it and/or modify |
---|
230 | it under the terms of the GNU General Public License as published by |
---|
231 | the Free Software Foundation, either version 3 of the License, or |
---|
232 | (at your option) any later version. |
---|
233 | |
---|
234 | This program is distributed in the hope that it will be useful, |
---|
235 | but WITHOUT ANY WARRANTY; without even the implied warranty of |
---|
236 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
---|
237 | GNU General Public License for more details. |
---|
238 | |
---|
239 | You should have received a copy of the GNU General Public License |
---|
240 | along with this program. If not, see <http://www.gnu.org/licenses/>. |
---|
241 | |
---|
242 | .. vim700: spell spelllang=en |
---|
243 | .. vim: ft=rst |
---|
244 | .. vim: ts=2 sw=2 et |
---|