| 1 | [[PageOutline]] |
| 2 | |
| 3 | = QCG BES/AR Installation in PL-Grid= |
| 4 | QCG BES/AR service (the successor of the OpenDSP project) is an open source service acting as a computing provider exposing on demand access to computing resources and jobs over the HPC Basic Profile compliant Web Services interface. In addition the QCG BES/AR offers remote interface for Advance Reservations management. |
| 5 | |
| 6 | This document describes installation of the QCG BES/AR service in the PL-Grid environment. The service should be deployed on the machine (or virtual machine) that: |
| 7 | * has at least 1GB of memory (recommended value: 2 GB) |
| 8 | * has 10 GB of free disk space (most of the space will be used by the log files) |
| 9 | * has any modern CPU (if you plan to use virtual machine you should dedicated to it one or two cores from the host machine) |
| 10 | * is running under Scientific Linux 5.5 (in most cases the provided RPMs should work with any operating system based on Redhat Enterpise Linux 5.x, e.g. CentOS 5) |
| 11 | |
| 12 | IMPORTANT: :: |
| 13 | The implementation name of the QCG BES/AR service is '''Smoa Computing''' and this name is used as a common in this guide. |
| 14 | |
| 15 | == Prerequisites == |
| 16 | We assume that you have the Torque local resource manager and the Maui scheduler already installed. This would be typically a frontend machine (i.e. machine where the pbs_server and maui daemons are running). If you want to install the Smoa Computing service on a separate submit host you should read this [[Smoa_Computing_on_separate_machine| notes]]. Moreover the following packages must be installed before you proceed with the Smoa Computing installation. |
| 17 | |
| 18 | * Install database backend (PostgresSQL): |
| 19 | {{{ |
| 20 | #!div style="font-size: 90%" |
| 21 | {{{#!sh |
| 22 | yum install postgresql postgresql-server |
| 23 | }}} |
| 24 | }}} |
| 25 | * UnixODBC and the PostgresSQL odbc driver: |
| 26 | {{{ |
| 27 | #!div style="font-size: 90%" |
| 28 | {{{#!sh |
| 29 | yum install unixODBC postgresql-odbc |
| 30 | }}} |
| 31 | }}} |
| 32 | * Expat (needed by the BAT updater - a PL-Grid accounting module): |
| 33 | {{{ |
| 34 | #!div style="font-size: 90%" |
| 35 | {{{#!sh |
| 36 | yum install expat-devel |
| 37 | }}} |
| 38 | }}} |
| 39 | * Torque devel package and the rpmbuild package (needed to build DRMAA): |
| 40 | {{{ |
| 41 | #!div style="font-size: 90%" |
| 42 | {{{#!sh |
| 43 | rpm -i torque-devel-your-version.rpm |
| 44 | yum install rpm-build |
| 45 | }}} |
| 46 | }}} |
| 47 | The X.509 host certificate (signed by the Polish Grid CA) and key is already installed in the following locations: |
| 48 | * `/etc/grid-security/hostcert.pem` |
| 49 | * `/etc/grid-security/hostkey.pem` |
| 50 | |
| 51 | Most of the grid services and security infrastructures are sensitive to time skews. Thus we recommended to install a Network Time Protocol daemon or use any other solution that provides accurate clock synchronization. |
| 52 | |
| 53 | == Configuring WP4 queue == |
| 54 | Sample Maui configuration that gives 8 machines to exclusive use of the Work Package 4: |
| 55 | {{{ |
| 56 | #!div style="font-size: 90%" |
| 57 | {{{#!default |
| 58 | # WP4 |
| 59 | # all users by default can use only DEFAULT partition (i.e. ALL minus WP4) |
| 60 | SYSCFG PLIST=DEFAULT |
| 61 | |
| 62 | |
| 63 | # increase priority of the plgrid-wp4-produkcja queue |
| 64 | CLASSCFG[plgrid-wp4-produkcja] PRIORITY=90000 |
| 65 | # jobs submitted to the plgrid-wp4 queue CAN use and CAN ONLY (note the &) use the wp4 partition |
| 66 | CLASSCFG[plgrid-wp4] PLIST=wp4& |
| 67 | |
| 68 | # devote some machines to the Work Package 4 |
| 69 | NODECFG[r512] PARTITION=wp4 |
| 70 | NODECFG[r513] PARTITION=wp4 |
| 71 | NODECFG[r514] PARTITION=wp4 |
| 72 | NODECFG[r515] PARTITION=wp4 |
| 73 | NODECFG[r516] PARTITION=wp4 |
| 74 | NODECFG[r517] PARTITION=wp4 |
| 75 | NODECFG[r518] PARTITION=wp4 |
| 76 | NODECFG[r519] PARTITION=wp4 |
| 77 | }}} |
| 78 | }}} |
| 79 | |
| 80 | Now you need also to add the two queues in the Torque resource manager: |
| 81 | {{{ |
| 82 | #!div style="font-size: 90%" |
| 83 | {{{#!sh |
| 84 | # |
| 85 | # Create and define queue plgrid-wp4 |
| 86 | # |
| 87 | create queue plgrid-wp4 |
| 88 | set queue plgrid-wp4 queue_type = Execution |
| 89 | set queue plgrid-wp4 resources_max.walltime = 72:00:00 |
| 90 | set queue plgrid-wp4 resources_default.ncpus = 1 |
| 91 | set queue plgrid-wp4 resources_default.walltime = 72:00:00 |
| 92 | set queue plgrid-wp4 acl_group_enable = True |
| 93 | set queue plgrid-wp4 acl_groups = plgrid-wp4 |
| 94 | set queue plgrid-wp4 acl_group_sloppy = True |
| 95 | set queue plgrid-wp4 enabled = True |
| 96 | set queue plgrid-wp4 started = True |
| 97 | |
| 98 | # |
| 99 | # Create and define queue plgrid-wp4-produkcja |
| 100 | # |
| 101 | create queue plgrid-wp4-produkcja |
| 102 | set queue plgrid-wp4-produkcja queue_type = Execution |
| 103 | set queue plgrid-wp4-produkcja resources_max.walltime = 72:00:00 |
| 104 | set queue plgrid-wp4-produkcja resources_max.ncpus = 256 |
| 105 | set queue plgrid-wp4-produkcja resources_default.ncpus = 1 |
| 106 | set queue plgrid-wp4-produkcja resources_default.walltime = 72:00:00 |
| 107 | set queue plgrid-wp4-produkcja acl_group_enable = True |
| 108 | set queue plgrid-wp4-produkcja acl_groups = plgrid-wp4 |
| 109 | set queue plgrid-wp4-produkcja acl_group_sloppy = True |
| 110 | set queue plgrid-wp4-produkcja enabled = True |
| 111 | set queue plgrid-wp4-produkcja started = True |
| 112 | }}} |
| 113 | }}} |
| 114 | |
| 115 | == Installation using provided RPMS == |
| 116 | * Create the following users: |
| 117 | * `smoa_comp` - needed by the Smoa Computing service |
| 118 | * `grms` - the user that the GRMS (i.e. the !QosCosGrid Broker service) would be mapped to |
| 119 | {{{ |
| 120 | #!div style="font-size: 90%" |
| 121 | {{{#!sh |
| 122 | useradd -d /opt/plgrid/var/log/smoa-comp/ -m smoa_comp |
| 123 | useradd -d /opt/plgrid/var/log/grms/ -m grms |
| 124 | }}} |
| 125 | }}} |
| 126 | * and the following group: |
| 127 | * `smoa_dev` - this group is allowed to read the configuration and log files. Please add the Smoa services' developers to this group. |
| 128 | {{{ |
| 129 | #!div style="font-size: 90%" |
| 130 | {{{#!sh |
| 131 | groupadd smoa_dev |
| 132 | }}} |
| 133 | }}} |
| 134 | * install PL-Grid (official) and QCG (testing) repositories: |
| 135 | * !QosCosGrid testing repository |
| 136 | {{{ |
| 137 | #!div style="font-size: 90%" |
| 138 | {{{#!sh |
| 139 | cat > /etc/yum.repos.d/qcg.repo << EOF |
| 140 | [qcg] |
| 141 | name=QosCosGrid YUM repository |
| 142 | baseurl=http://fury.man.poznan.pl/qcg-packages/sl/x86_64/ |
| 143 | enabled=1 |
| 144 | gpgcheck=0 |
| 145 | EOF |
| 146 | }}} |
| 147 | }}} |
| 148 | * Official PL-Grid repository |
| 149 | {{{ |
| 150 | #!div style="font-size: 90%" |
| 151 | {{{#!sh |
| 152 | rpm -Uvh http://software.plgrid.pl/packages/repos/plgrid-repos-2010-2.noarch.rpm |
| 153 | }}} |
| 154 | }}} |
| 155 | |
| 156 | * install Smoa Computing using YUM Package Manager: |
| 157 | {{{ |
| 158 | #!div style="font-size: 90%" |
| 159 | {{{#!sh |
| 160 | yum install smoa-comp |
| 161 | }}} |
| 162 | }}} |
| 163 | |
| 164 | * setup Smoa Computing database using provided script: |
| 165 | {{{ |
| 166 | #!div style="font-size: 90%" |
| 167 | {{{#!sh |
| 168 | /opt/plgrid/qcg/smoa/share/smoa-comp/tools/smoa-comp-install.sh |
| 169 | Welcome to smoa-comp installation script! |
| 170 | |
| 171 | This script will guide you through process of configuring proper environment |
| 172 | for running the Smoa Computing service. You have to answer few questions regarding |
| 173 | parameters of your database. If you are not sure just press Enter and use the |
| 174 | default values. |
| 175 | |
| 176 | Use local PostgreSQL server? (y/n) [y]: y |
| 177 | Database [smoa_comp]: |
| 178 | User [smoa_comp]: |
| 179 | Password [smoa_comp]: MojeTajneHaslo |
| 180 | Create database? (y/n) [y]: y |
| 181 | Create user? (y/n) [y]: y |
| 182 | |
| 183 | Checking for system user smoa_comp...OK |
| 184 | Checking whether PostgreSQL server is installed...OK |
| 185 | Checking whether PostgreSQL server is running...OK |
| 186 | |
| 187 | Performing installation |
| 188 | * Creating user smoa_comp...OK |
| 189 | * Creating database smoa_comp...OK |
| 190 | * Creating database schema...OK |
| 191 | * Checking for ODBC data source smoa_comp... |
| 192 | * Installing ODBC data source...OK |
| 193 | |
| 194 | Remember to add appropriate entry to /var/lib/pgsql/data/pg_hba.conf (as the first rule!) to allow user smoa_comp to |
| 195 | access database smoa_comp. For instance: |
| 196 | |
| 197 | host smoa_comp smoa_comp 127.0.0.1/32 md5 |
| 198 | |
| 199 | and reload Postgres server. |
| 200 | }}} |
| 201 | }}} |
| 202 | |
| 203 | Add a new rule to the pg_hba.conf as requested: |
| 204 | {{{ |
| 205 | #!div style="font-size: 90%" |
| 206 | {{{#!sh |
| 207 | vim /var/lib/pgsql/data/pg_hba.conf |
| 208 | /etc/init.d/postgresql reload |
| 209 | }}} |
| 210 | }}} |
| 211 | |
| 212 | Install Polish Grid and PL-Grid Simpla-CA certificates: |
| 213 | {{{ |
| 214 | #!div style="font-size: 90%" |
| 215 | {{{#!sh |
| 216 | wget https://dist.eugridpma.info/distribution/igtf/current/accredited/RPMS/ca_PolishGrid-1.38-1.noarch.rpm |
| 217 | rpm -i ca_PolishGrid-1.38-1.noarch.rpm |
| 218 | wget http://software.plgrid.pl/packages/general/ca_PLGRID-SimpleCA-1.0-2.noarch.rpm |
| 219 | rpm -i ca_PLGRID-SimpleCA-1.0-2.noarch.rpm |
| 220 | #install certificate revocation list fetching utility |
| 221 | wget https://dist.eugridpma.info/distribution/util/fetch-crl/fetch-crl-2.8.5-1.noarch.rpm |
| 222 | rpm -i fetch-crl-2.8.5-1.noarch.rpm |
| 223 | #get fresh CRLs now |
| 224 | /usr/sbin/fetch-crl |
| 225 | #install cron job for it |
| 226 | cat > /etc/cron.daily/fetch-crl.cron << EOF |
| 227 | #!/bin/sh |
| 228 | /usr/sbin/fetch-crl |
| 229 | EOF |
| 230 | chmod a+x /etc/cron.daily/fetch-crl.cron |
| 231 | }}} |
| 232 | }}} |
| 233 | |
| 234 | === The Grid Mapfile === |
| 235 | ==== Manually created grid mapfile (for testing purpose only) ==== |
| 236 | {{{ |
| 237 | #!div style="font-size: 90%" |
| 238 | {{{#!default |
| 239 | #for test purpose only add mapping for your account |
| 240 | echo '"MyCertDN" myaccount' >> /etc/grid-security/grid-mapfile |
| 241 | }}} |
| 242 | }}} |
| 243 | ==== LDAP based grid mapfile ==== |
| 244 | {{{ |
| 245 | #!div style="font-size: 90%" |
| 246 | {{{#!default |
| 247 | #install grid-mapfile generator from PL-Grid repository |
| 248 | yum install plggridmapfilegenerator |
| 249 | #configure gridmapfilegenerator - remember to change url property to your local ldap replica |
| 250 | cat > /opt/plgrid/plggridmapfilegenerator/etc/plggridmapfilegenerator.conf << EOF |
| 251 | [ldap] |
| 252 | url=ldaps://10.4.1.39 |
| 253 | #search base |
| 254 | #base=dc=osrodek,dc=plgrid,dc=pl |
| 255 | base=ou=People,dc=cyfronet,dc=plgrid,dc=pl |
| 256 | #filter, specifies which users should be processed |
| 257 | filter=plgridX509CertificateDN=* |
| 258 | #timeout for execution of ldap queries |
| 259 | timeout=10 |
| 260 | |
| 261 | [output] |
| 262 | format=^plgridX509CertificateDN, uid |
| 263 | EOF |
| 264 | #add the gridmapfile generator as the cron.job |
| 265 | cat > /etc/cron.hourly/gridmapfile.cron << EOF |
| 266 | #!/bin/sh |
| 267 | /opt/plgrid/plggridmapfilegenerator/bin/plggridmapfilegenerator.py -o /etc/grid-security/grid-mapfile |
| 268 | EOF |
| 269 | #set executable bit |
| 270 | chmod a+x /etc/cron.hourly/gridmapfile.cron |
| 271 | #try it! |
| 272 | /etc/cron.hourly/gridmapfile.cron |
| 273 | }}} |
| 274 | }}} |
| 275 | |
| 276 | Add appropriate rights for the `smoa_comp` and `grms` users in the Maui scheduler configuaration file: |
| 277 | {{{ |
| 278 | #!div style="font-size: 90%" |
| 279 | {{{#!default |
| 280 | vim /var/spool/maui/maui.cfg |
| 281 | # primary admin must be first in list |
| 282 | ADMIN1 root |
| 283 | ADMIN2 grms |
| 284 | ADMIN3 smoa_comp |
| 285 | }}} |
| 286 | }}} |
| 287 | Copy the service certificate and key into the `/opt/plgrid/qcg/smoa/etc/certs/`. Remember to set appropriate rights to the key file. |
| 288 | {{{ |
| 289 | #!div style="font-size: 90%" |
| 290 | {{{#!default |
| 291 | cp /etc/grid-security/hostcert.pem /opt/plgrid/qcg/smoa/etc/certs/smoacert.pem |
| 292 | cp /etc/grid-security/hostkey.pem /opt/plgrid/qcg/smoa/etc/certs/smoakey.pem |
| 293 | chown smoa_comp /opt/plgrid/qcg/smoa/etc/certs/smoacert.pem |
| 294 | chown smoa_comp /opt/plgrid/qcg/smoa/etc/certs/smoakey.pem |
| 295 | chmod 0600 /opt/plgrid/qcg/smoa/etc/certs/smoakey.pem |
| 296 | }}} |
| 297 | }}} |
| 298 | == DRMAA library == |
| 299 | DRMAA library must be compiled from SRC RPM: |
| 300 | {{{ |
| 301 | #!div style="font-size: 90%" |
| 302 | {{{#!default |
| 303 | wget http://fury.man.poznan.pl/qcg-packages/sl/SRPMS/pbs-drmaa-1.0.6-2.src.rpm |
| 304 | rpmbuild --rebuild pbs-drmaa-1.0.6-2.src.rpm |
| 305 | cd /usr/src/redhat/RPMS/x86_64/ |
| 306 | rpm -i pbs-drmaa-1.0.6-2.x86_64.rpm |
| 307 | }}} |
| 308 | }}} |
| 309 | however if you are using it for the first time then you should try to compile it with enabled logging: |
| 310 | {{{ |
| 311 | #!div style="font-size: 90%" |
| 312 | {{{#!default |
| 313 | wget http://fury.man.poznan.pl/qcg-packages/sl/SRPMS/pbs-drmaa-1.0.6-2.src.rpm |
| 314 | rpmbuild --define 'configure_options --enable-debug' --rebuild pbs-drmaa-1.0.6-2.src.rpm |
| 315 | cd /usr/src/redhat/RPMS/x86_64/ |
| 316 | rpm -i pbs-drmaa-1.0.6-2.x86_64.rpm |
| 317 | }}} |
| 318 | }}} |
| 319 | After installation you need '''either''': |
| 320 | * configure the DRMAA library to use Torque logs ('''RECOMMENDED'''). Sample configuration file of the DRMAA library (`/opt/plgrid/qcg/smoa/etc/pbs_drmaa.conf`): |
| 321 | {{{ |
| 322 | #!div style="font-size: 90%" |
| 323 | {{{#!default |
| 324 | # pbs_drmaa.conf - Sample pbs_drmaa configuration file. |
| 325 | |
| 326 | wait_thread: 1, |
| 327 | |
| 328 | pbs_home: "/var/spool/pbs", |
| 329 | |
| 330 | cache_job_state: 600, |
| 331 | }}} |
| 332 | }}} |
| 333 | '''Note:''' Remember to mount server log directory as described in the eariler [[QCG_BES_AR_on_separate_machine|note]]. |
| 334 | |
| 335 | '''or''' |
| 336 | * configure Torque to keep information about completed jobs (e.g.: by setting: `qmgr -c 'set server keep_completed = 300'`). |
| 337 | |
| 338 | It is possible to limit users to submit job to predefined queue by setting default job category (in the `/opt/plgrid/qcg/smoa/etc/pbs_drmaa.conf` file): |
| 339 | {{{ |
| 340 | #!div style="font-size: 90%" |
| 341 | {{{#!default |
| 342 | job_categories: { |
| 343 | default: "-q plgrid", |
| 344 | }, |
| 345 | }}} |
| 346 | }}} |
| 347 | |
| 348 | == Restricting advance reservation == |
| 349 | In some deployments enabling advance reservation for the whole cluster is not desirable. In such cases one can limit advance reservation to particular partition by editing `/opt/plgrid/qcg/smoa/lib/smoa-comp/modules/python/reservation_maui.py` file and changing the following line: |
| 350 | {{{ |
| 351 | #!div style="font-size: 90%" |
| 352 | {{{#!default |
| 353 | cmd = "setres -x BYNAME -r PROCS=1" |
| 354 | }}} |
| 355 | }}} |
| 356 | to |
| 357 | {{{ |
| 358 | #!div style="font-size: 90%" |
| 359 | {{{#!default |
| 360 | cmd = "setres -x BYNAME -r PROCS=1 -p wp4" |
| 361 | }}} |
| 362 | }}} |
| 363 | |
| 364 | = Service configuration = |
| 365 | Edit the preinstalled service configuration file (`/opt/plgrid/qcg/smoa/etc/smoa-compd.xml`): |
| 366 | {{{ |
| 367 | #!div style="font-size: 90%" |
| 368 | {{{#!xml |
| 369 | <?xml version="1.0" encoding="UTF-8"?> |
| 370 | <sm:SMOACore |
| 371 | xmlns:sm="http://schemas.smoa-project.com/core/2009/01/config" |
| 372 | xmlns="http://schemas.smoa-project.com/comp/2009/01/config" |
| 373 | xmlns:smc="http://schemas.smoa-project.com/comp/2009/01/config" |
| 374 | xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> |
| 375 | |
| 376 | <Configuration> |
| 377 | <sm:ModuleManager> |
| 378 | <sm:Directory>/opt/plgrid/qcg/smoa/lib/smoa-core/modules/</sm:Directory> |
| 379 | <sm:Directory>/opt/plgrid/qcg/smoa/lib/smoa-comp/modules/</sm:Directory> |
| 380 | </sm:ModuleManager> |
| 381 | |
| 382 | <sm:Service xsi:type="smoa-compd" description="SMOA Computing"> |
| 383 | <sm:Logger> |
| 384 | <sm:Filename>/opt/plgrid/var/log/smoa-comp/smoa-comp.log</sm:Filename> |
| 385 | <sm:Level>INFO</sm:Level> |
| 386 | </sm:Logger> |
| 387 | |
| 388 | <sm:Transport> |
| 389 | <sm:Module xsi:type="sm:ecm_gsoap.service"> |
| 390 | <sm:Host>frontend.example.com</sm:Host> |
| 391 | <sm:Port>19000</sm:Port> |
| 392 | <sm:KeepAlive>false</sm:KeepAlive> |
| 393 | <sm:Authentication> |
| 394 | <sm:Module xsi:type="sm:atc_transport_gsi.service"> |
| 395 | <sm:X509CertFile>/opt/plgrid/qcg/smoa/etc/certs/smoacert.pem</sm:X509CertFile> |
| 396 | <sm:X509KeyFile>/opt/plgrid/qcg/smoa/etc/certs/smoakey.pem</sm:X509KeyFile> |
| 397 | </sm:Module> |
| 398 | </sm:Authentication> |
| 399 | <sm:Authorization> |
| 400 | <sm:Module xsi:type="sm:atz_mapfile"> |
| 401 | <sm:Mapfile>/etc/grid-security/grid-mapfile</sm:Mapfile> |
| 402 | </sm:Module> |
| 403 | </sm:Authorization> |
| 404 | </sm:Module> |
| 405 | <sm:Module xsi:type="smc:smoa-comp-service"/> |
| 406 | </sm:Transport> |
| 407 | |
| 408 | <sm:Module xsi:type="pbs_jsdl_filter"/> |
| 409 | <sm:Module xsi:type="atz_ardl_filter"/> |
| 410 | <sm:Module xsi:type="sm:general_python" path="/opt/plgrid/qcg/smoa/lib/smoa-comp/modules/python/monitoring.py"/> |
| 411 | |
| 412 | <sm:Module xsi:type="submission_drmaa" path="/opt/plgrid/qcg/smoa/lib/libdrmaa.so"/> |
| 413 | <sm:Module xsi:type="reservation_python" path="/opt/plgrid/qcg/smoa/lib/smoa-comp/modules/python/reservation_maui.py"/> |
| 414 | |
| 415 | <sm:Module xsi:type="notification_wsn"> |
| 416 | <sm:Module xsi:type="sm:ecm_gsoap.client"> |
| 417 | <sm:ServiceURL>http://localhost:19001/</sm:ServiceURL> |
| 418 | <sm:Authentication> |
| 419 | <sm:Module xsi:type="sm:atc_transport_http.client"/> |
| 420 | </sm:Authentication> |
| 421 | <sm:Module xsi:type="sm:ntf_client"/> |
| 422 | </sm:Module> |
| 423 | </sm:Module> |
| 424 | |
| 425 | <sm:Module xsi:type="application_mapper"> |
| 426 | <ApplicationMapFile>/opt/plgrid/qcg/smoa/etc/application_mapfile</ApplicationMapFile> |
| 427 | </sm:Module> |
| 428 | |
| 429 | <Database> |
| 430 | <DSN>smoa_comp</DSN> |
| 431 | <User>smoa_comp</User> |
| 432 | <Password>smoa_comp</Password> |
| 433 | </Database> |
| 434 | |
| 435 | <UnprivilegedUser>smoa_comp</UnprivilegedUser> |
| 436 | |
| 437 | <FactoryAttributes> |
| 438 | <CommonName>klaster.plgrid.pl</CommonName> |
| 439 | <LongDescription>PL Grid cluster</LongDescription> |
| 440 | </FactoryAttributes> |
| 441 | </sm:Service> |
| 442 | |
| 443 | </Configuration> |
| 444 | </sm:SMOACore> |
| 445 | }}} |
| 446 | }}} |
| 447 | |
| 448 | In most cases it should be enough to change only following elements: |
| 449 | `Transport/Module/Host` :: |
| 450 | the hostname of the machine where the service is deployed |
| 451 | `Transport/Module/Authentication/Module/X509CertFile` and `Transport/Module/Authentication/Module/X509KeyFile` :: |
| 452 | the service private key and X.509 certificate (consult the [[http://www.globus.org/toolkit/docs/4.0/security/prewsaa/rn01re02.html|Globus User Gide]] on how to generate service certificate request or use the host certificate/key pair). Make sure that the key and certificate is owned by the `smoa_comp` user and the private key is not password protected (generating certificate with the `-service` option implies this). If you installed cert and key file in the recommended location you do not need to edit these fields. |
| 453 | `Module[type="smc:notification_wsn"]/Module/ServiceURL` :: |
| 454 | the URL of the Smoa Notification (QCG Notification) service (You can do it later, i.e. after [[installation_QCG_Notification_in_PLGrid|installing the Smoa Notification service]]) |
| 455 | `Module[type="submission_drmaa"]/@path` :: |
| 456 | path to the DRMAA library (the `libdrmaa.so`). Also, if you installed the DRMAA library using provided SRC RPM you do not need to change this path. |
| 457 | `Database/Password` :: |
| 458 | the `smoa_comp` database password |
| 459 | `FactoryAttributes/CommonName` :: |
| 460 | a common name of the cluster (e.g. reef.man.poznan.pl). You can use any name that is unique among all systems (e.g. cluster name + domain name of your institution) |
| 461 | `FactoryAttributes/LongDescription` :: |
| 462 | a human readable description of the cluster |
| 463 | |
| 464 | == Configuring BAT accounting module == |
| 465 | In order to report resource usage to the central PL-Grid accounting service you must enable the `bat_updater` module. You can do this by including the following snippet in the aforementioned configuration file (`/opt/plgrid/qcg/smoa/etc/smoa-comp.xml`). Please put the following snippet just before the `Database` section: |
| 466 | {{{ |
| 467 | #!div style="font-size: 90%" |
| 468 | {{{#!xml |
| 469 | <sm:Module xsi:type="bat_updater"> |
| 470 | <BATServiceURL>tcp://acct.grid.cyf-kr.edu.pl:61616</BATServiceURL> |
| 471 | <SiteName>psnc-smoa-plgrid</SiteName> |
| 472 | <QueueName>test-jobs</QueueName> |
| 473 | </sm:Module> |
| 474 | }}} |
| 475 | }}} |
| 476 | where: |
| 477 | * BATServiceURL : URL of the BAT accounting service |
| 478 | * !SiteName : local site name as reported to the BAT service |
| 479 | * !QueueName : queue name to which report usage data |
| 480 | |
| 481 | = Note on the security model = |
| 482 | The Smoa Computing can be configured with various authentication and authorization modules. However in the typical deployment we assume that the Smoa Computing is configured as in the above example, i.e.: |
| 483 | * authentication is provided on basics of ''httpg'' protocol |
| 484 | * authorization is based on the local `grid-mapfile` mapfile (see [[installation_GridFTP#Usersconfiguration|Users configuration]]). |
| 485 | |
| 486 | = Starting the service = |
| 487 | As root type: |
| 488 | {{{ |
| 489 | #!div style="font-size: 90%" |
| 490 | {{{#!sh |
| 491 | /etc/init.d/smoa-compd start |
| 492 | }}} |
| 493 | }}} |
| 494 | |
| 495 | The service logs can be found in: |
| 496 | {{{ |
| 497 | #!div style="font-size: 90%" |
| 498 | {{{#!sh |
| 499 | /opt/plgrid/var/log/smoa-comp/smoa-comp.log |
| 500 | }}} |
| 501 | }}} |
| 502 | |
| 503 | The service assumes that the following commands are in the standard search path: |
| 504 | * `pbsnodes` |
| 505 | * `showres` |
| 506 | * `setres` |
| 507 | * `releaseres` |
| 508 | * `checknode` |
| 509 | If any of the above commands is not installed in a standard location (e.g. `/usr/bin`) you may need to edit the `/opt/plgrid/qcg/smoa/etc/sysconfig/smoa-compd` file and set the `PATH` variable appropriately, e.g.: |
| 510 | {{{ |
| 511 | #!div style="font-size: 90%" |
| 512 | {{{#!sh |
| 513 | # INIT_WAIT=5 |
| 514 | # |
| 515 | # DRM specific options |
| 516 | |
| 517 | export PATH=$PATH:/opt/maui/bin |
| 518 | }}} |
| 519 | }}} |
| 520 | |
| 521 | If you compiled DRMAA with logging switched on you can set there also DRMAA logging level: |
| 522 | {{{ |
| 523 | #!div style="font-size: 90%" |
| 524 | {{{#!sh |
| 525 | # INIT_WAIT=5 |
| 526 | # |
| 527 | # DRM specific options |
| 528 | |
| 529 | export DRMAA_LOG_LEVEL=INFO |
| 530 | }}} |
| 531 | }}} |
| 532 | |
| 533 | = Stopping the service = |
| 534 | The service can be stopped using the following command: |
| 535 | {{{ |
| 536 | #!div style="font-size: 90%" |
| 537 | {{{#!sh |
| 538 | /etc/init.d/smoa-compd stop |
| 539 | }}} |
| 540 | }}} |
| 541 | |
| 542 | = Verifying the installation = |
| 543 | |
| 544 | * For convenience you can add the `/opt/plgrid/qcg/smoa/bin` and `/opt/plgrid/qcg/smoa-dep/globus/bin/` to your `PATH` variable. |
| 545 | * Edit the Smoa Computing client configuration file (`/opt/plgrid/qcg/smoa/etc/smoa-comp.xml`): |
| 546 | * set the `Host` and `Port` to reflects the changes in the service configuration file (`smoa-compd.xml`). |
| 547 | {{{ |
| 548 | #!div style="font-size: 90%" |
| 549 | {{{#!sh |
| 550 | <?xml version="1.0" encoding="UTF-8"?> |
| 551 | <sm:SMOACore |
| 552 | xmlns:sm="http://schemas.smoa-project.com/core/2009/01/config" |
| 553 | xmlns="http://schemas.smoa-project.com/comp/2009/01/config" |
| 554 | xmlns:smc="http://schemas.smoa-project.com/comp/2009/01/config" |
| 555 | xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> |
| 556 | |
| 557 | <Configuration> |
| 558 | <sm:ModuleManager> |
| 559 | <sm:Directory>/opt/QCG/smoa/lib/smoa-core/modules/</sm:Directory> |
| 560 | <sm:Directory>/opt/QCG/smoa//lib/smoa-comp/modules/</sm:Directory> |
| 561 | </sm:ModuleManager> |
| 562 | |
| 563 | <sm:Client xsi:type="smoa-comp" description="SMOA Computing client"> |
| 564 | <sm:Transport> |
| 565 | <sm:Module xsi:type="sm:ecm_gsoap.client"> |
| 566 | <sm:ServiceURL>httpg://frontend.example.com:19000/</sm:ServiceURL> |
| 567 | <sm:Authentication> |
| 568 | <sm:Module xsi:type="sm:atc_transport_gsi.client"/> |
| 569 | </sm:Authentication> |
| 570 | <sm:Module xsi:type="smc:smoa-comp-client"/> |
| 571 | </sm:Module> |
| 572 | </sm:Transport> |
| 573 | </sm:Client> |
| 574 | </Configuration> |
| 575 | </sm:SMOACore> |
| 576 | }}} |
| 577 | }}} |
| 578 | * Initialize your credentials: |
| 579 | {{{ |
| 580 | #!div style="font-size: 90%" |
| 581 | {{{#!sh |
| 582 | grid-proxy-init |
| 583 | Your identity: /O=Grid/OU=QosCosGrid/OU=PSNC/CN=Mariusz Mamonski |
| 584 | Enter GRID pass phrase for this identity: |
| 585 | Creating proxy .................................................................. Done |
| 586 | Your proxy is valid until: Wed Sep 16 05:01:02 2009 |
| 587 | }}} |
| 588 | }}} |
| 589 | * Query the SMOA Computing service: |
| 590 | {{{ |
| 591 | #!div style="font-size: 90%" |
| 592 | {{{#!sh |
| 593 | smoa-comp -G | xmllint --format - # the xmllint is used only to present the result in more pleasant way |
| 594 | |
| 595 | <bes-factory:FactoryResourceAttributesDocument xmlns:bes-factory="http://schemas.ggf.org/bes/2006/08/bes-factory"> |
| 596 | <bes-factory:IsAcceptingNewActivities>true</bes-factory:IsAcceptingNewActivities> |
| 597 | <bes-factory:CommonName>IT cluster</bes-factory:CommonName> |
| 598 | <bes-factory:LongDescription>IT department cluster for public use</bes-factory:LongDescription> |
| 599 | <bes-factory:TotalNumberOfActivities>0</bes-factory:TotalNumberOfActivities> |
| 600 | <bes-factory:TotalNumberOfContainedResources>1</bes-factory:TotalNumberOfContainedResources> |
| 601 | <bes-factory:ContainedResource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="bes-factory:BasicResourceAttributesDocumentType"> |
| 602 | <bes-factory:ResourceName>worker.example.com</bes-factory:ResourceName> |
| 603 | <bes-factory:CPUArchitecture> |
| 604 | <jsdl:CPUArchitectureName xmlns:jsdl="http://schemas.ggf.org/jsdl/2005/11/jsdl">x86_32</jsdl:CPUArchitectureName> |
| 605 | </bes-factory:CPUArchitecture> |
| 606 | <bes-factory:CPUCount>4</bes-factory:CPUCount><bes-factory:PhysicalMemory>1073741824</bes-factory:PhysicalMemory> |
| 607 | </bes-factory:ContainedResource> |
| 608 | <bes-factory:NamingProfile>http://schemas.ggf.org/bes/2006/08/bes/naming/BasicWSAddressing</bes-factory:NamingProfile> |
| 609 | <bes-factory:BESExtension>http://schemas.ogf.org/hpcp/2007/01/bp/BasicFilter</bes- factory:BESExtension> |
| 610 | <bes-factory:BESExtension>http://schemas.smoa-project.com/comp/2009/01</bes-factory:BESExtension> |
| 611 | <bes-factory:LocalResourceManagerType>http://example.com/SunGridEngine</bes-factory:LocalResourceManagerType> |
| 612 | <smcf:NotificationProviderURL xmlns:smcf="http://schemas.smoa-project.com/comp/2009/01/factory">http://localhost:2211/</smcf:NotificationProviderURL> |
| 613 | </bes-factory:FactoryResourceAttributesDocument> |
| 614 | }}} |
| 615 | }}} |
| 616 | * Submit a sample job: |
| 617 | {{{ |
| 618 | #!div style="font-size: 90%" |
| 619 | {{{#!sh |
| 620 | smoa-comp -c -J /opt/plgrid/qcg/smoa/share/smoa-comp/doc/examples/jsdl/sleep.xml |
| 621 | Activity Id: ccb6b04a-887b-4027-633f-412375559d73 |
| 622 | }}} |
| 623 | }}} |
| 624 | * Query it status: |
| 625 | {{{ |
| 626 | #!div style="font-size: 90%" |
| 627 | {{{#!sh |
| 628 | smoa-comp -s -a ccb6b04a-887b-4027-633f-412375559d73 |
| 629 | status = Executing |
| 630 | smoa-comp -s -a ccb6b04a-887b-4027-633f-412375559d73 |
| 631 | status = Executing |
| 632 | smoa-comp -s -a ccb6b04a-887b-4027-633f-412375559d73 |
| 633 | status = Finished |
| 634 | exit status = 0 |
| 635 | }}} |
| 636 | }}} |
| 637 | * Create an advance reservation: |
| 638 | * copy the provided sample reservation description file (expressed in ARDL - Advance Reservation Description Language) |
| 639 | {{{ |
| 640 | #!div style="font-size: 90%" |
| 641 | {{{#!sh |
| 642 | cp /opt/plgrid/qcg/smoa/share/smoa-comp/doc/examples/ardl/oneslot.xml oneslot.xml |
| 643 | }}} |
| 644 | }}} |
| 645 | * Edit the `oneslot.xml` and modify the `StartTime` and `EndTime` to dates that are in the near future, |
| 646 | * Create a new reservation: |
| 647 | {{{ |
| 648 | #!div style="font-size: 90%" |
| 649 | {{{#!sh |
| 650 | smoa-comp -c -D oneslot.xml |
| 651 | Reservation Id: aab6b04a-887b-4027-633f-412375559d7d |
| 652 | }}} |
| 653 | }}} |
| 654 | * List all reservations: |
| 655 | {{{ |
| 656 | #!div style="font-size: 90%" |
| 657 | {{{#!sh |
| 658 | smoa-comp -l |
| 659 | Reservation Id: aab6b04a-887b-4027-633f-412375559d7d |
| 660 | Total number of reservations: 1 |
| 661 | }}} |
| 662 | }}} |
| 663 | * Check which hosts where reserved: |
| 664 | {{{ |
| 665 | #!div style="font-size: 90%" |
| 666 | {{{#!sh |
| 667 | smoa-comp -s -r aab6b04a-887b-4027-633f-412375559d7d |
| 668 | Reserved hosts: |
| 669 | worker.example.com[used=0,reserved=1,total=4] |
| 670 | }}} |
| 671 | }}} |
| 672 | * Delete the reservation: |
| 673 | {{{ |
| 674 | #!div style="font-size: 90%" |
| 675 | {{{#!sh |
| 676 | smoa-comp -t -r aab6b04a-887b-4027-633f-412375559d7d |
| 677 | Reservation terminated. |
| 678 | }}} |
| 679 | }}} |
| 680 | * Check if the grid-ftp is working correctly: |
| 681 | {{{ |
| 682 | #!div style="font-size: 90%" |
| 683 | {{{#!sh |
| 684 | globus-url-copy gsiftp://your.local.host.name/etc/profile profile |
| 685 | diff /etc/profile profile |
| 686 | }}} |
| 687 | }}} |
| 688 | |
| 689 | = Configuring firewall = |
| 690 | In order to expose the !QosCosGrid services externally you need to open the following ports in the firewall: |
| 691 | * 19000 (TCP) - Smoa Computing |
| 692 | * 19001 (TCP) - Smoa Notification |
| 693 | * 2811 (TCP) - GridFTP server |
| 694 | * 9000-9500 (TCP) - GridFTP port-range (if you want to use different port-range adjust the `GLOBUS_TCP_PORT_RANGE` variable in the `/etc/xinetd.d/gsiftp` file) |
| 695 | |
| 696 | = Maintenance = |
| 697 | The historic usage information is stored in two relations of the Smoa Computing database: `jobs_acc` and `reservations_acc`. You can always archive old usage data to a file and delete it from the database using the psql client: |
| 698 | {{{ |
| 699 | #!div style="font-size: 90%" |
| 700 | {{{#!sh |
| 701 | psql -h localhost smoa_comp smoa_comp |
| 702 | Password for user smoa_comp: |
| 703 | Welcome to psql 8.1.23, the PostgreSQL interactive terminal. |
| 704 | |
| 705 | Type: \copyright for distribution terms |
| 706 | \h for help with SQL commands |
| 707 | \? for help with psql commands |
| 708 | \g or terminate with semicolon to execute query |
| 709 | \q to quit |
| 710 | |
| 711 | smoa_comp=> \o jobs.acc |
| 712 | smoa_comp=> SELECT * FROM jobs_acc where end_time < date '2010-01-10'; |
| 713 | smoa_comp=> \o reservations.acc |
| 714 | smoa_comp=> SELECT * FROM reservations_acc where end_time < date '2010-01-10'; |
| 715 | smoa_comp=> \o |
| 716 | smoa_comp=> DELETE FROM jobs_acc where end_time < date '2010-01-10'; |
| 717 | smoa_comp=> DELETE FROM reservation_acc where end_time < date '2010-01-10'; |
| 718 | }}} |
| 719 | }}} |