Changes between Version 56 and Version 57 of InstallingUsingDEBS

Show
Ignore:
Timestamp:
07/30/13 11:25:00 (11 years ago)
Author:
mmamonski
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • InstallingUsingDEBS

    v56 v57  
    1 [[PageOutline]]  
    2  
    3 *** !!! Warning !!! *** 
    41 
    52This page has been obsoleted by  the general [[InstallationGuide| Installation Guide]]. 
    63 
    7  
    8 = Introduction = 
    9 QCG-Computing service (the successor of the OpenDSP project) is an open source service acting as a computing provider exposing on demand access to computing resources and jobs over the HPC Basic Profile compliant Web Services interface. In addition the QCG-Computing offers remote interface for Advance Reservations management.  
    10  
    11 This document describes installation of the QCG-Computing service on Debian machines using binary packages. The service should be deployed on the machine (or virtual machine) that: 
    12 * has at least 1GB of memory (recommended value: 2 GB) 
    13 * has 10 GB of free disk space (most of the space will be used by the log files) 
    14 * has any modern CPU (if you plan to use virtual machine you should dedicated to it one or two cores from the host machine) 
    15 * runs DEBIAN 6.X 
    16 = Prerequisites = 
    17 We assume that you have the local batch systems  already installed. 
    18  
    19 The !QosCosGrid services do not require from you to install any QCG component on the worker nodes, however application wrapper scripts need the following software to be available on worker nodes: 
    20  * bash,  
    21  * rsync, 
    22  * zip/unzip, 
    23  * dos2unix, 
    24  * nc, 
    25  * python. 
    26 Which are usually available out of the box on most of the HPC systems. 
    27  
    28 == GridFTP server == 
    29 To be fully operable the !QosCosGrid stack requires the GridFTP server to be installed. This requirements is usually fulfilled by most PRACE sites. If not it can be easily installed by issuing the following commands: 
    30 {{{ 
    31 # apt-get install xinetd globus-gridftp-server-progs  
    32 # cat > /etc/xinetd.d/gsiftp << EOF 
    33 service gsiftp 
    34 { 
    35  instances               = 100 
    36  socket_type             = stream 
    37  wait                    = no 
    38  user                    = root 
    39  env                     += GLOBUS_TCP_PORT_RANGE=20000,25000 
    40  server = /usr/sbin/globus-gridftp-server 
    41  server_args = -i -aa -l /var/log/globus-gridftp.log 
    42  server_args += -d WARN 
    43  log_on_success          += DURATION 
    44  nice                    = 10 
    45  disable                 = no 
    46 } 
    47 EOF 
    48 # /etc/init.d/xinetd reload 
    49 Reloading internet superserver configuration: xinetd. 
    50 }}} 
    51 = Firewall configuration = 
    52 In order to expose the !QosCosGrid services externally you need to open the following incoming ports in the firewall: 
    53 * 19000 (TCP) - QCG-Computing 
    54 * 19001 (TCP) - QCG-Notification 
    55 * 2811 (TCP) - GridFTP server 
    56 * 20000-25000 (TCP) - GridFTP  port-range  
    57  
    58 The following outgoing trafic should be allowed in general: 
    59 * NTP, DNS, HTTP, HTTPS services 
    60 * gridftp (TCP ports: 2811 and port-ranges: 9000-9500, 20000-25000) 
    61  
    62 = Related software = 
    63 * Install database backend (PostgresSQL) - optional, only if you want to host the QCG-Computing database on the same machine.  
    64 {{{#!sh 
    65 apt-get install postgresql 
    66 }}} 
    67  
    68 * UnixODBC and the PostgresSQL ODBC driver: 
    69 {{{#!sh 
    70 apt-get install unixodbc odbc-postgresql 
    71 }}} 
    72  
    73 Moreover we further assume that the X.509 host certificate and key are already installed in the following locations: 
    74 * `/etc/grid-security/hostcert.pem` 
    75 * `/etc/grid-security/hostkey.pem` 
    76  
    77 Most of the grid services and security infrastructures are sensitive to time skews. Thus we recommended to install a Network Time Protocol daemon or use any other solution that provides accurate clock synchronization. 
    78  
    79 = Installation  = 
    80 The one who want to install QCG-Computing on Debian should follow these steps: 
    81  
    82 * ensure that the qcg-comp user is present in a system, otherwise create it: 
    83 {{{#!sh 
    84 useradd -r -d /var/log/qcg-comp/ qcg-comp 
    85 }}} 
    86  
    87 * ensure that the qcg-dev group is present in a system, otherwise create it: 
    88 {{{#!sh 
    89 groupadd -r qcg-dev 
    90 }}} 
    91  
    92 * install the !QosCosGrid Debian repository: 
    93 {{{#!sh 
    94 cat > /etc/apt/sources.list.d/qcg.unstable.list << EOF 
    95 deb http://fury.man.poznan.pl/qcg-packages/debian/ unstable main 
    96 EOF 
    97 }}} 
    98  
    99 * add the public key of the QCG repository to your trusted keys in the apt configuration: 
    100 {{{#!sh 
    101 wget https://apps.man.poznan.pl/trac/qcg-notification/raw-attachment/wiki/InstallingUsingDeb/qcg.pub 
    102 apt-key add qcg.pub 
    103 }}} 
    104  
    105 * refresh the packages list: 
    106 {{{#!sh 
    107 apt-get update 
    108 }}} 
    109  
    110 * install QCG-Computing: 
    111 {{{ 
    112 #!div style="font-size: 90%" 
    113 {{{#!sh 
    114 apt-get install qcg-comp qcg-comp-client qcg-comp-doc 
    115 }}} 
    116 }}} 
    117  
    118 * setup the QCG-Computing database as described [http://apps.man.poznan.pl/trac/qcg-computing/wiki/InstallingFromSources#Databasesetup here]. 
    119  
    120  
    121  
    122  
    123 = Service certificates = 
    124 Copy the service certificate and key into the `/etc/qcg-comp/certs/`. Remember to set appropriate rights to the key file. 
    125 {{{ 
    126 #!div style="font-size: 90%" 
    127 {{{#!default 
    128 cp /etc/grid-security/hostcert.pem /etc/qcg-comp/certs/qcgcert.pem 
    129 cp /etc/grid-security/hostkey.pem /etc/qcg-comp/certs/qcgkey.pem 
    130 chown qcg-comp /etc/qcg-comp/certs/qcgcert.pem 
    131 chown qcg-comp /etc/qcg-comp/certs/qcgkey.pem  
    132 chmod 0600 /etc/qcg-comp/certs/qcgkey.pem 
    133 }}} 
    134 }}} 
    135 =  DRMAA library = 
    136 == Torque/PBS Professional == 
    137 Install DRMAA for Torque/PBS Pro using source package available at [http://apps.man.poznan.pl/trac/pbs-drmaa PBS DRMAA home page] 
    138  
    139  
    140 == SLURM == 
    141 Install DRMAA for SLURM using source package available at [http://apps.man.poznan.pl/trac/slurm-drmaa SLURM DRMAA home page]. 
    142 {{{ 
    143 # install SLURM headers files 
    144 apt-get install libslurm21-dev 
    145 # get SLURM DRMAA 
    146 wget http://apps.man.poznan.pl/trac/slurm-drmaa/downloads/slurm-drmaa-1.0.5.tar.gz 
    147 tar -xzf slurm-drmaa-1.0.5.tar.gz 
    148 cd slurm-drmaa-1.0.5 
    149 # configure, make and install (by default DRMAA should be installed into /usr/local/ 
    150 ./configure 
    151 make  
    152 make install 
    153 # test it! 
    154  /usr/local/bin/drmaa-run /bin/hostname 
    155  
    156 }}} 
    157 = Service configuration  = 
    158 Edit the preinstalled service configuration file (`/etc/qcg-comp/qcg-comp`**d**`.xml`): 
    159 {{{ 
    160 <?xml version="1.0" encoding="UTF-8"?> 
    161 <sm:QCGCore 
    162         xmlns:sm="http://schemas.qoscosgrid.org/core/2011/04/config" 
    163         xmlns="http://schemas.qoscosgrid.org/comp/2011/04/config" 
    164         xmlns:smc="http://schemas.qoscosgrid.org/comp/2011/04/config" 
    165         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> 
    166          
    167         <Configuration> 
    168                 <sm:ModuleManager> 
    169                         <sm:Directory>/usr/lib/qcg-core/modules/</sm:Directory> 
    170                         <sm:Directory>/usr/lib/qcg-comp/modules/</sm:Directory> 
    171                 </sm:ModuleManager> 
    172  
    173                 <sm:Service xsi:type="qcg-compd" description="QCG Computing"> 
    174                         <sm:Logger> 
    175                                 <sm:Filename>/var/log/qcg-comp/qcg-compd.log</sm:Filename> 
    176                                 <sm:Level>INFO</sm:Level> 
    177                         </sm:Logger> 
    178  
    179                         <sm:Transport> 
    180                                 <sm:Module xsi:type="sm:ecm_gsoap.service"> 
    181                                         <sm:Host>localhost</sm:Host> 
    182                                         <sm:Port>19000</sm:Port> 
    183                                 </sm:Module> 
    184                                 <sm:Module xsi:type="smc:qcg-comp-service"/> 
    185                         </sm:Transport> 
    186                          
    187                         <sm:Authentication> 
    188         <sm:Module xsi:type="sm:atc_transport_gsi.service"> 
    189                          <sm:X509CertFile>/etc/qcg-comp/certs/qcgcert.pem</sm:X509CertFile> 
    190                          <sm:X509KeyFile>/etc/qcg-comp/certs/qcgkey.pem</sm:X509KeyFile> 
    191         </sm:Module> 
    192                         </sm:Authentication> 
    193  
    194       <sm:Authorization> 
    195         <sm:Module xsi:type="sm:atz_mapfile"> 
    196           <sm:Mapfile>/etc/grid-security/grid-mapfile</sm:Mapfile> 
    197         </sm:Module> 
    198       </sm:Authorization> 
    199  
    200  
    201                         <sm:Module xsi:type="submission_drmaa" path="/usr/local/lib/libdrmaa.so"/> 
    202  
    203       <!-- The jsdl filter module - uncomment module appropriate for your batch system --> 
    204       <!-- sm:Module xsi:type="pbs_jsdl_filter"/--> 
    205       <!-- sm:Module xsi:type="sge_jsdl_filter"/--> 
    206       <!-- sm:Module xsi:type="slurm_jsdl_filter"/--> 
    207       <!-- sm:Module xsi:type="lsf_jsdl_filter"/--> 
    208  
    209       <!-- The reservation module - uncomment module appropriate for your batch/scheduler system --> 
    210                         <!--sm:Module xsi:type="reservation_python" path="/usr/lib/qcg-comp/modules/python/reservation_sge.py"/--> 
    211                         <!--sm:Module xsi:type="reservation_python" path="/usr/lib/qcg-comp/modules/python/reservation_maui.py"/--> 
    212                         <!--sm:Module xsi:type="reservation_python" path="/usr/lib/qcg-comp/modules/python/reservation_moab.py"/--> 
    213                         <!--sm:Module xsi:type="reservation_python" path="/usr/lib/qcg-comp/modules/python/reservation_pbs.py"/--> 
    214                         <!--sm:Module xsi:type="reservation_python" path="/usr/lib/qcg-comp/modules/python/reservation_slurm.py"/--> 
    215       <sm:Module xsi:type="atz_ardl_filter"/> 
    216  
    217       <sm:Module xsi:type="sm:general_python" path="/usr/lib/qcg-comp/modules/python/monitoring.py"/> 
    218                          
    219                         <sm:Module xsi:type="notification_wsn"> 
    220                                 <sm:Module xsi:type="sm:ecm_gsoap.client" > 
    221                                                 <sm:ServiceURL>http://localhost:19001/</sm:ServiceURL> 
    222                                                         <sm:Authentication> 
    223                                                                 <sm:Module xsi:type="sm:atc_transport_http.client"/> 
    224                                                         </sm:Authentication> 
    225                                                 <sm:Module xsi:type="sm:ntf_client"/> 
    226                                 </sm:Module> 
    227                         </sm:Module> 
    228                                  
    229                         <sm:Module xsi:type="application_mapper"> 
    230                                 <ApplicationMapFile>/etc/qcg-comp/application_mapfile</ApplicationMapFile> 
    231                         </sm:Module> 
    232  
    233                         <Database> 
    234                                 <DSN>qcg-comp</DSN> 
    235                                 <User>qcg-comp</User> 
    236                                 <Password>qcg-comp</Password> 
    237                         </Database> 
    238  
    239                         <UnprivilegedUser>qcg-comp</UnprivilegedUser> 
    240  
    241                         <FactoryAttributes> 
    242                                 <CommonName>IT cluster</CommonName> 
    243                                 <LongDescription>IT department cluster for public use</LongDescription> 
    244                         </FactoryAttributes> 
    245                 </sm:Service> 
    246  
    247         </Configuration> 
    248 </sm:QCGCore> 
    249 }}} 
    250 In most cases it should be enough to change only following elements: 
    251  `Transport/Module/Host` :: 
    252    the hostname of the machine where the service is deployed  
    253  `Transport/Module/Authentication/Module/X509CertFile`  and  `Transport/Module/Authentication/Module/X509KeyFile` ::  
    254   the service private key and X.509 certificate. Make sure that the key and certificate is owned by the `qcg-comp` user.  If you installed cert and key file in the recommended location you do not need to edit these fields. 
    255  `Module[type="smc:notification_wsn"]/PublishedBrokerURL` ::  
    256   the external URL of the QCG-Notification service (You can do it later, i.e. after [http://apps.man.poznan.pl/trac/qcg-notification/wiki/InstallingUsingDeb installing the QCG-Notification service]) 
    257  `Module[type="smc:notification_wsn"]/Module/ServiceURL` ::  
    258   the localhost URL of the QCG-Notification service (You can do it later, i.e. after [http://apps.man.poznan.pl/trac/qcg-notification/wiki/InstallingUsingDeb installing the QCG-Notification service]) 
    259  `Module[type="submission_drmaa"]/@path` :: 
    260   path to the DRMAA library (the `libdrmaa.so`). Also, if you installed the DRMAA library using provided SRC RPM you do not need to change this path. 
    261  `Database/Password` ::  
    262   the `qcg-comp` database password   
    263   `UseScratch` :: 
    264   set this to `true` if you set QCG_SCRATCH_DIR_ROOT in `sysconfig` so any job will be started from scratch directory (instead of the default home directory) 
    265  `FactoryAttributes/CommonName` ::  
    266   a common name of the cluster (e.g. reef.man.poznan.pl). You can use any name that is unique among all systems (e.g. cluster name + domain name of your institution) 
    267  `FactoryAttributes/LongDescription` ::  
    268   a human readable description of the cluster 
    269  
    270 Moreover remember to uncomment  `jsdl_filter` and `reservation_python` modules (appropriate for your batch system). 
    271  
    272  
    273 = Creating applications' script space = 
    274 A common case for the QCG-Computing service is that an application is accessed using abstract app name rather than specifying absolute executable path. The application name/version to executbale path mappings are stored in the file `/etc/qcg-comp/application_mapfile`: 
    275  
    276 {{{#!default 
    277 cat /etc/qcg-comp/application_mapfile 
    278 # ApplicationName ApplicationVersion Executable 
    279  
    280 date * /bin/date 
    281 LPSolve 5.5 /usr/local/bin/lp_solve 
    282 }}} 
    283  
    284  
    285 It is also common to provide here  wrapper scripts rather than target executables. The wrapper script can handle such aspects of the application lifetime like: environment initialization,  copying files from/to scratch storage and application monitoring. It is recommended to create separate directory for those wrapper scripts (e.g. the application partition) for an applications and add write permission to them to the QCG Developers group. This directory must be readable by all users and from every worker node (the application partition usually fullfils those requirements). 
    286  
    287 {{{ 
    288 #!div style="font-size: 90%" 
    289 {{{#!default 
    290 mkdir /opt/exp_soft/qcg-app-scripts 
    291 chown :qcg-dev /opt/exp_soft/qcg-app-scripts 
    292 chmod g+rwx /opt/exp_soft/qcg-app-scripts 
    293 }}} 
    294 }}} 
    295  
    296 More on [ApplicationScripts Application Scripts]. 
    297 = Note on the security model = 
    298 The QCG-Computing can be configured with various authentication and authorization modules. However in the typical deployment we assume that the QCG-Computing is configured as in the above example, i.e.: 
    299 * authentication is provided on basics of ''httpg'' protocol, 
    300 * authorization is based on the local `grid-mapfile` mapfile. 
    301  
    302 = Starting the service = 
    303 As root type: 
    304 {{{ 
    305 #!div style="font-size: 90%" 
    306 {{{#!sh 
    307 /etc/init.d/qcg-comp start 
    308 }}} 
    309 }}} 
    310  
    311 The service logs can be found in: 
    312 {{{#!sh 
    313 /var/log/qcg-comp/qcg-compd.log 
    314 }}} 
    315  
    316  
    317  
    318  
    319 = Stopping the service = 
    320 The service can be stopped using the following command: 
    321 {{{ 
    322 #!div style="font-size: 90%" 
    323 {{{#!sh 
    324 /etc/init.d/qcg-comp stop 
    325 }}} 
    326 }}} 
    327  
    328 = Verifying the installation = 
    329  
    330 *  Edit the QCG-Computing client configuration file (`/etc/qcg-comp/qcg-comp.xml`): 
    331  *  set the `Host` and `Port` to reflects the changes in the service configuration file (`qcg-compd.xml`). 
    332 {{{ 
    333 #!div style="font-size: 90%" 
    334 {{{#!sh 
    335 <?xml version="1.0" encoding="UTF-8"?> 
    336 <sm:QCGCore 
    337        xmlns:sm="http://schemas.qoscosgrid.org/core/2011/04/config" 
    338        xmlns="http://schemas.qoscosgrid.org/comp/2011/04/config" 
    339        xmlns:smc="http://schemas.qoscosgrid.org/comp/2011/04/config" 
    340        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> 
    341    
    342        <Configuration> 
    343                <sm:ModuleManager> 
    344                        <sm:Directory>/opt/qcg/lib/qcg-core/modules/</sm:Directory> 
    345                        <sm:Directory>/opt/qcg/lib/qcg-comp/modules/</sm:Directory> 
    346                </sm:ModuleManager> 
    347   
    348                <sm:Client xsi:type="qcg-comp" description="QCG-Computing client"> 
    349                        <sm:Transport> 
    350                                <sm:Module xsi:type="sm:ecm_gsoap.client"> 
    351                                        <sm:ServiceURL>httpg://frontend.example.com:19000/</sm:ServiceURL> 
    352                                        <sm:Authentication> 
    353                                                <sm:Module xsi:type="sm:atc_transport_gsi.client"/> 
    354                                        </sm:Authentication> 
    355                                        <sm:Module xsi:type="smc:qcg-comp-client"/> 
    356                                </sm:Module> 
    357                        </sm:Transport> 
    358                </sm:Client> 
    359        </Configuration> 
    360 </sm:qcgCore> 
    361 }}} 
    362 }}} 
    363 * Initialize your credentials: 
    364 {{{ 
    365 #!div style="font-size: 90%" 
    366 {{{#!sh 
    367 grid-proxy-init -rfc 
    368 Your identity: /O=Grid/OU=QosCosGrid/OU=PSNC/CN=Mariusz Mamonski 
    369 Enter GRID pass phrase for this identity: 
    370 Creating proxy .................................................................. Done 
    371 Your proxy is valid until: Wed Apr  6 05:01:02 2012 
    372 }}} 
    373 }}} 
    374 * Query the QCG-Computing service: 
    375 {{{ 
    376 #!div style="font-size: 90%" 
    377 {{{#!sh 
    378 qcg-comp -G | xmllint --format - # the xmllint is used only to present the result in more pleasant way 
    379    
    380 <bes-factory:FactoryResourceAttributesDocument xmlns:bes-factory="http://schemas.ggf.org/bes/2006/08/bes-factory"> 
    381     <bes-factory:IsAcceptingNewActivities>true</bes-factory:IsAcceptingNewActivities> 
    382     <bes-factory:CommonName>IT cluster</bes-factory:CommonName> 
    383     <bes-factory:LongDescription>IT department cluster for public   use</bes-factory:LongDescription> 
    384     <bes-factory:TotalNumberOfActivities>0</bes-factory:TotalNumberOfActivities> 
    385     <bes-factory:TotalNumberOfContainedResources>1</bes-factory:TotalNumberOfContainedResources> 
    386     <bes-factory:ContainedResource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="bes-factory:BasicResourceAttributesDocumentType"> 
    387         <bes-factory:ResourceName>worker.example.com</bes-factory:ResourceName> 
    388         <bes-factory:CPUArchitecture> 
    389             <jsdl:CPUArchitectureName xmlns:jsdl="http://schemas.ggf.org/jsdl/2005/11/jsdl">x86_32</jsdl:CPUArchitectureName> 
    390         </bes-factory:CPUArchitecture> 
    391         <bes-factory:CPUCount>4</bes-factory:CPUCount><bes-factory:PhysicalMemory>1073741824</bes-factory:PhysicalMemory> 
    392     </bes-factory:ContainedResource> 
    393     <bes-factory:NamingProfile>http://schemas.ggf.org/bes/2006/08/bes/naming/BasicWSAddressing</bes-factory:NamingProfile>  
    394     <bes-factory:BESExtension>http://schemas.ogf.org/hpcp/2007/01/bp/BasicFilter</bes-  factory:BESExtension> 
    395     <bes-factory:BESExtension>http://schemas.qoscosgrid.org/comp/2011/04</bes-factory:BESExtension> 
    396     <bes-factory:LocalResourceManagerType>http://example.com/SunGridEngine</bes-factory:LocalResourceManagerType> 
    397     <smcf:NotificationProviderURL xmlns:smcf="http://schemas.qoscosgrid.org/comp/2011/04/factory">http://localhost:2211/</smcf:NotificationProviderURL> 
    398 </bes-factory:FactoryResourceAttributesDocument> 
    399 }}} 
    400 }}} 
    401 * Submit a sample job: 
    402 {{{ 
    403 #!div style="font-size: 90%" 
    404 {{{#!sh 
    405 qcg-comp -c -J /usr/share/doc/qcg-comp-doc/examples/date.xml 
    406 Activity Id: ccb6b04a-887b-4027-633f-412375559d73 
    407 }}} 
    408 }}} 
    409 * Query it status: 
    410 {{{ 
    411 #!div style="font-size: 90%" 
    412 {{{#!sh 
    413 qcg-comp -s -a ccb6b04a-887b-4027-633f-412375559d73 
    414 status = Executing 
    415 qcg-comp -s -a ccb6b04a-887b-4027-633f-412375559d73 
    416 status = Executing 
    417 qcg-comp -s -a ccb6b04a-887b-4027-633f-412375559d73 
    418 status = Finished 
    419 exit status = 0 
    420 }}} 
    421 }}}