Version 13 (modified by mmamonski, 11 years ago) (diff)

--

Download

The newest version of the plugin can be checkouted from svn:

svn co https://apps.man.poznan.pl/svn/qcg-tools/qcg-nagios/nagios-plugins-qcg-comp

Installation

After checkout simple run the ./install.sh script giving as the first argument directory where the probe should be installed. E.g.:

./install.sh /opt/qcg-comp-nagios

Usage

Usage:

./qcg-comp-nagios.sh -H hostname -p port -x proxy -t timeout [-v 0-3 -j test-jsdl.xml]

-H hostname - QCG-Computing host

-p port - QCG-Computing port

-x proxy - path to the file containing valid user X509 proxy

-t timeout - test timout given in seconds

-v 0-3 - verbosity (default: 0)

-j test-jsdl.xml - JSDL document decribing job to be tested

Example:

./qcg-comp-nagios.sh -H grass1.man.poznan.pl -p 19002 -x /tmp/proxy -j qcg-test-job.xml -t 60

Exit Codes

  • STATUS_OK (0) - Job finished successfully
  • STATUS_WARNING (1) - Job finished with exit code different than 0, Job did not finish within given timeout
  • STATUS_CRITICAL (2) - Submission of a job failed. Job ended with status Failed or Cancelled.
  • STATUS_UNKNOWN (3) - The probe internal or configuration error.

Interpreting error messages

  • "No CA certs provided" - check if the QCG-Comp service is registered in GOCDB with url starting with httpg (not https)
  • "Failed to submit a job: com.sun.xml.ws.client.ClientTransportException: HTTP transport error: java.net.SocketTimeoutException: connect timed out" - the machine is either down or there is some firewall issue between the Nagios machine and the QCG-Computing machine (The QCG-Computing service by default listen on port 19000)
  • "Failed to submit a job: com.sun.xml.ws.client.ClientTransportException: HTTP transport error: java.net.ConnectException: Connection refused - the host is up but the service is down (check if the qcg-comp service is running)