The acronym ARC stands for Advanced Resource Connector. It is a grid computing middleware developed and maintained by NorduGrid. Durham has an ARC Computing Element (CE) grid front-end on top of the cluster. This page contains information how to submit to any available ARC CE using ARC client tools, and upload your output to the grid storage.
A 24h hour proxy with VOMS extension is created as
arcproxy -S pheno -c validityPeriod=24h -c vomsACvalidityPeriod=24h
To create a longer lasting proxy that will automatically renew, please see the Grid Proxy page
The user needs to write instructions for ARC, so that it knows what to do on the actual node and which files to send with the submission. To parse job options and definitions, ARC uses the extended Resource Specification Language (xRSL). Here is a simple example (submit.xrsl) which will do the job for our c++ example.
&("JobName" = "TestJob") ("executable" = "job.sh") ("walltime" = "20") ("stdout" = "stdout") ("stderr" = "stderr") ("inputfiles" = ("simple" "")("input.txt" "") )
More about the syntax and rules can be found from the documentation http://www.nordugrid.org/documents/xrsl.pdf.
You will note we have set job.sh as our executable. In our example this is a script that will run our sample c++ executable and upload the output to the grid storage system. We suggest using a job script such as this to allow for multiple commands to be run within a single job. Below is the contents of job.sh, please replace the word USERNAME for your own username.
#! /bin/sh #Make our c++ example executable chmod +x simple #Run c++ example against input ./simple input.txt #Upload output to grid storage gfal-copy file://$PWD/output.txt gsiftp://se01.dur.scotgrid.ac.uk/dpm/dur.scotgrid.ac.uk/home/pheno/USERNAME/output.txt
The next step is to submit the job invoking the arcsub command. In addition to the xRSL file, the user needs to specify to computing element he or she is submitting using the -c option.
arcsub -c ce1.dur.scotgrid.ac.uk submit.xrsl
During the submission, all the information about the job is stored into a job database. The defaul job database is located at ~/.arc/jobs.dat. With the -j option one can define a separate job database.
arcsub -j jobs.xml -c ce1.dur.scotgrid.ac.uk submit.xrsl
This can be useful in order to keep track of different sets of jobs. If needed, the job database can be reconstructed later on using the arcsync command
arcsync -j jobs.xml -c ce1.dur.scotgrid.ac.uk
This creates a job database with all users jobs on the specified CE.
After submission, the arcsub command will print out a jobid for the submitted job. As mentioned earlier, this id is also stored in the job database. In order to make a query of a status of the job, use the arcstat command
arcstat -j jobs.xml
This will print out the status of all the jobs in the database. If you want to find the status of a specific job, use its id
arcstat -j jobs.xml <jobid>
After submission, it can take some time before this job reaches the information system.
Possible job status (depending on the middleware)
When a job has finished, your output will be available on the grid storage and can be collected using the following command. As above please replace USERNAME with your own.
gfal-copy gsiftp://se01.dur.scotgrid.ac.uk/dpm/dur.scotgrid.ac.uk/home/pheno/USERNAME/output.txt file://$PWD/output.txt
You can also download the stdout/stderr files with arcget. This will clean up the information system and the job database also. Using arcget to retrieve job output should be avoided wherever possible, as it is a very slow process.
arcget -j jobs.xml (<jobid>)
If you do not require the stdout/stderr files you should remove them from the CE with the following command.
arcclean -j jobs.xml (<jobid>)
If you want to terminate a run, use arckill
arckill -j jobs.xml (<jobid>)
More about the command line client tools can be found from the documentation http://www.nordugrid.org/documents/arc-ui.pdf.