In addition to computing elements, the grid has a storage system consisting of several storage elements (SEs). Usually all the sites with CEs have also an SE. If you need to transfer files with your job, and the total size of the files is more than 10MB, you need to use something else than sandboxis (= InputFiles in Ganga). One possibility is to utilise grid storage system for this purpose.
You need to set some environmental variables for the command lines tools to work.
export LFC_HOST=lfc01.dur.scotgrid.ac.uk export LCG_CATALOG_TYPE=lfc export LFC_HOME=/grid/pheno/username export LCG_GFAL_INFOSYS=lcgbdii.gridpp.rl.ac.uk:2170
There is no directory with you username, "/grid/pheno/username", in the grid storage by default.
To create this directory, or any other directory, execute
lfc-mkdir /grid/pheno/username
To view a directory, use "lfc-ls" which works as the normal "ls" shell command. If you execute it without arguments
lfc-ls
it will view the contents of your home directory. The location is set with the "LFC_HOME" variable above.
Here we list some useful commands used to create, move, and remove files and directories. Let's first create a directory called inputs
lfc-mkdir inputs
and then copy a file (inputs.txt) from the grid UI into this directory
lcg-cr --vo pheno -l inputs/inputs.txt -d se01.dur.scotgrid.ac.uk file:${PWD}/inputs.txt
Mind the order of the arguments. To copy a file from the grid storage, use
lcg-cp lfn:inputs/inputs.txt inputs.txt
A file can be deleted using
lcg-del -a lfn:inputs/inputs.txt
and an empty directory is deleted withlfc-rm -r inputs
If we want either the input or output to be on the grid storage at some stage, we need to have a driver script which handles the communication with the storage. It is up to you which language you use to write the script, but here we are going to use python. Let's create two directories under your grid storage home directory.
lfc-mkdir inputs
lfc-mkdir outputs
and move the intput file for the c++ example program therelcg-cr --vo pheno -l inputs/inputs.txt -d se01.dur.scotgrid.ac.uk file:${PWD}/inputs.txt
Then we need the driver script which will download the intput file from the storage, execute the program and upload the output into the storage element. We will call this file "driver.py".
#!/usr/bin/env python __author__ = "Tuomas Hapola" __date__ = "23.06.2015" """ Driver script to be executed on the worker node. """ import sys import os import time class TestJob: def SetEnv(self): os.environ["LFC_HOST"]="lfc01.dur.scotgrid.ac.uk" os.environ["LCG_CATALOG_TYPE"]="lfc" os.environ["LFC_HOME"]="/grid/pheno/hapola" os.environ["LCG_GFAL_INFOSYS"]="lcgbdii.gridpp.rl.ac.uk:2170" def CopyIn(self): os.system("lcg-cp lfn:inputs/inputs.txt inputs.txt") def CopyOut(self): os.system("lcg-cr --vo pheno -l outputs/output.txt -d se01.dur.scotgrid.ac.uk file:${PWD}/output.txt") def RunJob(self): cmd = './simple inputs.txt' os.system(cmd) def main(): # Start... t0 = time.time() job = TestJob() job.SetEnv() job.CopyIn() # End of setup t1 = time.time() job.RunJob() job.CopyOut() # End of processing.... t2 = time.time() print "Setup time (s)",t1-t0 print "Execution time (s)",t2-t1 print "Total time (s)",t2-t0 if __name__ == "__main__": main()
We also need to modify slightly the submission script since the executaple is now "driver.py". The program will be shipped using the input sandbox (=InputFiles).
#!/usr/bin/env ganga import sys import os # Create the job j0 = Job() j0.name = 'TestJob' j0.application = Executable(exe=File("driver.py")) j0.backend="ARC" j0.backend.CE='ce2.dur.scotgrid.ac.uk' ## Add input files. j0.inputfiles = [LocalFile("simple","/mt/home/username/GridTutorial/"),] # Finally submit... print "Submitting... be patient" j0.submit() print 'Job',j0.fqid,'submitted'
The submission proceeds as before
/mt/home/username/Ganga/install/6.1.0-hotfix2/bin/ganga ganga-driver-submit.py
Once the job finishes, you should see a file "output.txt" when you execute
lfc-ls outputs