Grid storage

In addition to computing elements, the grid has a storage system consisting of several storage elements (SEs). Usually all the sites with CEs have also an SE. If you need to transfer files with your job, and the total size of the files is more than 10MB, you need to use something else than sandboxis (= InputFiles in Ganga). One possibility is to utilise grid storage system for this purpose.

1. Preliminaries

You need to set some environmental variables for the command lines tools to work.

export LFC_HOST=lfc01.dur.scotgrid.ac.uk
export LCG_CATALOG_TYPE=lfc
export LFC_HOME=/grid/pheno/username
export LCG_GFAL_INFOSYS=lcgbdii.gridpp.rl.ac.uk:2170

There is no directory with you username, "/grid/pheno/username", in the grid storage by default.

To create this directory, or any other directory, execute

lfc-mkdir /grid/pheno/username

To view a directory, use "lfc-ls" which works as the normal "ls" shell command. If you execute it without arguments

lfc-ls

it will view the contents of your home directory. The location is set with the "LFC_HOME" variable above.

2. File handling

Here we list some useful commands used to create, move, and remove files and directories. Let's first create a directory called inputs

lfc-mkdir inputs

and then copy a file (inputs.txt) from the grid UI into this directory

lcg-cr --vo pheno -l inputs/inputs.txt -d se01.dur.scotgrid.ac.uk file:${PWD}/inputs.txt

Mind the order of the arguments. To copy a file from the grid storage, use

lcg-cp lfn:inputs/inputs.txt inputs.txt

A file can be deleted using

lcg-del -a lfn:inputs/inputs.txt

and an empty directory is deleted with

lfc-rm -r inputs

3. Example

If we want either the input or output to be on the grid storage at some stage, we need to have a driver script which handles the communication with the storage. It is up to you which language you use to write the script, but here we are going to use python. Let's create two directories under your grid storage home directory.

lfc-mkdir inputs

lfc-mkdir outputs

and move the intput file for the c++ example program there

lcg-cr --vo pheno -l inputs/inputs.txt -d se01.dur.scotgrid.ac.uk file:${PWD}/inputs.txt

Then we need the driver script which will download the intput file from the storage, execute the program and upload the output into the storage element. We will call this file "driver.py".

#!/usr/bin/env python

__author__ = "Tuomas Hapola"
__date__ = "23.06.2015"

"""
Driver script to be executed on the worker node.
"""

import sys
import os
import time


class TestJob:

    
    def SetEnv(self):
        os.environ["LFC_HOST"]="lfc01.dur.scotgrid.ac.uk"
        os.environ["LCG_CATALOG_TYPE"]="lfc"
        os.environ["LFC_HOME"]="/grid/pheno/hapola"
        os.environ["LCG_GFAL_INFOSYS"]="lcgbdii.gridpp.rl.ac.uk:2170"
        
    def CopyIn(self):
        os.system("lcg-cp lfn:inputs/inputs.txt inputs.txt")

    def CopyOut(self):
        os.system("lcg-cr --vo pheno -l outputs/output.txt -d se01.dur.scotgrid.ac.uk file:${PWD}/output.txt")
        
    def RunJob(self):
        cmd = './simple inputs.txt'
        os.system(cmd)

def main():

    # Start...
    t0 = time.time()

    job = TestJob()
    job.SetEnv()
    job.CopyIn()

    # End of setup
    t1 = time.time()

    job.RunJob()
    job.CopyOut()
    
    # End of processing....
    t2 = time.time()

    print "Setup time (s)",t1-t0
    print "Execution time (s)",t2-t1
    print "Total time (s)",t2-t0

if __name__ == "__main__":
    main()

We also need to modify slightly the submission script since the executaple is now "driver.py". The program will be shipped using the input sandbox (=InputFiles).

#!/usr/bin/env ganga

import sys
import os

# Create the job
j0 = Job()
j0.name = 'TestJob'
j0.application = Executable(exe=File("driver.py"))
j0.backend="ARC"
j0.backend.CE='ce2.dur.scotgrid.ac.uk'

## Add input files.
j0.inputfiles = [LocalFile("simple","/mt/home/username/GridTutorial/"),]

# Finally submit...
print "Submitting... be patient"
j0.submit()
print 'Job',j0.fqid,'submitted'

The submission proceeds as before

/mt/home/username/Ganga/install/6.1.0-hotfix2/bin/ganga ganga-driver-submit.py

Once the job finishes, you should see a file "output.txt" when you execute

lfc-ls outputs