Grid storage

In addition to computing elements, the grid has a storage system consisting of several storage elements (SEs). Usually all the sites with CEs have also an SE. If you need to transfer files with your job, and the total size of the files is more than 5MB, you need to use something other than sandboxes (InputFiles in XRSL). One possibility is to utilise grid storage system for this purpose. All the examples below use the Durham SE.

Remember you will need a proxy before running any of the following commands!

1. File handling

Here we list some useful commands used to create, move, and remove files and directories. As you will notice, the commands are very similar to standard Linux file management commands. Let's first create a directory that we will use for all our own files.

gfal-mkdir gsiftp://se01.dur.scotgrid.ac.uk/dpm/dur.scotgrid.ac.uk/home/pheno/osmith

and then create a subdirectory for our input files

gfal-mkdir gsiftp://se01.dur.scotgrid.ac.uk/dpm/dur.scotgrid.ac.uk/home/pheno/osmith/input

and copy a file (input.txt) from the grid UI into this directory

gfal-copy file://$PWD/input.txt gsiftp://se01.dur.scotgrid.ac.uk/dpm/dur.scotgrid.ac.uk/home/pheno/osmith/input/input.txt

Be careful with the order of the arguments. To copy a file from the grid storage, use

gfal-copy gsiftp://se01.dur.scotgrid.ac.uk/dpm/dur.scotgrid.ac.uk/home/pheno/osmith/input/input.txt file://$PWD/input.txt

A file can be deleted using

gfal-rm gsiftp://se01.dur.scotgrid.ac.uk/dpm/dur.scotgrid.ac.uk/home/pheno/osmith/input/input.txt

and a directory is deleted with

gfal-rm -r gsiftp://se01.dur.scotgrid.ac.uk/dpm/dur.scotgrid.ac.uk/home/pheno/osmith/input

2. Example

If we want either the input or output to be on the grid storage at some stage, we need to have a job script which handles the communication with the storage. It is up to you which language you use to write the script, but here we are going to use bash. Let's create two directories under your grid storage home directory.

gfal-mkdir gsiftp://se01.dur.scotgrid.ac.uk/dpm/dur.scotgrid.ac.uk/home/pheno/osmith/inputs

gfal-mkdir gsiftp://se01.dur.scotgrid.ac.uk/dpm/dur.scotgrid.ac.uk/home/pheno/osmith/outputs

and move the input file for the c++ example program there

gfal-copy file://$PWD/input.txt gsiftp://se01.dur.scotgrid.ac.uk/dpm/dur.scotgrid.ac.uk/home/pheno/osmith/inputs/inputs.txt

Then we need a job script which will download the input file from the storage, execute the program and upload the output into the storage element. We will call this file "job.sh".

#! /bin/sh
	 
#Get our inputs file from grid storage
gfal-copy gsiftp://se01.dur.scotgrid.ac.uk/dpm/dur.scotgrid.ac.uk/home/pheno/osmith/inputs/inputs.txt file://$PWD/inputs.txt

#Make our c++ example executable
chmod +x simple

#Run c++ example against input
./simple inputs.txt

#Upload output to grid storage
gfal-copy file://$PWD/output.txt gsiftp://se01.dur.scotgrid.ac.uk/dpm/dur.scotgrid.ac.uk/home/pheno/osmith/outputs/output.txt

This can now be submitted as with our previous example but we need to make a slight adjustment to the submit.xrsl as follows:

&("JobName" = "TestJob")
("executable" = "job.sh")
("walltime" = "20")
("stdout" = "stdout")
("stderr" = "stderr")
("inputfiles" = ("simple" "")
)

and then submit...

arcsub -c ce1.dur.scotgrid.ac.uk submit.xrsl

Once the job finishes, you should see a file "output.txt" when you execute

gfal-ls gsiftp://se01.dur.scotgrid.ac.uk/dpm/dur.scotgrid.ac.uk/home/pheno/osmith/outputs

For easier access/viewing of files, the SE can be accessed using WebDAV at the following address:

WebDAV

3. LFC

If using the grid storage wasn't complicated enough, there is a further system that can be used to manage your files. The LFC (Logical File Catalog) allows you to register files stored on multiple SEs (Storage Elements) at different sites, allowing you to view, read and replicate your files from one location. This should only be used if you are planning to use multiple storage elements as it is unnecessary and only adds complexity.

There is no directory with you username, "/grid/pheno/username", in the grid storage by default.

To create this directory, or any other directory, execute

gfal-mkdir lfc://se01.dur.scotgrid.ac.uk/grid/pheno/USERNAME

You will note that the same commands as above are used, however the protocol is changed from gsiftp to lfc.

The following will display the contents of your new folder on the LFC

gfal-ls lfc://se01.dur.scotgrid.ac.uk/grid/pheno/USERNAME

Rather than copying a file to the LFC, a file is 'registered' with it. This is done using the same command as a copy, however the source of the file must be on an SE (Storage element) as we did above.

gfal-copy gsiftp://se01.dur.scotgrid.ac.uk/dpm/dur.scotgrid.ac.uk/home/pheno/osmith/input/input.txt lfc://se01.dur.scotgrid.ac.uk/grid/pheno/osmith/input01.txt

Copying from the LFC can be done directly as with the SE.

gfal-copy lfc://se01.dur.scotgrid.ac.uk/grid/pheno/osmith/input01.txt file://$PWD/input01.txt

For easier access/viewing of files, the LFC can be accessed using WebDAV at the following address:

WebDAV