Certificates

A valid certificate is needed in order to access the grid. Certificates are used to identify users. In addition to a valid certificate any user is required to be also a member of a virtual organisation (VO). Which resources are available for the user, is authorised based on the VO membership.

1. How to apply a certificate and VO membership

Look up the instructions from here.

2. How to install your certificate

Once you have your certificate, the next step is to log in to a grid UI. This is a computer which is pre-configured with grid middleware client tools. The grid middleware is what allows the different resources to talk to each other and makes it possible for user to submit jobs. To login from IPPP desktop machines, use:

ssh -Y username@gridui1.dur.scotgrid.ac.uk

Your username and password for the gridui is not necessarily identical to your credentials on the main IPPP system.

After you have got your certificate application approved, export the certificate as a PKCS12 file and copy this .p12 file to the grid UI (these steps are covered at the tutorial on obtaining certificates). In order to use the certificate for grid submissions, it needs to be converted to a public - private key pair. First, create a directory under your home directory in the grid UI

mkdir .globus

and run the following commands (replace gridcert.p12 with the name or your PKCS12 file.)

openssl pkcs12 -nocerts -in gridcert.p12 -out .globus/userkey.pem

openssl pkcs12 -clcerts -nokeys -in gridcert.p12 -out .globus/usercert.pem

The last step is to set the permissions right

chmod 444 .globus/usercert.pem

chmod 400 .globus/userkey.pem

3. Proxy certificates

Your private key is encrypted and can be only accessed with a passphrase. Once you submit your job, communication between different grid resources is needed, each communication requiring authentication. To reduce the number of times you need to enter your passphrase, grid security infrastructure supports proxy certificates (delegation). A proxy contains a modified version of your public and private keys, and this private key is not encrypted. The slightly lower level of security is acceptable here because proxies have a finite lifetime.

The next step is to create a proxy, which also tests that your certificate is valid and properly installed. The proxy generation is done using the VOMS (Virtual Organisation Management Service) client tools. To create a standard 12h proxy, execute

arcproxy -S pheno -N

where pheno is the name of your virtual organisation. It will prompt you to enter your passphrase and after that it will tell you if the proxy generation succeeded or not. If successful, you should see output similar to this:

Enter pass phrase for private key:
Your identity: /C=UK/O=eScience/OU=Durham/L=eScience/CN=jeppe andersen
Contacting VOMS server (named pheno): voms03.gridpp.ac.uk on port: 15011
Proxy generation succeeded
Your proxy is valid until: 2015-12-09 04:16:25
You can test if you have a valid proxy, and how much time is left, with

arcproxy --info

which should give an output like:
Subject: /C=UK/O=eScience/OU=Durham/L=eScience/CN=jeppe andersen/CN=852084312
Issuer: /C=UK/O=eScience/OU=Durham/L=eScience/CN=jeppe andersen
Identity: /C=UK/O=eScience/OU=Durham/L=eScience/CN=jeppe andersen
Time left for proxy: 11 hours 51 minutes 50 seconds
Proxy path: /tmp/x509up_u1101
Proxy type: X.509 Proxy Certificate Profile RFC compliant impersonation proxy - RFC inheritAll proxy
Proxy key length: 1024
Proxy signature: sha256
====== AC extension information for VO pheno ======
VO        : pheno
subject   : /C=UK/O=eScience/OU=Durham/L=eScience/CN=jeppe andersen
issuer    : /C=UK/O=eScience/OU=Imperial/L=Physics/CN=voms03.gridpp.ac.uk
uri       : voms03.gridpp.ac.uk:15011
attribute : /pheno/Role=NULL/Capability=NULL
Time left for AC: 11 hours 51 minutes 52 seconds
This contains information both on the validity of the certificate, and its association with the VO (Pheno in this case).

Different grid sites use sometimes different middleware and not all the client tools can communicate with all the sites. In order to see which resources are available to your VO, use

lcg-infosites --vo pheno all

This command will print out information about all the resources available for you. The current (24/10/2018) list of compute elements available to PHENO users is

lcg-infosites --vo pheno ce

#   CPU    Free Total Jobs      Running Waiting ComputingElement
----------------------------------------------------------------
  24984     357      25209        24627     582 arc-ce01.gridpp.rl.ac.uk:2811/nordugrid-Condor-EL7
 24984     364      25772        24620    1152 arc-ce01.gridpp.rl.ac.uk:2811/nordugrid-Condor-grid3000M
  24984     346      25526        24638     888 arc-ce02.gridpp.rl.ac.uk:2811/nordugrid-Condor-EL7
  24984     346      26336        24638    1698 arc-ce02.gridpp.rl.ac.uk:2811/nordugrid-Condor-grid3000M
  24984     337      25570        24647     923 arc-ce03.gridpp.rl.ac.uk:2811/nordugrid-Condor-EL7
  24984     308      26456        24676    1780 arc-ce03.gridpp.rl.ac.uk:2811/nordugrid-Condor-grid3000M
  24984     300      25204        24684     520 arc-ce04.gridpp.rl.ac.uk:2811/nordugrid-Condor-EL7
  24984     300      25188        24684     504 arc-ce04.gridpp.rl.ac.uk:2811/nordugrid-Condor-grid3000M
    654       0          0            0       0 ce-test.dur.scotgrid.ac.uk:2811/nordugrid-SLURM-testing
      1       0         25           24       1 ce01.tier2.hep.manchester.ac.uk:2811/nordugrid-Condor-medium
   2280       0         16           16       0 ce02.tier2.hep.manchester.ac.uk:2811/nordugrid-Condor-medium
   1920       1        833          127     706 ce03.tier2.hep.manchester.ac.uk:8443/cream-pbs-long
     43     250        302            0     302 ce04.esc.qmul.ac.uk:8443/cream-slurm-sl6_lcg_1G_long
     43     250       1274          433     841 ce04.esc.qmul.ac.uk:8443/cream-slurm-sl6_lcg_2G_long
    232      81       1291          455     836 ce04.esc.qmul.ac.uk:8443/cream-slurm-sl6_lcg_4G_long
     43     280        302            0     302 ce05.esc.qmul.ac.uk:8443/cream-slurm-sl6_lcg_1G_long
     43     280       1275          432     843 ce05.esc.qmul.ac.uk:8443/cream-slurm-sl6_lcg_2G_long
     23     109       1288          452     836 ce05.esc.qmul.ac.uk:8443/cream-slurm-sl6_lcg_4G_long
     43     131        325            0     325 ce06.esc.qmul.ac.uk:8443/cream-slurm-sl6_lcg_1G_long
     43     131       1298          416     882 ce06.esc.qmul.ac.uk:8443/cream-slurm-sl6_lcg_2G_long
    232      11       1296          311     985 ce06.esc.qmul.ac.uk:8443/cream-slurm-sl6_lcg_4G_long
     43     411        325           23     302 ce07.esc.qmul.ac.uk:8443/cream-slurm-sl6_lcg_1G_long
     43     411       1298          458     840 ce07.esc.qmul.ac.uk:8443/cream-slurm-sl6_lcg_2G_long
     23     228       1296          460     836 ce07.esc.qmul.ac.uk:8443/cream-slurm-sl6_lcg_4G_long
    752       0        726          726       0 ce1.dur.scotgrid.ac.uk:2811/nordugrid-SLURM-ce1
    654       0        874          873       1 ce2.dur.scotgrid.ac.uk:2811/nordugrid-SLURM-ce2
    948       0        525          271     254 ce3.dur.scotgrid.ac.uk:2811/nordugrid-SLURM-ce3
   2604     311          0            0       0 ce3.ppgrid1.rhul.ac.uk:8443/cream-pbs-pheno
   1046       0        530          261     269 ce4.dur.scotgrid.ac.uk:2811/nordugrid-SLURM-ce4
   5109       0       4600         2987    1613 ceprod05.grid.hep.ph.ic.ac.uk:8443/cream-sge-grid.q
   5109       0       4634         2914    1720 ceprod06.grid.hep.ph.ic.ac.uk:8443/cream-sge-grid.q
   5109       0       4610         2962    1648 ceprod07.grid.hep.ph.ic.ac.uk:8443/cream-sge-grid.q
   5109       0       4617         2968    1649 ceprod08.grid.hep.ph.ic.ac.uk:8443/cream-sge-grid.q
   2204     102         56           30      26 cream2.ppgrid1.rhul.ac.uk:8443/cream-pbs-pheno
   1215       0       1257         1215      42 dc2-grid-21.brunel.ac.uk:2811/nordugrid-Condor-default
   2336       0       2275          557     527 dc2-grid-22.brunel.ac.uk:2811/nordugrid-Condor-default
     64       0          2            1       1 dc2-grid-25.brunel.ac.uk:2811/nordugrid-Condor-default
   1440       0       1031          197     260 dc2-grid-26.brunel.ac.uk:2811/nordugrid-Condor-default
    812       0       1224          735     489 dc2-grid-28.brunel.ac.uk:2811/nordugrid-Condor-default
    792      13       1150          779     371 hepgrid2.ph.liv.ac.uk:2811/nordugrid-Condor-grid
   1706    1013        715          693      22 hepgrid5.ph.liv.ac.uk:2811/nordugrid-Condor-grid
   1900       0       1835         1737      98 heplnx206.pp.rl.ac.uk:2811/nordugrid-Condor-grid
   1900       0       2499         1737     762 heplnx207.pp.rl.ac.uk:2811/nordugrid-Condor-grid
   2752       0       3555         2684     871 heplnx208.pp.rl.ac.uk:2811/nordugrid-Condor-grid
      0       0         32           32       0 lcgce1.shef.ac.uk:8443/cream-pbs-pheno
   1040     126         32           32       0 lcgce2.shef.ac.uk:8443/cream-pbs-pheno
   5032       0       4440         3881     559 svr009.gla.scotgrid.ac.uk:2811/nordugrid-Condor-condor_q2d
   5032       0       4683         3882     801 svr010.gla.scotgrid.ac.uk:2811/nordugrid-Condor-condor_q2d
   5032       0       4519         3883     636 svr011.gla.scotgrid.ac.uk:2811/nordugrid-Condor-condor_q2d
   5032       0       4645         3884     761 svr019.gla.scotgrid.ac.uk:2811/nordugrid-Condor-condor_q2d
     96       0         25           24       1 vm3.tier2.hep.manchester.ac.uk:2811/nordugrid-Condor-medium
	  

4. Automated Proxy Renewal

If the jobs is not going to be able to make it through the queue and finish within 24 hours, then it needs the associated proxy to be renewed. Certificates can be produced not just with arcproxy etc., but it is also possible to delegate the authority to issue certificates to a central server. This delegation is valid for a week (i.e. it has to be renewed at least every 168 hours). If the environment variables are setup to use MYPROXY_SERVER=myproxy.cern.ch then it is possible to delegate the right to this server by running
myproxy-init -d -a
You will be prompted for the grid certificate pass-phrase, and you will have to choose a new pass-phrase, which every request for a certificate from the myproxy server will have to use. After running this command, you should be able to get information on the validity of the central proxy delegation with

1:39pm gridui1 183 > myproxy-info -d
username: /C=UK/O=eScience/OU=Durham/L=eScience/CN=jeppe andersen
owner: /C=UK/O=eScience/OU=Durham/L=eScience/CN=jeppe andersen
retrieval policy: *
timeleft: 167:59:37 (7.0 days)

The next thing to do then is to setup a script to automatically retrieve certificates every so often, and to propagate them to any jobs in queues. First, create a file called longcert.sh with the content
#!/bin/bash
echo passphrase | myproxy-get-delegation -d --stdin_pass -s myproxy.cern.ch:7512
voms-proxy-init -voms pheno -noregen
for i in {arc-ce0{1..4}.gridpp.rl.ac.uk,svr0{09,10,11,19}.gla.scotgrid.ac.uk,ce{1..4}.dur.scotgrid.ac.uk}; do arcsync -f -c $i -j ~/jobs.xml;done
#for i in {ce{1..4}.dur.scotgrid.ac.uk}; do arcsync -f -c $i -j ~/jobs.xml;done
(nohup arcrenew -a -j ~/jobs.xml >& ~/renew.res &)
'passphrase' here refers to the passphrase chosen for the myproxy delegation. This script will first get a new certificate from the myproxy server. It will then add the voms information for pheno, ask for a list of all jobs submitted to RAL, Glasgow and Durham. And finally propagate the new proxy to all these jobs. We now just have to make this script run at regular intervals. cron is your friend. First, make the script executable with chmod 700 longcert.sh. This also ensures that other users cannot read the passphrase written in the file. Next, run crontab -e and add the following line to make it run every 8 hours ('user' is your username):
MAILTO=""
0 */4 * * * /mt/home/user/longcert.sh
(note that this uses the editor VI, so you have to first press 'i' for insert mode, type the line, then hit ESC and S-z S-z (using emacs notation, i.e. shift-z, shift-z) ). This will execute the renewal every 4 hours (at midnight, 4am, 8am, 12noon, 4pm, 8pm) on every day. The reason for 4 hours is that the certificate on job submission is at most 4 hours old, and therefore there will be at least 2 attempts at certificate renewal before it expires (it is of course possible to run it more frequently, if it is found that 2 attempts is insufficient). The only thing one still needs to remember is to upload a new grid proxy to myproxy at east every week. The certificate on gridui1 (if that is where the cronjob is installed) will automatically be renewed too in this process, so really, the certificate management is reduced to a job once or twice per week. Enjoy!