Software Installation Management for VIVE Application

From EGEE-see WIki

Jump to: navigation, search

This Wiki page is a part of SEE-GRID Gridification Guide. It is contributed by Belgrade University Computer Centre.

Contents

Introduction

The following scripts are used for installation, validation and removal of VIVE application. Consider them as exemplary scripts, which you can change or modify according to your needs and requirements of your application. All scripts given below assume a full CE name as argument:

site_VIVE_install grid01.rcub.bg.ac.yu:2119/jobmanager-pbs-seegrid

Installation procedures are made using the ESM (Experimental Software Manager) user class. Details on the instructions of usage, benefits of using the ESM user class and general problems and issues regarding software management can be found in Software Installation Management Guide.

Proxy Certificate Creation

First thing that you need to do is create proxy certificate by logging in as an ESM user. Example for SEEGRID VO:

$voms-proxy-init -voms seegrid:/seegrid/Role=sgmadmin

Installation

The script site_VIVE_install dynamically creates JDL and submits the installation job using CE given on invocation as a parameter, thus avoiding "edg-job-submit -r" .brokerinfo problem:

#!/bin/bash

#args: CE
echo VIVE install Startup $*

CE=$1
VER=`cat /storage/VIVE/programs/version_VIVE`

# do not use "edg-job-submit -r" because of ./BrokerInfo

cat > site_VIVE_install_tmp.jdl <<EOF
 Executable = "/opt/lcg/bin/lcg-ManageSoftware";
 InputSandbox = {"/storage/VIVE/programs/install_sw"};
 OutputSandbox={"out", "err"};
 StdOutput="out";
 StdError="err";
 Requirements = other.GlueCEUniqueID == "$CE" \
   && !Member("VO-seegrid-vive-$VER",other.GlueHostApplicationSoftwareRunTimeEnvironment) \
   && !Member("VO-seegrid-vive-$VER-to-be-validated",other.GlueHostApplicationSoftwareRunTimeEnvironment) \
   && !Member("VO-seegrid-vive-$VER-processing-validate",other.GlueHostApplicationSoftwareRunTimeEnvironment) \
   && !Member("VO-seegrid-vive-$VER-processing-install",other.GlueHostApplicationSoftwareRunTimeEnvironment) \
   && !Member("VO-seegrid-vive-$VER-aborted-validate",other.GlueHostApplicationSoftwareRunTimeEnvironment) \
   && !Member("VO-seegrid-vive-$VER-aborted-install",other.GlueHostApplicationSoftwareRunTimeEnvironment);
 Arguments = "--install --vo seegrid --tag vive-$VER --notify Branko.Marovic@rcub.bg.ac.yu";
 VirtualOrganisation="seegrid";
 RetryCount=0;
EOF

edg-job-list-match site_VIVE_install_tmp.jdl
edg-job-submit -o /storage/VIVE/programs/INST-`echo "$CE" | sed 's/\//_/'` site_VIVE_install_tmp.jdl

#cat site_VIVE_install_tmp.jdl
rm -f site_VIVE_install_tmp.jdl

install_sw script checks VO_SEEGRID_SW_DIR to verify that the site provides a shared file system and validates the required permissions. It also removes previously created junk by uninstallation process, and previous installation attempts. However, it also destroys valid old software installations:

#!/bin/bash
#required
#  voms-proxy-init -voms seegrid:/seegrid/Role=sgmadmin
#check tags with
#  lcg-infosites --vo seegrid tag
#  lcg-info --vo seegrid --list-ce --attrs Tag --query 'Tag=*VO-seegrid-vive*'
#  lcg-ManageVOTag -host <Target-CE> -vo seegrid --list
#manage with
#  yes yes | lcg-ManageVOTag -vo seegrid --remove -tag VO-seegrid-vive-0.4.3-aborted-validate -host Target-CE

export VER=0.4.3

echo "############ starting install_sw $VER ############"
echo "Running on: $HOSTNAME(`hostname -i`) as `whoami`"
echo Belonging to CE: `$EDG_LOCATION/bin/edg-brokerinfo getCE`

if [ x$VO_SEEGRID_SW_DIR = 'x' ]; then #failure?
   echo "VO_SEEGRID_SW_DIR is not set in the environment"
   echo "############ ending install_sw $VER returning 1 ############"
   exit 1
fi

if [ $VO_SEEGRID_SW_DIR = '.' ]; then #failure?
   echo "The site does not provide a shared file system: VO_SEEGRID_SW_DIR=$VO_SEEGRID_SW_DIR"
   echo "############ ending install_sw $VER returning 1 ############"
   exit 1
fi

if [ ! X`whoami` = 'Xseegridsgm'  -a ! X`whoami` = 'Xsegrisgm' ]; then #failure?
   echo "invalid username" `whoami`
   echo "############ ending install_sw $VER returning 1 ############"
   exit 1
fi

export TAR_LOC=`pwd`  #this is the temporary directory where is
                      #supposed to be the steering script and the tarballs

#this also tests outbound access to VIVE portal
wget http://grid02.rcub.bg.ac.yu/VIVE/seegrid-vive-$VER.tar.gz
                      #in this case the script doesn't need tarball from the grid but
                      #it fetches from the WEB requiring OUTBOUND connectivity
                      #it is needed anyway for VIVE software to run

cd $VO_SEEGRID_SW_DIR #go to software installation root directory
pwd
ls -al                #and check its content

#remove previously created junk caused by uninstall
#this is related to undocumented lcg-ManageSoftware behavior
rm -r -f seegrid-vive-0.1*_tmp 
rm -r -f seegrid-vive-0.2*_tmp 
rm -r -f seegrid-vive-0.3*_tmp
rm -r -f seegrid-vive-0.4.1_tmp
rm -r -f seegrid-vive-0.4.2_tmp
rm -f seegrid.*

rm -r -f -v vive-$VER #remove only the current version of the software

#for initial development and testing only:
#remove previously created installation attempts
#this also destroys valid old software versions which we may want to preserve
echo "#cleaning $VO_SEEGRID_SW_DIR"
rm -r -f vive-_tmp
rm -r -f vive-0.1*
rm -r -f vive-0.2*
#cannot rm -r -f *vive-0.* because of seegrid-vive-$VER_tmp
rm -r -f vive-0.3*
rm -r -f vive-0.4.1*
rm -r -f vive-0.4.2*
rm -r -f vive-$VER
ls -al                #and check its content

mkdir vive-$VER
if [ ! $? = 0 ]; then #failure?
   echo "Directory vive-$VER already exists in" `pwd`
   echo "############ ending install_sw $VER returning 1 ############"
   exit 1
fi

cd vive-$VER          #create installation directory
echo "Installing into" `pwd`

echo "running the command: tar xzvf $TAR_LOC/seegrid-vive-$VER.tar.gz"
tar xzvf $TAR_LOC/seegrid-vive-$VER.tar.gz
if [ ! $? = 0 ]; then #failure?
   echo "Failure unpacking $TAR_LOC/seegrid-vive-$VER.tar.gz"
   echo "############ ending install_sw $VER returning 1 ############"
   exit 1
fi

echo "Installation into " `pwd` " finished"
ls -al                #list what has been installed
echo "############ ending install_sw $VER returning $? ############"
exit $? #This is the relevant return code

Validation

The script site_VIVE_validate dynamically creates JDL and submits the validation job using CE given on invocation as a parameter, thus avoiding "edg-job-submit -r" .brokerinfo problem:

#!/bin/bash

#args: CE
echo VIVE validate Startup $*

CE=$1
VER=`cat /storage/VIVE/programs/version_VIVE`

# do not use "edg-job-submit -r" because of ./BrokerInfo

cat > site_VIVE_validate_tmp.jdl <<EOF
 Executable = "/opt/lcg/bin/lcg-ManageSoftware";
 InputSandbox = {"/storage/VIVE/programs/validate_sw"};
 OutputSandbox = {"out", "err"};
 StdOutput="out";
 StdError="err";
 #InputData = {"lfn:/grid/seegrid/VIVE/PACSimages/seegrid/SoftwareValidation.CSFE"};
 #DataAccessProtocol={"rfio","gsidcap"};
 Requirements = other.GlueCEUniqueID == "$CE" && (Member("VO-seegrid-vive-$VER-to-be-validated",other.GlueHostApplicationSoftwareRunTimeEnvironment));
 Arguments = "--validate --vo seegrid --tag vive-$VER --notify Branko.Marovic@rcub.bg.ac.yu --validate_script validate_sw";
 RetryCount=0;
 VirtualOrganisation="seegrid";
EOF

edg-job-list-match  site_VIVE_validate_tmp.jdl
edg-job-submit  -o /storage/VIVE/programs/VAL-`echo "$CE" | sed 's/\//_/'` site_VIVE_validate_tmp.jdl
rm -f site_VIVE_validate_tmp.jdl

validate_sw script checks VO_SEEGRID_SW_DIR to verify that the site provides a shared file system. It also checks lcg-cp command, possible routing problem, and FieldServer operation:

#!/bin/bash
#see the comments at the beginning of install_sw

export VER=0.4.3

#set up LFC/GFAL environment
export LFC_HOST=grid02.rcub.bg.ac.yu
export LCG_CATALOG_TYPE=lfc
export LCG_GFAL_VO=seegrid

export CE=`$EDG_LOCATION/bin/edg-brokerinfo getCE`

echo "############ starting validate_sw $VER ############"
echo "# VO_SEEGRID_SW_DIR (e.g. /opt/exp_soft/seegridi) should be writable only by seegridsgm user"
echo "# and only seegridsgm should be able to manage published application tags"
echo "Running job $EDG_WL_JOBID on: $HOSTNAME(`hostname -i`) as `whoami`"
echo "VO_SEEGRID_SW_DIR (e.g. /opt/exp_soft/seegrid) should be writable only by seegridsgm user"
echo " and only seegridsgm should be able to manage published application tags"
echo "VO_SEEGRID_SW_DIR is $VO_SEEGRID_SW_DIR"
echo Belonging to CE: $CE
echo Close SE is $VO_SEEGRID_DEFAULT_SE


if [ x$VO_SEEGRID_SW_DIR = 'x' ]; then #failure?
   echo "VO_SEEGRID_SW_DIR is not set in the environment"
   echo "############ ending validate_sw $VER returning 1 ############"
   exit 1
fi

if [ $VO_SEEGRID_SW_DIR = '.' ]; then #failure?
   echo "The site does not provide a shared file system: VO_SEEGRID_SW_DIR=$VO_SEEGRID_SW_DIR"
   echo "############ ending validate_sw $VER returning 1 ############"
   exit 1
fi

#useful in diagnosing sites that do not pass the validation
echo "### Checking lcg-cp"
lcg-cp --vo seegrid -v lfn:/grid/seegrid/VIVE/PACSimages/seegrid/SoftwareValidation.CSFE  file:`pwd`/SoftwareValidation.CSFE
if [ ! $? = 0 ]; then #failure?
   echo "Command lcg-cp does not work, suspicious but not required"
fi
ls -ld `pwd`
ls -al
rm -f `pwd`/SoftwareValidation.CSFE

cd $VO_SEEGRID_SW_DIR #software installation root directory

echo "### Listing `pwd`"
ls -alt
cd vive-$VER
if [ ! $? = 0 ]; then #failure?
   echo "Directory vive-$VER does not exists in" `pwd`
   echo "############ ending validate_sw $VER returning $? ############"
   exit 1
fi

echo "### checking possible routing problem"
traceroute grid02.rcub.bg.ac.yu 

echo "### Checking FieldServer operation"
echo "# This starts a server process that loads a file from the closeSE using RFIO"
echo "# based on the information from SEEGRID LFC, then connects to TCP port 443 on"
echo "# grid02.rcub.bg.ac.yu and, if OK, exits returning 113"
./FieldServer $CE "$HOSTNAME(`hostname -i`)" \
 p grid02.rcub.bg.ac.yu 443 grid02.rcub.bg.ac.yu ANY \
 . /VIVE-cache3D . . 300000000 0 SoftwareValidation@seegrid

if [ $? = 113 ]; then
   echo "Dataset load and binder connectivity test passed, installation validated"
   echo "############ ending validate_sw $VER returning 0 ############"
   exit 0
fi

echo "FieldServer returned $?"
echo "Validation failed"
echo "############ ending validate_sw $VER returning 255 ############"
exit 255

Removal

The script uninstall_VIVE dynamically creates JDL and submits the validation job using CE given on invocation as a parameter, thus avoiding "edg-job-submit -r" .brokerinfo problem:

Executable = "/opt/lcg/bin/lcg-ManageSoftware";
InputSandbox = {"/storage/programs/VIVE/uninstall_sw"};
OutputSandbox = {"out", "err"};
stdoutput = "out";
stderror = "err";
Arguments = "--uninstall --vo seegrid --tag seegrid-vive-0.2 --uninstall_script uninstall_sw";
Requirements = (Member("VO-seegrid-vive-0.2",other.GlueHostApplicationSoftwareRunTimeEnvironment) \
 && (other.GlueCEUniqueID == "<Target-CE>:2119/jobmanager-lcgpbs-seegrid))";
VirtualOrganisation="seegrid";

The script uninstall_sw checks VO_SEEGRID_SW_DIR to verify the site provides a shared file system, after which it goes to software installation directory and removes its content:

#!/bin/bash
#see the comments at the beginning of install_sw

export VER=0.3

echo "############ starting uninstall_sw $VER ############"
echo "Running on: $HOSTNAME(`hostname -i`) as `whoami`"
echo Belonging to CE: `$EDG_LOCATION/bin/edg-brokerinfo getCE`

if [ x$VO_SEEGRID_SW_DIR = 'x' ]; then #failure?
   echo "VO_SEEGRID_SW_DIR is not set in the environment"
   echo "############ ending uninstall_sw $VER returning 1 ############"
   exit 1
fi

if [ $VO_SEEGRID_SW_DIR = '.' ]; then #failure?
   echo "The site does not provide a shared file system: VO_SEEGRID_SW_DIR=$VO_SEEGRID_SW_DIR"
   echo "############ ending uninstall_sw $VER returning 1 ############"
   exit 1
fi

cd $VO_SEEGRID_SW_DIR #go to software installation root directory
ls -al                #and check its content

rm -f -v -R vive-$VER #remove only the current version of the software
if [ ! $? = 0 ]; then #failure?
   echo "Removal of vive-$VER from" `pwd` "failed"
   echo "############ ending uninstall_sw $VER returning $? ############"
   exit 1
fi
#optional further steps

echo "Removal of vive-$VER from" `pwd` "finished"
echo "############ ending uninstall_sw $VER returning $? ############"
exit 0 # This is the relevant return code

Running Installed Application

After an application is installed, its jobs should be run using appropriate VOMS group, like:

voms-proxy-init -voms seegrid:/seegrid/RS/App/VIVE 

Users should specify in JDL the tag corresponding to the software and its version they need. The job JDL should require the presence of the tag using "Requirements" attribute:

Requirements = Member("VO-seegrid-vive-0.4.3",other.GlueHostApplicationSoftwareRunTimeEnvironment);

Only those sites where the software has been properly installed and validated will run their jobs. This will also limit the sites listed by edg-job-list-match and glite-job-list-match command.

The installed software should not be referenced directly from "Executable" JDL attribute, since various sites may have different value of VO_SEEGRID_SW_DIR environment variable. It is better to submit and run a script that references installed program(s) or data files using something like:

#!/bin/bash
VER=0.4.3
...
$VO_SEEGRID_SW_DIR/vive-$VER/FieldServer ...

An alternative is to do

cd $VO_SEEGRID_SW_DIR/vive-$VER
FieldServer ...

In the second case, the directory of installed software will become the current one, which could cause the problems with InputSandbox and OutputSandbox usage. Also, the job will not be able to write to the software directory, unless explicitly allowed by install script.

Personal tools