Software Installation Management for PROPEL Application
From EGEE-see WIki
This Wiki page is a part of SEE-GRID Gridification Guide. It is contributed by Belgrade University Computer Centre.
Contents |
Introduction
Here you can find instructions on how to install PROPEL application in the grid environment. PROPEL (Asteroid Proper Elements) is an application designed to calculate proper elements for asteroids in the Asteroid belt of the Solar system. It is developed by the Astronomical Observatory of Belgrade, Serbia. Gridification of the application, which includes the scripts in this article, was done in collaboration between Astronomical Observatory of Belgrade and Belgrade University Computer Centre.
This document can be used as a tutorial on how to install an arbitrary application to the grid. All the scripts used in this document have been written as generic as possible in order to make modifications for other applications simple. Installation procedures are made using the ESM (Experimental Software Manager) user class. Details on the instructions of usage, benefits of using the ESM user class and general problems and issues regarding software management can be found in Software Installation Management Guide.
Proxy Certificate Creation
First thing that you need to do is create proxy certificate by logging in as an ESM user. Example for SEEGRID VO:
$voms-proxy-init -voms seegrid:/seegrid/Role=sgmadmin
Software Management
General Notes
Scripts take one optional argument, name of the target CE. If not specified, WMS (Workload Management System) will try to find CE that meets the criteria. Using scripts without specifying target CE is not recommended and should be avoided. One of the reasons is because IS (Information Service) usually takes a few minutes to update tags for installed software. Also, you will almost always wish to specify the exact CE where removal will be performed. To check status of software tags on the IS use lcg-infosites utility. Example for SEEGRID VO:
$lcg-infosites --vo seegrid tag
This will list CE and tags for installed experimental software.
To list all available CEs use the lcg-infosites utility again. Example for SEEGRID VO:
$lcg-infosites --vo seegrid CE
CEs listed are the actual arguments needed for scripts (target CEs). All scripts automatically generate JDL files that will handle the software management procedures.
Installation
The script install-propel creates temporary JDL file and submittes it to the grid:
#!/bin/bash
#args: CE
CE=$1
VER='1.1'
APP='propel'
VO='seegrid'
NOTIFY_EMAIL="milan (d) potocnik (a) rcub (d) bg (d) ac (d) yu
echo $APP-$VER install Startup $*
QUERY="other.GlueCEUniqueID == \"`echo $CE`\" &&"
if [ -z $CE ]; then
echo "Computing element not specified. Trying without it..."
QUERY=""
fi
cat > site_$APP_install_tmp.jdl <<EOF
Executable = "/opt/lcg/bin/lcg-ManageSoftware";
InputSandbox = {"install_sw"};
OutputSandbox={"out", "err"};
StdOutput="out";
StdError="err";
Requirements = $QUERY !Member("VO-$VO-$APP-$VER", other.GlueHostApplicationSoftwareRunTimeEnvironment) \
&& !Member("VO-$VO-$APP-$VER-to-be-validated", other.GlueHostApplicationSoftwareRunTimeEnvironment) \
&& !Member("VO-$VO-$APP-$VER-processing-validate", other.GlueHostApplicationSoftwareRunTimeEnvironment) \
&& !Member("VO-$VO-$APP-$VER-processing-install", other.GlueHostApplicationSoftwareRunTimeEnvironment) \
&& !Member("VO-$VO-$APP-$VER-aborted-validate", other.GlueHostApplicationSoftwareRunTimeEnvironment) \
&& !Member("VO-$VO-$APP-$VER-aborted-install", other.GlueHostApplicationSoftwareRunTimeEnvironment);
Arguments = "--install --vo $VO --tag $APP-$VER --notify $NOTIFY_EMAIL --install_script install_sw";
VirtualOrganisation="$VO";
EOF
edg-job-list-match site_$APP_install_tmp.jdl
edg-job-submit -o jids/jid site_$APP_install_tmp.jdl
#echo "Debug mode, listing created jdl file..."
#cat site_$APP_install_tmp.jdl
echo "Deleting created jdl file..."
rm -f site_$APP_install_tmp.jdl
echo "Installation request sent."
The script install-sw performs the actual installation on the target CE and is called by lcg-ManageSoftware:
#!/bin/bash
# required
# voms-proxy-init -voms seegrid:/seegrid/Role=sgmadmin
# check tags with
# lcg-infosites --vo seegrid tag
VER='1.1'
VO='seegrid'
VODIR=$VO_SEEGRID_SW_DIR
APP='propel'
INSTALL_URL="http://galeb.etf.bg.ac.yu/~potocnik/grid/install/$APP-$VER/$APP-$VER.tar.gz"
echo "############ starting install_sw $APP-$VER ############"
echo "Running on: $HOSTNAME(`hostname -i`) as `whoami`"
echo Belonging to CE: `$EDG_LOCATION/bin/edg-brokerinfo getCE`
if [ -z $VODIR ]; then #failure?
echo "VO_`echo $VO|tr [a-z] [A-Z]`_SW_DIR is not set in the environment"
echo "############ ending install_sw $VER returning 1 ############"
exit 1
fi
if [ $VODIR = '.' ]; then #failure?
echo "The site does not provide a shared file system: VO_`echo $VO|tr [a-z] [A-Z]`_SW_DIR=$VODIR"
echo "############ ending install_sw $VER returning 1 ############"
exit 1
fi
export TAR_LOC=`pwd` # this is the temporary directory where is
# supposed to be the steering script and the tarballs
wget $INSTALL_URL # fetches tarball from the web requiring outbound connectivity,
# outbound connectivity will be required for this software
# anyway;
# this way installation files are not needed in the inputsanbox,
# another option is to include tarballs in the input sandbox in
# the environments where there is no outbound connectivity
cd $VODIR # go to software installation root directory
pwd
ls -al # and check its content
# remove previously created junk caused by uninstall
# this is related to undocumented lcg-ManageSoftware behavior
rm -r -f $VO-$APP-1.1*_tmp
rm -f $VO.*
rm -r -f -v $APP-$VER # remove only the current version of the software
# for initial development and testing only:
# remove previously created installation attempts
# this also destroys valid old software versions which we may want to preserve
rm -r -f $APP-_tmp
ls -al # and check its content
mkdir $APP-$VER # create installation directory
if [ ! $? = 0 ]; then # failure?
echo "Directory $APP-$VER already exists in" `pwd`
echo "############ ending install_sw $VER returning 1 ############"
exit 1
fi
chmod 775 $APP-$VER # allow all users in sgmadmin group full access to the dir,
# should be the default behavior
cd $APP-$VER
echo "Installing into" `pwd`
echo "running the command: tar xzvf $TAR_LOC/$APP-$VER.tar.gz"
tar xzvf $TAR_LOC/$APP-$VER.tar.gz
if [ ! $? = 0 ]; then #failure?
echo "Failure unpacking $TAR_LOC/$APP-$VER.tar.gz"
echo "############ ending install_sw $APP-$VER returning 1 ############"
exit 1
fi
echo "Installation into " `pwd` " finished"
ls -al #list what has been installed
echo "############ ending install_sw $APP-$VER returning $? ############"
exit $? # This is the relevant return code
Validation
The script validate-propel creates temporary JDL file and sends it to the grid:
#!/bin/bash
#args: CE
CE=$1
VER='1.1'
APP='propel'
VO='seegrid'
NOTIFY_EMAIL="milan (d) potocnik (a) rcub (d) bg (d) ac (d) yu
echo $APP-$VER validate Startup $*
QUERY="other.GlueCEUniqueID == \"`echo $CE`\" &&"
if [ -z $CE ]; then
echo "Computing element not specified. Trying without it..."
QUERY=""
fi
cat > site_$APP_validate_tmp.jdl <<EOF
Executable = "/opt/lcg/bin/lcg-ManageSoftware";
InputSandbox = {"validate_sw"};
OutputSandbox = {"out", "err"};
StdOutput="out";
StdError="err";
Requirements = $QUERY (Member("VO-$VO-$APP-$VER-to-be-validated", other.GlueHostApplicationSoftwareRunTimeEnvironment));
Arguments = "--validate --vo $VO --tag $APP-$VER --notify $NOTIFY_EMAIL --validate_script validate_sw";
VirtualOrganisation="$VO";
EOF
edg-job-list-match site_$APP_validate_tmp.jdl
edg-job-submit -o jids/jid site_$APP_validate_tmp.jdl
#echo "Debug mode, listing created jdl file..."
#cat site_$APP_install_tmp.jdl
echo "Deleting created jdl file..."
rm -f site_$APP_validate_tmp.jdl
echo "Validation request sent."
The script validate-sw performs the actual validation on the target CE and is called by lcg-ManageSoftware:
#!/bin/bash
# see the comments at the beginning of install_sw
VER='1.1'
APP='propel'
VO='seegrid'
VODIR=$VO_SEEGRID_SW_DIR
SE=$VO_SEEGRID_DEFAULT_SE
INSTALL_HOST="galeb.etf.bg.ac.yu"
#set up LFC/GFAL environment
export LFC_HOST=grid02.rcub.bg.ac.yu
export LCG_CATALOG_TYPE=lfc
export LCG_GFAL_VO=seegrid
export CE=`$EDG_LOCATION/bin/edg-brokerinfo getCE`
echo "############ starting validate_sw $APP-$VER ############"
echo "# VO_`echo $VO|tr [a-z] [A-Z]`_SW_DIR (e.g. /opt/exp_soft/$VO) should be writable only by ${VO}sgm user"
echo "# and only ${VO}sgm should be able to manage published application tags"
echo "Running job $EDG_WL_JOBID on: $HOSTNAME(`hostname -i`) as `whoami`"
echo "VO_`echo $VO|tr [a-z] [A-Z]`_SW_DIR is $VODIR"
echo Belonging to CE: $CE
echo Close SE is $SE
if [ -z $VODIR ]; then #failure?
echo "VO_`echo $VO|tr [a-z] [A-Z]`_SW_DIR is not set in the environment"
echo "############ ending uninstall_sw $APP-$VER returning 1 ############"
exit 1
fi
if [ $VODIR = '.' ]; then #failure?
echo "The site does not provide a shared file system: VO_`echo $VO|tr [a-z] [A-Z]`_SW_DIR=$VODIR"
echo "############ ending uninstall_sw $APP-$VER returning 1 ############"
exit 1
fi
cd $VODIR # software installation root directory
echo "### Listing `pwd`"
ls -al
cd $APP-$VER
if [ ! $? = 0 ]; then #failure?
echo "Directory $APP-$VER does not exists in" `pwd`
echo "############ ending validate_sw $APP-$VER returning $? ############"
exit 1
fi
ls -al
# any additional testing of installation can be done here, checking for connectivity
# to necessary resources, running application with some test data, etc.
echo "### checking possible routing problem"
traceroute $INSTALL_HOST
echo "Installation validated."
echo "############ ending validate_sw $APP-$VER returning 0 ############"
exit 0
Removal
The script uninstall-propel creates temporary JDL file and sends it to the grid:
#!/bin/bash
#args: CE
CE=$1
VER='1.1'
APP='propel'
VO='seegrid'
echo $APP-$VER uninstall Startup $*
QUERY="other.GlueCEUniqueID == \"`echo $CE`\" &&"
if [ -z $CE ]; then
echo "Computing element not specified. Trying without it..."
QUERY=""
fi
cat > site_$APP_uninstall_tmp.jdl <<EOF
Executable = "/opt/lcg/bin/lcg-ManageSoftware";
InputSandbox = {"uninstall_sw"};
OutputSandbox = {"out", "err"};
stdoutput = "out";
stderror = "err";
Requirements = $QUERY (Member("VO-$VO-$APP-$VER", other.GlueHostApplicationSoftwareRunTimeEnvironment));
Arguments = "--uninstall --vo $VO --tag $APP-$VER --uninstall_script uninstall_sw";
VirtualOrganisation="$VO";
EOF
edg-job-list-match site_$APP_uninstall_tmp.jdl
edg-job-submit -o jids/jid site_$APP_uninstall_tmp.jdl
#echo "Debug mode, listing created jdl file..."
#cat site_$APP_uninstall_tmp.jdl
echo "Deleting created jdl file..."
rm -f site_$APP_uninstall_tmp.jdl
echo "Uninstallation request sent."
The script unistall-sw performs the actual removal on the target CE and is called by lcg-ManageSoftware:
#!/bin/bash
# see the comments at the beginning of install_sw
VER='1.1'
APP='propel'
VO='seegrid'
VODIR=$VO_SEEGRID_SW_DIR
echo "############ starting uninstall_sw $APP-$VER ############"
echo "Running on: $HOSTNAME(`hostname -i`) as `whoami`"
echo Belonging to CE: `$EDG_LOCATION/bin/edg-brokerinfo getCE`
if [ -z $VODIR ]; then #failure?
echo "VO_`echo $VO|tr [a-z] [A-Z]`_SW_DIR is not set in the environment"
echo "############ ending uninstall_sw $APP-$VER returning 1 ############"
exit 1
fi
if [ $VODIR = '.' ]; then #failure?
echo "The site does not provide a shared file system: VO_`echo $VO|tr [a-z] [A-Z]`_SW_DIR=$VODIR"
echo "############ ending uninstall_sw $APP-$VER returning 1 ############"
exit 1
fi
cd $VODIR # go to software installation root directory
ls –al # and check its content
rm -f -v -R $APP-$VER # remove only the current version of the software
if [ ! $? = 0 ]; then # failure?
echo "Removal of $APP-$VER from" `pwd` "failed"
echo "############ ending uninstall_sw $VER returning $? ############"
exit 1
fi
#optional further steps
echo "Removal of $APP-$VER from" `pwd` "finished"
echo "############ ending uninstall_sw $APP-$VER returning $? ############"
exit 0 # This is the relevant return code
Running Installed Software
After an application was successfully installed and validated, you can run it as a regular user with the proxy certificate similar to the one in the following example:
$voms-proxy-init -voms seegrid
An example of specifying application and CE for the "Requirements" attribute in the JDL file for PROPEL:
Requirements = Member("VO-seegrid-propel-1.1", other.GlueHostApplicationSoftwareRunTimeEnvironment) && other.GlueCEUniqueID == "cluster1.csk.kg.ac.yu:2119/jobmanager-pbs-seegrid";
PROPEL application is designed with the requirement for an environment variable to be present at runtime ($PROPEL_LIB_DIR). An example of JDL file for PROPEL application:
Executable = "run-example";
StdOutput = "std.out";
StdError = "std.err";
InputSandbox = {"run-example", "input.tar.gz"};
OutputSandbox = {"std.out", "std.err", "vast.fil.gz","vpla.fil.gz"};
ShallowRetryCount = 10;
Requirements = Member("VO-seegrid-propel-1.1", other.GlueHostApplicationSoftwareRunTimeEnvironment) && other.GlueCEUniqueID == "cluster1.csk.kg.ac.yu:2119/jobmanager-pbs-seegrid";
The run script run-example:
#!/bin/bash
APP_NAME=propel
APP_VERSION=1.1
tar -xzf input.tar.gz # extract input files
# This is needed for application to see the required libraries
export PROPEL_LIB_DIR=$VO_SEEGRID_SW_DIR/$APP_NAME-$APP_VERSION/
# Execute application
$VO_SEEGRID_SW_DIR/$APP_NAME-$APP_VERSION/propel.x
# Compress output files
gzip vast.fil
gzip vpla.fil
