SL4 WN glite-3.1

From EGEE-see WIki

Jump to: navigation, search

AEGIS01-PHY-SCL notes on installing and configuring natively compiled glite-WN and TORQUE_client on SL4.5 (32 bit)

Recently new glite-3.1 release of WN packages natively compiled on SL4 were announced. In order to verify installation and configuration process of this new glite-WN and TORQUE_client on SL4 (32 bit), we at AEGIS01-PHY-SCL performed several experiments which turned out to be successful - new WN is passing SAM tests for several days, and jobs of SEEGRID and AEGIS VO users are executed on the new WN without any problems.

Below are given minimal steps necessary for installing and configuring fully operational glite-WN and TORQUE_client using YAIM-3.1 and glite-3.1 repository. These instructions are based on several documents:


All additional RPMs mentioned in this guide can be found in AEGIS01-PHY-SCL SL4 repository.


1) OS Installation

We have chosen to install ALL PACKAGES of SL4.5 in this experiment. You can use your own kickstart file, of course.



2) Update and consolidation of SL4.5 installation

Default package management tool that SL4.x and YAIM-3.1 or YAIM-4.0 use is yum. In /etc/yum.repos.d/ it is necessary to add the following files:

glite.repo

   [glite-WN]
   name=gLite 3.1 Worker Node
   baseurl=http://linuxsoft.cern.ch/EGEE/gLite/R3.1/glite-WN/sl4/i386/
   enabled=1
   
   [glite-TORQUE_client]
   name=Torque clients
   baseurl=http://linuxsoft.cern.ch/EGEE/gLite/R3.1/glite-TORQUE_client/sl4/i386/
   enabled=1

lcg-ca.repo

   [CA]
   name=CAs
   baseurl=http://linuxsoft.cern.ch/LCG-CAs/current

jpackage5.0.repo

   [main]
   [jpackage17-generic]
   name=JPackage 1.7, generic
   baseurl=http://mirrors.dotsrc.org/jpackage/1.7/generic/free/
   enabled=1
   protect=1
    
   [jpackage17-generic-nonfree]
   name=JPackage 1.7, generic non-free
   baseurl=http://mirrors.dotsrc.org/jpackage/1.7/generic/non-free/
   enabled=1
   protect=1
   
   [main]
   [jpackage5-generic]
   name=JPackage 5, generic
   baseurl=http://mirrors.dotsrc.org/jpackage/5.0/generic/free/
   enabled=1
   protect=1
   
   [jpackage5-generic-nonfree]
   name=JPackage 5, generic non-free
   baseurl=http://mirrors.dotsrc.org/jpackage/5.0/generic/non-free/
   enabled=1
   protect=1


To update the node, try to execute:

   yum update

We got the following error:

   Error: Missing Dependency: j2sdk = 2000:1.4.2_13-fcs is needed by package java-1.4.2-sun-compat

To fix this, you need to remove java-1.4.2-sun-compat package:

   rpm -e java-1.4.2-sun-compat



3) Java 1.5 installation

After that, java-1.5 should be installed. To install it, it is necessary to go to SUN's Java web page and download JDK 5.0 Update 14. We used "Linux self-extracting file" jdk-1_5_0_14-linux-i586.bin in order to make java-1.5.0-sun-1.5.0.14-1jpp.i586.rpm and java-1.5.0-sun-devel-1.5.0.14-1jpp.i586.rpm packages, as suggested in Steve Traylen's guide.

To make and install those two packages, do the following:

rpm --import http://www.jpackage.org/jpackage.asc
mkdir -p ~/redhat/BUILD ~/redhat/SOURCES ~/redhat/SPECS ~/redhat/RPMS/i586 ~/redhat/SRPMS
cat <<EOF > ~/.rpmmacros
%_topdir    $HOME/redhat
%packager       Firstname Lastname <firstname.lastname@example.org>
EOF
rpm -Uvh http://mirrors.dotsrc.org/jpackage/1.7/generic/non-free/SRPMS/java-1.5.0-sun-1.5.0.14-1jpp.nosrc.rpm
cp jdk-1_5_0_14-linux-i586.bin ~/redhat/SOURCES

If, for some reason, src RPM version from jpackage.org does not match the one available from SUN's Java web page, you must adjust the src spec file by manually editing the file ~/redhat/SPECS/java-1.5.0-sun.spec. E.g. if 1.5.0.13 src was available, it would be sufficient to change the line

%define buildver        13

to be

%define buildver        14

After that, you need to copy the downloaded file jdk-1_5_0_14-linux-i586.bin to ~/redhat/SOURCES/, and to proceed with

rpmbuild -ba ~/redhat/SPECS/java-1.5.0-sun.spec

Now you have new java RPMs in ~/redhat/RPMS/i586/, and you can distribute them to all your WNs and install them:

rpm -Uvh ~/redhat/RPMS/i586/java-1.5.0-sun-1.5.0.14-1jpp.i586.rpm
rpm -Uvh ~/redhat/RPMS/i586/java-1.5.0-sun-devel-1.5.0.14-1jpp.i586.rpm

Alternatively, you can use java RPMs built at AEGIS01-PHY-SCL, available at http://glite.phy.bg.ac.yu/GLITE-3/java/

After this step, java is finally installed, and you can perform:

   yum update

Now all dependencies should be ok. This step would also update the kernel.



4) Adjust default kernel

After upgrading the kernel, you need to adjust /boot/grub/grub.conf so that the version appropriate for your hardware is used. Reboot the system.



5) Adjust services/daemons started at the boot time

Default installation sets an excessive amount of services/daemons to be started at boot - you need to check them and disable all unnecessary ones. It is also recommended to change the default runlevel to 3 in /etc/inittab. We specially suggest that you disable yum auto-update, since this may bring trouble when new updates (requiring reconfiguration of WNs) appear, and are installed automatically.

We do not suggest installation of SELinux because it slows down execution of mpi jobs when mpiexec is used. If it is installed, it should be disabled by changing line SELINUX=enforcing with line SELINUX=disabled in the file /etc/selinux/config.



6) Adjust file systems

If you use shared file system, it is necessary to configure new WN to mount it automatically and with proper permissions. Also, if you use scratch space for jobs on WNs, you need to configure it prior to jobs arriving at the new WN.



7) NTP configuration

As usual, NTP needs to be configured and verified.



8) Certification Authorities

   yum install lcg-CA



9) Additional RPMs installation:

In order to successfully install glite-WN, you need to add perl-SOAP-Lite and some additional packages in order to pass SAM rgmasc test:

wget http://glite.phy.bg.ac.yu/GLITE-3/SL4/perl-SOAP-Lite-0.65.6-1.noarch.rpm
rpm -Uvh perl-SOAP-Lite-0.65.6-1.noarch.rpm
wget http://glite.phy.bg.ac.yu/GLITE-3/SL4/bouncycastle-jdk14_1.19-2_noarch.rpm
rpm -Uvh bouncycastle-jdk14_1.19-2_noarch.rpm
wget http://glite.phy.bg.ac.yu/GLITE-3/SL4/edg-java-security_1.5.11-1%255fsl3_noarch.rpm
rpm -Uvh edg-java-security_1.5.11-1%255fsl3_noarch.rpm
wget http://glite.phy.bg.ac.yu/GLITE-3/SL4/edg-java-security-client_1.5.11-1%255fsl3_noarch.rpm
rpm -Uvh edg-java-security-client_1.5.11-1%255fsl3_noarch.rpm
wget http://glite.phy.bg.ac.yu/GLITE-3/SL4/edg-java-security-test_1.5.11-1%255fsl3_noarch.rpm
rpm -Uvh edg-java-security-test_1.5.11-1%255fsl3_noarch.rpm

We use the lates 2.1.9-5 version. Be careful here - the torque version must be the same on CE and all WNs!!! As always, RPMs of maui and torque can be found in ETICS repository maintained by Steve Traylen:

http://eticssoft.web.cern.ch/eticssoft/repository/torquemaui/



10) Install glite-WN and glite-TORQUE_client

   yum install glite-WN glite-TORQUE_client

This will install all packages needed for configuring glite-WN and TORQUE_client.



11) Preparation of conf files for new WN

Conf files necessary are: site-info.def, wn-list.conf, users.conf, and groups.conf, as well as the files in vo.d directory. If you have the conf files for recent enough version of YAIM, then little should be changed. According to the YAIM-4.0 Guide, several new site-info.def variables are added, so adapt your config files accordingly. Probably you will just use the conf files from other WNs, since they practically do not change with migration from SL3.0.x to SL4.5. The only important exception is the JAVA_LOCATION, which should point to 1.5 java on SL 4.x WNs.



12) Configuring node

To configure node type:

   /opt/glite/yaim/bin/yaim -c -s <path to site-info.def> -n WN -n TORQUE_client

Final tweaking includes adjusting of /var/spool/pbs/mom_priv/config file, creating of /etc/ssh/shosts.equiv if it already doesn't exist on new WN and updating this file on CE and other WNs, and of course, updating of ssh_known_hosts on all other nodes to include data about the new one (standard procedure for ssh). After this is done, the new WN can be added to the pbs server. After this, cron jobs will take care about shosts.equiv and ssh_known_hosts files.

In order to pass rgmasc SAM test, it is also necessary to add /opt/edg/sbin to the shell variable PATH, e.g. in some local file for bash and csh.

If you have previously installed the plugin on your CE to dynamically publish OS version, now is probably a good idea to disable it (remove the plugin from /opt/lcg/var/gip/plugin), and to adjust ldif files so that the correct OS version and release are published.


Personal tools