SL4 WN

From EGEE-see WIki

Jump to: navigation, search

AEGIS01-PHY-SCL notes on installing and configuring WN_torque on SL 4.4 (32 bit)

In order to verify installation and configuration of glite-WN on SL4 (32 bit), we at AEGIS01-PHY-SCL recently performed several experiments on this which turned out successful. Below are given minimal steps necessary for installing and configuring fully operational WN_torque.


1) Installation of SL 4.4

We have chosen to install ALL PACKAGES in this experiment, for simplicity. You can use your own kickstart file, of course.



2) Update and consolidation of SL 4.4 installation

If you try

   apt-get update
   apt-get dist-upgrade

you will see that there are some unresolved dependencies. In order to resolve them, you need to perform the following command:

   apt-get update
   apt-get -f install

Some packages will be removed by the above commands. Make sure that all dependencies are resolved at this time. After that you can proceed with kernel upgrade:

   apt-get dist-upgrade
   apt-get upgrade-kernel



3) Adjust default kernel

After upgrading the kernel, you need to adjust /boot/grub/grub.conf so that the version appropriate for your hardware is used.



4) Adjust services/daemons started at the boot time

Default installation sets an excessive amount of services/daemons to be started at boot - you need to check them and disable all unnecessary ones. It is also recommended to change the default runlevel to 3 in /etc/inittab.



5) Adjust file systems

If you use shared file system, it is necessary to configure new WN to mount it automatically and with proper permissions. Also, if you use scratch space for jobs on WNs, you need to configure it prior to jobs arriving at the new WN.



6) NTP configuration

As usual, NTP needs to be configured and verified.



7) Install YAIM

   wget http://grid-deployment.web.cern.ch/grid-deployment/gis/yaim/glite-yaim-latest.rpm
   rpm -Uvh glite-yaim-latest.rpm

Since glite-yaim-latest.rpm may in fact not be really the latest one, create /etc/apt/sources.list.d/glite.list with the following content:

   rpm http://glitesoft.cern.ch/EGEE/gLite/APT/R3.0/ rhel30 externals Release3.0 updates

and perform

   apt-get update; apt-get install glite-yaim

just to be sure.

After you install YAIM, you may apply some customizations (if necessary).



8) Prepare conf files for new WN

Conf files necessary are: site-info.def, wn-list.conf, users.conf, and groups.conf. Probably you will use conf files from other WNs, since they do not change with migration SL3 -> SL4.



9) First try of install_node

If you try to execute

   /opt/glite/yaim/scripts/install_node site-info.def glite-WN glite-torque-client-config

(note that glite-torque-client-config needs to be added above if you plan to install WN_torque) you will probably see a lot of unresolved dependencies at this moment. This is what we got:

   The following packages have unmet dependencies:
 glite-WN: Depends: jakarta-commons-logging (>= 1.0.2-lcg1_sl3) but it is not going to be installed
           Depends: jakarta-axis (>= 1.1rc2-3) but it is not going to be installed
           Depends: edg-replica-optimization-client (>= 2.2.2-1_sl3) but it is not going to be installed
           Depends: edg-replica-optimization-test (>= 2.2.2-1_sl3) but it is not going to be installed
           Depends: edg-local-replica-catalog-client (>= 2.2.9-1_sl3) but it is not going to be installed
           Depends: edg-local-replica-catalog-test (>= 2.2.9-1_sl3) but it is not going to be installed
           Depends: edg-replica-metadata-catalog-client (>= 2.2.9-1_sl3) but it is not going to be installed
           Depends: edg-replica-metadata-catalog-test (>= 2.2.9-1_sl3) but it is not going to be installed
           Depends: edg-replica-manager (>= 1.8.1-1_sl3) but it is not going to be installed
           Depends: edg-replica-manager-test (>= 1.8.1-1_sl3) but it is not going to be installed
           Depends: edg-replica-manager-gridftp-client_gcc3_2_2 (>= 1.8.1-1_sl3) but it is not going to be installed
           Depends: edg-wl-logging-api-c_gcc3_2_2 (>= lcg2.1.74-3_sl3) but it is not going to be installed
           Depends: edg-wl-logging-api-cpp_gcc3_2_2 (>= lcg2.1.74-3_sl3) but it is not going to be installed
           Depends: edg-wl-logging-api-sh_gcc3_2_2 (>= lcg2.1.74-3_sl3) but it is not going to be installed
           Depends: edg-wl-chkpt-api_gcc3_2_2 (>= lcg2.1.74-1_sl3) but it is not going to be installed
           Depends: edg-wl-common-api_gcc3_2_2 (>= lcg2.1.74-1_sl3) but it is not going to be installed
           Depends: edg-wl-ui-api-cpp_gcc3_2_2 (>= lcg2.1.74-1_sl3) but it is not going to be installed
           Depends: classads-g3 (>= 0.9.4-vh7_sl3) but it is not going to be installed
 glite-torque-client-config: Depends: torque-client but it is not going to be installed
                             Depends: torque-mom but it is not going to be installed

These dependencies can be easily resolved.



10) Resolving dependencies generated by install_node

In order to resolve the above dependencies, you need to install the following packages from SL 3.0.x repository:

  commons-logging-1.0.2-12.i386.rpm
  junit-3.8.1-1.i386.rpm
  libgcj-ssa-3.5ssa-0.20030801.48.i386.rpm

In addition to this, the package

  classads-g3-0.9.4-vh8.i486.rpm

needs to be installed from some external RH/RHEL repository, since the version available from gLite repository doesn't work with SL 4.4.

You will also need to remove some of the packages:

  rpm -e lam perl-XML-SAX perl-LDAP perl-XML-LibXML

and to manually download and install the following package from gLite repository:

  wget http://glitesoft.cern.ch/EGEE/gLite/APT/R3.0/rhel30/RPMS.externals/perl-Net-LDAP-0.2701-1.dag.rhel3.noarch.rpm
  rpm -Uvh perl-Net-LDAP-0.2701-1.dag.rhel3_noarch.rpm

For some reason, automatic procedure doesn't work for the abaove package - apt get it confused with other perl-LDAP packages.

If you are planning to install WN_torque, then you need to install torque, torque-mom and torque-client RPMs built for SL4. We suggest that you use the latest builds provided by Steve Traylen. Please bear in mind that torque server and all torque clients need to have the same version, so be careful with versions. We use the latest version of torque (2.1.8-1cri_sl4_1st) without any problems.

For your convenience, we have put all packages mentioned here in a local repository, so feel free to use them:

http://glite.phy.bg.ac.yu/GLITE-3_0_2/SL4/


11) install_node and configure_node should be working now

After the above steps, install_node should not report any problems, and you can proceed with

  /opt/glite/yaim/scripts/install_node site-info.def glite-WN glite-torque-client-config
  /opt/glite/yaim/scripts/configure_node site-info.def WN_torque

Final tweaking includes adjusting of /var/spoom/pbs/mom_priv/config file, creating of /etc/ssh/shosts.equiv if it already doesn't exist on new WN, and of course, updating of known_hosts on all other nodes to include data about the new one (standard procedure for ssh).



Personal tools