NWCHEM VO
From EGEE-see WIki
Introduction
NWChem is a computational chemistry package that is designed to run on high-performance parallel supercomputers as well as conventional workstation clusters. It aims to be scalable both in its ability to treat large problems efficiently, and in its usage of available parallel computing resources. NWChem has been developed by the Molecular Sciences Software group of the Environmental Molecular Sciences Laboratory (EMSL) at the Pacific Northwest National Laboratory (PNNL). Most of the implementation has been funded by the EMSL Construction Project. For more information about this software visit http://www.emsl.pnl.gov/docs/nwchem/nwchem.html.
Licensing Issues
NWChem is an open source software distributed under the EMSL Software User Agreement (Image:Nwchem ecce ua.pdf). The agreement has been signed by the Applicaton Support Team of NCSR "Demokritos" which is responsible for installing the package in HellasGrid. Users that wish to use this software should join the regional NWCHEM VO. This VO has been established in order to control access to the software and satisfy the requirement imposed by the above license. In order to join the VO follow the link below and follow the instructions. https://access.hellasgrid.gr/account/nwchem_vo_form A pre-requisite to be accepted in the VO is to read the license of NWChem software and agree to adhere it.
How to use
An MPICH-enabled build of NWCehm 5.0 has been deployed on all sites supporting NWCHEM VO, thus once you are a member of the VO no special flags or requirements are need to be set in the JDL file. What is need though is to initiate your proxy certificate explicitly defining NWCHEM VO as your working VO. Therefor, the first step for using the package is to issue the following command in your UI:
voms-proxy-init --voms nwchem.vo.hellasgrid.gr
Like most of the non-trivial MPI jobs in EGEE, a NWChem job should use a shell script that prepares the environment in the target host and invokes the mpiexec command. For example the script below can be used to run NWChem with MPI, passing as parameter an input nw file.
#!/bin/sh -x
# the binary to execute
EXE=$VO_NWCHEM_SW_DIR/nwchem/bin/nwchem
INPUTFILE=$1
echo "***********************************************************************"
echo "Running on: $HOSTNAME"
echo "As: " `whoami`
echo "***********************************************************************"
if [ "x$PBS_NODEFILE" != "x" ] ; then
echo "PBS Nodefile: $PBS_NODEFILE"
HOST_NODEFILE=$PBS_NODEFILE
fi
if [ "x$LSB_HOSTS" != "x" ] ; then
echo "LSF Hosts: $LSB_HOSTS"
HOST_NODEFILE=`pwd`/lsf_nodefile.$$
for host in ${LSB_HOSTS}
do
echo $host >> ${HOST_NODEFILE}
done
fi
if [ "x$HOST_NODEFILE" = "x" ]; then
echo "No hosts file defined. Exiting..."
exit
fi
echo "***********************************************************************"
CPU_NEEDED=`cat $HOST_NODEFILE | wc -l`
echo "Node count: $CPU_NEEDED"
echo "Nodes in $HOST_NODEFILE: "
cat $HOST_NODEFILE
echo "***********************************************************************"
echo "***********************************************************************"
CPU_NEEDED=`cat $HOST_NODEFILE | wc -l`
echo "Checking ssh for each node:"
NODES=`cat $HOST_NODEFILE`
for host in ${NODES}
do
echo "Checking $host..."
ssh $host hostname
done
echo "***********************************************************************"
echo Preparing the environment to run $EXE
export NWCHEM_NWPW_LIBRARY=$VO_NWCHEM_SW_DIR/nwchem/data/
#cp -f $VO_SEE_SW_DIR/nwchem/data/default.nwchemrc $HOME/.nwchemrc
#make sure that no .nwchemrc is left behind
##rm -rf $HOME/.nwchemrc
#cp -f nwchemrc.txt $HOME/.nwchemrc
##cp -f nwchemrc.txt $VO_SEE_SW_DIR/nwchem/data/default.nwchemrc
cp -f $VO_NWCHEM_SW_DIR/nwchem/data/default.nwchemrc $HOME/.nwchemrc
cat $HOME/.nwchemrc
echo "***********************************************************************"
echo "Executing $EXE with mpiexec"
mpiexec $EXE $INPUTFILE > mpiexec.out 2>&1
echo "***********************************************************************"
The above script is generic enough and will not require modification if the input .nw file is the only dependency to run the computation. For more complex jobs, with possible multiple input files, or files fetched from the LFC, the above script should be modified appropriately.
For example to run NWChem on 4 CPUs giving as input the S2-cpmd.nw file (included in the examples directory of the package), the following JDL can be used:
[
Type = "Job";
JobType = "MPICH";
NodeNumber = 4;
Executable = "mpi-nwchem.sh";
Arguments = "S2-cpmd.nw";
StdOutput = "stdout";
StdError = "stderr";
InputSandbox = {"mpi-nwchem.sh", "S2-cpmd.nw"};
OutputSandbox = {"stderr","stdout","mpiexec.out"};
]
The number of CPUs allocated for the job are defined by the NodeNumber attribute. You can modify it according to your needs. The input file used for the computation is defined in the Arguments attribute and is listed in the InputSandbox of the job. Of course the input file should be in the local directory from where you submit the job in order for gLite to be able to transfer it to the target host.
