NWCHEM VO

From EGEE-see WIki

Jump to: navigation, search

Introduction

NWChem is a computational chemistry package that is designed to run on high-performance parallel supercomputers as well as conventional workstation clusters. It aims to be scalable both in its ability to treat large problems efficiently, and in its usage of available parallel computing resources. NWChem has been developed by the Molecular Sciences Software group of the Environmental Molecular Sciences Laboratory (EMSL) at the Pacific Northwest National Laboratory (PNNL). Most of the implementation has been funded by the EMSL Construction Project. For more information about this software visit http://www.emsl.pnl.gov/docs/nwchem/nwchem.html.

Licensing Issues

NWChem is an open source software distributed under the EMSL Software User Agreement (Image:Nwchem ecce ua.pdf). The agreement has been signed by the Applicaton Support Team of NCSR "Demokritos" which is responsible for installing the package in HellasGrid. Users that wish to use this software should join the regional NWCHEM VO. This VO has been established in order to control access to the software and satisfy the requirement imposed by the above license. In order to join the VO follow the link below and follow the instructions. https://access.hellasgrid.gr/account/nwchem_vo_form A pre-requisite to be accepted in the VO is to read the license of NWChem software and agree to adhere it.

How to use

An MPICH-enabled build of NWCehm 5.0 has been deployed on all sites supporting NWCHEM VO, thus once you are a member of the VO no special flags or requirements are need to be set in the JDL file. What is need though is to initiate your proxy certificate explicitly defining NWCHEM VO as your working VO. Therefor, the first step for using the package is to issue the following command in your UI:


voms-proxy-init --voms nwchem.vo.hellasgrid.gr

Like most of the non-trivial MPI jobs in EGEE, a NWChem job should use a shell script that prepares the environment in the target host and invokes the mpiexec command. For example the script below can be used to run NWChem with MPI, passing as parameter an input nw file.

#!/bin/sh -x

# the binary to execute
EXE=$VO_NWCHEM_SW_DIR/nwchem/bin/nwchem 
INPUTFILE=$1

echo "***********************************************************************" 
echo "Running on: $HOSTNAME" 
echo "As:       " `whoami` 
echo "***********************************************************************" 


if [ "x$PBS_NODEFILE" != "x" ] ; then 
  echo "PBS Nodefile: $PBS_NODEFILE" 
  HOST_NODEFILE=$PBS_NODEFILE 
fi

if [ "x$LSB_HOSTS" != "x" ] ; then 
  echo "LSF Hosts: $LSB_HOSTS" 
  HOST_NODEFILE=`pwd`/lsf_nodefile.$$ 
  for host in ${LSB_HOSTS} 
  do 
    echo $host >> ${HOST_NODEFILE} 
  done 
fi

if [ "x$HOST_NODEFILE" = "x" ]; then
  echo "No hosts file defined.  Exiting..."
  exit
fi 

echo "***********************************************************************" 
CPU_NEEDED=`cat $HOST_NODEFILE | wc -l` 
echo "Node count: $CPU_NEEDED"
echo "Nodes in $HOST_NODEFILE: "
cat $HOST_NODEFILE
echo "***********************************************************************" 

echo "***********************************************************************" 
CPU_NEEDED=`cat $HOST_NODEFILE | wc -l` 
echo "Checking ssh for each node:"
NODES=`cat $HOST_NODEFILE`
for host in ${NODES}
do
  echo "Checking $host..." 
  ssh $host hostname
done
echo "***********************************************************************" 

echo Preparing the environment to run $EXE
export NWCHEM_NWPW_LIBRARY=$VO_NWCHEM_SW_DIR/nwchem/data/
#cp -f $VO_SEE_SW_DIR/nwchem/data/default.nwchemrc $HOME/.nwchemrc
#make sure that no .nwchemrc is left behind
##rm -rf $HOME/.nwchemrc
#cp -f nwchemrc.txt $HOME/.nwchemrc
##cp -f nwchemrc.txt $VO_SEE_SW_DIR/nwchem/data/default.nwchemrc
cp -f $VO_NWCHEM_SW_DIR/nwchem/data/default.nwchemrc $HOME/.nwchemrc
cat $HOME/.nwchemrc

echo "***********************************************************************" 
echo "Executing $EXE with mpiexec" 
mpiexec $EXE $INPUTFILE > mpiexec.out 2>&1 
echo "***********************************************************************" 

The above script is generic enough and will not require modification if the input .nw file is the only dependency to run the computation. For more complex jobs, with possible multiple input files, or files fetched from the LFC, the above script should be modified appropriately.

For example to run NWChem on 4 CPUs giving as input the S2-cpmd.nw file (included in the examples directory of the package), the following JDL can be used:

[
  Type = "Job";
  JobType = "MPICH";
  NodeNumber = 4;
  Executable = "mpi-nwchem.sh";
  Arguments = "S2-cpmd.nw";
  StdOutput = "stdout";
  StdError = "stderr";
  InputSandbox = {"mpi-nwchem.sh", "S2-cpmd.nw"};
  OutputSandbox = {"stderr","stdout","mpiexec.out"};
]

The number of CPUs allocated for the job are defined by the NodeNumber attribute. You can modify it according to your needs. The input file used for the computation is defined in the Arguments attribute and is listed in the InputSandbox of the job. Of course the input file should be in the local directory from where you submit the job in order for gLite to be able to transfer it to the target host.

Personal tools