SG Using file replicas and RFIO: UI configuration, rfiod, usage in apps, limitations and workarounds
From EGEE-see WIki
Contents |
Configuring UI
Those among GIMs who wish to allow their users to use LFC from UI nodes, need to do the following on UI machines as root:
echo "export LFC_HOST=grid02.rcub.bg.ac.yu" > /etc/profile.d/seegrid.sh echo "export LCG_CATALOG_TYPE=lfc" >> /etc/profile.d/seegrid.sh chown root:root /etc/profile.d/seegrid.sh chmod 755 /etc/profile.d/seegrid.sh
echo "setenv LFC_HOST grid02.rcub.bg.ac.yu" > /etc/profile.d/seegrid.csh echo "setenv LCG_CATALOG_TYPE lfc" >> /etc/profile.d/seegrid.csh chown root:root /etc/profile.d/seegrid.csh chmod 755 /etc/profile.d/seegrid.csh
Otherwise the users would need to set
export LFC_HOST=grid02.rcub.bg.ac.yu export LCG_CATALOG_TYPE=lfc
in their environments. /etc/profile.d/seegrid.csh is a good location for environment settings related to SEEGREED VO, since it won't be affected by YAIM reconfiguration.
The above described actions are probably the best approach for the moment. A better, but currently unfeasible alternative is to rely on BDII, instead of LFC_HOST. Since the information about our LFC site is correctly published on both ui.ulakbim.gov.tr:2170 and grid.phy.bg.ac.yu:2170 BDIIs, lcg-* commands will work without LFC_HOST, if your LCG_GFAL_INFOSYS points to one of given BDIIs. You may check this with "lcg-infosites --vo seegrid lfc", using new lcg-infosites provided by Min yesterday. However, LFC_HOST must be set for lfc-* commands, which is another bug we just noticed.
I suggest that administrators leave WNs as they are and let application writer to provide LFC_HOST and LCG_CATALOG_TYPE in JDL or job wrapper scripts. Besides, for GFAL to work from applications, LCG_GFAL_VO should be also set to "seegrid". This approach won't disturb BWs from being used by other VOs.
With all this in place, the users and applications will have the access to lcg-* and lfc-* commands, while the application will be capable of using RFIO.
Configuring SE
On a classic SE, some sites have a problem with rfiod. It has been fixed in recent CASTOR RPMs, so we hope it will be less frequent in the future. The site administrators should also check whether RFIO daemon runs on their SE nodes with
ps -d | grep rfiod
If not, they need to modify line 45 of "/etc/init.d/rfiod" replacing:
RFIOD=/usr/local/bin/rfiod
with:
RFIOD=/usr/bin/rfiod
Another simple solution is to do
ln -s /usr/bin/rfiod /usr/local/bin/rfiod
After the intervention, start the service with:
service rfiod start
This is described at
000000000
- https://savannah.cern.ch/bugs/?func=detailitem&item_id=7999
- https://savannah.cern.ch/bugs/?func=detailitem&item_id=6396
Testing your site for GFAL
If an application requires GFAL/RFIO it is crucial that your site has rfiod up and running. To test RFIO/GFAL access from your WNs, follow this recipe:
$ cat > gfal-test.jdl
Executable="/opt/lcg/bin/gfal_testread";
StdOutput="out";
StdError="err";
Arguments="<sfn-SURL-within-close-SE>";
OutputSandbox = {"out", "err"};
VirtualOrganisation = "seegrid";
RetryCount = 0;
<Ctrl-D>
$ edg-job-submit -r <Target-CE> gfal-test.jdl
$ edg-job-get-output ...
$ cat /tmp/jobOutput/.../out
opening sfn:...
open successful, fd = 3
read successful
close successful
$ cat /tmp/jobOutput/.../err
compare failed at offset 0
Otherwise, the output may look like
$ cat /tmp/jobOutput/.../out opening sfn:... $ cat /tmp/jobOutput/.../err gfal_open: Connection reset by peer
Configuring RB
LCG 2.4 does not support InputData JDL parameter with LFC due to a bug in workload manager code. With LCG 2.5 and later, /opt/edg/etc/edg_wl.conf on SEEGRID RBs need to be updated by adding DLICatalog to NetworkServer section.
NetworkServer = [
...
DLICatalog = {"seegrid"};
...
]
LFC_HOST and LCG_CATALOG_TYPE do not have to be set on RBs, since the LFC host is retrieved from the IS. Its status may be checked by
$ lcg-infosites --vo seegrid lfc grid02.rcub.bg.ac.yu
After the editing, the restart of NS and WM services is required:
$ /etc/init.d/edg-wl-ns restart $ /etc/init.d/edg-wl-wm restart
Dealing with double-slash problem in jobs and GFAL
GFAL not dealing with RFIO double-slash in classic SE's SURLs
The users and developers who want to use logical file names in their applications need to use the following GFAL workarounds. If their application accepts file only at the start of the execution, a storage URL can be calculated from the job script with
SURL=`lcg-lr --vo seegrid $LFN | \ grep "^sfn://$VO_SEEGRID_DEFAULT_SE/" | head -1 | \ sed \ "s/sfn:[/][/]$VO_SEEGRID_DEFAULT_SE[/]/sfn:\/\/$VO_SEEGRID_DEFAULT_SE\/\//"`
For applications that dynamically access files, the C++ snippet looks like:
#include <fcntl.h>
#include <stdio.h>
extern "C"{
#include "lcg_util.h"
}
bool lfn2surl(char* &filename) {
bool success = false;
char ** pfns;
/* The lcg_lr call itself */
if(lcg_lr(filename, "seegrid", NULL, 0, &pfns)!=0){
perror("Error with lcg_lr!");
}
char * closeSE=getenv("VO_SEEGRID_SEFAULT_SE");
char prefix[200];
strcpy(prefix, "sfn://");
strcat(prefix, closeSE);
strcat(prefix, "/");
int len = strlen(prefix);
for (char **pfnsIter=pfns; *pfnsIter!=NULL; pfnsIter++) {
if (strncmp(prefix, *pfnsIter, len) == 0) {
delete []filename;
filename = new char[strlen(*pfnsIter)+2];
strcpy(filename,prefix);
// -1 duplicates existing slash character
strcat(filename,(*pfnsIter)+(len-1));
success = true;
}
free(*pfnsIter);
}
free(pfns);
return success;
}
In both cases, the resulting $SURL and "char* filename" can be GFAL_open-ed.
The shorter version of this text was posted to Savannah portal.
Java access to LFC and LCG UTILS
Our Java JNI wrapper around LFC and LCG UTILS C APIs is described at
SEE-GRID File Management Java API
It is available at
http://grid02.rcub.bg.ac.yu/LFCJavaAPI/index.html
Savannah follow-us of above issues:
Using LFC and RFIO on UI, SE, RB, WN; LCG 2.4 limitations and workarounds
LFC_HOST required for lfc-*commands, even when LFC is published in BDII
