SG Job submission using Java WMProxy

From EGEE-see WIki

Jump to: navigation, search

This guide is a part of SEE-GRID Gridification Guide

Contents


How-to for submission of jobs trough WMProxy service from Java

This short tutorial describes the process of submitting jobs to a WMProxy server with calls to API in Java. It is by no means an exhaustive tutorial for WMProxy service, nor is it a detailed documentation for the WMProxy Java API or versions of this API for other languages. Its aim is to give a short and comprehensible overview of what is needed to submit a job to WMProxy service with the Java API. It takes a practical course in the form of a short explanations and examples of the whole process. Where appropriate it references the more elaborating documents. An additional example on job submission and output retrieval can be seen here.

Requirements

First and foremost, a WMProxy server has to be properly installed and operational. However this is beyond the scope of this guide, and for purposes of this tutorial we assume that the server is up and running, and the service is available at a known URL.

Also, this guide assumes that there is an already installed and configured UI machine with all the appropriate rpms.

In order to use the Java API for submitting jobs trough WMProxy, the WMProxy client API has to be installed. The WMProxy client API consists of one jar file, but it has several dependencies. The list of jar files that is needed in order for the API to work, including the jar file with the API itself is given.

Install needed rpms following versions:

    Axis (http://ws.apache.org/axis/) 

where the needed jars available are:

    axis-ant.jar
    axis.jar
    jaxrpc.jar
    commons-discovery-0.2.jar
    commons-logging-1.0.4.jar
    log4j-1.2.8.jar
    log4j.properties
    saaj.jar
    wsdl4j-1.5.1.jar

and:

    BouncyCastle (http://www.bouncycastle.org/)

where the needed jar is:

    bcprov-jdk14-122.jar


Apache Axis libraries:

  • axis-1.4-2006-04-22.jar
  • axis2-1.0-2006-05-05.jar
  • axis-jaxrpc-1.2-2005-05-03.jar
  • axis-wsdl4j-1.2-2005-04-12.jar

Geronimo J2EE libraries:

  • geronimo-activation_1.0.2_spec-1.1-2006-06-14.jar
  • geronimo-j2ee_1.4_spec-1.1-2006-06-14.jar

Crypto libraries:

  • cryptix-asn1.jar
  • cryptix32.jar
  • bcprov.jar
  • puretls.jar

GLITE libraries:

  • glite-jdl-api-java.jar
  • glite-wms-wmproxy-api-java.jar
  • glite-security-delegation-java.jar
  • glite-security-trustmanager.jar
  • glite-security-util-java.jar

Gloubs and other libraries:

  • cog-jglobus.jar
  • classad.jar
  • commons-discovery-0.2-2004-03-23.jar
  • commons-logging-api-1.1.jar
  • log4j-1.2.14.jar

Most of these libraries are standard, but we had problems with some of them when trying different versions and distributions, so that is why we provide link to a [zip] archive containing all of them which are guarantied to work.

So make sure you have all of these jars in your classpath on your UI machine.

Next, you should acquire a proxy certificate. This could be done either from the command line interface of the UI machine (with voms-proxy-init) or trough API calls. Acquiring it from Java will be described in another guide.

The whole process of submitting a job with WMProxy can be divided in several stages:

  • making a connection with the WMProxy server
  • delegating credentials
  • submitting the job
  • retrieving the output of the job
  • miscellaneous information functions

Connecting to WMProxy server

The most important class is org.glite.wms.wmproxy.WMProxyAPI. All calls to the WMProxy service are made trough an instance of this class. All classes involved in the process of submission and retrieval of jobs are in the package org.glite.wms.wmproxy, so from now on, the class names will be used with the short name instead of the full name.

First an instance of the class WMProxyAPI must be made. The constructor for this class is:

 WMProxyAPI(String url, String proxyFile, String certsPath)

The first argument is compulsory and designates the URL at witch the WMProxy service is available and is typically something like this: https://wms.ipb.ac.rs:7443/glite_wms_wmproxy_server (this is the WMProxy server for the SEE-GRID VO).

The second argument specifies the location of the proxy certificate file. This should be a full absolute filename. If this argument is null then the proxy certificate is looked for in the standard location - /tmp/x509up_u<uid>.

The third argument specifies the location (directory) where to look for the CA (Certificate Authority) certificates. Again, this argument can be null, in which case the default location - /etc/grid-security/certificates/ is searched. When the WMProxyAPI object is created, it automatically establishes a connection with the WMProxy service. All other requests to the WMProxy service are done trough this object.

Delegating credentials

This is done with the two following calls on a WMProxyAPI object:

 String delegationId = <any string>;
 String proxy = api.grstGetProxyReq(delegationId);
 api.grstPutProxy(delegationId, proxy);

where <any string> can be any string, and api is a WMProxyAPI object.

Submitting the job

The next step would be, logically, to submit the job. There are two cases here.

1. The job does not have an InputSandbox or all of the InputSandbox files are available trough GridFTP. In this case there are no files needed to be copied from the UI to the WMS machine. If this is the case, then the job is submitted with the call to

 api.jobSubmit(String jdl, String delegationId)

where api is a WMProxyAPI object, jdl is a string containing the jdl definition of the job (the content of the jdl file), and delegationId is the string acquired as described in the previous section.

2. The job has at least one file in the InputSandbox which is not a Grid file, i.e. a file which exists locally on the UI machine and has to be transferred to the WMS. In this case the submission of the job is further split into these phases:

  • Register the job
  • Transfer the files manually
  • Start the job.

First, the job is registered with a call to the jobRegister method of WMProxyAPI.

 JobIdStructType jobIdStruct = api.jobRegister(jdlString, delegationId);

where api is a WMProxyAPI object, jdlString is the string with the jdl definition of the job, and delegationId is as described in the previous section.

The result of the method call is an object of type JobIdStructType.

Next, the local files have to be transferred to the WMS. To do this, we need two things. One, we have to know the destination URL of the file on the WMS machine; Two, we have to copy the file from the local computer to the destination URL on the WMS. To get the destination URLs of the files the following should be done:

 StringList stringList = api.getSandboxDestURI(jobId, protocol), 

where api is the WMProxyAPI object, jobId is a string with the job identifier. The job is aquired from the jobIdStruct object with the getId() method. The protocol argument is a string stating the name of the protocol with which you want to transfer the file. It can be gsiftp or https. If the WMProxy server is version 2.2.0 or greater than depending on the value of this argument the urls that will be returned will be with the gsiftp or https protocol. If the version of the WMProxy server is less than 2.2.0 than the protocol argument has no meaning and all possible urls (with all supported protocols) of all the files will be returned in the StringList return object.

After executing

 String[] stringArray = stringList.getItem()

the stringArray object holds an array of the urls of the files that need to be transferred to the WMS, where the urls are represented as strings. If the WMProxy version is greater or equal to 2.2.0 than for every file in the input sandbox there will be one url with the desired protocol, and the urls of the files will be in the same order in which the files are in the jdl description. In the WMProxy version is less than 2.2.0, than for every file there will be one or more possible urls, one for each supported protocols. All urls for the same file are adjacent in the array.

Now all the files have to be copied one by one. One way to do it is with the org.globus.io.urlcopy.UrlCopy class. Let’s assume that the local file path of the file is in the string variable fromURL (the same string as in the JDL file), and that the remote file path, i.e. the destination path on the WMS is in the string variable toURL. The copying is performed with the following calls:

 GlobusURL from = new GlobusURL(fromURL);
 GlobusURL to = new GlobusURL(toURL);
 UrlCopy uCopy = new UrlCopy();
 uCopy.setDestinationUrl(to);
 uCopy.setSourceUrl(from);
 uCopy.setUseThirdPartyCopy(true);						
 uCopy.copy();

where the GlobusURL class in from the org.globus.util package.

This segment of code should be repeated for every local file in the InputSandbox.

Now, the job can be started easily with the call to jobStart method of the WMProxyAPI class:

 api.jobStart(jobId)

where api is the WMProxyAPI object and jobId is the job identifier.

Retrieving the output

Once the job has finished, the output can be retrieved. This is a two stage process. First all the output files have to be copied to the client machine. And the purge method is called.

The process of copying the output files from the WMS to the client machine is similar to the process of copying the input files in the opposite direction. First you should acquire the urls of the files on the WMS machine and the using UrlCopy copy them to the local machine. The following code segment illustrates that:

 StringAndLongList result = api.getOutputFileList(jobId, protocol);
 StringAndLongType[ ] list = result.getFile();

Where api is the WMProxyAPI instance, jobId is the job identifier, and protocol is a string which can have the value gsiftp, https or all (but the value is ignored in versions less than 2.2.0). With this code the variable list holds a reference to an array of type StringAndLongType, where every element of the array corresponds to one url of a file. This can be one url per file, if a specific protocol is used and the version is greater or equal to 2.2.0 or can be several urls per file (one for each supported protocol) if the all protocol is used or the version of the WMProxy server is less than 2.2.0.

 StringAndLongType file = <one element of the previous array.>
 String name = file.getName(); # the name (url) of the file.

Now that we know the file's url we can copy it from the WMS to the local machine the same way we did with the input files.

 GlobusURL from = new GlobusURL(fromURL);
 GlobusURL to = new GlobusURL(toURL);
 UrlCopy uCopy = new UrlCopy();
 uCopy.setDestinationUrl(to);
 uCopy.setSourceUrl(from);
 uCopy.setUseThirdPartyCopy(true);
 uCopy.copy(); 

assuming that the fromURL is a string variable containing the URL acquired with the previously described procedure, and toURL is a local path on the UI machine. This process should be repeated for every output file.

After all the output files are copied to the local machine, the call to purge the job is made:

 api.jobPurge(jobId)

where api is the WMProxyAPI instance and jobId is the string identifier of the job.

Miscellaneous information functions

In the previous section the most important and most often used methods of the WMProxyAPI class were described. There are many other methods, most of them for finding some information. The most often used of them are:

// Gets the version numbers of the WMProxy services
public String getVersion()
// Returns the server available transfer protocols to be used for file transferring operations
public StringList getTransferProtocols()
//Returns the information related to the proxy used to submit a job that identified by its JobId
ProxyInfoStructType getJobProxyInfo(String jobId)
Personal tools