Work Binder Application Service
From EGEE-see WIki
This Wiki page is a part of SEE-GRID Gridification Guide. It is contributed by Belgrade University Computer Centre.
Contents |
Introduction
Work Binder is a generic service developed in Java whose main purpose is to quickly allocate new jobs for Grid users. It is based on TCP binder software developed as a part of VIVE (Volumetric Image Visualization Environment) application that was gridified during SEE-GID project. On Grid Interactivity, Pilot Jobs, and Job Pooling page can be found an old discussion about pilot job infrastructures from which the Work Binder project has evolved. Its main targets are three types of Grid jobs:
- Interactive jobs - jobs executing on a remote WN (Worker Node) that require communication (interaction) with the user that started them (in gLite support for these types of jobs is very poor).
- Jobs with relatively short execution time, that have a good chance of being resubmitted.
- Jobs with critical demand for startup time.
One thing all of this job types have in common, is the need for short startup time. Work Binder is designed in such a way, that it enables almost instant allocation of jobs to the users. In effect, Work Binder acts as a mediator between client and server components of an application in the Grid environment.
Its key features include:
- Java API which allows quick & simple way to integrate new applications into the Binder & Grid environment.
- Almost instant allocation of new jobs for users.
- Support for any number of different applications.
- Dynamic allocation of new jobs.
- Work Binder hides complex Grid infrastructure from the user.
System Design
The Work Binder consists of software components distributed in three tiers:
- Various application-specific client programs that interact with end user.
- Application-specific server programs running on the Grid (Worker Nodes). Each execution of a server program is called worker, and is being executed within a worker job.
- Generic Work Binder providing job management and mediation between the above two.
The Work Binder submits worker jobs to the grid, maintains a pool of ready jobs, and mediates between clients and workers. It pairs client and worker jobs from which it has received connections. A small dispatcher program started by the Worker Job then starts an application-dependent Java or C++ class or executable file installed on the targeted grid site. Further communication (what kind of communication will be used is specified by the client) between the client and the worker is continued using one of three possible methods:
- Via binder - All communication will go through the binder with the option to have a specific code on the binder that will process the data transferred between the client and the worker.
- Built-in direct communication - The worker connects directly to the client after the client has been matched to the worker and they continue their communication without going through the binder (network performance will probably be better, however not all clients are able to accept incoming connections).
- Custom communication - Described with arbitrary, application specific options, it is up to the client and the worker to implement it.
Pool of ready worker jobs is maintained using sophisticated algorithms with the goal of always having enough worker jobs in the pool for incoming clients as well as minimizing the stress on the Grid infrastructure (the pool changes its size according to user demand). Having one pool of jobs for many different applications is an efficient way of conserving precious Grid resources. When client connects to the Work Binder and asks for a worker job, it will acquire a job almost instantly, if there is one available. If not, since all jobs in the pool are used up, the client has an option to try again later.
More details about Work Binder can be found at:
Binder Java API
The Work Binder Java API allows application users to easily integrate Work Binder with their existing applications. This integration can be achieved by using the appropriate API wrapper interfaces. Wrapper interfaces exist in three levels, and their purpose is to hide complex internal Binder protocols from application users and provide them with simple interfaces and methods so that they can focus on their own application specific problems and not bother with Grid and Binder infrastructure related issues. The API is designed in such a way, that the applications that already have some sort of client-server design can be easily adapted to use it (even if they were not gridified previously).
Security
In order to connect to the Work Binder, application users are required to authenticate themselves using their own grid credentials. Binder supports two kinds of credentials:
- PKCS#12 certificate bundle containing password protected user's grid certificate and private key, similar to the one that is used to import the credentials to a web browser or an e-mail client. This way of authentication is provided in order to simplify usage of the Work Binder and allow clients to exist on non-grid machines. However, since user's private key will not be transmitted over the network, support for users to allocate new jobs from their workers will not be allowed (since worker also needs to authenticate to the binder if it wants to allocate a new job). Users can, of course, create as many jobs as it is allowed by the infrastructure from their clients. If they desire to create new jobs from their workers, they need to implement their own way of delegating credentials.
- Standard grid proxy certificate commonly used on the grid created issuing voms-proxy-init command. This proxy certificate contains user's certificate and short lived passwordless proxy key. Since the entire proxy certificate will be transferred to the binder, support for delegation of credentials to the workers exists and it allows the workers to allocate new jobs. This way the workers will identify to the binder using the same proxy credentials as the user. If users don't wish to run their clients from the UI, they have an option to manually copy their own proxy certificates from the grid UI machine to their own personal computer where the client will be run from. Proxy certificate is usually located in /tmp/x509up_u<User_ID> on the UI and is available in X509_USER_PROXY environment variable.
Creating a PKCS#12 bundle is fairly easy. From the grid UI machine, user needs to do the following command:
openssl pkcs12 -export -in ~/.globus/usercert.pem -inkey ~/.globus/userkey.pem -name "My Certificate" -out mycertificate.p12
And then copy it from the grid UI machine to the personal machine.
Authentication to the binder includes two steps. First one is the verification of the user's certificate and the second step is the verification if the user belongs to the appropriate application group on the VOMS server. This means that each user needs to be a member of a certain VO (Virtual Organization) groups in order to be able to use the applications which the binder supports.
More on the binder authentication & security can be found here.
Client Level
On client level, ClientConnector is the wrapper interface through which users can connect to the Binder and obtain a Worker Job. By initiating the wrapper with the properties object users can specify many options, some of them being:
- BinderAddress - address of the Work Binder
- BinderPort - port on which Work Binder is listening on
- ApplicationID - unique name identifying which application supported by the binder will the client use
- CandidateCE - a list of candidate CEs where the job should be executed, if empty any supporting & available CE will be used
- RequiredWallClockTime - how long will the client use the Worker Job
- AccessString - string describing connection type:
- "" (empty string) - via binder
- "direct:<address>:<port>" - built-in direct communication
- "custom:<arguments list>" - custom communication
- UseGridProxy - whether grid proxy or PKCS#12 bundle will be used
- X509_USER_PROXY - path to the grid proxy, read only if UseGridProxy == YES (if omitted, default path is used).
- Certificate - path to the PKCS#12 bundle, read only if UseGridProxy == NO
- CertificatePass - password to open the PKCS#12 bundle, read only if UseGridProxy == NO (NOTE: for security reasons it is not recommended to keep this password in a plain text file, it is best to prompt the user from the client each time)
Once a Worker Job has been obtained, the user can communicate to the Worker Job through ClientConnector interface methods or get the reference to the actual socket being used to communicate to the Worker job (this feature is not a preferred way of communication, but is provided in order to make life easier for applications that already have some sort of client-server communication implemented).
Binder Level
Binder level is the intermediate level of communication between the client and the worker. This level takes place on the Work Binder itself. One of the purposes of this intermediate level of communication, among other reasons, is to provide a way for clients and workers to connect to each other in environments that would not allow them to establish direct communication due to firewall or other network issues (Binder acts as a proxy server in this case). Another possible use is to add some application logic to the Binder level where some processing can be performed on the Binder instead of on the remote Worker Job. The client chooses whether it wants to connect to the Worker Job via Binder or directly (direct connection can be described in detail). If direct communication is chosen, this entire level of communication is skipped.
If communication via Binder is used, the users have an option to implement application specific handlers on the Binder level. If there is no application specific implementation on the Work Binder, simple proxy implementation is used where Work Binder acts as a proxy between the client and the Worker Job. Otherwise, a specific Binder handler is used. In order to implement this handler, users have to implement the following simple interface:
public interface BinderHandler {
public void run(BinderConnector binderConnector);
}
The actual implementation consists of implementing just the run method of the interface. Once the client and the worker have been matched by the Work Binder, this method will be executed. Within this method, application users are given access to an instance of the BinderConnector interface. This instance represents the wrapper interface at the binder level. Similarly to the wrapper interface on the client level, this class gives access to the communication primitives to the client on one and the worker on the other side, as well as direct access to the actual sockets used to communicate to the client and the worker (similarly to the ClientConnector, this is not a recommended way of communication).
Worker Level
Worker level interface provides communication from within a worker job. Application users must implement their custom handlers by implementing the following simple interface (similar to the binder level):
public interface WorkerHandler {
public void run(WorkerConnector workerConnector);
}
By implementing the run method, users get access to an instance of the WorkerConnector interface, which is the wrapper interface on the worker level. With this interface, users (as with other two wrapper classes) get access to communication primitives as well as low level access to the actual socket used for communication.
Application specific WorkerHandler implementations are executed on remote Grid sites, so they must be preinstalled and available to the Work Binder in order to be executed successfully. In order to specify handlers for each application, the configuration must contain option values in the following form:
- ApplicationName - name of the application that will be reported to the binder (<ApplicationID>)
- <ApplicationID>_Class - fully qualified class name of the WorkerHandler implementation class
- <ApplicationID>_Parameters - list of arbitrary parameters that will be used by the application
- <ApplicationID>_Libs - list of library files needed for the execution of the application specific WorkerHandler implementation
Some of the features related to modular installation and management of applications on supporting grid sites are still under development.
In case application users just want to execute some external program located on the WN, instead of writing their own WorkerHandler implementations, users have an option to use built-in implementations. This way, an easy integration of programs written in various programming languages is possible, by providing execution of application-defined program or scripts, as well as assistance in establishing communication with the client. Two built-in implementations for two distinct use cases are available:
- ExternalExecutor is a WorkerHandler implementation that is intended for applications that want to establish direct communication between the worker and the client, where the worker is an external program (specified by the "<ApplicationID>_Parameters" field in the WN configuration) located on the remote Worker Node. This implementation simply closes the connection to the binder and towards client, and executes a command line string on the worker with arguments provided by the client. Client provides this arguments by choosing & describing (with proper strings where these arguments are located, this is the client's "AccessString" option value) custom communication between the client and the worker. It is up to the external program and the client to reestablish communication, if needed. For example, the client may inform the worker (using the program invocation parameters) about the address the worker should connect to.
- ExternalListener is another WorkerHandler implementations that is intended for applications that want to use communication via binder, but the worker is an external program (specified by the "<ApplicationID>_Parameters" field in the WN configuration) located on the remote WN. This implementation preserves the connection to the binder and towards client, starts listening on a socket, and executes the external program to which it provides an address and the port on which it is listening as command line arguments. The worker program is expected to connect to this socket (instead of connecting to the client) and communicate with the client normally using its own application-defined protocol as if there are no mediators between the worker and client. The client therefore does not notice any interruption of the communication and can start using the application-defined protocol after successful execution of ClientConnector.connect(). Another benefit of this approach is that all communication will go through the binder. This scenario is useful in cases when the worker and the client can't establish direct communication or when there is a need to have a specific code on the binder itself (implementation of the BinderHandler) through which all the data transferred between the client and the worker will go.
An interesting feature is the ability of users to allocate more Worker Jobs by using the ClientConnector on the worker level to connect to the Work Binder. This way a complex dynamic graph of jobs can be implemented. In order to achieve this, WorkerConnector interface provides a method ClientConnector createClientConnector(Properties prop, boolean delegateCredentials). If delegateCredentials parameter is true, the worker will use user's credentials to create a new ClientConnector that can be used to connect to the binder and ask for a new job.
Adding Application Support to the Binder
Configuring the CEs in order to notify the binder that the application is supported is fairly simple. Binder searches for the information about installed applications in the following root directory $VO_<VO_NAME>_SW_DIR. This is the standard directory where the applications that can be used on the grid are installed. For example, for SEEGRID VO, it is usually /opt/exp_soft/seegrid and each subdirectory represents an installed application. More on this subject can be found at Software_Installation_Management_Guide.
Binder searches within each installed application's root directory (which is the direct subdirectory of $VO_<VO_NAME>_SW_DIR) for a file named binder-plugin.properties and reads this file in order to get information about the application that it needs in order to execute the appropriate WorkerHandler implementation. This file contains information described in the previous section. For example, for Reverse Remote Shell example it looks something like this:
binder-plugin.properties
# Application name - mandatory
ApplicationName = rrs-example
# The class that will be used as the WorkerHandler implementation - mandatory
# Note: if custom implementation is used, specific libs (jar files) will probably
# be needed.
rrs-example_Class = yu.ac.bg.rcub.binder.handler.worker.ExternalExecutor
# List of parameters for external programs (always used for built-in implementations)
#
# Note: if ExternalListener is used the parameters are read in the following
# convention 'appname address ...', where address is the information where
# the external program will connect. External listener will find a free port.
rrs-example_Parameters = /opt/exp_soft/seegrid/rrs-1.70/rrs
# List of needed libs for each custom WorkerHandler implementation, specified in the
# proper format.
#
# ExampleApp_Libs = file:lib1.jar http://somesite.org/lib2.jar
rrs-example_Libs =
This way application users have an option to add support for their application transparently. Developers are required to provide Work Binder Service administrator with the information about the application, namely the Application Name (or AppID) and mappings for supported VOMS groups needed for user authorization. However, depending on binder's policies for each CE, certain CE may restrict the list of supported applications.
Also, in order to deploy custom BinderHandler implementations on the binder level, application users need to contact the Work Binder Service administrator, because this handlers are executed directly within the binder.
API Examples
Two examples are provided:
- Echo Test application example - simple application that uses communication via Binder or supported built-in direct communication (provided with the wrappers) between the client and the worker
- Reverse Remote Shell example - another simple application showing an example of custom direct communication between the client and the server, that allows the client to use the shell of the remote worker.
Contact
Milan Potocnik [milan (d) potocnik (a) rcub (d) bg (d) ac (d) yu]
