Workflows
From EGEE-see WIki
Contents |
Workflows in P-GRADE Portal
Definition
Workflow applications are represented by graphs where the nodes of the graphs are jobs or Grid services and the directed arcs represent the job execution dependencies and/or the necessary file transfers among the component jobs. Workflow applications are ideal for the Grid since the different component jobs can be executed on different Grid sites. Workflow applications enable the exploitation of parallelism at two levels:
- Inside a workflow node: if a node of the workflow is an MPI job and this job is assigned to a Grid site where the necessary number of processors is available, then this job can be executed in a parallel way inside the Grid site. This is often called as intra-node parallelism.
- Among several workflow nodes: if there is a parallel branch inside the workflow, then the nodes of the parallel branches can be executed in parallel at different Grid sites. This is often called as inter-node parallelism. If the parallel branches contain MPI jobs, then both intra- and inter-node parallelism can be exploited at the same time.
P-GRADE portal supports the graphical creation of DAG-like workflows where the nodes can be sequential, parallel jobs and services. The workflow manager of P-GRADE portal supports the exploitation of two-level parallelism (intra-node and inter-node) of DAG workflows.
The following picture represents a simple workflow exploiting both levels of parallelism: jobs Budapest and Paris can be started at the same time (this is the workflow-level parallelism). Besides this, Paris is an MPI application (this is the job-level parallelism). The job London depends on the successful execution of jobs Budapest and Paris. Both Budapest and Paris produce one output file. These output files are passed to job London, as its input files.
Execution
Organizing workflow execution using EGEE command-line tools is a very difficult task. The user has to take care about the followings:
- submit jobs whose prerequisites are met
- check the progress of submitted jobs
- if a job has finished, should download its results
- if a finished job's outputs are used by some other jobs as input, file movements should be handled, too.
It is obvious, that users can't be forced to do all these tasks. P-GRADE Portal offers a convenient way for creating and running workflows by hiding all these tasks from the user. The previous figure represents a workflow created using P-GRADE Portal. Portal uses Condor DAGMan for scheduling the workflow execution after submission. Different control scripts take care of job submission, job progress checking, output download and file transfers. The user's only task is to create the workflow and watch is execution.
In the example above Condor DAGMan starts Budapest and Paris simultaneously. Depending on the Grid type the jobs are assigned to, different control scripts submit the jobs to the selected Grids. Next, a status polling mechanism starts that periodically checks the submitted jobs' status. If a job has finished, its output is downloaded from the Grid and is stored on P-GRADE Portal on a well-defined location. If both Budapest and Paris has finished, Condor schedules London for execution. The control scripts started by Condor DAGMan get the output files of Budapest and Paris, and use them as the input of London. Next, just like the two other jobs, London is executed.
The following figure shows a workflow in action:
Parameter Study workflows - the highest level of parallelism
One of the most frequent users favored ways of exploiting the services of a computational grid is when the user wants to solve such problems where sets of inputs must be applied to a single algorithm.
The name of the scenario is Parameter Study when
- the algorithm is independent from the input - i.e. the same code represented by the algorithm can be applied to any member of the input set -, and
- the outputs - equal in cardinality with the input set - will be evaluated/elaborated in a later phase (eventually by a differenet algorithm).
The inputs of these generally exploring/searching tasks need not be the members of a single set representing one of the possible characteristics of a feature but of several sets with different kind of features as well. In this case all combinations of actual characteristics of different features must be studied.
For example, if we have two independent features, set1 and set2 where the members of set1 are {c11, c12, c13}, and the members of set2 are {c21, c22}, then combination of possible actual charcteristics compose a new set {{c11,c21}, {c11,c22}, {c12,c21}, {c12,c22}, {c13,c21}, {c13,c22}} having the cardinality computed by the multplication of the cardinality of the base sets (Descartes product), in our case 3*2 = 6.
The members of this combination must be applied one by one to the algorithm which in our case yields 6 independent runs each with two parametrized input values.
We use the term PS Set (or Parameter Study Set) for each of the independent feature sets.
The Workflows created with the help of the P-GRADE Portal are ideally suitable to serve as the representation of the mentioned algorithm because the load of the executions can be distributed in the Grid. The simplest way we regard a tested P-GRADE workflow a black box, and "pump" in it the members of the combined inputs. To do that efficiently the user must be careful to submit the jobs belonging to the parametric workflows with the assistance of the Broker whenever it is possible. The workflows defined together with their PS_Set(s) are called as Parameter Study Workflows or PS Workflows.
The following figure shows how to activate the Parameter Study extension by switching a job input to parametric input port:
In the above example the input port of the first job is switched to PS. This means, that it is processed multiple times, using files located in the defined directory. For each processing, a new workflow (e-Workflow) is created, that uses the processed file as input. If there were more PS input ports, then the cross product of the input files located in the different PS directories would be used as the input sequence for creating e-Workflows.
The next image shows the execution of a PS-Workflow. 5 e-Workflows have been created and submitted:
Besides this, a tool for automatically generating workflow input files might be desirable. P-GRADE Portal introduced the Generator and Autogenerator jobs: the purpose of these jobs is to produce files for running Parameter Study workflows. These jobs are run once, when the workflow has been submitted. If all Generator jobs have finished, e-Workflow generation start. Generator jobs starts user-specified executables, whereas Autogenerator jobs produce files following user-defined rules.
Collecting the output of the different e-Workflows might be a difficult task. In order to ease the processing of the big number of generated results, P-GRADE Portal introduced the Collector job type: these jobs are started after all of the e-Workflows have finished. They receive the results created by the different e-Workflows. So users get a convenient tool for processing the different outputs (for example for creating statistics).
The next figure shows a workflow containing a generator, an autogenerator, and two collector jobs:
Workflows in gLite WMS
Besides P-GRADE Portal, a workflow can also be defined as a DAG job through a JDL file and submitted using command line tools in gLite WMS.
For the JDL syntax of a DAG job, please refer to gLite Documentation.
After creating the JDL file, the DAG job can be submitted using glite-wms-job-submit command and monitored with glite-wms-job-status command. Finally, the output can be retrieved with glite-wms-job-output command after a successful completion.





