P-GRADE White paper
From EGEE-see WIki
Contents |
EXECUTIVE SUMMARY
P-GRADE provides a unique and complete solution for development and execution of parallel applications on supercomputers, clusters, and Grid systems. P-GRADE’s high-level graphical environment was designed so that the users need not learn programming methodologies for different parallel and distributed platforms; the same environment is applicable for traditional supercomputers, PC clusters, or new Grid solutions (based on either Globus or Condor) to utilise transparently all the available computational resources.
P-GRADE significantly accelerates the reengineering procedure of sequential and legacy programs for parallel and Grid systems providing easy-to-use solutions even for non-professional programmers. The sequential code can be easily inherited and re-used relying on the hierarchical design approach as well as the legacy sequential or parallel programs can interoperate to other jobs with the built-in workflow layer in order to execute large-scale complex programs in the Grid. Each stage of development life-cycle (design, debugging, testing, monitoring and performance analysis) is covered in P-GRADE system and its highly portable run-time environment provides dynamic load balancing facilities for long-running parallel applications and fault tolerant execution based on fully automatic checkpointing and migration mechanisms for cluster and Grid applications as well as on-line monitoring facilities and remote performance visualisation.
Introduction to P-GRADE
The major goal of P-GRADE is to provide an easy-to-use, integrated set of programming tools for development of general message-passing applications to be run in heterogeneous computing environments or supercomputers. P-GRADE's main benefits are as follows:
- Visual interface to define all parallel activities in the application.
- Programmers are not required to know the syntax of the underlying message-passing system. P-GRADE generates all message-passing library calls automatically on the basis of graphics.
- Compilation and distribution of the executables are performed automatically in the heterogeneous environment.
- Debugging and monitoring information is related directly back to the user's graphical code during on-line debugging and visualisation of the trace file.
- Support the systemtematic debugging of the designed application.
- Programmers can use predefined process communication templates.
- Dynamic load balancing for long-running applications.
- Fault tolerance execution on clusters without any changes of the source code written by the user.
P-GRADE currently consists of the following tools as main components:
- GRAPNEL: A graphical parallel programming language
- GRED: A graphical editor to write parallel applications. The editor supports the syntax of the graphical language GRAPNEL
- GRP2C: A precompiler to produce the C code with PVM or MPI function calls from the graphical program
- GRM: A monitoring tool to generate a trace file during execution of a PVM/MPI application
- DIWIDE: A distributed debugger with systematic debugging capabilities
- PROVE: A semi-on-line visualisation tool to analyse and interpret the trace file information and present them to the programmer graphically during the execution
- CHKPT: A distributed checkpointer tool
- LB: A dynamic load balancer tool
- FT: A tool providing fault tolerance facilities
The scheme of the program development cycle in P-GRADE can be summarized as follows. As a first step the user applies the GRED graphical editor to design and construct the parallel program written in a special visual programming language called GRAPNEL. The GRED editor creates the so-called GRP file from the GRAPNEL program. The GRP file contains all the information necessary to restore the program graph for further editing and to compile the GRAPNEL program into a C+PVM/MPI code. The latter is the task of the GR2PC precompiler which additionally creates other auxiliary files including makefiles used by the UNIX make utility for building the executables. C code generation and compilation is fully automated on every host of the heterogeneous cluster of workstations or supercomputers. Having obtained the executables, the parallel program can be executed either in debugging mode or in trace mode. In debugging mode, the DIWIDE distributed debugger controls the execution of the program by providing commands to create breakpoints, step-by-step execution, animation, etc. The DIWIDE distributed debugger supports the systematic debugging as well. In trace mode, a trace file is generated containing all the trace events defined by the user. These events are visualised by the PROVE graphical visualization tool assisting the user in spotting performance bottlenecks in the GRAPNEL programs.
GRAPNEL and GRED
The main features of GRAPNEL can be summarized as follows:
- GRAPNEL is a hybrid language, i.e. it supports by graphics only the parallel processing activities, any other parts of the program should be written in a textual language. (Currently C is supported.) As a consequence, arbitrarily large sequential code could be included into a GRAPNEL program from existing libraries.
- GRAPNEL programs are based on the message passing parallel programming paradigm. GRAPNEL programs can be compiled to existing message passing systems. (Currently PVM or MPI calls are inserted into the C code.)
- GRAPNEL supports top-down parallel program design based on three hierarchical design levels:
- At the top level (visualized in the Application Window) the outline of the whole application is described graphically with respect to communication connections among the processes. Processes, process groups, templates, communication ports and connections among the processes must be defined here graphically but the functionalities (i.e. the code) of the processes are hidden at this level as shown in Fig. 1.
- At the middle level the send and receive operations and their surrounding program structures are defined graphically inside the code of a process. A Process Window is associated with each process to describe graphically its internal structure as depicted in Fig. 1.
- At the lowest level the textual code fragments can be defined by the biult-in editor or any standard UNIX editor like vi or EMACS. (See the built-in text window in Fig. 1.)
Figure 1. Parallel program design levels with GRAPNEL
The GRED graphical editor is used in the Application and Process Windows to construct the necessary graphs. GRED supports the usual editing functions like cut, copy and paste for arbitrary subgraphs at both levels. For example, copying a process in the Application Window results in the copying its complete internal structure represented by its associated Process Window. In this way large graphs can be quickly constructed.
Predefined process topology templates like pipe, mesh, etc. are available to help the fast creation of large, regularly connected process graphs which are typical in many data parallel algorithms. (see Fig. 2.)
Figure 2. Scalable process communication template (Pipe)
Debugging by DIWIDE
A distributed debugger called DIWIDE is integrated into the GRADE environment. DIWIDE is a general purpose debugging tool for PVM/MPI programs and it defines a set of C functions that can be embedded into other systems as it has happened in case of P-GRADE. As debugging information is related directly back the user's graphical code, GRAPNEL programs can be debugged in exactly the same graphical environment where they have been constructed. Moreover, in the design of GRAPNEL language we particularly focused on the debugging aspects (i.e. the graphical outline of the program is extremely useful to locate communication related errors).
P-GRADE provides the usual activities for the programmer to control the execution of processes under debugging (e.g. 'run', 'step', 'continue', etc.) A breakpoint can belong to either a graphical symbol in the visual code or a specific line in a textual code segment. A breakpoint on a graphical symbol is denoted as a '!' sign at the right side of the icon which is highlighted when the process actually stops at that point of the code.
Every process icon is coloured dynamically according to the actual state of that process in the Application Window. If a process is blocked by the debugger, the graphical program item, containing the C code in which it is stopped, is highlighted both in the Application and Process Window. A sample debugging session can be seen in Figure 3.
Figure 3. Debugging of GRAPNEL programs by DIWIDE
Visualization by PROVE
PROVE is a performance visualization tool which is part of the P-GRADE integrated program development environment for message passing programs. PROVE can analyse event traces generated by the GRM monitoring system and visualise the behaviour of the application run-time. The instrumentation of the parallel program is done automatically by the programming environment but it can be customized by the programmer as well.
PROVE has relatively small number of displays which include space time diagram, Gantt chart and some communication displays to show the volume and distribution of the communication among processes or hosts (i.e. processors). However, zoom and scroll mechanism are provided that allows the user to examine any part of the graphs even if they are rather crowded because of the large amount of various events.
The user is allowed also to concentrate on different parts of the parallel program by selecting interactively particular processes to be presented in the different displays.
A strength of PROVE is its tight integration with the graphical program editor of P-GRADE that allows the user to relate communication events with source code. For example, the user can click on a line representing a communication in the space-time diagram of PROVE to identify the corresponding communication operation in the GRAPNEL source code. Moreover, the reverse direction is supported as well, e.g. the user can click on a communication statement in the GRAPNEL code to highlight the next (or previous) event generated by that particular statement in the space time diagram. A sample visualisation session can be seen in Figure 4.
Figure 4. Visualisation of GRAPNEL programs by PROVE
Execution facilities for clusters based on distributed checkpointing
The current release of P-GRADE environment is able to generate a distributed checkpoint of the running application without any modification of the user's source code. Basically, the distributed checkpointing technique means the automatic storage of the current program state onto the hard disk in order to enable the restart of GRAPNEL application from a saved consistent global program state (i.e from checkpoint files). Relying on the introduced checkpointing subsystem P-GRADE environment can either migrate any of running GRAPNEL processes from the current target host to another one (e.g. for load balancing) or restart the entire application from the last checkpoint when a critical failure occurs (such as the crash of any target host).
Dynamic load balancing support
The dynamic load balancer module integrated into P-GRADE environment tries to equalise the workload on the target hosts. The load balancer module requires the GRM monitor to collect runtime information concerning the activities of GRAPNEL application. Based on the provided information the load balancer module calculates the best mapping for the application processes and also initiates the necessary migrations of GRAPNEL process, which are done by the built-in checkpointing & migration subsystem. The process migration can be observed on-line or off-line by the help of PROVE visualisation tool.
In details, the decision unit of the integrated load balancer performs the following steps:
- At regular time intervals it fetches information from GRM monitor.
- Based on the provided information, the estimated computation and communication demands are calculated.
- A nearly optimal mapping is determined using the simulated annealing or the diffusion algorithm.
- Finally, migration requests are calculated (based on the difference between the original and the optimised mapping of processes) and sent to migration module.
Figure 5. Load balancing of a GRAPNEL application (using 3x3 mesh topology)
More information: [1]
