Introduction to the parallel computing paradigms

From EGEE-see WIki
Jump to: navigation, search

The page is part of SEE-GRID Gridification Guide
This topic is contributed by Center for Scientific Research of SASA and University of Kragujevac,Serbia


Different approaches to the parallel computing

Parallel computing philosophy lies in fact that the computing task itself can be divided into the smaller tasks in order to obtain the better performance. The separate processes are usually executed on separate processors and in general, parallel codes run on shared memory multiprocessors, distributed memory multi-computers, clusters of workstations or heterogeneous clusters of the above.
The approach differs on type of hardware architecture in use. Parallelism on shared memory systems (ranging from SMP PCs to large supercomputers) is generally based on compiler directive approach, such as OpenMP standard. On the other side, MPI (Message Passing Interface) can be used with distributed memory as well as with shared memory model. With recent innovations and making dual-core and quad-core CPUs more affordable, today's HPC (High Performance Computing) systems can optimally employ both of the above programming approaches, thus making multi-level parallel programming (MPI+OpenMP) a hot topic for developers.

Message passing and MPI standard

This manual will consider the usage of MPI in SEE-GRID environment, which is all well supported from the operating system and middleware. The first question to be answered is: “What is MPI?”, and another one: “What MPI isn't?”. MPI is not a programming language, but a specification of the parallel programming interface which each real implementation has to fulfill. The real implementation of the interface comes as a library of functions invokable from Fortran/C codes. There is a number of MPI implementations such as MPICH, LAM, OpenMPI etc., with almost full support of MPI-2 standard. SEE-GRID currently supports MPICH version 1.2.6, one of the oldest and the most mature implementations. Mentioned MPICH version fully covers MPICH-1 standard.
The second question considers the general benefits from parallel computation model, especially highlighting the message passing model practical application. Why MPI?

  • to provide efficient communication (message passing) among computational resources
  • to enable more analyses in the same amount of time
  • to reduce time required for one single analysis
  • to increase level of reality of physical modeling
  • to provide access to more memory on distributed systems
  • for highly parallelisable problems, such as many Monte-Carlo applications, usage of some trivial MPI gives near-linear speedup.

MPI's pre-defined constants and function prototypes are included in the header file. This file must be included in the code wherever MPI function calls appear (for C codes #include “mpi.h”). MPI_Init must be the first function called and MPI_Finalize terminates MPI session. These two functions must only be called only once in the user code.

Message structure and MPI data types

Which components build normal MPI message? First, there is useful data (array) of MPI data types (basic types like int or float, or derived data types like structures). Second, the message “envelope” must be present, containing the information on source and destination process number, tag and communicator. The MPI specification describes two different types of the communication among active processes:

  • Point to point communication
    • blocking – returns from call when task is complete
    • nonblocking – returns from call without waiting for task to complete
  • Collective communication

In order to fulfill the task of sending/receiving, the sender and receiver data type, tag and communicator must match. Since MPI messages are sent without the user actually packing the data into a message buffer, the interface requires a rigorous definition of data types. MPI data types have their corresponding C/Fortran data types:

  • MPI_INT – signed int
  • MPI_UNSIGNED – unsigned int
  • MPI_FLOAT – float
  • MPI_DOUBLE – double
  • MPI_CHAR – char
  • ...
  • User derived types consisted of basic types


Personal tools