Monday, December 7, 2020

Overview Of MPI Reduction Operations in HPC cluster

Message Passing Interface[ MPI ] is a de facto standard framework for distributed computing in many HPC applications. MPI collective operations involve a group of processes communicating by message passing in an isolated context, known as a communicator. Each process is identified by its rank, an integer number ranging from 0 to P − 1, where P is the size of the communicator. All processes place the same call (SPMD fashion i.e Single Program Multiple Data) depending on the process.

MPI Reductions are among the most useful MPI operations and form an important class of computational operations. . The operation can be either user-specified or from the list of pre-defined operations. Usually, the predefined operations are largely sufficient for any application. 
 
Consider a system where you have N processes. The goal of the game is to compute the dot product of two N-vectors in parallel. Now the dot product of two vectors u and v Example operation : u⋅v=u1v1+u2v2+...+uNvN . As you can imagine, this is highly parallelizable. If you have N processes, each process i can compute the intermediate value ui×vi. Then, the program needs to find a way to sum all of these values. This is where the reduction comes into play. We can ask MPI to sum all those value and store them either on only one process (for instance process 0) or to redistribute the value to every process.
 
MPI reduction operations fall into three categories: 
 
1) Global Reduction Operations: 
  • MPI REDUCE, 
  • MPI IREDUCE, 
  • MPI ALLREDUCE and 
  • MPI IALLREDUCE. 
2) Combined Reduction and Scatter Operations: 
  • MPI REDUCE SCATTER, 
  • MPI IREDUCE SCATTER, 
  • MPI REDUCE SCATTER BLOCK and 
  • MPI IREDUCE SCATTER BLOCK. 
 
3) Scan Operations: 
  • MPI SCAN, 
  • MPI ISCAN, 
  • MPI EXSCAN, and 
  • MPI IEXSCAN. 
 
The primary idea of these operations is to collectively compute on a set of input data elements to generate a combined output. MPI REDUCE is a collective function where each process provides some input data (e.g., an array of double-precision floating-point numbers). This input data is combined through an MPI operation, as specified by the“op” parameter. Most applications use MPI predefined operations such as summations or maximum value identification, although some applications also utilize reductions based on user-defined function handlers. The MPI operator “op” is always assumed to be associative. All predefined operations are also assumed to be commutative. Applications, however, may define their own operations that are associative but not commutative. The “canonical” evaluation order of a reduction is determined by the ranks of the processes in the group. However, an MPI implementation can take advantage of associativity, or associativity and commutativity of the operations, in order to change the order of evaluation. Doing so may change the result of the reduction for operations that are not strictly associative and commutative, such as floating-point addition 
 
The following predefined operations are supplied for MPI_REDUCE and related functions MPI_ALLREDUCE, MPI_REDUCE_SCATTER, and MPI_SCAN. 
 
These operations are invoked by placing the following in op 
  • [ Name] Meaning 
  • [ MPI_MAX] maximum 
  • [ MPI_MIN] minimum 
  • [ MPI_SUM] sum 
  • [ MPI_PROD] product 
  • [ MPI_LAND] logical and 
  • [ MPI_BAND] bit-wise and 
  • [ MPI_LOR] logical or 
  • [ MPI_BOR] bit-wise or 
  • [ MPI_LXOR] logical xor 
  • [ MPI_BXOR] bit-wise xor 
  • [ MPI_MAXLOC] max value and location 
  • [ MPI_MINLOC] min value and location 

 



 Example 1: Get the memory on each node and perform MPI_SUM operation to calculate average Memory on the cluster.