MPI Group Management & Communicator

Introduction

MPI supports process grouping capability and allows the programmer to:

  • Organize tasks based upon application nature into task groups.
  • Enable Collective Communications operations across a subset of related tasks.
  • Provide basis for implementing virtual communication topologies.
  • Ensure the communication safety and handle the application complexity.

The programmer can create a group and associate a communicator with that group. The new communicator can be used in point-to-point or collective communication routines. Both groups and communicators are MPI objects (stored in system space) accessed by handles (returned from or passed to MPI routines).


Group

  • A group is an ordered set of processes. Each process in a group is associated with a unique integer rank. Rank values start at zero and go to N-1, where N is the number of processes in the group.
  • One process can belong to two or more groups.
  • Groups are represented by opaque group objects, and hence cannot be directly transferred from one process to another.
  • A group is used within a communicator to describe the participants in a communication "universe" and to rank such participants.
  • A group always includes the same local process. The source and destination of a message is identified by process rank within that group.
  • Group is a dynamic object in MPI and can be created and destroyed during program execution.

Communicator and Classification

  • A communicator is an opaque object with a number of attributes together with simple rules that govern its creation, use, and destruction.
  • The communicator determines the scope and the "communication universe" in which a point-to-point or collective operation is to operate.
  • Each communicator contains a group of valid participants. The source and destination of a message is identified by process rank within that group.
  • Intracommunicator is the communicator used for communicating within a single group of processes.
  • Intercommunicator is used for communicating within two or more groups of processes. In MPI-1, an intercommunicator is used for point-to-point communication between two disjoint groups of processes. In MPI-2, the intercommunicator can be used for collective communication within two or more groups of processes.
  • Communicators are dynamic, i.e., they can be created and destroyed during program execution.

 


Comments on MPI_COMM_WORLD

  • MPI_COMM_WORLD is the initially defined universe intracommunicator for all processes to conduct various communications once MPI_INIT has been called.
  • In MPI-1, where the processes of executing the MPI computation, MPI_COMM_WORLD is a communicator of all processes available for the computation. This communicator has the same value in all processes.
  • In MPI-2, where processes can dynamically join an MPI execution, it may be the case that a process starts an MPI computation without having access to all other processes. In such situations, MPI_COMM_WORLD is a communicator incorporating all processes with which the joining process can immediately communicate. Therefore, MPI_COMM_WORLD may simultaneously have different values in different processes.


Commonly used MPI Group Routines

MPI_Group_size returns number of processes in group
MPI_Group_rank returns rank of calling process in group
MPI_Group_compare compares group members and group order
MPI_Group_translate_ranks translates ranks of processes in one group to those in another group
MPI_Comm_group returns the group associated with a communicator
MPI_Group_union creates a group by combining two groups
MPI_Group_intersection creates a group from the intersection of two groups
MPI_Group_difference creates a group from the difference between two groups
MPI_Group_incl creates a group from listed members of an existing group
MPI_Group_excl creates a group excluding listed members of an existing group
MPI_Group_range_incl creates a group according to first rank, stride, last rank
MPI_Group_range_excl creates a group by deleting according to first rank, stride, last rank
MPI_Group_free marks a group for deallocation

     

     

    Commonly used MPI Communicator Routines

    MPI_Comm_size returns number of processes in communicator's group
    MPI_Comm_rank returns rank of calling process in communicator's group
    MPI_Comm_compare compares two communicators
    MPI_Comm_dup duplicates a communicator
    MPI_Comm_create creates a new communicator for a group
    MPI_Comm_split splits a communicator into multiple, non-overlapping communicators
    MPI_Comm_free marks a communicator for deallocation

      More Group and Communicator routines can be found on the MPI standard home page:
      www.mpi-forum.org


       

      Typical usage of group and communicator

      • Extract handle of global group from MPI_COMM_WORLD using MPI_Comm_group
      • Form new group as a subset of global group using MPI_Group_incl
      • Create new communicator for new group using MPI_Comm_create
      • Determine new rank in new communicator using MPI_Comm_rank
      • Conduct communications using any MPI message passing routines
      • When finished, free up new communicator and group (optional) using MPI_Comm_free and MPI_Group_free


       

      Example 1:

      To create two different process groups for separate collective communications:

       

      /* Begin of the C code */
      #include "mpi.h" 
      #include 
      #define NPROCS 8 
      int main(argc,argv) 
      int argc; 
      char *argv[]; { 
      int rank, new_rank, sendbuf, recvbuf, 
      ranks1[4]={0,1,2,3}, ranks2[4]={4,5,6,7}; 
      
      MPI_Group orig_group, new_group; 
      MPI_Comm new_comm; 
      MPI_Init(&argc,&argv); 
      MPI_Comm_rank(MPI_COMM_WORLD, &rank); 
      sendbuf = rank; 
      
      /* Extract the original group handle */ 
      
      MPI_Comm_group(MPI_COMM_WORLD, &orig_group); 
            
      /* Divide tasks into two distinct groups based upon rank */ 
      
      if (rank < NPROCS/2) { MPI_Group_incl(orig_group, NPROCS/2, ranks1, &new_group);} 
      else { MPI_Group_incl(orig_group, NPROCS/2, ranks2, &new_group); } 
      
      /* Create new communicator and then perform collective communications */ 
      
      MPI_Comm_create(MPI_COMM_WORLD, new_group, &new_comm); 
      MPI_Allreduce(&sendbuf, &recvbuf, 1, MPI_INT, MPI_SUM, new_comm); 
      MPI_Group_rank (new_group, &new_rank); 
      
      printf("rank= %d newrank= %d recvbuf= %d
      ",rank,new_rank,recvbuf); 
      
      MPI_Finalize(); } 
      /* END of the C code */
      
      Sample program output: 
      rank= 7 newrank= 3 recvbuf= 22 
      rank= 0 newrank= 0 recvbuf= 6 
      rank= 1 newrank= 1 recvbuf= 6 
      rank= 2 newrank= 2 recvbuf= 6 
      rank= 6 newrank= 2 recvbuf= 22 
      rank= 3 newrank= 3 recvbuf= 6 
      rank= 4 newrank= 0 recvbuf= 22 
      rank= 5 newrank= 1 recvbuf= 22
      

      Example 2:

      To create virtual topology of matrix structure:

       

       

      #include  
      #include 
      
      main(int argc, char **argv) 
      { 
         MPI_Comm row_comm, col_comm; 
         int myrank, size, P=4, Q=3, p, q; 
      
         MPI_Init (&argc, &argv); 
         MPI_Comm_rank (MPI_COMM_WORLD, &myrank); 
         MPI_Comm_size (MPI_COMM_WORLD, &size); 
      
         /* Determine row and column position */ 
         p = myrank / Q; 
         q = myrank % Q; /* pick a row-major mapping */ 
      
         /* Split comm into row and column comms */ 
         MPI_Comm_split(MPI_COMM_WORLD, p, q, &row_comm); 
         /* color by row, rank by column */ 
         MPI_Comm_split(MPI_COMM_WORLD, q, p, &col_comm); 
         /* color by column, rank by row */ 
      
         printf("[%d]:My coordinates are (%d,%d)
      ",myrank,p,q); 
         MPI_Finalize(); 
      } 
      

      Example 3:

      To divide a communicator into two non-overlapping groups:

      #include 
      #include 
      
      main(int argc, char **argv)
      {
         MPI_Comm row_comm;
         int myrank, size, p,q, Q;
         MPI_Init (&argc, &argv);
         MPI_Comm_rank (MPI_COMM_WORLD, &myrank);
         MPI_Comm_size (MPI_COMM_WORLD, &size);
      
         /* Determine row and column position */
         Q = size /2;
         p = 1;
         q = myrank - Q;
         if(myrank < Q) {
         p = 0;
         q = myrank; }
      
         /* Split comm into two group */
         MPI_Comm_split(MPI_COMM_WORLD, p, q, &row_comm);
         /* color by row, rank by column */
      
         printf("[%d]:My coordinates are (%d,%d)
      ",myrank,p,q);
         MPI_Finalize();
      }
      

      Example 4:

      To divide a communicator such that:

       

      • all processes with even ranks are in one group
      • all processes with odd ranks are in the other group
      • maintain the reverse order by rank

       
      Hints: color = (rank % 2 == 0) ? 0 : 1 ;
      key = size - rank ;
      MPI_Comm_split(comm, color, key, &newcomm) ;

      Comments: Roles of parameters "color" and "key" in MPI_Comm_split

      "color" controls subset assignment for each group under the communicator "comm".

      "key" controls rank assignment for each process inside a group.


      Exercise:

      • Divide the default communicator MPI_COMM_WORLD into 3 non-overlapping communicators such that
        • processes with rank 0, 3, 6,... belong to the first group
        • processes with rank 1, 4, 7,... belong to the second group
        • processes with rank 2, 5, 8,... belong to the third group

      • Print the rank in MPI_COMM_WORLD and rank in new communicator.

       


       

      References