Open Source Your Knowledge, Become a Contributor

Technology knowledge has to be shared and made accessible for free. Join the movement.

Create Content

Topologies

MPI topology mechanism is an extra, optional attribute that one can give to an intra-communicator; topologies cannot be added to inter-communicators. A topology can provide a convenient naming mechanism for the processes of a group (within a communicator), and additionally, may assist the runtime system in mapping the processes onto hardware.

As stated for Groups, Communicators, a process group in MPI is a collection of n processes. Each process in the group is assigned a rank between 0 and n-1. In many parallel applications a linear ranking of processes does not adequately reflect the logical communication pattern of the processes (which is usually determined by the underlying problem geometry and the numerical algorithm used).

Often the processes are arranged in topological patterns such as two- or three-dimensional grids. More generally, the logical process arrangement is described by a graph. In this chapter we will refer to this logical process arrangement as the virtual topology.

A clear distinction must be made between the virtual process topology and the topology of the underlying, physical hardware. The virtual topology can be exploited by the system in the assignment of processes to physical processors, if this helps to improve the communication performance on a given machine. How this mapping is done, however, is outside the scope of MPI. The description of the virtual topology, on the other hand, depends only on the application, and is machine-independent. The functions that are described in this chapter deal with machine-independent mapping and communication on virtual process topologies.

What are a virtual topologies?

The communication pattern of a set of processes can be represented by a graph. The nodes represent processes, and the edges connect processes that communicate with each other. MPI provides message-passing between any pair of processes in a group. There is no requirement for opening a channel explicitly. Therefore, a missing link in the user-defined process graph does not prevent the corresponding processes from exchanging messages. It means rather that this connection is neglected in the virtual topology. This strategy implies that the topology gives no convenient way of naming this pathway of communication. Another possible consequence is that an automatic mapping tool (if one exists for the runtime environment) will not take account of this edge when mapping.

Specifying the virtual topology in terms of a graph is sufficient for all applications. However, in many applications the graph structure is regular, and the detailed set-up of the graph would be inconvenient for the user and might be less efficient at run time. A large fraction of all parallel applications use process topologies like rings, two- or higher-dimensional grids, or tori. These structures are completely defined by the number of dimensions and the numbers of processes in each coordinate direction. Also, the mapping of grids and tori is generally an easier problem than that of general graphs. Thus, it is desirable to address these cases explicitly.

Example Cartesian Topology

Process coordinates in a Cartesian structure begin their numbering at 0. Row-major numbering is always used for the processes in a Cartesian structure. This means that, for example, the relation between group rank and coordinates for four processes in a (2 × 2) grid is as follows.

  • coord (0,0): rank 0
  • coord (0,1): rank 1
  • coord (1,0): rank 2
  • coord (1,1): rank 3

Topologies Constructor

In the following several MPI topologies are described. The available MPI topologies are: MPI_GRAPH the graph topology, MPI_CART the cartesian topology, and MPI_DIST_GRAPH distributed graph topology. When is not defined a topology for a communicator the value MPI_UNDEFINED is used.

Moreover, MPI provides usefull function to inquiry topology, here the most important:

  • MPI_TOPO_TEST returns the type of topology that is assigned to a communicator.
    int MPI_Topo_test(MPI_Comm comm, int *status)
    
    • IN comm, communicator (handle)
    • OUT status, topology type of communicator comm (state)
  • MPI_GRAPH_GET retrieve the graph-topologyinformation that was associated with a communicator.
    int MPI_Graph_get(MPI_Comm comm, int maxindex, int maxedges, int index[],int edges[])
    
    • IN comm, communicator with graph structure (handle)
    • IN maxindex, length of vector index in the calling program (integer)
    • IN maxedges, length of vector edges in the calling program (integer)
    • OUT index, array of integers containing the graph structure (for details see the definition of MPI_GRAPH_CREATE)
    • OUT edges, array of integers containing the graph structure
  • MPI_CART_GET return the Cartesian topology information that was associated with a communicator.
    int MPI_Cart_get(MPI_Comm comm, int maxdims, int dims[], int periods[], int coords[])
    
    • IN comm, communicator with Cartesian structure (handle)
    • IN maxdims, length of vectors dims, periods, and coords in the calling program (integer)
    • OUT dims, number of processes for each Cartesian dimension (array of integer)
    • OUT periods, periodicity ( true/ false) for each Cartesian dimension (array of logical)
    • OUT coords, coordinates of calling process in Cartesian structure (array of integer)
  • MPI_CART_RANK translates the logical process coordinates to process ranks as they are used by the point-to-point routines.
    int MPI_Cart_rank(MPI_Comm comm, const int coords[], int *rank)
    
    • IN comm, communicator with Cartesian structure (handle)
    • IN coords, integer array (of size ndims) specifying the Cartesian coordinates of a process
    • OUT rank, rank of specified process (integer)
  • MPI_CART_COORDS provides rank-to-coordinates translation (inverse mapping),
    int MPI_Cart_coords(MPI_Comm comm, int rank, int maxdims, int coords[])
    
    • IN comm, communicator with Cartesian structure (handle)
    • IN rank, rank of a process within group of comm (integer)
    • IN maxdims, length of vector coords in the calling program (integer)
    • OUT coords, integer array (of size ndims) containing the Cartesian coordinates of specified process (array of integers)

Cartesian Constructor

MPI_CART_CREATE returns a handle to a new communicator to which the Cartesian topology information is attached. If reorder is false then the rank of each process in the new group is identical to its rank in the old group. Otherwise, the function may reorder the processes (possibly so as to choose a good embedding of the virtual topology onto the physical machine). If the total size of the Cartesian grid is smaller than the size of the group of comm_old, then some processes are returned MPI_COMM_NULL, in analogy to MPI_COMM_SPLIT. If ndims is zero then a zero-dimensional Cartesian topology is created. The call is erroneous if it specifies a grid that is larger than the group size or if ndims is negative.

Cartesian

int MPI_Cart_create(MPI_Comm comm_old, int ndims, const int dims[], const int periods[], int reorder, MPI_Comm *comm_cart)
  • IN comm_old, input communicator (handle)
  • IN ndims, number of dimensions of Cartesian grid (integer)
  • IN dims, integer array of size ndims specifying the number of processes in each dimension
  • IN periods, logical array of size ndims specifying whether the grid is periodic ( true) or not ( false) in each dimension
  • IN reorder, ranking may be reordered ( true) or not ( false) (logical)
  • OUT comm_cart, communicator with new Cartesian topology (handle)

The following example uses 12 processes.

MPI_Cart_create

Graph Constructor

MPI_GRAPH_CREATE returns a handle to a new communicator to which the graph topology information is attached. If reorder = false then the rank of each process in the new group is identical to its rank in the old group. Otherwise, the function may reorder the processes. If the size, nnodes, of the graph is smaller than the size of the group of comm_old, then some processes are returned MPI_COMM_NULL, in analogy to MPI_CART_CREATE and MPI_COMM_SPLIT. If the graph is empty, i.e., nnodes == 0, then MPI_COMM_NULL is returned in all processes. The call is erroneous if it specifies a graph that is larger than the group size of the input communicator.

The three parameters nnodes, index and edges define the graph structure. nnodes is the number of nodes of the graph. The nodes are numbered from 0 to nnodes-1. The i-th entry of array index stores the total number of neighbors of the first i graph nodes. The lists of neighbors of nodes 0, 1, ..., nnodes-1 are stored in consecutive locations in array edges. The array edges is a flattened representation of the edge lists. The total number of entries in index is nnodes and the total number of entries in edges is equal to the number of graph edges.

Graph

int MPI_Graph_create(MPI_Comm comm_old, int nnodes, const int index[], const int edges[], int reorder, MPI_Comm *comm_graph)
  • IN comm_old, input communicator (handle)
  • IN nnodes, number of nodes in graph (integer)
  • IN index, array of integers describing node degrees (see below)
  • IN edges, array of integers describing graph edges (see below)
  • IN reorder, ranking may be reordered ( true) or not ( false) (logical)
  • OUT comm_graph, communicator with graph topology added (handle)

The following example uses 4 processes.

MPI_Graph_create
Open Source Your Knowledge: become a Contributor and help others learn. Create New Content