Communicating between components

CanESM5 runs in a multiple-program, multiple-domain paradigm where each component (atmosphere, ocean, and coupler) run their own executables and communicate using MPI. All MPI tasks exist in the same MPI_COMM_WORLD. CanCPL itself is run on a single MPI task (and has no further parallelization within it). This page details how fields are sent between components and the transformations that occur.

Coupler API

All three components compile com_cpl.F90 containing routines that initialize the MPI communicators on every task, create MPI communicators, define the interfaces to send and receive data via MPI (see the API reference for a full description), and setup the list of variables that will be passed through the coupler. The coupler API are only accessed in a handful of routines in each component:

AGCM

  • gcm18.F

  • mpi_getcpl2.F

  • mpi_putggb2.F

Ocean

  • cpl_cancpl.F90

Organization of MPI communications

CanESM5 is run in a multi-program, multi-data (MPMD) paradigm where each component of the earth system (CanAM, CanCPL, and NEMO) has its own executable. Every MPI task exists within the default MPI_COMM_WORLD communicator. The parallelization strategies for CanAM and NEMO use MPI internally to exchange data between tasks. The calls are necessarily blocking and so the MPI_COMM_WORLD must be split properly to allow each component to run in parallel.

The definition of these communicators is done in the subroutine define_group in com_cpl.F90. During initialization each component calls this subroutine with a three-character identifier cpl, atm, or ocn, depending on which component it is. These character strings are then mapped onto pre-defined, integer parameters in cpl_types.F90 to avoid string comparisons. The integer parameters are then used to create the MPI communicators for each group. Additional information about each group is stored in the group_info array which contains the ‘leader’ of each group (the task with the lowest MPI rank in each group) and the ranks associated with each communicator. Some of this information is then broadcast from the leader of each group to ensure that every MPI task has the same information.

Exchanging coupled fields

The transfer of fields between communicators is handled only by the master tasks. These fields comprise the entire global array and not just the subdomain (in the ocean) or latitude bands (in the atmosphere) associated with an individual task. After receiving the field, the master task scatters the global array to all other tasks within the communicator.

The following describes the communication pathway using sea surface temperature as an example:

  1. The lead ocn task constructs the global SST array from the subdomain of every other ocn task.

  2. The lead ocn task sends SST to the main (and only) cpl task.

  3. The cpl task remaps SST from the ocean grid to the atmospheric grid

  4. The cpl task sends the global SST array to the master atm task

  5. The lead atm task scatters the global SST array to every other atm task.

  6. Each atmospheric task copies only the part of the global array that it needs

Potential future enhancements

  • Eliminate the need for steps (4) and (5) above by creating an intercommunicator between cpl and atm/ocn

  • Refactor coupler to support more tasks to enhance performance of the ESMF remapping

  • Avoid repeated code by refactoring the bcast routines

  • Generate mapping between processor domains to avoid full global gathers/scatters