Communicating between components¶
CanESM5 runs in a multiple-program, multiple-domain paradigm where each component (atmosphere, ocean, and coupler) run
their own executables and communicate using MPI. All MPI tasks exist in the same MPI_COMM_WORLD. CanCPL itself is
run on a single MPI task (and has no further parallelization within it). This page details how fields are sent
between components and the transformations that occur.
Coupler API¶
All three components compile com_cpl.F90 containing routines that initialize the MPI communicators on every task,
create MPI communicators, define the interfaces to send and receive data via MPI (see the API reference for a full
description), and setup the list of variables that will be passed through the coupler. The coupler API are only accessed
in a handful of routines in each component:
AGCM
gcm18.Fmpi_getcpl2.Fmpi_putggb2.F
Ocean
cpl_cancpl.F90
Organization of MPI communications¶
CanESM5 is run in a multi-program, multi-data (MPMD) paradigm where each component of the earth system (CanAM, CanCPL,
and NEMO) has its own executable. Every MPI task exists within the default MPI_COMM_WORLD communicator. The
parallelization strategies for CanAM and NEMO use MPI internally to exchange data between tasks. The calls are
necessarily blocking and so the MPI_COMM_WORLD must be split properly to allow each component to run in parallel.
The definition of these communicators is done in the subroutine define_group in com_cpl.F90. During
initialization each component calls this subroutine with a three-character identifier cpl, atm, or ocn,
depending on which component it is. These character strings are then mapped onto pre-defined, integer parameters in
cpl_types.F90 to avoid string comparisons. The integer parameters are then used to create the MPI communicators for
each group. Additional information about each group is stored in the group_info array which contains the ‘leader’
of each group (the task with the lowest MPI rank in each group) and the ranks associated with each communicator.
Some of this information is then broadcast from the leader of each group to ensure that every MPI task has the same
information.
Exchanging coupled fields¶
The transfer of fields between communicators is handled only by the master tasks. These fields comprise the entire global array and not just the subdomain (in the ocean) or latitude bands (in the atmosphere) associated with an individual task. After receiving the field, the master task scatters the global array to all other tasks within the communicator.
The following describes the communication pathway using sea surface temperature as an example:
The lead
ocntask constructs the global SST array from the subdomain of every otherocntask.The lead
ocntask sends SST to the main (and only)cpltask.The
cpltask remaps SST from the ocean grid to the atmospheric gridThe
cpltask sends the global SST array to the masteratmtaskThe lead
atmtask scatters the global SST array to every otheratmtask.Each atmospheric task copies only the part of the global array that it needs
Potential future enhancements¶
Eliminate the need for steps (4) and (5) above by creating an intercommunicator between
cplandatm/ocnRefactor coupler to support more tasks to enhance performance of the ESMF remapping
Avoid repeated code by refactoring the bcast routines
Generate mapping between processor domains to avoid full global gathers/scatters