Skip to content

Draft: ENH: improve handling of multi-pass send/recv

Mark OLESEN requested to merge update-Pstream-large-sends into develop
  • the maxCommsSize variable is used to 'chunk' large data transfers (eg, with PstreamBuffers) into a multi-pass send/recv sequence.

    The send/recv windows for chunk-wise transfers:

    iter    data window
    ----    -----------
    0       [0, chunk]
    1       [chunk, 2*chunk]
    2       [2*chunk, 3*chunk]
    ...

    The previous versions, the number of chunks was determined by the sender sizes. This required an additional MPI_Allreduce to establish an overall consistent number of chunks to walk. This additional overhead each time meant that maxCommsSize was rarely actually enabled.

    We can, however, instead rely on the send/recv buffers having been consistently sized ands simply walk through the local send/recvs until no futher chunks need to be exchanged. As an additional enhancement, the message tags are connected to chunking iteration, which allows the setup of all send/recvs without an intermediate Allwait.

    The default value of maxCommsSize = 0 corresponds to a chunk-size of (INT_MAX), which is the current internal limit for MPI counts. Since we mostly send/recv in bytes, this limit can hit rather quickly.

Merge request reports