Skip to content

Parallel contiguous data synchronisation not scaling

Functionality to add/problem to solve

Parallel (face/edge/point) synchronisation uses the syncTools helper functions. It assumes non-contiguous data so uses streaming to extract the data, followed by an exchange of sizes before sending the actual data. It is this exchange of sizes (all-to-all) that causes scaling issues. For contiguous data and one-to-one mapping we know in advance the amount of data to receive so can skip this.

Target audience

  • parallel running on larger number of processors

Proposal

  • check for contiguous data in templated functions using PstreamBuffers

Links / references

This work is based on the work done in

"Communication Optimization for Multiphase FlowSolver in the Library of OpenFOAM"

Zhipeng Lin1, Wenjing Yang1,*, Houcun Zhou2, Xinhai Xu3, Liaoyuan Sun1, Yongjun Zhang3and Yuhua Tang

(MDPI "Water" magazine, October 2018)

it shows two bottlenecks:

  • linear solver using blocking allreduce
  • MULES finding out sizes to receive every sweep The first item was tackled in the v2006 pipelined CG solver implementation. The current issue is a more general fix of the second item.

@mark

Funding

(Does the functionality already exist/is sponsorship available?)

Edited by Mattijs Janssens