Skip to content
Snippets Groups Projects
  1. Apr 10, 2016
  2. Apr 08, 2016
  3. Apr 07, 2016
  4. Apr 06, 2016
  5. Apr 04, 2016
  6. Apr 03, 2016
  7. Apr 02, 2016
    • Henry Weller's avatar
    • Henry Weller's avatar
      Pstream: optimisation of data exchange · 56668b24
      Henry Weller authored
      Contributed by Mattijs Janssens.
      
      1. Any non-blocking data exchange needs to know in advance the sizes to
         receive so it can size the buffer.  For "halo" exchanges this is not
         a problem since the sizes are known in advance but or all other data
         exchanges these sizes need to be exchanged in advance.
      
         This was previously done by having all processors send the sizes of data to
         send to the master and send it back such that all processors
         - had the same information
         - all could work out who was sending what to where and hence what needed to
           be received.
      
         This is now changed such that we only send the size to the
         destination processor (instead of to all as previously). This means
         that
         - the list of sizes to send is now of size nProcs v.s. nProcs*nProcs before
         - we cut out the route to the master and back by using a native MPI
           call
      
         It causes a small change to the API of exchange and PstreamBuffers -
         they now return the sizes of the local buffers only (a labelList) and
         not the sizes of the buffers on all processors (labelListList)
      
      2. Reversing the order of the way in which the sending is done when
         scattering information from the master processor to the other
         processors. This is done in a tree like fashion. Each processor has a
         set of processors to receive from/ send to. When receiving it will
         first receive from the processors with the least amount of
         sub-processors (i.e. the ones which return first). When sending it
         needs to do the opposite: start sending to the processor with the
         most amount of sub-tree since this is the critical path.
      56668b24
  8. Apr 01, 2016