Mark Olesen
authored
- the field functions use a variety of TFOR_ALL... macros to handle the field loops. However, these all have a __restrict__ keyword buried in the list access functions. This means that any operations with identical input and output violate the __restrict__ contract and this may be responsible for some of odd results seen with particular compiler versions. - updated the macros into inplace and non-inplace versions with an additional rename. For example, previous: TFOR_ALL_F_OP_FUNC_F(typeF1, f1, OP, FUNC, typeF2, f2) updated: TSEQ_FORALL_F_OP_FUNC_F(f1, OP, FUNC, f2) TSEQ_FORALL_F_OP_FUNC_F_inplace(f1, OP, FUNC, f2) The updated versions now start with a 'TSEQ_FORALL_' prefix to indicate that they roughly correspond to a <std::execution::seq> execution policy. The change of name is also useful since they are now also written supplying the parameter data types. The solution is still not necessarily optimal, since it involves a run-time check and more writing. For example, ``` if (result.cdata_bytes() == f1.cdata_bytes()) { // std::for_each TSEQ_FORALL_F_OP_F_FUNC_inplace(result, =, f1, T) } else { // std::transform TSEQ_FORALL_F_OP_F_FUNC(result, =, f1, T) } ``` However, the check is cheap and is only done once (outside of the loop). - possibly related to #2925, #3024, #3091, #3166
Name | Last commit | Last update |
---|