Optimisation of MPI global calls

This issue was created automatically from an original CUE issue. Further discussion may take place here.

The goal here woudl be to optimise scalability in regards to call to p_max/p_sum/p_min...

Sometimes 4 call to p_max are made on a 4 integers when it might be better to do one call on a 4-integer array.
Or calls within a loop on an element of an array when the call could be made on the array itself outside the loop.

Here is a list of files where it should be investiguated:

sources/gaia/flyxptr_gaia.f: same as above but for gaia
sources/utils/bief/fluxpr.f: call to p_min,p_max within loop on nsec could be done outside of loop
sources/telemac2d/sluxpr_telemac.f: same as above
sources/gaia/conlit_gaia.f: p_max(yadeb) within nfrliq loop could be done in outside loop
sources/telemac2d/bord.f: p_max(yadeb) within nfrliq loop could be done in outside loop
sources/telemac2d/oilspill.f: group p_sum in balance of oil spill
sources/telemac2d/buse.f: group p_min and p_max
sources/telemac2d/lecbus.f: group p_min p_max
sources/telemac2d/cltrac.f: group p_max,p_min

Also it is often use p_min + p_max to get value where some are set to zero and the rest to the same value.
Creating a costum operator to handle that might be worth it.

Edited Nov 15, 2023 by Yoann Audouin