Hello, I would like to to calculate a global net flux value withing OpenIFS to verify my conservative remapping. ECMWF hides MPI functionalities under MPL. There is limited MPL documentation in Part VI of the documentations but it does not include MPL_ALLREDUCE.

MPI_ALLREDUCE has an input and an output parameter, but MPL_ALLREDUCE seemingly has only input. How do I get the SUM oder MAX values out?


Examples:

MPI_ALLREDUCE within FESOM2:

    call MPI_AllREDUCE(shortwave_local, shortwave_global,  2, MPI_DOUBLE_PRECISION, MPI_SUM, MPI_COMM_FESOM, MPIerr)


MPL_ALLREDUCE within OpenIFS:

    CALL MPL_ALLREDUCE(shortwave_local,'SUM',ldreprod=.FALSE.,CDSTRING='AWICOUPLING:')


I use the function as it is used in many other places within OpenIFS. Does it work only for 1-d variables and returns in the input variable? Did I misunderstand something here? I'll be using normal MPI_ALLREDUCE for now but I'd still like to learn.


Cheers,

Jan

3 Comments

  1. Hi Jan,

    It looks like the result ofthe required operation ('SUM', 'MAX', etc) is copied to the input buffer before subroutine exit.  In mpl_allreduce_mod.F90, looking at MPL_ALLREDUCE_REAL8(PSENDBUF, CDOPER,....), the call to MPI_ALLREDUCE is:

    414   CALL MPI_ALLREDUCE(PSENDBUF,ZRECVBUF,ISENDCOUNT,INT(MPI_REAL8), &
    415                   &  IOPER,ICOMM,IERROR)

    I'm ignoring the reproducible part of the IF/THEN/ELSE clause for simplicity

    Then right before the END SUBROUTINE statement:

    433 PSENDBUF(:) = ZRECVBUF(:)
    434 
    435 END SUBROUTINE MPL_ALLREDUCE_REA

    PSENDBUF is declared INTENT(INOUT), so the result should be returned in the input, i.e. in 'shortwave_local' in your case. The other overloaded versions do the same.

    Does that answer your question?

    Cheers,   Glenn


  2. Yes, thank you!

    I did end up getting MPL_ALLREDUCE to work as I need it. All I had to do was create local sums and then use MPL to add those up instead of sending in the whole arrays. I'm not very experienced with MPI communication and actually thought MPI_ALLREDUCE would do both summation steps in one go. I thought I could send local arrays in and get one global sum out. Confusion resolved.

    Cheers, Jan



  3. Ah ok. I understand the confusion.

    One point to bear in mind is reproducibility of the global sum. Unless you force the summation between MPI tasks to happen in a certain order, you cannot be sure of reproducibility. In the MPL_ALLREDUCE code module you'll see there is a parameter that will allow you to control how the calculation is done. If reproducibility is desired then MPL_ALLREDUCE actually uses a set of SEND/RECV calls to force the order of summation.   Likewise for local sums, it's better to sort first and then sum from the smallest value to the largest to control rounding.  You may know this already but I thought I'd mention it for completeness.

    Cheers,  Glenn