MPI_REDUCE, MPI_Reduce Purpose Applies a reduction operation to the vector sendbuf over the set of tasks specified by comm and places the result in recvbuf on root. C synopsis #include int MPI_Reduce(void* sendbuf,void* recvbuf,int count, MPI_Datatype datatype,MPI_Op op,int root,MPI_Comm comm); C++ synopsis #include mpi.h void MPI::Comm::Reduce(const void* sendbuf, void* recvbuf, int count, const MPI::Datatype& datatype, const MPI::Op& op, int root) const; FORTRAN synopsis include 'mpif.h' or use mpi MPI_REDUCE(CHOICE SENDBUF,CHOICE RECVBUF,INTEGER COUNT, INTEGER DATATYPE,INTEGER OP,INTEGER ROOT,INTEGER COMM, INTEGER IERROR) Description This subroutine applies a reduction operation to the vector sendbuf over the set of tasks specified by comm and places the result in recvbuf on root. The input buffer and the output buffer have the same number of elements with the same type. The arguments sendbuf, count, and datatype define the send or input buffer. The arguments recvbuf, count and datatype define the output buffer. MPI_REDUCE is called by all group members using the same arguments for count, datatype, op, and root. If a sequence of elements is provided to a task, the reduction operation is executed element-wise on each entry of the sequence. Here's an example. If the operation is MPI_MAX and the send buffer contains two elements that are floating point numbers (count = 2 and datatype = MPI_FLOAT), recvbuf(1) = global max(sendbuf(1)) and recvbuf(2) = global max(sendbuf(2)). Users can define their own operations or use the predefined operations provided by MPI. User-defined operations can be overloaded to operate on several datatypes, either basic or derived. The argument datatype of MPI_REDUCE must be compatible with op. See IBM Parallel Environment for AIX: MPI Programming Guide for a list of the MPI predefined operations. The "in place" option for intracommunicators is specified by passing the value MPI_IN_PLACE to the argument sendbuf at the root. In this case, the input data is taken at the root from the receive buffer, where it will be replaced by the output data. If comm is an intercommunicator, the call involves all tasks in the intercommunicator, but with one group (group A) defining the root task. All tasks in the other group (group B) pass the same value in argument root, which is the rank of the root in group A. The root passes the value MPI_ROOT in root. All other tasks in group A pass the value MPI_PROC_NULL in root. Only send buffer arguments are significant in group B and only receive buffer arguments are significant at the root. MPI_IN_PLACE is not supported for intercommunicators. When you use this subroutine in a threads application, make sure all collective operations on a particular communicator occur in the same order at each task. See IBM Parallel Environment for AIX: MPI Programming Guide for more information on programming with MPI in a threads environment. Parameters sendbuf is the address of the send buffer (choice) (IN) recvbuf is the address of the receive buffer (choice, significant only at root) (OUT) count is the number of elements in the send buffer (integer) (IN) datatype is the datatype of elements of the send buffer (handle) (IN) op is the reduction operation (handle) (IN) root is the rank of the root task (integer) (IN) comm is the communicator (handle) (IN) IERROR is the FORTRAN return code. It is always the last argument. Notes See IBM Parallel Environment for AIX: MPI Programming Guide. The MPI standard urges MPI implementations to use the same evaluation order for reductions every time, even if this negatively affects performance. PE MPI adjusts its reduce algorithms for the optimal performance on a given task distribution. The MPI standard suggests, but does not mandate, this sacrifice of performance. PE MPI chooses to put performance ahead of the MPI standard's recommendation. This means that two runs with the same task count may produce results that differ in the least significant bits, due to rounding effects when evaluation order changes. Two runs that use the same task count and the same distribution across nodes will always give identical results. In the 64-bit library, this function uses a shared memory optimization among the tasks on a node. This optimization is discussed in the chapter "Using shared memory" of IBM Parallel Environment for AIX: MPI Programming Guide, and is enabled by default. This optimization is not available to 32-bit programs. Errors Fatal errors: Invalid count count < 0 Invalid datatype Type not committed Invalid op Invalid root For an intracommunicator: root < 0 or root >= groupsize For an intercommunicator: root < 0 and is neither MPI_ROOT nor MPI_PROC_NULL, or root >= groupsize of the remote group Invalid communicator Unequal message lengths Invalid use of MPI_IN_PLACE MPI not initialized MPI already finalized Develop mode error if: Inconsistent op Inconsistent datatype Inconsistent root Inconsistent message length Related information MPE_IREDUCE MPI_ALLREDUCE MPI_OP_CREATE MPI_REDUCE_SCATTER MPI_SCAN