PFlotran v7.0 MS MPI issue

65 views
Skip to first unread message

Snape, Christopher

unread,
Jan 16, 2026, 11:37:29 AMJan 16
to pflotr...@googlegroups.com

Hi PFlotran Dev Team,

 

We are currently attempting to build PFlotran 7.0 for Windows, but struggle with calls made to MPI from PFlotran. Any insight you might be able to share would be really valuable to us.

 

Full details of what we are doing are below:

 

Build environment: Windows Server 2022 x64

Build tools: MSYS2 with mingw64, GNU compiler collection v15.2.0 with MPI wrappers (mpicc, mpifort)

Libraries/dependencies:

Flex v2.6.4

Bison v3.8.2

BLAS/LAPACK: openblas v0.3.30

MS MPI v10.1.3 (https://learn.microsoft.com/en-us/message-passing-interface/microsoft-mpi)

HDF5: v1.14.6

PT-Scotch: v7.0.3

PETSc: v3.21.4

PFlotran: v7.0

 

We build HDF5 (with parallel support), PT-Scotch and PETSc from source. HDF5, PT-Scotch and PETSc are configured to build Fortran bindings and we use static linking throughout.

 

Our configure/build step for HDF5 and PT-Scotch only takes the path to the mpiexec.exe and we use short DOS paths (no whitespaces) to define this, e.g. “/C/PROGRA~1/MICROS~4/Bin/mpiexec”. In PETSc we define Include and Lib locations to MS MPI (also using short DOS paths).

 

Regression tests for HDF5 show 29 test fails (9 of these have MPI in their name, but all other MPI tests pass, and one specifically references Fortran).

 

106:MPI_TEST_testphdf5_cchunk3

114:MPI_TEST_testphdf5_tldsc

120:MPI_TEST_t_bigio

121:MPI_TEST_t_cache

128:MPI_TEST_t_pmulti_dset

129:MPI_TEST_t_select_io_dset

131:MPI_TEST_t_filters_parallel

2356:FORTRAN_testhdf5_fortran

2366:MPI_TEST_FORT_async_test

2790:MPI_TEST_H5_f90_ph5_f90_filtered_writes_no_sel

 

Regression tests for PT-Scotch show 4 test fails (3 of these are related to file compression/zipping and we’re not concerned about that).

 

1:test_common_file_compress_bz2

2:test_common_file_compress_gz

3:test_common_file_compress_lzma

5:test_common_random_1

 

Running check for PETSc indicates some warnings with Tutorials ex19 and ex5f but otherwise tests for C/C++ and Fortran using MPI processes pass.

 

PFlotran

 

PFlotran builds successfully, but when running any calculation that invokes MPI_Allreduce(), this method does not behave as expected and always sets 0 in returned data.

 

By adding some print debugging statements for example, we can determine that temp_int in line 956 is correct for the supplied model, but after the call to MPI_Allreduce(), temp_int is set to 0. This therefore fails the logic in line 959 and causes the calculation to fail. We see this in all grid calculations that call MPI_AllReduce().

 

 

 

  1. Is there a specific step or configuration setting missing that will result in such behaviour?
  2. We found this article https://abhila.sh/writing/3/mpi_instructions.html which mentions creating a Fortran library for msmpi.dll and statically linking this with PFlotran but this did not help resolve the issue.
  3. Using 1 MPI process (or even launching PFlotran without mpiexec) still causes the same behaviour.
  4. We’ve tried switching to the Intel MPI library, but this generated many errors in HDF5 and effectively all PFlotran regression tests failed.
  5. Our MPI issue doesn’t seem to be specific to calling MPI_Allreduce with MPI_MIN as the instruction, or when using Petsc Ints as the data type. There are other places where it’s doing the same thing for MPI_MAX and regular ints for example.
  6. We’re not fully sure how MPI is brought into the PFlotran execution chain, because we don’t observe any “use MPI” or other indicator in the Fortran source. So it’s not clear to us if some of our current setup breaks something for MS MPI here.

 

We believe PFlotran 7.0 was never formally released for general use – is it possible there was an underlying issue with using MS MPI in this version?

 

We would appreciate any help or guidance on what might be the cause of the problem here.

 

Thanks,

 

Chris




Capgemini is a trading name used by the Capgemini Group of companies which includes Capgemini UK plc, a company registered in England and Wales (number 943935) whose registered office is at 95 Queen Victoria Street, London, UK, EC4V 4HN.

This message contains information that may be privileged or confidential and is the property of the Capgemini Group. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized to read, print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message.

Hammond, Glenn E

unread,
Jan 16, 2026, 4:38:57 PMJan 16
to pflotr...@googlegroups.com
Chris,

Thank you for the comprehensive description of the issue; very helpful.

I haven’t encountered the issue you described before, nor have I worked with MS MPI. I recommend trying MPICH or OpenMPI as alternatives for your MPI setup. If the issue persists, double-check that the mpirun you’re using corresponds to the same location as the MPI libraries.

In my experience, mismatches between mpirun and the libraries have been the most common cause of MPI-related issues at runtime. A typical symptom is screen output from all MPI ranks, rather than just the I/O rank.

Another way to test is to create a simple code using PETSc with an MPI_Allreduce(). For instance, you can place MPI_Allreduce() in https://bitbucket.org/pflotran/pflotran/src/master/src/test/petsc_test_f.F90 after between the two print statements:  if (rank == 0) print *, ‘Beginning/End of Fortran90 test program’ . Use make petsc_test_f to build that executable. This help distinguish whether the issue originates from PETSc or PFLOTRAN.

Let me know how it goes or if you need further assistance troubleshooting.

Regards,

Glenn

From: 'Snape, Christopher' via pflotran-dev <pflotr...@googlegroups.com>
Date: Friday, January 16, 2026 at 8:37 AM
To: pflotr...@googlegroups.com <pflotr...@googlegroups.com>
Subject: [pflotran-dev: 6369] PFlotran v7.0 MS MPI issue

Check twice before you click! This email originated from outside PNNL.
--
You received this message because you are subscribed to the Google Groups "pflotran-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pflotran-dev...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/pflotran-dev/CWXP123MB5689DA07A04A75702675E6ABF68DA%40CWXP123MB5689.GBRP123.PROD.OUTLOOK.COM.

Mills, Richard Tran

unread,
Jan 16, 2026, 6:33:27 PMJan 16
to pflotr...@googlegroups.com
Hi Chris,

Glenn has given you good advice. As one of the PETSc developers, let me also add: You generally shouldn't be receiving warnings from PETSc's 'make check' if everything is configured correctly. What are the exact warnings that you are seeing?

If you determine that something might be wrong with your PETSc install, please ask for help by sending your full configure.log to petsc...@mcs.anl.gov.

Thanks,
Richard

From: 'Hammond, Glenn E' via pflotran-dev <pflotr...@googlegroups.com>
Sent: Friday, January 16, 2026 1:38 PM
To: pflotr...@googlegroups.com <pflotr...@googlegroups.com>
Subject: [pflotran-dev: 6370] Re: PFlotran v7.0 MS MPI issue
 
Chris, Thank you for the comprehensive description of the issue; very helpful. I haven’t encountered the issue you described before, nor have I worked with MS MPI. I recommend trying MPICH or OpenMPI as alternatives for your MPI setup. If the
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.
 
ZjQcmQRYFpfptBannerEnd

Christopher Snape

unread,
Jan 19, 2026, 5:58:48 AMJan 19
to pflotran-dev
Hi Glenn and Richard,

Thanks very much for the replies and suggestions. I'll look into testing with the petsc_test_f.f90 file as well as switching to an alternative suggested MPI library from the PFlotran perspective.

Richard, 

For PETSc, I observe lots of the same warnings during the build process: mainly complaints about use of %z character formatting, and messages about the size of PETSCInt and suggestions about switching to 64-bit indices. 

I did try switching to using 64-bit indices in both PT-Scotch and PETSc as a test, but this produced errors in the PETSc build so reverted back to using 32-bit indices.

In terms of running check, I get "Error detected during compile or link!" messages, but they seem to refer to %z character formatting, and tests report they pass.

If it's better for me to post these to the PETSc developer group I'm happy to do that.

Thanks,

Chris  

Christopher Snape

unread,
Jan 19, 2026, 8:37:35 AMJan 19
to pflotran-dev
A few examples of some of the PETSc build warnings (I'm struggling to attach images from my work PC unfortunately).

C:/OSSC-Compilation-Task-CS/Dependencies/PETSc/include/petscsys.h:1657:3: note: in expansion of macro 'PetscCheck'
 1657 |   PetscCheck(sizeof(PetscCount) <= sizeof(PetscInt) || a <= PETSC_MAX_INT, PETSC_COMM_SELF, PETSC_ERR_ARG_OUTOFRANGE, "%" PetscCount_FMT " is too big for PetscInt, you may need to ./configure using --with-64-bit-indices", a);


C:/OSSC-Compilation-Task-CS/Dependencies/PETSc/include/petscstring.h: In function 'PetscMemzero':
C:/OSSC-Compilation-Task-CS/Dependencies/PETSc/include/petscstring.h:736:55: warning: unknown conversion type character 'z' in format [-Wformat=]
  736 |   PetscAssert(a, PETSC_COMM_SELF, PETSC_ERR_ARG_NULL, "Trying to zero %zu bytes at a null pointer", n);
      |                                                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


C:/OSSC-Compilation-Task-CS/Dependencies/PETSc/include/petscstring.h:684:55: warning: too many arguments for format [-Wformat-extra-args]
  684 |   PetscAssert(b, PETSC_COMM_SELF, PETSC_ERR_ARG_NULL, "Trying to copy %zu bytes from a null pointer (Argument #2)", n);
      |                                                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Christopher Snape

unread,
Jan 20, 2026, 7:43:31 AMJan 20
to pflotran-dev
Hi Glenn,

I tried modifying the Petsc_test_f.F90 file, by adding the following below the "call VecDestroy(vec, ierr);CHKERRQ(ierr)" line. I'm afraid I'm not a Fortran programmer by trade, but hopefully this is all correct:

  PetscErrorCode :: ierr2
  PetscInt :: temp_int
  PetscInt :: comm
  PetscMPIInt :: ONE_INTEGER_MPI

  temp_int = 1
  comm = 2
  ONE_INTEGER_MPI = 1
  print *, 'temp_int before MPI_Allreduce:', temp_int
  call MPI_Allreduce(MPI_IN_PLACE,temp_int,ONE_INTEGER_MPI,MPIU_INTEGER,MPI_MIN,comm,ierr2);CHKERRQ(ierr2)
  print *, 'temp_int after MPI_Allreduce:', temp_int 

Running the .exe I simply got several [0] PETSC ERROR lines in the command line, it summarised the PETsc version (3.21.4) the configure options I used, with the final line:

[0]PETSC ERROR: #1 petsc_test_f.F90:49  

(line 49 is where I call MPI_Allreduce)

As a separate test, I tried building PFlotran using the current Master branch (I used PETSc 3.24.0, PT-Scotch 7.0.9 and HDF5 1.14.1 with MS MPI) but I still have exactly the same problem with MPI_Allreduce calls. This leads me to suspect it's likely some fundamental problem in how I'm configuring PETSc to pickup MS MPI (i.e. the problem is not specific to any particular versions I'm using). I currently configure MS MPI by using the Mingw64 packaged MS MPI headers and lib (libmsmpi.dll.a).

I'll make a post to the PETSc dev group and send my config log in, in case anyone is able to spot something wrong.

Thanks,

Chris  
Reply all
Reply to author
Forward
0 new messages