PETSc Segmentation Violation Error

118 views
Skip to first unread message

Madelaine Griesel

unread,
Jun 11, 2024, 11:57:44 AM6/11/24
to pflotr...@googlegroups.com
Hello, 

I recently updated my pflotran version to maint/v5.0 using petsc v3.20.2 and I am getting an PETSc error message about Segmentation Violation error. It appears to be an issue with the write to the HDF5 file because the model runs when I comment out the output card. 

I have inserted links below to the input files as well as the log with the error message. Please let me know if the links do not work.  

Thank you for your time, 
Madelaine

mesh.ugi
fehlman_bedforms_1m_water.h5
bedform_surface.ss
pflotran.in

Hammond, Glenn E

unread,
Jun 11, 2024, 1:15:19 PM6/11/24
to pflotr...@googlegroups.com

Madelaine,

 

I cannot replicate the issue with v5.0 or the master branch compiled as debug or optimized, serial or parallel. I suspect that something is corrupted with the HDF5 installation, but that is speculation. Are you running this on a Linux box or a larger machine?

 

Glenn

 

From: pflotr...@googlegroups.com <pflotr...@googlegroups.com> On Behalf Of Madelaine Griesel
Sent: Tuesday, June 11, 2024 8:57 AM
To: pflotr...@googlegroups.com
Subject: [pflotran-dev: 6259] PETSc Segmentation Violation Error

 

Check twice before you click! This email originated from outside PNNL.

 

--
You received this message because you are subscribed to the Google Groups "pflotran-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pflotran-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pflotran-dev/CABKq66ucg1FpUEfEBHMggDLUjsgJ1joNyW2t%2B%2B-18jtUosLmkw%40mail.gmail.com.

Madelaine Griesel

unread,
Jun 12, 2024, 1:31:10 PM6/12/24
to pflotran-dev
Hi Glenn, 

I am running this on the MIT Lincoln Labs Supercloud. For context, the Supercloud administrator recommended we use an open MPI module instead of MPICH to configure PETSc, which also includes an HDF5 download. Here is the configure command I used: 

./configure --CFLAGS='-O3' --CXXFLAGS='-O3' --FFLAGS='-O3' --with-debugging=no --with-mpi-dir=/usr/local/pkg/openmpi/4.1.5/ --download-hdf5=yes --download-hdf5-fortran-bindings=yes --download-fblaslapack=yes --download-metis=yes --download-parmetis=yes --force

Thank you,
Madelaine

Hammond, Glenn E

unread,
Jun 13, 2024, 10:48:00 AM6/13/24
to pflotr...@googlegroups.com

Open MPI should be fine. Do you have another machine on which you can confirm that PFLOTRAN compiles and the input deck fails? That will help rule out the MIT Supercloud installation being an issue.

 

Glenn

 

Madelaine Griesel

unread,
Jun 18, 2024, 11:57:27 AM6/18/24
to pflotran-dev
Hello Glenn, 

I was able to successfully compile PFLOTRAN and run a model with my input deck on another machine. Thus, it seems like an issue with the Supercloud installation. I'll reach out to their team. 

Thank you for your advice and time, 
Madelaine

Madelaine Griesel

unread,
Jul 2, 2024, 2:00:53 PM7/2/24
to pflotran-dev

Hello Glenn,

 

I've continued to look into this issue with the Supercloud team and they gave me the following information:


I was able to get a bit more information out of the stack trace by adding a debug flag to the pflotran build. From that I was able to identify the function the error is happening in:

 

#12  0x5601b49f2a45 in outputaggregatetofile

                at pflotran/v5/pflotran/src/pflotran/output_observation.F90:288

 

I looked at that output_observation.f90 file on Bitbucket for version 4 (which worked) versus version 5 ( https://bitbucket.org/pflotran/pflotran/src/a2104cedea1528a00aa2718572d43a4461019c60/src/pflotran/output_observation.F90?at=maint%2Fv5.0) , and it looks like there were a few lines which changed to specify an H5 output, such as in the following example:

 v4.pngv5_2.png

 

Is it possible that the changes to that output file might be causing the PETSc segmentation violation issue when writing to an H5 file in PFLOTRAN version 5? PFLOTRAN version 3 and 4 are able to write to an H5 file without any errors on the SuperCloud. 


Thank you for your time,

Madelaine


Hammond, Glenn E

unread,
Jul 2, 2024, 4:05:36 PM7/2/24
to pflotr...@googlegroups.com

Madelaine,

 

I am fairly confident that it is a bug, but we would need to attempt to replicate the issue (same compiler and hardware architecture). Will you please send the file $PETSC_DIR/$PETSC_ARCH/lib/petsc/conf/configure.log to pflotr...@googlegroups.com (to avoid spamming the userbase)? I am hoping it will provide such info.

 

Thanks,

 

Glenn

 

From: pflotr...@googlegroups.com <pflotr...@googlegroups.com> On Behalf Of Madelaine Griesel
Sent: Tuesday, July 2, 2024 11:01 AM
To: pflotran-dev <pflotr...@googlegroups.com>
Subject: Re: [pflotran-dev: 6265] PETSc Segmentation Violation Error

 

Hello Glenn,

 

I've continued to look into this issue with the Supercloud team and they gave me the following information:

 

I was able to get a bit more information out of the stack trace by adding a debug flag to the pflotran build. From that I was able to identify the function the error is happening in:

 

#12  0x5601b49f2a45 in outputaggregatetofile

                at pflotran/v5/pflotran/src/pflotran/output_observation.F90:288

 

I looked at that output_observation.f90 file on Bitbucket for version 4 (which worked) versus version 5 ( https://bitbucket.org/pflotran/pflotran/src/a2104cedea1528a00aa2718572d43a4461019c60/src/pflotran/output_observation.F90?at=maint%2Fv5.0) , and it looks like there were a few lines which changed to specify an H5 output, such as in the following example:

 

Madelaine Griesel

unread,
Jul 3, 2024, 10:07:24 AM7/3/24
to pflotran-dev
Hello Glenn, 

Thanks for your response, I have sent the configuration log to that email address. Please let me know if you need any additional information and I can get it to you. Thank you!

Best,
Madelaine

Hammond, Glenn E

unread,
Jul 17, 2024, 11:07:02 AM7/17/24
to pflotr...@googlegroups.com

Madelaine,

 

I tried configuring similar to what is reported in your configure.log. The main differences were shared versus static libraries and the Supercloud installing OpenMPI as opposed to my configure script downloading it (same versions). Your input deck runs fine. I apologize, but I am unsure what to do. Can the Supercloud team try configuring the “--download-openmpi=yes” instead of specifying the system installation of OpenMPI and see if that version succeeds?

 

Glenn

Madelaine Griesel

unread,
Oct 7, 2024, 11:56:35 AM10/7/24
to pflotran-dev

Hello Glenn,

 

I'm revisiting this thread as I discovered that my issue with the PETSc segmentation violation error for PFLOTRAN Version 5 is related to the use of the PFLOTRAN PERIODIC_OBSERVATION under the output card options. Running Version 5 fails if I include PERIODIC_OBSERVATION but when I comment it out, it does run.  When I used PFLOTRAN Version 3 instead, the model ran and output the appropriate files without any issue. I've uploaded the related PFLOTRAN input and log files for my attempts with both versions here: pflotran_petsc_error

 

Do you know what may be causing this issue? I would like to use PFLOTRAN Version 5 and the PERIODIC_OBSERVATION card to output an integral flux file. 


Thanks for your time,

Madelaine

Hammond, Glenn E

unread,
Oct 7, 2024, 12:07:23 PM10/7/24
to pflotr...@googlegroups.com

Madelaine,

 

Please upload a .tar.gz of the entire input deck to your shared drive. When I try to run pflotran_v5*, there are missing files. I note that I am using the PFLOTRAN development version (extension of v6), and I am hopeful that it will replicate the v5 issue.

 

Glenn

 

 Image removed by sender.Image removed by sender.

Madelaine Griesel

unread,
Oct 7, 2024, 12:36:43 PM10/7/24
to pflotran-dev
Hi Glenn, 

I have uploaded the full input deck as a tar.gz to v5 in the shared folder. Please let me know if you need anything else. 

Best,
Madelaine

Hammond, Glenn E

unread,
Oct 7, 2024, 5:31:40 PM10/7/24
to pflotr...@googlegroups.com

Madelaine,

 

Your input decks runs with 8 processes on PFLOTRAN v5, v6 and development branch on a Mac. Can you verify that the input deck fail with 8 processes (mpirun -n 8 …)?

 

On which operating system are you running?

 

Glenn

 

 

Madelaine Griesel

unread,
Oct 8, 2024, 4:09:27 PM10/8/24
to pflotran-dev

Hello Glenn, 


Thank you for checking that. I'm using the MIT Lincoln Labs Supercloud. They are having system maintenance today but I'll update you with results from running my input deck with 8 processes as soon as I can tomorrow. 


Best,

Madelaine

Madelaine Griesel

unread,
Oct 10, 2024, 4:10:32 PM10/10/24
to pflotran-dev
Hi Glenn, 

I get the following error on the Supercloud when running my deck with 8 processes for both PFLOTRAN v3 & v5: "mpirun noticed that process rank 13 with PID 163754 on node d-17-10-4 exited on signal 9 (Killed)"

It is running with a minimum of 20 processes with PFLOTRAN v3, but produces a segmentation violation error with PFLOTRAN v5.

I'll continue troubleshooting the deck with 8 processes. 

Thank you,
Madelaine

Hammond, Glenn E

unread,
Oct 10, 2024, 6:46:57 PM10/10/24
to pflotr...@googlegroups.com

Madelaine,

 

Running on 8 processes was solely to confirm that we were running on the same core count. We have now ruled out “core count” as an issue (or difference). My installation runs but MIT’s does not.

 

Have you tried v6?

How may I obtain collaborator access to the MIT machine? Out of the question?

Can you try installing PETSc locally following the instructions at https://documentation.pflotran.org/user_guide/how_to/installation/linux.html#linux-install?

 

Glenn

 

From: pflotr...@googlegroups.com <pflotr...@googlegroups.com> on behalf of Madelaine Griesel <madelain...@gmail.com>
Date: Thursday, October 10, 2024 at 1:10
PM
To: pflotran-dev <pflotr...@googlegroups.com>
Subject: Re: [pflotran-dev: 6294] PETSc Segmentation Violation Error

Hi Glenn, 

 

I get the following error on the Supercloud when running my deck with 8 processes for both PFLOTRAN v3 & v5: "mpirun noticed that process rank 13 with PID 163754 on node d-17-10-4 exited on signal 9 (Killed)"

 Error! Filename not specified.Error! Filename not specified.

Reply all
Reply to author
Forward
0 new messages