Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

segmentation fault when running both serial and parallel xyce installed from source code

99 views
Skip to first unread message

Alessandro

unread,
Oct 24, 2024, 4:02:07 AM10/24/24
to xyce-users
Hello,

I tried installing Xyce 7.8 on a RHEL 9.4 machine.
I followed the documentation reported in the building guide
https://xyce.sandia.gov/documentation-tutorials/building-guide/

and used the attached reconfigure scripts.

The building does not throw any error but when I try to run any simulation either via parallel or serial version, I get the segmentation fault reported in the picture.

segmentation_fault.png

I tried running the regression tests, but basically no test performs and all throw "NO PRN FILE" (I guess cause no simulation is actually run).

I then tried installing the rpm (only serial) version and the the simulation runs flawlessly.

I assume I am doing something wrong when building and installing from source.

Has anyone encountered a similar error?

I will give it another try by installing under a different location.
reconfigure_trilinos.sh
reconfigure_xyce.sh
reconfigure_xyce_parallel.sh
reconfigure_trilinos_parallel.sh

Thomas Russo

unread,
Nov 14, 2024, 7:30:20 PM11/14/24
to xyce-users

I do not have access to RHEL9, and I suspect neither does the Xyce team (Sandia lags OS releases, rarely approving one for use internally before it's a few years old and only just before the last version hits EOL).   

However after an OS upgrade on my system that came with a new compiler version and new STL implementations, I have been having a lot of test failures in serial and almost universal failure in parallel with newly compiled Xyce.   I have begun hunting down the causes, and have identified a number of issues in the code that can lead to undefined behavior (most of these are very old pieces of code that have never had a problem before).  I've opened issues on github and submitted pull requests where I've found definitive solutions, but the work is ongoing and I still find hundreds of parallel test failures even after fixing a few potentially dangerous uses of STL objects.

It is *possible* that on RHEL9 there are similar failures coming up from the same issues.  Once my pull requests are in place for issues 113 and 115 on github, you might try them out and see if they fix your problem.  But it might be a week or two before I finish, because I'm only working on it in my spare time.

stefan.s...@gmail.com

unread,
Nov 15, 2024, 5:45:29 AM11/15/24
to xyce-users

I have compiled and built Xyce today on my debian desktop system with trilinos 12.12.1 as usual. So far I don't see crashes.
Xyce_build.mp4

stefan.s...@gmail.com

unread,
Nov 15, 2024, 5:57:20 AM11/15/24
to xyce-users
The system is actually a Devuan testing, so fairly new packages.
linux kernel: 6.11.5-amd64
libc: 2.40
g++: 14.2.0
libstdc++ : 14.2.0
libsuitesparse : 7.8.3
libblas: 3.12.0
liblapack: 3.12.0
libfftw3: 3.3.10

stefan.s...@gmail.com

unread,
Nov 15, 2024, 6:23:00 AM11/15/24
to xyce-users
Attached the reconfigure script for trilinos: compared to Thomas version I have the additional -std=c++11 in the FLAGS variable and my system has all libs in /usr/lib instead of /usr/lib64 (its a amd64 system anyway), but this depends on the specific linux system used for the build.

The configure script I use to build Xyce (in a separate my_build directory below the Xyce sources root dir) is:
../configure CXXFLAGS='-O3 -std=c++11 '  ARCHDIR=${HOME}/xyce/XyceLibs/Serial CPPFLAGS=-I/usr/include/suitesparse --prefix=${HOME}/xyce

of course ARCHDIR and --prefix depend on where the trilinos libs are installed and where Xyce is to be installed, respectively.
reconfigure

stefan.s...@gmail.com

unread,
Nov 15, 2024, 7:41:08 AM11/15/24
to xyce-users
According to Thomas russo scripts I updated my trilinos reconfigure and Xyce (I tested serial only) configure to latest suggested ones as per Russo attached files and Xyce build guide. The result is the same, I get all my test cases running with no crashes / segfaults

Tom Russo

unread,
Nov 15, 2024, 10:37:36 AM11/15/24
to xyce-users
It's great that the issues I'm seeing on FreeBSD with Clang 18 are not universal on newer Linux systems and that it's fine on yours.  My speculation that Alessandro's problems were related to mine was just speculation and may be off the mark.

But there are definitely problems on my system and I'm slowly tracking them down to various risky uses of STL containers in ways that are known to have potential for undefined behavior.  Fixing them is the right thing to do, even if the code isn't broken on most Linux systems.

--
You received this message because you are subscribed to the Google Groups "xyce-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xyce-users+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/xyce-users/12c50fa6-3391-4874-91d1-cff700ae280dn%40googlegroups.com.


--
Tom Russo    KM5VY 
Tijeras, NM 

 echo "prpv_a'rfg_cnf_har_cvcr" | sed -e 's/_/ /g' | tr [a-m][n-z] [n-z][a-m]

Tom Russo

unread,
Nov 15, 2024, 10:41:37 AM11/15/24
to xyce-users
I should also note that on my system the problems are primarily in parallel.  There were a few failures in serial, but those were easily fixed and were isolated to one particular feature (restart).

Thomas Russo

unread,
Nov 19, 2024, 11:48:43 AM11/19/24
to xyce-users
I have no idea if Alessandro's problems are in any way related to those I experienced, but I have a pair of pull requests in (https://github.com/Xyce/Xyce/pull/118 and https://github.com/Xyce/Xyce/pull/114) that fixed all of my crashes in serial and parallel.  In all cases, the things I fixed were bits of old code that were risky because they made assumptions about how STL containers were implemented, and on my newer FreeBSD system that underlying implementation changed so those assumptions were invalid.

Feel free to try them.  If they help, please let me know, but if they don't then clearly your build has some completely different issue that nobody else has spotted yet.  I do not see anything in your reconfigures that would account for your experiences.

Alessandro

unread,
Nov 19, 2024, 5:33:23 PM11/19/24
to xyce-users
Wow, thanks, Thomas, and thanks, Stefan. I really appreciate your help! :)
You've been so supportive.
I haven’t tried compiling Xyce again because, well, life happens between one compilation and the next. But I’ll get back to you all as soon as I’m done reinstalling.
Reply all
Reply to author
Forward
0 new messages