Segmentation Fault Issuses with Larger organic systems

30 views
Skip to first unread message

Lily Ireton

unread,
Feb 10, 2020, 12:55:23 PM2/10/20
to XMVB User Mailing List
I am running a VBSCF calculation on the 0th through 2nd ionic structures of the 3n structure seen below in which the X is a carbonyl (C=O) group. I am running into some segmentation fault issues when the VBSCF convergence is beginning. My output file is attached below and it displays no iterations for the VBSCF calculations. I am using a cluster of computers and have given the job up to 128G of RAM over 8 threads. I noticed that in the XMVB user manual it states that the OMP_STACKSIZE variable should be set ot 1Gb in the case of issues like this. I have tried that and I have also set that variable to much higher numbers such as 128 Gb. Once OMP_STACKSIZE gets very large (just under 100 Gb) I start getting errors from Open MP stating that memory resources cannot be allocated and I need to decrease OMP_NUM_THREADS. I have done that as well and I am still getting segmentation faults from forrtl.


Has anyone else encountered this issue? No matter how much memory I give the job, I am always getting a segmentation fault. Attached is a zipped folder of my preint input file and my VBSCF xmi and xmo files. Also included is the segmentation fault message from the batch scheduler I am using. Any help would be appreciated.

Image result for fulvene derivatives
7ketone.zip

应富鸣

unread,
Feb 19, 2020, 10:16:38 PM2/19/20
to XMVB User Mailing List
Dear Lily,
Sorry for the delay.
I will check and try to understand what happened.

Best regards,

在 2020年2月11日星期二 UTC+8上午1:55:23,Lily Ireton写道:

应富鸣

unread,
Feb 20, 2020, 9:53:13 PM2/20/20
to XMVB User Mailing List
Dear Lily,
I have tested the calculation and found XMVB worked fine for your job.
I have a question: have you set "ulimit -s unlimited" yet?

This command "ulimit -s unlimited" set the stacksize of master thread to be unlimited, and OMP_STACKSIZE set the stacksize of slave threads.
If you have not set the stacksize of master thread, the segmentation fault will come since stacksize of your master thread overflew.
You may try this first and see if it works. If not, I will send you the most recent develop branch which I tested and worked.

Best regards,
Fuming Ying


在 2020年2月11日星期二 UTC+8上午1:55:23,Lily Ireton写道:
I am running a VBSCF calculation on the 0th through 2nd ionic structures of the 3n structure seen below in which the X is a carbonyl (C=O) group. I am running into some segmentation fault issues when the VBSCF convergence is beginning. My output file is attached below and it displays no iterations for the VBSCF calculations. I am using a cluster of computers and have given the job up to 128G of RAM over 8 threads. I noticed that in the XMVB user manual it states that the OMP_STACKSIZE variable should be set ot 1Gb in the case of issues like this. I have tried that and I have also set that variable to much higher numbers such as 128 Gb. Once OMP_STACKSIZE gets very large (just under 100 Gb) I start getting errors from Open MP stating that memory resources cannot be allocated and I need to decrease OMP_NUM_THREADS. I have done that as well and I am still getting segmentation faults from forrtl.

Lily Ireton

unread,
Feb 24, 2020, 7:05:24 PM2/24/20
to XMVB User Mailing List
I have added the line "ulimit -s unlimited" to the xmvb script in the XMVB/bin file path. I had previously tried that but still had OMP_STACKSIZE set so I assume that was an issue.

Doing this removed the segmentation faults but did not finish the job. I have no errors from either XMVB or OpenMP but the job runs for 30 seconds and terminates at the same point before any iterations occur. See the output file from the job scheduler and the output file from XMVB attached here.The job was given 8 threads with a max of 32 Gb of RAM for each thread.

When you ran this calculation, how much memory was required?

I am at a loss with no errors to clue me in to the issue here.

Lily
slurm-55506457.out
7ketone-vbscf.xmo

应富鸣

unread,
Feb 25, 2020, 6:42:32 PM2/25/20
to Lily Ireton, XMVB User Mailing List
This is strange. I just set OMP_STACKSIZE=500M and everything goes smoothly.
I will try it again and let you know.
Did you use the stable branch?
Fuming Ying, Engineer
Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry
Xiamen University
Tel: 86-592-2187396
Mob: (86)15260202135
 
应富鸣,工程师
福建省理论与计算化学重点实验室
厦门大学化学系
电话:86-592-2187396
移动:(86)15260202135 


Lily Ireton <lil...@gmail.com> 于2020年2月25日周二 上午8:05写道:
--
You received this message because you are subscribed to the Google Groups "XMVB User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xmvb-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/xmvb-user/eee2a5ee-f29e-4ee1-a9dd-a83599d1e171%40googlegroups.com.

Lily Ireton

unread,
Feb 26, 2020, 3:52:22 PM2/26/20
to XMVB User Mailing List
I downloaded the most recent versions of XMVB for linux (both for GNU and intel compilers). The most recent versions I found was the one released on 2019-1-17. They are both giving me a segmentation fault (see below) even after modifying the xmvb script to include "ulimit -s unlimited" and OMP_STACKSIZE=500M. I have also tried running the job with these combinations of tools:

OpenMPI/1.8.1
GNU/4.8.3
OpenBLAS/0.2.9-LAPACK-3.5.0

~and~

GCC/8.3.0
OpenBLAS/0.3.7
OpenMPI

~and~

ifort/2018.1.163-GCC-6.4.0-2.28
impi/2018.1.163

Here is the text of the segmentation fault:

If it is possible, could you send me the stable branch you are working with?

Lily Ireton

On Tuesday, February 25, 2020 at 6:42:32 PM UTC-5, 应富鸣 wrote:
This is strange. I just set OMP_STACKSIZE=500M and everything goes smoothly.
I will try it again and let you know.
Did you use the stable branch?
Fuming Ying, Engineer
Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry
Xiamen University
Tel: 86-592-2187396
Mob: (86)15260202135
 
应富鸣,工程师
福建省理论与计算化学重点实验室
厦门大学化学系
电话:86-592-2187396
移动:(86)15260202135 


Lily Ireton <lil...@gmail.com> 于2020年2月25日周二 上午8:05写道:
I have added the line "ulimit -s unlimited" to the xmvb script in the XMVB/bin file path. I had previously tried that but still had OMP_STACKSIZE set so I assume that was an issue.

Doing this removed the segmentation faults but did not finish the job. I have no errors from either XMVB or OpenMP but the job runs for 30 seconds and terminates at the same point before any iterations occur. See the output file from the job scheduler and the output file from XMVB attached here.The job was given 8 threads with a max of 32 Gb of RAM for each thread.

When you ran this calculation, how much memory was required?

I am at a loss with no errors to clue me in to the issue here.

Lily

--
You received this message because you are subscribed to the Google Groups "XMVB User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xmvb...@googlegroups.com.

Lily Ireton

unread,
Feb 26, 2020, 3:58:50 PM2/26/20
to XMVB User Mailing List
I was able to confirm that my XMVB install was otherwise working by running a smaller calculation that I had previously completed. The issues just seems to come up on slightly larger systems.

应富鸣

unread,
Feb 27, 2020, 8:40:28 AM2/27/20
to Lily Ireton, XMVB User Mailing List
Dear Lily,
Thank you and I will have a try.

Fuming Ying, Engineer
Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry
Xiamen University
Tel: 86-592-2187396
Mob: (86)15260202135
 
应富鸣,工程师
福建省理论与计算化学重点实验室
厦门大学化学系
电话:86-592-2187396
移动:(86)15260202135 


Lily Ireton <lil...@gmail.com> 于2020年2月27日周四 上午4:58写道:
I was able to confirm that my XMVB install was otherwise working by running a smaller calculation that I had previously completed. The issues just seems to come up on slightly larger systems.

--
You received this message because you are subscribed to the Google Groups "XMVB User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xmvb-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/xmvb-user/80942529-0a48-4c65-942a-d61ef5456b24%40googlegroups.com.

应富鸣

unread,
Feb 29, 2020, 9:07:55 AM2/29/20
to Lily Ireton, xmvb-user
Dear Lily,
I just pushed a snapshot of XMVB package with GNU compilers.
You may download and have a try.

Best regards,

Fuming Ying, Engineer
Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry
Xiamen University
Tel: 86-592-2187396
Mob: (86)15260202135
 
应富鸣,工程师
福建省理论与计算化学重点实验室
厦门大学化学系
电话:86-592-2187396
移动:(86)15260202135 


Lily Ireton <lil...@gmail.com> 于2020年2月29日周六 上午3:57写道:
My apologies. These are the incorrect 7ketone-geomopt.xmi and 7ketone-geomopt.xmo files. The ones in my previous email were the test files I ran in which I added an additional orbital to see if that would clear the error. The original ones I ran had 35 orbitals as did my input for the stand alone XMVB. These are attached here. The same error appears in both sets of input/output files.

I am not looking to fix errors with my GAMESS-XMVB setup which seems to be working fine. Ultimately, if I could run these calculations in stand-alone XMVB that would be preferred but if either way would work, that would be fantastic.

Lily

On Thu, Feb 27, 2020 at 8:48 PM Lily Ireton <lil...@gmail.com> wrote:
I tried running the job through GAMESS-XMVB. The GAMESS and XMVB input/output files are attached here. I am getting a real error now that states as follows. I was able to confirm that GAMESS-XMVB was otherwise working correctly on smaller jobs.

image.png

Although it says this is not output from XMVB calculations, the entire GAMESS calculation was finished before this error appeared. I double checked my XMVB input file and it seems okay. I'm inclined to believe so especially since you ran it with no issues. Please let me know if you are familiar with this issue.

Lily


应富鸣

unread,
Mar 9, 2020, 9:40:58 AM3/9/20
to Lily Ireton, xmvb-user
Dear Lily,
The version of my libc is also 2.17:
glibc-2.17-292.el7.x86_64

This is really wierd.
Fuming Ying, Engineer
Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry
Xiamen University
Tel: 86-592-2187396
Mob: (86)15260202135
 
应富鸣,工程师
福建省理论与计算化学重点实验室
厦门大学化学系
电话:86-592-2187396
移动:(86)15260202135 


Lily Ireton <lil...@gmail.com> 于2020年3月9日周一 上午8:51写道:
I ran the same job with the program given in the download link above. I was able to get that to run on my computer but it never completed due to heat issues. What I did notice is that some material was printed to the 7ketone-vbscf.xmo file. I did not see this when the program was stopped with a segfault on the cluster. The xmo file shows that while 3906 structures exist using ion(0-2), only ten of them are generated. See the attached output file. This is different than the results I saw with the snapshot you recently pushed. That snapshot still generated a segfault on my laptop but also generated the complete 3906 structures. Are you seeing this same issue with the output files of the calculations you were able to complete? I am wondering if the program is only able to run now on my system due to the lower number of structures generated.

The program still does not work on the cluster and I am working on that issue. I believe that the libc-2.17.so library is part of the cause. The version on the cluster is 2.17 but my computer is using 2.30 and the segmentation fault seems to occur when this library is used based on the memory map spit out when the segfault is thrown. Can you tell me what version of libc.so is being called when you run this job on your program?

Lily

On Thu, Mar 5, 2020 at 11:14 PM 应富鸣 <fmy...@gmail.com> wrote:
The download link is:

password:8mz1
Will expired at:2020-03-13
Fuming Ying, Engineer
Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry
Xiamen University
Tel: 86-592-2187396
Mob: (86)15260202135
 
应富鸣,工程师
福建省理论与计算化学重点实验室
厦门大学化学系
电话:86-592-2187396
移动:(86)15260202135 


Lily Ireton <lil...@gmail.com> 于2020年3月1日周日 上午7:26写道:
I tried out the new version. An error is still occurring and it seems to be related to the free() function. Hopefully the information in this file will help. I tried using various editions of GCC, GCCcore, and GNU to see if that fixed the problem but the error was the same in each case. The 7ketone-vbscf.xmo file was empty.


Reply all
Reply to author
Forward
0 new messages