Structure fails with error munmap_chunk(): invalid pointer

1,559 views
Skip to first unread message

Ryan Shofner

unread,
Mar 8, 2022, 10:01:36 AM3/8/22
to structure-software
Hi all,

Background
I am attempting to execute Structure on a machine running Pop!_OS 20.04 and a Ryzen Threadripper 3970X CPU with 256gb ram. I am using the pre-compiled executable from the Structure website. I have used StrAuto to generate a list of commands for a range of Ks, and I'm using GNU Parallel 20161222 to run the commands in parallel. Example command:

structure -D 637738 -K 1 -m mainparams -o k1/amphistomus_k1_run1 2>&1 > log/k1/amphistomus_k1_run1.log

Most, but not all, runs fail at various stages of their execution with the error code:

munmap_chunk(): invalid pointer
Aborted (core dumped)

Troubleshooting done so far
A) The only other conversation in this group that I can find that mentions this error states that the patch to the source code fixed their problem. I downloaded the source code and complied (using GCC 8, then again with GCC 9) with both patches that are available here.

Running these locally compiled versions did not solve my issue, but the error thrown changed to:

Segmentation fault (core dumped)

B) I compiled Structure on two different machines (using GCC 8) running the same OS, but with different CPUs: a Ryzen 9 5950X and a Ryzen 9 3900X, with 128gb and 64gb ram, respectively. On both machines, the Structure runs completed successfully using the same commands, the same dataset, and running in parallel using the same version of GNU Parallel.

From what I can tell via Googling the errors, both errors are something memory-related in the C code of Structure, specifically with the command free. From there, my understanding of the situation fails, as I've never programmed in C. It seems to me that the problem must be related to the CPU architecture somehow, although the Threadripper and 3900X should be highly similar.

Does anyone have any solutions to this issue? I specifically built this Threadripper system to analyse large datasets using Structure, so it's a massive bummer that the program isn't working on the system.

Thanks for the help!

Ryan

Francisco Pina Martins

unread,
Mar 9, 2022, 7:00:59 AM3/9/22
to structure-software
Hi Ryan,

I've tried to reproduce your issue on a system with a Threadripper 3960X (256gb RAM, Ubuntu-server 20.04), which shouldn't be that different from your 3970X.
First thing I noticed was that the pre-compiled binary from here does not even run. This happens because it was compiled as i386, and I don't have the relevant libraries for 32-bit backward compatibility installed on my system (not will I install them since running i386 binaries on these threadrippers is almost a crime!).
As such I have used the precompiled binary from Structure_threader, which was built on Ubuntu-server 12.04 using GCC 4.6, according to these instructions, but I was unable to reproduce your issue (wrapped under Structure_threader).
Then I tried to recompile structure, using the same script, but under Ubuntu 20.04 on the threadripper system, and my test runs all went fine as well.
Maybe I didn't catch something in my tests runs that you are using on your parameters?
Would it be possible for you to share your mainparams and extraparams files, along with a minimal dataset that reveals the issue for us to try to reproduce?

Francisco

Ryan Shofner

unread,
Mar 10, 2022, 8:28:20 AM3/10/22
to structure-software
Hi Francisco,

Interesting that the precompiled binary doesn't run on your system; it runs on mine, even though I don't have i386 support enabled (dpkg --print-architecture returns only amd64). When I use the precompiled binary, I'd say 90% of runs fail, but 10% finish normally. This is about the same as when I compile the binary myself.

With the Structure_threader binary as well as the binary compiled using the helper script, I still get the Segmentation fault (core dumped) error. I've attached the screen output from make (make-log.txt).

I've included an archive that comprises my data and all the relevant auxiliary files. If structure is in your $PATH and GNU Parallel installed all you need to do is execute ./runstructure to run it exactly as I have been. You will need to edit the -j value on line 48 in runstructure to match the number of CPU cores/threads on your system.

If there is going to be an error, it will happen in the first 5-10 seconds or so.

Cheers,

Ryan

make-log.txt

Francisco Pina Martins

unread,
Mar 13, 2022, 8:51:13 AM3/13/22
to structure-software
Hi again Ryan,

>Interesting that the precompiled binary doesn't run on your system; it runs on mine, even though I don't have i386 support enabled (dpkg --print-architecture returns only amd64). When I use the precompiled binary, I'd say 90% of runs fail, but 10% finish normally. This is about the same as when I compile the binary myself.
As far as I know, the GNU/Linux kernel comes with i386 compatibility built in, even is it is compiled as x86_64. You only need the i386 (or i686) compatibility libraries installed in order for 32bit ELF binaries to run (I don't have these libs installed, but if it runs on your system, then they probably are).

I have used the data (and scripts) you provided to run the analysis and I was able to successfully complete all runs on my TR 3960X. That was quite a through way to be able to reproduce the issue!

>With the Structure_threader binary as well as the binary compiled using the helper script, I still get the Segmentation fault (core dumped) error. I've attached the screen output from make (make-log.txt).
So if you are getting the same error, regardless of how the binary was built, then I suspect it is either some PopOS! shenanigan, or something with your specific CPU model.
I have built a way to test the former, but not the latter (which you already have, by testing on a different CPU).
I have created this Dockerfile, based on Ubuntu 20.04, which installs Structure_threader and the respective binares to the image. If you don't want to build it yourself I've also uploaded it to docker hub and you can get it with the command: docker pull stunts/structure_threader:01
For this to work, you need, of course to install docker (If I recall correctly, the package names are docker and docker.io. For ease of use, you should also add your user to the docker group.
This should allow you to run the binaries in an Ubuntu 20.04 environment, with only a very minimal performance penalty relative to running on bare iron.
Once you have pulled the image to your system (or built it yourself using the Dockerfile), you just have to find the path to the location where your files are (the same ones you sent me) and do the following:

docker run -v /path/to/the/directory/where/your/files/are:/analysis/ -it stunts/structure_threader:01 /bin/bash

This command should result in an interactive bash prompt that looks something like this:

root@a7dc6e32ea4c:/analysis#

The a7dc6e32ea4c part is the container hash (sorry about being a root prompt I didn't have the time for a proper docker build!), and therefore will be different from the one you see here.
If all goes well, in that prompt, you just have to run the command ./runstructure and everything should occur smoothly. Unless there is some specific issue with the TR 3970X...

Hopefully it will "just work" (TM) and you will be able to simply use docker for your runs.

Good luck, and let us know how that went!

Francisco

Ryan Shofner

unread,
Mar 23, 2022, 9:20:25 AM3/23/22
to structure-software
Hi Francisco,

Well, that was my first time using docker. I was able to get the environment loaded without much fuss. Unfortunately, I am still encountering errors while running structure. The error has change somewhat, and now there are two variations:

double free or corruption (out)
Aborted (core dumped)

and

Segmentation fault (core dumped)

The only conclusion then is that I have a memory issue, either with the onboard CPU cache or with the RAM. Time to start chasing hardware gremlins it seems.

Thanks so much for the help, it was extremely thorough, and I think we've covered all the bases we could.

Cheers,

Ryan

Francisco Pina Martins

unread,
Mar 24, 2022, 8:13:36 AM3/24/22
to structure-software
Hi Ryan,

>Well, that was my first time using docker. I was able to get the environment loaded without much fuss. Unfortunately, I am still encountering errors while running structure. The error has change somewhat, and now there are two variations:

Oh. Well, hopefully, now you have a new option on your tool-belt. =-)

>The only conclusion then is that I have a memory issue, either with the onboard CPU cache or with the RAM. Time to start chasing hardware gremlins it seems.

After all that was tested, I too, think this is the most likely solution. RAM is the usual culprit here. I'd start with removing half the RAM DIMMs, and testing again. If it works, then you know there is a bad DIMM in the half you removed. If it doesn't, try with the other half. You know the drill, bisect until you find the bad DIMM.
If it is still not working after that, then you might have been unlucky enough to get a bad CPU or motherboard. In this case, you're probably better off filing an RMA.

>Thanks so much for the help, it was extremely thorough, and I think we've covered all the bases we could.

Your effort to provide a way to reproduce the issue made it fun to chase. =-)

Please let us know the conclusion to the story once you have it sorted.


Best,

Francisco

Ryan Shofner

unread,
Mar 25, 2022, 9:17:59 AM3/25/22
to structure-software
Issue solved! All I did was disable XMP in the BIOS so the RAM was running at 2666Mhz. STRUCTURE now runs normally.

I had posted this issue over on Reddit to see if hardware/C++ gurus could figure it out; the thread is here if anyone care to read. There were a lot of decent troubleshooting suggestions, including using the tool Valgrind to troubleshoot memory issues written into the code. It's been quite the wild ride.

Thank you Francisco for all your help, I wouldn't have been able to troubleshoot it so thoroughly without your assistance.
Reply all
Reply to author
Forward
0 new messages