Hi all,
Was in two minds about which mailing list to post to first, so sorry if this is the wrong place to ask. It's got three parts so I'll break it down.
I'm currently refining a fairly large number of particles at bin1 in RELION 4 using the classic WARP style 3D subtomograms as opposed to RELION 5 stacks.
1. On the final iteration of 3D refine, RELION keeps crashing on the maximisation step (see pasted run.out and run.err at bottom). This consistently happens, across different commits of RELION 4, even if I use GPUs with huge amounts of memory, if I reduce threads to 1, if I use --free_gpu_memory 4000, and other memory saving tricks. I think it may be an issue with memory however, as other refinements, eg at bin2 and bin4, worked fine. Has anyone come across this and found a way around it? FWIW, RELION is crashing at the stage of outputting files called: run_half1_class001_data_image.mrc, run_half1_class001_data_real.mrc, run_half1_class001_data_weights.mrc, which I have never seen before, and I think are temporary files which are deleted after the generation of the _unfilterd.mrc.
2. Instead of letting the final iteration complete, I tried to use relion_reconstruct to get the half maps and full map from the _data.star from the final iteration instead as a bit of a cheat:
relion_reconstruct --i final_data.star_from_refine --o out.mrc --ctf --3d_rot
but the reconstructed particle looks like a ribosome (yay), but much lower resolution and quality than the refinement iteration maps (boo). Something is going wrong, am I missing a flag?
3. I tried using the 2D stacks export tool and refining in RELION 5, but I believe the particles are being extracted in the wrong place. Using the exact same .star to define coordinates, WARP extracts 3D subtomograms correctly (the averaged particle for each tomo is a white blob as expected) whereas I don't see the same in the 2D stacks. Does WARP require the input for 2D stacks to be centred coordinates as opposed to CoordinateXYZ? That's the only reason I could think of.
Sorry for the essay, I'm crowdsourcing knowledge as it's been a tricky thing to troubleshoot as the dataset is so large it takes a few days per test.
Thanks,
Euan
run.err:
corrupted size vs. prev_size
[gpu47:06576] *** Process received signal ***
[gpu47:06576] Signal: Aborted (6)
[gpu47:06576] Signal code: (-6)
[gpu47:06576] [ 0] /lib64/libpthread.so.0(+0x12d20)[0x7fffe3fafd20]
[gpu47:06576] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7fffe32fd52f]
[gpu47:06576] [ 2] /lib64/libc.so.6(abort+0x127)[0x7fffe32d0e65]
[gpu47:06576] [ 3] /lib64/libc.so.6(+0x8f727)[0x7fffe333e727]
[gpu47:06576] [ 4] /lib64/libc.so.6(+0x96a2c)[0x7fffe3345a2c]
[gpu47:06576] [ 5] /lib64/libc.so.6(+0x972d6)[0x7fffe33462d6]
[gpu47:06576] [ 6] /lib64/libc.so.6(+0x97465)[0x7fffe3346465]
[gpu47:06576] [ 7] /lib64/libc.so.6(+0x99b28)[0x7fffe3348b28]
[gpu47:06576] [ 8] /lib64/libc.so.6(+0x9a65b)[0x7fffe334965b]
[gpu47:06576] [ 9] /lib64/libc.so.6(+0x9b6fa)[0x7fffe334a6fa]
[gpu47:06576] [10] /g/easybuild/x86_64/Rocky/8/genoa/software/FFTW.MPI/3.3.10-gompi-2023a/lib/libfftw3.so.3(fftw_malloc_plain+0x15)[0x7fffea70b0c5]
[gpu47:06576] [11] /g/easybuild/x86_64/Rocky/8/genoa/software/FFTW.MPI/3.3.10-gompi-2023a/lib/libfftw3.so.3(+0x39c6d)[0x7fffea70cc6d]
[gpu47:06576] [12] /g/easybuild/x86_64/Rocky/8/genoa/software/FFTW.MPI/3.3.10-gompi-2023a/lib/libfftw3.so.3(+0x3a493)[0x7fffea70d493]
[gpu47:06576] [13] /g/easybuild/x86_64/Rocky/8/genoa/software/FFTW.MPI/3.3.10-gompi-2023a/lib/libfftw3.so.3(+0x3aeca)[0x7fffea70deca]
[gpu47:06576] [14] /g/easybuild/x86_64/Rocky/8/genoa/software/FFTW.MPI/3.3.10-gompi-2023a/lib/libfftw3.so.3(fftw_mkplan_d+0xf)[0x7fffea70e34f]
[gpu47:06576] [15] /g/easybuild/x86_64/Rocky/8/genoa/software/FFTW.MPI/3.3.10-gompi-2023a/lib/libfftw3.so.3(fftw_mkplan_f_d+0x59)[0x7fffea70e3c9]
[gpu47:06576] [16] /g/easybuild/x86_64/Rocky/8/genoa/software/FFTW.MPI/3.3.10-gompi-2023a/lib/libfftw3.so.3(+0x40f89)[0x7fffea713f89]
[gpu47:06576] [17] /g/easybuild/x86_64/Rocky/8/genoa/software/FFTW.MPI/3.3.10-gompi-2023a/lib/libfftw3.so.3(+0x3ab4e)[0x7fffea70db4e]
[gpu47:06576] [18] /g/easybuild/x86_64/Rocky/8/genoa/software/FFTW.MPI/3.3.10-gompi-2023a/lib/libfftw3.so.3(+0x3adf2)[0x7fffea70ddf2]
[gpu47:06576] [19] /g/easybuild/x86_64/Rocky/8/genoa/software/FFTW.MPI/3.3.10-gompi-2023a/lib/libfftw3.so.3(fftw_mkplan_d+0xf)[0x7fffea70e34f]
[gpu47:06576] [20] /g/easybuild/x86_64/Rocky/8/genoa/software/FFTW.MPI/3.3.10-gompi-2023a/lib/libfftw3.so.3(+0x46bec)[0x7fffea719bec]
[gpu47:06576] [21] /g/easybuild/x86_64/Rocky/8/genoa/software/FFTW.MPI/3.3.10-gompi-2023a/lib/libfftw3.so.3(+0x3ab4e)[0x7fffea70db4e]
[gpu47:06576] [22] /g/easybuild/x86_64/Rocky/8/genoa/software/FFTW.MPI/3.3.10-gompi-2023a/lib/libfftw3.so.3(+0x3adf2)[0x7fffea70ddf2]
[gpu47:06576] [23] /g/easybuild/x86_64/Rocky/8/genoa/software/FFTW.MPI/3.3.10-gompi-2023a/lib/libfftw3.so.3(fftw_mkplan_d+0xf)[0x7fffea70e34f]
[gpu47:06576] [24] /g/easybuild/x86_64/Rocky/8/genoa/software/FFTW.MPI/3.3.10-gompi-2023a/lib/libfftw3.so.3(+0x41b83)[0x7fffea714b83]
[gpu47:06576] [25] /g/easybuild/x86_64/Rocky/8/genoa/software/FFTW.MPI/3.3.10-gompi-2023a/lib/libfftw3.so.3(+0x3ab4e)[0x7fffea70db4e]
[gpu47:06576] [26] /g/easybuild/x86_64/Rocky/8/genoa/software/FFTW.MPI/3.3.10-gompi-2023a/lib/libfftw3.so.3(+0x3adf2)[0x7fffea70ddf2]
[gpu47:06576] [27] /g/easybuild/x86_64/Rocky/8/genoa/software/FFTW.MPI/3.3.10-gompi-2023a/lib/libfftw3.so.3(fftw_mkplan_d+0xf)[0x7fffea70e34f]
[gpu47:06576] [28] /g/easybuild/x86_64/Rocky/8/genoa/software/FFTW.MPI/3.3.10-gompi-2023a/lib/libfftw3.so.3(+0x46bec)[0x7fffea719bec]
[gpu47:06576] [29] /g/easybuild/x86_64/Rocky/8/genoa/software/FFTW.MPI/3.3.10-gompi-2023a/lib/libfftw3.so.3(+0x3ab4e)[0x7fffea70db4e]
[gpu47:06576] *** End of error message ***
run.out:
Expectation iteration 15
2.58/7.01 hrs ......................~~,_,"> [oo]
4.35/6.85 hrs .....................................~~(,_,">
5.67/6.79 hrs ..................................................~~(,_,>
6.77/6.77 hrs ............................................................~~(,_,">
Averaging half-reconstructions up to 40 Angstrom resolution to prevent diverging orientations ...
Note that only for higher resolutions the FSC-values are according to the gold-standard!
Calculating gold-standard FSC ...
Maximization ...
000/??? sec ~~(,_,">