Trouble shhoting for misalighnment correction

155 views
Skip to first unread message

Rajiv Ranjan Singh

unread,
Apr 17, 2024, 9:11:00 AM4/17/24
to spis...@googlegroups.com
Hi,
I was running tutorial data for HA Trimer data set

For Anisotropy calculation it works fine.

I downloaded the desired files from "https://ucla.box.com/s/ng459g8mhlf63z4sio5y4v432yt6k7qa"
containing following files 
job025_tutorial.mrcs  mask.mrc 

HA_reference.mrc       job025_tutorial.star  


I run the job in Relion 4.0.1 with following error


the run.out put as follow


RELION version: 4.0.1-commit-7809a7 
Precision: BASE=double

 === RELION MPI setup ===
 + Number of MPI processes = 5
 + Number of threads per MPI process = 4
 + Total number of threads therefore = 20
 + Leader (0) runs on host = spgpu
 + Follower 1 runs on host = spgpu
 + Follower 2 runs on host = spgpu
 + Follower 3 runs on host = spgpu
 + Follower 4 runs on host = spgpu
 =================
 uniqueHost spgpu has 4 ranks.
 Follower 1 will distribute threads over devices 2
 Thread 0 on follower 1 mapped to device 2
 Thread 1 on follower 1 mapped to device 2
 Thread 2 on follower 1 mapped to device 2
 Thread 3 on follower 1 mapped to device 2
 Follower 2 will distribute threads over devices 3
 Thread 0 on follower 2 mapped to device 3
 Thread 1 on follower 2 mapped to device 3
 Thread 2 on follower 2 mapped to device 3
 Thread 3 on follower 2 mapped to device 3
 Follower 3 will distribute threads over devices 2
 Thread 0 on follower 3 mapped to device 2
 Thread 1 on follower 3 mapped to device 2
 Thread 2 on follower 3 mapped to device 2
 Thread 3 on follower 3 mapped to device 2
 Follower 4 will distribute threads over devices 3
 Thread 0 on follower 4 mapped to device 3
 Thread 1 on follower 4 mapped to device 3
 Thread 2 on follower 4 mapped to device 3
 Thread 3 on follower 4 mapped to device 3
Device 2 on spgpu is split between 2 followers
Device 3 on spgpu is split between 2 followers
 Running CPU instructions in double precision. 
 + On host spgpu: free scratch space = 691.762 Gb.
 Copying particles to scratch directory: /ssd/relion_volatile/
2.15/2.15 min ............................................................~~(,_,">
 For optics_group 1, there are 85358 particles on the scratch disk.
 Estimating initial noise spectra from 1000 particles 
   3/ 3 sec ............................................................~~(,_,">
 Auto-refine: Iteration= 1
 Auto-refine: Resolution= 9.86353 (no gain for 0 iter) 
 Auto-refine: Changes in angles= 999 degrees; and in offsets= 999 Angstroms (no gain for 0 iter) 
 CurrentResolution= 9.86353 Angstroms, which requires orientationSampling of at least 6.54545 degrees for a particle of diameter 170 Angstroms
 Oversampling= 0 NrHiddenVariableSamplingPoints= 30240
 OrientationalSampling= 15 NrOrientations= 1440
 TranslationalSampling= 2.62 NrTranslations= 21
=============================
 Oversampling= 1 NrHiddenVariableSamplingPoints= 967680
 OrientationalSampling= 7.5 NrOrientations= 11520
 TranslationalSampling= 1.31 NrTranslations= 84
=============================
 Expectation iteration 1
2.55/2.55 min ............................................................~~(,_,">
 Averaging half-reconstructions up to 40 Angstrom resolution to prevent diverging orientations ...
 Note that only for higher resolutions the FSC-values are according to the gold-standard!
 Calculating gold-standard FSC ...
 Maximization ...
000/??? sec ~~(,_,"> [oo]
 + Making system call for external reconstruction: /opt/miniconda3/envs/spisonet/bin/python /opt/spIsoNet/spIsoNet/bin/relion_wrapper.py Refine3D/job001/run_it001_half1_class001_external_reconstruct.star
iter = 001
set CUDA_VISIBLE_DEVICES=None
set CONDA_ENV=spisonet
set ISONET_WHITENING=True
set ISONET_WHITENING_LOW=10
set ISONET_RETRAIN_EACH_ITER=True
set ISONET_BETA=0.5
set ISONET_ALPHA=1
set ISONET_START_HEALPIX=3
set ISONET_ACC_BATCHES=2
set ISONET_EPOCHS=5
set ISONET_KEEP_LOWRES=False
set ISONET_LOWPASS=True
set ISONET_ANGULAR_WHITEN=False
set ISONET_3DFSD=False
set ISONET_FSC_05=False
set ISONET_FSC_WEIGHTING=True
set ISONET_START_RESOLUTION=15.0
set ISONET_KEEP_LOWRES= True
healpix = 2
symmetry = C3
mask_file = mask.mrc
pixel size = 1.309998
resolution at 0.5 and 0.143 are 999.0 and 999.0
real limit resolution to 10.0

 RELION version: 4.0.1-commit-7809a7
 exiting with an error ...
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD 
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------



Output of run.err
The following warnings were encountered upon command-line parsing: 
WARNING: Option --keep_lowres is not a valid RELION argument
Traceback (most recent call last):
  File "/opt/spIsoNet/spIsoNet/bin/relion_wrapper.py", line 360, in <module>
    shutil.copy(mrc_unfil, mrc_unfil_backup)
  File "/opt/miniconda3/envs/spisonet/lib/python3.10/shutil.py", line 417, in copy
    copyfile(src, dst, follow_symlinks=follow_symlinks)
  File "/opt/miniconda3/envs/spisonet/lib/python3.10/shutil.py", line 254, in copyfile
    with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: 'Refine3D/job001/run_it001_half1_class001_unfil.mrc'
Traceback (most recent call last):
  File "/opt/spIsoNet/spIsoNet/bin/relion_wrapper.py", line 360, in <module>
    shutil.copy(mrc_unfil, mrc_unfil_backup)
  File "/opt/miniconda3/envs/spisonet/lib/python3.10/shutil.py", line 417, in copy
    copyfile(src, dst, follow_symlinks=follow_symlinks)
  File "/opt/miniconda3/envs/spisonet/lib/python3.10/shutil.py", line 254, in copyfile
    with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: 'Refine3D/job001/run_it001_half2_class001_unfil.mrc'
in: /home/install/relion/src/backprojector.cpp, line 1294
ERROR: 
 ERROR: there was something wrong with system call: /opt/miniconda3/envs/spisonet/bin/python /opt/spIsoNet/spIsoNet/bin/relion_wrapper.py Refine3D/job001/run_it001_half1_class001_external_reconstruct.star
=== Backtrace ===
/opt/relion/4.0.1//bin/relion_refine_mpi(_ZN11RelionErrorC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES7_l+0x76) [0x48a0a6]
/opt/relion/4.0.1//bin/relion_refine_mpi(_ZN13BackProjector19externalReconstructER13MultidimArrayIdER8FileNameS2_S2_S2_S2_ddbdi+0x24b5) [0x520375]
/opt/relion/4.0.1//bin/relion_refine_mpi(_ZN14MlOptimiserMpi12maximizationEv+0x17c7) [0x4be4d7]
/opt/relion/4.0.1//bin/relion_refine_mpi(_ZN14MlOptimiserMpi7iterateEv+0x3aa) [0x4bf4da]
/opt/relion/4.0.1//bin/relion_refine_mpi(main+0x66) [0x477246]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7ff860db9840]
/opt/relion/4.0.1//bin/relion_refine_mpi(_start+0x29) [0x47a709]
==================
ERROR: 

 ERROR: there was something wrong with system call: /opt/miniconda3/envs/spisonet/bin/python /opt/spIsoNet/spIsoNet/bin/relion_wrapper.py Refine3D/job001/run_it001_half1_class001_external_reconstruct.star


Please advise to troubleshoot.


Rajiv Ranjan Singh

Rajiv Ranjan Singh

unread,
Apr 17, 2024, 9:26:40 AM4/17/24
to spis...@googlegroups.com
The command as follow:

`which relion_refine_mpi` --o Refine3D/job001/run --auto_refine --split_random_halves --i job025_tutorial.star --ref HA_reference.mrc --firstiter_cc --trust_ref_size --ini_high 10 --dont_combine_weights_via_disc --scratch_dir /ssd/ --pool 30 --pad 2  --ctf --particle_diameter 170 --flatten_solvent --zero_mask --solvent_mask mask.mrc --oversampling 1 --healpix_order 2 --auto_local_healpix_order 4 --offset_range 5 --offset_step 2 --sym C3 --low_resol_join_halves 40 --norm --scale  --j 4 --gpu "2:3" --external_reconstruct --keep_lowres --pipeline_control Refine3D/job001/


Best,

Rajiv Ranjan Singh


YUNTAO LIU

unread,
Apr 17, 2024, 3:41:18 PM4/17/24
to Rajiv Ranjan Singh, spis...@googlegroups.com
Hi Rajiv,

I recall that this error can be solved by adding "--solvent_correct_fsc" parameter.

--
You received this message because you are subscribed to the Google Groups "spIsoNet" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spisonet+u...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/spisonet/CAOnWSxQgkhb5MoVMJeSdPezzDzp4hXVyBtQT381iYJFZdEw9-g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


--
Best Regards,
Yuntao Liu,  Postdoc.

California NanoSystem Institute
University of California Los Angeles

Rajiv Ranjan Singh

unread,
Apr 17, 2024, 7:59:22 PM4/17/24
to YUNTAO LIU, spis...@googlegroups.com
I have added this argument as well but it failed then as well.
Please advise.
Best,

Rajiv Ranjan Singh


YUNTAO LIU

unread,
Apr 17, 2024, 8:15:13 PM4/17/24
to Rajiv Ranjan Singh, spis...@googlegroups.com
Hi Rajiv,

OK.This "MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD with errorcode 1." looks like a RELION reconstruction is not successfully performed.

 Let's have more details. What files are generated in the output folder?



Rajiv Ranjan Singh

unread,
Apr 17, 2024, 8:16:23 PM4/17/24
to YUNTAO LIU, spis...@googlegroups.com
Adding this argument in Relion 4.0.1 worked! (Earlier I have given these arguments in Relion 5 beta2 then it was failing).

Thank you,
Rajiv Ranjan Singh


YUNTAO LIU

unread,
Apr 17, 2024, 8:24:42 PM4/17/24
to Rajiv Ranjan Singh, spis...@googlegroups.com
Sounds good, spIsoNet is currently not compatible with RELION5.

Rajiv Ranjan Singh

unread,
Apr 17, 2024, 8:37:54 PM4/17/24
to YUNTAO LIU, spis...@googlegroups.com
Thank you, Appreciate your troubleshooting. Looking forward to use spIsoNet for our own datasets. Does it work for small protein complexes below 100kDa?

Rajiv Ranjan Singh


YUNTAO LIU

unread,
Apr 17, 2024, 9:08:46 PM4/17/24
to Rajiv Ranjan Singh, spis...@googlegroups.com
Hi Rajiv,

It probably works. But, in theory, larger (or higher resolution) molecules perform better, because spIsoNet is trained only from your limited map/data. 


Jinuk Kim

unread,
Apr 17, 2024, 9:40:29 PM4/17/24
to spIsoNet
Hi,

I encountered the same error with RELION 4.0.1, but the problem was fixed by using the "--solvent_correct_fsc" parameter.

Thanks!
2024년 4월 18일 목요일 오전 10시 8분 46초 UTC+9에 yun...@g.ucla.edu님이 작성:
Reply all
Reply to author
Forward
0 new messages