Any solutions? and thanks

70 views
Skip to first unread message

Chunyan Wang

unread,
Jul 4, 2024, 9:42:12 AM7/4/24
to spIsoNet

The following modules were not unloaded:

  (Use "module --force purge" to unload all):


  1) StdEnv

RELION 4 data processing is not backwards-compatible with RELION 3.

Keep data for RELION 3 separate if switching between versions.

RELION version: 4.0.1 

Precision: BASE=double


 WARNING: No reference mask filename was found in file PseudoSubtomo/job042-bin8-icos/optimisation_set.star. Continuing without mask.

 WARNING: No reference mask filename was found in file PseudoSubtomo/job042-bin8-icos/optimisation_set.star. Continuing without mask.

 WARNING: No reference mask filename was found in file PseudoSubtomo/job042-bin8-icos/optimisation_set.star. Continuing without mask.

 WARNING: No reference mask filename was found in file PseudoSubtomo/job042-bin8-icos/optimisation_set.star. Continuing without mask.

 WARNING: No reference mask filename was found in file PseudoSubtomo/job042-bin8-icos/optimisation_set.star. Continuing without mask.

 === RELION MPI setup ===

 + Number of MPI processes             = 7

 + Leader  (0) runs on host            = r104u09n01.mccleary.ycrc.yale.edu

 + Follower     1 runs on host            = r104u09n01.mccleary.ycrc.yale.edu

 + Follower     2 runs on host            = r104u09n01.mccleary.ycrc.yale.edu

 + Follower     3 runs on host            = r104u09n01.mccleary.ycrc.yale.edu

 =================

 WARNING: No reference mask filename was found in file PseudoSubtomo/job042-bin8-icos/optimisation_set.star. Continuing without mask.

 WARNING: No reference mask filename was found in file PseudoSubtomo/job042-bin8-icos/optimisation_set.star. Continuing without mask.

 + Follower     4 runs on host            = r104u25n01.mccleary.ycrc.yale.edu

 + Follower     5 runs on host            = r104u25n01.mccleary.ycrc.yale.edu

 + Follower     6 runs on host            = r104u25n01.mccleary.ycrc.yale.edu

 uniqueHost r104u09n01.mccleary.ycrc.yale.edu has 3 ranks.

 uniqueHost r104u25n01.mccleary.ycrc.yale.edu has 3 ranks.

GPU-ids not specified for this rank, threads will automatically be mapped to available devices.

 Thread 0 on follower 1 mapped to device 0

GPU-ids not specified for this rank, threads will automatically be mapped to available devices.

 Thread 0 on follower 2 mapped to device 1

GPU-ids not specified for this rank, threads will automatically be mapped to available devices.

 Thread 0 on follower 3 mapped to device 2

GPU-ids not specified for this rank, threads will automatically be mapped to available devices.

 Thread 0 on follower 4 mapped to device 0

GPU-ids not specified for this rank, threads will automatically be mapped to available devices.

 Thread 0 on follower 5 mapped to device 1

GPU-ids not specified for this rank, threads will automatically be mapped to available devices.

 Thread 0 on follower 6 mapped to device 2

 Running CPU instructions in double precision. 

 Estimating initial noise spectra from 1000 particles 

   4/   4 sec ............................................................~~(,_,">

 Auto-refine: Iteration= 1

 Auto-refine: Resolution= 33.6 (no gain for 0 iter) 

 Auto-refine: Changes in angles= 999 degrees; and in offsets= 999 Angstroms (no gain for 0 iter) 

 CurrentResolution= 33.6 Angstroms, which requires orientationSampling of at least 5.14286 degrees for a particle of diameter 740 Angstroms

 Oversampling= 0 NrHiddenVariableSamplingPoints= 46656

 OrientationalSampling= 7.5 NrOrientations= 576

 TranslationalSampling= 33.6 NrTranslations= 81

=============================

 Oversampling= 1 NrHiddenVariableSamplingPoints= 2985984

 OrientationalSampling= 3.75 NrOrientations= 4608

 TranslationalSampling= 16.8 NrTranslations= 648

=============================

 Expectation iteration 1

2.25/2.25 min ............................................................~~(,_,">

 Averaging half-reconstructions up to 40 Angstrom resolution to prevent diverging orientations ...

 Note that only for higher resolutions the FSC-values are according to the gold-standard!

 Calculating gold-standard FSC ...

 Maximization ...


 + Making system call for external reconstruction: python /home/cw785/software/spIsoNet/spIsoNet/bin/relion_wrapper.py Refine3D/job068/run_it001_half1_class001_external_reconstruct.star

Traceback (most recent call last):

  File "/home/cw785/software/spIsoNet/spIsoNet/bin/relion_wrapper.py", line 362, in <module>

iter = 001

set CUDA_VISIBLE_DEVICES=0,1,2,3

set CONDA_ENV=spisonet

set ISONET_WHITENING=True

set ISONET_WHITENING_LOW=10

set ISONET_RETRAIN_EACH_ITER=True

set ISONET_BETA=0.5

set ISONET_ALPHA=1

set ISONET_START_HEALPIX=3

set ISONET_ACC_BATCHES=2

set ISONET_EPOCHS=5

set ISONET_KEEP_LOWRES=False

set ISONET_LOWPASS=True

set ISONET_ANGULAR_WHITEN=False

set ISONET_3DFSD=False

set ISONET_FSC_05=False

set ISONET_FSC_WEIGHTING=True

set ISONET_START_RESOLUTION=15.0

set ISONET_KEEP_LOWRES= False

healpix = 3

symmetry = I3

mask_file = None

pixel size = 16.8

resolution at 0.5 and 0.143 are 999.0 and 999.0

real limit resolution to 34.0

Traceback (most recent call last):

  File "/home/cw785/software/spIsoNet/spIsoNet/bin/relion_wrapper.py", line 362, in <module>

    shutil.copy(mrc_unfil, mrc_unfil_backup)

  File "/home/cw785/.conda/envs/spisonet/lib/python3.10/shutil.py", line 417, in copy

    shutil.copy(mrc_unfil, mrc_unfil_backup)

  File "/home/cw785/.conda/envs/spisonet/lib/python3.10/shutil.py", line 417, in copy

    copyfile(src, dst, follow_symlinks=follow_symlinks)

  File "/home/cw785/.conda/envs/spisonet/lib/python3.10/shutil.py", line 254, in copyfile

    copyfile(src, dst, follow_symlinks=follow_symlinks)

  File "/home/cw785/.conda/envs/spisonet/lib/python3.10/shutil.py", line 254, in copyfile

    with open(src, 'rb') as fsrc:

FileNotFoundError: [Errno 2] No such file or directory: 'Refine3D/job068/run_it001_half2_class001_unfil.mrc'

    with open(src, 'rb') as fsrc:

FileNotFoundError: [Errno 2] No such file or directory: 'Refine3D/job068/run_it001_half1_class001_unfil.mrc'

in: /dev/shm/ms725/build/RELION/4.0.1/fosscuda-2020b/relion-4.0.1/src/backprojector.cpp, line 1294

ERROR: 

 ERROR: there was something wrong with system call: python /home/cw785/software/spIsoNet/spIsoNet/bin/relion_wrapper.py Refine3D/job068/run_it001_half2_class001_external_reconstruct.star

=== Backtrace  ===

/vast/palmer/apps/avx2/software/RELION/4.0.1-fosscuda-2020b/bin/relion_refine_mpi(_ZN11RelionErrorC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES7_l+0x63) [0x4cde53]

/vast/palmer/apps/avx2/software/RELION/4.0.1-fosscuda-2020b/bin/relion_refine_mpi() [0x4598f5]

/vast/palmer/apps/avx2/software/RELION/4.0.1-fosscuda-2020b/bin/relion_refine_mpi(_ZN14MlOptimiserMpi12maximizationEv+0x1c00) [0x5054e0]

/vast/palmer/apps/avx2/software/RELION/4.0.1-fosscuda-2020b/bin/relion_refine_mpi(_ZN14MlOptimiserMpi7iterateEv+0x3a6) [0x506216]

/vast/palmer/apps/avx2/software/RELION/4.0.1-fosscuda-2020b/bin/relion_refine_mpi(main+0x59) [0x4b9ba9]


 RELION version: 4.0.1

 exiting with an error ...

/lib64/libc.so.6(__libc_start_main+0xe5) [0x14ce8b4ced85]

/vast/palmer/apps/avx2/software/RELION/4.0.1-fosscuda-2020b/bin/relion_refine_mpi(_start+0x2e) [0x4bcd4e]

==================

ERROR: 

 ERROR: there was something wrong with system call: python /home/cw785/software/spIsoNet/spIsoNet/bin/relion_wrapper.py Refine3D/job068/run_it001_half2_class001_external_reconstruct.star

in: /dev/shm/ms725/build/RELION/4.0.1/fosscuda-2020b/relion-4.0.1/src/backprojector.cpp, line 1294

ERROR: 

 ERROR: there was something wrong with system call: python /home/cw785/software/spIsoNet/spIsoNet/bin/relion_wrapper.py Refine3D/job068/run_it001_half1_class001_external_reconstruct.star

--------------------------------------------------------------------------

MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD

with errorcode 1.


NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.

You may or may not see output from other processes, depending on

exactly when Open MPI kills them.

--------------------------------------------------------------------------

=== Backtrace  ===

/vast/palmer/apps/avx2/software/RELION/4.0.1-fosscuda-2020b/bin/relion_refine_mpi(_ZN11RelionErrorC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES7_l+0x63) [0x4cde53]

/vast/palmer/apps/avx2/software/RELION/4.0.1-fosscuda-2020b/bin/relion_refine_mpi() [0x4598f5]

/vast/palmer/apps/avx2/software/RELION/4.0.1-fosscuda-2020b/bin/relion_refine_mpi(_ZN14MlOptimiserMpi12maximizationEv+0x191c) [0x5051fc]

/vast/palmer/apps/avx2/software/RELION/4.0.1-fosscuda-2020b/bin/relion_refine_mpi(_ZN14MlOptimiserMpi7iterateEv+0x3a6) [0x506216]

/vast/palmer/apps/avx2/software/RELION/4.0.1-fosscuda-2020b/bin/relion_refine_mpi(main+0x59) [0x4b9ba9]

/lib64/libc.so.6(__libc_start_main+0xe5) [0x14ded0a83d85]


 RELION version: 4.0.1

 exiting with an error ...

/vast/palmer/apps/avx2/software/RELION/4.0.1-fosscuda-2020b/bin/relion_refine_mpi(_start+0x2e) [0x4bcd4e]

==================

ERROR: 

 ERROR: there was something wrong with system call: python /home/cw785/software/spIsoNet/spIsoNet/bin/relion_wrapper.py Refine3D/job068/run_it001_half1_class001_external_reconstruct.star

--------------------------------------------------------------------------

MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD

with errorcode 1.


NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.

You may or may not see output from other processes, depending on

exactly when Open MPI kills them.

--------------------------------------------------------------------------

slurmstepd: error: *** STEP 37013163.0 ON r104u09n01 CANCELLED AT 2024-07-04T09:38:41 ***

srun: Job step aborted: Waiting up to 32 seconds for job step to finish.

000/??? sec ~~(,_,">                                                          [oo]srun: error: r104u09n01: tasks 0,2-3: Killed

srun: error: r104u09n01: task 1: Killed

srun: error: r104u25n01: tasks 4-6: Killed

YUNTAO LIU

unread,
Jul 8, 2024, 5:23:24 PM7/8/24
to Chunyan Wang, spIsoNet
Hi Chunyan,

I guess this probably happens because spIsoNet requires using a mask in relion refinement.

--
You received this message because you are subscribed to the Google Groups "spIsoNet" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spisonet+u...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/spisonet/2cfbebdf-0315-4dcf-967f-9d77d71eed82n%40googlegroups.com.


--
Best Regards,
Yuntao Liu,  Postdoc.

California NanoSystem Institute
University of California Los Angeles
Reply all
Reply to author
Forward
0 new messages