cryoET 3D classification

21 views
Skip to first unread message

Esther Bullitt

unread,
Apr 7, 2026, 10:29:29 AM (7 days ago) Apr 7
to EMAN2
Hi,
Is there somewhere I can read about how to choose the 3D classification scheme for a cryoET dataset? — EMAN2 includes ortho-proj that uses MSA, PCA-based, and multi-reference 3D classification.

Do I try them all, or is there a way to make an informed decision?

thanks very much,
Esther

Muyuan Chen

unread,
Apr 7, 2026, 11:49:22 AM (7 days ago) Apr 7
to em...@googlegroups.com
Unfortunately I don’t think there is one... A main reason is that I keep making new programs and changing existing one. 

I often start from e2spt_sgd_new.py. It has the —ncls and —classify options that does classification, and can also load from existing aligned particles. More recently it also supports —realspace and —mask that let you focus classification on local regions. Notably it only gives structures of multiple classes but not classify all particles. For that it requires e2spt_refinemulti_new.py. This one also works with or without initial alignment, with or without mask, and it split the particles to corresponding classes. It also taks subtilt refinement results for higher resolution classification. However, its convergence rate is much slower so it is better to run it with references. There is also now e2gmm_spt_heter_refine.py, which is probably better at continuous motion of local regions. 

Muyuan



On Apr 7, 2026, at 7:29 AM, 'Esther Bullitt' via EMAN2 <em...@googlegroups.com> wrote:

Hi,
--
--
----------------------------------------------------------------------------------------------
You received this message because you are subscribed to the Google
Groups "EMAN2" group.
To post to this group, send email to em...@googlegroups.com
To unsubscribe from this group, send email to eman2+un...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/eman2

---
You received this message because you are subscribed to the Google Groups "EMAN2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eman2+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/eman2/310239d2-6a55-4d89-93fb-373aeebaf10an%40googlegroups.com.

Steve Ludtke

unread,
Apr 7, 2026, 12:42:23 PM (7 days ago) Apr 7
to em...@googlegroups.com
You left out the GMM-based dynamics classification. That's actually probably the one I'd use as a first attempt nowadays, unless you are confident that you know your data should divide into N<5 discrete classes (and you know N).  It's use on CryoET data is still a bit experimental, but it worked pretty well in some testing I did.

Esther Bullitt

unread,
Apr 7, 2026, 2:51:41 PM (7 days ago) Apr 7
to EMAN2
The GMM-based dynamics classification looks promising.
For those interested, here is the documentation website:  https://blake.bcm.edu/emanwiki/doku.php?id=eman2:e2gmm

Thanks!

Steve Ludtke

unread,
Apr 7, 2026, 3:11:16 PM (7 days ago) Apr 7
to em...@googlegroups.com
I'd actually recommend starting with the video tutorial to explain some of the interface, though it's a few years old now, and probably needs an update, it will at least have the basics:

The earlier videos in that series also have good tutorials on tomography and single particle work (as of 2022). Most of it is still relevant.

Bullitt, Esther

unread,
8:05 AM (11 hours ago) 8:05 AM
to em...@googlegroups.com
  1.  $ e2version.py
EMAN 2.99.72 ( GITHUB: 2026-03-06 12:57 - commit: 1bb4904d8 )
Your EMAN2 is running on: Linux-6.5.0-28-generic-x86_64-with-glibc2.39 6.5.0-28-generic
Your Python version is: 3.12.13

  1.  e2spt_refine_new.py --ptcls=sets/extracted_tomobox.lst --ref=clip72_CS20_mask.mrc --startres=30.0 --goldstandard --sym=H1:3:111.8:8.95 --iters=p3,t2,p,t,r,d --keep=0.95 --localrefine --maxres=0.0 --minres=0.0 --parallel=thread:4 --threads=6

  1.  tomography, ~500K particles, 72 pix^3

We have been having problems with the computer hanging, and needing to be rebooted by pressing the power button (it works on smaller datasets).
The first problem was a full SSD drive, from old cached data (NOT eman2), which we deleted.

Our IT person wrote me to say:
Something is wrong with your eman2 installation or how you are running. You were running out of RAM just for test I created 200gb swap file and already over 100gb used. Eman2 suppose to create it is own cache but it does not.

He created a swap file on the SSD, but I dont know how to get EMAN2 to use it — is there a configuration file somewhere that I can edit? 

Thanks!
Sincerely,
Esther
-- 

Esther Bullitt, PhD

Professor of Pharmacology, Physiology & Biophysics

Director, Cryogenic Electron Microscopy Core

Boston University Chobanian & Avedisian School of Medicine

700 Albany St, W-302

Boston, MA. 02118

Steve Ludtke

unread,
4:07 PM (3 hours ago) 4:07 PM
to em...@googlegroups.com
Ok, it's hard to say with absolute certainty what's causing the excessive memory issue from this information. Is the problem happening during the orientation refinement or is it happening during the 3-D reconstruction at the end of each iteration?  

Some things to consider:
- If the problem is happening at the 3-D reconstruction stage, try adding the --m3dthread option

- Helical symmetry can sometimes cause some peculiar problems. With your specifications, it looks like you would have roughly 3*72/8.95 = 24 symmetric copies. If you have a 2 degree tilt step, this effectively means you're working with 500,000 * 60 * 24 = 720,000,000 2-D particle images. 
- This is far in excess of the 7 digit precision supported by single precision floating point numbers, and can potentially produce severe mathematical artifacts.
- This is far more particles than is likely to usefully contribute to a 3-D structure, ignoring the previous point, without subdividing into many classes. This is one of the reasons why Muyuan's picking approach focused on random sites along a filament. Once you exceed a certain number of particles there just isn't anything useful to be gained (other than noise bias). 

- It is conceivable that there is a bug somewhere, possibly relating to the helical symmetry, causing memory leaks of some sort.  To help figure out what's going on, launch the job, then run "top" in a separate window, and press "M". This will sort in order of memory usage. Watch the job as it starts to run. If you see the memory usage gradually rising (after the initial program has fully started running) then we need to know which program it is. 

- I would consider an initial run using maybe 10,000 particles to start and seeing what happens with that. eg-
head -10000 sets/extracted_tomobox.lst >sets/test_10k.lst
e2spt_refine_new on sets/test_10k.lst

- I would omit the "r" and "d" iterations at least initially. They rarely do anything very useful, and are very time consuming.


--
--
----------------------------------------------------------------------------------------------
You received this message because you are subscribed to the Google
Groups "EMAN2" group.
To post to this group, send email to em...@googlegroups.com
To unsubscribe from this group, send email to eman2+un...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/eman2

---
You received this message because you are subscribed to the Google Groups "EMAN2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eman2+un...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages