extract the raw particles go into each class-average in EMAN2.1

168 views
Skip to first unread message

Junjie Zhang

unread,
Feb 5, 2014, 9:59:38 PM2/5/14
to em...@googlegroups.com
Hi Steve,

Is there a python program or command that we can use to extract the raw particles which go into a class-average from the e2refine2d result?

Best,
Junjie Zhang
Department of Biochemistry and Biophysics
Texas A&M University

Steven Ludtke

unread,
Feb 5, 2014, 10:08:50 PM2/5/14
to em...@googlegroups.com
The graphical program (e2evalparticles) is still broken, though I was working on it today, and it's getting close to being functional again.

The command-line program e2classextract.py can also be used for this purpose.

----------------------------------------------------------------------------
Steven Ludtke, Ph.D.
Professor, Dept of Biochemistry and Mol. Biol.         (www.bcm.edu/biochem)
Co-Director National Center For Macromolecular Imaging        (ncmi.bcm.edu)
Co-Director CIBR Center                          (www.bcm.edu/research/cibr)
Baylor College of Medicine                             





--
--
----------------------------------------------------------------------------------------------
You received this message because you are subscribed to the Google
Groups "EMAN2" group.
To post to this group, send email to em...@googlegroups.com
To unsubscribe from this group, send email to eman2+un...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/eman2
 
---
You received this message because you are subscribed to the Google Groups "EMAN2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eman2+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Junjie Zhang

unread,
Feb 5, 2014, 10:52:24 PM2/5/14
to em...@googlegroups.com
Thank you!
--
Regards,
Junjie (George) Zhang

Ximing XU

unread,
Jun 11, 2015, 1:40:11 PM6/11/15
to em...@googlegroups.com
Hi Steven,
  I have a question on e2classextract.py useage. For example, I did classification of all the particles (say the raw particles file particles.hdf), and I got a file avg.hdf (5 images of each class, each image is the average image of one class). According to the help information of e2classextract.py, we need a --input_set file, so I made a input_set file myself, say myset.lst, and the format is like:

#---input set start----
0       particles.hdf
1       particles.hdf
2       particles.hdf
3       particles.hdf
4       particles.hdf
5       particles.hdf
... # means a lot of numbers
n       particles.hdf
# to the end, n is the total number of particles 
#---input set end----

and then I prepared a list of classes, cls.lst file (I just have two classes)
# start list
0    avg.hdf
1    avg.hdf
#end list


then I run the command:
e2classextract.py --input_set myset.lst --classlist cls.lst

and the error information is:
Error, no class-averages found
Traceback (most recent call last):
  File "/home/biocheming/Programs/EMAN2/bin/e2classextract.py", line 146, in main
    ncls=EMUtil.get_image_count(args[0])
IndexError: list index out of range



Steven, help please. Thanks a lot.
Ximing

Steven Ludtke

unread,
Jun 11, 2015, 1:59:24 PM6/11/15
to em...@googlegroups.com
It is much easier to do this process with e2evalparticles.py where you can mark the classes graphically: http://blake.bcm.edu/emanwiki/EMAN2/Programs/e2evalparticles

It you wantto use e2classextract 

The input set file should be the particle set used when generating the class-averages. IF you did this with a regular image stack rather than a set, it should be ok to use e2proclst.py to create a .lst for the stack.

You also need a list of classes to extract, as per the help:

  --classlist CLASSLIST
                        Filename of a text file containing a (comma or
                        whitespace separated) list of class average numbers to
                        operate on.

that file should just be a list of numbers.

For more options, visit https://groups.google.com/d/optout.

----------------------------------------------------------------------------
Steven Ludtke, Ph.D.
Professor, Dept. of Biochemistry and Mol. Biol.                Those who do
Co-Director National Center For Macromolecular Imaging            ARE
Baylor College of Medicine                                     The converse
slu...@bcm.edu  -or-  ste...@alumni.caltech.edu               also applies
http://ncmi.bcm.edu/~stevel

Ximing XU

unread,
Jun 11, 2015, 6:18:08 PM6/11/15
to em...@googlegroups.com
Hi Steven,
  Thanks for your reply. However, I do not think that the usage of e2classextract.py is logical. I think the following command is what you said:

  e2classextract.py --input_set=particle.hdf --classlist=cls.lst

am I right?

what I understand this command is that:  the input_set is the raw particle file that used for classification, and it is a stacked image file;

second, the classlist file is a list file, which only contains numbers( the class number).

OK, now problems come.Let's say, we have only two files: the raw particles (particles.hdf) and the averaged images of each class (avg.hdf).  If the classlist file only contains numbers, how does e2classextract.py know the averaged images of each class? how do the images in the particles know which classes they will go to?


I think that we, for all the beginners, still need some examples of all the eman2's command, especially for those commands that depend on the EMAN2 database. 

Thank you very much
Best,
Ximing  

Ximing XU

unread,
Jun 11, 2015, 6:30:53 PM6/11/15
to em...@googlegroups.com
Hi Steven,
  I know how to do it now.
e2classifykmeans.py particles.hdf --ncls=2 -C
which is much more easier.
Best,
Ximing

On Thursday, June 11, 2015 at 7:59:24 PM UTC+2, Steve Ludtke wrote:

Steven Ludtke

unread,
Jun 11, 2015, 8:28:50 PM6/11/15
to em...@googlegroups.com
Hi Ximing. Unfortunately, that doesn't do the same thing at all. e2classifykmeans will not do any alignment of the particles it will just directly measure the symmetry of the unaligned particles and split them into two classes. In most situations this isn't what you want to do at all.  Extracting the particles associated with the classes from the original process is the correct thing to do, and it is actually quite straightforward to accomplish, once you understand how the file/folder conventions work in EMAN2.

The trick is that in standard EMAN2.1 processing, you do not give an HDF stack directly to e2refine2d or e2refine_easy. You normally create a 'set' (a .lst file) and then use that as input to these programs (standards and reasoning are explained here: http://blake.bcm.edu/emanwiki/EMAN2/DirectoryStructure).  The main reason for this is to avoid making many copies of the particle data, which uses a lot of disk space. Instead, you create .lst files which are just text files representing subsets of particles from other files.

In any case, you CAN give an HDF file directly to these programs, and the output will still be fine. However, if you want to extract subsets of particles, you need to provide a .lst file representing the particles in the .hdf file.  For example:

e2refine2d.py --input particles.hdf ... other options

produces, for example, r2d_01/

Let's say you want to extract the particles associated with classes 1,3 and 8 from r2d_01/classes_05.hdf

e2proclst.py particles.hdf --create particles.lst
echo "1,3,8" > goodclasses.txt
e2classextract.py --input_set particles.lst --classlist goodclasses.txt --setname output.lst r2d_01/classes_05.hdf 

This will produce a file called 'output.lst' containing the particles in classes 1,3, and 8.  ".lst" files can be treated as if they were image stacks for input in all EMAN2.1 programs. They do not contain actual images, but just pointers to existing images in other files.
Reply all
Reply to author
Forward
0 new messages