Problem with e2refine2d.py

367 views
Skip to first unread message

huyb

unread,
Jan 23, 2013, 6:55:28 AM1/23/13
to EMAN2
I had a stack of particles and imported into EMAN using the
e2projectmanager.py and built a set already. I tried to start
e2refine2d.py (Generate Classes) with a reference containing several
class averages since starting without a reference gives me only a few
good classes. However, I always got the error msg from EMAN2. The
command line is this:

e2refine2d.py --iter=1 --input=bdb:sets#Y_prealn --ncls=80 --naliref=5
--initial=ref_20.spi --nbasisfp=5 --simalign=rotate_translate_flip --
simaligncmp=ccc --simraligncmp=dot --simcmp=ccc --classkeep=0.85 --
classiter=1 --classalign=rotate_translate_flip --classaligncmp=ccc --
classraligncmp=ccc --classaverager=mean --classcmp=ccc --
classnormproc=normalize.edgemean

The error msgs are:
Beginning image sort/alignment
Traceback (most recent call last):
File "/g/software/linux/pack/eman2-20120822/bin/e2stacksort.py",
line 325, in <module>
main()
File "/g/software/linux/pack/eman2-20120822/bin/e2stacksort.py",
line 115, in main
b=sortstackheader(a,options.nsort,options.byheader)
File "/g/software/linux/pack/eman2-20120822/bin/e2stacksort.py",
line 312, in sortstackheader
stack.sort(key=lambda B:B.get_attr(header))
File "/g/software/linux/pack/eman2-20120822/bin/e2stacksort.py",
line 312, in <lambda>
stack.sort(key=lambda B:B.get_attr(header))
RuntimeError
Error running:
e2stacksort.py bdb:r2d_02#allrefs_01 bdb:r2d_02#allrefs_01 --
byheader=class_ptcl_qual

Running without the initial reference works fine. What did I do wrong?

Ludtke Steven

unread,
Jan 23, 2013, 8:28:01 AM1/23/13
to em...@googlegroups.com
It expects the provided class-averages to have a header value that yours don't have. Generally giving it starting class-averages isn't a very good approach. If you aren't getting very good classification, it's probably a better strategy to adjust the parameters.

--normproj You probably should specify this option. Without it, e2refine2d will intentionally give you a bunch of bad classes, as effectively the SNR of the image becomes part of the classification. If you use --normproj, you will get more 'good' class-averages, but each one will have some of the 'bad' particles. Running without this option can be used to help reduce the population of overly noisy particles from your data set.
--iter=1 This may have only been a test, but generally ~10 is probably a better value, particularly if you're having problems
--nbasisfp=5 In many cases this is too small, and you should increase to 10 or even 15.
--naliref=5 This may be sufficient or not.
--classiter=1 You will generally get faster convergence with a larger value here, perhaps 5.

If it still isn't giving you the results you expect, it would help to see a screenshot of what you are getting...
> --
> ----------------------------------------------------------------------------------------------
> You received this message because you are subscribed to the Google
> Groups "EMAN2" group.
> To post to this group, send email to em...@googlegroups.com
> To unsubscribe from this group, send email to eman2+un...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/eman2

huyb

unread,
Jan 24, 2013, 5:16:43 AM1/24/13
to EMAN2
Dear Steven,

Your suggestion definitely improved classification. Is there anyway to
improve it further such as using the good references and running the
classification again?


On Jan 23, 2:28 pm, Ludtke Steven <sludtk...@gmail.com> wrote:
> It expects the provided class-averages to have a header value that yours don't have. Generally giving it starting class-averages isn't a very good approach. If you aren't getting very good classification, it's probably a better strategy to adjust the parameters.
>
> --normproj      You probably should specify this option. Without it, e2refine2d will intentionally give you a bunch of bad classes, as effectively the SNR of the image becomes part of the classification. If you use --normproj, you will get more 'good' class-averages, but each one will have some of the 'bad' particles. Running without this option can be used to help reduce the population of overly noisy particles from your data set.
> --iter=1        This may have only been a test, but generally ~10 is probably a better value, particularly if you're having problems
> --nbasisfp=5    In many cases this is too small, and you should increase to 10 or even 15.
> --naliref=5     This may be sufficient or not.
> --classiter=1   You will generally get faster convergence with a larger value here, perhaps 5.
>
> If it still isn't giving you the results you expect, it would help to see a screenshot of what you are getting...
>

Ludtke Steven

unread,
Jan 24, 2013, 9:12:54 AM1/24/13
to em...@googlegroups.com
This process is really only designed for three purposes:
1) Data assessment - to get a feel for any possible symmetry in your particles, see some of the orientation distributions present in the data, and look for possible conformational flexibility
2) Initial models - generating a few (5-20) class-averages to use in building an initial model
3) Bad data - Particles in 'bad' classes can be automatically marked and excluded from the data set for further processing.

It is not designed to produce large sets of robust, reproducible class-averages. During 3-D refinement, class-averages are generated via projection matching, and these averages aren't used. That is, from an EMAN2 perspective, there is very little point in putting a lot of time and effort into optimizing these class-averages. If you show me a screenshot of your class-averages and your eigenimages, I may be able to offer a suggestion to further tweak the parameters, but otherwise, I would argue it isn't worth spending a lot of time on. If you're looking for structural variability, then subclassifying the particles that went into one or more of your existing classes is the typical approach.
> --
> --
> ----------------------------------------------------------------------------------------------

Paul Penczek

unread,
Jan 24, 2013, 10:34:58 AM1/24/13
to em...@googlegroups.com
Hi,

ISAC was designed to deliver high-quality 2D class averages, or at least alert one to the limitation of the data set:

Yang Z, Fang J, Chittuluru J, Asturias FJ, Penczek PA. Iterative stable alignment and clustering of 2D transmission electron microscope images. Structure 2012, 20:237-47. doi: http://dx.doi.org/10.1016/j.str.2011.12.007. PMID: 22325773.


http://sparx-em.org/sparxwiki/sxisac


The code is part of the installed EMAN2 package.


Regards,

Pawel Penczek.



From: huyb <huy...@gmail.com>
To: EMAN2 <em...@googlegroups.com>
Sent: Thu, January 24, 2013 4:18:20 AM
Subject: [EMAN2] Re: Problem with e2refine2d.py

huyb

unread,
Jan 26, 2013, 9:08:14 AM1/26/13
to em...@googlegroups.com
Thanks a lot, Pawel. I will try ISAC but the computational time seems to be quite intimidating. We have a PBS cluster so I'm not sure it's easy to setting up MPI parallel run.

I have another question regarding the class average.

My imported particles are pre-aligned in Spider already. However, the class average coming out seems to be quite random in orientation. Is this not a bad thing? Is the class average supposed to be reasonably aligned? I though e2refine2d.py did a e2stacksort.py between the iterations also.

One of my colleagues ran e2refine2d.py on her dataset. The class averages are beautiful but not aligned.

Can you explain me a bit of what the particles going through in e2refine2d.py with iteration?

Best,
Huy

Paul Penczek

unread,
Jan 26, 2013, 12:49:11 PM1/26/13
to em...@googlegroups.com
Hi,

I am not sure what PBS is.  It is very easy to set up MPI, just try to follow instructions here:

If it works, takes few minutes.  If it does not, please send me description of the problem.

There is a substantial difference between 2D class averages obtained by ISAC and using old methods, SPIDER and EMAN2 including.
You cannot compare them.  Class averages from e2refine2d and other similar approaches are rather random, by which I mean
the content of classes is incidental.

I am not sure what do you mean by "intimidating".  Note you spent already much more time trying this and that, apparently without being satisfied
with the results.

In which sense class averages are in random orientations?  They should be more or less centered, but in-plane angles are indeed undefined,
aka "random".  However, it is trivial to run standard 2D alignment (sxali2d) ont he class averages to get them more or less in order.

I cannot comment on inner workings of e2refine2d, it was written by Steve.

Regards,
Pawel


From: huyb <huy...@gmail.com>
To: em...@googlegroups.com
Sent: Sat, January 26, 2013 8:08:17 AM

Subject: [EMAN2] Re: Problem with e2refine2d.py

Thanks a lot, Pawel. I will try ISAC but the computational time seems to be quite intimidating. We have a PBS cluster so I'm not sure it's easy to setting up MPI parallel run.

I have another question regarding the class average.

My imported particles are pre-aligned in Spider already. However, the class average coming out seems to be quite random in orientation. Is this not a bad thing? Is the class average supposed to be reasonably aligned? I though e2refine2d.py did a e2stacksort.py between the iterations also.

One of my colleagues ran e2refine2d.py on her dataset. The class averages are beautiful but not aligned.

Can you explain me a bit of what the particles going through in e2refine2d.py with iteration?

Best,
Huy

On Wednesday, January 23, 2013 12:55:28 PM UTC+1, huyb wrote:
I had a stack of particles and imported into EMAN using the
e2projectmanager.py and built a set already. I tried to start
e2refine2d.py (Generate Classes) with a reference containing several
class averages since starting without a reference gives me only a few
good classes. However, I always got the error msg from EMAN2. The
command line is this:

e2refine2d.py --iter=1 --input=bdb:sets#Y_prealn --ncls=80 --naliref=5
--initial=ref_20.spi --nbasisfp=5 --simalign=rotate_translate_ flip --
simaligncmp=ccc --simraligncmp=dot --simcmp=ccc --classkeep=0.85 --
classiter=1 --classalign=rotate_translate_ flip --classaligncmp=ccc --
classraligncmp=ccc --classaverager=mean --classcmp=ccc --
classnormproc=normalize. edgemean

The error msgs are:
Beginning image sort/alignment
Traceback (most recent call last):
  File "/g/software/linux/pack/eman2- 20120822/bin/e2stacksort.py",
line 325, in <module>
    main()
  File "/g/software/linux/pack/eman2- 20120822/bin/e2stacksort.py",
line 115, in main
    b=sortstackheader(a,options. nsort,options.byheader)
  File "/g/software/linux/pack/eman2- 20120822/bin/e2stacksort.py",
line 312, in sortstackheader
    stack.sort(key=lambda B:B.get_attr(header))
  File "/g/software/linux/pack/eman2- 20120822/bin/e2stacksort.py",
line 312, in <lambda>
    stack.sort(key=lambda B:B.get_attr(header))
RuntimeError
Error running:
e2stacksort.py bdb:r2d_02#allrefs_01 bdb:r2d_02#allrefs_01 --
byheader=class_ptcl_qual

Running without the initial reference works fine. What did I do wrong?

huyb

unread,
Jan 28, 2013, 3:42:00 AM1/28/13
to em...@googlegroups.com
Dear Pawel,

When I mean intimidating, I thought of doing a test run on our workstation with 8 cores. That would probably take quite long. But I agree with you that I should spend time to set up ISAC with MPI to run on our PBS cluster (a cluster with a queuing system).

Sorry, I did send the class average image to Steven before in private but not publicly here. My complex looks like a Y-shape. There  is flexibility but most of them still look like a Y . I did prealign them roughly so I expect after classification, the class average which differ subtly but still in roughly the same orientation as the prealignment step.

Best,
Huy

Paul Penczek

unread,
Jan 28, 2013, 10:50:19 AM1/28/13
to em...@googlegroups.com
Hi,

indeed, from this point of view it might be intimidating.
However, it's been awhile since I saw an EM project that could be completed on 8 CPU workstation...

Regards,
Pawel.

Sent: Mon, January 28, 2013 2:42:04 AM
Subject: Re: [EMAN2] Re: Problem with e2refine2d.py
Reply all
Reply to author
Forward
0 new messages