FSC curve interpretation

1,315 views
Skip to first unread message

ashuthe...@gmail.com

unread,
Feb 17, 2017, 10:03:07 AM2/17/17
to EMAN2

Dear All,

I have a couple of questions on the attached FSC curve.

What does the dip ~0.12 spatial frequency mean? Does it mean that there is no data corresponding to that resolution range ? And how to proceed with refinement from this point.

The curve comes from the refinement of ~100K particles, target resolution 8.5, speed 3 ( ~angular sampling of 4.29 deg). The data have been collected in the defocus range of -2 to -4, and apix is 1.5.

Also could it be said that the refinement is converged ?

This refinement was carried out in steps (However, Steve mentioned in one of the threads that this is not needed) :
 
4 iterations with speed 7 against fullres data --> continued for 8 iterations with speed 5 --> continued with more iterations at speed 5 but refinement stopped after 3 iterations because of a computer glitch -- > refinement continued from the last model, glitch again after 3 iterations --> 2 more iterations at speed 5 from the last model. The curve comes from last 2 iterations.

Thanks a lot !

Ashu


FSC.pdf

Paul Penczek

unread,
Feb 17, 2017, 10:12:22 AM2/17/17
to em...@googlegroups.com
The curves are suspect as they never drop to zero. They indicate you reached nyquist frequency in refinement. 

Regards,
Pawel
--
--
----------------------------------------------------------------------------------------------
You received this message because you are subscribed to the Google
Groups "EMAN2" group.
To post to this group, send email to em...@googlegroups.com
To unsubscribe from this group, send email to eman2+un...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/eman2

---
You received this message because you are subscribed to the Google Groups "EMAN2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eman2+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
<FSC.pdf>

ashuthe...@gmail.com

unread,
Feb 17, 2017, 12:32:43 PM2/17/17
to EMAN2

Thanks Pawel,

Does is mean that there is not data to refine upto the target resolution (8.5A) ?

Thanks,
Ashu

Paul Penczek

unread,
Feb 17, 2017, 1:14:11 PM2/17/17
to em...@googlegroups.com
No, it means something went wrong with the refinement.

You can look at Penczek, P.A.: Resolution measures in molecular electron microscopy. Methods Enzymol 482, 73-100, 2010.
for a discussion and examples of 'strange' FSC curves.

A possibility is that you set part of the Fourier transform to zero - at the point where you have a dip.

Hard to tell.  You should wait for a comment from somebody familiar with the refinement you try to use.

Pawel.


From: "ashuthe...@gmail.com" <ashuthe...@gmail.com>
To: EMAN2 <em...@googlegroups.com>
Sent: Friday, February 17, 2017 11:32 AM
Subject: Re: [EMAN2] FSC curve interpretation

Steve Ludtke

unread,
Feb 17, 2017, 2:11:45 PM2/17/17
to em...@googlegroups.com
Hi Ashu, sorry for the slow reply.

The falloff you see at  ~8 Å from the data.  The rise you see after the falloff  is related to the resolution target and speed parameters, combined with the very large number of particles in the data set. Effectively this is producing a pattern in Fourier space which produces a false correlation effect. Understand that, given the even/odd split, this _should_ not happen even in this case, unless something else is also going on. Typically when you see this in a situation like this, it means that a significant fraction of your particles are bad (or not real particles), which causes this sort of "background correlation" effect. It isn't noise bias or initial model bias, rather it's an algorithmic bias caused by the specific parameters combined with a lot of noise. 

There are several specific changes to the refinement which will avoid the bad FSC curve, but this will have no real impact on the structure itself. I would take this as a warning sign that you may have been too liberal in your particle picking process. Have you gone through the e2evalrefine.py --evalptclqual  process on this data?  Take a look at the 2017 tutorial for a description of this process. It can tell you a lot about your particle/data quality.

There is also a very real possibility that you simply have SO many more particles than you need for the resolution you are targeting that the residual high resolution noise permits this effect even with good particles.

If you run another refinement with a higher target resolution, or do a run with speed=1, it should dramatically reduce this problem in the FSC curve, and make the structure a bit better.

Alternatively, if you
grep orientgen refine_XX/0_refine_parms.json

you will see something like 
"orientgen": "eman:delta=5.62437:inc_mirror=0:perturb=0"

if you run your next refinement with the

--orientgen=eman:...
option (fill in the string above), but with perturb=1 instead of 0, it will randomize the orientations slightly and prevent this type of false correlation. This, however, won't improve the structure, just the FSC assessment.
----------------------------------------------------------------------------
Steven Ludtke, Ph.D.
Professor, Dept. of Biochemistry and Mol. Biol.                Those who do
Co-Director National Center For Macromolecular Imaging            ARE
Baylor College of Medicine                                     The converse
slu...@bcm.edu  -or-  ste...@alumni.caltech.edu               also applies
http://ncmi.bcm.edu/~stevel

Ashu Sharma

unread,
Feb 17, 2017, 2:37:55 PM2/17/17
to em...@googlegroups.com
Hi Steve,

Thanks so much for your reply.

Yes, I have been trying to get rid of bad particles for quite some time, though without any success yet. This is a small protein (~300kD), a monomer (therefore no symmetry) with predicted flexible domains. Autopicking wasn't working well (though when I started picking particles, new neural network based picking wasn't available) and image qualities were not so great, so essentially all the 100K particles are manually picked. Given the non-symmetric and flexible nature of the molecule, I  indeed  was liberal in picking particles with the fear of not losing important projections.

I did try evalptclqual. And, the evalptclqual plot is not bilobed. I attach the plot (with default axes) here. And in an other thread you suggested that small particle size could be one of the reasons for this pattern. Having failed to discard bad particles through this method, I resorted to make class averages for the entire dataset. I'll try to discard bad particles there once class averaging is complete.

This morning, I already started a refinement with higher (5A) target resolution and speed 1. I'll keep you posted.

Thanks,
Ashu
 
To unsubscribe from this group, send email to eman2+unsubscribe@googlegroups.com

For more options, visit this group at
http://groups.google.com/group/eman2

---
You received this message because you are subscribed to the Google Groups "EMAN2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eman2+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
--
----------------------------------------------------------------------------------------------
You received this message because you are subscribed to the Google
Groups "EMAN2" group.
To post to this group, send email to em...@googlegroups.com
To unsubscribe from this group, send email to eman2+unsubscribe@googlegroups.com

For more options, visit this group at
http://groups.google.com/group/eman2

---
You received this message because you are subscribed to a topic in the Google Groups "EMAN2" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/eman2/hoauyqxW0fc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to eman2+unsubscribe@googlegroups.com.
Screenshot-1.png

Steve Ludtke

unread,
Feb 17, 2017, 11:00:18 PM2/17/17
to em...@googlegroups.com
Hi Ashu. Actually, that plot isn't quite as bad as I pictured. While you are correct that it lacks the characteristic elbow, in truly bad cases, a linear fit of the points would have zero slope. Yours at least has a clear slope, and proportionally few negative values.  And, actually, I'd be curious how it looks if you reduce the point size. This plot is pretty saturated, but I get the hint of a modest separation between the left and right side of the plot. With a smaller point size it may be easier to distinguish.

----------------------------------------------------------------------------
Steven Ludtke, Ph.D.
Professor, Dept. of Biochemistry and Mol. Biol.                Those who do
Co-Director National Center For Macromolecular Imaging            ARE
Baylor College of Medicine                                     The converse
slu...@bcm.edu  -or-  ste...@alumni.caltech.edu               also applies
http://ncmi.bcm.edu/~stevel

<Screenshot-1.png>

Ashu Sharma

unread,
Feb 18, 2017, 3:26:43 PM2/18/17
to em...@googlegroups.com
Hi Steve,

Attached is the plot with reduced point size. As  you rightly, suspected there is a trail towards lower correlation values. Would it be  possible to select and fetch good particles from this list? What are the different columns in the ptclfsc_??.txt output file. Could I use this to fetch the identity of good particles, with some user defined criteria, using some script (awk) and then make a list file for refinment?


The dip in the FSC curve on refinement with higher target resolution and speed 1 indeed vanished. I have attached the FSC curve for the first two iterations, continued from the previous FSC curve which I posted earlier.
Though, as Pawel pointed out, the 2nd iteration FSC still doesn't cross zero any where.

Thanks,
Ashu

<Screenshot-1.png>

Screenshot-3.png
FSC_speed1.pdf

Steve Ludtke

unread,
Feb 18, 2017, 3:32:12 PM2/18/17
to em...@googlegroups.com
Hi Ashu,
indeed, while this isn't quite the clear separation that we normally hope to see, I think you may get some benefit to making a reduced set with only the best particles. Please look at step 13 in the EMAN2.2 tutorial on the Wiki. It contains a description of how this plot can be used to extract a good subset. The automatic separation into 2 groups may not be optimal in your case since the data isn't well separated, but if you keep reading through the full tutorial you can see how you might manually subdivide the data into, say 3 or 4 groups, then keep the 1 or 2 groups with the best particles.

----------------------------------------------------------------------------
Steven Ludtke, Ph.D.
Professor, Dept. of Biochemistry and Mol. Biol.                Those who do
Co-Director National Center For Macromolecular Imaging            ARE
Baylor College of Medicine                                     The converse
slu...@bcm.edu  -or-  ste...@alumni.caltech.edu               also applies
http://ncmi.bcm.edu/~stevel

<Screenshot-3.png><FSC_speed1.pdf>

Steve Ludtke

unread,
Feb 18, 2017, 4:08:10 PM2/18/17
to em...@googlegroups.com
If you want to work with the ptclfsc_XX.txt file yourself, it is documented here:


----------------------------------------------------------------------------
Steven Ludtke, Ph.D.
Professor, Dept. of Biochemistry and Mol. Biol.                Those who do
Co-Director National Center For Macromolecular Imaging            ARE
Baylor College of Medicine                                     The converse
slu...@bcm.edu  -or-  ste...@alumni.caltech.edu               also applies
http://ncmi.bcm.edu/~stevel

Ashu Sharma

unread,
Feb 18, 2017, 4:22:22 PM2/18/17
to em...@googlegroups.com
Thanks Steve,

I followed the step 13 of the tutorial just now. And it removed nearly 46% of particles !! The plot was symmetric in the 8-15A resolution range itself. I'll use this new data with 50% particles to refine the map and if required will repeat the step 13, or will try to make the list myself with documentation you provided.

Best,
Ashu

<Screenshot-3.png><FSC_speed1.pdf>

Steve Ludtke

unread,
Feb 19, 2017, 12:02:47 PM2/19/17
to em...@googlegroups.com
I should point out that the fact that it removed about half the particles is not really a positive sign, but it is to be expected in this situation. It runs a clustering algorithm on the multicolumn data and splits into 2 groups. If there is a clear break between the data it will use it, but if there is no clear break it will produce a near 50/50 split...

----------------------------------------------------------------------------
Steven Ludtke, Ph.D.
Professor, Dept. of Biochemistry and Mol. Biol.                Those who do
Co-Director National Center For Macromolecular Imaging            ARE
Baylor College of Medicine                                     The converse
slu...@bcm.edu  -or-  ste...@alumni.caltech.edu               also applies
http://ncmi.bcm.edu/~stevel

Reply all
Reply to author
Forward
0 new messages