classification using cross-correlation scores

Molly Gravett

unread,

Mar 10, 2025, 7:31:39 AMMar 10

to EMAN2

Hello,

I am trying to write a python script to automate assigning microtubule protofilament number and direction by comparing an image to a reference and using the ccc score, but it doesn't seem to be performing as well as I would expect. Currently as references I use 2D projections of MT em maps that I have inverted the contrast for, applied a circular mask and made a rotational average with symmetry depending on number of protofilaments (12 pf c12 symmetry).

I have an image stack of rotated and z projected subtomograms so that you are looking down the microtubule. I have high-passed filtered these and applied a circular mask:

z_proj = data.process('misc.directional_sum',{'axis':'z'})
z_proj.process('filter.highpass.gauss', {'cutoff_abs':0.01})
z_proj.process('normalize')
z_proj = z_proj * zmask

The high pass filter seems to help with the translational alignment to the reference.

I then use my check_pf function to compare each image in the stack to a reference image in the reference stack.

def check_pf(stack, references, mask): #stack is a list of z projected subtomos, references is the list of reference microtubules 11-15pf and alternating in direction, mask is the 2D circular mask.
top_classes = []
top_scores = []
for j in range(len(stack)): #iterate through image stack
img = stack[j]
img.process('xform.center') #center MT in image, not sure necessary?
score = []
best=(-2,-1)
for item in range(12):
pf = item+(11-((item+(item%2))/2)) #works out pf number of reference based on position in stack
rotated = img.process('xform.applysym',{'sym':f'c{pf}'}) #apply pf symmetry to image
reference = references[item].process('xform.center') #center reference MT
reference = reference.process('normalize') #normalize
aligned = rotated.align("rotate_translate", reference, {'useflcf':1, 'maxshift':2}) #do you think flcf is better?
c = aligned.cmp("ccc",reference,{"negative":0,"mask":mask}) #cross correlation coefficient of image to reference
best=max(best,(c,item))
score.append(c)
top_score = best[0] #value of top score
top_score_index = best[1]+1 #class top score belongs to
top_classes.append(top_score_index)
top_scores.append(top_score)
to_write = [j, top_score_index, top_score]
writer_h.writerow(to_write) #write image, class, and score to csv file
return top_classes, top_scores

Do you think comparing the rotational averages is better than comparing the raw image?

Do you think the flcf would align these better?

Would you recommend a better or different way of doing this?

Let me know if you need any more info!

Many thanks!

Molly

Steve Ludtke

unread,

Mar 10, 2025, 8:24:24 AMMar 10

to em...@googlegroups.com

Ok, it’s a bit unclear exactly what data you have here and where the references are coming from. I started to write a reply offering suggestions for each of the questions, but without understanding exactly what you’re doing, it’s hard to offer good advice.

- The data you are trying to look at direction and N in is tomographic data?

- You’ve already done subtomogram averaging of individual microtubules?

- If so:

-did you force it to preserve overall orientation of the individual “particles”?

- I assume you didn’t impose any symmetry on these during averaging?

- Do the reconstructions look good? (might be worth posting some representative images you are trying to classify)

- To be clear, you are making a Z projection of each subtomogram average, each of which comes from one filament? (clearly this eliminates any direction information)

- What defocus range did you use for the data collection?

- What magnification (A/pix)?

- Where did your reference with different N come from?

Posting a few representative images (data and references) would help a lot to explain what you’re trying to achieve.

In answer to the script questions.

- Centering is likely more useful than alignment in this context, but would need to see what your images look like. Obviously alignment won’t work well if N is different...

- No, FLCF likely not useful in this context, and honestly I’m not sure it’s working properly at all.

--
--
----------------------------------------------------------------------------------------------
You received this message because you are subscribed to the Google
Groups "EMAN2" group.
To post to this group, send email to em...@googlegroups.com
To unsubscribe from this group, send email to eman2+un...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/eman2

---
You received this message because you are subscribed to the Google Groups "EMAN2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eman2+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/eman2/7c894164-4f63-4c0c-8226-e367f5c7899bn%40googlegroups.com.

Tanvir Shaikh

unread,

Mar 10, 2025, 10:14:32 AMMar 10

to em...@googlegroups.com

On Mon, Mar 10, 2025 at 12:31 PM Molly Gravett <moll...@gmail.com> wrote:

rotated = img.process('xform.applysym',{'sym':f'c{pf}'}) #apply pf symmetry to image

This step worries me a little bit: enforcing the symmetry. I'm guessing that, if the image is noisy, the higher symmetry will give a better correlation whether it's correct or not.

I haven't played with microtubules in depth (only the IHRSR tutorial), but assuming that the protofilament spacing is invariant, would the diameter be enough to sort them? The equator encodes the radial distribution function, and is invariant to translation and rotation around the helical axis. You'd need alignment information to distinguish up from down still.

-Tapu Shaikh

Steve Ludtke

unread,

Mar 10, 2025, 10:20:57 AMMar 10

to em...@googlegroups.com

I’m waiting to see what the images look like. Generally a Z-projection along the length of a microtubule is going to basically be circularly symmetric because you’re averaging over the whole pseudo-helical pitch. Any Cn signal left would be because of averaging over an incomplete pitch, so imposing it isn’t going to do anything useful, but also won’t be especially harmful unless centering is poor. So pretty much classifying based on diameter (which certainly could work for N). I’m concerned that there may be strong CTF effects influencing this attempt, but I may be misunderstanding what she’s working with here... Hoping for more information to go on still...

--
--
----------------------------------------------------------------------------------------------
You received this message because you are subscribed to the Google
Groups "EMAN2" group.
To post to this group, send email to em...@googlegroups.com
To unsubscribe from this group, send email to eman2+un...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/eman2

---
You received this message because you are subscribed to the Google Groups "EMAN2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eman2+un...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/eman2/CALEKtCGkFexuvXOviRn-uf8pFwMohVtVyRnui8T9VJ53-uezeQ%40mail.gmail.com.

Molly Gravett

unread,

Mar 11, 2025, 6:59:51 AMMar 11

to em...@googlegroups.com

Hi Steve & Tanvir,

Thanks for your replies.

I am using the raw subtomogram, not a subtomogram average, that I have rotated to have the MT aligned along the z axis based on the picking coordinates. Please find the reference image stack (MT_reference_stack_masked_rotational_average_center_normalized.hdf), an example of a raw subtomo z projected (MT1.0_ptcl030_zproj.mrc), mask/hpf/normalised (MT1.0_ptcl030_zproj_hpf_norm_masked.mrc) and the rotational averages (MT1.0_ptcl030_zproj_hpf_norm_masked_rotav_cN.mrc), and the mask (zmask.mrc). The reference image stack comes from simulated MT maps with different PF numbers supplied with the mirp package, that I have inverted and normalised.

Let me know if you need anything else!

Thanks!

Molly

You received this message because you are subscribed to a topic in the Google Groups "EMAN2" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/eman2/EDM-NOnjflk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to eman2+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/eman2/ACF6885C-A624-4640-8780-411A5C1021A2%40gmail.com.

MT1.0_ptcl030_zproj_hpf_norm_masked.mrc

MT1.0_ptcl030_zproj_hpf_norm_masked_rotav_c16.mrc

zmask.mrc

MT1.0_ptcl030_zproj.mrc

MT_reference_stack_masked_rotational_average_center_normalized.hdf

MT1.0_ptcl030_zproj_hpf_norm_masked_rotav_c12.mrc

MT1.0_ptcl030_zproj_hpf_norm_masked_rotav_c14.mrc

MT1.0_ptcl030_zproj_hpf_norm_masked_rotav_c13.mrc

MT1.0_ptcl030_zproj_hpf_norm_masked_rotav_c15.mrc

MT1.0_ptcl030_zproj_hpf_norm_masked_rotav_c11.mrc

Steve Ludtke

unread,

Mar 11, 2025, 8:31:53 AMMar 11

to em...@googlegroups.com

Hi Molly,

ok... you have a lot of potential problems here:

1) Your templates with different “N” all have the same apparent diameter? While there are variations of helical arrangement as well as number of protofilaments, I would generally expect an increase of N to correspond to an increase in diameter?

2) You didn’t mention defocus in your reply. If you are generating idealized templates (ie - from a PDB of a simulated filament), then you need to apply an amplitude CTF filter comparable to the tomogram you’re comparing to for optimal results. While “a highpass filter” can mimic a little bit of the CTF effect, it isn’t really a substitute if you’re targeting precision.

3) Tomograms have missing-wedge, meaning they will not be very circularly symmetric along the MT axis for MT laying parallel to the grid surface. Clearly you can’t really use the trick of imposing symmetry on the MT to help fill in the missing wedge, since detecting the symmetry is your goal, but to some extent, the information you are trying to extract, simply doesn’t exist in the tomogram. Having said that, in the directions where you do have sampling, there may be sufficient information to pull out something like radius, which should be the determining factor in “N”, but a straight correlation coefficient may not be the optimal strategy. It may be ok, since your data is fixed and the synthetic references are varying (and don’t have missing information), but getting optimal results may require somewhat deeper thought.

4) a prototypical microtubule is ~250Å in diameter. Meaning distinguishing between different Ns will require measuring diameter accurately with ~20 Å precision. The sampling of your MT data appears to be ~8.7 Å/pix, meaning you will be looking for ~1 pixel (on each side) size changes for an integral change of N. While this is likely detectable with implicit rotational averaging, you’re a little closer to the sampling limit than I’d like if I were doing it. ie - ideally you’d have higher mag tomograms

5) depending on the cellular situation, there may be other molecules associated with the microtubule, which may impact their apparent diameter, particularly with a sampling of only ~9 Å/pix

Anyway, hope that gives you something to start from at least...

cheers

To view this discussion visit https://groups.google.com/d/msgid/eman2/CA%2BH8%3D9AU2vA2BdWq%3D7c9ppfM7gX0esc%2BtuBNMK1iq8mntngBbg%40mail.gmail.com.
<MT1.0_ptcl030_zproj_hpf_norm_masked.mrc><zmask.mrc><MT1.0_ptcl030_zproj.mrc><MT_reference_stack_masked_rotational_average_center_normalized.hdf><MT1.0_ptcl030_zproj_hpf_norm_masked_rotav_c12.mrc><MT1.0_ptcl030_zproj_hpf_norm_masked_rotav_c14.mrc><MT1.0_ptcl030_zproj_hpf_norm_masked_rotav_c13.mrc><MT1.0_ptcl030_zproj_hpf_norm_masked_rotav_c15.mrc><MT1.0_ptcl030_zproj_hpf_norm_masked_rotav_c11.mrc><MT1.0_ptcl030_zproj_hpf_norm_masked_rotav_c16.mrc>

Reply all

Reply to author

Forward