Automated particle selection - Advanced tomography tool

parrell...@gmail.com

unread,

Aug 19, 2022, 6:14:40 PM8/19/22

to EMAN2

Hi There,

I am using the Automated particle selection tool from this page (https://blake.bcm.edu/emanwiki/EMAN2/e2tomo_more) to pick particles in some tomograms that I reconstructed in etomo, it works great for my particles and I can take the coordinates to my next steps. But I am running into some problems with it since updating the continuous build, I'll list the e2version.py info below. Here is the situation.

I am using tomograms from etomo (.mrc files). I import the tomograms inside e2projectmanager.py, using the copy option or move option with success. But if I run the "e2spt_boxer_convnet.py --label xxx" command the tomogram files do not appear in the e2spt_boxer_convnet window.

I can however get the tomograms to show up in this window if I do a specific set of operations. After importing I must run preprocessing on any tomograms I want to pick particles in, I use the defaults. I have to do this even if I do not need/want to use the preproecessed tomograms, or want to use the original tomograms. If I skip it the next steps don't work. After the preprocessing is finished I need to add ""apix_unbin":3.86" (or whatever the pixel size is) to the .json file for each tomograms file in the "info" directory, even for the original. I add it to the top line after the brackets open. After adding the "apix_unbin" information, I need to go into the "box training references" and add at least two particles to a particle list and save it. Once I have done these things running "e2spt_boxer_convnet.py --label xxx" will show my tomograms in the window. I have to do these steps in this specific order for it to work. Unfortunately in the fairly recent version of the continuous build opening a tomogram from the list causes it to crash with an error I will list below (almost at the bottom), this did not used to happen using a version of the continuous build from around December last year.

This is a kind of roundabout way to make it work. However like I said using a version of the continuous build from around December last year I was at least able to get things working, and it works great! I can look up that version and let you know later, I don't have it in front of me right now.

Am I doing something fundamentally incorrect while importing the tomograms to cause my trouble? I am going to paste the output from the terminal below for each step and the text from the json file for my example file after each step in case this will help you. Please let me know if I can clarify, I think this was a lot to unpack.

Thank you very much!

Daniel Parrell

e2version.py

EMAN 2.99 ( GITHUB: 2022-08-02 21:05 - commit: 8a24bb112b5837b69fa6c0696c7ae1b88262a637 )

Your EMAN2 is running on: Linux-3.10.0-1160.59.1.el7.x86_64-x86_64-with-glibc2.17 3.10.0-1160.59.1.el7.x86_64

Your Python version is: 3.9.13

IMPORT TOMOGRAMS:

e2projectmanager.py

/opt/EMAN2/continuous-release-2022-08-16/bin/e2projectmanager.py:1799: DeprecationWarning: an integer is required (got type float). Implicit conversion to integers using __int__ is deprecated, and may be removed in a future version of Python.

self.scwidget.setMinimumWidth(self.width()-1.5*self.verticalScrollBar().width())

/opt/EMAN2/continuous-release-2022-08-16/bin/e2projectmanager.py:1800: DeprecationWarning: an integer is required (got type float). Implicit conversion to integers using __int__ is deprecated, and may be removed in a future version of Python.

self.scwidget.setMaximumWidth(self.width()-1.5*self.verticalScrollBar().width())

NOT Writing notes, ppid=-2

Fri Aug 19 16:29:57 2022: e2proc3d.py W1223_G4_130Kx_ts_15_Alimc.rec-bin4.mrc ./tomograms/W1223_G4_130Kx_ts_15_Alimc.rec-bin4.hdf --clip 1024,1024,240 --process normalize

Done.

JSON file:

not created yet.

If I run "e2spt_boxer_convnet.py --label xxx" now

No tomograms listed in window. No terminal message. I close the windows.

Preprocess Tomograms: Use defaults

NOT Writing notes, ppid=-2

1/1

JSON file:

{}

Manually edit JSON file:

{

"apix_unbin":3.86,

}

Box triaining References: Use defaults

Saving 2D particles to particles/W1223_G4_130Kx_ts_15_Alimc.rec-bin4__particles_00.hdf

Exiting

None

JSON file:

{

"apix_unbin":3.86,

"boxes_3d": [[425.5,586.5,120,"manual",0.0,0],

[654.5,455.5,120,"manual",0.0,0]

],

"class_list": {

"0": {

"boxsize": 64,

"name": "particles_00"

}

Launching "e2spt_boxer_convnet.py --label xxx" again:

If I click on a tomogram in the list of files the follwong error appears.

Reading tomograms/W1223_G4_130Kx_ts_15_Alimc_preproc.hdf...

Traceback (most recent call last):

File "/opt/EMAN2/continuous-release-2022-08-16/bin/e2spt_boxer_convnet.py", line 565, in on_list_selected

self.set_data(self.curinfo["name"])

File "/opt/EMAN2/continuous-release-2022-08-16/bin/e2spt_boxer_convnet.py", line 590, in set_data

apix_unbin=js["apix_unbin"]

File "/opt/EMAN2/continuous-release-2022-08-16/lib/python3.9/site-packages/EMAN2jsondb.py", line 875, in get

raise KeyError(key)

KeyError: 'apix_unbin'

Aborted

Muyuan Chen

unread,

Aug 19, 2022, 6:58:50 PM8/19/22

to em...@googlegroups.com

You are correct, and I am quite impressed that you figured out the workaround... The “apix_unbin” key in the info file indeed marks the major difference between the tomograms reconstructed in EMAN vs the imported ones. The idea was, since we allow the tomogram output of different sizes from the reconstruction, when you open different versions of the same tomogram (bin2 and bin4 for example) in the boxer, you would find your particles in the correct locations. When you reconstruct the tomogram, the program write the Apix of the tilt series as “apix_unbin” in the info file, then it compare that with the header of the tomogram when boxing particles to decide the correct scaling.

Saying that, I don’t really have a quick fix for it. I probably have never touched a tomogram not reconstructed in EMAN in the last 3 years, so most of the newer programs lack the compatibility. Adding “apix_unbin” to info files is certainly necessary. While it might be a good idea to automatically do that at the import tomogram step, it might also break some more ancient code in e2spt_boxer that were implemented before me (which use the corner of tomogram instead of the center as origin).

Why you need to add two extra particles isn’t very clear however, and I think it should still be able to display tomograms without that. If you have the error message with “apix_unbin” but without the two particles, it would be good to let me know. I think now the easiest way for your problem is that I solve the two particle issue, and add an option of —addkey in e2procjson, so you can add the apix_unbin key without manually modify every json file. Then at least it is a more convenient workaround…

Muyuan

--
--
----------------------------------------------------------------------------------------------
You received this message because you are subscribed to the Google
Groups "EMAN2" group.
To post to this group, send email to em...@googlegroups.com
To unsubscribe from this group, send email to eman2+un...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/eman2

---
You received this message because you are subscribed to the Google Groups "EMAN2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eman2+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/eman2/985a9a13-99c2-4825-8973-e1cbea7d9been%40googlegroups.com.

Daniel Parrell

unread,

Aug 22, 2022, 4:55:26 PM8/22/22

to em...@googlegroups.com

Hi Muyan,

I am glad you understood my question. That all makes sense to me, and I did some more testing. Yes, if you can add an option for the apix_unbin so I don't have to manually edit the info files that would make life easier... and after some testing that I will explain below, this may be all I need you to do.

I did a little more testing after your comments, and I think that my file name was causing some additional issues. Due to a quirk in my own preprocessing there was a file name that used two extensions essentially, "xxx.rec-yyy.mrc". You can see the exact format in my previous message notes if you are curious. If I rename that file to xxx.mrc the process becomes much simpler...

After importing xxx.mrc files, and preprocessing them, now I just have to manually add "apix_unbin":3.86 to the info JSON files. After doing this, running "e2spt_boxer_convnet.py --label xxx" displays my tomograms in the e2spt_boxer_convnet window and they open! So as you expected there was no need to add particles... and the apix_unbin error is gone. I wonder if the added complexity to my workflow before and the error was due to my possibly unexpected extension. Anyway, I can solve that on my end by using a simpler file name.

I also wonder if the requirement to run preprocessing is just to create a .JSON file for the tomograms in the info directory, that I can then edit.

It is funny to me... I noticed that the preprocessed tomograms to do not appear in e2spt_boxer_convnet after these new steps. However, running my trick to add a particle, even to one of the original tomograms using the "box training references" in the GUI will cause the "preproc.hdf" files to be added to the list of files in the e2spt_boxer_convnet window, and will cause JSON files for these to be generated in the info folder. Again even if they are not the file I interacted with. If I manually give these files the "apix_unbin":3.86 value then they will open as well! Just giving you my observations, in case they are helpful.

To summarize, I think adding an option to generate a .json file with a defined "apix_unbin": will solve the issue for me. And I think some of the added complexity before was a file nameing issue that I can solve on my end.

Thank you for your help, and for supporting my use case, I really appreciate your tools!

Daniel Parrell

PS: The version of the program I ran from December was downloaded on 12-2-2021. Although I think now some of the problems were my file names now that I did some more testing because it works similar to what I expected.

To view this discussion on the web visit https://groups.google.com/d/msgid/eman2/CCD0C003-473A-4B56-BC2D-FBE7B8F03135%40gmail.com.

Muyuan Chen

unread,

Aug 22, 2022, 7:59:08 PM8/22/22

to em...@googlegroups.com

I committed the new option to Github. I have not triggered a continuous build yet, so it is only available if you build from source right now. Or just grab the latest e2procjson.py from Github and replace your version.

Now run

e2procjson.py info/xxx.json --addkey apix_unbin:1.0

will add the key to the json file. Alternatively, run

e2procjson.py tomograms/xxx.mrc --addkey apix_unbin:1.0 --infofile

will look for the json file corresponding to the mrc in the info folder, create it if needed, and set the apix_unbin key. Giving it something like tomograms/*.mrc should work too.

Muyuan

To view this discussion on the web visit https://groups.google.com/d/msgid/eman2/CAO5HvnN%3Dp7YuPtT1o2sW60Nco8hP%3DGqGOFWrngVPxhf%2B8hvRJg%40mail.gmail.com.

Daniel Parrell

unread,

Aug 24, 2022, 1:52:03 PM8/24/22

to em...@googlegroups.com

Hi Muyan,

Thank you for making this change. I had a chance to give it a try and I am happy to report that it works! I ran the program from within my project directory after just importing the tomograms using the following command "python ./e2procjson.py tomograms/*.mrc --addkey apix_unbin:3.86 --infofile".

I ran this on a whole set of tomograms and they all have the key apix_unbin now. and if I run "e2spt_boxer_convnet.py --label xxx" I can open a tomogram and pick particles.

One new problem I notice. If I close the first tomogram I opened and try to open it again, or another tomogram, the second instance of opening a tomogram gives me a white window. Here is a screenshot. However if I close all of the windows and rerun "e2spt_boxer_convnet.py --label DPS" I can open any of them again and the progress was saved. So this is a small problem with an easy workaround given how fast the training is.

I also noticed if you press save before training a neural network it crashes the program with the following error. This is not a problem if you are doing things in the correct order though.

Traceback (most recent call last):

File "/opt/EMAN2/continuous-release-2022-08-16/bin/e2spt_boxer_convnet.py", line 831, in save_nnet

self.nnet.save_network("neuralnets/nnet_save.hdf")

AttributeError: 'NoneType' object has no attribute 'save_network'

Aborted

Thank you, I think this is mostly solved!

Daniel

Screen Shot 2022-08-24 at 12.46.54 PM.png

You received this message because you are subscribed to a topic in the Google Groups "EMAN2" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/eman2/BP803C5r8w8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to eman2+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/eman2/CAO_0xqGB3n3qWZtX2jaOebTyj6nEUFUgWKCdBUWvEgEDTmKTJg%40mail.gmail.com.

Muyuan Chen

unread,

Aug 24, 2022, 2:02:36 PM8/24/22

to em...@googlegroups.com

The blank window issue is a known bug introduced after the recent Qt upgrade. We added a workaround for e2display and e2tomo_eval, but apparently forgot about this one. As long as you don’t close the image window and just switch to a new tomogram, it should display correctly. I can add the workaround later too.

Yes the bug with saving nonexistent networks is quite stupid. I shall change that too…

Muyuan

On Aug 24, 2022, at 10:51 AM, Daniel Parrell <parrell...@gmail.com> wrote:

Hi Muyan,

Thank you for making this change. I had a chance to give it a try and I am happy to report that it works! I ran the program from within my project directory after just importing the tomograms using the following command "python ./e2procjson.py tomograms/*.mrc --addkey apix_unbin:3.86 --infofile".

I ran this on a whole set of tomograms and they all have the key apix_unbin now. and if I run "e2spt_boxer_convnet.py --label xxx" I can open a tomogram and pick particles.

One new problem I notice. If I close the first tomogram I opened and try to open it again, or another tomogram, the second instance of opening a tomogram gives me a white window. Here is a screenshot. However if I close all of the windows and rerun "e2spt_boxer_convnet.py --label DPS" I can open any of them again and the progress was saved. So this is a small problem with an easy workaround given how fast the training is.

I also noticed if you press save before training a neural network it crashes the program with the following error. This is not a problem if you are doing things in the correct order though.

Traceback (most recent call last):
File "/opt/EMAN2/continuous-release-2022-08-16/bin/e2spt_boxer_convnet.py", line 831, in save_nnet
self.nnet.save_network("neuralnets/nnet_save.hdf")
AttributeError: 'NoneType' object has no attribute 'save_network'

Aborted

Thank you, I think this is mostly solved!

Daniel

To view this discussion on the web visit https://groups.google.com/d/msgid/eman2/CAO5HvnNo6WdYkcB%3DGMevGNOwz8zBNTNqK7SbvYb7C2pGUAoK%3Dg%40mail.gmail.com.

parrell...@gmail.com

unread,

Aug 26, 2022, 3:25:37 PM8/26/22

to EMAN2

Hi Muyan,

I have been able to train a neural network using these new strategies, pick particles and extract those particles coordinates. I have run into one quirk that I think you referred to in one response below. The origin of the particles is at the center rather than the corner of the tomogram. I think I can figure this out using programs like imod trans, but is there something I can do differently to get coordinates with the origin in the corner directly from e2spt_boxer_convnet?

Thank you

Daniel

Muyuan Chen

unread,

Aug 26, 2022, 3:38:19 PM8/26/22

to em...@googlegroups.com

Placing the center at the tomogram corner will mess up with many things, particularly that we allow different binning and clipping of the same tomogram. The easiest way to do the conversion is probably to write a small python script yourself. Something Like:

from EMAN2 import *

import numpy as np

tomo="tomograms/xx.hdf"

apix=x.xx

js=js_open_dict(info_name(tomo))

box=np.array([b[:3] for b in js["boxes_3d"]])

a=EMData(tomo,0,True)

box=box/apix+np.array([ a["nx"], a["nz"], a["nz"]])/2

I just typed this in email, so it may and may not work. It may need some debugging but that's the basic concept.

Muyuan

To view this discussion on the web visit https://groups.google.com/d/msgid/eman2/335daa3c-689f-480d-98ed-aae80729aeedn%40googlegroups.com.

parrell...@gmail.com

unread,

Aug 26, 2022, 3:52:49 PM8/26/22

to EMAN2

Hi Muyan,

Okay, that is what I thought based on what you said before. I think I can go from this. Thank you again. Have a good weekend!

Daniel

Daniel Parrell

unread,

Sep 2, 2022, 12:42:02 PM9/2/22

to em...@googlegroups.com

Hi Again,

Just a brief update in case anyone is following along or to document for future readers. I found that if you close the e2spt_boxer_convnet window after applying the neural network to your tomograms. Then go into e2projectmanager and open the tomogram in the box training references step of the segmentation workflow, the particles are listed as a particle set with your tag. Clicking "file", "savebox coord" in the tomogram window will save a text file with the coordinates and the origin in the corner for these coordinates, as I was looking for. And the format is perfect for using the imod program point2model. The model now overlays with the tomogram in 3dmod. You do have to open and save the box coordinates file for each tomogram, but this is easier than some of the other solutions I was working on.

Daniel

To view this discussion on the web visit https://groups.google.com/d/msgid/eman2/5FED4CD2-D791-4E63-A887-F31B79BA3156%40gmail.com.

Reply all

Reply to author

Forward