troubleshooting neural network particle picking

85 views
Skip to first unread message

Zachary Maben

unread,
Sep 11, 2018, 10:35:27 AM9/11/18
to EMAN2

Hello,


I tried to pick particles using the neural network, but it found zero particles.  I’ve copied the training outputs and the outputs from autoboxing below.


In trying to troubleshoot I realized I had a few questions:


-How do I assess if the neural network training was successful?  The costs and learning rates change between training epochs, but how much change is sufficient?


-How many reference boxes of each category are needed/preferred/typically used?


-Do good reference boxes need to contain single particles?  My particles are fairly dense in the images and it is hard to place boxes without somewhat including other particles.




Thank you for your help.


Sincerely,

Zach Maben

Stern Lab

UMass Medical School





e2version.py


EMAN 2.21a final (GITHUB: 2018-02-14 08:53 - commit: c573a6e )

Your EMAN2 is running on: Mac OS 10.11.6 x86_64

Your Python version is: 2.7.13



e2projectmanager.py —> e2boxer.py GUI


Importing dependencies...

Setting up network for particle picking ...

Shape of neural networl input:  (10, 1, 64, 64)

Pre-processing particles...

Now Training...

Training epoch 0, cost  -4.05604119605 , learning rate 0.00096

Training epoch 1, cost  -4.63877650128 , learning rate 0.0009216

Training epoch 2, cost  -4.83559883175 , learning rate 0.000884736

Training epoch 3, cost  -5.02184668002 , learning rate 0.00084934656

Training epoch 4, cost  -5.14526304446 , learning rate 0.0008153726976

Training epoch 5, cost  -5.23339055521 , learning rate 0.000782757789696

Training epoch 6, cost  -5.31457074115 , learning rate 0.000751447478108

Training epoch 7, cost  -5.38761362516 , learning rate 0.000721389578984

Training epoch 8, cost  -5.4332089344 , learning rate 0.000692533995824

Training epoch 9, cost  -5.48386359553 , learning rate 0.000664832635992

Training epoch 10, cost  -5.52086509277 , learning rate 0.000638239330552

Training epoch 11, cost  -5.57545953628 , learning rate 0.00061270975733

Training epoch 12, cost  -5.63242122387 , learning rate 0.000588201367037

Training epoch 13, cost  -5.68789606062 , learning rate 0.000564673312355

Training epoch 14, cost  -5.71317328855 , learning rate 0.000542086379861

Training epoch 15, cost  -5.75839212495 , learning rate 0.000520402924666

Training epoch 16, cost  -5.79777988743 , learning rate 0.00049958680768

Training epoch 17, cost  -5.80663344365 , learning rate 0.000479603335373

Training epoch 18, cost  -5.83798400777 , learning rate 0.000460419201958

Training epoch 19, cost  -5.90723518193 , learning rate 0.000442002433879

Saving the trained net to nnet_pickptcls.hdf...

Setting up  network for bad particle exclusion ...

Shape of neural networl input:  (10, 1, 64, 64)

Pre-processing particles...

Now Training...

Training epoch 0, cost  0.508335282476 , learning rate 0.0048

Training epoch 1, cost  0.510935925278 , learning rate 0.004608

Training epoch 2, cost  0.510931222807 , learning rate 0.00442368

Training epoch 3, cost  0.51092670944 , learning rate 0.0042467328

Training epoch 4, cost  0.51092237747 , learning rate 0.004076863488

Training epoch 5, cost  0.510918219421 , learning rate 0.00391378894848

Training epoch 6, cost  0.51091422821 , learning rate 0.00375723739054

Training epoch 7, cost  0.510910397166 , learning rate 0.00360694789492

Training epoch 8, cost  0.510906720047 , learning rate 0.00346266997912

Training epoch 9, cost  0.510903190467 , learning rate 0.00332416317996

Training epoch 10, cost  0.510899802557 , learning rate 0.00319119665276

Training epoch 11, cost  0.510896550638 , learning rate 0.00306354878665

Training epoch 12, cost  0.510893429228 , learning rate 0.00294100683518

Training epoch 13, cost  0.510890432931 , learning rate 0.00282336656178

Training epoch 14, cost  0.510887556742 , learning rate 0.0027104318993

Training epoch 15, cost  0.510884795826 , learning rate 0.00260201462333

Training epoch 16, cost  0.510882145539 , learning rate 0.0024979340384

Training epoch 17, cost  0.510879601549 , learning rate 0.00239801667686

Training epoch 18, cost  0.5108771595 , learning rate 0.00230209600979

Training epoch 19, cost  0.51087481531 , learning rate 0.0022100121694

Saving the trained net to nnet_classify.hdf...

Loading the Neural Net...

Loading the Neural Net...

Starting on img 0...

Starting on img 1...

Starting on img 2...

Starting on img 3...

Starting on img 4...

12) 0 boxes -> micrographs/A-g2_00000_ali.hdf

Starting on img 5...

12) 0 boxes -> micrographs/A-g2_00100_ali.hdf

Starting on img 6...

12) 0 boxes -> micrographs/A-g2_00200_ali.hdf

Starting on img 7...

12) 0 boxes -> micrographs/A-g2_00300_ali.hdf

Starting on img 8...

12) 0 boxes -> micrographs/A-g2_00400_ali.hdf

Starting on img 9...

12) 0 boxes -> micrographs/A-g2_00500_ali.hdf

Starting on img 10...

12) 0 boxes -> micrographs/A-g2_00700_ali.hdf

Starting on img 11...

12) 0 boxes -> micrographs/A-g2_00600_ali.hdf

Starting on img 12...

12) 0 boxes -> micrographs/A-g2_00800_ali.hdf

12) 0 boxes -> micrographs/A-g2_00900_ali.hdf

12) 0 boxes -> micrographs/A-g2_01000_ali.hdf

12) 0 boxes -> micrographs/A-g2_01100_ali.hdf

12) 0 boxes -> micrographs/A-g2_01182_ali.hdf

Steve Ludtke

unread,
Sep 11, 2018, 3:59:30 PM9/11/18
to em...@googlegroups.com, Muyuan Chen
Hi Zach,
learning how to train the network is the hardest step, and the most difficult thing to teach. The "good" boxes need to contain only particles (no contamination, etc.). It is better if you find well isolated particles for this, but can be ok to have some particles impinging on the edge of the box.

The background references need to be exactly that, empty ice with no particles present. You are training the network to discriminate between nothing and particles with these references, so if the "nothing" includes density from particles, the training process will try and learn to discriminate between the particles in the good references and the particles in the bad references instead of between particles and background.  Please remember that you can and should draw both particles and references from more than one micrograph. Give it a couple of examples from each of several images.

The "bad" reference particles should be high-contrast objects which are not particles. Do not include low contrast background regions with nothing in them in this set. This is for things like ice contamination and the edge of the carbon film.

If you still have problems, it would be very useful to see screenshots of your "good", "background" and "bad" particles. 

--------------------------------------------------------------------------------------
Steven Ludtke, Ph.D. <slu...@bcm.edu>                      Baylor College of Medicine 
Charles C. Bell Jr., Professor of Structural Biology
Dept. of Biochemistry and Molecular Biology                      (www.bcm.edu/biochem)
Academic Director, CryoEM Core                                        (cryoem.bcm.edu)
Co-Director CIBR Center                                    (www.bcm.edu/research/cibr)



--
--
----------------------------------------------------------------------------------------------
You received this message because you are subscribed to the Google
Groups "EMAN2" group.
To post to this group, send email to em...@googlegroups.com
To unsubscribe from this group, send email to eman2+un...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/eman2

---
You received this message because you are subscribed to the Google Groups "EMAN2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eman2+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Muyuan Chen

unread,
Sep 11, 2018, 4:30:43 PM9/11/18
to EMAN2
it seems that the second network for bad particle exclusion fails in this case. it is hard to know why from the output only.
normally the "cost" for the first network is <-5, and the cost for the second is at least <0.1. the value can vary in different dataset, but if you see the cost not dropping at all or increasing like in this case, something has to be wrong.
you can check the result by looking at "trainout_pickptcl.hdf" and "trainout_classify.hdf" in the project folder. in "trainout_pickptcl.hdf", the images next to each particle should be a small bright spot, while images next to non-particles are blank. in "trainout_classify.hdf", there should be very dark spots corresponding to contamination features, and bright elsewhere.
usually 5-50 particles per reference is enough.

I would guess the problem happened because there are particle like things in "bad reference" set. you can also make it partially work by removing all "bad references", so it will just skip the second network. I do not exactly recall what is in the 2.21a, but if you upgrade to 2.22, there should be two slide bars to adjust the threshold for the two networks. simply set the second threshold to something very small like -100 should also work..

still, look at the bad particle references and trainout_ files first to make sure there is not something obviously wrong..

Muyuan
Reply all
Reply to author
Forward
0 new messages