Hi Michael,
this reply contains more information that what you really need,
but it gives an overview on all the different ways you can control
the particle selection. Part of it is already in the documentation
found in the website.
SELECTION CRITERIA
You can select subset of particles according to several criteria.
For each criterion you need to add new lines or modify existing
lines in the master/continue file, which specify the value of the
corresponding parameters. You can add them anywhere, but for
visual convenience, it is advised to do it in the section that
starts as
# Particle selection criteria involving scores or CCs
Global search (based on the program PPFT) and refinement (based on
PO2R) procedures are controlled by separate parameters. These two
procedures correspond to 'auto mode search' and 'auto mode
refine', respectively. To remark that in each master/continue file
you can specify only one selection criteria for global search and
one for refinement.
For PFFT, the criteria to choose from are:
pft_cc_fraction
prj_cc_fraction
cmp_cc_fraction
pft_cc_nstd
prj_cc_nstd
cmp_cc_nstd
However, by default PPFT only write the pft score, so in this case
you can only set either pft_cc_fraction or pft_cc_nstd.
For PO2r you can choose between
score_fraction
score_nstd
In the case of refinement (but this extends easily to the global
search), if you want to select particles by fraction, the
parameter is
score_fraction: fraction of images to accept on the basis of the
score generated by program PO2R, if given a value less than 1,
otherwise it refers to the actual number of best particles to use.
For example,
auto score_fraction 0.8
will select the best 80% of particles, while
auto score_fraction 1056
will select the best 1056 particles.
If you want to select particles by standard deviations with
respect to the average score, the parameter is
score_nstd: number of standard deviations to add to the average
score when setting cutoff. Negative values are less restrictive,
positive values are more restrictive.
For example,
auto score_nstd 2
will select all the particles whose score is better than average
plus two times the standard deviation. In your specific case, you
would set this parameter to 0 to select all the particles whose
score is above the average.
In addition to it, you can control if applying the threshold to
each stack separately with the parameter global_select, already
present in the continue file.
global_select: if set to a non-zero value, then selection criteria
are applied globally across particle parameter files. Otherwise,
selection criteria are applied on a per stack basis.
TESTING MULTIPLE THRESHOLDS DURING THE ITERATIVE ALIGNMENT
In addition to setting just one threshold, you can try different
thresholds, but only within one selection criterion, in order to
see which one gives the reconstruction with the higher resolution.
Therefore the map generated at each iteration will be calculated
from this optimal subset. As a note of caution, the test is
repeated at each iteration, and it involves generating multiple
reconstructions, which can slow down the program considerably.
If you want to do this, then you use two additional parameters:
nselect_offset and select_delta.
nselect_offset basically determines how many tries
(nselect_offset*2+1 because the threshold is always changed
towards both directions around the central value), while
select_delta is the step to apply for each try.
Here some simple examples
Example 1:
auto score_fraction 0.75
auto score_nstd 0
auto nselect_offset 0
auto select_delta 0
At each iteration the program will only use 75% of particles,
those with the best score.
Example 2:
auto score_fraction 0.75
auto score_nstd 0
auto nselect_offset 5
auto select_delta 0.05
At each iteration the program will try the following percentages
of best particles (for each one it will determine the resolution)
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1.0
and only for the value with the best score it will generate a
reconstruction to use at the next iteration
Example 3:
auto score_fraction 550
auto score_nstd 0
auto nselect_offset 3
auto select_delta 50
At each iteration it will try to select the following number of
particles
400 450 500 550 600 650 700
The same concept applies if you use score_nstd or a ppft-related
selection.
TESTING MULTIPLE THRESHOLDS AT THE END OF THE ITERATIVE ALIGNMENT
Finally, if you want to test how the map comes out using a reduced
number of particles, but only after the planned iterations of
refinement, you can run auto3dem using the continue file as usual,
after changing the following two lines
# Map reconstruction mode
# accepted values: yes, no or only
auto generate_map yes
# none for not appending anything to the file name
auto map_suffix none
to
auto generate_map only
auto map_suffix test1
In this way the program will not advance in iterations, no matter
what you say in the other parameters, and it will only generate a
reconstruction applying the given selection criterion. By
specifying the map_suffix string to append at the end of the file
name, the new reconstruction will not overwrite the one obtained
at that iteration. Furthermore, in the summary file there will be
an additional line giving the number of particles used and the
resolution estimated.
I hope it helps.
Giovanni