Re: cytoseg questions

1 view
Skip to first unread message

Rick Giuly

unread,
Mar 2, 2013, 6:45:44 PM3/2/13
to kurt weiss, cyt...@googlegroups.com

Hi Kurt,

Interesting questions

kurt weiss wrote:
> Hi Rick,
>
> Thanks for the update on negative examples. A few questions come to
> mind: sorry if they are long-winded:
>
> 1. So void of any highlighted negative examples, does cytoseg still read
> contours of all the negative regions based on intensity or texture. So
> for example it would read a membrane or vesicle as a 'bad contour' even
> if we do not highlight it? If so, what is the exact reason to highlight
> a few negative examples anyway, since it sounds like all non-mito
> regions would then essentially be treated the same, i.e. 'bad' pixels
> and 'bad' contours.

The pixel classification step can be set to just process certain
positive and negative examples and ignore the rest.

The contour classification step is set up a little differently. It
detects a bunch of contours all over the image and then sets the ones
"valid" that are on the positive training data. The rest end up set
negative. This is sort of a quirk of the way that the code is written.
Theoretically it could be changed.

In the pixel classification stage, it may be that picking the negative
regions carefully will lead to better results. It's a way of telling the
classifier the most problematic types of regions or structures that need
to be avoided rather than avoiding everything equally.


> 2. Does the contour processing stage take place in 2D or 3D? And an
> important implication here would be that higher z-resolution should
> almost always be better if this takes place in 3D. For example, a
> single vesicle which only spans 40nm, and thus one image given a 40 nm
> z-section, would not really have any contour in 3D, however, a single
> vesicle at 20nm resolution should have some rough/linear contour in the
> z-direction. Is this logic correct?

It's set by default to use two contours in adjacent planes that are
nearby. You can set it to use 1, 2, 3, etc. if you change a number in
the code. It's really classifying pairs of contours for the default setting.

If a structure really only exists in one slice, that would be an issue
for N=2 contours.


> 3. On another topic: Balancing examples. Two questions:
> A. What does the overall number mean. For example: you suggest 1000 mito
> and 1500 other. What would the implication be if our examples came out
> to 3000 mito and 4500 other (same ratio, but more)?

That's fine. It's just giving the classifier more examples. It should
work with higher accuracy than a smaller number. It will just maybe take
longer to process.


> And I guess
> likewise, is the 0.003 value we enter in the script for 'mito weight' a
> probability? And what does the 'number of examples' really mean when it
> calculates it for you: did it actually look for mitochondria in the
> training set that quickly?

Not totally sure which output this refers to, but probably yes.

It's pulling out a random subset of all possible examples. Every pixel
that you marked can be a possible example. Weight 1 means use %100 of
examples. Weight 0.5 means 50%, and 0.003 means .3%, etc.


> B. Can you verify my findings and understanding please? (I only
> attempted to really quantify this on one dataset, so it may not be
> representative). I find that if I increase the first number (mito) and
> not the second (other) in the run script, I will actually get more false
> negatives in the output.

Have tow consider two settings here:

---------------
For the parameter --voxelWeights=XX,XX

If you are looking at the final output "blobs," what you observe could
happen. The effect is complicated. If it gets a bad weighting, it might
not detect contours well at all anymore.

If you are looking at the probability map output (this is the output of
pixel classification only), you should get what you expect. You could
threshold it to figure out a number of positives & negative predictions
considering each pixel.


---------------
For the parameter --contourListWeights=XX,XX

This is dealing with "contour pairs," so weighting positives higher
probably will bring more into the final results as expected. This has no
effect on pixel classification.


> Logically, it seems that increasing the weight
> of mito would increase the number found in the output, a positive
> correlation, but I don't see this result. I could also imagine a
> negative correlation whereby you are defining the number of positive
> regions in the training set relative to negative. Thus, when you simply
> increase this first number without changing the second one, or the
> trianing data itself, you would be saying that there are more
> mitochondria in the training set relative to non-mitochondria (more
> relative to the prior run when the mito number was smaller), but since
> you did not actually mark any more mitochondria the classifier would
> actually interpret this as there being fewer mito relative to 'other'
> and you'll actually get fewer in your output.

Are you thinking about pixel classification here or "contour pair"
classification. I'd have to consider these separately.


Best,
-Rick
Reply all
Reply to author
Forward
0 new messages