Hi Josh,
I hope we did not lose anybody in the conversation, but it is really too
much to ask from me to re-instate the Cc:list.
On Sun, 11 Nov 2012, Josh Moore wrote:
> On Nov 9, 2012, at 6:11 PM, Johannes Schindelin wrote:
>
> > On Wed, 24 Oct 2012, Roger Leigh wrote:
> >
> > A general comment on the specification: it seems to be very focused on
> > serialization.
>
> Though at first glance the tight, binary serialization that Roger has
> proposed might be quite useful, I agree it's not a priority. In fact,
> we'll likely have multiple serializations, since we'll need something
> (unfortunately) that'll work well in XML, etc. etc. Nevertheless, I
> think more of the specification is about getting the model API straight.
> And this week is about hammering on it, modifying, repeating, etc.
No. The focus needs to be on the use cases of ROIs. Otherwise we do not
even *know* what to serialize.
Just think about the following use case: a user chooses Li auto-threshold
to select voxels.
What to serialize? Li? The value Li calculated? The BinaryType mask
representing the selected voxels? A geometric approximation of the
hyper-volume?
This is not a theoretical question. It has very real consequences on what
we can, and what we cannot do, with the ROI model.
In my opinion, we have to concentrate on *what* people need to do with
regions of interest. For example, in the above use case, scientists
frequently need to be able to store the fact that it was a Li
auto-threshold. But to apply the same ROI to another channel, the binary
mask must be transported.
However, my primary focus these two days lies purely in implementing ROIs
on top of ImgLib2. Everything else would be premature, or too limited in
applicability.
> > Applications I can think of right now:
> >
> > - serialization
> > - visualization (AKA rendering)
> > - processing (AKA iterating, both *inside* and *outside*)
> > - set operations (i.e. merging, intersecting, set difference)
> > - interaction (i.e. letting the users define ROIs)
> >
> > I am sure that I forgot some.
No comments here???
> > The main concern I have for this meeting is that we might end up with
> > something that is limiting us too much in the future.
> >
> > For example, we need to think far enough ahead to allow for
> > backwards-compatible and future-proof extensions (as Tobias suggested,
> > we need to define a way how to ignore unimplemented features safely).
>
> +1, though it's of course hard to know what one is forgetting.
The point, of course, is to allow ourselves to forget something. We need
to use duck-typing on the implementation side, and versioning on the
specification side (although I sincerely believe that we have to get the
implementation side right first before we can dare to think about the
specification).
> In terms of dealing with the unimplemented or undesirable features, I'd
> vote for allowing one to choose between either the implementations (say
> via a factory interface) or the error-handling itself (say via a
> strategy interface) to allow for minimally "safe" and "unsafe" versions.
No. The only way that proved practical was to use an interface-driven
design, not a factory-driven one. I.e. "what can it do?" instead of "give
me something, but I don't know what, exactly"
> > Another limitation we should avoid at all cost is the limitation to a
> > certain number of dimensions (such as an inherent 2D model that
> > somehow gets extruded to nD; my experience with the Virtual Insect
> > Protocol shows that this is a terrible way to think about regions of
> > interest).
>
> The proposal as it stands certainly has limitations. Some we discussed
> this week, which Roger is either working on now, or we'll be mentioning
> once everyone is in Dundee. But even with those changes, one of the
> initial points of discussion will be the utility and mathematics behind
> higher-dimensional shapes (i.e. beyond 3D). Very roughly, the updated
> proposal will be that 3D shapes be combined with 3D transforms to
> produce what we workingly called "Shape trees". There is one of these
> per ROI along with an embedding in other higher dimensions.
Again, this is too focused on low dimensions. 2D, 3D are not what we need
to think. If we limit ourselves that way, we can stop thinking about a
general ROI specification that will serve us well for decades to come
right now.
Instead, we need to focus on defining what a ROI can do. Completely
independent of the number of dimensions of the space it lives in.
> We discussed 2 of the possibly several down-sides from others'
> perspectives. The first is that one can't use an arbitrary MxN
> transformation across all the dimensions in the shape tree. We couldn't
> figure out the mathematics behind why that would be useful (or legal).
> Perhaps someone could talk us through that. The second is that this form
> of "N-3" restriction doesn't work with the ImgLib API.
In real life, transformations on ROIs are rarely needed, unless
accompanied by a transformation on the original image's space. In which
case it is a space transformation applied to the ROI.
Thinking about the problem that way lets us liberate ourselves from the
very limited fixed-dimensional view of ROIs. We can say that a certain
aspect some ROIs accomodate is to provide a
RealRandomAccessibleInterval<ProbabilityType> instead. Much more useful
when you have to work *with* the ROIs as opposed to just trying to save
and load them.
> If we can agree that this second reason is actually the bigger issue,
> then we'd propose:
>
> * Make the ROI package something external to ImgLib, e.g. package
> "scijava.roi"
There is little reason to make the ROI package independent of ImgLib.
ImgLib is a versatile library to work with n-dimensional data, and ROIs
are very much subsets of those data. Trying to overgeneralize will only
cost us unnecessary gray hair.
> * Add Safe and Unsafe strategies as above so that if ImgLib tries to
> perform a higher-dimensional transformation that's not supported, the
> implementation can either throw (Safe) or blindly convert the
> ROI+shapes to a NDim matrix (Unsafe).
Again, transformations are not something inherently specific to ROIs. So
they should not be part of the specification of ROIs. Instead, ROIs that
can be transformed (my Li example from above cannot, unless realized into
a specific RandomAccessible<BitType>) should implement an according
interface.
> For a division like that to work, it'll be important to nail down the
> minimal number of ImgLib interfaces that this package (whatever the
> name) MUST implement.
Now we're talking.
> There may already be too many ROI elements in ImgLib for this to be
> feasible without burdensome refactoring;
We are already sure that there needs to be refactoring. This is what this
hackathon is about.
> if so, back to the drawing board.
Or alternatively, hacking. That is what a hackathon is good for.
Remember, a hackathon is not about nailing down a design document. It is
about hacking on code. If that code serves as a proof-of-concept, or as a
demonstration why a particular approach does not work, does not matter.
A hackathon is about code.
> But in an effort to prevent everything from falling into ImgLib over
> time, we'd suggest at least discussing a place for it in the
> Dresden-DAG, similar to the RnR package.
Since ImgLib2 is already shared between so many SciJava projects, it does
not make sense to bend over to define things outside it.
Instead, we should implement things on top of ImgLib2. Even better: there
is already a ton of code out there, on top of ImgLib2 (for those
unfamiliar with ImgLib2, I highly recommend to familiarize yourself with
it, you will have to do that in the future anyway).
> P.S. Another, secondary question might be: how will axis ordering/naming
> be specified at the ROI level (say for transferring a ROI between 2
> images)? What's the naming scheme used? But that seems like getting into
> details at this point.
Indeed.
So far, the assumption of ImgLib2's core is that if you work on two images
simultaneously, the axis ordering/naming is the same. And if it is not, it
is very easy to use Views to make it so. Therefore I deem this an
unconcern for the moment.
> > And I would also like to caution against making the specification so
> > centric on serialization.
I really worry about this. If we focus too much on specification, we will
end up -- again -- with no tangible code.
Ciao,
Johannes