On Mon, 2015-10-26 at 15:18 -0700, Manolis wrote:
> On Friday, October 23, 2015 at 7:55:43 PM UTC-4, Dan Kortschak wrote:
> > The issue here is that not all features have an orientation.
> >
> That's not entirely true.
> For BED the ones that don't have it (BED3-BED5) can return NotOriented.
> That't perfectly legit.
> As for the GFF.Feature, this one should have it and return the strand.
For the classes of features that exist in featio, you are right, but
features are more general than that.
> > In all cases where you are reading from a stream, you must know the
> > format of the stream and therefore you must know the underlying type. My
> > pattern is to call `f := sc.Feature().(T)` at the first line of the
> > scanner loop. I don't think this is cognitively or mechanically costly.
> >
>
> That's not always true. Sometimes you don't know the format of the stream a
> priori.
> func DoSth (r featio.Reader) {} does not know the underlying type of r.
In this case you don't know that the feature stream that you have holds
the data that you need for the analysis. feat.Feature is the lowest
common denominator.
If you are in a position to not require some of the data elements of a
feature, then you can make a feat.Feature implementation that implements
additional methods to conditionally return the wrapped feat.Feature
obtained from the Scanner. Alternatively a filter func that does the
same thing can be used. I do these - usually the latter unless I need
the type to match an interface.
> I actually believe that it will increase genericity. You will have parsers
> that work as tokenizers/scanners and can even be used directly without
> having to dive into the rest of biogo. You will still have types that
> satisfy the featio.Reader interface (all you have to do is just embed the
> parser types and create the appropriate signature for the Read method).
> For backwards compatibility you are absolutely right it will break stuff
> that access the concrete types.
It doesn't just break that. It breaks anything that previously
implemented feat.Feature and doesn't return the orientation.
Part of the reason behind the absence of Orientation is the biogical
distinction between orientation (the general concept) and strand (the
nucleic acid-specific concept which is isomorphic). In hindsight, this
is probably unfortunate - maybe sequences should have had orientation
rather than strand. But it is what it is now. It's unreasonable to have
both Orientation and Strand on a seq.Sequence. I am involved in
designing APIs with two methods that return essentially the same thing
(we are still working through this - but that case is far more
conceptually complicated than here), and it's not something I want to
add to seqquences and features.
This is how I would do that if I found that it was happening a lot.
type orientedFeature interface {
feat.Feature
feat.Oriented
}
type oriented struct { feat.Feature }
func (f oriented) Orientation() feat.Orientation { return feat.NotOriented }
func asOriented(f feat.Feature) orientedFeature {
if f, ok := f.(orientedFeature); ok {
return f
}
return oriented{f}
}
func printMatchingBedFeats(file string, orientedFeats []OrientedFeature) {
fileReader, _ := os.Open(file)
r, _ := bed.NewReader(fileReader, guessBedType(file))
for {
f, err := r.Read()
if err != nil {
// handle error
}
if matchFound(orientedFeats, oriented(f)) {
fmt.Print(f)
}
}
}
> Also a maybe not so irrelevant point from my admittedly small experience
> with Go. As a best practice, I avoid returning interfaces from functions or
> at least return the widest possible interface. On the other hand, I use the
> narrowest possible interface to define input requirements for a function.
Rather than using o() and O(), I'd advise you use 𝛺() for your method
set complexity. When it's internal, this is generally fairly easy to
define.