hi there,
in a couple of days, I am giving an introductory tutorial to Go to (mostly) biologists.
the morning we go through most of what is Go (types, funcs, vars, interfaces, goroutines/channels).
the afternoon, the idea was to go through a bio-related exercize.
as I am not a biologist, I got the organizers to give me a Python-based exercize that I would translate to Go.
The original exercize wasn't using any BioPython module (just the stdlib) so it was easy to translate.
(The original exercize was, given chromosomes in a FASTA file and the associated GFF3 annotation file, extract the nucleotide sequences of the CDS.)
here is my attempt:
the data files are available at:
(I checked I get the same answer than the original python code, but faster.)
so far so good, but I figured it would be interesting to also show how biogo could be leveraged.
so I came up with:
any ideas on how to improve it?
especially interested in:
- improving the handling of the reverse-complement
- dealing with the GFF3 input file
- producing the final CDS output file.
thanks!
-s