This is an excellent response.
If you must eliminate all overlaps (not recommended), you can go into
the source
code (the files node.h and dprog.h) and set MAX_SAM_OVLP and
MAX_OPP_OVLP to 0.
I can't vouch for the program's behavior if you do this, though,
although I tried it on
E. coli and it worked. Again, I agree with Torsten that this is a bad
idea.
In addition to 11, 10, or 01, Prodigal uses the normal Genbank
convention of < and >
for genes that run off the edge in its Genbank output, so for example
<2..399 means the
gene runs off the left edge, and 50..>1399 means it runs off the right
edge.
regards,
doug
On Mar 24, 7:31 pm, Torsten Seemann <
torsten.seem...@monash.edu>
wrote:
> > I am quite interested in using Prodigal for prokaryotic gene
> > predictions but at time I get overlapping prediction results e.g. 5'
> > of a gene falls within 3' of previous gene.
>
> Bacteria contain lots of these overlapping genes - are you sure you want to
> ignore them?
> They are valid. eg. operons.
>
> Is there a way to control it and avoid overlapping predictions e.g.
>
> > shifting start position of the next gene where it overlaps with
> > previous gene's 3'?
>
> AFAIK, Prodigal does not do this, because you will miss real genes, as
> overlapping genes are real. You could write a Bio{Perl/Python} script to do
> this.
>
> > Furthermore, is it possible to have explicitly whether the predicted
> > gene lack 5' or 3' end etc?
>
> The README talks about the /note field in the .gbk output:
>
> *The "partial=01", etc., field is used to indicate if genes continue off
> the *
> *edges of the contig. A '0' indicates that the gene is contained within the
> *
> *contig, and a '1' indicates the gene runs off that edge. So '11' runs off
> both*
> *edges of the contig, '10' runs off the left edge, '01' runs off the right
> edge,*
> *and '00' is fully contained within the contig*
>
> --
> *--Dr Torsten Seemann
> --Scientific Director : Victorian Bioinformatics Consortium, Monash
> University, AUSTRALIA*
> *--Senior Researcher : VLSCI Life Sciences Computation Centre, Parkville,
> AUSTRALIA
> --
http://www.bioinformatics.net.au/*