You would have problems in anonymous/metagenomic mode because it will predict GTG and TTG starts.
My recommended workflow for this would be to combine a bunch of eukaryotic transcripts from the same organism and with similar levels
of GC content into a single file and then just run Prodigal with "-c -g 1" to force closed ends and ATG-only starts. If the organism has a tight
GC-content distribution, you may be able to put everything in one file.
In this paper, they compared Prodigal to other tools, and it did ok, but not amazing (not really a fair comparison imo, since Prodigal
was never designed for this purpose). Prodigal would struggle to recognize Kozak sequence, since it's got a lot of microbial-specific
rules.
Honestly, I could probably write a simple python script in a day that would do this better than existing tools.
Just a basic coding/noncoding bayes classifier + kozak sequence / upstream sequence analysis.
Not sure how much interest there is in this sort of thing, since I don't do much with eukaryotic gene prediction.
The hardest part is dealing with frame shifts in transcripts/transcript assemblies, but I don't know of any tools that do a good job
there. Having worked with millions of poplar transcripts, they definitely happen a fair bit and you have a hit to a known protein
in two different reading frames, etc.
I've used TransDecoder quite a bit and it's still my favorite, but, yes it's very slow.
regards,
doug