[maker-devel] GFFDB error

113 views
Skip to first unread message

Anna Bennett

unread,
Nov 2, 2011, 10:43:17 PM11/2/11
to maker...@yandell-lab.org
Hi,

I am hoping to use Maker as a consensus get set generator for several
gene prediction sets that have already been produced outside of Maker.
I did use Maker to align my ESTs and I recursively copied that entire
directory a couple of times in order to generate a couple different
consensus set versions. I used the protein_gff= option to enter
several GFF3 sets of protein alignments run outside of Maker and the
pred_gff= option to enter all of the gene GFF3 sets I want to include.
All of the prediction sets have passed our GFF3 validation steps and
successfully load into Chado and Apollo so I don't suspect an error
with the file formats, however Maker v2.15 is failing on the GFF3
files. Is there a chance this error is tied to bad predictions that
potentially have internal stop codons or some other issue? How can I
determine which specific errors are causing the failure?

When running Maker v2.15 I get these errors:

Gathering GFF3 input into hits - chunk:0
WARNING: Problem in GFFDB::_get_t_offset_and_end
WARNING: Problem cause by bad CDS entries in GFF3 file for Group10.1.006.1
Maker will just figure out a new CDS entry internally

ERROR: Failed on Group10.1.006.1
Check your input GFF3 file for errors! (from GFFDB)

--FAILURE--
ERROR: Failed while prepare section files!!

ERROR: Chunk failed at level:12, tier_type:2
!!
FAILED CONTIG:Group10.1

ERROR: Chunk failed at level:5, tier_type:0
!!
FAILED CONTIG:Group10.1


I appreciate any help you can offer.
Thanks,
Anna

_______________________________________________
maker-devel mailing list
maker...@box290.bluehost.com
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

Carson Holt

unread,
Nov 3, 2011, 9:53:55 AM11/3/11
to Anna Bennett, maker...@yandell-lab.org
Could you send me the GFF3 file containing Group10.1.006.1. I think the
CDS and exon coordinates may have a non-overlap issue (chado and GFF3
validator don't check for this).

Thanks,
Carson

Anna Bennett

unread,
Nov 3, 2011, 11:47:27 AM11/3/11
to Carson Holt, maker...@yandell-lab.org
Hi Carson,

This same general error message was actually reported for all of the scaffolds that I allowed to run before killing the job though so a systematic issue like the one you mentioned is the most likely.  Going back through my log file, it looks like for some scaffolds it did give an additional error on specific predictions:


Gathering GFF3 input into hits - chunk:0
WARNING: Problem in GFFDB::_get_t_offset_and_end
WARNING: Problem cause by bad CDS entries in GFF3 file for gnomon_20414_mRNA
Maker will just figure out a new CDS entry internally

ERROR: Failed on gnomon_20414_mRNA
Check your input GFF3 file for errors! (from GFFDB)

--FAILURE--
ERROR: Failed while prepare section files!!

ERROR: Chunk failed at level:12, tier_type:2
!!
FAILED CONTIG:Group10.4

ERROR: Chunk failed at level:5, tier_type:0
!!
FAILED CONTIG:Group10.4



This is the GFF3 for that feature:
Group10.4       GNOMON  gene    19452   19796   .       -       .       ID=gnomon_20414;Name=gnomon_20414
Group10.4       GNOMON  mRNA    19452   19796   .       -       .       ID=gnomon_20414_mRNA;Name=gnomon_20414_mRNA;Parent=gnomon_20414
Group10.4       GNOMON  CDS     19452   19796   .       -       0       Parent=gnomon_20414_mRNA



I can parse out a scaffolds worth of predictions for all of the seven prediction sets I have, but before I do that, are exon features required by Maker in order for it to resolve CDS issues?  Some of the prediction files do not have exon features in the GFF3, only gene/mRNA/CDS and some have UTR.  If exon features are required by Maker I can write a script to generate them and try Maker again on those files.  

Thanks,
Anna
--
Anna Bennett
Doctoral Student
Elsik Computational Genomics Laboratory
Department of Biology
406 Reiss Science Building
Georgetown University
Washington, DC 20057

Carson Holt

unread,
Nov 3, 2011, 12:12:24 PM11/3/11
to Anna Bennett, maker...@yandell-lab.org
Is that the full feature.  I guess there is no exon just CDS.  The fix might then be as simple as just changing all your CDS entries to exon.  Why is this important?  Well explicit UTR is optional in GFF3 and is defined as exon excluding CDS, so the missing exon feature creates logic issues.  Chado and the GFF3 validator only check for inheritance conformity (I.e. mRNA can't be the child of a promotor based on the ontology tree).

This is from a NOTE 3 in the GFF3 spec:

NOTE 3 - UTRs, splice sites and translational start and stop sites. These are implied by the combination of exon and CDS and do not need to be explicitly annotated as part of the canonical gene. In the case of annotating predicted splice or translational start/stop sites independently of a particular gene, it is suggested that they be attached directly to the genomic sequence and not to a gene or a subpart of a gene.


Thanks,
Carson

Anna Bennett

unread,
Nov 3, 2011, 2:16:32 PM11/3/11
to Carson Holt, maker...@yandell-lab.org
This may help me understand why the output of some prediction programs have exon features that only correspond to the CDS features, but do not include the coordinates of the explicit UTR features in the exon coordinates.  When I saw that I believe I stripped out the exon features and allowed the Chado loader to generate exons.  

The prediction file that does have specific error reports in my log file (the example from my last email), does not have any UTR features, but also does not have exons.  I'll try generating exons based on the CDS and see if that eliminates the error.

Thanks again,
Anna  

Carson Holt

unread,
Nov 3, 2011, 3:04:23 PM11/3/11
to Anna Bennett, maker...@yandell-lab.org
Also if you just convert the CDS column to exon, MAKER will try and make the CDS for you by looking for the longest open reading frame of the tiled exons of the mRNA.
Reply all
Reply to author
Forward
0 new messages