Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Issue with coordinates in GFF output

9 views
Skip to first unread message

Alyssa Proia

unread,
Sep 18, 2024, 11:15:07 PM9/18/24
to apollo
Has anyone had issues with the GFF pulled from Apollo where sub-feature coordinates extend beyond parent features (e.g., exon extends beyond mRNA), even though the visualization of the features is correct?  I have several instances where the sub-feature coordinates are correct, but the feature coordinates are not.  Please see attached picture as an example, coordinates are highlighted.

Each issue has been with a user-created/edited track. Thanks!


Apollo_coordinate_issue.jpg

Monica Poelchau

unread,
Sep 19, 2024, 11:58:02 AM9/19/24
to Alyssa Proia, apollo

 

Yes, we’ve seen this. Our gff3toolkit software can detect and fix these errors. https://github.com/nal-i5k/gff3toolkit

 

You first need to run gff3_QC. This creates an output file with all the errors in the gff3 file. To have it fix just the mismatched feature boundaries, you need to create a new error file with 1) the header from the error file and 2) all the lines with error code Ema0003 (see https://github.com/NAL-i5K/GFF3toolkit/blob/master/docs/Detection-of-GFF3-format-errors.rst for all the error codes). gff3_fix will update the coordinates of the parent by using the minimum and the maximum coordinate of the child feature.

 

e.g. 

#detect errors

gff3_QC -g example_file/example.gff3 -f example_file/reference.fa -o error.txt -s statistic.txt

 

#create file to fix just Ema0003 errors (although you can have it fix more types of errors if you want)

head -1 error.txt > errors-for-gff3fix.txt

grep Ema0003 error.txt >> errors-for-gff3fix.txt

 

#run gff3 fix

gff3_fix -qc_r errors-for-gff3fix.txt -g example_file/example.gff3 -og corrected.gff3

 

#I like to run gff3_QC again on the output as a sanity check

gff3_QC -g corrected.gff3 -f example_file/reference.fa -o corrected-error.txt -s corrected-statistic.txt

 

Hth, let me know if you run into any snags.

 

Monica 


On Wed, Sep 18, 2024 at 9:15 PM Alyssa Proia <apro...@gmail.com> wrote:
Has anyone had issues with the GFF pulled from Apollo where sub-feature coordinates extend beyond parent features (e.g., exon extends beyond mRNA), even though the visualization of the features is correct?  I have several instances where the sub-feature coordinates are correct, but the feature coordinates are not.  Please see attached picture as an example, coordinates are highlighted.

Each issue has been with a user-created/edited track. Thanks!


To unsubscribe from this group and stop receiving emails from it, send an email to apollo+un...@lbl.gov.

Alyssa Proia

unread,
Sep 23, 2024, 9:54:29 AM9/23/24
to apollo, Monica Poelchau, apollo, Alyssa Proia
Hi Monica, 

Thank you so much!  That worked perfectly!

Alyssa

Jacques DAINAT

unread,
Sep 26, 2024, 4:57:18 AM9/26/24
to Alyssa Proia, apollo
Hi,

Another solution is to run agat_convert_sp_gxf2gxf.pl from AGAT (https://github.com/NBISweden/AGAT).
It detects and fix automatically such cases.

agat_convert_sp_gxf2gxf.pl --gff infile.gff -o infile_fixed.gff

Best regards,

/Jacques

Le 19 sept. 2024 à 05:15, Alyssa Proia <apro...@gmail.com> a écrit :

Has anyone had issues with the GFF pulled from Apollo where sub-feature coordinates extend beyond parent features (e.g., exon extends beyond mRNA), even though the visualization of the features is correct?  I have several instances where the sub-feature coordinates are correct, but the feature coordinates are not.  Please see attached picture as an example, coordinates are highlighted.

Each issue has been with a user-created/edited track. Thanks!



To unsubscribe from this group and stop receiving emails from it, send an email to apollo+un...@lbl.gov.
<Apollo_coordinate_issue.jpg>

Reply all
Reply to author
Forward
0 new messages