[maker-devel] annotation comparison aed plots

367 views
Skip to first unread message

Robert King (RRes-Roth)

unread,
Mar 10, 2014, 8:17:07 AM3/10/14
to maker...@yandell-lab.org

Dear Maker Developers, 

 

I’ve updated a reference that was had errors and was a little incomplete and now trying to produce a annotation for it. Please note the reference has not changed dramatically. I’ve produced two annotations using as evidence:

 

Annotation 1:

Uniprot proteins search using species keyword “fusarium”

Pubmed mRNA for the name of the organism

Prior annotation reference transcripts

 

Annotation 2:

Uniprot proteins search using species keyword “fusarium”

Pubmed mRNA for the name of the organism

Prior annotation reference transcripts

mRNA trinity assembly pasafly of different strain (only RNA-seq available)

 

I’m not sure if it was a smart move to use the prior annotation reference transcripts?

 

I want to compare these two annotations and have produced AED scores. How do I generate summary stats/figures to compare annotations. You mentioned last year in a post Mike Campbell has a script to produce these, do you know if he will post it? I’ve got the Eval program and converted to gtf format using the provided script, just waiting on some perl modules to be installed by admin to test it. I’m waiting on some perl modules to be installed by our administrator to test out the “Evaluator” and “compare” programs too, what do they do?

 

Best Wishes

Rob


--
This message has been scanned for viruses and
dangerous content by MailScanner, and
we believe but do not warrant that this e-mail and any attachments thereto do not contain any viruses. However, you are fully responsible for performing any virus scanning.

Carson Holt

unread,
Mar 10, 2014, 12:25:50 PM3/10/14
to Robert King (RRes-Roth), maker...@yandell-lab.org
I don’t know about Michaels’s script, but I’ve always used eval.  It produces sensitivity/specificity metrics.  It assumes the first models are 100% correct, and then tells you the sensitivity/specificity value for the second models.

It is not therefor a quality metric.  Instead you should view it as a change metric. Lower sensitivity tells you that models/exons have been lost between versions, and lower specificity tells you models/exons have been gained.  There will also be a lost of generic statistics on exon/intron distribution and UTR length.  Then the AED values from the MAEKR run can be used independently to evaluate how well models match the evidence.

—Carson


_______________________________________________ maker-devel mailing list maker...@box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

Michael Campbell

unread,
Mar 10, 2014, 11:50:53 AM3/10/14
to Robert King (RRes-Roth), maker...@yandell-lab.org
One more point. The sensitivity, specificity,and accuracy produced by the compare_annotations_3.2.pl script are gene level, and overlap is defined very liberally between annotation sets is defined as at least one nucleotide of an exon overlap.
Mike


On Mon, Mar 10, 2014 at 9:47 AM, Michael Campbell <michael.s...@gmail.com> wrote:
Hi Robert,

Here are the scripts that were mentioned before.

The AED_cdf_generator.pl script is for making cumulative distribution function plots based on annotation edit distance. This script is quite simple and strait forward in its internals.

The compare_annotations_3.2.pl script is for generating summary stats for annotations and will compare two annotations of the same assembly. 

You can run either script without arguments to get a usage statement.

Thanks,
Mike


_______________________________________________
maker-devel mailing list
maker...@box290.bluehost.com
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org




--
Michael Campbell MS, RD.
Doctoral Candidate
Eccles Institute of Human Genetics
University of Utah
15 North 2030 East, Room 2100
Salt Lake City, UT 84112-5330
ph:585-3543




--
Michael Campbell MS, RD.
Doctoral Candidate
Eccles Institute of Human Genetics
University of Utah
15 North 2030 East, Room 2100
Salt Lake City, UT 84112-5330
ph:585-3543

Michael Campbell

unread,
Mar 10, 2014, 11:47:50 AM3/10/14
to Robert King (RRes-Roth), maker...@yandell-lab.org
Hi Robert,

Here are the scripts that were mentioned before.

The AED_cdf_generator.pl script is for making cumulative distribution function plots based on annotation edit distance. This script is quite simple and strait forward in its internals.

The compare_annotations_3.2.pl script is for generating summary stats for annotations and will compare two annotations of the same assembly. 

You can run either script without arguments to get a usage statement.

Thanks,
Mike
On Mon, Mar 10, 2014 at 6:17 AM, Robert King (RRes-Roth) <rober...@rothamsted.ac.uk> wrote:
_______________________________________________
maker-devel mailing list
maker...@box290.bluehost.com
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

AED_cdf_generator.pl
compare_annotations_3.2.pl
Reply all
Reply to author
Forward
0 new messages