[maker-devel] associating gene names between related strains

33 views
Skip to first unread message

Janna Fierst

unread,
Mar 14, 2014, 12:06:26 PM3/14/14
to maker...@yandell-lab.org
Hi,

we are assembling and annotating genomes for several related strains of Caenorhabditis worms and I was wondering if there is a way to coordinate the gene naming so that orthologs between species can be associated by name. I have been playing around a little with the est_forward option but can't figure out a good system/workflow that preserves names but still uses the strain-specific RNA-Seq EST set for the actual gene models. Thanks! -Janna

Daniel Ence

unread,
Mar 14, 2014, 1:32:02 PM3/14/14
to Janna Fierst, maker...@yandell-lab.org
Hi Janna, So do you have one strain that you want to use as the reference for all the others? There's a script that comes with MAKER called maker_map_ids that lets you use a common prefix or suffix for entries in a fasta file from one strain and then use est_forward to use that ID in the gene models for the other species. 

Let me know if that's not what you're looking for, 
Daniel

Daniel Ence
Graduate Student
Eccles Institute of Human Genetics
University of Utah
15 North 2030 East, Room 2100
Salt Lake City, UT 84112-5330

From: maker-devel [maker-dev...@yandell-lab.org] on behalf of Janna Fierst [jfi...@uoregon.edu]
Sent: Friday, March 14, 2014 10:06 AM
To: maker...@yandell-lab.org
Subject: [maker-devel] associating gene names between related strains

Janna Fierst

unread,
Mar 14, 2014, 2:01:16 PM3/14/14
to Daniel Ence, maker...@yandell-lab.org
I will try it today. Thanks for the quick reply!

Carson Holt

unread,
Mar 14, 2014, 2:02:48 PM3/14/14
to Daniel Ence, Janna Fierst, maker...@yandell-lab.org
maker_map_ids does a translation (i.e. change gene-A to smug1), so you need to know which genes you want to translate names to (two column input file, column 1 -> original ID, column 2 -> new ID).  I’m not sure EST forward is the best way to do this, although I do think maker_map_ids is the tool to use in the end.  The question is how to make a list of IDs to translate as the input to maker_map_ids?

I would actually just use BLASTP against the reference strain, and then do reciprocal best BLAST hits.  To do this you BLAST your reference proteins against your maker proteins.  Then do the opposite, BLAST your  maker proteins against your reference proteins.  If they are both each others best hit, then they are orthologous, and you can safely make a two column entry for the maker_map_ids input (i.e. maker-gene-1 translates into smug1).

—Carson


_______________________________________________ maker-devel mailing list maker...@box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

Janna Fierst

unread,
Mar 21, 2014, 11:43:59 AM3/21/14
to Carson Holt, maker...@yandell-lab.org
Hi,

I just wanted to say thanks for all your help- I did the reciprocal best blast hits and then used the maker scripts (map_fasta_ids, map_gff_ids) to associate names between strain assemblies/annotations. Worked perfectly! -Janna

Carson Holt

unread,
Mar 21, 2014, 11:54:15 AM3/21/14
to Janna Fierst, maker...@yandell-lab.org
I'm glad we could help.

--Carson
Reply all
Reply to author
Forward
0 new messages