[maker-devel] maker gene order in gff output

171 views
Skip to first unread message

Guohong Cai

unread,
May 29, 2014, 3:16:06 PM5/29/14
to <maker-devel@yandell-lab.org>
Hi Carson,

    In the maker output, the genes have names like  "genemark-scaffold17-
processed-gene-0.0".  Many users probably will eventually give the genes different names, such as GSGxxx (Genus Species Gene #).

In the gff output, the scaffolds are not in order (either numerical order or the order of input assembly). On the same scaffold, the genes are not listed in order either. This will make it a little harder for users to change the gene IDs. We may name the genes in order from scaffold 1 to scaffold N, and and each scaffold, order the genes from left to right, e.g GSG00001, GSG00002). Do you think you can order the genes in the gff output? For example, order the scaffolds according to the input genome assembly, and on each scaffold, order the genes from 5' to 3'.

Thanks.

Guohong
Rutgers University

Carson Holt

unread,
May 29, 2014, 11:11:43 PM5/29/14
to Guohong Cai, <maker-devel@yandell-lab.org>
The maker_map_ids script that comes with MAKER can be used to generate new names of the style PREFIX###### or PREFIX_######.  You can use the --sort_order flag to sort the contigs in whatever your preferred order is before generating the new names.

Then use the map_gff_ids  and map_fasta_ids to change the names in the gff3 and fasta files respectively.

Here is some extra information from a tutorial where the maker_map_ids script is used --> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_GMOD_Online_Training_2014#Post_Processing_of_Annotations

--Carson


_______________________________________________ maker-devel mailing list maker...@box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

Guohong Cai

unread,
May 30, 2014, 7:40:34 AM5/30/14
to Carson Holt, <maker-devel@yandell-lab.org>
Great!---Guohong
Reply all
Reply to author
Forward
0 new messages