Origin of NemaGene contigs

15 views
Skip to first unread message

gary.w...@wormbase.org

unread,
May 22, 2014, 6:42:15 AM5/22/14
to nematode...@googlegroups.com
I'm trying to determine if the NemaGene contigs are from EST sequences that are all publicly available in the INSDC.

Looking in the NemeGene Cluster Data Download page
http://nematode.net/NN3_frontpage.cgi?navbar_selection=nemagene&subnav_selection=cluster_data_ftp
there are HISTORY files which give the ESTs that contribute to a cluster, for example for M. incognita:


050621 meloidogyne_incognita.fasta.screen.ace.1: 
MI00001, Contig1, EST#=1, 9033909.c1;
###############################

MI00002, Contig2, EST#=1, ra06g05.y2;
###############################

MI00003, Contig3, EST#=1, ra11a02.y1;
###############################

MI00004, Contig4, EST#=1, ra14f03.y2;
###############################

MI00005, Contig5, EST#=1, ra16c12.y2;
etc.

Most of the EST clone IDs (ra06g05.y2, ra11a02.y1, ra14f03.y2, ra16c12.y2 etc.) are easily identifiable as ESTs in the INSDC.

However I cannot identify the ID '9033909.c1' on the second line.

Is this an ID of a 454 sequence?
Which public database is this sequence held in?

Thanks,
Gary Williams
WormBase

gary.w...@wormbase.org

unread,
May 22, 2014, 9:47:23 AM5/22/14
to nematode...@googlegroups.com
I have also found EST IDs in H. contortus that are not in the INSDC. They have IDs like:
27539529.y1
10122154.y1
27539257.y1
10122189.y1
10122242.y1
15498852.y1


Are these also 454 sequence IDs?
Which public database have they been placed ?

Thanks,
Gary

John Martin

unread,
Jun 20, 2014, 6:58:17 PM6/20/14
to nematode...@googlegroups.com
I believe in all those cases (and in general when you see an EST named in that format), those are GenBank gi numbers.   Those are cases where we included external cDNA data amongst the reads we assembled.

If you just grab the numeric portion of those reads they will show up in GenBank's nucleotide division.  They are just reads that were submitted to GenBank from other sources than WashU.   I believe they would all be ESTs produced by sanger sequencing.   Although if its important you should check their annotation in GenBank to be sure about that.

I apoligize for not seeing your question sooner.


John Martin
Reply all
Reply to author
Forward
0 new messages