Finding Gene DNA sequence

274 views
Skip to first unread message

Iffy

unread,
Nov 12, 2008, 11:14:45 AM11/12/08
to Group-4-Bioinformatics
Hi

I have searched gene DNA sequence from NCBI and ENSEMBLE but I have
some quries either I got right sequence or not ..... Becoz in ensemble
their is diffrent sequence and in ncbi its diffrent .... In NCBI
entering the gene name I got two options for sequence that is FASTA
seq and GENEBANK sequence ....But those are also diffrent...How to
search gene DNA sequence with only conding part ?

Can anyone explain this?

Fred

unread,
Nov 13, 2008, 2:54:26 AM11/13/08
to Group-4-Bioinformatics
Hello Iffy,

NCBI and Ensembl are two nice places to look for gene informations.

If you are looking for a given gene, let's say ANKS1A.
At NCBI (http://www.ncbi.nlm.nih.gov/sites/entrez?
db=gene&cmd=Retrieve&dopt=Graphics&list_uids=23294), in the Entrez
gene card you will get one sequence that is the only transcript listed
in Refseq for this gene (NM_015245).

At Ensembl (http://www.ensembl.org/Homo_sapiens/geneview?
gene=ENSG00000064999), in the Ensembl card you will get two sequences
that are the two transcripts listed in Ensembl for this gene
(ENST00000360359 and ENST00000373990)

Actually the Refseq id (NM_015245) and the Ensembl Id
(ENST00000360359) are reffering the same transcript. And it seems that
Ensembl found out a new transcript associated to this gene
(ENST00000373990).

So it is quite normal to find some differences between Ensembl and
NCBI when you ar looking for transcripts related to a given genes.
Sometimes Ensembl has more transcripts listed for a given gene and
sometimes it is NCBI.

Concerning the Fasta and Genbank options given by the NCBI : It is
just two different format. The difference is the amount of information
related to the sequence. In the Fasta format you just have a heading
description line followed by the sequence. You have more information
in the genbank format. For more information concerning sequence format
you can check this webpage (http://www.genomatix.de/online_help/help/
sequence_formats.html).

Hope this helps,

Fred

Saneth Samarakoon

unread,
Nov 13, 2008, 5:27:11 AM11/13/08
to group4bioi...@googlegroups.com

Hi iffy

When we search the NCBI database with a gene name selecting "Nucleotide option of the deropdown menu in the NCBI search toolbar", we'll receive a big list of gene names as results. This creates the problem of selecting the correct gene sequence. I also faced the same problem and most of the time using the following method to deal this.

        1. Select the Gene Option of the deropdown menu in the NCBI search toolbar and type the gene symbol and perform the search
            (It is quite rare to find the same human gene appear twice in the result list)

        2. Select the gene of your interest, this will direct you to the relevant entry of the gene database.

        3. Conform the gene using links in the summary category
 
        4. Scroll down to NCBI Reference Sequences (RefSeq) category and

                     a. if you need mRNA and protein isoform sequences use links in the RefSeqs maintained independently of Annotated Genomes => mRNA and Protein(s) group
                         (prefix NM_ is for mRNAs and NP_ is for proteins)

                     b. if you need genomic sequences use the linka in the RefSeqs of Annotated Genomes =>Reference assembly => Genomic group
                         (prefix NC_ is for chromosomes and NT_ for contigs)

This method works most of the time for me.

Anyway if you can't find the correct link to Gene database, you can use the NCBI mapviewer and give gene symbol and chromosome number and select reference Assembly.

In the Genes_Seq map click on the gene symbol and use links appear in the pop up menu.

Good luck
Pubudu
Reply all
Reply to author
Forward
0 new messages