Looking for a webservice that returns the nucleotide sequence of a human genome region

5 views
Skip to first unread message

Fred

unread,
Nov 12, 2008, 8:57:52 AM11/12/08
to Group-4-Bioinformatics
Dear all,

I am looking for a webservice that would return me the nucleotide
sequence of a given human genome region.
The ideal webservice would be that I just have to give the following
kind of parameters.
- The chromosome number (e.g. : 2)
- The nucleotide start position (e.g. : 128 000)
- The nucleotide end position (e.g. : 128 700)
- The strand (e.g. : +1)

And then I would get the sequence in return.

I know that at the ncbi they provide a such service (EFetch for
Sequence and other Molecular Biology Databases :
http://www.ncbi.nlm.nih.gov/entrez/query/static/efetchseq_help.html)
but it seems that I have to provide a chromosome contig Id instead of
a chromosome number.
MAy be it is possible to give a chromosome number but the
documentation is, in my opinion, not very friendly.

And actually I don't want to load the entire human genome sequence in
my computer.

Does anyone know how to easily perform such a task.

Thanks in advance for you suggestions,

Fred

ahmed essaghir

unread,
Nov 12, 2008, 9:12:21 AM11/12/08
to group4bioi...@googlegroups.com
Hi!
I tkink the easiest way is to go to ensembl exportview tool :
http://www.ensembl.org/Homo_sapiens/exportview?action=select;option=fasta;type1=region
there you can specify for instance :
- chromosome : 2
- from base pair: 128000
- to base pair: 128700
- select export format
and you're done.
 
good luck!

Fred

unread,
Nov 12, 2008, 10:27:55 AM11/12/08
to Group-4-Bioinformatics
Thanks Ahmed,

But I would like to perform this kind of query dynamically in my php
scripts.
I will check if this query is available in their webservice.

By the way below is a solution provided by another Group-4-
Bioinformatics user (in french but actually the referenced website is
quite nice and in english).
##############
Bonjour Frédéric,

Le package RSAT (http://rsat.scmbb.ulb.ac.be/rsat/) developpé au sein
du labo où je travaille propose l'outil "retrieve-ensembl-seq", qui
fait exactement ceci.

Via le site Web, cette option n'est pas encore accessible. Par contre,
l'accès Web Service SOAP permet d'utiliser ce mode de fonctionnement.
Dans le menu de gauche, il faut cliquer sur le menu Web Services
pour avoir accès aux infos sur nos Web Services (environ une
trentaine).

Cet outil se connecte à EnsEMBL pour l'accès aux séquences, et permet
de nombreuses requêtes différentes (masquage des région répétées, des
régions codantes...).

Bien à vous,

Morgane
##############


On Nov 12, 3:12 pm, "ahmed essaghir" <ahmed.essag...@gmail.com> wrote:
> Hi!
> I tkink the easiest way is to go to ensembl exportview tool :http://www.ensembl.org/Homo_sapiens/exportview?action=select;option=f...
> there you can specify for instance :
> - chromosome : 2
> - from base pair: 128000
> - to base pair: 128700
> - select export format
> and you're done.
>
> good luck!
>
> On 11/12/08, Fred <frederic.fle...@gmail.com> wrote:
>
>
>
>
>
> > Dear all,
>
> > I am looking for a webservice that would return me the nucleotide
> > sequence of a given human genome region.
> > The ideal webservice would be that I just have to give the following
> > kind of parameters.
> > - The chromosome number (e.g. : 2)
> > - The nucleotide start position (e.g. : 128 000)
> > - The nucleotide end position (e.g. : 128 700)
> > - The strand (e.g. : +1)
>
> > And then I would get the sequence in return.
>
> > I know that at the ncbi they provide a such service (EFetch for
> > Sequence and other Molecular Biology Databases :
> >http://www.ncbi.nlm.nih.gov/entrez/query/static/efetchseq_help.html)
> > but it seems that I have to provide a chromosome contig Id instead of
> > a chromosome number.
> > MAy be it is possible to give a chromosome number but the
> > documentation is, in my opinion, not very friendly.
>
> > And actually I don't want to load the entire human genome sequence in
> > my computer.
>
> > Does anyone know how to easily perform such a task.
>
> > Thanks in advance for you suggestions,
>
> > Fred- Hide quoted text -
>
> - Show quoted text -

ahmed essaghir

unread,
Nov 12, 2008, 10:36:43 AM11/12/08
to group4bioi...@googlegroups.com
Hi!
I think you could do like this dynamically : create variables that can be put in the link below.
example: specify the parameters (in bold) in this link and you'll get in return the text file containing your sequence.
let me know if it works.
http://www.ensembl.org/Homo_sapiens/exportview?seq_region_name=2&type1=bp&anchor1=128000&type2=bp&anchor2=128700&downstream=&upstream=&format=fasta&action=export&_format=Text&output=txt&submit=Continue+%3E%3E

good luck


On 11/12/08, Fred <frederi...@gmail.com> wrote:

Fred

unread,
Nov 12, 2008, 10:41:42 AM11/12/08
to Group-4-Bioinformatics
Actually I just found out that it was possible to do what I was
looking for using the ncbi EFetch services.

Indeed it seems that all the Homo sapiens chromosomes, are listed as
reference assembly, complete sequence with the following Accession
number format in the NCBI nucleotide database.
Chromsome 1 = > NC_000001
|
|
Chromsome 11 = > NC_000011
|
|
Chromsome Y = > NC_000024

So to perform the query in my instance I just need to construct this
url
http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=NC_000002&rettype=fasta&seq_start=128000&seq_stop=128700&strand=1

And that's all.

Hope this helps someone else in the futur.

Fred

frabotta

unread,
Dec 9, 2008, 2:37:47 PM12/9/08
to Group-4-Bioinformatics
Fred,

Also consider the UCSC Genome Browser (if you haven't already), since
you can explore your region of interest; instantly look at
evolutionary conservation; search for genomic elements such as msats,
LINEs, SINEs; etc.

The Browser also performs in silico PCR, BLAT (BLAST-Like Alignment
Tool) searches, and can quickly stay in the same genome region but
switch target taxon (Homo > Mus > Chimp, etc)

http://genome.ucsc.edu/

Have fun,
Laurence
Reply all
Reply to author
Forward
0 new messages