rRNA track

1,085 vistas
Ir al primer mensaje no leído

Bogdan Tanasa

no leída,
7 oct 2015, 10:14:25 a.m.7/10/2015
para gen...@soe.ucsc.edu,Qi Ma,Sreejith Nair
Dear all,

please could you advise on finding a good rRNA (5S, 5.8S, 28S) track in the genome browser. Many thanks,

-- Qi and Bogdan

Matthew Speir

no leída,
8 oct 2015, 5:39:34 p.m.8/10/2015
para Bogdan Tanasa,gen...@soe.ucsc.edu,Qi Ma,Sreejith Nair
Hi Bogdan,

Thank for your questions about finding ribosomal RNA (rRNA) in the UCSC Genome Browser. Identifying rRNA in the Genome Browser is going to depend on which assembly you are using as some assemblies have better annotation than others. If you are looking at the human (hg19, hg38) or mouse (mm10) genomes, you can use the "GENCODE Gene Annotation" tracks to view rRNA. You can also use the Ensembl Genes track to view rRNA genes in the Genome Browser as it is available for many more organisms.

To find these genes in the Browser, you can use the Table Browser to filter these tables for only the rRNA genes. In the following steps, I've used GENCODE Genes on hg38 as my example, but you should be able to modify these steps to use Ensembl Genes for a different organism.

1. Navigate to the Table Browser, http://genome.ucsc.edu/cgi-bin/hgTables.
2. Make the following selections:
    clade: Mammal
    genome: Human
    assembly: Dec. 2013 (GRCh38/hg38)
    group: Genes and Gene Predictions
    track: All GENCODE V22
    table: Basic (wgEncodeGencodeBasicV22)
    output: Hyperlinks to Genome Browser

3. Next to 'filter', click "create".
4. Under 'Linked Tables', check the box next to 'wgEncodeGencodeAttrsV22'.
5. Click 'allow filtering using fields in checked tables'.
6. Under 'hg38.wgEncodeGencodeAttrsV22 based filters', type 'rRNA' in the 'geneType' and 'transciptType' fields.
        The "geneType" line should read: geneType does match rRNA
        The "transcriptType" line should read: transcriptType does match rRNA

7. Click 'submit'.
8. Click 'get output'

You will now see a page full of links to rRNA genes in the Genome Browser. Note that some of these may be rRNA pseudogenes. You will need to click through to the Ensembl site to see more information about each gene.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Matthew Speir
UCSC Genome Bioinformatics Group
--


Qi Ma

no leída,
8 oct 2015, 7:15:23 p.m.8/10/2015
para Matthew Speir,Bogdan Tanasa,gen...@soe.ucsc.edu,Sreejith Nair
Dear Dr. Matthew,

Thank you so much for your reply. It very helpful.

But we are searching for rRNA (the sequence should include the components as in http://www.ncbi.nlm.nih.gov/nuccore/555853/ indicated) locations on each chromosome marked by hg18 genome. Could you give us any clue on how to get the information on that? Many Thanks.

Best,
Qi
--
Qi Ma, Postdoctoral Scholar
Department of Bioengineering & Department of Medicine,
University of California, San Diego (UCSD)
9500 Gilman Dr. #0419
La Jolla, CA 92093-0419
(858)5983866
q1...@ucsd.edu

Qi Ma

no leída,
8 oct 2015, 7:16:08 p.m.8/10/2015
para Matthew Speir,Bogdan Tanasa,gen...@soe.ucsc.edu,Sreejith Nair
Attn to Sree:

Could you also add some of your comments on our discussion, and explain more of what we are searching for to Dr. Matthew?
Many Thanks,

Best,
Qi

Nair, Sreejith

no leída,
9 oct 2015, 11:23:51 a.m.9/10/2015
para Qi Ma,Matthew Speir,btanasa-forward,gen...@soe.ucsc.edu
Dear Dr. Speir,

   Thanks for the explanation. One specific problem we are facing is to distinguish the rDNA sequences between different chromosome. As you know, rDNA clusters (200-400 repeats of 43kb region) are distributed in the p-arm of 5 different acrocentric chromosomes across human genome. The 43 kb region is assumed to be same in the genome. However, it is possible that there could be some markers within the repeats or other parts of the p-arm of different chromosome that would help us to align the sequences in a chromosome specific manner. The reason we need this info is to align our various nextgen seq expt data to the rDNA repeat at the p arm of Chromosome 21.

    We would greatly appreciate if you could provide any insight regarding this matter.
Sincerely,

Sree Nair


From: Qi Ma [maqiw...@gmail.com]
Sent: Thursday, October 08, 2015 4:13 PM
To: Matthew Speir
Cc: btanasa-forward; gen...@soe.ucsc.edu; Nair, Sreejith
Subject: Re: [genome] rRNA track

Steve Heitner

no leída,
14 oct 2015, 12:41:58 p.m.14/10/2015
para Nair, Sreejith,Qi Ma,Matthew Speir,btanasa-forward,gen...@soe.ucsc.edu

Hello, Sree.

We cannot provide assistance with sequence analysis, but
we can suggest a way to align your region of interest on chr21 to other regions in the reference genome to see whether the alignments show informative differences between similar regions.  Perform the following steps:

1. Get your coordinates of interest from a Table Browser query as previously outlined by my colleague Matt Speir

2. View this region in the Browser:

  2.1. Navigate to http://genome.ucsc.edu/cgi-bin/hgGateway

  2.2. Enter your assembly of choice and enter your coordinates in the “search term” box

  2.3. Click the “submit” button

3. In the blue navigation bar at the top of the screen, click “View/DNA”

4. Click the “get DNA” button

5. Copy the DNA sequence

6. Navigate to http://genome.ucsc.edu/cgi-bin/hgBlat

7. Paste the sequence into the text box (note that blat has a limit of 25,000 bases, so if your region is larger than this, you will need to trim the sequence – this can be done more easily by just viewing a smaller region in the Browser before obtaining the DNA sequence in steps 3-5)

8. Click the “submit” button

You may also wish to contact the producers of the reference genome assembly (NCBI for hg18 and the Genome Reference Consortium for hg19 and hg38) to see if they have any comments about how the rDNA repetitive regions were handled and whether different sequences were assigned to different chromosomes in those assemblies.

Please contact us again at gen...@soe.ucsc.edu if you have any further questions. 
All messages sent to that address are archived on a publicly-accessible Google Groups forum.  If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

---
Steve Heitner
UCSC Genome Bioinformatics Group

--

Galt Barber

no leída,
14 oct 2015, 2:31:09 p.m.14/10/2015
para Nair, Sreejith,Qi Ma,btanasa-forward,gen...@soe.ucsc.edu

BLAT will have problems dealing with highly repetitive sequences.

Not only are highly-used tiles masked out providing no seeds at those locations,

but it also has built-in limits to only return 16 alignments per chromosome per strand.

Perhaps another aligner like Bowtie or BWA would work better.

Repetitive DNA and next-generation sequencing: computational challenges and solutions

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3324860/

-Galt

--


Angie Hinrichs

no leída,
14 oct 2015, 3:49:38 p.m.14/10/2015
para Galt Barber,Nair, Sreejith,Qi Ma,btanasa-forward,gen...@soe.ucsc.edu
I believe Galt is referring to the alignment of highly repeated rRNA subunits, but the intention was actually to align larger regions of the reference assembly sequence.  The suitability of BLAT depends on the size of the region that you're searching in addition to the amount of repetitive content.  The tiles that Galt referred to are 11-base sequences that are overrepresented in the genome.  In addition, parts of the genome that are annotated by RepeatMasker or short Tandem Repeats are soft-masked so that alignments cannot begin there -- but alignments can extend through those tiles and regions from adjacent non-repetitive sequence.  

rDNA clusters (200-400 repeats of 43kb region)

43kb is too long for online BLAT's 25kb limit, so that region could be split in half.  

I strongly suggest contacting the producers of the reference genome assembly as Steve said, to ask how the clusters were placed in the assembly.  Given the repetitive and highly polymorphic nature of these clusters (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2134781/) it seems that those regions of the genome would be extremely difficult to assemble.   The discussion here might also be of interest: https://www.biostars.org/p/12325/ 

Angie

--


Nair, Sreejith

no leída,
15 oct 2015, 11:11:48 a.m.15/10/2015
para Angie Hinrichs,Galt Barber,Qi Ma,btanasa-forward,gen...@soe.ucsc.edu
Thank you all for the illuminating discussion. We appreciate your inputs. Will try you suggestions and more readings on this issue. One question: where to find the contact info for the produces for the reference genome assembly?

Sincerely,

Sree

From: Angie Hinrichs [an...@soe.ucsc.edu]
Sent: Wednesday, October 14, 2015 12:49 PM
To: Galt Barber
Cc: Nair, Sreejith; Qi Ma; btanasa-forward; gen...@soe.ucsc.edu

Matthew Speir

no leída,
15 oct 2015, 11:52:15 a.m.15/10/2015
para Nair, Sreejith,Angie Hinrichs,Galt Barber,Qi Ma,btanasa-forward,gen...@soe.ucsc.edu
Hi Sree,

The hg18/NCBI36 assembly of the human genome was produced by NCBI. You can contact the NCBI Help Desk at in...@ncbi.nlm.nih.gov for questions about this assembly.

Note that later assemblies, such as hg19/GRCh37 and hg38/GRCh38, were produced by the Genome Reference Consortium (GRC). You can contact the GRC by filing out this form.


I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Matthew Speir
UCSC Genome Bioinformatics Group


--


Nair, Sreejith

no leída,
15 oct 2015, 1:44:13 p.m.15/10/2015
para Matthew Speir,Angie Hinrichs,Galt Barber,Qi Ma,btanasa-forward,gen...@soe.ucsc.edu
Thank you so much!

From: Matthew Speir [msp...@soe.ucsc.edu]
Sent: Thursday, October 15, 2015 8:52 AM
To: Nair, Sreejith; Angie Hinrichs; Galt Barber
Cc: Qi Ma; btanasa-forward; gen...@soe.ucsc.edu

Qi Ma

no leída,
19 oct 2015, 1:18:29 p.m.19/10/2015
para Nair, Sreejith,Matthew Speir,Angie Hinrichs,Galt Barber,btanasa-forward,gen...@soe.ucsc.edu
Thank you All guys.
Let's contact NCBI and GRC to see what we could get.
Many Thanks about your guys help.

Best,
Qi

Sreejith Nair

no leída,
19 jul 2017, 4:45:25 p.m.19/7/2017
para Bogdan Tanasa,gen...@soe.ucsc.edu,Qi Ma,Sreejith Nair
--
A mind that is stretched by a new experience can never go back to its old dimensions. - Oliver Wendell Holmes

Christopher Lee

no leída,
20 jul 2017, 3:33:29 p.m.20/7/2017
para Sreejith Nair,Bogdan Tanasa,gen...@soe.ucsc.edu,Qi Ma,Sreejith Nair
Hi Sree,

Thanks for your question on rRNA data. Unfortunately there has not
been a significant update to the method described here:
https://groups.google.com/a/soe.ucsc.edu/d/msg/genome/0X06cAZgHjU/YekSLOG7CQAJ

There has, however, been some small updates to perhaps use the RefSeq
Genes track as described here:
https://groups.google.com/a/soe.ucsc.edu/d/msg/genome/qUuelwZ1LxQ/z21rDd6hKAAJ

You may also find the following discussion interesting:
http://seqanswers.com/forums/archive/index.php/t-41868.html

Thanks,

Christopher Lee
UCSC Genomics Institute
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "UCSC Genome Browser Public Support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to genome+un...@soe.ucsc.edu.
> To post to this group, send email to gen...@soe.ucsc.edu.
> Visit this group at https://groups.google.com/a/soe.ucsc.edu/group/genome/.
> To view this discussion on the web visit
> https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/CAArzky8Z8q9QRjtqtSAtEAmTVOdzdsKvEOTHB0KF-VHs%3D_hU4w%40mail.gmail.com.
> For more options, visit https://groups.google.com/a/soe.ucsc.edu/d/optout.
Responder a todos
Responder al autor
Reenviar
0 mensajes nuevos