help findinf TSS for a large list of genes

62 views
Skip to first unread message

Lia Mayorga

unread,
Jan 3, 2023, 3:57:26 PM1/3/23
to gen...@soe.ucsc.edu
Hello!
I was wondering how to use the USCS genome browser to find the TSS of a large list of genes in order to set a bed file. Could you help me? 
Thank you in advance. 
Regards, 

Lia

--
Lía Mayorga, M.D, PhD
Pediatrician, Inborn Errors of Metabolism
Instituto de Histología y Embriología de Mendoza (IHEM)
CONICET-Universidad Nacional de Cuyo
Mendoza, Argentina

Gerardo Perez

unread,
Jan 12, 2023, 4:31:00 PM1/12/23
to Lia Mayorga, gen...@soe.ucsc.edu

Hello, Lia.

Thank you for your interest in the Genome Browser and your question about finding the TSS from a large list of genes.

You can use the Table Browser with a list of genes to download the TSS data in bed format. You would have to select a gene track dataset to extract the TSS data. Here are two gene tracks that we offer for human hg38/hg19 and mouse mm10/mm39:

The following FAQ entry can explain the differences between the NCBI RefSeq and GENCODE tracks: https://genome.ucsc.edu/FAQ/FAQgenes.html#ensRefseq

As an example, you can do the following to extract TSS data from the hg38 GENCODE V41 track by using a list of gene names:

1.Navigate to the Table Browser (http://genome.ucsc.edu/cgi-bin/hgTables) and make the following selections:

clade: Mammal
genome: Human
assembly: Dec. 2013 (GRCh38/hg38)
group: Genes and Gene Predictions
track: GENCODE V41
table: knownGene

2. Set the region to “genome”

3. Click "paste list" or "upload list" next to identifiers (names/accessions): and enter/upload a list of gene names such as the following and then click "submit":

ENST00000452429.5
ENST00000420650.2
ENST00000437141.1
ENST00000226319.8
CRKAS
ZCYTO7
NM_145870
uc003uew.4
Q6UWX9

4. Set output format to "BED - browser extensible data"

5. Insert a name next to output filename:, such as gencodeV41genes_hg38

6. Click “get output”

7. Click “get BED”

The output will then give you the data in bed format where the 2nd column is the txStart and the 3rd column is the txEnd. Keep in mind, for items in the minus (-) strand, the txStart is actually the txEnd, and vice versa, as all the entries from the perspective of the plus (+) strand. You can learn more about the standard BED columns on the following help page, http://genome.ucsc.edu/FAQ/FAQformat.html#format1.

For more information on using the Table Browser, please review the Table Browser User Guide:
https://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Gerardo Perez
UCSC Genomics Institute


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/CA%2BKL43ppjyOXQfZXXnBK8y2ERK7_y44NAL1gqQwB922b%2BaZxjA%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages