How to make Refseq genomic regions compatible with UCSC browser

57 views
Skip to first unread message

Tsuda, Shanel M.

unread,
Sep 26, 2022, 3:50:11 PM9/26/22
to gen...@soe.ucsc.edu, NAGARAJA, SHASHANK
Hi UCSC friends,

We are having trouble visualizing our ATAC-seq data tracks on the UCSC genome browser. We used NCBI RefSeq (mm39) to align our data, so our genomic regions have the following format:

NC_000067.7_3050129_3050218_+

However, we know the genome browser only takes "chr1..." for genomic regions. Is the only way to convert our RefSeq regions to the UCSC annotation by using the Liftover tool (https://genome.ucsc.edu/cgi-bin/hgLiftOver)? Or is there another way, such as using an "alias file" like IGV does?

Best,
Shanel

Gerardo Perez

unread,
Sep 28, 2022, 8:44:12 PM9/28/22
to Tsuda, Shanel M., gen...@soe.ucsc.edu, NAGARAJA, SHASHANK

Hello, Shanel.

Thank you for your interest in the Genome Browser and your question about how to make Refseq genomic regions compatible.

Could you share the input file with us? A custom tracks input automatically does this conversion, and there should not be a problem. We do have an IGV format file that can be used for conversions:
https://hgdownload.soe.ucsc.edu/goldenPath/mm39/bigZips/mm39.chromAlias.txt

You can also use the chromToUcsc tool, which is available to download from the utilities directory, https://hgdownload.soe.ucsc.edu/downloads.html#utilities_downloads. You can then find chromToUcsc under the directory that matches your operating system. For example, here are the direct links for Linux:
http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/chromToUcsc

You can run the utility on its own to see a help message, e.x.

$ ./chromToUcsc
Usage: chromToUcsc [options] filename - change NCBI or Ensembl chromosome names to UCSC names in tabular or wiggle files, using a chromAlias table.

    Supports these UCSC file formats:
    BED, genePred, PSL, wiggle (all formats), bedGraph, VCF, SAM, GTF, Chain
    ... or any other csv or tsv format where the sequence (chromosome) name is a separate field.

    Requires a <genome>.chromAlias.tsv file which can be downloaded like this:
        chromToUcsc --get hg19              # download the file hg19.chromAlias.tsv into current directory

    If you do not want to use the --get option to retrieve the mapping tables, you can also download the alias mapping
    files yourself, e.g. for mm10 with 'wget https://hgdownload.soe.ucsc.edu/goldenPath/mm10/database/chromAlias.txt.gz'

    Then the script can be run like this:
        chromToUcsc -i in.bed -o out.bed -a hg19.chromAlias.tsv
        chromToUcsc -i in.bed -o out.bed -a https://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/chromAlias.txt.gz
    Or in pipes, like this:
        cat test.bed | chromToUcsc -a mm10.chromAlias.tsv > test.ucsc.bed
    For BAM files use this program in a pipe with samtools:
        samtools view -h in.bam | ./chromToUcsc -a mm10.chromAlias.tsv | samtools -bS > out.bam

    By default, this script expects the chromosome name in the first field.
    The default works for BED, bedGraph, GTF, wiggle, VCF.
    For the following file formats, you will need to set the -k option to these values manually:
    genePred: 2 -- PSL: 10 (query) or 14 (target) -- chain: 2 (target) or 7 (query) -- SAM: 2
    (If a line starts with @ (SAM format), -k is automatically set to 2.)

Options:
...

If you don't want to share this information with our publicly-archived mailing list (gen...@soe.ucsc.edu), you can send it to our confidential support list at genom...@soe.ucsc.edu.

Gerardo Perez
UCSC Genomics Institute


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/PH0PR22MB24714B5177FD6DA02A854877DB529%40PH0PR22MB2471.namprd22.prod.outlook.com.

Tsuda, Shanel M.

unread,
Sep 29, 2022, 1:23:18 PM9/29/22
to Gerardo Perez, gen...@soe.ucsc.edu, NAGARAJA, SHASHANK
Hi Gerardo,

Thanks for your response!

Here is the hub.txt file I used to make a track hub:

Here is one of the input files, in bigWig format:

When I tried both the track hub and custom track with the single bigWig, I got the same issue: the track would technically load but no peaks would appear (ie, looks empty).

Best,
Shanel


From: Gerardo Perez <gpe...@ucsc.edu>
Sent: Wednesday, September 28, 2022 8:43 PM
To: Tsuda, Shanel M. <sts...@ufl.edu>
Cc: gen...@soe.ucsc.edu <gen...@soe.ucsc.edu>; NAGARAJA, SHASHANK <s.nag...@ufl.edu>
Subject: Re: [genome] How to make Refseq genomic regions compatible with UCSC browser
 
[External Email]

Gerardo Perez

unread,
Sep 29, 2022, 8:52:52 PM9/29/22
to Tsuda, Shanel M., gen...@soe.ucsc.edu, NAGARAJA, SHASHANK

Hello, Shanel.

Thank you for providing more details and for bringing this to our attention.

We were able to reproduce your issue, and it is now fixed. You should now be able to visualize your ATAC-seq data tracks on the UCSC genome browser.

Let us know if you experience any additional unexpected behavior. Thank you again for bringing this to our attention.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Gerardo Perez
UCSC Genomics Institute

Tsuda, Shanel M.

unread,
Sep 30, 2022, 1:33:28 PM9/30/22
to Gerardo Perez, gen...@soe.ucsc.edu, NAGARAJA, SHASHANK
Hi Gerardo,

Thank you so much! We really appreciate your help.

Best,
Shanel

From: Gerardo Perez <gpe...@ucsc.edu>
Sent: Thursday, September 29, 2022 8:52 PM

Tsuda, Shanel M.

unread,
Sep 30, 2022, 1:57:09 PM9/30/22
to Gerardo Perez, gen...@soe.ucsc.edu, NAGARAJA, SHASHANK
Hi Gerardo,

The tracks were loading just fine for a few hours, but I recently just got this error message:

"hgTracks object is missing from the response"

The track hub loaded successfully, but when I tried to change the view to "full" it gave the error message and did not show the track.

Best,
Shanel

From: Tsuda, Shanel M. <sts...@ufl.edu>
Sent: Friday, September 30, 2022 1:20 PM
To: Gerardo Perez <gpe...@ucsc.edu>

Luis Nassar

unread,
Sep 30, 2022, 8:29:19 PM9/30/22
to Tsuda, Shanel M., Gerardo Perez, gen...@soe.ucsc.edu, NAGARAJA, SHASHANK

Hello, Shanel.

The error you are seeing is because the hub has autoScale group in the trackDb, e.g.

track D7_IEL_rep1
shortLabel D7 IEL rep1
longLabel D7 IEL rep1 LCMV-Arm Milner 2017 Nature
type bigWig
maxHeightPixels 60
autoScale group
bigDataUrl https://data.cyverse.org/dav-anon/iplant/home/stsuda/D7_IEL_Rep1.bw

The group option for that setting (https://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#autoScale) should only be used for composite (container) tracks, you can see the NOTE on the definition document. If you remove that or change it to off/on everything should work.

I hope this is helpful. Please include gen...@soe.ucsc.edu in any replies to ensure visibility by the team. All messages sent to that address are archived on our public forum. If your question includes sensitive information, you may send it instead to genom...@soe.ucsc.edu.

Lou Nassar
UCSC Genomics Institute


Reply all
Reply to author
Forward
0 new messages