how to view this TXT file on ucsc genome browser

11 views
Skip to first unread message

Terumi Kohwi-Shigematsu

unread,
Jul 21, 2021, 8:36:42 PM7/21/21
to gen...@soe.ucsc.edu, Terumi Kohwi-Shigematsu
Dear members,
I do not have a technician who used to help on this kind of thing right now.
Could you tell me how to proceed in viewing this data set attached in the ucsc genome browser?
There is no information about hg19 or 38  in this GEO accession data.
Original website is:

I extracted one file from above and it is attached here. It is provided as TXT file.

Thank you for your help.
Terumi Kohwi-Shigematsu
GSE90550_AHR-only_bound_peaks (1).txt

Jairo Navarro Gonzalez

unread,
Jul 26, 2021, 6:36:21 PM7/26/21
to Terumi Kohwi-Shigematsu, UCSC Genome Browser Discussion List

Hello,

Thank you for using the UCSC Genome Browser and sending your inquiry.

From the associated study, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5773648/, the data
was aligned to the hg19 assembly. To load the file onto the UCSC Genome Browser, you will have
to do a bit of scripting to get the file into the BED format for a custom track. You can run the
following command:

zcat GSE90550_AHR-only_bound_peaks.txt.gz | awk '{ print $1,$2,$3,$5 }' > GSE90550_AHR-only_bound_peaks.bed

Which will produce an output like the following:

chr start end #
chr15 100016696 100016979 MEF2A
chr16 85182626 85182849 LOC400548
chr20 31849943 31850100 BPIFB1
chr3 11816912 11817262 VGLL4
chr7 101627577 101627843 CUX1
chr12 131706942 131707184 RP11-638F5.1
chr18 21595389 21595589 TTC39C
chr3 72359981 72360269 RYBP
chr9 128274716 128274922 MAPKAP1

You will then have to remove the first line from the output or add a # to signify that it is a header
line. You can find an example of the BED file you can load on hg19 here:

https://hgwdev.gi.ucsc.edu/~jairo/MLQ/27884/GSE90550_AHR-only_bound_peaks.bed

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a publicly accessible Google Groups forum.
If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Jairo Navarro
UCSC Genome Browser

Want to share the Browser with colleagues?
Host a workshop: https://bit.ly/ucscTraining


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/CANvTo-wNRxWE92i3BWF-CRz91yNiQaab-Vn6dvDWV1w8zsPwqg%40mail.gmail.com.

Mari Grange

unread,
Jul 27, 2021, 7:10:50 PM7/27/21
to gen...@soe.ucsc.edu, jnav...@ucsc.edu
Hello,

Thank you for your command. I tried the command using the same sample (GSE90550_AHR-only_bound_peaks.txt.gz) but got an error message "not in gzip format" and same errors happened to other files from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE90550. If you have ideas to solve this error, it would be great.

Thank you, 

Mari Grange

University of California, San Francisco

513 Parnassus Avenue

San Francisco, CA 94143

HSW860


Matthew Speir

unread,
Jul 28, 2021, 10:10:35 PM7/28/21
to Mari Grange, gen...@soe.ucsc.edu, jnav...@ucsc.edu
Hello, Mari. 

Can you share the steps you used to download these files? It sounds like something may have gone wrong with your downloading of these files that are leading them to not being recognized as gzipped files.

If you're downloading these files to a server with curl or wget, then I'd recommend copying the FTP link on the GEO page (e.g. https://ftp.ncbi.nlm.nih.gov/geo/series/GSE90nnn/GSE90550/suppl/GSE90550_AHR-only_bound_peaks.txt.gz). If you copy the HTTP link and use it with curl/wget, it doesn't seem to actually download the file, rather an HTML page about the file or something. This would mean zcat wouldn't recognize these as gzip files.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Training videos & resources: http://genome.ucsc.edu/training/index.html

Want to share the Browser with colleagues? Host a workshop: http://bit.ly/ucscTraining

---

Matthew Speir

UCSC Cell Browser, Quality Assurance and Data Wrangler

Human Cell Atlas, User Experience Researcher

UCSC Genome Browser, User Support

UC Santa Cruz Genomics Institute

Revealing life’s code.



Reply all
Reply to author
Forward
0 new messages