bigWigToWig and wig indexing

46 views
Skip to first unread message

Alex Reynolds

unread,
Feb 19, 2015, 3:41:21 PM2/19/15
to gen...@soe.ucsc.edu
I'd like to ask how bigWig files are indexed and whether they are re-indexed (or not) when converted from bigWig to wig with bigWigToWig.

For instance, when converting the following file to wig, the output appears to contain one (or more) wig elements with a start coordinate of 0:


If the convention is that wig is 1-based and closed, should there be elements with a start coordinate of 0 in the output of bigWigToWig, or are there problems with this particular bigWig file? 

Or is there not really a convention for how wig data are indexed, and downstream tools should not rely on any particular indexing?

Thanks for any advice,

Regards,
Alex

Brian Lee

unread,
Feb 21, 2015, 12:25:26 AM2/21/15
to Alex Reynolds, gen...@soe.ucsc.edu

Dear Alex,

Thank you for using the UCSC Genome Browser and your question about how bigWig files are indexed and whether they are re-indexed (or not) when converted from bigWig to wig with bigWigToWig.

The original data that is used to generate a bigWig can come from different formats. There is bedGraph, which is zero-relative, and wiggle, which is 1-relative. In summary, if a bedGraph is used, the results from bigWigToWig will be the bedGraph zero-relative coordinates. What will be included in the output is a commented note, for example, "#bedGraph section chr1:10451-568419" at the head of the wgEncodeSydhTfbsK562Pol3StdSig file mentioned. Thus, the data is not re-indexed, unless you specify bigWigToBedGraph, then data will always return as 0-based bedGraph.

Most ENCODE data, such as the information you were looking at, originated from a bam, that was processed through a step like bamToBedfile.bam -> file.bedGraph bedGraphToBigWig -> file.bw Thus, there is no problem with this file, it should be what you see when looking at most bam originated bigWig files from the ENCODE project.

As to your last question, it is best to not rely on the fact all bigWigs will be indexed the same, some will be from bedGraphs, some from wigs, depending on their originating files, but likely all ENCODE data will exit bigWigToWig as bedGraphs since they were likely encoded as bedGraphs from bams.

Here is further background information. There are two bigWig encoders, bedGraphToBigWig and wigToBigWig, that can take bedGraph or the two wiggle types, variableStep and fixedStep. Then there are two ways back: bigWigToBedGraph and bigWigToWig. If you wish to explore with these formats, please see these pages, the last being the location for obtaining precompiled binaries:
http://genome.ucsc.edu/goldenPath/help/bedgraph.html
http://genome.ucsc.edu/goldenPath/help/wiggle.html
http://hgdownload.soe.ucsc.edu/admin/exe/

Thank you again for your inquiry and using the UCSC Genome Browser. If you have any further questions, please reply togen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

All the best,

Brian Lee
UCSC Genome Bioinformatics Group


--


Reply all
Reply to author
Forward
0 new messages