Trouble viewing ChIP-seq data in juicebox

484 views
Skip to first unread message

Mariam Okhovat

unread,
Mar 14, 2017, 8:51:29 PM3/14/17
to 3D Genomics

Hi,

I am trying to view some custom chip-seq tracks along with HiC data for hg19 in juicebox. This is the format of my bedgraph files:

chr9    12    43    3
chr9    43    299    4
chr9    299    340    0
.
.
.

The first, second and third columns are obviously, chromosome, start and end. The fourth column is alignment enrichment. When I load it onto juicebox it does not recognize the values in the fourth column and does not show any of the variation in enrichment. I also tried wig format and the same happened. It basically only shows the regions covered by the chip-seq data but does not show the enrichment levels. Is there a way to visualize custom ChIP_seq data values along with the Hi-C map?

I also tried to load the ENCODE tracks and I keep getting this error:  Encode tracks are not available for hg19/chrom.sizes

Any help would be appreciated. Thank you!!

Mariam

Neva Durand

unread,
Mar 15, 2017, 6:46:41 AM3/15/17
to Mariam Okhovat, 3D Genomics
Hello Mariam,

The value of the bedgraph shows when you mouse over the track, in the window at the right.

You should make your .hic file using "hg19" as the genome ID, or use a file named "hg19.chrom.sizes".  Otherwise Juicebox has no way of parsing the genome ID and thus can't link it to ENCODE tracks.  You can just remake the file using the Pre command:  https://github.com/theaidenlab/juicer/wiki/Pre

Best
Neva


--
You received this message because you are subscribed to the Google Groups "3D Genomics" group.
To unsubscribe from this group and stop receiving emails from it, send an email to 3d-genomics+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/3d-genomics/494739ba-a2f7-48f8-9d26-2cd3af136e18%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Neva Cherniavsky Durand, Ph.D.
Staff Scientist, Aiden Lab

Mariam Okhovat

unread,
Mar 16, 2017, 1:46:56 PM3/16/17
to Neva Durand, 3D Genomics
Hi Neva,

Thank you for the response. 
I see the values now. But, is there a way yo visualize the enrichment levels as bar heights on the y-axis, like the ENCODE data is viewed?

Also, if I understand correctly the PRE command is mainly used when .hic files are created by programs other than juicer. My .hic files were generated by JUICER  (they are the output of the run that gave me errors about number of fastq and stats not matching). I used a manually downloaded hg19 genome and chrom.sizes, so I am a little confused as of why the juicebox does not recognize the genome.

In any case, I downloaded the juicer_tool and ran the command below, but got a strange error:

$ java -Xmx2g -jar juicer_tools_linux_0.8.jar pre aligned/inter.hic aligned/inter-hg19.hic hg19

Not including fragment map

Start preprocess

Writing header

Writing body

java.io.IOException: Unexpected column count.  Only 11 or 16 columns supported.  Check file format

at juicebox.tools.utils.original.AsciiPairIterator.advance(AsciiPairIterator.java:108)

at juicebox.tools.utils.original.AsciiPairIterator.<init>(AsciiPairIterator.java:70)

at juicebox.tools.utils.original.Preprocessor.computeWholeGenomeMatrix(Preprocessor.java:487)

at juicebox.tools.utils.original.Preprocessor.writeBody(Preprocessor.java:371)

at juicebox.tools.utils.original.Preprocessor.preprocess(Preprocessor.java:283)

at juicebox.tools.clt.old.PreProcessing.run(PreProcessing.java:108)

at juicebox.tools.HiCTools.main(HiCTools.java:86)

java.lang.RuntimeException: No reads in Hi-C contact matrices. This could be because the MAPQ filter is set too high (-q) or because all reads map to the same fragment.

at juicebox.tools.utils.original.Preprocessor$MatrixZoomDataPP.mergeAndWriteBlocks(Preprocessor.java:1466)

at juicebox.tools.utils.original.Preprocessor$MatrixZoomDataPP.access$000(Preprocessor.java:1237)

at juicebox.tools.utils.original.Preprocessor.writeMatrix(Preprocessor.java:651)

at juicebox.tools.utils.original.Preprocessor.writeBody(Preprocessor.java:373)

at juicebox.tools.utils.original.Preprocessor.preprocess(Preprocessor.java:283)

at juicebox.tools.clt.old.PreProcessing.run(PreProcessing.java:108)

at juicebox.tools.HiCTools.main(HiCTools.java:86)


Is .hic format not supported or is my file strange?

Thank you soo much for all your help!

M

Mariam Okhovat

unread,
Mar 16, 2017, 2:15:20 PM3/16/17
to Neva Durand, 3D Genomics
I also just noticed that the hg19.fa genome and corresponding chrom.sizes file that I used in generating the .hic file, only has the main chromosomes (i.e. chr1,..chrY) and none of the random haplotypes, etc, listed here:https://genome.ucsc.edu/goldenpath/help/hg19.chrom.sizes

Also, the fasta entries for each chromsome is quite strange in the hg19.fa file:
$grep "chr" hg19.fa

>chr1 dna:chromosome chromosome:GRCh37:1:1:249250621:1

>chr2 dna:chromosome chromosome:GRCh37:2:1:243199373:1

...

Could that be the reason why the genome is not recognized as hg19 by juicebox?


Thanks,
Mariam

Neva Durand

unread,
Apr 4, 2017, 10:38:56 AM4/4/17
to Mariam Okhovat, 3D Genomics
Hello Mariam, 

We pushed a major update having to do with chromosome names.  Can you try the latest (dev) version of Juicebox and see if it fixes the problem?


Best
Neva

Mariam Okhovat

unread,
Apr 4, 2017, 12:18:14 PM4/4/17
to Neva Durand, 3D Genomics
Hi Neva,

I tried the dev version (1.6.1) on both hg19 and hg38 HiC data generated by juicer. In both cases I got the same error:

Encode tracks are not available for hg19/chrom.sizes  
or
Encode tracks are not available for hg38/chrom.sizes    

I wonder why it calls the genomes hg38/chrom.sizes is that normal or could this be a small format issue where the genome name is saved as hg38/chrom.sizes rather than hg38?

Thanks,
Mariam

Neva Durand

unread,
Apr 4, 2017, 1:56:10 PM4/4/17
to Mariam Okhovat, 3D Genomics
Hi Mariam,

If you can recreate your hic files, you should name your chrom.sizes file hg19.chrom.sizes (or hg38.chrom.sizes), whichever it is.

Or you can just use "hg19" (or hg38) without the file; Juicebox will recognize the genomeID.

Unfortunately Juicebox does not recognize hg19/chrom.sizes as this is not a standard way of naming chrom.sizes files.

Best
Neva

Mariam Okhovat

unread,
Apr 4, 2017, 1:59:10 PM4/4/17
to Neva Durand, 3D Genomics
Oh, I see! I will try to rerun with corrected file names. 

Thanks a lot.

Mariam Okhovat

unread,
Apr 7, 2017, 12:40:05 PM4/7/17
to Neva Durand, 3D Genomics
Hi Neva,

I reran the juicer codes with correct name (hg38.chrom.size), but now I get an error saying:

Encode tracks are not available for hg38

Do you know what might be the issue?

Thanks,
Mariam

Neva Durand

unread,
Apr 12, 2017, 8:22:54 AM4/12/17
to Mariam Okhovat, 3D Genomics
Hello Mariam,

The ENCODE viewer is for ENCODE v2, which didn't have hg38 maps.  (It's actually the same as what is in IGV, under "Load from ENCODE".) 

For the most recent ENCODE, the tracks don't have the same setup and so there is no "Load from ENCODE" capability in either IGV or Juicebox; so there are no hg38 tracks because there were none in the older version of ENCODE.

For now, I would suggest finding the URL (e.g. https://www.encodeproject.org/files/ENCFF584BXF/@@download/ENCFF584BXF.bigWig) and loading the track via URL.

Best
Neva

Mariam Okhovat

unread,
Apr 18, 2017, 2:14:38 PM4/18/17
to Neva Durand, 3D Genomics
Thank you!
Reply all
Reply to author
Forward
0 new messages