Dumping .hic files into dense matrix format?

1,656 views
Skip to first unread message

ajk...@g.harvard.edu

unread,
Sep 14, 2016, 5:11:49 PM9/14/16
to 3D Genomics
Hello,

I noticed that Juicebox's 'dump' command line tool currently outputs observed matrices in sparse matrix format (row, column, value). I was wondering if there was a way to dump a Juicebox .hic file into a dense matrix format? i.e., when I run dump now using:

juicebox dump observed VC_SQRT filename.hic 1 1  BP 25000

I get an output like:

375000 375000 4
375000 400000 3
400000 400000 8

and I would like an output like:

                  375000 400000
375000         4            3
400000         3            8

Best,
Andrea

Neva Durand

unread,
Sep 15, 2016, 3:57:48 PM9/15/16
to ajk...@g.harvard.edu, 3D Genomics
Hello Andrea,

I'm sorry, but we only support sparse matrix format, since the size of dense matrices at high resolution is prohibitive and unnecessary.  Indeed, what you wrote is not exactly dense since it doesn't include all the bins that have 0s.  

May I ask for what application you would like a dense matrix?  It should be fairly straightforward in any language to convert to a dense matrix, but again might be prohibitively expensive memory-wise.

Best
Neva

--
You received this message because you are subscribed to the Google Groups "3D Genomics" group.
To unsubscribe from this group and stop receiving emails from it, send an email to 3d-genomics+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/3d-genomics/a69a55e9-0cd8-4229-9405-db7060a42602%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Neva Cherniavsky Durand, Ph.D.
Staff Scientist, Aiden Lab
Message has been deleted

Neva Durand

unread,
Feb 24, 2017, 9:05:29 AM2/24/17
to Andrea Kriz, 3D Genomics
Hello Andrea,

The latest version of the (newly renamed) Juicer tools has this functionality.  Use the "-d" flag to extract data in dense format.

You can also use our data API, Straw, which allows programmatic access to the hic files (enabling you to manipulate your matrices more easily).

Best
Neva

On Fri, Sep 16, 2016 at 7:37 PM, Andrea Kriz <ajk...@g.harvard.edu> wrote:
Hi Neva,

Thanks for your response! I've found that many tools developed by others for analyzing HiC data (e.g., Crane et al., 2015) require a dense HiC matrix as input, so perhaps you will consider adding the feature for low resolution or small regional matrices in the future :)

Best,
Andrea

On Thursday, September 15, 2016 at 3:57:48 PM UTC-4, Neva Durand wrote:
Hello Andrea,

I'm sorry, but we only support sparse matrix format, since the size of dense matrices at high resolution is prohibitive and unnecessary.  Indeed, what you wrote is not exactly dense since it doesn't include all the bins that have 0s.  

May I ask for what application you would like a dense matrix?  It should be fairly straightforward in any language to convert to a dense matrix, but again might be prohibitively expensive memory-wise.

Best
Neva
On Wed, Sep 14, 2016 at 11:11 PM, <ajk...@g.harvard.edu> wrote:
Hello,

I noticed that Juicebox's 'dump' command line tool currently outputs observed matrices in sparse matrix format (row, column, value). I was wondering if there was a way to dump a Juicebox .hic file into a dense matrix format? i.e., when I run dump now using:

juicebox dump observed VC_SQRT filename.hic 1 1  BP 25000

I get an output like:

375000 375000 4
375000 400000 3
400000 400000 8

and I would like an output like:

                  375000 400000
375000         4            3
400000         3            8

Best,
Andrea

--
You received this message because you are subscribed to the Google Groups "3D Genomics" group.
To unsubscribe from this group and stop receiving emails from it, send an email to 3d-genomics...@googlegroups.com.



--
Neva Cherniavsky Durand, Ph.D.
Staff Scientist, Aiden Lab

--
You received this message because you are subscribed to the Google Groups "3D Genomics" group.
To unsubscribe from this group and stop receiving emails from it, send an email to 3d-genomics+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Andrea Kriz

unread,
Mar 1, 2017, 5:19:33 PM3/1/17
to Neva Durand, 3D Genomics
Hi Neva,

Thank you again for all your help. I'm trying to download the latest version of Juicer tools from the code section on the Github (https://github.com/theaidenlab/juicer). I noticed the last update is from three months ago - is that the newest version with the "-d" flag option to extra matrices in dense format? Or is there a newer version?

Best,
Andrea

Neva Durand

unread,
Mar 2, 2017, 1:25:23 AM3/2/17
to Andrea Kriz, 3D Genomics
The best place to download newest Juicer tools is here:


Best
Neva



Reply all
Reply to author
Forward
0 new messages