how to convert contact matrix to hic file

1,019 views
Skip to first unread message

Zhilan Li

unread,
Mar 27, 2019, 4:01:40 AM3/27/19
to 3D Genomics
    hi everyone,I'm a new to hic and Juicer,i have encountered some difficulties,can someone help me?
    We know,we can Extracting data from .hic files with dump,the data is a coordinate list file,then i have some processed with the coordinate list file,which is different with the former.
I just want to know what can i do from the new coordinate list file to create a new hic file?
Thanks!

Neva Durand

unread,
Mar 27, 2019, 9:36:30 AM3/27/19
to Zhilan Li, 3D Genomics
Hello, 

Please have a look at the Pre command: https://github.com/aidenlab/juicer/wiki/Pre

You can create a pseudo contact list and use the "Short with Score" format.

Neva

--
You received this message because you are subscribed to the Google Groups "3D Genomics" group.
To unsubscribe from this group and stop receiving emails from it, send an email to 3d-genomics...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/3d-genomics/82413a50-1a00-42cd-b0aa-3b7e1076c7f1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
Neva Cherniavsky Durand, Ph.D.
Staff Scientist, Aiden Lab

Zhilan Li

unread,
Mar 28, 2019, 12:33:22 AM3/28/19
to 3D Genomics
      Hi~
      Thank you very much for your answer, but I have some other questions for you.The juicer dump command will get the coordinate list,and the coordinate is the bin size,like this :
       10000   10000   30.0
       40000   50000   1.0
       50000   50000   48.0
       40000   60000  10.0
        
        the short format is :
    <str1> <chr1> <pos1> <frag1> <str2> <chr2> <pos2> <frag2>

    According to you advice,i need to change the  pseudo contact list to a short format file.But what bother me is that coordinate are the coordinates are integer multiples of bin size,can i add a bias to it to avoid spanning two bin sizes.For the above mentioned pseudo coordinate list,may i change it to(1kb resolution):
       (0 1 1 0000+500  0 0 1 10000+500  0) *30.0 (because it has 30 contacts)
       0 1 1 40000+500 0 0 1  50000+500  

       


       


在 2019年3月27日星期三 UTC+8下午9:36:30,Neva Durand写道:
Hello, 

Please have a look at the Pre command: https://github.com/aidenlab/juicer/wiki/Pre

You can create a pseudo contact list and use the "Short with Score" format.

Neva

On Wed, Mar 27, 2019 at 4:01 AM Zhilan Li <zhila...@gmail.com> wrote:
    hi everyone,I'm a new to hic and Juicer,i have encountered some difficulties,can someone help me?
    We know,we can Extracting data from .hic files with dump,the data is a coordinate list file,then i have some processed with the coordinate list file,which is different with the former.
I just want to know what can i do from the new coordinate list file to create a new hic file?
Thanks!

--
You received this message because you are subscribed to the Google Groups "3D Genomics" group.
To unsubscribe from this group and stop receiving emails from it, send an email to 3d-ge...@googlegroups.com.

Zhilan Li

unread,
Mar 28, 2019, 12:34:41 AM3/28/19
to 3D Genomics
  • Looking forward to your reply and best wishes!


在 2019年3月27日星期三 UTC+8下午9:36:30,Neva Durand写道:
Hello, 

Please have a look at the Pre command: https://github.com/aidenlab/juicer/wiki/Pre

You can create a pseudo contact list and use the "Short with Score" format.

Neva
On Wed, Mar 27, 2019 at 4:01 AM Zhilan Li <zhila...@gmail.com> wrote:
    hi everyone,I'm a new to hic and Juicer,i have encountered some difficulties,can someone help me?
    We know,we can Extracting data from .hic files with dump,the data is a coordinate list file,then i have some processed with the coordinate list file,which is different with the former.
I just want to know what can i do from the new coordinate list file to create a new hic file?
Thanks!

--
You received this message because you are subscribed to the Google Groups "3D Genomics" group.
To unsubscribe from this group and stop receiving emails from it, send an email to 3d-ge...@googlegroups.com.

Neva Durand

unread,
Mar 28, 2019, 7:19:55 AM3/28/19
to Zhilan Li, 3D Genomics
Yes you should use the “short with score” format that has the bin counts in the last field - so a nine field format, as before, but with the contacts in the last field. 

To unsubscribe from this group and stop receiving emails from it, send an email to 3d-genomics...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/3d-genomics/eee6aa0c-a27d-442b-8d22-fb64a323bfa2%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Neva Durand

unread,
Mar 28, 2019, 7:22:19 AM3/28/19
to Zhilan Li, 3D Genomics
And be sure the fragment number is different. So

0 1 10001 0 0 1 10001 1 30

For the first entry and so on. 

Zhilan Li

unread,
Mar 28, 2019, 11:34:01 PM3/28/19
to 3D Genomics
Thank your answer . Let me try it !

在 2019年3月28日星期四 UTC+8下午7:22:19,Neva Durand写道:
--
Neva Cherniavsky Durand, Ph.D.
Staff Scientist, Aiden Lab

Zhilan Li

unread,
Mar 29, 2019, 10:38:18 AM3/29/19
to 3D Genomics
  • Thank you. I gave it a try. I can do it. Thank you again


在 2019年3月28日星期四 UTC+8下午7:22:19,Neva Durand写道:
And be sure the fragment number is different. So
--
Neva Cherniavsky Durand, Ph.D.
Staff Scientist, Aiden Lab
Message has been deleted

Neva Durand

unread,
Mar 30, 2019, 6:14:28 AM3/30/19
to Zhilan Li, 3D Genomics
This would happen because one of your positions is bigger than the chromosome. Double check that the positions aren't too big. You might look at the end of the file you converted.

By the way, don't use the "-f" flag - you made pseudo fragment numbers and those maps are meaningless.

On Fri, Mar 29, 2019 at 10:38 PM Zhilan Li <zhila...@gmail.com> wrote:
Hi,When I processed my experimental data, there were two different errors:

java.png

               
        
  •           My pseudo-coordinate list is as follows:

  • coor.png

         I have no idea about this,though it reminded me that ArrayIndexOutOfBound.But I carefully checked my data, and it's consistent with the format of  short   with score format.
                   

    •               Could you help me to solve it again,please!

  •  
在 2019年3月28日星期四 UTC+8下午7:22:19,Neva Durand写道:
And be sure the fragment number is different. So
--
Neva Cherniavsky Durand, Ph.D.
Staff Scientist, Aiden Lab
--
Neva Cherniavsky Durand, Ph.D.
Staff Scientist, Aiden Lab

--
You received this message because you are subscribed to the Google Groups "3D Genomics" group.
To unsubscribe from this group and stop receiving emails from it, send an email to 3d-genomics...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Zhilan Li

unread,
Apr 1, 2019, 7:19:27 AM4/1/19
to 3D Genomics
  • Thank you for your many patient answers,you are right,it because the pseudo list is longer than  the real chromosome.I have fixed it.

  • But today,when i called the diffhiccups to seek for different loops between  to look for different loops for between  two hic files,I have the following problem:

  • problems.png


     Both of hic file only contain the chr22 BP=10kb,looking forward for you reply!

在 2019年3月30日星期六 UTC+8下午6:14:28,Neva Durand写道:

Neva Durand

unread,
Apr 1, 2019, 7:23:27 AM4/1/19
to Zhilan Li, 3D Genomics
You should be able to send in only chr 22 with -c 22

To unsubscribe from this group and stop receiving emails from it, send an email to 3d-genomics...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/3d-genomics/abb5ee2a-91f7-4b31-a566-f5d866366cc1%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Message has been deleted

Zhilan Li

unread,
Apr 1, 2019, 7:25:48 AM4/1/19
to 3D Genomics

problem2.png

i had try it,it will have the following problems:

在 2019年4月1日星期一 UTC+8下午7:23:27,Neva Durand写道:

Zhilan Li

unread,
Apr 1, 2019, 8:29:18 AM4/1/19
to 3D Genomics
I tired the diffhiccups in the GSE63525_GM12878_insitu_primary.hic(2014,cell),it also occured the wrong info:

problem3.png

     I wonder if diffhiccups does not support CPU or other factors cause it?


在 2019年4月1日星期一 UTC+8下午7:23:27,Neva Durand写道:
You should be able to send in only chr 22 with -c 22

Muhammad Saad Shamim

unread,
Apr 1, 2019, 9:14:44 AM4/1/19
to Zhilan Li, 3D Genomics
Hi,

You'll need to use the --cpu flag if you want to use the CPU version, as it's not the default.
Lmk if that works.

Best,


To unsubscribe from this group and stop receiving emails from it, send an email to 3d-genomics...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/3d-genomics/80941cd1-4f90-4193-ab8b-974a09f653a2%40googlegroups.com.

Zhilan Li

unread,
Apr 1, 2019, 10:05:22 PM4/1/19
to 3D Genomics
Hi~
      Thank you for your answer.After Posting,yesterday, I checked relevant documents and solved this problem, which is exactly what you said.
      But i still can’t solve the problem before i posted,may you take a look at it for me?
      Sincerely.

在 2019年4月1日星期一 UTC+8下午9:14:44,Muhammad Shamim写道:

Zhilan Li

unread,
Apr 1, 2019, 10:11:40 PM4/1/19
to 3D Genomics

problem4.png

   If it with the -c 22(or chr22) flag,the warning info;Without the -c flag,it can work out well.

在 2019年4月1日星期一 UTC+8下午9:14:44,Muhammad Shamim写道:
Hi,

Hannah

unread,
Feb 18, 2020, 9:58:12 PM2/18/20
to 3D Genomics
Is it possible to use this short with score format to create a .hic file with multiple resolutions by starting with either a high resolution matrix or a collection of matrices at different resolutions?

Neva Durand

unread,
Feb 19, 2020, 7:26:06 AM2/19/20
to Hannah, 3D Genomics
Yes you can start with your highest resolution and it will bin from there. 

To unsubscribe from this group and stop receiving emails from it, send an email to 3d-genomics...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/3d-genomics/27f19ef7-5056-41fe-8a33-ebe882b932ec%40googlegroups.com.


--
Neva Cherniavsky Durand, Ph.D.
Pronouns: she, her, hers
Assistant Professor, Aiden Lab
Reply all
Reply to author
Forward
0 new messages