Arima Hi-C experiment - check if restriction map was included?

116 views
Skip to first unread message

Alexander Predeus

unread,
Jan 27, 2023, 7:13:34 AM1/27/23
to 3D Genomics
Hi all, 

I'm new to Juicer pipeline, and currently am trying to apply it to some Arima Genomics Hi-C experiments. I managed to generate all the expected outputs. However, I wasn't sure how can I tell if the restriction site information was included correctly? (I have generated a custom Arima file for this). From what I understood, the restriction file should add some extra resolutions to the hic file - am I right? 

Would appreciate any extra info. 

All the best, 

-- Alex

Olga Dudchenko

unread,
Jan 27, 2023, 12:36:01 PM1/27/23
to 3D Genomics
Hi Alex,

You are probably thinking fragment-resolution. No, by default fragment-resolution maps are not build by the Juicer pipeline. I also do not think having fragment-resolution maps would help you see that you added the restriction site information correctly. You can try to examine the calculated statistics, things like intra-fragment reads, but the counts there depend not only on how you ran Juicer but also on the experiment, so this may be tricky. Note that if you are running for assembly with 3d-dna the peculiarities of RE are not very important: you are anyways mostly relying on inter-scaffold data which is annotated and used the same way irrespective of the restriction site label.

Olga

Alexander Predeus

unread,
Jan 28, 2023, 5:41:55 PM1/28/23
to 3D Genomics
Hi Olga, 

> You are probably thinking fragment-resolution.

I actually have no idea what it means or why do we need a restriction map :( I've read the Wiki but it didn't become much clearer. 

What I'm trying to do is run differential loop detection for several cancer & normal samples. I'm trying to run HiCCUPs (which didn't run as part of Juicer since I didn't have GPU) and HICCUPs-diff, and also motifs - although I'm not sure if that's possible without the matched ChIP-seq experiments? 

Here's an example of "inter_30.txt" file that I have and the command that was run:

Experiment description: Juicer version 1.6; BWA 0.7.17-r1188; 32 threads; openjdk version "11.0.17" 2022-10-18; /nfs/cellgeni/Juicer//scripts/juicer.sh -D /nfs/cellgeni/Juicer/ -d /lustre/scratch126/cellgen/cellgeni/tickets/tic-1997/data/MY_200531_12930315 -z /nfs/cellgeni/Juicer/references/GRCh38_v32_modified.fa -y /nfs/cellgeni/Juicer/restriction_sites/GRCh38_v32_modified_Arima.txt -t 32 -p /nfs/cellgeni/Juicer/references/GRCh38_v32_modified.chrom.sizes -g /nfs/cellgeni/Juicer/references/GRCh38_v32_modified.chrom.sizes                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
Sequenced Read Pairs:  452,127,408
 Normal Paired: 285,227,765 (63.09%)
 Chimeric Paired: 138,275,409 (30.58%)
 Chimeric Ambiguous: 26,042,296 (5.76%)
 Unmapped: 2,581,938 (0.57%)
 Ligation Motif Present: 0 (0.00%)
Alignable (Normal+Chimeric Paired): 423,503,174 (93.67%)
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
WARN [2023-01-20T01:47:19,086]  [Globals.java:138] [main]  Development mode is enabled
Unique Reads: 268,181,460 (59.32%)
PCR Duplicates: 152,958,908 (33.83%)
Optical Duplicates: 2,362,806 (0.52%)
Library Complexity Estimate: 428,688,650
Intra-fragment Reads: 0 (0.00% / 0.00%)
Below MAPQ Threshold: 41,615,413 (9.20% / 15.52%)
Hi-C Contacts: 226,566,047 (50.11% / 84.48%)
 Ligation Motif Present: 0  (0.00% / 0.00%)
 3' Bias (Long Range): 50% - 50%
 Pair Type %(L-I-O-R): 25% - 25% - 25% - 25%
Inter-chromosomal: 45,705,964  (10.11% / 17.04%)
Intra-chromosomal: 180,860,083  (40.00% / 67.44%)
Short Range (<20Kb): 87,445,722  (19.34% / 32.61%)
Long Range (>20Kb): 93,373,677  (20.65% / 34.82%)


From the looks of it, intra-fragment reads are exactly 0, so probably the map was not included? Also HICCUPs generates messages "Data not available for chr1 at 5000 resolution" for all the default resolutions (5k, 10k, 25k) - could you please suggest what can be changed in the command for it to work? 

Thank you very much in advance!

-- Alex

Alexander Predeus

unread,
Jan 28, 2023, 5:41:55 PM1/28/23
to 3D Genomics
Hi Olga, 

Thank you for your reply! I'm trying to run differential loop detection for cancer samples, not assembly. (I would maybe also like to run the motif finder, but I still can't understand from the documentation if that's possible for samples that do not have matched ChIP-seq experiments for them, and what exactly are the benefits of motif finding). 

Here's an example of the "inter_30.txt" (note that Juicer was ran on CPU-only node, so the pipeline stopped at Arrowhead, and did not run HiCCUPS): 


Experiment description: Juicer version 1.6; BWA 0.7.17-r1188; 32 threads; openjdk version "11.0.17" 2022-10-18; /nfs/cellgeni/Juicer//scripts/juicer.sh -D /nfs/cellgeni/Juicer/ -d /lustre/scratch126/cellgen/cellgeni/tickets/tic-1997/data/MY_200531_12930315 -z /nfs/cellgeni/Juicer/references/GRCh38_v32_modified.fa -y /nfs/cellgeni/Juicer/restriction_sites/GRCh38_v32_modified_Arima.txt -t 32 -p /nfs/cellgeni/Juicer/references/GRCh38_v32_modified.chrom.sizes -g /nfs/cellgeni/Juicer/references/GRCh38_v32_modified.chrom.sizes                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
Sequenced Read Pairs:  452,127,408
 Normal Paired: 285,227,765 (63.09%)
 Chimeric Paired: 138,275,409 (30.58%)
 Chimeric Ambiguous: 26,042,296 (5.76%)
 Unmapped: 2,581,938 (0.57%)
 Ligation Motif Present: 0 (0.00%)
Alignable (Normal+Chimeric Paired): 423,503,174 (93.67%)
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
WARN [2023-01-20T01:47:19,086]  [Globals.java:138] [main]  Development mode is enabled
Unique Reads: 268,181,460 (59.32%)
PCR Duplicates: 152,958,908 (33.83%)
Optical Duplicates: 2,362,806 (0.52%)
Library Complexity Estimate: 428,688,650
Intra-fragment Reads: 0 (0.00% / 0.00%)
Below MAPQ Threshold: 41,615,413 (9.20% / 15.52%)
Hi-C Contacts: 226,566,047 (50.11% / 84.48%)
 Ligation Motif Present: 0  (0.00% / 0.00%)
 3' Bias (Long Range): 50% - 50%
 Pair Type %(L-I-O-R): 25% - 25% - 25% - 25%
Inter-chromosomal: 45,705,964  (10.11% / 17.04%)
Intra-chromosomal: 180,860,083  (40.00% / 67.44%)
Short Range (<20Kb): 87,445,722  (19.34% / 32.61%)
Long Range (>20Kb): 93,373,677  (20.65% / 34.82%)


From the looks of it (zero intra-fragment reads), the map was not used. I'm also having troubles running HiCCUPS - get a bunch of "Data not available for chr1 at 5000 resolution" errors. Could you please suggest what needs to be corrected in the command? 

Thank you in advance!

-- Alex
On Friday, 27 January 2023 at 17:36:01 UTC Olga Dudchenko wrote:

Olga Dudchenko

unread,
Feb 10, 2023, 2:52:51 AM2/10/23
to 3D Genomics
Hello,

The zero intra-fragment suggests that the fragment information was not indeed properly passed on. I would suggest re-chekcing the fragment file,, and I don't see you passing along the ligation motif.

Olga

Reply all
Reply to author
Forward
0 new messages