mega.sh

406 views
Skip to first unread message

Scott

unread,
Mar 25, 2022, 3:47:10 AM3/25/22
to 3D Genomics
Hi,

I'm using Juicer CPU (jar version: juicer_tools_1.22.01.jar) to generate *.hic contact matrices for multiple replicates, then merge them with mega. So far I've been able to generate inter.hic and inter_30.hic files using Juicer without much difficulty, however I am consistently running into the sorting error when trying to merge them using the mega.sh script.

The pipeline runs up to midway through writing the body for the merged *.hic files then exits with an error that would indicate sorting issues and deletes the hic file:

................................................................................................................Error: the chromosome combination 12_26 appears in multiple blocks

I have tried pre-sorting the merged1.txt and merged30.txt files using 'sort -k2,2d -k6,6d', but this does not avoid reproducing the error. I've also tried modifying the sort commands in mega.sh:

sort --parallel=40 -T "${tmpdir}" -m -k2,2d -k6,6d ${merged_names} > "${outputDir}"/merged1.txt
sort --parallel=40 -T "${tmpdir}" -m -k2,2d -k6,6d ${merged_names30} > "${outputDir}"/merged30.txt

replacing them with:

  sort --parallel=40 -T ${tmpdir} -m -k2,2d -k6,6d -k4,4n -k8,8n -k1,1n -k5,5n -k3,3n ${merged_names} > ${outputDir}/merged1.txt
    sort --parallel=40 -T ${tmpdir} -m -k2,2d -k6,6d -k4,4n -k8,8n -k1,1n -k5,5n -k3,3n ${merged_names30} > ${outputDir}/merged30.txt

I realize this issue has been discussed here and on the git repo to some extent, but none of the posted solutions have worked for me. Any help or insight would be much appreciated and I'd be happy to provide more details if it could be useful. Looking forward to getting your response!

Thanks,
Scott



Yichen LI

unread,
May 22, 2022, 12:45:33 AM5/22/22
to 3D Genomics
Hi,

I am encountering the same error and have tried the same thing but with no luck. Have you found a way to solve it? Many thanks for the help!

Thanks,
Kelly

Pavla Navratilova

unread,
May 23, 2022, 9:02:13 AM5/23/22
to 3D Genomics
Same problem here with building the .hic file
.............................Error: the chromosome combination 1_1 appears in multiple blocks

I tried sort -k2,2d -k6,6d but the problem persisted.

Pavla Navratilova

unread,
May 23, 2022, 9:21:41 AM5/23/22
to 3D Genomics
My file looks sorted ....
head -100 merged1.txt                                                                                                      
0 chr1H 157 0 16 chr1H 36 2
0 chr1H 162 1 16 chr1H 96 2
0 chr1H 162 1 16 chr1H 117 3
0 chr1H 162 1 16 chr1H 129 3
0 chr1H 189 1 16 chr1H 122 3
0 chr1H 252 1 16 chr1H 129 3
0 chr1H 252 1 16 chr1H 176 3
0 chr1H 191 1 16 chr1H 8430 55
0 chr1H 252 1 16 chr1H 8430 55
0 chr1H 252 1 16 chr1H 89173846 673842
0 chr1H 252 1 0 chr1H 130644422 1004264
0 chr1H 250 1 16 chr1H 137467211 1057886
0 chr1H 252 1 0 chr1H 152177789 1171115
0 chr1H 252 1 0 chr1H 196368711 1509614
0 chr1H 252 1 0 chr1H 310104724 2379832
0 chr1H 252 1 0 chr1H 368487161 2850065
0 chr1H 252 1 0 chr1H 400855287 3113666
0 chr1H 252 1 0 chr1H 434031217 3383240
0 chr1H 252 1 0 chr1H 448537838 3500815
0 chr1H 252 1 16 chr1H 451745485 3527220
0 chr1H 251 1 16 chr1H 483297718 3786048
0 chr1H 251 2 16 chr1H 129 3
0 chr1H 252 2 16 chr1H 151 3
0 chr1H 370 2 16 chr1H 345 3
0 chr1H 373 2 16 chr1H 283 3

Moshe Olshansky

unread,
May 23, 2022, 9:14:18 PM5/23/22
to 3D Genomics
Hi Scott (and the previous authors),

sort -m just merges assuming that all the files are already sorted correctly (it does not sort). So I would have tried to sort each file separately (sort -k2,2d -k6,6d -k3,3n -k7,7n) and only then merge them by sort  -m -k2,2d -k6,6d -k3,3n -k7,7n

Best regards,
Moshe.

Yichen LI

unread,
May 24, 2022, 10:40:51 AM5/24/22
to 3D Genomics
Hi Moshe,

Many thanks for the suggestion. It works for me!

Best regards,
Kelly

Pavla Navratilova

unread,
May 26, 2022, 3:55:48 AM5/26/22
to 3D Genomics
Hi, could you please elaborate a bit on that? Am I supposed to sort the individual replica merged_dedup.bam and restart the mega.sh?
I hoped that I could use the product of merging with the command from mega (samtools merge -c -t cb "$cThreadString" "${outputDir}"/mega_merged_dedup.bam "${bams_to_merge}"). But I cannot get past the # Create statistics file step and the "pre" fails as I have complained in the previous post.
Thanks,
Pavla
Reply all
Reply to author
Forward
0 new messages