merged.txt file sizes don't match wih mega.sh vs mega_from_bams.sh

83 views
Skip to first unread message

Mark Mackiewicz

unread,
Sep 18, 2022, 7:22:17 PM9/18/22
to 3D Genomics
Hi,
I have processed six independent samples with cpu juicer.sh (juicer_tools_v2.13.07).  I have used the mega.sh script to merge these and got merged1.txt and merged30.txt file sizes of ~14 Gb and ~11 Gb, respectively.  When I use the merged_dedup.bam files produced from the juicer.sh script on each individual sample to run mega_from_bams.sh, I get merged1.txt and merged30.txt file sizes of only ~8 Gb and ~6Gb, respectively. 

I know mega.sh merges the individual merged1.txt and merged30.txt files produced from the juicer.sh script, whereas the mega_from_bams.sh uses samtools view to create the merged1.txt and merged30.txt files (although this is similar to what the juicer.sh script does to generate individual merged1 and merged30 .txt files).  But is there any reason to believe that the merged.txt outputs produced from mega.sh and mega_from_bams.sh would not match in file size?  I have not run juicer pre to compare the file sizes of the corresponding .hic files yet (the mega scripts seem to crash on me during the .hic generation step because it can't find the chrom.sizes file so I have to run juicer pre independently on them).

Mark
Reply all
Reply to author
Forward
0 new messages