Hello
I am sorry for delayed reply.
1. I have about 750G pair-end hic reads and 3.1G genome assembly. When I chose to run with use CPU scripts on the headnode, the task will be killed! So I submit the CPU task by SLURM :
#!/bin/sh
#SBATCH -c 40 --mem 40G
#SBATCH --partition=pNormal
#SBATCH --qos=normal
#SBATCH --get-user-env
#SBATCH -o Ma6.juicer.sh.job2022117115111/Ma6.juicer.sh.split.sh.1.sl.out
#SBATCH -e Ma6.juicer.sh.job2022117115111/Ma6.juicer.sh.split.sh.1.sl.err
#SBATCH -D /data/01/user157/fenshu/HIFI/Hic-anchor/Ma6/juicer/work
./scripts/juicer.sh -g Ma6.genome -d /data/01/user157/fenshu/HIFI/Hic-anchor/Ma6/juicer/work/hic_data -s MboI -p restriction_sites/Ma6.genome.chrom.sizes -y restriction_sites/Ma6.genome_MboI.txt -z references/M.aspalax.6.bp.p_ctg.fasta -D /data/01/user157/fenshu/HIFI/Hic-anchor/Ma6/juicer/work -t 80
Although I set the thread 80, the most steps is only single thread. The path of Bwa, samtools and java have been added to the .zshrc and are available on the current node,maybe if I can remove all section about "load bwa" "load java" and so on, and then definite like this in the SLURM:
#!/bin/bash
#SBATCH -N 1
#SBATCH -n 8
#SBATCH -t 5-5:00:00
#SBATCH --mem-per-cpu=2G
#SBATCH --partition=pNormal
#SBATCH --qos=normal
#SBATCH --get-user-env
#SBATCH --job-name=Ma6.juicer
#SBATCH --mail-type=end
#SBATCH --export=ALL
#SBATCH --array=1-10
#SBATCH --output=Ma6.juicer.out
#SBATCH --error=Ma6.juicer.err
./scripts/juicer.sh -g Ma6.genome -d /data/01/user157/fenshu/HIFI/Hic-anchor/Ma6/juicer/work/hic_data -s MboI -p restriction_sites/Ma6.genome.chrom.sizes -y
restriction_sites/Ma6.genome_MboI.txt -z references/M.aspalax.6.bp.p_ctg.fasta -D /data/01/user157/fenshu/HIFI/Hic-anchor/Ma6/juicer/work -S final -t 80
2. Also , when I finished the CPU run, finally I get the the aligned folder containing the results:header,inter_30.txt,inter.txt,merged1.txt,merged30.txt,merged_dedup.bam; but it lacks these files:
collisions.txt, dups.txt, opt_dups.txt ,merged_sort and merged_nodups.txt which is needed to run 3d-dna. Because I only need to get merged_nodups.txt for 3d-dna, so my command after crashing due to out of memory:
#!/bin/sh
#SBATCH -c 40 --mem 90G
#SBATCH --partition=pNormal
#SBATCH --qos=normal
#SBATCH --get-user-env
#SBATCH -o Ma6.juicer.sh.job2022121173823/Ma6.juicer.sh.split.sh.1.sl.out
#SBATCH -e Ma6.juicer.sh.job2022121173823/Ma6.juicer.sh.split.sh.1.sl.err
#SBATCH -D /data/01/user157/fenshu/HIFI/Hic-anchor/Ma6/juicer/work
./scripts/juicer.sh -g Ma6.genome -d /data/01/user157/fenshu/HIFI/Hic-anchor/Ma6/juicer/work/hic_data -s MboI -p restriction_sites/Ma6.genome.chrom.sizes -y restriction_sites/Ma6.genome_MboI.txt -z references/M.aspalax.6.bp.p_ctg.fasta -D /data/01/user157/fenshu/HIFI/Hic-anchor/Ma6/juicer/work -S dedup -t 80 -e early
(1) Why I can not generate the merged_nodups.txt. And My log file is:
Using restriction_sites/Ma6.genome_MboI.txt as site file
(-: Mark duplicates done successfully
(-: Pipeline successfully completed (-:
Run cleanup.sh to remove the splits directory
Check /data/01/user157/fenshu/HIFI/Hic-anchor/Ma6/juicer/work/hic_data/aligned for results
(2) why the different juicer.sh exists. which one is correct?