Hi all,
I am a relatively new to nextflow and very new to running jobs on a cluster. I am having issues with my nextflow script sitting idle for hours at a time on the cluster not doing anything. Further, nextflow sends jobs to different nodes other than the one I reserved so I cannot request much since I quickly hit resource limits which really dampens the parallelization.
A bit more detail, my job has heavy I/O with light processing on a large number of files. Each file takes roughly 1-3 minutes. I took off all directives on my processes to simplify things
IT is not thrilled with my script as is since it bogs down the scheduler. I think this should be addressable. My script run faster on my laptop than the HPC due to these scheduling issues. On my laptop it would take about 18hrs, it takes 2 days on the cluster.
Where can I get more information in controlling how nextflow submits to SLURM?
Thanks in advance,
Matt
Here is my slurm submission script
############################################################
#!/bin/bash
#SBATCH -n 7
#SBATCH -N 1
#SBATCH --partition=general
#SBATCH --mem=8GB
#SBATCH -t 01-00:00:00
#SBATCH --mail-type=BEGIN,REQUEUE,FAIL,END
#SBATCH --mail-user= foo bar
work_dir=/pine/scr/m/j/mjrich
image_set=$1
$work_dir/nextflow -c $work_dir/nextflow.config run $work_dir/
image_processing.nf --folder $image_set
###########################################################
here is my nextflow config
############################################################
process.container = './longleaf.sif'
singularity.enabled = true
singularity.autoMounts = true
executor {
name='slurm'
queueSize = 25
pollInterval = '10 sec'
dumpInterval = '10 min'
exitReadTimeout = '5 min'
killBatchSize = 50
}
#############################################################