GPUs are not found so HICCUPs cannot be run

807 views
Skip to first unread message

Assa Yeroslaviz

unread,
Mar 23, 2022, 3:59:37 PM3/23/22
to 3d-ge...@googlegroups.com
Hi,

Imaybe you can help me solve the mystery, but it doesn't work the way it should be.

I'm running the juicer.sh script on a SLURM cluster with GPUs. Itested it with cuda version 10.1.
I keep getting stuck with an error message.

This is the part I have changed in juicer.sh:

else

    isVoltron=1

    echo "we are Voltron"

    #export PATH=/gpfs0/biobuild/biobuilds-2016.11/bin:$PATH 

    # unset MALLOC_ARENA_MAX # on IBM platform this parameter has significant speed efect but may result in memory leaks

    load_bwa=""   # "spack load b...@0.7.17 arch=\`spack arch\`"

    load_awk=""   # "spack load ga...@4.1.4 arch=\`spack arch\`"

    load_gpu="module load gcc/8;module load cuda/10.1;"   # "spack load ...

    load_samtools="" # "spack load samt...@1.13...

    call_bwameth="" # "/gpfs0/home/neva/bwa-meth/bwameth.py"


The way we load module here is with module load command.
In order to be able to use bwa and samtools i have created a conda env with these tools. I even installed cuda-nvcc to make sure that I have nvcc.

But I keep getting stuck with the error 

$ less debug/hiccups_wrap-55866.err

Warning Hi-C map may be too sparse to find many loops via HiCCUPS.

jcuda.CudaException: Failed to initialize the driver: CUDA_ERROR_NO_DEVICE

        at jcuda.utils.KernelLauncher.initialize(KernelLauncher.java:606)

        at jcuda.utils.KernelLauncher.<init>(KernelLauncher.java:586)

        at jcuda.utils.KernelLauncher.create(KernelLauncher.java:393)

        at jcuda.utils.KernelLauncher.create(KernelLauncher.java:321)

        at jcuda.utils.KernelLauncher.compile(KernelLauncher.java:270)

        at juicebox.tools.utils.juicer.hiccups.GPUController.<init>(GPUController.java:72)

        at juicebox.tools.clt.juicer.HiCCUPS.buildGPUController(HiCCUPS.java:558)

        at juicebox.tools.clt.juicer.HiCCUPS.runCoreCodeForHiCCUPS(HiCCUPS.java:485)

        at juicebox.tools.clt.juicer.HiCCUPS.access$200(HiCCUPS.java:158)

        at juicebox.tools.clt.juicer.HiCCUPS$1.run(HiCCUPS.java:414)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

        at java.lang.Thread.run(Thread.java:748)

GPU/CUDA Installation Not Detected

Exiting HiCCUPS


I don't know what is going wrong here. I can load the cuda/10.1 module manually and than run the hiccups script specifically without errors.
This is when the module is not manually loaded:

$ module list

Currently Loaded Modulefiles:

  1) intel/18.0.5   2) openmpi/4      3) jdk/8          4) bwa/0.7.17

(base) yeroslaviz@hpcl8001:Juicer_SLURM$ ./scripts/juicer_hiccups.sh -j scripts/juicer_tools


HiCCUPS:


GPUs are not installed so HiCCUPs cannot be run

...

but when I upload it:

(base) yeroslaviz@hpcl8001:Juicer_SLURM$ module load cuda/10.1

base) yeroslaviz@hpcl8001:Juicer_SLURM$ module list

Currently Loaded Modulefiles:

  1) intel/18.0.5   2) openmpi/4      3) jdk/8          4) bwa/0.7.17     5) cuda/10.1

(base) yeroslaviz@hpcl8001:Juicer_SLURM$ ./scripts/juicer_hiccups.sh -j scripts/juicer_tools


HiCCUPS:


scripts/juicer_tools hiccups /fs/home/yeroslaviz/projects/Dept_Tachibana/Laura/Juicer_SLURM/aligned/inter_30.hic /fs/home/yeroslaviz/projects/Dept_Tachibana/Laura/Juicer_SLURM/aligned/inter_30_loops

Reading file: /fs/home/yeroslaviz/projects/Dept_Tachibana/Laura/Juicer_SLURM/aligned/inter_30.hic

No valid configurations specified, using default settings

Using 1 CPU thread(s)

Warning Hi-C map may be too sparse to find many loops via HiCCUPS.

Default settings for 5kb, 10kb, and 25kb being used

Running HiCCUPS for resolution 5000 

...


So I know that I have GPUs and I know how to load them. Why doesn't it work within the script?
I'm able to create the .hic file, but I can't process it. 
I'm not even sure where to look for the error. 

What am I missing?

thanks

Assa







Muhammad Shamim

unread,
Apr 3, 2022, 3:39:00 PM4/3/22
to 3D Genomics
Can you run this test?

Assa Yeroslaviz

unread,
Apr 4, 2022, 4:05:23 AM4/4/22
to 3D Genomics
Yes i did it already, and yes i can see them there

$ ./detect_gpu

Number of devices found: 2

Device Number: 0

  Device name: Quadro RTX 5000

  Memory Clock Rate (KHz): 7001000

  Memory Bus Width (bits): 256

  Peak Memory Bandwidth (GB/s): 448.064000


Device Number: 1

  Device name: Quadro RTX 5000

  Memory Clock Rate (KHz): 7001000

  Memory Bus Width (bits): 256

  Peak Memory Bandwidth (GB/s): 448.064000


I also can run the juicer_hiccups.sh  by itself outside of the jusicer.sh script:

$ ./scripts/juicer_hiccups.sh -j scripts/juicer_tools -i aligned_manualHICCUP_succeeded/inter_30.hic -g hg19


HiCCUPS:


scripts/juicer_tools hiccups aligned_manualHICCUP_succeeded/inter_30.hic aligned_manualHICCUP_succeeded/inter_30_loops

Reading file: aligned_manualHICCUP_succeeded/inter_30.hic

No valid configurations specified, using default settings

Using 1 CPU thread(s)

Warning Hi-C map may be too sparse to find many loops via HiCCUPS.

Default settings for 5kb, 10kb, and 25kb being used

Running HiCCUPS for resolution 5000

...


I don't understand the problem, as I can't specify the error.  Any idea how to test the problem?

Just to make sure, that I can detect the GPUs within the script, I have added the detect_gpu to  jusicer.sh.  When running, I can see the same output on STDOUT. So the script can detect the GPUs, but somehow can't load the module. 


Just to make sure, I'm using the correct version of everything.  I have downloaded tand am running juicer_tools.jar versions  together with modules load gcc/8;module load cuda/10.1. Is this combination correct or should I change to any other version?


thanks

Assa

Muhammad Shamim

unread,
Apr 5, 2022, 7:13:48 AM4/5/22
to 3D Genomics
So the GPU version of HiCCUPS is working correctly when you call it manually, but not within the juicer script?
Are you using Juicer 1.6 or Juicer 2? And which Juicer Tools jar version?

Can you try replacing lines 70-133 here with just the set the variables for your system?

You might need to specify the request for the GPU in the job header
e.g. something like
sbatch_req="#SBATCH --gres=gpu:kepler:1"

What are the exact instructions you use when manually loading the GPU/CUDA?
Is there a cluster/systems admin who may be able to advise you on the setup for your cluster?


On Sunday, April 3, 2022 at 9:39:00 PM UTC+2  wrote:

Assa Yeroslaviz

unread,
Apr 6, 2022, 12:44:18 AM4/6/22
to 3d-ge...@googlegroups.com
Hi Muhamnad and thanks for the reply.

It looks like I have solved it. Can't be sure quite 100%, that i didn't destroy something else inbetween and I appreciate it, if you maybe look at the modified script and tell me if all looks well. 

I'm using juicer 2.0 (based on the shell script) and juicer tool version 2.10.1.

I have done the following modifications to the script:
1. I have deleted all the rows related to Rice or BCM at the beginning as well as around lines ~464 for memory allocation and ~1034 for cpu allocations.
2. According to what I have found in our documentation i have changed also line ~1553 to this:
        sbatch_req="#SBATCH --gres=gpu:1" # modified from sbatch_req="#SBATCH --gres=gpu:kepler:1"
3. in the part where the batch script for hiccup should be written i have commented  out many lines regarding the variable $isNots. Can you please explain to me what this one should do? I think here lies the main problem, why it didn't load the gpu. 
Because this if clause if [[  "$isNots" -eq 1 ]], was never true, it didn't add the row to load the gpu's. In my case the value of $isNots was always 0, so the gpu load command was ignored. 

I also don't see where the batch script for the hiccup is saved. Is it written to a file somewhere? I have a few batch files in the debug folder (*.slurm), but nothing for the hiccup module. Is it saved somewhere else or not at all?

Please see attached modified script if I missed something. 

thanks

Assa



--
You received this message because you are subscribed to a topic in the Google Groups "3D Genomics" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/3d-genomics/dXr9OOO3Pgw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to 3d-genomics...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/3d-genomics/0a644834-5038-4963-af52-f49a6566f6d4n%40googlegroups.com.
juicer_modified.sh
Reply all
Reply to author
Forward
0 new messages