Juicer pipeline breakdown before alignment

258 views
Skip to first unread message

Aoi Summer

unread,
Apr 28, 2017, 12:10:48 AM4/28/17
to 3D Genomics

Hi,

    I am using Juicer tool to analysis raw hic data nowadays, but the pipeline breakdown before alignment step like this:

(-: Looking for fastq files...fastq files exist

Job <2858> is submitted to queue <normal>.

(-: Aligning files matching /work/home/myjuicer/fastq/*_R*.fastq*

 in queue normal to genome zebrafish with site file /work/home/myjuicer/restriction_sites/zebrafish_MboI.txt

(-: Created /work/home/jfwang/myjuicer/splits and /work/home/myjuicer/aligned.

(-: Starting job to launch other jobs once splitting is complete

Job <2859> is submitted to queue <normal>.

Job <2860> is submitted to queue <normal>.

Job <2861> is submitted to queue <normal>.

long: No such queue. Job not submitted.

/a1493261906_mergeHi-C-lib_S1_L003_001.fastq.gz: No matching job found. Job not submitted.

/a1493261906_mergeHi-C-lib_S1_L003_001.fastq.gz: No matching job found. Job not submitted.

/a1493261906_chimeric*: No matching job found. Job not submitted.

long: No such queue. Job not submitted.

/a1493261906_fragmerge: No matching job found. Job not submitted.

/a1493261906_fragmerge*: No matching job found. Job not submitted.

/a1493261906_osplit: No matching job found. Job not submitted.

/a1493261906_osplit: No matching job found. Job not submitted.

/a1493261906_launch: No matching job found. Job not submitted.

/a1493261906_postproc_wrap: No matching job found. Job not submitted.

/a1493261906_postproc_wrap: No matching job found. Job not submitted.

(-: Finished adding all jobs... please wait while processing.


    I checked the results, it seemed that alignment step was never proceed and I can't use -S merge argument.

                                                                                                                             

(-: Looking for fastq files...fastq files exist                                                                              

Job <2875> is submitted to queue <normal>.                                                                                   

long: No such queue. Job not submitted.                                                                                      

/a1493347801_fragmerge: No matching job found. Job not submitted.                                                            

/a1493347801_fragmerge*: No matching job found. Job not submitted.                                                           

/a1493347801_osplit: No matching job found. Job not submitted.                                                               

/a1493347801_osplit: No matching job found. Job not submitted.                                                               

/a1493347801_launch: No matching job found. Job not submitted.                                                               

/a1493347801_postproc_wrap: No matching job found. Job not submitted.                                                        

/a1493347801_postproc_wrap: No matching job found. Job not submitted.

(-: Finished adding all jobs... please wait while processing.

   Could you help me figure it out?

   Your tools are really great and have a perfect pipeline with almost all essential analysis for HiC data.

   Really appreciate your work.

   Regards,

   Aoi

                                                              



Neva Durand

unread,
Apr 28, 2017, 2:08:34 AM4/28/17
to 3D Genomics, Aoi Summer
Thanks for the kind words!

The pipeline calls two queues, because on our cluster systems, there is usually a "short" queue where jobs are only allowed to take a certain amount of time, and a "long" queue where the time is much longer. You can adjust this directly in the script or send in the -l flag. Eg if in your system you just have one queue, you would send -q myqueue -l myqueue

If your jobs have successfully aligned, you can restart the pipeline at the merge stage with -S merge 

Hope that helps!
Neva

--


You received this message because you are subscribed to the Google Groups "3D Genomics" group.


To unsubscribe from this group and stop receiving emails from it, send an email to 3d-genomics...@googlegroups.com.


To view this discussion on the web visit https://groups.google.com/d/msgid/3d-genomics/cc523bdb-5a3b-411d-8f95-fb31f2574524%40googlegroups.com.


For more options, visit https://groups.google.com/d/optout.


Aoi Summer

unread,
May 2, 2017, 5:23:45 AM5/2/17
to 3D Genomics
Thanks for your reply!
There is still a problem here: my LSF system only have one short queue with time limited <12h. Change bsub -W did not work for juicer.sh used default queue parameters. 
# LSBATCH: User input
#!/bin/bash
#BSUB -q normal  # Soo
        #BSUB -W 12:00
#BSUB -o /work/home/myjuicer/lsf.out
#BSUB -w " exit(/a1493357396_align1Hi-C-lib_S1_L003_001.fastq.gz) || exit(/a1493357396_align2Hi-C-lib_S1_L003_001.fastq.gz) || exit(/a1493357396_mergeHi-C-lib_S1_L003_001.fastq.gz) || exit(/a1493357396_chimericHi-C-lib_S1_L003_001.fastq.gz)  "
#BSUB -J "cleanup_/a1493357396_0"
        #BSUB -g /a1493357396_clean
bkill -g /a1493357396 0  # Soo
        bkill -g /a1493357396_clean 0

------------------------------------------------------------

TERM_OWNER: job killed by owner.
Exited with signal termination: Alarm clock.

Is there anyway to solve this? Or should I try CPU mode?

Neva Durand

unread,
May 2, 2017, 5:46:30 AM5/2/17
to Aoi Summer, 3D Genomics
Hello,

You should talk to your system administrators.  12h is very short for a maximum time length in all possible queues.  

If you have access to a computer with a decent amount of RAM and disk space that doesn't have time limits, you could run the merge sort step on that computer; you would then want to run the dedupping again on the cluster, and then the making of the hic file and normalization would also need to run on the single computer.  

I wouldn't suggest the CPU version in general if you have hundreds of millions of reads, but you could do a hybrid like I suggested above if need be.

Best
Neva

To unsubscribe from this group and stop receiving emails from it, send an email to 3d-genomics+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/3d-genomics/32e551a0-e542-408f-90ac-b884458db185%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Neva Cherniavsky Durand, Ph.D.
Staff Scientist, Aiden Lab

Aoi Summer

unread,
May 9, 2017, 9:47:01 AM5/9/17
to 3D Genomics, stca...@gmail.com
Hello,
       I asked my system administrator to change queues settings and finished dedup stage.
       But errors appeared in the follow-up steps:

/work/home/.lsbatch/1494317131.3508: /usr/bin/modulecmd: No such file or directory

/work/home/.lsbatch/1494317131.3508: line 8: java: command not found

/work/home/myjuicer/scripts/juicer_tools: line 24: java: command not found

       My cluster do have java and single command works well. I changed module path but didn't think that caused the error.
      Do you have any idea how to solve this?

Neva Durand

unread,
May 9, 2017, 9:57:24 AM5/9/17
to Aoi Summer, 3D Genomics
Hello,

At the top of the Juicer script, you will see something like

load_java="module load dev/java/jdk1.7"

The module names differ from cluster to cluster (and also depend on the package manager used).   You should ask you sysadmin how to load Java, or try the command "module avail" and look for something with Java 1.7.  Then modify that line to be whatever is appropriate for your system.

If you have a merged_nodups.txt file, once you've fixed this you can restart from the final stage via sending the flag  "-S final" 

Best
Neva

To unsubscribe from this group and stop receiving emails from it, send an email to 3d-genomics+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/3d-genomics/317f0e44-a3ab-46bd-a3d5-dc22becec638%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Aoi Summer

unread,
May 9, 2017, 10:29:07 AM5/9/17
to 3D Genomics, stca...@gmail.com
Hello,
      My cluster do not have java module files, so I tried to skip the load step and use java directly. But it seemed not work. 
      Is it necessary to use module load script? If so I'll try to connect my sysadmin again.

Neva Durand

unread,
May 9, 2017, 10:30:59 AM5/9/17
to Aoi Summer, 3D Genomics
Hello,

You should find out how to call Java on your system, yes.  Perhaps you need the full path.

Best
Neva

To unsubscribe from this group and stop receiving emails from it, send an email to 3d-genomics+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/3d-genomics/98da3ec2-6b55-4756-8489-afc51f9e8adc%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages