Question about AQUAS pipeline

212 views
Skip to first unread message

Han-Qin Zheng

unread,
Feb 17, 2017, 2:02:17 AM2/17/17
to idr-discuss
Hello:
       
       I had tried to use AQUAS pipeline. However, I suffered some troubles:

1. MACSe2 is included in AQUAS pipeline, and parameter "-gensz" and "-chrsz" are necessary. But, the -gensz accept "hs" and "mm" only. And the species of my data is not included in these parameter. How do I solve it?

2.  It seems that "AQUAS pipeline" use "bedtools intersect" to remove peaks of blacklist. However, I don't have the blacklist. And the error occurs:

                # SYS command. line 35

                 TASKTIME=$[$(date +%s)-${STARTTIME}]; echo "Task has finished (${TASKTIME} seconds)."
        StdErr (100000000 lines)  :

                ***** ERROR: -b option given, but no database file specified. *****

                Tool:    bedtools intersect (aka intersectBed)
                Version: v2.26.0
                Summary: Report overlaps between two feature files.

                Usage:   bedtools intersect [OPTIONS] -a <bed/gff/vcf/bam> -b <bed/gff/vcf/bam>

                        Note: -b may be followed with multiple databases and/or
                        wildcard (*) character(s).
                Options:
                       ...
                Notes:
                        (1) When a BAM file is used for the A file, the alignment is retained if overlaps exist,
                        and exlcuded if an overlap cannot be found.  If multiple overlaps exist, they are not
                        reported, as we are only testing for one or more overlaps.

Fatal error: chipseq.bds, line 1272, pos 2. Task/s failed.

Creating checkpoint file: Config or command line option disabled checkpoint file creation, nothing done.



How do I solve it?


3. "-species_file" can specify the path of the reference. If I want to add my reference of species, what types of data should I prepare (i.g. bed, index of bowtie)?


Thank you.

Anshul Kundaje

unread,
Feb 17, 2017, 2:04:30 AM2/17/17
to idr-d...@googlegroups.com, Jin Wook Lee
I'm ccing Jin who can help you with these questions. 

Anshul 

--
You received this message because you are subscribed to the Google Groups "idr-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to idr-discuss+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jin

unread,
Feb 17, 2017, 2:20:25 AM2/17/17
to idr-d...@googlegroups.com, Anshul Kundaje
1. -chrsz is a chromosome sizes file. You can generate it from a reference sequence .fa by using faidx. For  -gensz, you can just sum up column 2 in chrsz file. Please refer to the item 3. You actually don't have to manually specify all of these. If you have an URL for reference .fa file then the installer will automatically generate genome database based on it.

2. Fixed. Please git pull the latest code.

3. You need to modify install_genome_data.sh. Add your genome name to https://github.com/kundajelab/chipseq_pipeline/blob/master/install_genome_data.sh#L39 and https://github.com/kundajelab/chipseq_pipeline/blob/master/install_genome_data.sh#L140 with an appropriate downloadable URL for REF_FA.
```
elif [ $GENOME == YOUR_GENOME_NAME ]; then
  REF_FA=YOUR_GENOME_FA_URL
fi
```
Then simply run install_genome_data.sh [GENOME_NAME] [DEST_DIR]

Thanks,

Jin

Han-Qin Zheng

unread,
Feb 21, 2017, 12:15:51 AM2/21/17
to idr-discuss, ans...@kundaje.net
Thank you for your suggestion.
But I suffered another trouble.
I re-installed newest AQUAS pipeline (including all required package) and followed the suggestion to add the species.

However, a error occurred:

Fatal error: modules/callpeak_spp.bds, line 40, pos 15. Cannot convert 'HashMap' to int

Creating checkpoint file: Config or command line option disabled checkpoint file creation, nothing done.


How do I solve it?



Jin於 2017年2月17日星期五 UTC+8下午3時20分25秒寫道:
To unsubscribe from this group and stop receiving emails from it, send an email to idr-discuss...@googlegroups.com.

Han-Qin Zheng

unread,
Feb 22, 2017, 8:24:33 PM2/22/17
to idr-discuss, ans...@kundaje.net
Thank you for your help.
I saw the update of AQUAS pipeline, and do "git pull the code".
Then I re-run the pipeline from the checkpoint (Run pipeline from peak calling stage). 
But, on reporting stage, an error occurred:

Fatal error: modules/report.bds, line 421, pos 2. Task/s failed.

Creating checkpoint file: Config or command line option disabled checkpoint file creation, nothing done.


Please fix the bug. Thank you.


Han-Qin Zheng於 2017年2月21日星期二 UTC+8下午1時15分51秒寫道:

Han-Qin Zheng

unread,
Feb 23, 2017, 9:37:10 PM2/23/17
to idr-discuss, ans...@kundaje.net
I add some information about "Fatal error: modules/report.bds, line 421, pos 2. Task/s failed.".

gzip: /data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.gz: No such file or directory
Task failed:
        Program & line     : 'modules/report.bds', line 412
        Task Name          : 'peak2hammock'
        Task ID            : 'chipseq.bds.20170224_104132_426/task.report.peak2hammock.line_412.id_18'
        Task PID           : '28712'
        Task hint          : 'zcat /data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.gz | sed  /^ (chr )/!d  | sort -k1,1V -k2,2n > /data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline'
        Task resources     : 'cpus: -1  mem: -1.0 B     wall-timeout: 8640000'
        State              : 'ERROR'
        Dependency state   : 'ERROR'
        Retries available  : '1'
        Input files        : '[/data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.gz]'
        Output files       : '[/data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.hammock.gz]'
        Script file        : '/data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/chipseq.bds.20170224_104132_426/task.report.peak2hammock.line_412.id_18.sh'
        Exit status        : '1'
        Program            :

                # SYS command. line 414

                 if [[ -f $(which conda) && $(conda env list | grep aquas_chipseq | wc -l) != "0" ]]; then source activate aquas_chipseq; sleep 5; fi;  export PATH=/data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/.:/data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/modules:/data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/utils:${PATH}:/bin:/usr/bin:/usr/local/bin:${HOME}/.bds; set -o pipefail; STARTTIME=$(date +%s)

                # SYS command. line 415

                 zcat /data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.gz | sed '/^\(chr\)/!d' | sort -k1,1V -k2,2n > /data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.tmp

                # SYS command. line 417

                 /data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/utils/narrowpeak.py /data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.tmp /data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.hammock

                # SYS command. line 418

                 rm -f /data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.tmp
        StdErr (100000000 lines)  :
                gzip: /data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.gz: No such file or directory

Fatal error: modules/report.bds, line 421, pos 2. Task/s failed.

Creating checkpoint file: Config or command line option disabled checkpoint file creation, nothing done.


The "Fatal error: modules/report.bds, line 421, pos 2. Task/s failed." error may be caused by the wrong input (highlighted in yellow).
But I can not solve it.
Please help me, thank you.

Han-Qin Zheng於 2017年2月23日星期四 UTC+8上午9時24分33秒寫道:

lee...@gmail.com

unread,
Feb 23, 2017, 10:19:33 PM2/23/17
to idr-discuss, ans...@kundaje.net
Can you post a full log and your command line to run the pipeline? What files are there in your /data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap?
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted

Han-Qin Zheng

unread,
Feb 24, 2017, 2:11:05 AM2/24/17
to idr-discuss, ans...@kundaje.net
There is two file in "/data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap":

zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.tmp
zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.regionPeak.gz

My command:
bds chipseq.bds -se -bam1 /data/CHIP/bam_correct_name/zfh1EF_2-4h.bam -bam2 /data/CHIP/bam_correct_name/zfh1RL_2-4h.bam -ctl_bam /data/CHIP/bam_correct_name/input_2-4h.bam -nth 5 -out_dir ../test_file -species dm3


lee...@gmail.com於 2017年2月24日星期五 UTC+8上午11時19分33秒寫道:
Can you post a full log and your command line to run the pipeline? What files are there in your /data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap?

On Thursday, February 23, 2017 at 6:37:10 PM UTC-8, Han-Qin Zheng wrote:
I add some information about "Fatal error: modules/report.bds, line 421, pos 2. Task/s failed.".

gzip: /data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.gz: No such file or directory
Task failed:
        Program & line     : 'modules/report.bds', line 412
        Task Name          : 'peak2hammock'
        Task ID            : 'chipseq.bds.20170224_104132_426/task.report.peak2hammock.line_412.id_18'
        Task PID           : '28712'
        Task hint          : 'zcat /data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.gz | sed  /^ (chr )/!d  | sort -k1,1V -k2,2n > /data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline'
        Task resources     : 'cpus: -1  mem: -1.0 B     wall-timeout: 8640000'
        State              : 'ERROR'
        Dependency state   : 'ERROR'
        Retries available  : '1'
        Input files        : '[/data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.gz]'
        Output files       : '[/data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.hammock.gz]'
        Script file        : '/data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/chipseq.bds.20170224_104132_426/task.report.peak2hammock.line_412.id_18.sh'
        Exit status        : '1'
        Program            :

                # SYS command. line 414

                 if [[ -f $(which conda) && $(conda env list | grep aquas_chipseq | wc -l) != "0" ]]; then source activate aquas_chipseq; sleep 5; fi;  export PATH=/data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/.:/data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/modules:/data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/utils:${PATH}:/bin:/usr/bin:/usr/local/bin:${HOME}/.bds; set -o pipefail; STARTTIME=$(date +%s)

                # SYS command. line 415

                 zcat /data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.gz | sed '/^\(chr\)/!d' | sort -k1,1V -k2,2n > /data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.tmp

                # SYS command. line 417

                 /data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/utils/narrowpeak.py /data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.tmp /data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.hammock

                # SYS command. line 418

                 rm -f /data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.tmp
        StdErr (100000000 lines)  :
                gzip: /data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.gz: No such file or directory

Fatal error: modules/report.bds, line 421, pos 2. Task/s failed.

Creating checkpoint file: Config or command line option disabled checkpoint file creation, nothing done.


The "Fatal error: modules/report.bds, line 421, pos 2. Task/s failed." error may be caused by the wrong input (highlighted in yellow).
But I can not solve it.
Please help me, thank you.

Han-Qin Zheng於 2017年2月23日星期四 UTC+8上午9時24分33秒寫道:
Thank you for your help.
I saw the update of AQUAS pipeline, and do "git pull the code".
Then I re-run the pipeline from the checkpoint (Run pipeline from peak calling stage). 
But, on reporting stage, an error occurred:

Fatal error: modules/report.bds, line 421, pos 2. Task/s failed.

Creating checkpoint file: Config or command line option disabled checkpoint file creation, nothing done.


Please fix the bug. Thank you.


Han-Qin Zheng於 2017年2月21日星期二 UTC+8下午1時15分51秒寫道:
Thank you for your suggestion.
But I suffered another trouble.
I re-installed newest AQUAS pipeline (including all required package) and followed the suggestion to add the species.

However, a error occurred:

Fatal error: modules/callpeak_spp.bds, line 40, pos 15. Cannot convert 'HashMap' to int

Creating checkpoint file: Config or command line option disabled checkpoint file creation, nothing done.


How do I solve it?



Jin於 2017年2月17日星期五 UTC+8下午3時20分25秒寫道:
1. -chrsz is a chromosome sizes file. You can generate it from a reference sequence .fa by using faidx. For  -gensz, you can just sum up column 2 in chrsz file. Please refer to the item 3. You actually don't have to manually specify all of these. If you have an URL for reference .fa file then the installer will automatically generate genome database based on it.

2. Fixed. Please git pull the latest code.

3. You need to modify install_genome_data.sh. Add your genome name to https://github.com/kundajelab/chipseq_pipeline/blob/master/install_genome_data.sh#L39 and https://github.com/kundajelab/chipseq_pipeline/blob/master/install_genome_data.sh#L140 with an appropriate downloadable URL for REF_FA.
```
elif [ $GENOME == YOUR_GENOME_NAME ]; then
  REF_FA=YOUR_GENOME_FA_URL
fi
```
Then simply run install_genome_data.sh [GENOME_NAME] [DEST_DIR]

Thanks,

Jin


On Thu, Feb 16, 2017 at 11:04 PM, Anshul Kundaje <ans...@kundaje.net> wrote:
I'm ccing Jin who can help you with these questions. 

Anshul 

On Feb 16, 2017 11:02 PM, "Han-Qin Zheng" <hank...@gmail.com> wrote:
Hello:
       
       I had tried to use AQUAS pipeline. However, I suffered some troubles:

1. MACSe2 is included in AQUAS pipeline, and parameter "-gensz" and "-chrsz" are necessary. But, the -gensz accept "hs" and "mm" only. And the species of my data is not included in these parameter. How do I solve it?

2.  It seems that "AQUAS pipeline" use "bedtools intersect" to remove peaks of blacklist. However, I don't have the blacklist. And the error occurs:

                # SYS command. line 35

                 TASKTIME=$[$(date +%s)-${STARTTIME}]; echo "Task has finished (${TASKTIME} seconds)."
        StdErr (100000000 lines)  :

                ***** ERROR: -b option given, but no database file specified. *****

                Tool:    bedtools intersect (aka intersectBed)
                Version: v2.26.0
                Summary: Report overlaps between two feature files.

                Usage:   bedtools intersect [OPTIONS] -a <bed/gff/vcf/bam> -b <bed/gff/vcf/bam>

                        Note: -b may be followed with multiple databases and/or
                        wildcard (*) character(s).
                Options:
                       ...
                Notes:
                        (1) When a BAM file is used for the A file, the alignment is retained if overlaps exist,
                        and exlcuded if an overlap cannot be found.  If multiple overlaps exist, they are not
                        reported, as we are only testing for one or more overlaps.

Fatal error: chipseq.bds, line 1272, pos 2. Task/s failed.

Creating checkpoint file: Config or command line option disabled checkpoint file creation, nothing done.



How do I solve it?


3. "-species_file" can specify the path of the reference. If I want to add my reference of species, what types of data should I prepare (i.g. bed, index of bowtie)?


Thank you.

--
You received this message because you are subscribed to the Google Groups "idr-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to idr-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Message has been deleted
Message has been deleted

Han-Qin Zheng

unread,
Feb 24, 2017, 2:25:01 AM2/24/17
to idr-discuss, ans...@kundaje.net
The error occurred when I submitted the reply after uploading the log file (chipseq.bds.(date).report.html) .
Therefore, can you give me the email address to mail the log file?

lee...@gmail.com

unread,
Feb 24, 2017, 5:30:38 PM2/24/17
to idr-discuss, ans...@kundaje.net
Can you check if the following file exists?

/data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.gz

If not, what is in
/data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/?

My email address is leepc12 at gmail

Jin

Han-Qin Zheng

unread,
Feb 24, 2017, 9:20:11 PM2/24/17
to idr-discuss, ans...@kundaje.net
The file is not exists:
/data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.gz

In the "/data/CHIP/software/AQUAS_TF_ChIP-seq/TF_chipseq_pipeline/../test_file/peak/spp/overlap/", I saw two files:

zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.tmp
zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.regionPeak.gz






lee...@gmail.com於 2017年2月25日星期六 UTC+8上午6時30分38秒寫道:

lee...@gmail.com

unread,
Feb 25, 2017, 10:34:43 AM2/25/17
to idr-discuss, ans...@kundaje.net
I think I fixed it. It turns out to be bedtools interesect bug (this bug still exists even in the latest version) when input is gzipped bed.

Please get the latest pipeline, remove all files in ./peak/spp/overlap/ and then re-run pipelines.

Thanks,

Jin

Han-Qin Zheng

unread,
Feb 26, 2017, 2:26:26 AM2/26/17
to idr-discuss, ans...@kundaje.net
This problem still occur...
I checked the code, and I saw some code (i.g. utils/narrowpeak.py) use the command line tools "bgzip" and "tabix".
And these command line tools doesn't exist in my linux (OS: centOS6.8).
Then I install it and rerun the command.

I find:
If I run the command line to run AQUAS pipeline, there are two files in peak/spp/overlap/
zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.tmp
zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.regionPeak.gz

But when I run the sub-command "# SYS command. line 417" (below)
/data/CHIP/software/AQUAS_TF_ChIP-seq/chipseq_pipeline/utils/narrowpeak.py /data/CHIP/software/AQUAS_TF_ChIP-seq/chipseq_pipeline/../test_file4/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.tmp /data/CHIP/software/AQUAS_TF_ChIP-seq/chipseq_pipeline/../test_file4/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.hammock

Two additional files appeared:
zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.hammock.gz
zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.hammock.gz.tbi

Therefore, I guess that "bgzip" and "tabix" may not be used in AQUAS pipeline. And it make some necessary files can not be created. (i.g. zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.gz)

So, if the program can catch  "bgzip" and "tabix" and use, the error may be solved.

Thank you

== Done do_idr()
gzip: /data/CHIP/software/AQUAS_TF_ChIP-seq/chipseq_pipeline/../test_file4/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.gz: No such file or directory
Task failed:
        Program & line     : 'modules/report.bds', line 412
        Task Name          : 'peak2hammock'
        Task ID            : 'chipseq.bds.20170226_151636_914/task.report.peak2hammock.line_412.id_18'
        Task PID           : '9662'
        Task hint          : 'zcat /data/CHIP/software/AQUAS_TF_ChIP-seq/chipseq_pipeline/../test_file4/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.gz | sed  /^ (chr )/!d  | sort -k1,1V -k2,2n > /data/CHIP/software/AQUAS_TF_ChIP-seq/chipseq_pipeline/../t'
        Task resources     : 'cpus: -1  mem: -1.0 B     wall-timeout: 8640000'
        State              : 'ERROR'
        Dependency state   : 'ERROR'
        Retries available  : '1'
        Input files        : '[/data/CHIP/software/AQUAS_TF_ChIP-seq/chipseq_pipeline/../test_file4/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.gz]'
        Output files       : '[/data/CHIP/software/AQUAS_TF_ChIP-seq/chipseq_pipeline/../test_file4/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.hammock.gz]'
        Script file        : '/data/CHIP/software/AQUAS_TF_ChIP-seq/chipseq_pipeline/chipseq.bds.20170226_151636_914/task.report.peak2hammock.line_412.id_18.sh'
        Exit status        : '1'
        Program            :

                # SYS command. line 414

                 if [[ -f $(which conda) && $(conda env list | grep aquas_chipseq | wc -l) != "0" ]]; then source activate aquas_chipseq; sleep 5; fi;  export PATH=/data/CHIP/software/AQUAS_TF_ChIP-seq/chipseq_pipeline/.:/data/CHIP/software/AQUAS_TF_ChIP-seq/chipseq_pipeline/modules:/data/CHIP/software/AQUAS_TF_ChIP-seq/chipseq_pipeline/utils:${PATH}:/bin:/usr/bin:/usr/local/bin:${HOME}/.bds; set -o pipefail; STARTTIME=$(date +%s)

                # SYS command. line 415

                 zcat /data/CHIP/software/AQUAS_TF_ChIP-seq/chipseq_pipeline/../test_file4/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.gz | sed '/^\(chr\)/!d' | sort -k1,1V -k2,2n > /data/CHIP/software/AQUAS_TF_ChIP-seq/chipseq_pipeline/../test_file4/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.tmp

                # SYS command. line 417

                 /data/CHIP/software/AQUAS_TF_ChIP-seq/chipseq_pipeline/utils/narrowpeak.py /data/CHIP/software/AQUAS_TF_ChIP-seq/chipseq_pipeline/../test_file4/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.tmp /data/CHIP/software/AQUAS_TF_ChIP-seq/chipseq_pipeline/../test_file4/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.hammock

                # SYS command. line 418

                 rm -f /data/CHIP/software/AQUAS_TF_ChIP-seq/chipseq_pipeline/../test_file4/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.tmp
        StdErr (100000000 lines)  :
                gzip: /data/CHIP/software/AQUAS_TF_ChIP-seq/chipseq_pipeline/../test_file4/peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.gz: No such file or directory

Fatal error: modules/report.bds, line 421, pos 2. Task/s failed.

Creating checkpoint file: Config or command line option disabled checkpoint file creation, nothing done.


lee...@gmail.com於 2017年2月25日星期六 UTC+8下午11時34分43秒寫道:

Han-Qin Zheng

unread,
Feb 26, 2017, 2:56:48 AM2/26/17
to idr-discuss, ans...@kundaje.net
I search "filt.regionPeak.gz" in the code and find that appear in  modules/callpeak_spp.bds only

line 41:   filt_rpeakfile  := "$prefix_x.filt.regionPeak.gz"
line 81~84:
                sys if [[ $blacklist_exists == "true" ]]; then \
                        bedtools intersect -v -a <(zcat -f $rpeakfile) -b <(zcat -f $blacklist) \
                        | awk 'BEGIN{OFS="\t"} {if ($5>1000) $5=1000; print $0}' | grep -P 'chr[\dXY]+[ \t]' \
                        | gzip -nc > $filt_rpeakfile; \

It seems like that suffix file ".filt.regionPeak.gz" only be created when the blacklist file existed.
And there are no blacklist file in my command. 
Therefore , it seems that the line 81~84 were skipped, and the $prefix_x.filt.regionPeak.gz  may not be created.
(i.g. peak/spp/overlap/zfh1EF_2-4h.nodup_zfh1RL_2-4h.nodup.tagAlign_x_input_2-4h.nodup.tagAlign.naive_overlap.filt.regionPeak.gz)

And it caused that "modules/report.bds" can't find the "*filt.regionPeak.gz" and report error.



Han-Qin Zheng於 2017年2月26日星期日 UTC+8下午3時26分26秒寫道:

Jin

unread,
Feb 26, 2017, 8:14:08 AM2/26/17
to idr-d...@googlegroups.com, Anshul Kundaje
The pipeline uses bgzip and tabix (in ./utils/narrowpeak.py) to generate .hammock files to be shown on a genome browser.

I forgot telling you to reinstall dependencies (run ./uninstall_dependencies.sh and then ./install_dependencies). But if you have already installed bgzip and tabix on your sytem, then it's okay. no need to reinstall dependencies.

About the peak2hammock error. There was a bug in the code (returning wrong filename (*.filt.*) when there is no blacklist). It's fixed now. Please remove all files in peaks/spp/overlap and re-run.

Thanks,

Jin



To unsubscribe from this group and stop receiving emails from it, send an email to idr-discuss+unsubscribe@googlegroups.com.

Han-Qin Zheng

unread,
Feb 26, 2017, 7:20:10 PM2/26/17
to idr-discuss, ans...@kundaje.net, lee...@stanford.edu
OK, now it works well, and there are no error messages .
Thanks a lot for your help!


Jin於 2017年2月26日星期日 UTC+8下午9時14分08秒寫道:
Reply all
Reply to author
Forward
0 new messages