On 7. Aug 2020, at 13:06, Rautenstrauch, Pia <Pia.Raut...@mdc-berlin.de> wrote:
Hi Alex,
Thanks a lot for the fast reply! Manually gzipping the files and rerunning the pipeline works - the pipeline continues.
Unfortunately, I ran into a further weird bug in the "linking the genome fasta" step.
```[Fri Aug 7 12:00:09 2020]Job 99:Linking genome fasta:input : /fast/AG_Ohler/prauten/data/Schuelke_Dummy_ChiP-seq/pigx/mm9/genome/mm9_primary_assembly.faoutput: /fast/AG_Ohler/prauten/data/Schuelke_Dummy_ChiP-seq/pigx/mm9/out/Bowtie2_Index/Main/mm9_primary_assembly.fa
[33mJob counts:count jobs1 link_genome1[0m[32m[Fri Aug 7 12:00:12 2020][0m[31mError in rule link_genome:[0m[31m jobid: 0[0m[31m output: /fast/AG_Ohler/prauten/data/Schuelke_Dummy_ChiP-seq/pigx/mm9/out/Bowtie2_Index/Main/mm9_primary_assembly.fa[0m[31m log: /fast/AG_Ohler/prauten/data/Schuelke_Dummy_ChiP-seq/pigx/mm9/out/Log/link_Main_genome_mm9_primary_assembly.log (check log file(s) for error message)[0m[31m[0m[31mSystemExit in line 45 of /gnu/store/nxk52pcmvizq6vlb4xmg2x0c3s74xprb-pigx-chipseq-0.0.42/libexec/pigx_chipseq/Rules/Mapping.py:Unknow input genome file formatFile "/gnu/store/09a5iq080g9b641jyl363dr5jkkvnhcn-python-3.8.2/lib/python3.8/concurrent/futures/thread.py", line 57, in runFile "/gnu/store/nxk52pcmvizq6vlb4xmg2x0c3s74xprb-pigx-chipseq-0.0.42/libexec/pigx_chipseq/Rules/Mapping.py", line 67, in __rule_link_genomeFile "/gnu/store/nxk52pcmvizq6vlb4xmg2x0c3s74xprb-pigx-chipseq-0.0.42/libexec/pigx_chipseq/Rules/Mapping.py", line 45, in link_genome[0m[31mExiting because a job execution failed. Look above for error message[0mShutting down, this might take some time.Exiting because a job execution failed. Look above for error messageComplete log: /fast/AG_Ohler/prauten/data/Schuelke_Dummy_ChiP-seq/pigx/mm9/out/.snakemake/log/2020-08-07T120005.091283.snakemake.log```
I think I pinpointed the error after checking the code of pigx_chipseq/Rules/Mapping.py.
The problem is, that the script checks the filetype with magic, and my fasta files are identified as 'ASCII text’, however, only 'text/plain’ is accepted.
Before I looked into the source code, I thought a problem might be, that I initially tried with a FASTA file, where all genome information per header was just in a single line, I subsequently split this. For the very long FASTA lines magic identifies the file type 'ASCII text, with very long lines’. However, both does not work ;).
I checked also the original genome file that I downloaded from UCSC (mouse mm9) and magic also says it is 'ASCII text’, it is a quite old assembly, so I don’t know whether more recent ones are plain text instead of ASCII.
I tried to circumvent this problem by just gzipping my input genome, however, then, the function check_fasta_header (File "/gnu/store/nxk52pcmvizq6vlb4xmg2x0c3s74xprb-pigx-chipseq-0.0.42/libexec/pigx_chipseq/scripts/Check_Config.py", line 219, in check_fasta_header) says I would have whitespaces in the fasta headers, which I checked and there are none in the uncompressed fasta file.
```SystemExit in line 26 of /gnu/store/nxk52pcmvizq6vlb4xmg2x0c3s74xprb-pigx-chipseq-0.0.42/libexec/pigx_chipseq/scripts/Check_Config.py:ERROR: Settings file is not properly formated:Genome fasta headers contain whitespaces.Please reformat the headers
File "/gnu/store/nxk52pcmvizq6vlb4xmg2x0c3s74xprb-pigx-chipseq-0.0.42/libexec/pigx_chipseq/Snake_ChIPseq.py", line 44, in <module>File "/gnu/store/nxk52pcmvizq6vlb4xmg2x0c3s74xprb-pigx-chipseq-0.0.42/libexec/pigx_chipseq/scripts/Check_Config.py", line 26, in validate_config```
I copied just this function and checked both my gzipped genome and my ungzipped genome fasta file.
For the ungzipped I get no error message, however, for the gzipped one I do. Hence, the pipeline also terminates.
I hope this information is sufficient to understand my problem. I am happy to provide more information.
For a quick fix, I am also happy to change my input file format/transform etc.
Thanks a lot for your help,
Pia
On 6. Aug 2020, at 17:21, Blume, Alexander <Alexand...@mdc-berlin.de> wrote:
Hi Pia,
I am one of the developers responsible for the pigx-chipseq pipeline.
Sorry that you are having problems with the pipeline, but thank you for sharing with us.
I will have a closer look at your problem the next days, but it seems that the gzip argument is not forwarded to the trimming tool.A fix will hopefully be available by next week, until then I suggest you try to trick the pipeline by gzipping the trimmed fastq files yourself and testing wether it continuous.
If you find any other bugs or weird things please let us know, we are very happy to receive feedback.
Best,Alex
--
You received this message because you are subscribed to the Google Groups "pigx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pigx+uns...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pigx/EAE5B5D5-D292-4465-A013-C08DBEF1483E%40mdc-berlin.de.
On 7. Aug 2020, at 15:59, Rautenstrauch, Pia <Pia.Raut...@mdc-berlin.de> wrote:
Hi Alex,
Thanks a lot for the fast and helpful reply! Changing the path to the actual file path instead of using a soft link solved the error.
I am sorry to bother you again, I hope there are not a lot more problems coming up.
In the "Mapping with bowtie2” step I get the following error (for all parallel jobs that got started):
```[31m output: /fast/AG_Ohler/prauten/data/Schuelke_Dummy_ChiP-seq/pigx/mm9/out/Mapped/Bowtie/Main/Myf5_rep1/Myf5_rep1.bam[0m[31m log: /fast/AG_Ohler/prauten/data/Schuelke_Dummy_ChiP-seq/pigx/mm9/out/Log/Myf5_rep1/Myf5_rep2.bowtie2_Main.log (check log file(s) for error message)[0m[31mRuleException:TypeError in line 174 of /gnu/store/nxk52pcmvizq6vlb4xmg2x0c3s74xprb-pigx-chipseq-0.0.42/libexec/pigx_chipseq/scripts/SnakeFunctions.py:'NoneType' object is not iterableFile "/gnu/store/nxk52pcmvizq6vlb4xmg2x0c3s74xprb-pigx-chipseq-0.0.42/libexec/pigx_chipseq/Rules/Mapping.py", line 220, in __rule_bowtie2File "/gnu/store/nxk52pcmvizq6vlb4xmg2x0c3s74xprb-pigx-chipseq-0.0.42/libexec/pigx_chipseq/scripts/SnakeFunctions.py", line 174, in join_paramsFile "/gnu/store/09a5iq080g9b641jyl363dr5jkkvnhcn-python-3.8.2/lib/python3.8/concurrent/futures/thread.py", line 57, in run[0m```And some perl warnings:```perl: warning: Setting locale failed.perl: warning: Please check that your locale settings:LANGUAGE = (unset),LC_ALL = (unset),LANG = "en_US.UTF-8"are supported and installed on your system.perl: warning: Falling back to the standard locale ("C”).```
Thanks again and kind regards,Pia
On 7. Aug 2020, at 16:23, Rautenstrauch, Pia <Pia.Raut...@mdc-berlin.de> wrote:
Hi Alex,
Thanks for taking the time! Please find attached the settings file. Now that I am writing to you anyhow - for me it became not clear from the tutorial and manual which programs the bam_filter and extract_signal belong to (and hence which manual to check for the respective parameters/flags). I did not care too much for now, as I just wanted to check the pipeline in general..
Kind regards,Pia
<settings.yaml>
On 11. Aug 2020, at 13:48, Rautenstrauch, Pia <Pia.Raut...@mdc-berlin.de> wrote:
Hi Alex,
Thank you very much for the fast and helpful replies and the explanation of the settings section!
I was on Holiday yesterday and only continued with the PiGX pipeline today. I left the N:0 option of bowtie2 uncommented which fixed the previously reported problem. However, I stumbled into another error:
```[31mError in rule bowtie2:[0m[31m jobid: 0[0m[31m output: /fast/AG_Ohler/prauten/data/Schuelke_Dummy_ChiP-seq/pigx/mm9/out/Mapped/Bowtie/Main/Myf5_rep2/Myf5_rep2.bam[0m[31m log: /fast/AG_Ohler/prauten/data/Schuelke_Dummy_ChiP-seq/pigx/mm9/out/Log/Myf5_rep2/Myf5_rep2.bowtie2_Main.log (check log file(s) for error message)[0m[31m[0m[31mRuleException:CalledProcessError in line 224 of /gnu/store/nxk52pcmvizq6vlb4xmg2x0c3s74xprb-pigx-chipseq-0.0.42/libexec/pigx_chipseq/Rules/Mapping.py:Command 'set -euo pipefail; /gnu/store/7sicb2zgpqnav53gaiizmarjdn03ydp2-bowtie-2.3.4.3/bin/bowtie2 -p 2 -x /fast/AG_Ohler/prauten/data/Schuelke_Dummy_ChiP-seq/pigx/mm9/out/Bowtie2_Index/Main/mm9 -U /fast/AG_Ohler/prauten/data/Schuelke_Dummy_ChiP-seq/pigx/mm9/out/Trimmed/Trim_Galore/Myf5_rep2/Myf5_rep2_R.fastq.gz -N 0 2> /fast/AG_Ohler/prauten/data/Schuelke_Dummy_ChiP-seq/pigx/mm9/out/Log/Myf5_rep2/Myf5_rep2.bowtie2_Main.log | /gnu/store/jx7i0x6jzfy9vlwsv6hycdyvih2q16lm-samtools-1.9/bin/samtools view -bhS > /fast/AG_Ohler/prauten/data/Schuelke_Dummy_ChiP-seq/pigx/mm9/out/Mapped/Bowtie/Main/Myf5_rep2/Myf5_rep2.bam' returned non-zero exit status 1.File "/gnu/store/nxk52pcmvizq6vlb4xmg2x0c3s74xprb-pigx-chipseq-0.0.42/libexec/pigx_chipseq/Rules/Mapping.py", line 224, in __rule_bowtie2
File "/gnu/store/09a5iq080g9b641jyl363dr5jkkvnhcn-python-3.8.2/lib/python3.8/concurrent/futures/thread.py", line 57, in run[0m
[31mExiting because a job execution failed. Look above for error message[0m
```
The log file contains the following information:
```perl: warning: Setting locale failed.perl: warning: Please check that your locale settings:LANGUAGE = (unset),LC_ALL = (unset),
LANG = "en_GB.UTF-8"
are supported and installed on your system.perl: warning: Falling back to the standard locale ("C").
Error: reads file does not look like a FASTQ fileterminate called after throwing an instance of 'int'(ERR): bowtie2-align died with signal 6 (ABRT) (core dumped)```I checked the input FASTQ file (/fast/AG_Ohler/prauten/data/Schuelke_Dummy_ChiP-seq/pigx/mm9/out/Trimmed/Trim_Galore/Myf5_rep2/Myf5_rep2_R.fastq.gz ). To me, it looks fine.
I think the following thread could relate to the problem: https://sourceforge.net/p/bowtie-bio/bugs/163/. However, according to the website, the problem should be resolved.
I would appreciate your help and am sorry for running into so many problems.
--
You received this message because you are subscribed to the Google Groups "pigx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pigx+uns...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pigx/950F7D3C-4B9B-47AF-A5DE-BB67D93A3C3F%40mdc-berlin.de.
zcat GCF_000001405.39_GRCh38.p13_genomic.fna.gz | head
NC_000001.11 Homo sapiens chromosome 1, GRCh38.p13 Primary AssemblyNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
<20210920_PiGx_ChIP_Seq_Report.html>
To view this discussion on the web visit https://groups.google.com/d/msgid/pigx/5130ECB9-5895-4CF8-9D11-00CE810F5130%40mdc-berlin.de.