samtools sort fail to open file

757 views
Skip to first unread message

Ira Herniter

unread,
Dec 13, 2021, 8:17:28 AM12/13/21
to Stacks
Hi,

I've been trying to use the shell script in Appendix 4 of the new bioRxiv manuscript, edited for my specific folder configuration:

$ cat popmap.tsv | cut -f 1 | while read sample; do fq1=./samples/${sample}.1.fq.gz; fq2=./samples/${sample}.2.fq.gz; bam=./aligned_samples2/${sample}.bam | bwa mem ./vcor_db/vcor_db "$fq1" "$fq2" | samtools view -b -h | samtools sort -o "$bam"; done

bwa mem seems to work fine, and it processes the samples. However, there seems to be an issue with outputting the *.bam files at the end. I get the following readout:

[bam_sort_core] merging from 7 files and 1 in-memory blocks...
[E::hts_open_format] Failed to open file
samtools sort: failed to create "": No such file or directory


My guess is that it isn't creating the *.bam files. Do I need to create the files prior to running the shell loop? If so, how should I go about doing that?

I have found that running the command one individual at a time for the alignment does work, and there I don't have to individually create the output file:

$ bwa mem ./vcor_db/vcor_db ./samples/16-4-434.1.fq.gz ./samples/16-4-434.2.fq.gz | samtools view -b -h | samtools sort -o ./aligned_samples2/16-4-434

Have I just done something silly like forgetting a character?

Thanks,
Ira

Ira Herniter

unread,
Dec 13, 2021, 11:53:35 AM12/13/21
to Stacks
I realized i left the file extension off the last bit of code. It should say:

bwa mem ./vcor_db/vcor_db ./samples/16-4-434.1.fq.gz ./samples/16-4-434.2.fq.gz | samtools view -b -h | samtools sort -o ./aligned_samples2/16-4-434.bam

-Ira

Catchen, Julian

unread,
Dec 13, 2021, 12:52:43 PM12/13/21
to stacks...@googlegroups.com

Hi,

 

The variable in your loop, $BAM, is not defined (you defined ‘$bam’). When you run the command one at a time you are not using this variable, but handcoding the outputfile. Just use the appropriate file name, which is stored in $sample, or use the proper case for the variable name:

 

$ cat popmap.tsv | cut -f 1 | while read sample; do bwa mem ./vcor_db/vcor_db ./samples/${sample}.1.fq.gz ./samples/${sample}.2.fq.gz | samtools view -b -h | samtools sort -o "${sample}.bam"; done

 

Or

 

$ cat popmap.tsv | cut -f 1 | while read sample; do fq1=./samples/${sample}.1.fq.gz; fq2=./samples/${sample}.2.fq.gz; bam=./aligned_samples2/${sample}.bam | bwa mem ./vcor_db/vcor_db "$fq1" "$fq2" | samtools view -b -h | samtools sort -o "$bam"; done

 

Cheers,

julian

Ira Herniter

unread,
Dec 13, 2021, 2:56:29 PM12/13/21
to Stacks
Hi Julian,

Thanks for the quick response though I'm not sure I follow. My initial code is exactly the same are your second code. I don't have a variable labelled "$BAM" unless I'm missing something.

Thanks,
Ira

Catchen, Julian

unread,
Dec 13, 2021, 3:17:49 PM12/13/21
to stacks...@googlegroups.com

Right, sorry about that. The first version of the command I gave will work. Also, looking at your code again, it wasn’t the case of the variable name (I somehow imagined that), but instead you have an extra pipe (fixed in red). It should read:

 

$ cat popmap.tsv | cut -f 1 | while read sample; do fq1=./samples/${sample}.1.fq.gz; fq2=./samples/${sample}.2.fq.gz; bam=./aligned_samples2/${sample}.bam; bwa mem ./vcor_db/vcor_db "$fq1" "$fq2" | samtools view -b -h | samtools sort -o "$bam"; done

 

 

From: stacks...@googlegroups.com <stacks...@googlegroups.com> on behalf of Ira Herniter <iher...@gmail.com>
Date: Monday, December 13, 2021 at 1:56 PM
To: Stacks <stacks...@googlegroups.com>
Subject: Re: [stacks] Re: samtools sort fail to open file

Hi Julian,

 

Thanks for the quick response though I'm not sure I follow. My initial code is exactly the same are your second code. I don't have a variable labelled "$BAM" unless I'm missing something.

 

Thanks,

Ira

On Monday, December 13, 2021 at 12:52:43 PM UTC-5 jcatchen wrote:

Hi,

 

The variable in your loop, $BAM, is not defined (you defined ‘$bam’). When you run the command one at a time you are not using this variable, but handcoding the outputfile. Just use the appropriate file name, which is stored in $sample, or use the proper case for the variable name:

 

$ cat popmap.tsv | cut -f 1 | while read sample; do bwa mem ./vcor_db/vcor_db ./samples/${sample}.1.fq.gz ./samples/${sample}.2.fq.gz | samtools view -b -h | samtools sort -o "./aligned_samples2/${sample}.bam"; done

 

Or

Ira Herniter

unread,
Dec 13, 2021, 3:23:15 PM12/13/21
to Stacks
Thanks. Very much appreciate the help. I'm running that first version you gave and it seems to be working. It's made it through two samples without incident.
-Ira

Angel Rivera-Colón

unread,
Dec 13, 2021, 3:24:25 PM12/13/21
to Stacks
Hi Ira,

I think I found the error. Your command has a pipe (|) between the bam=./aligned_samples2/${sample}.bam and the bwa mem ./vcor_db/vcor_db. This should be a semicolon (;) instead. Because of this it is trying to pipe the "output" of the bam= line into bwa mem, but never actually defining the bam variable.

Julian's first command should also work because the bam variable is never defined and instead the "${sample}.bam" is used instead.

The fixed command should be (highlighting the new semicolon):


$ cat popmap.tsv | cut -f 1 | while read sample; do fq1=./samples/${sample}.1.fq.gz; fq2=./samples/${sample}.2.fq.gz; bam=./aligned_samples2/${sample}.bam; bwa mem ./vcor_db/vcor_db "$fq1" "$fq2" | samtools view -b -h | samtools sort -o "$bam"; done

Hope this helps.

Thanks,
Angel

Ira Herniter

unread,
Dec 15, 2021, 11:13:10 PM12/15/21
to Stacks
Hi Angel,

I'll try that with the next set. Sometimes these programs really drive me up the wall with their very required inputs.

-Ira

Reply all
Reply to author
Forward
0 new messages