error dDocent exits

348 views
Skip to first unread message

Chris Brauer

unread,
Oct 13, 2014, 11:12:12 PM10/13/14
to ddo...@googlegroups.com
Hi,

I am running dDocent with a linux machine running ubuntu 14.04.1 for 240 paired end ddRAD samples. I am using .fq files from process_rad_tags and all seems to run ok but I get the following error messages and the program exits.

3       970226
4       532473
66831583 pairs of reads input. 60619264 pairs of reads output, discarded 6212319 pairs of reads, 9.30% clone reads.
[M::main_mem] read 5194806 sequences (400000062 bp)...
/usr/local/bin/dDocent.FB: line 529:  7660 Segmentation fault      (core dumped) bwa mem reference.fasta forward1 reverse1 -t $NUMProc -a -M -T 10 -A $optA -B $optB -O $optO -R "@RG\tID:cat\tSM:cat\tPL:Illumina"
      7661                       (core dumped) | samtools view -q1 -@ $NUMProc -SbT reference.fasta - > cat.bam 2> cat.bam.log
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
/usr/local/bin/dDocent.FB: line 529:  7675 Segmentation fault      (core dumped) samtools sort -@$NUMProc cat.bam cat
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[bam_index_core] Invalid BAM header.[bam_index_build2] fail to index the BAM file.
Failed to open BAM file cat-RRG.bam
Error: The requested bed file (map.bed) could not be opened. Exiting!
split: 0: invalid number of lines
Try 'split --help' for more information.
ls: cannot access splitmap*: No such file or directory
mv: cannot stat ‘raw.1.vcf’: No such file or directory
mv: cannot stat ‘raw.2.vcf’: No such file or directory
mv: cannot stat ‘raw.3.vcf’: No such file or directory
mv: cannot stat ‘raw.4.vcf’: No such file or directory
mv: cannot stat ‘raw.5.vcf’: No such file or directory
mv: cannot stat ‘raw.6.vcf’: No such file or directory
mv: cannot stat ‘raw.7.vcf’: No such file or directory
mv: cannot stat ‘raw.8.vcf’: No such file or directory
mv: cannot stat ‘raw.9.vcf’: No such file or directory
terminate called after throwing an instance of 'std::out_of_range'
  what():  basic_string::substr
/usr/local/bin/dDocent.FB: line 181:  7739 Aborted                 (core dumped) vcfcombine raw.*.vcf > TotalRawSNPs1.vcf
terminate called after throwing an instance of 'std::out_of_range'
  what():  basic_string::substr
terminate called after throwing an instance of 'std::out_of_range'
  what():  basic_string::substr
/usr/local/bin/dDocent.FB: line 181:  7741 Aborted                 (core dumped) vcfglxgt TotalRawSNPs1.vcf
      7742                       (core dumped) | vcffixup - > TotalRawSNPs.vcf
Using VCFtools to parse SNPS.vcf for SNPS that are called in at least 90% of individuals
/usr/local/bin/dDocent.FB: line 339: mailx: command not found
mawk: cannot open ./logfiles/Final.log (No such file or directory)
Phoinix@bio111817:~/Chris/stacks/M4N6n7/read_files$

It has created the –RG.bam files for each sample but they are all empty, as are the bam.log files. The bwa log files all contain the following: [E::main_mem] fail to open file `Pop7_Na133.R1.fq'.

I have a little perl scripting experience but I'm not really sure where to start here.

Any help appreciated

Chris Brauer

Quynh Quach

unread,
Feb 24, 2016, 10:47:27 AM2/24/16
to dDocent
Hi Chris and Jon,
Did you ever figured this out because I am running into similar error, this is what is printed to my terminal:

ls: cannot access *-RG.bam: No such file or directory
Using FreeBayes to call SNPs
ls: cannot access mapped.*.bed: No such file or directory
rm: cannot remove ‘mapped.*.bed’: No such file or directory

mv: cannot stat ‘raw.1.vcf’: No such file or directory
mv: cannot stat ‘raw.2.vcf’: No such file or directory
mv: cannot stat ‘raw.3.vcf’: No such file or directory
mv: cannot stat ‘raw.4.vcf’: No such file or directory
mv: cannot stat ‘raw.5.vcf’: No such file or directory
mv: cannot stat ‘raw.6.vcf’: No such file or directory
mv: cannot stat ‘raw.7.vcf’: No such file or directory
mv: cannot stat ‘raw.8.vcf’: No such file or directory
mv: cannot stat ‘raw.9.vcf’: No such file or directory
terminate called after throwing an instance of 'std::out_of_range'
  what():  basic_string::substr
mv: cannot stat ‘raw.*.vcf’: No such file or directory
Using VCFtools to parse SNPS.vcf for SNPs that are called in at least 90% of individuals
mawk: cannot open *.bam.log (No such file or directory)
/usr/local/bin/dDocent: line 363: mailx: command not found

It looks like I could have possibly mis-installed something starting with the -RG.bam file but I'm not sure how to fix that.
I only have the cat-RRG.bam.bai file created.
One additional question, when I was installing Rainbow 2.0.4 there are no rbmergetag to do chmod and cp into my usr/local/bin... could this be causing the problem?

Thank you,
Quynh

Chris Brauer

unread,
Feb 24, 2016, 10:12:44 PM2/24/16
to dDocent
Hi Quynh,

when we experienced our issue it was due to excessive RAM usage. In the end we experimented with a run using just one core to see how much RAM FreeBayes was accessing per core and then just specified the maximum number of cores that could be used without exceeding the total RAM available on the system. From memory it seems to scale up very quickly as you increase the number of samples rather than number of sequences. I have a feeling Jon may have addressed this issue with the last update so maybe also check to see if you are running the latest version. I'm not sure however if yours is the same issue as the error message is different.

Jon will, I'm sure be able to shed more light here than me.

Chris

Jon Puritz

unread,
Feb 25, 2016, 8:08:19 AM2/25/16
to dDocent
Hi Quynh,

It seems that something failed during the read mapping process.  Could you look through some files in your ./logfiles directory?  Take a look through the bwa.log files and the bam.log files.  Please paste anything that looks like an error in your email.  This should help us pinpoint the problem.

Cheers,

Jon

Quynh Quach

unread,
Feb 25, 2016, 11:00:01 AM2/25/16
to dDocent
Hi Jon and Chris,
Thanks for getting back to me so quickly. I actually don't have any bwa.log or bam.log files.  I have a log for each of my samples and a final.log and VCFtools.log..
It seems like the cat-RRG.bam file wasn't recognized and therefore cov.stats. bamlist.list map.bed and mapped.bed are all empty.
I will try to run it on one core to see if Chris's solution will work for me as well.
Thanks again,
Quynh

Quynh Quach

unread,
Feb 25, 2016, 11:10:42 AM2/25/16
to dDocent
Hi Jon,
To clarify a bit more about the project in case the details will shed some light on the issue. I have 48 individuals with SE RAD reads and I'm not looking to map them to a reference genome (de novo is preferred).

Quynh Quach

unread,
Feb 25, 2016, 1:43:23 PM2/25/16
to dDocent
Hi Jon,
So running it on one core and 2g of RAM gave the same errors.  I took a look at the dDocent code to see where -RG.bam is created and it looks like it's created when the mapping option is set to "yes" which I have specified as "no".  However in the part that follows for SNP calling, the program looks for -RG.bam to create the bamlist.list but because I didn't want mapping, an -RG.bam file was never created.  Is this the same file as the cat-RRG.bam file? Here's the portion of code I'm referring to:

##SNP Calling Section of code

if [ "$SNP" != "no" ]; then
    #Create list of BAM files
    ls *-RG.bam >bamlist.list
    #If mapping is not being performed, but intervals do not exist they are created
    if [[ "$MAP" == "no" && ! -f "cat-RRG.bam" ]]; then
        CreateIntervals









Quynh
On Thursday, February 25, 2016 at 8:08:19 AM UTC-5, Jon Puritz wrote:

Jon Puritz

unread,
Feb 25, 2016, 1:46:29 PM2/25/16
to ddo...@googlegroups.com
Why aren't you mapping reads?

--
You received this message because you are subscribed to the Google Groups "dDocent" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ddocent+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Quynh Quach

unread,
Feb 25, 2016, 2:36:23 PM2/25/16
to dDocent
Hi Jon,
I might have misunderstood the program but I didn't choose to map reads because I thought that option is for aligning sequences to a reference genome and I don't have a reference genome for the species I'm working with. I wanted to call variants from a de novo assembled genome from my reads, is de novo also done in the map read portion of the pipeline as well?
Quynh

Jon Puritz

unread,
Feb 25, 2016, 2:38:37 PM2/25/16
to ddo...@googlegroups.com
Hi Quynh,

The read mapping needs to happen whether you are using a reference genome or a de novo assembled one.  This is the step where all the reads in your data set are aligned to the reference (or de novo reference) and provide support for SNP calls.  So, please run the mapping step!

Jon

-- 
Jon Puritz, PhD
Postdoctoral Research Associate
Northeastern University
Marine Science Center
430 Nahant Rd, Nahant, MA 01908

Webpage: http://staff.tamucc.edu/jpuritz

Email: 
jpu...@gmail.com 

Cell: 401-338-8739

"The most valuable of all talents is that of never using two words when one will do."
-Thomas Jefferson

Quynh Quach

unread,
Feb 26, 2016, 8:37:50 AM2/26/16
to dDocent
Hi Jon,
Thank you for helping me, the run went smoothly this time and all the output files were created.
I did get this error at the end:
mawk: read error (Is a directory)

/usr/local/bin/dDocent: line 363: mailx: command not found
But I think that's just for emailing when the job's finished unless you think otherwise. 
Thanks again!
Quynh
Reply all
Reply to author
Forward
0 new messages