Hello,
Did anyone ever figure out what was causing the error? I have run into a similar problem. I am working with about 104G of data which is about 25X for the genome I am working with. I am working on a Ubuntu 12.04 64 system with 32G of ram. When I run this data through with out the -s option, everything seems to work. I get the following output.
Program start..
Program: KmerFreq_AR
Version: v2.0
Author: BGI-ShenZhen
CompileDate: Dec 17 2012 time: 10:56:25
Current time: Tue Oct 15 12:16:10 2013
Command line: KmerFreq_AR -t 8 -k 15 -p bullata FileList.txt
Begin to construct Kmer frequency table...
Kmer frequency table initialization completed
Run time: 0s.
Start to parse read file: W.1B.fq ...
8 threads were created!
Finished parse reads file: W.1B.fq
Run time: 193s.
Start to parse read file: W.2B.fq ...
8 threads were created!
Finished parse reads file: W.2B.fq
Run time: 387s.
Start to parse read file: W.1S.fq ...
8 threads were created!
Finished parse reads file: W.1S.fq
Run time: 389s.
Start to parse read file: W.2S.fq ...
8 threads were created!
Finished parse reads file: W.2S.fq
Run time: 389s.
Parsed all the reads files completed
Run time: 389s.
Start to combine forward and reverse strands of Kmers...
8 threads were created!
Complete to combine forward and reverse strands of Kmers!
Construction of Kmer frequency table completed!
Run time: 396s.
Start to output compressed kmer frequency file...
8 threads were created!
Complete to generate the kmer frequency compressed file!
Run time: 416s.
Start to generate kmer frequency statistics file...
Complete to generate the kmer frequency statistics file!
Run time: 416s.
Please check the peek position carefully.
Start to generate genome characters estimate-file...
Complete to generate the genome characters estimate-file!
All done!
Run time: 416s.
When I run the exact same setup with -s 2 option I get.
Program start..
Program: KmerFreq_AR
Version: v2.0
Author: BGI-ShenZhen
CompileDate: Dec 17 2012 time: 10:56:25
Current time: Tue Oct 15 12:14:41 2013
Command line: KmerFreq_AR -t 8 -s 2 -k 15 -p bullata FileList.txt
Begin to construct Kmer frequency table...
Kmer frequency table initialization completed
Run time: 1s.
Start to parse read file: W.1B.fq ...
8 threads were created!
And I get: Segmentation fault (core dumped) from the standard error.
Is it possible that I am running into a memory problem? I have 32G memory and without the -s flag, SOAPec uses around 1G of memory. I do not know if it is a related problem, but I tried to track down the possibility that the problem was a result of too little memory and ran into another issue. Because the example data provided with SOAPdenovo was so small, I decided to try a few things out with it. When running multitreaded with -s option the program started thrashing at the "output compressed kmer frequency file" stage:
peyton@Cranium:~/Downloads/test$ KmerFreq_AR -t 8 -k 11 -s 4 -p Test TestList.txt
Program start..
Program: KmerFreq_AR
Version: v2.0
Author: BGI-ShenZhen
CompileDate: Dec 17 2012 time: 10:56:25
Current time: Mon Oct 14 15:35:21 2013
Command line: KmerFreq_AR -t 8 -k 11 -s 4 -p Test TestList.txt
Begin to construct Kmer frequency table...
Kmer frequency table initialization completed
Run time: 0s.
Start to parse read file: test_PE1.fa ...
8 threads were created!
Finished parse reads file: test_PE1.fa
Run time: 0s.
Start to parse read file: test_PE2.fa ...
8 threads were created!
Finished parse reads file: test_PE2.fa
Run time: 0s.
Parsed all the reads files completed
Run time: 0s.
Start to combine forward and reverse strands of Kmers...
8 threads were created!
Complete to combine forward and reverse strands of Kmers!
Construction of Kmer frequency table completed!
Run time: 0s.
Start to output compressed kmer frequency file...
8 threads were created!
^C
If, however, I run essentially the same setup single threaded it runs like a champ.
Program start..
Program: KmerFreq_AR
Version: v2.0
Author: BGI-ShenZhen
CompileDate: Dec 17 2012 time: 10:56:25
Current time: Mon Oct 14 17:07:39 2013
Command line: KmerFreq_AR -k 11 -s 4 -p Test TestList.txt
Begin to construct Kmer frequency table...
Kmer frequency table initialization completed
Run time: 0s.
Start to parse read file: test_PE1.fa ...
1 threads were created!
Finished parse reads file: test_PE1.fa
Run time: 1s.
Start to parse read file: test_PE2.fa ...
1 threads were created!
Finished parse reads file: test_PE2.fa
Run time: 1s.
Parsed all the reads files completed
Run time: 1s.
Start to combine forward and reverse strands of Kmers...
1 threads were created!
Complete to combine forward and reverse strands of Kmers!
Construction of Kmer frequency table completed!
Run time: 1s.
Start to output compressed kmer frequency file...
1 threads were created!
Complete to generate the kmer frequency compressed file!
Run time: 2s.
Start to generate kmer frequency statistics file...
Complete to generate the kmer frequency statistics file!
Run time: 2s.
Please check the peek position carefully.
Start to generate genome characters estimate-file...
Complete to generate the genome characters estimate-file!
All done!
Run time: 2s.
I tried running the large data set single threaded, but it still failed in the same way. Thank you in advance for the help.
Justin