Hello EMIRGE users,
I'm new to the program and I can't quite figure out how to troubleshoot the error that I'm running into. It looks similar to the error that was discussed in a previous post when no reads were aligning, but it looks to me like that is not the problem in my case. I'm on a Mac running OS X El Capitan, if that matters. Any thoughts?
Thanks!
>emirge.py ./EMIRGE_32 -1 32_S32_L001_R1_001.fastq -2 32_S32_L001_R2_001.fastq -l 302 -f ~/refs/SSURef/SSURef_111_candidate_db.fasta -b ~/refs/SSURef/SSURef_111_candidate_db_formated -i 500 -s 500 --phred33
If you use EMIRGE in your work, please cite these manuscripts, as appropriate.
Miller CS, Baker BJ, Thomas BC, Singer SW, Banfield JF (2011)
EMIRGE: reconstruction of full-length ribosomal genes from microbial community short read sequencing data.
Genome biology 12: R44. doi:10.1186/gb-2011-12-5-r44.
Miller CS, Handley KM, Wrighton KC, Frischkorn KR, Thomas BC, Banfield JF (2013)
Short-Read Assembly of Full-Length 16S Amplicons Reveals Bacterial Diversity in Subsurface Sediments.
PloS one 8: e56018. doi:10.1371/journal.pone.0056018.
imported _emirge C functions from: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/_emirge.so
Command:
/Users/Professional/bin/emirge ./EMIRGE_32 -1 32_S32_L001_R1_001.fastq -2 32_S32_L001_R2_001.fastq -l 302 -f /Users/Professional/refs/SSURef/SSURef_111_candidate_db.fasta -b /Users/Professional/refs/SSURef/SSURef_111_candidate_db_formated -i 500 -s 500 --phred33
EMIRGE started at Tue Jan 26 12:47:19 2016
Performing initial mapping with command:
cat /Users/Professional/Dropbox/Water_Heater_Microbes/Sequencing/whole_genomes/2014_May_MiSeq/fastq_files/32_S32_L001_R1_001.fastq | bowtie --phred33-quals -t -p 1 -n 3 -l 20 -e 300 --best --sam --chunkmbs 128 --minins 302 --maxins 2000 /Users/Professional/refs/SSURef/SSURef_111_candidate_db_formated -1 - -2 /Users/Professional/Dropbox/Water_Heater_Microbes/Sequencing/whole_genomes/2014_May_MiSeq/fastq_files/32_S32_L001_R2_001.fastq | samtools view -b -S -u -F 0x0004 - > /Users/Professional/Dropbox/Water_Heater_Microbes/Sequencing/whole_genomes/2014_May_MiSeq/fastq_files/EMIRGE_32/initial_mapping/initial_bowtie_mapping.PE.bam
Time loading reference: 00:00:00
Time loading forward index: 00:00:00
Time loading mirror index: 00:00:00
[samopen] SAM header is present: 150807 sequences.
Seeded quality full-index search: 00:11:44
# reads processed: 849199
# reads with at least one reported alignment: 682 (0.08%)
# reads that failed to align: 848517 (99.92%)
Reported 682 paired-end alignments to 1 output stream(s)
Time searching: 00:11:44
Overall time: 00:11:44
Beginning initialization at Tue Jan 26 12:59:04 2016...
Reading bam file /Users/Professional/Dropbox/Water_Heater_Microbes/Sequencing/whole_genomes/2014_May_MiSeq/fastq_files/EMIRGE_32/initial_mapping/initial_bowtie_mapping.PE.bam at Tue Jan 26 12:59:04 2016...
Traceback (most recent call last):
File "/Users/Professional/bin/emirge", line 1697, in <module>
main()
File "/Users/Professional/bin/emirge", line 1681, in main
em.initialize_EM(options.mapping, options.fasta_db)
File "/Users/Professional/bin/emirge", line 337, in initialize_EM
self.read_bam(bam_filename, reference_fasta_filename)
File "/Users/Professional/bin/emirge", line 297, in read_bam
self.probN = [None for x in range(max(self.sequence_name2sequence_i[-1].values())+1)]
ValueError: max() arg is an empty sequence
If you use EMIRGE in your work, please cite these manuscripts, as appropriate.
Miller CS, Baker BJ, Thomas BC, Singer SW, Banfield JF (2011)
EMIRGE: reconstruction of full-length ribosomal genes from microbial community short read sequencing data.
Genome biology 12: R44. doi:10.1186/gb-2011-12-5-r44.
Miller CS, Handley KM, Wrighton KC, Frischkorn KR, Thomas BC, Banfield JF (2013)
Short-Read Assembly of Full-Length 16S Amplicons Reveals Bacterial Diversity in Subsurface Sediments.
PloS one 8: e56018. doi:10.1371/journal.pone.0056018.
imported _emirge C functions from: /home/z3382651/bin/mypythondir/mypythonenv/mypythonenv/lib/python2.7/site-packages/_emirge.so
Command:
/home/z3382651/bin/mypythondir/mypythonenv/mypythonenv/bin/emirge.py /srv/scratch/z3382651/metagenomes -1 /srv/scratch/z3382651/metagenomes/high_quality_genomes/028-LFA_S1_L001_R1_001_val_1.fq -2 /srv/scratch/z3382651/metagenomes/high_quality_genomes/028-LFA_S1_L001_R2_001_val_2.fq -f /srv/scratch/z3382651/emirge_db/SSURef_111_candidate_db.fasta -b /srv/scratch/z3382651/emirge_db/SSURef_111_candidate_db -l 880 -i 480 -s 300 -a 12 --phred33
EMIRGE started at Thu Feb 4 16:41:28 2016
Time loading reference: 00:00:01
Time loading forward index: 00:00:00
Time loading mirror index: 00:00:01
[samopen] SAM header is present: 150807 sequences.
Seeded quality full-index search: 00:06:31
# reads processed: 4903537
# reads with at least one reported alignment: 0 (0.00%)
# reads that failed to align: 4903537 (100.00%)
No alignments
Time searching: 00:06:33
Overall time: 00:06:33
Performing initial mapping with command:
cat /srv/scratch/z3382651/metagenomes/high_quality_genomes/028-LFA_S1_L001_R1_001_val_1.fq | bowtie --phred33-quals -t -p 12 -n 3 -l 20 -e 300 --best --sam --chunkmbs 128 --minins 880 --maxins 1380 /srv/scratch/z3382651/emirge_db/SSURef_111_candidate_db -1 - -2 /srv/scratch/z3382651/metagenomes/high_quality_genomes/028-LFA_S1_L001_R2_001_val_2.fq | samtools view -b -S -u -F 0x0004 - > /srv/scratch/z3382651/metagenomes/initial_mapping/initial_bowtie_mapping.PE.bam
Beginning initialization at Thu Feb 4 16:48:01 2016...
Reading bam file /srv/scratch/z3382651/metagenomes/initial_mapping/initial_bowtie_mapping.PE.bam at Thu Feb 4 16:48:01 2016...
Traceback (most recent call last):
File "/home/z3382651/bin/mypythondir/mypythonenv/mypythonenv/bin/emirge.py", line 1697, in <module>
main()
File "/home/z3382651/bin/mypythondir/mypythonenv/mypythonenv/bin/emirge.py", line 1681, in main
em.initialize_EM(options.mapping, options.fasta_db)
File "/home/z3382651/bin/mypythondir/mypythonenv/mypythonenv/bin/emirge.py", line 337, in initialize_EM
self.read_bam(bam_filename, reference_fasta_filename)
File "/home/z3382651/bin/mypythondir/mypythonenv/mypythonenv/bin/emirge.py", line 297, in read_bam
self.probN = [None for x in range(max(self.sequence_name2sequence_i[-1].values())+1)]
ValueError: max() arg is an empty sequence
samtools flagstat initial_bowtie_mapping.PE.bam
0 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
0 + 0 mapped (nan%:nan%)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (nan%:nan%)
0 + 0 with itself and mate mapped
0 + 0 singletons (nan%:nan%)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
samtools view -h initial_bowtie_mapping.PE.bam | head
@HD VN:1.0 SO:unsorted
@SQ SN:FJ788112.1.2000 LN:2000
@SQ SN:JF742194.1.2062 LN:2000
@SQ SN:GU290080.1.2000 LN:2000
@SQ SN:FJ572900.1.2000 LN:2000
@SQ SN:GU556149.1.2000 LN:2000
@SQ SN:AY288699.1.2000 LN:2000
@SQ SN:AF012514.1.2000 LN:2000
@SQ SN:AF093247.1.2007 LN:2000
@SQ SN:AF249194.1.1999 LN:1999