-Multiple databases, including kraken2 standard, kraken2 standard 8gb, RefSeqCompleteV205 100gb, SILVA132, kraken NCBI
-Adjusting '--confidence' at multiple values between 0 and 0.5
-Input concatenated reads file as fastq
-Input concatenated reads file as fasta
-Input paired (R1 and R2) sequence files (fastq) using the '--paired' flag
-Running with and without '--use-names'
-Running with and without '--memory-mapping'
-Running with and without '--quick'
Any ideas on what might be happening? I checked if it was a pre-processing error by 1. Running FastQC on concatenated sequencing lanes and on concatenated reads (F and R) following the KneadData step (everything looks good to me), and by 2. Testing a sample from a different dataset that was pre-processed following the MG SOPv2 (same sample type, also did not classify >3%). These samples came from marine biofilms on steel and I don't think they are that novel; however, maybe I am missing something obvious.
Thank you,
Rachel
![]() |
ANDRÉ M. COMEAU, PhD |
Address for deliveries: |
Research Associate (Lab Manager)
Morgan Langille Lab •
Dept. of Pharmacology |
||
"Without fantasy, there is no science. Without fact, there is no art." - Nabokov |
![]() |
ANDRÉ M. COMEAU, PhD |
Address for deliveries: |
Research Associate (Lab Manager)
Morgan Langille Lab •
Dept. of Pharmacology |
||
"Without fantasy, there is no science. Without fact, there is no art." - Nabokov |
kraken2 --db /Volumes/FireFly_Promise_Pegasus/Databases/kraken2/k2_standard --threads 12 --quick --output /Volumes/FireFly_Promise_Pegasus/RMugge/DISSERTATION/Ch3/Metagenome_Microcosm2.0/kraken2_testing_random/output/output.kraken2.txt --report /Volumes/FireFly_Promise_Pegasus/RMugge/DISSERTATION/Ch3/Metagenome_Microcosm2.0/kraken2_testing_random/output/kraken2_report --use-names --memory-mapping --confidence 0.5 /Volumes/FireFly_Promise_Pegasus/RMugge/DISSERTATION/Ch3/Metagenome_Microcosm2.0/kraken2_testing_random/input/*.fastq
Output:
289348 sequences classified (1.02%)
27965686 sequences unclassified (98.98%)
![]() |
ANDRÉ M. COMEAU, PhD |
Address for deliveries: |
Research Associate (Lab Manager)
Morgan Langille Lab •
Dept. of Pharmacology |
||
"Without fantasy, there is no science. Without fact, there is no art." - Nabokov |
![]() |
ANDRÉ M. COMEAU, PhD |
Address for deliveries: |
Research Associate (Lab Manager)
Morgan Langille Lab •
Dept. of Pharmacology |
||
"Without fantasy, there is no science. Without fact, there is no art." - Nabokov |
OK good to know about the 16S. However, were there dominant ASVs in there that were only classified down to a shallow tax level (such as Proteobacteria-unclassified)? Those will still show up as matching query sequences, just get labeled with much less fine taxonomy.
For comparison, I extracted all the rRNA genes from the metaG sample and the largest amount of hits for the LSU were coming back as the following NCBI accession (see attached pivot table of the results as well):
Interestingly, this seems to be an unclassified zeta-proteobacterium scaffold from Bigelow that has since been removed from the dbase (it says due to awaiting publication, but that was submitted in 2013) and hence would not be in the Kraken2 dbases.
Maybe you are getting a highly specific consortium of these zetas being selected for in these biofilms. It might be worth trying some MAG assembly with these samples (either individually or "pooled" if behaving the same way) to see if you can get some larger scaffolds that would confirm you have a few dominant organisms/strains here and that they may be some pretty novel stuff. You'd be then able to map this LSU sequence back to the scaffolds to prove whether it belongs to this new zeta and you can then do a read mapping back to the constructed scaffolds to measure the amount of original reads you are recovering using them (therefore telling you whether the majority of reads missed by Kraken2 map to these new scaffolds and hence why they were missed).
ANDRÉ M. COMEAU, PhD
Manager • Integrated Microbiome Resource (IMR)
T: 902.494.2684 | E: andre....@dal.caAddress for deliveries:
Dept. of Pharmacology
Tupper Med. Bldg., room 5D
Dalhousie University
5850 College St.
Halifax NS B3H 4R2Research Associate (Lab Manager)
Morgan Langille Lab • Dept. of Pharmacology
ResearchGate Profile • GoogleScholar Publications
"Without fantasy, there is no science. Without fact, there is no art." - Nabokov
"The good thing about science is that it's true whether or not you believe in it." - Neil deGrasse Tyson
From: microbio...@googlegroups.com <microbio...@googlegroups.com> on behalf of Rachel Mugge <rachel...@gmail.com>
![]() |
ANDRÉ M. COMEAU, PhD |
Address for deliveries: |
Research Associate (Lab Manager)
Morgan Langille Lab •
Dept. of Pharmacology |
||
"Without fantasy, there is no science. Without fact, there is no art." - Nabokov |
![]() |
ANDRÉ M. COMEAU, PhD |
Address for deliveries: |
Research Associate (Lab Manager)
Morgan Langille Lab •
Dept. of Pharmacology |
||
"Without fantasy, there is no science. Without fact, there is no art." - Nabokov |