Bowtie/diamond parameters - combined DNA/RNA - reads length

65 views
Skip to first unread message

Flo

unread,
Sep 11, 2019, 10:42:01 AM9/11/19
to MetaPhlAn-users

Dear MetaPhlAn2/ HUMAnN2 users and developers,

 

Thanks a lot for developing and maintaining those tools. I am working on oral microbiome samples using metagenomics (2*~250 nts after quality trimming) and metatranscriptomics (2*~100 nts after quality trimming).

 

I recently realized that read length should be taken into consideration when using metaphlan2 and humann2 (e.g., https://groups.google.com/forum/#!searchin/humann-users/sensitive-local%7Csort:date/humann-users/UTJMPKaO59w/CLnVREUjAQAJ)

 

Do you have  recommendations for metaphlan2 taxonomical profiling of 250 nts DNA reads?

1-    Are the options --min_alignment_len 100 --bt2_ps sensitive-local correct or am I missing some marker genes setting minimum alignment length to 100 nts? Any reason using very-sensitive-local?

 

With humann2, it is also advised to tune bowtie2 options for the pangenome search to local-mode, and configure translated search to a more relaxed mode --translated-query-coverage-threshold from 90%->50%.

 

2-    I can’t find the argument to define bowtie2 preset in humann2 (v2.8.1). How could I do that?

 

I am working with metagenomes and metatranscriptomes (human oral sites) and my paired end read after quality trimming are 250 nts and 100 nts long, respectively.

 

3-    Would you advise to use --bt2_ps sensitive-local --min_alignment_len 100 and --translated-query-coverage-threshold 50 for DNA reads 250 nts and keep the default for the 200 nts long RNA reads ?

 


Thanks a ton.



Florentin





P.S. :Please find a summary of log info for a sample I run with humann2 (default except --prescreen-threshold 0.01 --taxonomic-profile metaphlan2/ ${NAME}*_profile.txt )

 


DNA - 2x 250 nts

RNA - 2x 100 nts

Quality trimmed human free reads

4991454

1493062

aligned exactly 1 time against nucleotide database

23.05%

20.64%

aligned >1 times (nucleotide database)

30.23%

48.03%

overall alignment rate (nucleotide database)

53.28%

68.67%

Unaligned reads after nucleotide alignment

51.1878102052 %

33.8590079951 %

Unaligned reads after translated alignment:

50.1295814807 %

32.7953541295 %

Proteins with coverage greater than threshold (50.0)

6296

2001

Total translated alignments not included based on small query coverage:

6703489

1025460

Total translated alignments not included based on small subject coverage value:

56900

160339



Reply all
Reply to author
Forward
0 new messages