Can't reproduce the basic tutorial results...just 100% unclassified

335 views
Skip to first unread message

허지원

unread,
Sep 14, 2018, 2:11:28 AM9/14/18
to MetaPhlAn-users
Hello!

I am a newcomer at this field and trying to learn the way of microbial community analysis using whole genome sequencing.

I have installed Metaphlan2 and requisite programs (numpy, bowtie2...).

I set the PATH of metaphlan2 and bowtie2 correctly and the version of python and bowtie is sufficient for the performance.

But, when I run the command like below, it failed to classify the input data making empty bowtie2out.txt

[jwhuh@lsg03 tutorial]$ metaphlan2.py SRS014459-Stool.fasta.gz --input_type fasta > stool.txt
Help message for read_fastx.py

[jwhuh@lsg03 tutorial]$ ll
total 700
-rw-r--r-- 1 jwhuh users 705910 Sep 14 14:55 SRS014459-Stool.fasta.gz
-rw-r--r-- 1 jwhuh users 0 Sep 14 14:56 SRS014459-Stool.fasta.gz.bowtie2out.txt
-rw-r--r-- 1 jwhuh users 49 Sep 14 14:56 stool.txt

[jwhuh@lsg03 tutorial]$ head stool.txt
#SampleID Metaphlan2_Analysis
unclassified 100.0

I have changed permission for both sample data and metaphlan_database directory but it didn't work...

What am I doing wrong? I hope you can help me...
Thank you in advance..

Best,
JW Huh

Francesco Asnicar

unread,
Sep 14, 2018, 7:49:41 AM9/14/18
to 허지원, MetaPhlAn-users
Hello JW Huh,

Thanks for writing here. I tried to reproduce your issue, without success. Here the steps I did:
1) get the data

2) run MetaPhlAn2
$ metaphlan2.py --input_type fasta SRS014459-Stool.fasta.gz > SRS014459-Stool.profile
Help message for read_fastx.py

3) check the generated files
$ ls -l
total 744K
-rw-rw---- 1 f.asnicar CM 690K Sep 14 12:11 SRS014459-Stool.fasta.gz
-rw-rw---- 1 f.asnicar CM  45K Sep 14 12:13 SRS014459-Stool.fasta.gz.bowtie2out.txt
-rw-rw---- 1 f.asnicar CM 3.6K Sep 14 12:14 SRS014459-Stool.profile

4) the profile of the sample
$ head SRS014459-Stool.profile 
#SampleID       Metaphlan2_Analysis
k__Bacteria     100.0
k__Bacteria|p__Firmicutes       65.04229
k__Bacteria|p__Bacteroidetes    34.95771
k__Bacteria|p__Firmicutes|c__Clostridia 65.04229
k__Bacteria|p__Bacteroidetes|c__Bacteroidia     34.95771
k__Bacteria|p__Firmicutes|c__Clostridia|o__Clostridiales        65.04229
k__Bacteria|p__Bacteroidetes|c__Bacteroidia|o__Bacteroidales    34.95771
k__Bacteria|p__Firmicutes|c__Clostridia|o__Clostridiales|f__Ruminococcaceae     39.25462
k__Bacteria|p__Bacteroidetes|c__Bacteroidia|o__Bacteroidales|f__Bacteroidaceae  31.02488

Can you check which version of the tools you installed, the size of the database file and which version of Python and the Python libraries you have installed for MetaPhlAn2? So that we can try to understand if there are incompatibilities somewhere.


Many thanks,
Francesco

--
You received this message because you are subscribed to the Google Groups "MetaPhlAn-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metaphlan-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

허지원

unread,
Sep 16, 2018, 5:50:34 AM9/16/18
to MetaPhlAn-users
Thank you for kind and fast response!

Your process looks really clear... I hope I can do that as well...
Anyway, the versions of the tools I installed for metaphlan2 are described below

- Python 3.6.5
- Bowtie2-2.3.4.1
(those two have already been installed before metaphlan2 installation)

- numpy 1.15.1
- biom-format 2.1.6
- pandas 0.23.4
- scipy 1.1.0
- jupyter 1.0.0
- sympy 1.2
- nose 1.3.7
- biopython 1.72
- ipython 6.5.0
- matplotlib 2.2.3


and the list and the size of metaphlan database files are shown below...

[jwhuh@lsg03 metaphlan_databases]$ ll -h
total 553M
-rwxrwxrwx 1 jwhuh users 20M Sep 12 15:17 mpa_v20_m200.1.bt2
-rwxrwxrwx 1 jwhuh users 9.4M Sep 12 15:17 mpa_v20_m200.2.bt2
-rwxrwxrwx 1 jwhuh users 509K Sep 12 15:16 mpa_v20_m200.3.bt2
-rwxrwxrwx 1 jwhuh users 9.4M Sep 12 15:16 mpa_v20_m200.4.bt2
-rwxrwxrwx 1 jwhuh users 204M Nov 2 2017 mpa_v20_m200.fna.bz2
-rwxrwxrwx 1 jwhuh users 51 Sep 12 15:16 mpa_v20_m200.md5
-rwxrwxrwx 1 jwhuh users 39M Nov 3 2017 mpa_v20_m200.pkl
-rwxrwxrwx 1 jwhuh users 20M Sep 12 15:17 mpa_v20_m200.rev.1.bt2
-rwxrwxrwx 1 jwhuh users 9.4M Sep 12 15:17 mpa_v20_m200.rev.2.bt2
-rwxrwxrwx 1 jwhuh users 242M Sep 12 15:16 mpa_v20_m200.tar

Thank again for your help

Sincerely,
Huh

2018년 9월 14일 금요일 오후 8시 49분 41초 UTC+9, Francesco Asnicar 님의 말:
Message has been deleted

허지원

unread,
Sep 17, 2018, 1:56:19 AM9/17/18
to MetaPhlAn-users
Let me add more.
When I tried bowtie2 alone like below, it also failed to classify

[jwhuh@lsg03 tutorial]$ bowtie2 --sensitive -S stool.sam -x /data/program/Metaphlan2/biobakery-metaphlan2-097a52362c79/metaphlan_databases/mpa_v20_m200 -fU SRS014459-Stool.fasta.gz

20000 reads; of these:
20000 (100.00%) were unpaired; of these:
20000 (100.00%) aligned 0 times
0 (0.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
0.00% overall alignment rate

Because you successfully classified the same stool sample data, I think bowtie2 process in my computer is going wrong anyway..

Or Could you provide an intermediate sam file that was processed by bowtie2 command to confirm whether my metaphlan2 works properly.


Thank you!

Best,
Huh

허지원

unread,
Sep 17, 2018, 3:25:02 AM9/17/18
to MetaPhlAn-users
Now, it worked!!

It was found that original bowtie index files in metadata_database directory was not working, and when I set a command like below

$ metaphlan2.py SRS014459-Stool.fasta --input_type fasta --mpa_pkl ./metaphlan_databases/mpa_v20_m200.pkl --bowtie2db ./metaphlan_databases/mpa_v20_m200 --bowtie2_exe /data/program/bowtie2-2.3.4.1 --bowtie2out stool.out --tmp_dir .

(I linked the database directory)

The command newly downloaded Metaphlan2 database like this

------------------------------------------------------------------------
Downloading MetaPhlAn2 database
Please note due to the size this might take a few minutes

Downloading https://bitbucket.org/biobakery/metaphlan2/downloads/mpa_v20_m200.tar
Downloading file of size: 241.78 MB
241.78 MB 100.00 % 7.22 MB/sec 0 min -0 sec
Downloading https://bitbucket.org/biobakery/metaphlan2/downloads/mpa_v20_m200.md5
Downloading file of size: 0.00 MB
0.01 MB 16062.75 % 3.95 MB/sec 0 min -0 sec

Decompressing ./metaphlan_databases/mpa_v20_m200/mpa_v20_m200.fna.bz2 into ./metaphlan_databases/mpa_v20_m200/mpa_v20_m200.fna

Building Bowtie2 indexes
Removing uncompress database ./metaphlan_databases/mpa_v20_m200/mpa_v20_m200.fna

Download complete
Help message for read_fastx.py
OSError: "[Errno 13] Permission denied: '/data/program/bowtie2-2.3.4.1'"
Fatal error running BowTie2. Is BowTie2 in the system path?
----------------------------------------------------------------------------

When I permitted and used those indexes for classification, it worked perfectly.

So, I replaced the original indexes with new ones, and now the first command that didn't work "$metaphlan2.py SRS014459-Stool.fasta --input_type fasta > stool.txt" revealed expected results!!

I greatly appreciate your discussion. I got a hint from your mention of index size.

I will be back when I meet another problem :)
Thank you again

Bests,
Huh

Francesco Asnicar

unread,
Sep 17, 2018, 3:25:03 AM9/17/18
to 허지원, MetaPhlAn-users
Hi Huh,

I think something is not right with your bowtie2 indexes. The size of the file automatically downloaded are correct, the 6 files generated by bowtie2 however do not match. These are the size of the 6 files that I have:
291M  mpa_v20_m200.1.bt2
170M  mpa_v20_m200.2.bt2
9.0M  mpa_v20_m200.3.bt2
170M  mpa_v20_m200.4.bt2
291M  mpa_v20_m200.rev.1.bt2
170M  mpa_v20_m200.rev.2.bt2

From the Biobakery tutorial of MetaPhlAn2 (https://bitbucket.org/biobakery/biobakery/wiki/metaphlan2#rst-header-input-files) you can get the bowtie2out file for the example files, this should be the one for the stool example: https://bitbucket.org/biobakery/biobakery/raw/tip/demos/biobakery_demos/data/metaphlan2/output/SRS014459-Stool.fasta.gz.bowtie2out.txt

I remember that at certain point bowtie2 had a bug about this, this should be the issue in GitHub https://github.com/BenLangmead/bowtie2/issues/194#issuecomment-406467586

Many thanks,
Francesco

허지원

unread,
Sep 17, 2018, 3:30:25 AM9/17/18
to MetaPhlAn-users
Oh, you and I simultaneously posted :)

You are correct! When I downloaded database, it worked perfect.
I would like to express my biggest thanks.

Thank you

H

Francesco Asnicar

unread,
Sep 17, 2018, 3:37:37 AM9/17/18
to 허지원, MetaPhlAn-users
Hi Huh,

Awesome, I'm glad that we were able to fix this problem and now you have MetaPhlAn2 working.

All the best,
Francesco

Reply all
Reply to author
Forward
0 new messages