No MetaPhlAn BowTie2 database found

4,045 views
Skip to first unread message

charlottec

unread,
Jun 27, 2018, 12:34:09 PM6/27/18
to MetaPhlAn-users
Hi all,

I'm wondering if anyone can help me...

I have downloaded Metaphlan2 and all of its dependencies via Conda, in a conda environment.

am trying to profile metagenomic paired end samples. I was testing this code preliminarily:

metaphlan2.py S1_QC_1.fastq,S1_QC_2.fastq --bowtie2out bowtie_output_S1.bz2 -t rel_ab_w_read_stats --nproc 12 --input_type fastq

But it keeps giving me the error:
No MetaPhlAn BowTie2 database found (--index option)!
Expecting location bowtie2db

I have downloaded the database files and they are all present in /miniconda2/envs/metaphlan/bin/metaphlan_databases/
mpa_v20_m200.fna mpa_v20_m200.fna.bz2 mpa_v20_m200_marker_info.txt mpa_v20_m200_marker_info.txt.bz2 mpa_v20_m200.md5 mpa_v20_m200.pkl mpa_v20_m200.tar

I'm not sure if I've missed something out of my command and if I should have used --bowtie2db?

Thanks in advance!

Francesco Asnicar

unread,
Jun 28, 2018, 7:23:51 AM6/28/18
to charlottec, MetaPhlAn-users
Hi,

So, the error is due to the fact that MetaPhlAn doesn't find the bowtie2 indexes. The indexes are searched into the "metaphlan_databases" folder, but in the error you reported the set folder is "bowtie2db", which is not the default, and this is strange...

So, I'll try this way, to see if it is due to the fact (if I understood correctly) that you manually downloaded the metaphlan database (which shouldn't be a problem), but just for rule out this possibility.
If you can rename the "/miniconda2/envs/metaphlan/bin/metaphlan_databases/" to something like "/miniconda2/envs/metaphlan/bin/metaphlan_databases.bkp/" and then re-run MetaPhlAn (keeping the params as you posted in the command in your previous email). At this point MetaPhlAn should automatically download the database and the perform the profiling of your metagenomes.


Many thanks,
Francesco

--
You received this message because you are subscribed to the Google Groups "MetaPhlAn-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metaphlan-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

charlottec

unread,
Jun 28, 2018, 8:49:03 AM6/28/18
to MetaPhlAn-users
Hi Francesco,

Thank you for your reply.

I downloaded metaphlan using this command conda install -c bioconda metaphlan2 in a conda environment.

I just tried your solution but when re-running metaphlan, it just created another metaphlan_databases directory here miniconda2/envs/metaphlan/bin/

and gave this error:

Downloading MetaPhlAn2 database
Please note due to the size this might take a few minutes

Downloading https://bitbucket.org/biobakery/metaphlan2/downloads/mpa_v20_m200.tar
Downloading file of size: 241.78 MB
241.78 MB 100.00 % 13.27 MB/sec 0 min -0 sec
Downloading https://bitbucket.org/biobakery/metaphlan2/downloads/mpa_v20_m200.md5
Downloading file of size: 0.00 MB
0.01 MB 16062.75 % 12.19 MB/sec 0 min -0 sec

Decompressing /pub38/cchong/miniconda2/envs/metaphlan/bin/metaphlan_databases/mpa_v20_m200.fna.bz2 into /pub38/cchong/miniconda2/envs/metaphlan/bin/metaphlan_databases/mpa_v20_m200.fna

Building Bowtie2 indexes
Removing uncompress database /pub38/cchong/miniconda2/envs/metaphlan/bin/metaphlan_databases/mpa_v20_m200.fna

Download complete
No MetaPhlAn BowTie2 database found (--index option)!
Expecting location /pub38/cchong/miniconda2/envs/metaphlan/bin/metaphlan_databases/mpa_v20_m200
Exiting...(metaphlan)


Thanks,

Charlotte

Francesco Asnicar

unread,
Jun 29, 2018, 3:10:16 AM6/29/18
to charlottec, MetaPhlAn-users
Hi Charlotte,

Can you please provide what's the content of the (new) "metaphlan_databases" folder created automatically by MetaPhlAn (ls -l /pub38/cchong/miniconda2/envs/metaphlan/bin/metaphlan_databases/)?

That error is caused when MetaPhlAn does not find the bowtie2 indexes of the database.

Many thanks,
Francesco

charlottec

unread,
Jun 29, 2018, 5:23:40 AM6/29/18
to MetaPhlAn-users
Hi Francesco,

The contents of the new metaphlan_databases folder is:

/miniconda2/envs/metaphlan/bin/metaphlan_databases$ ls
mpa_v20_m200.1.bt2 mpa_v20_m200.2.bt2 mpa_v20_m200.3.bt2 mpa_v20_m200.4.bt2 mpa_v20_m200.fna.bz2 mpa_v20_m200.md5 mpa_v20_m200.pkl mpa_v20_m200.tar

It seems that all of the correct files are present?

Thank you for your help!

Charlotte

Francesco Asnicar

unread,
Jun 29, 2018, 5:52:32 AM6/29/18
to charlottec, MetaPhlAn-users
Thanks Charlotte for the ls output.

Not sure why, but 2 files are missing, other than 1, 2, 3, and 4 .bt2 indexes there should also be rev.1.bt2 and rev.2.bt2.
Can you try manually run the following commands in the "metaphlan_databases" folder to see if it is related with bowtie2:
$ bzip2 -kd mpa_v20_m200.fna.bz2
$ bowtie2-build -f mpa_v20_m200.fna mpa_v20_m200

The first command uncompress the database and the second should create 6 bowtie2 indexes:
mpa_v20_m200.1.bt2
mpa_v20_m200.2.bt2
mpa_v20_m200.3.bt2
mpa_v20_m200.4.bt2
mpa_v20_m200.rev.1.bt2
mpa_v20_m200.rev.2.bt2

BTW, I just created an empty conda environment, install MetaPhlAn from Bioconda and run it without any problem on a sample fastq file... So, I'm thinking that your error could be due to a different bowtie2 version (?).
The Bowtie2 version installed in the conda environment is:
$ bowtie2 --version
/shares/CIBIO-Storage/CM/scratch/users/f.asnicar/anaconda3/envs/test_mpa2/bin/bowtie2-align-s version 2.3.4.1
64-bit
Built on default-df05fd51-3d07-4109-abba-6883676f3ae8
Mon Jun 25 23:12:07 UTC 2018
Compiler: gcc version 4.8.2 20140120 (Red Hat 4.8.2-15) (GCC) 
Options: -O3 -m64 -msse2 -funroll-loops -g3  -DBOOST_MATH_DISABLE_FLOAT128 -m64 -fPIC -std=c++98 -DPOPCNT_CAPABILITY -DWITH_TBB -DNO_SPINLOCK -DWITH_QUEUELOCK=1
Sizeof {int, long, long long, void*, size_t, off_t}: {4, 8, 8, 8, 8, 8}

Can you check which bowtie2 version you are using?


Many thanks,
Francesco

charlottec

unread,
Jun 29, 2018, 6:31:12 AM6/29/18
to MetaPhlAn-users
I ran the two commands:

$ bzip2 -kd mpa_v20_m200.fna.bz2
$ bowtie2-build -f mpa_v20_m200.fna mpa_v20_m200

Giving me the output:
Settings:
Output files: "mpa_v20_m200.*.bt2"
Line rate: 6 (line is 64 bytes)
Lines per side: 1 (side is 64 bytes)
Offset rate: 4 (one in 16)
FTable chars: 10
Strings: unpacked
Max bucket size: default
Max bucket size, sqrt multiplier: default
Max bucket size, len divisor: 4
Difference-cover sample period: 1024
Endianness: little
Actual local endianness: little
Sanity checking: disabled
Assertions: disabled
Random seed: 0
Sizeofs: void*:8, int:4, long:8, size_t:8
Input files DNA, FASTA:
mpa_v20_m200.fna
Building a SMALL index
Reading reference sizes
Time reading reference sizes: 00:00:07
Calculating joined length
Writing header
Reserving space for joined string
Joining reference sequences
Time to join reference sequences: 00:00:07
bmax according to bmaxDivN setting: 177889396
Using parameters --bmax 133417047 --dcv 1024
Doing ahead-of-time memory usage test
Passed! Constructing with these parameters: --bmax 133417047 --dcv 1024
Constructing suffix-array element generator

$ls
mpa_v20_m200.1.bt2 mpa_v20_m200.3.bt2 mpa_v20_m200.fna mpa_v20_m200.md5 mpa_v20_m200.tar
mpa_v20_m200.2.bt2 mpa_v20_m200.4.bt2 mpa_v20_m200.fna.bz2 mpa_v20_m200.pkl

I don't appear to have the mpa_v20_m200.rev.1.bt2 mpa_v20_m200.rev.2.bt2 files.

And the version of bowtie2 that I have is the same as what you found:

/pub38/cchong/miniconda2/envs/metaphlan/bin/bowtie2-align-s version 2.3.4.1


64-bit
Built on default-df05fd51-3d07-4109-abba-6883676f3ae8
Mon Jun 25 23:12:07 UTC 2018
Compiler: gcc version 4.8.2 20140120 (Red Hat 4.8.2-15) (GCC)
Options: -O3 -m64 -msse2 -funroll-loops -g3 -DBOOST_MATH_DISABLE_FLOAT128 -m64 -fPIC -std=c++98 -DPOPCNT_CAPABILITY -DWITH_TBB -DNO_SPINLOCK -DWITH_QUEUELOCK=1
Sizeof {int, long, long long, void*, size_t, off_t}: {4, 8, 8, 8, 8, 8}

Thank you!

Charlotte

Francesco Asnicar

unread,
Jul 2, 2018, 1:56:04 PM7/2/18
to charlottec, MetaPhlAn-users
Hi Charlotte,

I tried but I wasn't able to have Bowtie2 building just the 4 index files and I think they might be different than the ones you got.
So, for making your MetaPhlAn running fine I uploaded the correct (or better, what we expect Bowtie2 to create) database files here: https://www.dropbox.com/sh/w4j4yr1b0o7xu9v/AAAx1yiV6enIGR7SuC8B34cKa?dl=0
If you put them in the "metaphlan_databases" folder, then MetaPhlAn should run fine.

It would be great for debugging if you can upload somewhere and share with me the 4 Bowtie2 files you got so that I can test them.

Many thanks,
Francesco

wulin...@126.com

unread,
Jul 4, 2018, 1:33:18 AM7/4/18
to MetaPhlAn-users

Hi Francesco,

I got exactly the same issue as Charlotte. I've added the 4 Bowtie2 files (generating from bowtie2-build -f mpa_v20_m200.fna mpa_v20_m200) here: https://www.dropbox.com/sh/m6xnqfd33woy5x9/AADWauB5LnS0H8kC0vaxFIjQa?dl=0

For your information, I installed Metaphlan from conda. And the Bowtie2 version is the same as you posted (2.3.4.1). Tried to remove and install Metaphlan/Bowtie2 again, didn't work.

Using the 6 Bowtie2 files you shared, my metaphlan program is running now. But can we trust the results if there's potential problem in Bowtie2?

Thanks so much!

River

Francesco Asnicar

unread,
Jul 4, 2018, 5:29:59 AM7/4/18
to wulin...@126.com, MetaPhlAn-users
Hi River,

Thanks for sharing the files!

It seems that this is a problem with the bowite2-build command, they have an issue open in Github (https://github.com/BenLangmead/bowtie2/issues/194) and I hope they can figure out what's happening.

In the meantime, the MetaPhlAn results are ok when using the bowtie2 indexes I shared previously as the problem is related with the bowtie2-build command and not with the bowtie2 aligner.


Many thanks,
Francesco

charlottec

unread,
Jul 9, 2018, 10:51:53 AM7/9/18
to MetaPhlAn-users
Hi Francesco,

Thank you very much for your help. I have downloaded the metaphlan database files that you linked and my original command now runs.

However I am now getting another error. It appears that the script cannot find the read_fastx.py file, despite this being in correct place "/pub38/cchong/miniconda2/envs/metaphlan/bin/read_fastx.py".

The error I get is:

Help message for read_fastx.py
Traceback (most recent call last):
File "/pub38/cchong/miniconda2/envs/metaphlan/bin/read_fastx.py", line 123, in <module>
read_and_write_raw(f, opened=False, min_len=min_len)
File "/pub38/cchong/miniconda2/envs/metaphlan/bin/read_fastx.py", line 88, in read_and_write_raw
with fopen(fd) as inf:
File "/pub38/cchong/miniconda2/envs/metaphlan/bin/read_fastx.py", line 47, in fopen
return open(fn)
IOError: [Errno 2] No such file or directory: ''

Is this files linked to the ones bowtie builds? I'm unsure of how to fix this error.

Thank you in advance!

Charlotte

Francesco Asnicar

unread,
Jul 10, 2018, 4:42:52 AM7/10/18
to charlottec, MetaPhlAn-users
Hi Charlotte,

Great that it is working now!

About read_fastx.py, I don't think the error is due to the read_fastx.py command not found (in that case MetaPhlAn should say "OSError: fatal error running 'read_fastx.py'. Is it in the system path?"). It seems instead that read_fastx.py is crashing because the file passed is an empty string (from the "IOError: [Errno 2] No such file or directory: ''").
Can you check (or post here) the full command line?

Many thanks,
Francesco

charlottec

unread,
Jul 10, 2018, 5:10:07 AM7/10/18
to MetaPhlAn-users
Hi Francesco,

The command I used was:

metaphlan2.py /pub38/cchong/metaphlan_analysis/fastq_input/S1_QC_1.fastq, /pub38/cchong/metaphlan_analysis/fastq_input/S1_QC_2.fastq --bowtie2out /pub38/cchong/metaphlan_analysis/metaphlan2_output/bowtie_output/bowtie_output_S1.bz2 -t rel_ab_w_read_stats --nproc 12 --input_type fastq

I think I must be missing something from my command!

Thank you!

Charlotte

Francesco Asnicar

unread,
Jul 10, 2018, 12:02:04 PM7/10/18
to charlottec, MetaPhlAn-users
Thanks for the command.
I think there is a space between the inputs. Please, make sure that there is no space between the input files, should be something like:

$ metaphlan2.py /pub38/cchong/metaphlan_analysis/fastq_input/S1_QC_1.fastq,/pub38/cchong/metaphlan_analysis/fastq_input/S1_QC_2.fastq --bowtie2out /pub38/cchong/metaphlan_analysis/metaphlan2_output/bowtie_output/bowtie_output_S1.bz2 -t rel_ab_w_read_stats --nproc 12 --input_type fastq


Thanks,
Francesco

Ashok Dinasarapu

unread,
Jul 15, 2018, 6:37:40 PM7/15/18
to MetaPhlAn-users
I have the following same issue with metaphlan2 when running humann2.

CRITICAL ERROR: Error executing: /home/adinasarapu/anaconda3/envs/ddocent_env/bin/metaphlan2.py /scratch/269684_JvK797BW/G45250_non_rRNA.fastq --input_type fastq --no_map --mpa_pkl /home/adinasarapu/anaconda3/envs/ddocent_env/bin/db_v20/mpa_v20_m200.pkl --bowtie2db /home/adinasarapu/anaconda3/envs/ddocent_env/bin/db_v20/mpa_v20_m200 -o /scratch/269684_JvK797BW/G45250_humann2/G45250_non_rRNA_humann2_temp/G45250_non_rRNA_metaphlan_bugs_list.tsv --input_type multifastq --bowtie2out /scratch/269684_JvK797BW/G45250_humann2/G45250_non_rRNA_humann2_temp/G45250_non_rRNA_metaphlan_bowtie2.txt --nproc 4

Error message returned from metaphlan2.py :

Downloading MetaPhlAn2 database
Please note due to the size this might take a few minutes

Downloading https://bitbucket.org/biobakery/metaphlan2/downloads/mpa_v20_m200.tar
Downloading file of size: 241.78 MB

241.78 MB 100.00 % 93.69 MB/sec 0 min -0 sec

0.01 MB 16062.75 % 11.87 MB/sec 0 min -0 sec

Decompressing /home/adinasarapu/anaconda3/envs/ddocent_env/bin/db_v20/mpa_v20_m200/mpa_v20_m200.fna.bz2 into /home/adinasarapu/anaconda3/envs/ddocent_env/bin/db_v20/mpa_v20_m200/mpa_v20_m200.fna

Building Bowtie2 indexes
Error: could not open /home/adinasarapu/anaconda3/envs/ddocent_env/bin/db_v20/mpa_v20_m200/mpa_v20_m200.fna
Error: Encountered internal Bowtie 2 exception (#1)
Command: /mnt/icebreaker/data2/home/adinasarapu/anaconda3/envs/ddocent_env/bin/bowtie2-build-s --wrapper basic-0 --quiet --threads 4 -f /home/adinasarapu/anaconda3/envs/ddocent_env/bin/db_v20/mpa_v20_m200/mpa_v20_m200.fna /home/adinasarapu/anaconda3/envs/ddocent_env/bin/db_v20/mpa_v20_m200/mpa_v20_m200
Removing uncompress database /home/adinasarapu/anaconda3/envs/ddocent_env/bin/db_v20/mpa_v20_m200/mpa_v20_m200.fna


Traceback (most recent call last):

File "/home/adinasarapu/anaconda3/envs/ddocent_env/bin/metaphlan2.py", line 1564, in <module>
metaphlan2()
File "/home/adinasarapu/anaconda3/envs/ddocent_env/bin/metaphlan2.py", line 1357, in metaphlan2
check_and_install_database(pars['index'], pars['bowtie2db'], pars['bowtie2_build'], pars['nproc'])
File "/home/adinasarapu/anaconda3/envs/ddocent_env/bin/metaphlan2.py", line 826, in check_and_install_database
download_unpack_tar(DATABASE_DOWNLOAD, index, bowtie2_db, bowtie2_build, nproc)
File "/home/adinasarapu/anaconda3/envs/ddocent_env/bin/metaphlan2.py", line 814, in download_unpack_tar
os.remove(fna_file)
OSError: [Errno 2] No such file or directory: '/home/adinasarapu/anaconda3/envs/ddocent_env/bin/db_v20/mpa_v20_m200/mpa_v20_m200.fna'

Francesco Asnicar

unread,
Jul 16, 2018, 5:23:46 AM7/16/18
to Ashok Dinasarapu, MetaPhlAn-users, Lauren McIver
Hi,

Thanks for reporting this, I'm Ccing Lauren which can provide further help since the error is happening within HUMAnN2.

In the meantime, it would be very useful if you can provide the full command line and the versions of the tools (HUMAnN2, MetaPhlAn2, etc.)

Additionally, if you manually specified the "--bowtie2db" and "--mpa_pkl" MetaPhlAN2 options in HUMAnN2, you can try re-run HUMAnN2 without that options (the latest MetaPhlAn2 can handle them automatically). If you haven't specified them, then we should wait for some inputs from Lauren.


Many thanks,
Francesco

ARD

unread,
Jul 16, 2018, 8:01:31 AM7/16/18
to MetaPhlAn-users
Thank you Francesco.

Now it’s working with the following command.

With bowtie2db PATH and #CORES fixed my problem.

humann2 \
--threads $CORES \
--input $TMP_DIR/${SID}_non_rRNA.fastq \
--metaphlan-options='--input_type fastq --mpa_pkl /path/to/db_v20/mpa_v20_m200.pkl --bowtie2db /path/to/db_v20' \
--output $TMP_DIR/${SID}_humann2

Thank you,
ARD

Tianqi

unread,
Jun 19, 2019, 8:43:25 AM6/19/19
to MetaPhlAn-users
> To unsubscribe from this group and stop receiving emails from it, send an email to metaphl...@googlegroups.com.

>
> For more options, visit https://groups.google.com/d/optout.

Hi Francesco,

I was following your instructions to solve the identical bowtie2 problem with Charlotte. and it worked.

Now I am also having a problem with read_fastx.py, after I used the command line: metaphlan2.py SRS014476-Supragingival_plaque.fasta.gz --input_type fasta > SRS014476-Supragingival_plaque_profile.txt --nproc 5.

The detailed error is shown in the figure.

I installed my Metaphlan2 with the zip file provided in Biobackery. My version is MetaPhlAn version 2.7.7 (31 May 2018).

Thank you very much.
Tianqi

error 1.JPG

Francesco Beghini

unread,
Jul 3, 2019, 10:07:09 AM7/3/19
to Tianqi, MetaPhlAn-users
Hi Tianqi,
from the command you sent, it seems that you put the nproc parameter after the output redirect, I'll leave here below the correct command.
metaphlan2.py SRS014476-Supragingival_plaque.fasta.gz  --input_type fasta --nproc 5 > SRS014476-Supragingival_plaque_profile.txt .

Also, from the screenshot you attached, it seems that the input file is corrupted, could you inspect with zcat?

Best,
Francesco


Francesco Beghini
PhD Student - Laboratory of Computational Metagenomics
Department of Cellular, Computational and Integrative Biology - CIBIO
University of Trento


To unsubscribe from this group and stop receiving emails from it, send an email to metaphlan-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/metaphlan-users/b2c953b4-5c44-40bc-bdeb-fd6be6cb6ffd%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages