anvi-import-collections from MaxBin results

654 views
Skip to first unread message

Nastassia Patin

unread,
Feb 22, 2017, 9:27:35 AM2/22/17
to Anvi'o
Hi there,

I have just installed Anvi'o and so far I think it is running smoothly! I am working with one metagenome sample to start with, and I would like to compare the binning results to those from MaxBin using anvi-import-collections. I feel silly asking this, but - what is the output text file that I am meant to import? MaxBin gave me three bins in fasta format. Do I need to convert those to text files and merge them before importing?

Also, unrelated question - if I just have one metagenome sample, do I still need to do the anvi-merge step? 

Thanks,
Nastassia

A. Murat Eren

unread,
Feb 22, 2017, 10:21:14 AM2/22/17
to an...@googlegroups.com
Hi Natassia,

No, you don't need the anvi-merge step if you have one sample.

Your question about binning and the format of the input file (with an example file) is explained here:


Best,

--

A. Murat Eren (meren)
http://merenlab.org :: gpg

--
Anvi'o Paper: https://peerj.com/articles/1319/
Project Page: http://merenlab.org/projects/anvio/
Code Repository: https://github.com/meren/anvio
---
You received this message because you are subscribed to the Google Groups "Anvi'o" group.
To unsubscribe from this group and stop receiving emails from it, send an email to anvio+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/anvio/9d8b5735-db32-4611-9d29-b7401fc5341f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nastassia Patin

unread,
Feb 22, 2017, 12:27:16 PM2/22/17
to Anvi'o
Thanks for your quick response! From what I can tell, the input file maps each contig to its designated bin. This isn't entirely clear from the tutorial and MaxBin doesn't do this automatically, might be worth mentioning in the tutorial.

Unfortunately, I'm getting an error when I try to import it with the following command:

anvi-import-collection Maxbins_for_anvio.txt -p anvio_profile/PROFILE.db -c 03_CONTIGS/contigs.db --contigs-mode -C collections --bins-info bin-info-anvio.txt

File/Path Error: The expected number of fileds for 'Maxbins_for_anvio.txt' is 2. Yet, it has 3 of them :/


There are indeed three bins from MaxBin, why would the expected number be two?

-Nastassia

On Wednesday, February 22, 2017 at 10:21:14 AM UTC-5, Meren wrote:
Hi Natassia,

No, you don't need the anvi-merge step if you have one sample.

Your question about binning and the format of the input file (with an example file) is explained here:


Best,

--

A. Murat Eren (meren)
http://merenlab.org :: gpg

On Wed, Feb 22, 2017 at 8:27 AM, Nastassia Patin <n.v....@gmail.com> wrote:
Hi there,

I have just installed Anvi'o and so far I think it is running smoothly! I am working with one metagenome sample to start with, and I would like to compare the binning results to those from MaxBin using anvi-import-collections. I feel silly asking this, but - what is the output text file that I am meant to import? MaxBin gave me three bins in fasta format. Do I need to convert those to text files and merge them before importing?

Also, unrelated question - if I just have one metagenome sample, do I still need to do the anvi-merge step? 

Thanks,
Nastassia

--
Anvi'o Paper: https://peerj.com/articles/1319/
Project Page: http://merenlab.org/projects/anvio/
Code Repository: https://github.com/meren/anvio
---
You received this message because you are subscribed to the Google Groups "Anvi'o" group.
To unsubscribe from this group and stop receiving emails from it, send an email to anvio+un...@googlegroups.com.

A. Murat Eren

unread,
Feb 22, 2017, 1:47:20 PM2/22/17
to an...@googlegroups.com
​Hi ​Nastassia,

On Wed, Feb 22, 2017 at 11:27 AM, Nastassia Patin <n.v....@gmail.com> wrote:
File/Path Error: The expected number of fileds for 'Maxbins_for_anvio.txt' is 2. Yet, it has 3 of them :/

There are indeed three bins from MaxBin, why would the expected number be two?

The error intends to say that the file `Maxbins_for_anvio.txt` needs to have two columns only, but there are three (I will fix the error message so it is more clear).

The input file must be a TAB-delimited file with 2 columns. Each line should contain the contig name, and the bin name the contig is affiliated with.

​I am not sure how to improve the tutorial, but I will try :)​

Nastassia Patin

unread,
Feb 22, 2017, 2:07:43 PM2/22/17
to Anvi'o
Ahh thanks so much, there was an extra tab between the two columns in that file. 

I'm not trying to give you more work! Maybe most people won't have trouble with this step but I was expecting to input actual bins in fasta format instead of just the mapping information. 

Thanks again for your quick response, it's much appreciated!

-Nastassia

Nastassia Patin

unread,
Feb 22, 2017, 2:46:35 PM2/22/17
to an...@googlegroups.com
Sorry Meren. I now see where the tutorial specifically describes the input file necessary for the collection. I got excited because I was near the end and must have gotten lazy with reading! For the record it's a phenomenal tutorial that even took a relative metagenomics newbie like me through almost everything without a hitch.

Thanks again!

-Nastassia

--
You received this message because you are subscribed to a topic in the Google Groups "Anvi'o" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/anvio/DWO8fDQ_g7M/unsubscribe.
To unsubscribe from this group and all its topics, send an email to anvio+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/anvio/05b2f0c7-9396-4fe8-86b1-387e05c3f8c4%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--

Dr. Nastassia Patin

Postdoctoral Researcher

School of Biology

Georgia Institute of Technology

A. Murat Eren

unread,
Feb 22, 2017, 2:56:00 PM2/22/17
to an...@googlegroups.com
​Hi ​Nastassia,

Thank you. There is absolutely no need for you to be sorry :) I am very glad the tutorial has been helpful so far, and I hope it would continue to give you insights about the inner workings of anvi'o.

It is important for us to make sure things work without much hassle (regardless how complex they may be), and we try to not cut corners with our writing in the expense of clarity. So please continue to let us know if you run into issues, and we will try to fix them.


​Best,​


On Wed, Feb 22, 2017 at 1:46 PM, Nastassia Patin <n.v....@gmail.com> wrote:
Sorry Meren. I now see where the tutorial specifically describes the input file necessary for the collection. I got excited because I was near the end and must have gotten lazy with reading! For the record it's a phenomenal tutorial that even took a relative metagenomics newbie like me through almost everything without a hitch.




Bruno Gomez-Gil

unread,
May 11, 2019, 10:13:10 AM5/11/19
to Anvi'o
Here are my 2cents
I ran into the same problem. Maxbin produces bins in fasta format with the name of each contig in the fasta sequence header, so we only need to extract these contigs names from each bin, make a list out of it and add the bin name in another column; the bin name can be obtained from the fasta filename, so simple bash script can get these from all fasta files in a directory:
#!/bin/bash
# A simple script to convert maxbin results to anvio
FILES
=$(find *.fasta)
for f in $FILES; do
 NAME
=$(basename $f .fasta)
 grep
">" $f | sed 's/>//' | sed -e "s/$/\t$NAME/" | sed 's/\./_/' >> maxbins4anvio.tsv
done

Place this script in the directory with al the fasta files, and only with these files, and execute it. I named it maxbin2anvio.sh. The resulting file, which has in the first column the contigs and in the second the bin, is named maxbins4anvio.tsv:

contig_55 maxbin_001
contig_80 maxbin_001
contig_110 maxbin_001
contig_171 maxbin_001
...
contig_25030    maxbin_004
contig_25841    maxbin_004
contig_26157    maxbin_004




El miércoles, 22 de febrero de 2017, 12:56:00 (UTC-7), Meren escribió:
​Hi ​Nastassia,

Thank you. There is absolutely no need for you to be sorry :) I am very glad the tutorial has been helpful so far, and I hope it would continue to give you insights about the inner workings of anvi'o.

It is important for us to make sure things work without much hassle (regardless how complex they may be), and we try to not cut corners with our writing in the expense of clarity. So please continue to let us know if you run into issues, and we will try to fix them.


​Best,​


Reply all
Reply to author
Forward
0 new messages