I have about 50 metagenomes that were assembled separately into contigs. I'd like to carry out a similar binning workflow to the one described in
this post, except without the co-assembly. To do this, it seems I need to concatenate the FASTA files with each sample's contigs (into a single contigs database), then map each sample to the concatenated contigs database. Then, I'll use the workflow described in the blog post (merging each sample's profile, then clustering contigs).
However,
this 2016 thread mentions that when combining contig files, "multiple scaffolds might overlap (i.e., same part of a genome). It would dilute mapping signal and render the analysis noisy and difficult." Does the anvi-cluster-contigs function essentially fix this problem? Or could something like anvi-dereplicate-genomes be used to remove almost-duplicate contigs?