anvi-pan-genome bad alignment with Muscle

36 views
Skip to first unread message

jayos...@gmail.com

unread,
Jun 14, 2023, 8:28:30 AM6/14/23
to Anvi'o
I am running the following command (error below):
anvi-pan-genome -g CHLAM-GENOMES.db --project-name "TEST" --output-dir TEST_pangenome --num-threads 32 --mcl-inflation 5 --enforce-hierarchical-clustering --min-occurrence 3 --debug

On this version of Anvio:

Anvi'o .......................................: hope (v7.1)

Profile database .............................: 38
Contigs database .............................: 20
Pan database .................................: 15
Genome data storage ..........................: 7
Auxiliary data storage .......................: 2
Structure database ...........................: 2
Metabolic modules database ...................: 2
tRNA-seq database ............................: 2


I am running into the following types of errors when trying to make a calculate a pangenome:

WARNING
===============================================
VERY BAD NEWS. The alignment of sequences with 'Muscle' in the gene cluster
'GC_00001527' failed for some reason. Since the real answer to 'why' is too deep
in the matrix, there is no reliable solution for anvi'o to find it for you, BUT
THIS WILL AFFECT YOUR SCIENCE GOING FORWARD, SO YOU SHOULD CONSIDER ADDRESSING
THIS ISSUE FIRST. The 2 sequences in gene cluster GC_00001527 are stored in the
temporary file '/tmp/ANVIO_GC_GC_00001527vguwyc9w'

{
  "name": "GC_00001522",
  "entry": [
    {
      "gene_caller_id": 1528,
      "gene_cluster_id": "GC_00001522",
      "genome_name": "SIMNZ",
      "alignment_summary": ".|350"
    },
    {
      "gene_caller_id": 933,
      "gene_cluster_id": "GC_00001522",
      "genome_name": "PL25a_bin_130",
      "alignment_summary": ".|227"
    }
  ]
}

AND this error, which might be a result of the muscle error:

Process Process-66:
[14 Jun 23 14:21:02 Computing gene cluster homogeneity indices] Processed 85 gene clusters using 32 threads                      ETA: ∞:∞:∞Traceback (most recent call last):
  File "/home/apps/conda/miniconda3/envs/anvio-7.1/lib/python3.6/site-packages/anvio/dbops.py", line 1627, in homogeneity_worker
    funct_index, geo_index, combined_index = homogeneity_calculator.get_homogeneity_dicts(gene_cluster)
  File "/home/apps/conda/miniconda3/envs/anvio-7.1/lib/python3.6/site-packages/anvio/homogeneityindex.py", line 174, in get_homogeneity_dicts
    fun = self.compute_functional_index(cluster_sequences)
  File "/home/apps/conda/miniconda3/envs/anvio-7.1/lib/python3.6/site-packages/anvio/homogeneityindex.py", line 50, in compute_functional_index
    residues.append(gene_cluster_sequences[gene_sequence][residue_number])
IndexError: string index out of range                                                                                                      
During handling of the above exception, another exception occurred:

[14 Jun 23 14:21:02 Computing gene cluster homogeneity indices] Processed 86 gene clusters using 32 threads                      ETA: ∞:∞:∞Traceback (most recent call last):
  File "/home/apps/conda/miniconda3/envs/anvio-7.1/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/home/apps/conda/miniconda3/envs/anvio-7.1/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/apps/conda/miniconda3/envs/anvio-7.1/lib/python3.6/site-packages/anvio/dbops.py", line 1638, in homogeneity_worker
    combined_index[gene_cluster_name] = -1
UnboundLocalError: local variable 'combined_index' referenced before assignment

Is there a way to fix this?

Thanks!

A. Murat Eren (Meren)

unread,
Jun 26, 2023, 5:27:34 AM6/26/23
to an...@googlegroups.com
Jey Jay,

Sorry for the very late response! I was on a break, and I'm not sure if this is still a problem, but I will respond to it anyway since it is still in my inbox :)

The way to fix this is to try the anvio-dev. I remember a bug in 7.1 which resulted in exploding anvi'o at the stage of calculating homogeneity indices due to prior alignment issues. I think we fixed it in the active development branch already.


Best wishes,
--

A. Murat Eren
 (Meren) | he/him


--
Anvi'o Paper: https://peerj.com/articles/1319/
Project Page: http://merenlab.org/projects/anvio/
Code Repository: https://github.com/meren/anvio
---
You received this message because you are subscribed to the Google Groups "Anvi'o" group.
To unsubscribe from this group and stop receiving emails from it, send an email to anvio+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/anvio/fc82c9f6-c806-46cf-9e44-87df7957b08fn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages