Error when running SnakeMake Contigs workflow

169 views
Skip to first unread message

Dennis Chan

unread,
Sep 2, 2020, 11:59:42 AM9/2/20
to Anvi'o
Hi Anvio Team,

I am going through the snakemake workflows tutorial. Being but a novice in bioinformatics, I have encountered some errors that are beyond my troubleshooting skills.

My first errors were fixable, I got errors that the config-contigs.json file included in the mock data was lacking a "workflow_name" parameter, so I added that (set to "workflow_name": "contigs") and then I ran anvi-migrate to update it to the correct version. 

Then when I try to run the Contigs workflow with anvi-run-workflow using the abovementioned config file I get this: 

anvi-run-workflow -w contigs -c config-contigs.json

WARNING
===============================================
If you publish results from this workflow, please do not forget to cite
snakemake (doi:10.1093/bioinformatics/bts480)

WARNING
===============================================
We are initiating parameters for the contigs workflow

WARNING
===============================================
We are initiating parameters for the contigs workflow

Shell programs for the workflow
===============================================
Needed .......................................: gunzip, anvi-script-reformat-fasta, anvi-script-reformat-fasta, anvi-gen-contigs-database, anvi-import-functions, anvi-get-sequences-for-gene-calls, centrifuge, anvi-import-taxonomy-for-genes, anvi-run-hmms, anvi-run-pfams, anvi-run-ncbi-cogs, anvi-run-scg-taxonomy, anvi-scan-trnas, anvi-get-sequences-for-gene-calls
Missing ......................................: None

Building DAG of jobs...
InputFunctionException in line 178 of /home/dennistcc/miniconda3/envs/anvio-6.2/lib/python3.6/site-packages/anvio/workflows/contigs/Snakefile:
KeyError: '{group}'
Wildcards:
group={group}

I have done the one-time runs of anvi-setup-ncbi-cogs and anvi-setup-scg-databases, although one weird thing with anvi-setup-scg-databases is that it says the database is successfully downloaded, but it still ends with this error message: 

(anvio-6.2) anvi-setup-scg-databases --reset

WARNING
===============================================
The existing directory for SCG taxonomy data dir has been removed. Just so you
know.

WARNING
===============================================
Please remember that the data anvi'o uses for SCG taxonomy is a courtesy of The
Genome Taxonomy Database (GTDB), an initiative to establish a standardised
microbial taxonomy based on genome phylogeny, primarly funded by tax payers in
Australia. Please don't forget to cite the original work, doi:10.1038/nbt.4229
by Parks et al to explicitly mention the source of databases anvi'o relies upon
to estimate genome level taxonomy. If you are not sure how it should look like
in your methods sections, anvi'o developers will be happy to help you if you
can't find any published example to get inspiration.

Local directory to setup .....................: /home/dennistcc/miniconda3/envs/anvio-6.2/lib/python3.6/site-packages/anvio/data/misc/SCG_TAXONOMY/GTDB
Reset the directory first ....................: True

Remote database ..............................: GTDB
Remote URL to download files .................: https://data.ace.uq.edu.au/public/gtdb/data/releases/latest/
Remote files of interest .....................: VERSION, ar122_msa_individual_genes.tar.gz, ar122_taxonomy.tsv, bac120_msa_individual_genes.tar.gz, bac120_taxonomy.tsv
GTDB release found ...........................: v95 (Released July 17, 2020)
Downloaded succesfully .......................: /home/dennistcc/miniconda3/envs/anvio-6.2/lib/python3.6/site-packages/anvio/data/misc/SCG_TAXONOMY/GTDB/VERSION

Config Error: Something went wrong with your download attempt. Here is the problem: 'HTTP
              Error 404: Not Found'

Could it be related? Any help would be greatly appreciated. 

Sincerely, Dennis (an overwhelmed bioinformatics padawan)
Message has been deleted

kiefl...@gmail.com

unread,
Sep 2, 2020, 12:42:48 PM9/2/20
to Anvi'o
Hi Dennis,

I think the first error is due to an outdated snakemake version. You can check your snakemake version with

```
snakemake --version
```

You should be aiming for `5.10.0`.

I have less advice for the second error. It sounds like your internet became unstable, or anvio tries to access a dead link. This is just conjecture. Good luck!

Dennis Chan

unread,
Sep 2, 2020, 4:07:25 PM9/2/20
to Anvi'o
I see! I have tried updating my snakemake to latest version or at least 5.10.0 as you recommend, but keep failing. I have tried the following: 
  • conda install snakemake=5.10.0 gets stuck at "solving environments"
  • Installing it through mamba as recommended in the SnakeMake documentation installs it in a sub env of my anvio-6.2 environment. I tried running the anvi-run-workflow -w contigs -c config-contigs.json again but gave me the same error in my first post.
  • My latest attempt was doing conda update snakemake, but then I ended up with version 4.3.0 somehow. 
Any suggestions? 

Evan Kiefl

unread,
Sep 2, 2020, 4:34:46 PM9/2/20
to an...@googlegroups.com
It sounds like you are on the right track. https://bioconda.github.io/recipes/snakemake/README.html suggests adding the bioconda channel to your conda before `conda install` or `conda update`. Regardless, when you think you've got it right, check with `snakemake --version`. 

--
Anvi'o Paper: https://peerj.com/articles/1319/
Project Page: http://merenlab.org/projects/anvio/
Code Repository: https://github.com/meren/anvio
---
You received this message because you are subscribed to a topic in the Google Groups "Anvi'o" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/anvio/IqOKSTOBbNY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to anvio+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/anvio/cc0011f0-e841-41cb-a31c-087cb6da2b0an%40googlegroups.com.

Dennis Chan

unread,
Sep 2, 2020, 7:32:54 PM9/2/20
to Anvi'o
Hi again! So I restarted my Ubuntu Terminal and reordered the priority of my channels according to the conda documentation:

conda config --add channels defaults 
conda config --add channels bioconda 
conda config --add channels conda-forge    

Then I retried a number of download options, I cant quite remember which ones I did and in what order (I took break out of frustration) but I ended up with snakemake 5.23.0. Then I did one final run of conda install snakemake=5.10.0 that worked this time (did not get stuck at "solving environment" but my snakemake was still at version 5.23.0. I figured oh well. 

I then encountered this error: 

anvi-run-scg-taxonomy -c 02_CONTIGS_contigs_workflow/G02-contigs.db -T 6  >> 00_LOGS_contigs_workflow/G02-anvi_run_scg_taxonomy.log 2>&1
[Wed Sep  2 22:48:19 2020]
Error in rule anvi_run_scg_taxonomy:
    jobid: 11
    output: 02_CONTIGS_contigs_workflow/anvi_run_scg_taxonomy-G02.done
    log: 00_LOGS_contigs_workflow/G02-anvi_run_scg_taxonomy.log (check log file(s) for error message)
    shell:
        anvi-run-scg-taxonomy -c 02_CONTIGS_contigs_workflow/G02-contigs.db -T 6  >> 00_LOGS_contigs_workflow/G02-anvi_run_scg_taxonomy.log 2>&1
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /home/dennistcc/pan/p_fl/anvio_tutor/WORKFLOW_TUTORIAL_DATA/.snakemake/log/2020-09-02T224712.256292.snakemake.log

But i "fixed" it by changing the "run" parameter of anvi-run-scg-taxonomy rule to "false", then I got nothing but green happy messaged. I suppose there is something wrong with my scg database? 

Ana

unread,
Sep 23, 2020, 6:28:39 PM9/23/20
to Anvi'o
Thanks for this post and for providing the solution:
I also got the same error as soon the first anvi-run-scg-taxonomy program started to be run.

In addition, I got the warning shown below as soon as I restarted the program: anvi-run-workflow -w contigs -c config-contigs.json --additional-params --jobs 6 --resources nodes=6 (as suggested in http://merenlab.org/2019/03/14/ncbi-genome-download-magic/). I hope this time the job is finished ;).

Thanks!

WARNING
===============================================
OK, SO THIS IS SERIOUS, AND WHEN THINGS ARE SERIOUS THEN WE USE CAPS. WE SEE
THAT YOU ARE USING --additional-params AND THAT'S GREAT, BUT WE WANT TO REMIND
YOU THAT ANYTHING THAT FOLLOWS --additional-params WILL BE CONSIDERED AS A
snakemake PARAM THAT IS TRANSFERRED TO snakemake DIRECTLY. So make sure that
these don't include anything that you didn't mean to include as an additional
param: --jobs, 6, --resources, nodes=6.
Reply all
Reply to author
Forward
0 new messages