Using QIIME (Python2) with Snakemake (Python3)

176 views
Skip to first unread message

Daniel Hwang

unread,
Jul 21, 2016, 12:38:59 PM7/21/16
to Qiime 1 Forum
Hi all,

I am wondering if anyone here has successfully used QIIME in their Snakemake workflows. As the documentation for each says, QIIME runs on Python2 (and not 3) while Snakemake runs on Python3 (and not 2).

I have a virtual environment for my running Snakemake in Python3. I attempted to create a Python2 virtual environment within Snakemake's shell command, but had no luck and received this error:

venv/bin/activate: line 57: PS1: unbound variable

This is the region of the virtualenv activation code the error above points
to:

# unset PYTHONHOME if set
> if ! [ -z "${PYTHONHOME+_}" ] ; then
>     _OLD_VIRTUAL_PYTHONHOME="$PYTHONHOME"
>     unset PYTHONHOME
> fi
> if [ -z "${VIRTUAL_ENV_DISABLE_PROMPT-}" ] ; then
>     _OLD_VIRTUAL_PS1="$PS1"
>     if [ "x" != x ] ; then
>         PS1="$PS1"
>     else
>         PS1="(`basename \"$VIRTUAL_ENV\"`) $PS1"
>     fi
>     export PS1
> fi




What I can't wrap my head around is:

*I can manually launch my virtual environment for snakemake (in python3)
*Then launch my virtual environment for qiime (in python2)
*Then am able to successfully run QIIME

However, if I try to recreate it through a bash script, it seems that the python versions clash and I get errors.



Here is my Snakefile which attempts to create a python2 virtual environment inside and then execute qiime commands:

configfile: "config.yaml"
#ALGO, = glob_wildcards("../pandatest/subset_test_fasta2/algo_used_{algo}.txt")
ALGO = ['ea_util']
#SAMP, = glob_wildcards("../test_fastq_subset/{sample}_R1_subset.fastq")
SAMP = ['2-D10_S142_L001']
rule cluster_uclust:
    input:
        'sample.fasta'
    output:
        expand("{algo}_uclust_otus/otu_table.biom", algo = ALGO),
        expand('{algo}_uclust_otus/pynast_aligned_seqs/{sample}_pandaseq_rep_set_aligned.fasta', algo = ALGO, sample = SAMP),
        expand('{algo}_uclust_otus/pynast_aligned_seqs/{sample}_pandaseq_rep_set_aligned_pfiltered.fasta', algo = ALGO, sample = SAMP),
        expand('{algo}_uclust_otus/pynast_aligned_seqs/{sample}_pandaseq_rep_set_failures.fasta', algo = ALGO, sample = SAMP),
        expand('{algo}_uclust_otus/pynast_aligned_seqs/{sample}_pandaseq_rep_set_log.txt', algo = ALGO, sample = SAMP),
        expand('{algo}_uclust_otus/rep_set/{sample}_pandaseq_rep_set.fasta', algo = ALGO, sample = SAMP),
        expand('{algo}_uclust_otus/rep_set/{sample}_pandaseq_rep_set.log', algo = ALGO, sample = SAMP),
        expand('{algo}_uclust_otus/rep_set.tre', algo = ALGO),
        expand('{algo}_uclust_otus/uclust_assigned_taxonomy/{sample}_pandaseq_rep_set_tax_assignments.log', algo = ALGO, sample = SAMP),
        expand('{algo}_uclust_otus/uclust_assigned_taxonomy/{sample}_pandaseq_rep_set_tax_assignments.txt', algo = ALGO, sample = SAMP),
        expand('{algo}_uclust_otus/uclust_picked_otus/{sample}_pandaseq_clusters.uc', algo = ALGO, sample = SAMP),
        expand('{algo}_uclust_otus/uclust_picked_otus/{sample}_pandaseq_otus.log', algo = ALGO, sample = SAMP),
        expand('{algo}_uclust_otus/uclust_picked_otus/{sample}_pandaseq_otus.txt', algo = ALGO, sample = SAMP)
    shell:
        """
        source venv/bin/activate
        pick_de_novo_otus.py -i {input} -o {output}_uclust_otus/
        """



Here is the error for the snakefile above:



Error processing line 1 of /cbcb/sw/RedHat-7-x86_64/common/local/pythonext/2.7.9/lib/python2.7/site-packages/matplotlib-1.4.3-py2.7-nspkg.pth:

Failed to import the site module
Traceback (most recent call last):
  File "/cbcb/project2-scratch/nolson/miniconda2/envs/snakeenv/lib/python3.5/site.py", line 167, in addpackage
    exec(line)
  File "<string>", line 1, in <module>
  File "/cbcb/project2-scratch/nolson/miniconda2/envs/snakeenv/lib/python3.5/types.py", line 166, in <module>
    import functools as _functools
  File "/cbcb/project2-scratch/nolson/miniconda2/envs/snakeenv/lib/python3.5/functools.py", line 21, in <module>
    from collections import namedtuple
  File "/cbcb/project2-scratch/nolson/miniconda2/envs/snakeenv/lib/python3.5/collections/__init__.py", line 16, in <module>
    from reprlib import recursive_repr as _recursive_repr
  File "/cbcb/sw/RedHat-7-x86_64/common/local/pythonext/2.7.9/lib/python2.7/site-packages/reprlib/__init__.py", line 7, in <module>
    raise ImportError('This package should not be accessible on Python 3. '
ImportError: This package should not be accessible on Python 3. Either you are trying to run from the python-future src folder or your installation of python-future is corrupted.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/cbcb/sw/RedHat-7-x86_64/common/local/pythonext/2.7.9/lib/python2.7/site-packages/site.py", line 74, in <module>
    __boot()
  File "/cbcb/sw/RedHat-7-x86_64/common/local/pythonext/2.7.9/lib/python2.7/site-packages/site.py", line 49, in __boot
    addsitedir(item)
  File "/cbcb/project2-scratch/nolson/miniconda2/envs/snakeenv/lib/python3.5/site.py", line 206, in addsitedir
    addpackage(sitedir, name, known_paths)
  File "/cbcb/project2-scratch/nolson/miniconda2/envs/snakeenv/lib/python3.5/site.py", line 177, in addpackage
    import traceback
  File "/cbcb/project2-scratch/nolson/miniconda2/envs/snakeenv/lib/python3.5/traceback.py", line 3, in <module>
    import collections
  File "/cbcb/project2-scratch/nolson/miniconda2/envs/snakeenv/lib/python3.5/collections/__init__.py", line 16, in <module>
    from reprlib import recursive_repr as _recursive_repr
  File "/cbcb/sw/RedHat-7-x86_64/common/local/pythonext/2.7.9/lib/python2.7/site-packages/reprlib/__init__.py", line 7, in <module>
    raise ImportError('This package should not be accessible on Python 3. '
ImportError: This package should not be accessible on Python 3. Either you are trying to run from the python-future src folder or your installation of python-future is corrupted.




Any help would be appreciated.

Justine Debelius

unread,
Jul 22, 2016, 1:11:25 PM7/22/16
to Qiime 1 Forum
Hi Daniel,

This seems like more of a snakemake problem. Perhaps their devs might be better able to help you?

Thanks,
Justine

Daniel Hwang

unread,
Jul 22, 2016, 1:56:44 PM7/22/16
to Qiime 1 Forum
Hi Justine,

I posted in the snakemake forum as well. I just thought there may have been some here who may have used QIIME with snakemake and run into a similar problem.

Just to wrap some things up if anyone else come across an issue similar to mine. I was able to successfully resolve the PS1 unbound variable issue:
When creating a virtual environment inside of snakemake, the virtualenv violates `bash strict mode` which snakemake uses (link to FAQ here).

The other errors still persist with the different python versions clashing with using Snakemake (python3) and QIIME (python2). QIIME either tries to go through python3 to run its scripts (I think) which is why the errors pop up. These happen when I use a workflow program like snakemake to streamline a lot of runs or a shell script file to run many through there.

However, when I run these manually, everything works! Even with having nested python environments (python 2 virtual environment (for QIIME) within a python 3 virtual environment (for Snakemake))! That left me scratching my head because this manual way works perfectly, but when I run a bash script (which literally has my manual commands all written down sequentially there), it fails. <-- When I use the bash script I submit it to a cluster to execute. So i'm hesitantly thinking that there may be an issue with how virtuaal environments get finicky when done from a submission to a cluster rather than me having just manually run stuff from a head node.


Anyway, if anyone does find him/herself here with similar issues I hope some of my thoughts might help in any way. I will definitely post a followup if I find a solution!

-Daniel



Colin Brislawn

unread,
Jul 22, 2016, 2:52:04 PM7/22/16
to Qiime 1 Forum
Thanks for the detailed update Daniel. I have not used snakemake with qiime, but have colleagues who have used it with other programs to great success.

Qiime 2 will be built in Python 3, so hopefully this will mitigate problem like this in the future. You can follow the devs here:

Thanks!
Colin

Daniel Hwang

unread,
Jul 26, 2016, 11:28:22 AM7/26/16
to Qiime 1 Forum
Hi,

I tried an attempt with using conda environments specifically for snakemake and qiime but I had some issues with libraries not being correctly linked. I think I saw some people who had success with that method. It may be an issue with how my environment in general is set up on the cluster I am working in.

I have had success, however (and thankfully) but it was through a set up I had before... So i'm scratching my head with that.

My set up was (for anyone who sees a similar error as mine, I hope mine helps you as well):


Submission Shell Script:
  • activate snakemake environment (python3 environment)
  • run Snakemake
Snakefile
  • activate virtual environment (python2 environment)
  • run QIIME code
It works.


It is a bit odd since this was the first thing I tried. One thing in between this setup failing and succeeding was that I had set my LD_LIBRARY_PATHS environment variable to find the correct libraries that it was complaining about. However, this did not work. So then I got rid of linking this in my .bashrc. Now it works. 

A bit strange, but if I find the actual reason to why this is working I will provide an update. Currently, i'm doing a few sanity checks with submissions from different submission nodes on my cluster.

-Daniel

Colin Brislawn

unread,
Jul 26, 2016, 2:05:17 PM7/26/16
to Qiime 1 Forum
Hello Daniel,

Glad you got it working! Automated build environments can be a pane to set up, but are so worth it once running.

Let us know if you learn something more, or if you have your questions.

Colin

Reply all
Reply to author
Forward
0 new messages