running macqiime from R via system() or system2() call

204 views
Skip to first unread message

Colin A

unread,
Feb 13, 2016, 1:54:35 PM2/13/16
to Qiime 1 Forum
Hi All,

I would really love to be able to run macqiime from R using the system() command. It would allow me to carry over the skills I have from R.Studio and its interfacing with GitHub in a really seamless way. Also, once Qiime generates my OTU tables I then conduct further analysis using my own custom R scripts. It would be ideal to be able to create an R project in R studio, and then run some sequential scripts, chaining together output, and then push to GitHub when I am done.

I went ahead and tried to do this using the system() command in R, which allows you to run code into terminal, and it prints in the R console. So, I went ahead and wrote out:

system('macqiime')

Which prints out:
MacQIIME version:
MacQIIME 1.9.0-20140227

Sourcing MacQIIME environment variables...

  This is the same as a normal terminal shell, except your default
  python is DIFFERENT (/macqiime/bin/python) and there are other new
  QIIME-related things in your PATH.

  Type "exit" (return) to go back to your normal shell


So, it launches macqiime successfully, but then the R console hangs. It won't accept new commands. Same thing happens with system2('macqiime'). Any idea what is causing this, and any ideas on a work around that would allow me to call and run macqiime from R?

Colin

Colin Brislawn

unread,
Feb 14, 2016, 12:53:06 AM2/14/16
to Qiime 1 Forum
Hello Colin,

This isn't really answering your question, but I wonder if R and R studio are a good fit for data processing using qiime.

I currently use bash scripts to run 'sequential scripts, chaining together output' just like you describe and I find that the simplicity of bash is a good fit for qiime. The qiime devs use jupyter notebooks to chain together their scripts. The Illumina tutorial is inside an jupyter notebook: http://nbviewer.jupyter.org/github/biocore/qiime/blob/1.9.1/examples/ipynb/illumina_overview_tutorial.ipynb

I guess my suggestion is to have one series of scripts for qiime (using bash or jupyter) and another for R processing. While you could do it all in R, I think these other programs may be a better fit.


Let me know what you think...
Oh and +1 for reproducible science! "then push to GitHub when I am done" :-)
Colin Brislawn

Colin A

unread,
Feb 14, 2016, 10:45:22 AM2/14/16
to Qiime 1 Forum
Thanks Colin. I went ahead and installed the Jupyter notebook, per your suggestion, and started trying to replicate the tutorial here: nbviewer.jupyter.org/github/biocore/qiime/blob/1.9.1/examples/ipynb/illumina_overview_tutorial.ipynb

Everything works fine until I start running QIIME commands. This may have something to with the fact that I run macqiime. For instance, I make a new code block, and execute:

!macqiime

And the notebook prints:
MacQIIME version:
MacQIIME 1.9.0-20140227

Sourcing MacQIIME environment variables...

  This is the same as a normal terminal shell, except your default
  python is DIFFERENT (/macqiime/bin/python) and there are other new
  QIIME-related things in your PATH.

  Type "exit" (return) to go back to your normal shell

MacQIIME Colins-MacBook-Pro:illumina $ 

Then the notebook hangs, and to regain control I have to interrupt the kernel via the top menu, kernel -> interrupt.

I also tried including the call the macqiime, the check mapping file, and then an exit command like this, but it still hangs and I have to interrupt the kernel.
!macqiime
!validate_mapping_file.py -o vmf-map/ -m map.tsv
!exit


Any suggestions on how this works with macqiime? I like the notebook format, but if it never plays with macqiime, then its kind of useless for me. I just have to go back to copy pasting to terminal.

Colin A

unread,
Feb 14, 2016, 11:03:28 AM2/14/16
to Qiime 1 Forum
Update: macqiime works fine, so long as you launch macqiime BEFORE you launch jupyter notebook. So, to start your jupyter session enter at the command line:

Colins-MacBook-Pro:~ colin$ macqiime


MacQIIME version:
MacQIIME 1.9.0-20140227

Sourcing MacQIIME environment variables...

  This is the same as a normal terminal shell, except your default
  python is DIFFERENT (/macqiime/bin/python) and there are other new
  QIIME-related things in your PATH.

  Type "exit" (return) to go back to your normal shell

MacQIIME Colins-MacBook-Pro:~ $ jupyter notebook

Thanks for your suggestions Colin (Brislawn), I am enjoying the Jupyter notebook environment!

Colin Brislawn

unread,
Feb 14, 2016, 1:17:52 PM2/14/16
to Qiime 1 Forum
Hi other Colin,

I'm glad you got macqiime to play nice with jupyter. Many other people have had this issue so I'm happy to know that the workaround is to launch macqiime before starting the notebook server. Thanks!

jupyter can run R code too. I currently run R through R studio, but some of my collaborators have had success running R through jupyter, as shown here:

I also want to quickly promote the R package Phyloseq. It elegantly holds all the parts of a amplicon experiment and makes beautiful extendible graphs with ggplot2. Here, the phyloseq author fully replicates another study using only R and phyloseq: http://joey711.github.io/phyloseq-demo/Restroom-Biogeography 

Good luck other Colin,
Colin Brislawn


Colin A

unread,
Feb 14, 2016, 2:14:43 PM2/14/16
to Qiime 1 Forum
Thanks again other Colin, I appreciate the suggestions. the Jupyter notebook is working great, however I have a follow up question:

I frequently run QIIME on a remote computer via ssh. This computer has a native install of QIIME, and many more processing cores than my laptop, so I can run a lot of the bioinformatic analysis in parallel. Is there a way to connect to the remote computer within the Jupyter Notebook, while still working with my local copy of these tools? I'd basically just like to create the notebook script on my local machine, but have it download files and execute QIIME commands on my remote computer. This remote computer does not have Jupyter Notebook or iPython Notebook installed. If this doesn't work I'm back to copy pasting into terminal!

Colin Brislawn

unread,
Feb 14, 2016, 3:01:17 PM2/14/16
to Qiime 1 Forum
Hey Colin,

So Jupyter does support something like this. It has a client-server model where jobs can be farmed out to separate remote computers, but I've never set it up. I also use qiime on a supercomputer and here is what I have found works well for me:

Instead of spinning up the notebook server on my laptop, I spin up the notebook server on the head node of the supercomputer, then use my laptop to access this remote server using it's IP address and Jupyter notebook port. So while I've viewing and editing scripts from the web browser of my laptop, the scripts are living and running on the remote supercomputer. 

Does that sound like it may work for you? You can install a Jupyter server using miniconda (no sudo needed) if you want to give this a shot.

The qiime devs use Jupyter more than I do. Maybe they have some advice.

Colin

Colin A

unread,
Feb 14, 2016, 3:25:39 PM2/14/16
to Qiime 1 Forum
Sounds like it might work for me! I tried searching "install Jupyter server using miniconda". Any link to a page walking through how to do this? Would love to try.

It would be super rad if the QIIME tutorial page had an example of how to establish a connection and run code on a remote super computer from within the Jupyter Notebook environment, w/o having Jupyter Notebook installed on the remote computer, or super user access.

Colin Brislawn

unread,
Feb 14, 2016, 4:01:33 PM2/14/16
to Qiime 1 Forum
Yeah a tutorial like that would be great. Qiime 2 is being built to run entirely through a Jupyter notebook so documentation about setting up a notebook server would be helpful. There is nothing qiime specific that I know of, but here the Jupyter documentation on how to do it.
https://jupyter-notebook.readthedocs.org/en/latest/public_server.html#running-a-public-notebook-server 

Anaconda, often called conda, is a commercially supported open source platform that makes setting up software environments easy, all without sudo access. You can install miniconda here:
Then install qiime and all your other software like I do here:
Installing Jupyter is also easy:
conda install jupyter


Keep in touch,
Colin

Colin A

unread,
Feb 15, 2016, 7:36:54 PM2/15/16
to Qiime 1 Forum
Hi Colin,

Just wanted to follow up and say I got miniconda and jupyter notebook running on my remote computer fine, however I still can't get my notebook and the remote computer talking. I made a stackoverflow post about it, but haven't had much success. I plan to add a bounty in a day, and if I get a working response I'll be sure to post it here.

Thanks again for all your help! Made everything go much faster!

Colin

Colin Brislawn

unread,
Feb 15, 2016, 8:29:43 PM2/15/16
to Qiime 1 Forum
Glad you got it running. Now for the connection...

Check out the second point in this post. I use ssh tunneling as shown here:

Colin

Colin A

unread,
Feb 15, 2016, 8:52:22 PM2/15/16
to Qiime 1 Forum
Okay, this seems like it will work, however I don't know "/path/to/ssh/key", and googling hasn't solved this quickly! w/o it the jupyter notebook hangs since it asks for me to enter my password, and stuff like sshpass doesn't work. I'm sure this is obvious, how would I figure out the /path/to/ssh/key ?

Colin A

unread,
Feb 15, 2016, 9:00:39 PM2/15/16
to qiime...@googlegroups.com
Also- if I try to do this from the terminal (rather than from within jupyter notebook), I enter something like this:

Colins-MacBook-Pro:~ colin$ ssh -i $HOME/.ssh/id_dsa -NL 8157:localhost:8888 user...@remotecomputer.edu
Warning: Identity file /Users/colin/.ssh/id_dsa not accessible: No such file or directory.
user...@remotecomputer.edu's password:
bind: Address already in use
channel_setup_fwd_listener: cannot listen to port: 8157
Could not request local forwarding.

Colin Brislawn

unread,
Feb 16, 2016, 2:53:08 AM2/16/16
to Qiime 1 Forum
OK, so a core part of SSH is security, and a core part of security is proving that you are who you say you are: you have to prove that you are 'authentically' you. The two common methods of authentication are a password and a hashed password. When we use ssh, we usually use a password, but now you need a hashed password.

Hashed password authentication is described here. The basic idea is that, instead of a password you type in, your two computers exchange a hash that could only have been made by you typing in your real password.

Colin A

unread,
Feb 16, 2016, 8:16:19 AM2/16/16
to Qiime 1 Forum
Slowly but surely getting there. I get to the "Adding hashed password to your network configuration file", and get hung up again. I have generated the config file successfully. Within the jupyter notebook I am executing the line:

c.NotebookApp.password = u'myreallylonghashedpassword'

and getting the error:
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-18-1910d48be5a8> in <module>()
      1 #c = get_config()
----> 2 c.NotebookApp.password = u'myreallylonghashedpassword'

NameError: name 'c' is not defined

I tried adding in a definition of 'c' like this:

c = get_config()
c.NotebookApp.password = u'myreallylonghashedpassword'

But this generates the error:

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-19-8ef1a8bf9cbe> in <module>()
----> 1 c = get_config()
      2 c.NotebookApp.password = u'myreallylonghashedpassword'

NameError: name 'get_config' is not defined




Colin Brislawn

unread,
Feb 16, 2016, 2:40:20 PM2/16/16
to Qiime 1 Forum
This is getting a little over my head, but I think this page may be helpful:

They show c = get_config() being used after importing these libraries:
from jupyter_core.paths import jupyter_config_dir, jupyter_data_dir

Make sure you are importing those too,
Colin

Reply all
Reply to author
Forward
0 new messages