Beginners Linux Tutorial

Antonio González Peña

unread,

Jan 21, 2011, 1:11:51 PM1/21/11

to qiime...@googlegroups.com

Dear QIIME Users:

This is a great resource for beginners in the Linux/Mac terminal:

http://www.linuxcommand.org/learning_the_shell.php

This should be very useful information for beginner users of Linux,
and users whose first experience with Linux is the Qiime Virtual Box.

Happy QIIMEing.

--
Antonio González Peña
Research Assistant, Knight Lab
University of Colorado at Boulder
https://chem.colorado.edu/knightgroup/

Antonio González Peña

unread,

May 2, 2011, 1:37:36 PM5/2/11

to qiime...@googlegroups.com

Hi,

Here is an OS X specific tutorial:
http://www.osxfaq.com/Tutorials/LearningCenter/

Thanks Mike.

2011/1/21 Antonio González Peña <antg...@gmail.com>:

William Hickey

unread,

Sep 25, 2012, 6:05:56 PM9/25/12

to qiime...@googlegroups.com

That link is dead

Yoshiki Vázquez Baeza

unread,

Sep 25, 2012, 6:33:03 PM9/25/12

to qiime...@googlegroups.com

Hello William,

Yes, the first link is dead but the link as a response to that one is still working.

If you were interested in something very Apple specific, apple provides a couple of documents (this and this), I suggest that if you are interested and once you feel more comfortable, to check their Shell Scripting Primer document.

Thanks!

Yoshiki.

--

Daniel McDonald

unread,

Jan 3, 2013, 4:32:38 PM1/3/13

to qiime...@googlegroups.com

Learn Linux the Hardway just came out. The other tutorials by Shaw are great and I suspect
this one is as well.

http://37.200.69.165/doku.php?id=llthw

-Daniel

louisville 1

unread,

Feb 11, 2013, 11:03:29 AM2/11/13

to qiime...@googlegroups.com

Dear all

I just installed virtual machine and started with the tutorial. I failed to give the command to install fasta files. wget http://greengenes.lbl.gov/Download/Sequence_Data/Fasta_data_files/core_set_aligned.fasta.imputed.

I tried several times but it comes either with error or service not found. Thanks for help

Bassam

Daniel McDonald

unread,

Feb 11, 2013, 11:06:36 AM2/11/13

to qiime...@googlegroups.com

Bassam,

Can you send the full output from wget?

Thanks,
Daniel

> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "Qiime Forum" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to qiime-forum...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Bassam Abomoelak

unread,

Feb 11, 2013, 12:17:37 PM2/11/13

to qiime...@googlegroups.com

Dear Daniel

How you can cut or copy the output from the terminal to word or email. Thanks for your patience with me.

Bassam

Daniel McDonald

unread,

Feb 11, 2013, 12:43:03 PM2/11/13

to qiime...@googlegroups.com

Bassam,

You should be able to highlight the text and right-click to copy. You
cannot easily copy/paste from within the virtual machine to outside of
the virtual machine (AFAIK), but you should be able to send an email
from within the virtual machine by logging into your gmail account
from there.

Best,
Daniel

Bassam Abomoelak

unread,

Feb 11, 2013, 12:53:29 PM2/11/13

to qiime...@googlegroups.com

Thanks Daniel. Here is the output. I appreciate so much your patience as I have my first steps in qiime.
Bassam

ubuntu@ubuntu:~$ unzip qiime_tutorial-v1.3.0.zip
Archive: qiime_tutorial-v1.3.0.zip
creating: qiime_tutorial-v1.3.0/
creating: qiime_tutorial-v1.3.0/18S_tutorial_files/
inflating: qiime_tutorial-v1.3.0/18S_tutorial_files/18S_tutorial_sample_seqs.fna
inflating: qiime_tutorial-v1.3.0/Fasting_Example.fna
inflating: qiime_tutorial-v1.3.0/Fasting_Example.qual
inflating: qiime_tutorial-v1.3.0/Fasting_Example.sff
inflating: qiime_tutorial-v1.3.0/Fasting_Example.sff.txt
inflating: qiime_tutorial-v1.3.0/Fasting_Map.txt
inflating: qiime_tutorial-v1.3.0/qiime_tutorial_commands_parallel.sh
inflating: qiime_tutorial-v1.3.0/qiime_tutorial_commands_serial.sh
inflating: qiime_tutorial-v1.3.0/README
ubuntu@ubuntu:~$ cd qiime_tutorial-v1.3.0
ubuntu@ubuntu:~/qiime_tutorial-v1.3.0$
ubuntu@ubuntu:~/qiime_tutorial-v1.3.0$
ubuntu@ubuntu:~/qiime_tutorial-v1.3.0$
ubuntu@ubuntu:~/qiime_tutorial-v1.3.0$ unzip qiime_tutorial-v1.3.0.zipcd qiime_tutorial-v1.3.0
unzip: cannot find or open qiime_tutorial-v1.3.0.zipcd, qiime_tutorial-v1.3.0.zipcd.zip or qiime_tutorial-v1.3.0.zipcd.ZIP.
ubuntu@ubuntu:~/qiime_tutorial-v1.3.0$
ubuntu@ubuntu:~/qiime_tutorial-v1.3.0$ check_id_map.py -m Fasting_Map.txt -o mapping_output
check_id_map.py: command not found
ubuntu@ubuntu:~/qiime_tutorial-v1.3.0$

Daniel McDonald

unread,

Feb 11, 2013, 12:55:25 PM2/11/13

to qiime...@googlegroups.com

Can you send the output for the wget command that is not working? Is
there a reason you're using QIIME-1.3.0 instead of the latest?

Thanks,
Daniel

Bassam Abomoelak

unread,

Feb 11, 2013, 1:02:51 PM2/11/13

to qiime...@googlegroups.com

The first command worked finally and I downloaded the files. I kept repeating until it worked. Can you send me the link for the latest version of qiime. Thanks so much

Bassam

Will Van Treuren

unread,

Feb 11, 2013, 1:30:04 PM2/11/13

to qiime...@googlegroups.com

Hi Bassam,

If you follow the instructions listed on the following link it will guide you through the download and installation of the latest QIIME1.6 vm:
http://qiime.org/install/virtual_box.html#installing-the-qiime-virtual-box

Hope this helps,

Will

Bassam Abomoelak

unread,

Feb 11, 2013, 1:42:12 PM2/11/13

to qiime...@googlegroups.com

Thanks will. I got this output in my first use. Can you help? I think the space between the command is the problem> Thanks a lot
Bassam

Will Van Treuren

unread,

Feb 11, 2013, 1:45:50 PM2/11/13

to qiime...@googlegroups.com

Hi Bassam,

I am confused what error you received and at what step. Can you please tell me if you were able to get the qiime 1.6 vm working and if so where you ran in to an error.

Thanks,

Will

Bassam Abomoelak

unread,

Feb 12, 2013, 9:46:47 AM2/12/13

to qiime...@googlegroups.com

Dear Will
I'm beginner in qiime. I try to follow the tutorial. I had error in cd qiime_tutorial-v1.3.0 to unzip the file. Also the command to check mapping files didn't work. Help appreciated.
Bassam

ubuntu@ubuntu:~$ cd qiime_tutorial-v1.3.0

ubuntu@ubuntu:~/qiime_tutorial-v1.3.0$ check_id_map.py -m Fasting_Map.txt -o mapping_output
check_id_map.py: command not found

ubuntu@ubuntu:~/qiime_tutorial-v1.3.0$ cdqiime_tutorial-v1.3.0
cdqiime_tutorial-v1.3.0: command not found
ubuntu@ubuntu:~/qiime_tutorial-v1.3.0$ cd qiime_tutorial-v1.3.0
bash: cd: qiime_tutorial-v1.3.0: No such file or directory
ubuntu@ubuntu:~/qiime_tutorial-v1.3.0$ cd

Jose Navas

unread,

Feb 12, 2013, 11:45:55 AM2/12/13

to qiime...@googlegroups.com

Hi Bassam,

Did you install the qiime-1.6.0 virtual box? In order to successfully complete the tutorial, you should have installed the last version of qiime, since some commands change time to time.

In order to install the last version of qiime virtual box, follow this instructions: http://qiime.org/install/virtual_box.html

Once you've installed the qiime-1.6.0 virtual box, can you run:

print_qiime_config.py -t

and post the output?

2013/2/12 Bassam Abomoelak <babom...@gmail.com>

--
Jose Navas

Bassam Abomoelak

unread,

Feb 12, 2013, 11:56:46 AM2/12/13

to qiime...@googlegroups.com

Dear Jose

I did dowload the qiime-1.6.0 version and it's on the desktop of my computer. I don't know if I need to activate or extract the files from the folder?. Thanks for the help

Bassam

Jose Navas

unread,

Feb 12, 2013, 12:04:36 PM2/12/13

to qiime...@googlegroups.com

Hi Bassam,

Please, follow all the instructions in this link to install the qiime virtual box:

http://qiime.org/install/virtual_box.html

Which a virtual box means is that it will run a new operating system inside your current operating system. So, you will have to turn up a new 'virtual machine' and inside this new virtual machine you will get qiime installed and all the files for the tutorial.

Bassam Abomoelak

unread,

Feb 13, 2013, 3:28:19 PM2/13/13

to qiime...@googlegroups.com

Dear Jose
I use qiime 1.6 and I downloaded the tutorial. I couldn't check mapping files. Thanks
Bassam
ubuntu@ubuntu:~$ check_id_map.py -m Fasting_Map.txt -o mapping_output
check_id_map.py: command not found
ubuntu@ubuntu:~$ check_id_map.py -m Fasting_Map.txt -o mapping_output
check_id_map.py: command not found
ubuntu@ubuntu:~$

Will Van Treuren

unread,

Feb 13, 2013, 3:33:31 PM2/13/13

to qiime...@googlegroups.com

Hi Bassam,

Did you follow the set up instructions listed in the 'before you start" folder? Maybe viewing this video will help:

https://www.youtube.com/watch?v=1jYupkquaME

Thanks,

Will

Bassam Abomoelak

unread,

Feb 15, 2013, 5:54:08 PM2/15/13

to qiime...@googlegroups.com

Dear Jose

Thanks for your help. I installed the virtual machine and I'm working with the tutorial files to go through the whole process. I have some questions regarding the splitting. When you type the command for checking the mapping files, where you should have the files as output? they will be at home directory?. I'm sorry I feel I will give you troubles until I have good knowledge of the software. Please be patient with me and with my questions. Thanks so much

Bassam

Jose Navas

unread,

Feb 15, 2013, 6:29:36 PM2/15/13

to qiime...@googlegroups.com

Hi Bassam,

You can have the files whatever you want. The only limitation is that you have to be in the folder where you downloaded the files in order to run the tutorial commands. For example, if you have all the files in:

/home/qiime/qiime-tutorial

you have to move to that folder:

cd /home/qiime/qiime-tutorial

You can check in which folder are you by executing:

pwd

Hope this helps!!

2013/2/15 Bassam Abomoelak <babom...@gmail.com>

--
Jose Navas

Jai Ram Rideout

unread,

Feb 15, 2013, 7:20:38 PM2/15/13

to qiime...@googlegroups.com

Hi Bassam,

We have a short tutorial that may help you get more comfortable with using the command line:

http://qiime.org/tutorials/unix_commands.html

Hope this helps,

Jai

Bassam Abomoelak

unread,

Feb 15, 2013, 8:29:14 PM2/15/13

to qiime...@googlegroups.com

Thanks Jai. I have question regarding the commands. When you split the library for the tutorial and assign the reads. You should have the files generated on the terminal?. If so, You should direct them to the home directory by cd?.In other words, How do you know that the files are already generated in the terminal? Thanks again

Bassam

Jai Ram Rideout

unread,

Feb 18, 2013, 11:59:13 AM2/18/13

to qiime...@googlegroups.com

Hi Bassam,

All commands in the tutorial must be executed from within the terminal. The output files for the split_libraries.py command will be placed into the directory specified by the -o option to the script. So, if you are running the split_libraries.py command in the QIIME overview tutorial (http://qiime.org/tutorials/tutorial.html), the output will be placed in the 'split_library_output' directory, relative to the current directory that you are in when you ran the command.

For example, if your current directory is /home/ubuntu/ and you run the split_libraries.py command using '-o split_library_output', the output will be placed in /home/ubuntu/split_library_output/.

I'm not sure if I understood your question correctly- please let me know if you have additional questions.

Thanks,

Jai

Bassam Abomoelak

unread,

Feb 21, 2013, 3:06:29 PM2/21/13

to qiime...@googlegroups.com

Dear Jose
This is what I got with the map check command. Can you explain to me if I'm following the steps correctly. I'm still in the tutorial. Thanks for great patience with me.
Bassam

qiime@qiime-VirtualBox:~$ check_id_map.py -m Fasting_Map.txt -o mapping_output -v
Usage: check_id_map.py [options] {-m/--mapping_fp MAPPING_FP}

[] indicates optional input (order unimportant)
{} indicates required input (order unimportant)

Specifically, we check that:

    - The BarcodeSequence, LinkerPrimerSequences, and ReversePrimer fields
       have valid IUPAC DNA characters, and BarcodeSequence characters
       are non-degenerate (error)
    - The SampleID, BarcodeSequence, LinkerPrimerSequence, and Description
       headers are present. (error)
    - There are not duplicate header fields (error)
    - There are not duplicate barcodes (error)
    - Barcodes are of the same length. Suppressed when
       variable_len_barcode flag is passed (warning)
    - The headers do not contain invalid characters (alphanumeric and
       underscore only) (warning)
    - The data fields do not contain invalid characters (alphanumeric,
       underscore, space, and +-%./:,; characters) (warning)
    - SampleID fields are MIENS compliant (only alphanumeric
       and . characters). (warning)
    - There are no duplicates when the primer and variable length
       barcodes are appended (error)
    - There are no duplicates when barcodes and added demultiplex
       fields (-j option) are combined (error)
    - Data fields are not found beyond the Description column (warning)

    Details about the metadata mapping file format can be found here:
    http://www.qiime.org/documentation/file_formats.html#metadata-mapping-files

    Errors and warnings are saved to a log file. Errors can be caused by
    problems with the headers, invalid characters in barcodes or primers, or
    by duplications in SampleIDs or barcodes.

    Warnings can arise from invalid characters and variable length barcodes that
    are not specified with the --variable_len_barcode.
    Warnings will contain a reference to the cell (row,column) that the
    warning arose from.

    In addition to the log file, a "corrected_mapping" file will be created.
    Any invalid characters will be replaced with '.' characters in
    the SampleID fields (to enforce MIENS compliance) and text in other data
    fields will be replaced with the character specified by the -c parameter,
    which is an underscore "_" by default.

    A html file will be created as well, which will show locations of
    warnings and errors, highlighted in yellow and red respectively. If no
    errors or warnings were present the file will display a message saying
    such. Header errors can mask other errors, so these should be corrected
    first.

    If pooled primers are used, separate with a comma. For instance, a pooled
    set of three 27f primers (used to increase taxonomic coverage) could be
    specified in the LinkerPrimerSequence fields as such:
    AGGGTTCGATTCTGGCTCAG,AGAGTTTGATCCTGGCTTAG,AGAATTTGATCTTGGTTCAG

Example usage:
Print help message and exit
check_id_map.py -h

Example: Check the Fasting_Map.txt mapping file for problems, supplying the required mapping file, and output the results in the check_id_map_output directory
check_id_map.py -m Fasting_Map.txt -o check_id_map_output

check_id_map.py: error: option -m: file does not exist: 'Fasting_Map.txt'
qiime@qiime-VirtualBox:~$

Jai Ram Rideout

unread,

Feb 21, 2013, 3:18:29 PM2/21/13

to qiime...@googlegroups.com

Hi Bassam,

You need to either be in the same directory as your Fasting_Map.txt file to have that command work, or you can change the filepath you used with the -m option to wherever your Fasting_Map.txt file is, relative to your current directory.

For example, it looks like you are in your home directory, /home/ubuntu/. If Fasting_Map.txt is in /home/ubuntu/qiime_tutorial/, you could run the following command:

check_id_map.py -m qiime_tutorial/Fasting_Map.txt -o mapping_output -v

You will need to modify the path to point to wherever your Fasting_Map.txt file is.

Hope this helps,

Jai

Bassam Abomoelak

unread,

Feb 22, 2013, 9:20:10 AM2/22/13

to qiime...@googlegroups.com

Dear Jose
Thanks for the help. I managed to do the check Id for the tutorial. The next step with spliiting the library gave me the following message. I think you have to specify the path for the command again as before. Is that right?. Thanks for your great input.
Bassam

qiime@qiime-VirtualBox:~$ split_libraries.py -m qiime_tutorial/Fasting_Map.txt -f Fasting_Example.fna -q Fasting_Example.qual -0 split_library_output
Usage: split_libraries.py [options] {-m/--map MAP_FNAME -f/--fasta FASTA_FNAMES}

[] indicates optional input (order unimportant)
{} indicates required input (order unimportant)

Since newer sequencing technologies provide many reads per run (e.g. the 454 GS FLX Titanium series can produce 400-600 million base pairs with 400-500 base pair read lengths) researchers are now finding it useful to combine multiple samples into a single 454 run. This multiplexing is achieved through the application of a pyrosequencing-tailored nucleotide barcode design (described in (Parameswaran et al., 2007)). By assigning individual, unique sample specific barcodes, multiple sequencing runs may be performed in parallel and the resulting reads can later be binned according to sample. The script split_libraries.py performs this task, in addition to several quality filtering steps including user defined cut-offs for: sequence lengths; end-trimming; minimum quality score. To summarize, by using the fasta, mapping, and quality files, the program split_libraries.py will parse sequences that meet user defined quality thresholds and then rename each read with the appropriate Sample ID, thus formatting the sequence data for downstream analysis. If a combination of different sequencing technologies are used in any particular study, split_libraries.py can be used to perform the quality-filtering for each library individually and the output may then be combined.

Sequences from samples that are not found in the mapping file (no corresponding barcode) and sequences without the correct primer sequence will be excluded. Additional scripts can be used to exclude sequences that match a given reference sequence (e.g. the human genome; exclude_seqs_by_blast.py) and/or sequences that are flagged as chimeras (identify_chimeric_seqs.py).

Example usage:
Print help message and exit

split_libraries.py -h

Standard Example: Using a single 454 run, which contains a single FASTA, QUAL, and mapping file while using default parameters and outputting the data into the Directory "Split_Library_Output"
split_libraries.py -m Mapping_File.txt -f 1.TCA.454Reads.fna -q 1.TCA.454Reads.qual -o Split_Library_Output/

Multiple FASTA and QUAL Files Example: For the case where there are multiple FASTA and QUAL files, the user can run the following comma-separated command as long as there are not duplicate barcodes listed in the mapping file
split_libraries.py -m Mapping_File.txt -f 1.TCA.454Reads.fna,2.TCA.454Reads.fna -q 1.TCA.454Reads.qual,2.TCA.454Reads.qual -o Split_Library_Output_comma_separated/

Duplicate Barcode Example: An example of this situation would be a study with 1200 samples. You wish to have 400 samples per run, so you split the analysis into three runs and reuse barcoded primers (you only have 600). After initial analysis you determine a small subset is underrepresented (<500 sequences per samples) and you boost the number of sequences per sample for this subset by running a fourth run. Since the same sample IDs are in more than one run, it is likely that some sequences will be assigned the same unique identifier by split_libraries.py when it is run separately on the four different runs, each with their own barcode file. This will cause a problem in file concatenation of the four different runs into a single large file. To avoid this, you can use the '-s' parameter which defines a start index for split_libraries.py. From experience, most FLX runs (when combining both files for a single plate) will have 350,000 to 650,000 sequences. Thus, if Run 1 for split_libraries.py uses '-n 1000000', Run 2 uses '-n 2000000', etc., then you are guaranteed to have unique identifiers after concatenating the results of multiple FLX runs. With newer technologies you will just need to make sure that your start index spacing is greater than the potential number of sequences.

To run split_libraries.py, you will need two or more (depending on the number of times the barcodes were reused) separate mapping files (one for each Run, for example one for Run1 and another one for Run2), then you can run split_libraries.py using the FASTA and mapping file for Run1 and FASTA and mapping file for Run2. Once you have run split libraries on each file independently, you can concatenate (e.g. using the 'cat' command) the sequence files that were generated by split_libraries.py. You can also concatenate the mapping files, since the barcodes are not necessary for downstream analyses, unless the same sample IDs are found in multiple mapping files.

Run split_libraries.py on Run 1
split_libraries.py -m Mapping_File.txt -f 1.TCA.454Reads.fna -q 1.TCA.454Reads.qual -o Split_Library_Run1_Output/ -n 1000000

Run split_libraries.py on Run 2. The resulting FASTA files from Run 1 and Run 2 can then be concatenated using the 'cat' command (e.g. cat Split_Library_Run1_Output/seqs.fna Split_Library_Run2_Output/seqs.fna > Combined_seqs.fna) and used in downstream analyses.
split_libraries.py -m Mapping_File.txt -f 2.TCA.454Reads.fna -q 2.TCA.454Reads.qual -o Split_Library_Run2_Output/ -n 2000000

Barcode Decoding Example: The standard barcode types supported by split_libraries.py are golay (Length: 12 NTs) and hamming (Length: 8 NTs). For situations where the barcodes are of a different length than golay and hamming, the user can define a generic barcode type "-b" as an integer, where the integer is the length of the barcode used in the study.

Note: When analyzing large datasets (>100,000 seqs), users may want to use a generic barcode type, even for length 8 and 12 NTs, since the golay and hamming decoding processes can be computationally intensive, which causes the script to run slow. Barcode correction can be disabled with the -c option if desired.

For the case where the 8 base pair barcodes were used, you can use the following command
split_libraries.py -m Mapping_File_8bp_barcodes.txt -f 1.TCA.454Reads.fna -q 1.TCA.454Reads.qual -o split_Library_output_8bp/ -b 8

Linkers and Primers: The linker and primer sequence (or all the degenerate possibilities) are associated with each barcode from the mapping file. If a barcode cannot be identified, all the possible primers in the mapping file are tested to find a matching sequence. Using truncated forms of the same primer can lead to unexpected results for rare circumstances where the barcode cannot be identified and the sequence following the barcode matches multiple primers.

In many cases, sequence reads are long enough to sequence through the reverse primer and sequencing adapter. To remove these primers and all following sequences, the -z option can be used. By default, this option is set to 'disable'. If it is set to 'truncate_only', split_libraries will trim the primer and any sequence following it if the primer is found. If the 'truncate_remove' option is set, split_libraries.py will trim the primer if found, and will not write the sequence if the primer is not found. The allowed mismatches for the reverse primer are set with the --reverse_primer_mismatches parameter (default 0). To use reverse primer removal, one must include a 'ReversePrimer' column in the mapping file, with the reverse primer recorded in the 5' to 3' orientation.

Example reverse primer removal, where primers are trimmed if found, and sequence is written unchanged if not found. Mismatches are increased to 1 from the default 0
split_libraries.py -m Mapping_File_reverse_primer.txt -f 1.TCA.454Reads.fna -q 1.TCA.454Reads.qual -o split_libraries_output_revprimer/ --reverse_primer_mismatches 1 -z truncate_only

split_libraries.py: error: option -m: file does not exist: 'qiime_tutorial/Fasting_Map.txt'
qiime@qiime-VirtualBox:~$

Jai Ram Rideout

unread,

Feb 22, 2013, 11:12:12 AM2/22/13

to qiime...@googlegroups.com

Hi Bassam,

Yes, for any QIIME commands that require input files or directories, you will need to specify the correct paths to where those files are located. So for the split_libraries.py step, you will need to specify the paths to your mapping file, .fna, and .qual files.

Hope this helps,

Jai

Bassam Abomoelak

unread,

Feb 22, 2013, 11:32:45 AM2/22/13

to qiime...@googlegroups.com

Dear Jai
Here is the command that I used. I didn't get the split output. Thanks

qiime@qiime-VirtualBox:~$ split_libraries.py -m qiime_tutorial/Fasting_Map.txt -f qiime_tutorial/Fasting_Example.fna -q qiime_tutorial/Fasting_Example.qual -o split_library_output

Tony Walters

unread,

Feb 22, 2013, 11:46:42 AM2/22/13

to qiime...@googlegroups.com

Bassam,

If you type:
ls qiime_tutorial/

Do you see the Fasting_Map.txt, Fasting_Example.fna, and Fasting_Example.qual files listed?

-Tony

Bassam Abomoelak

unread,

Feb 22, 2013, 11:54:23 AM2/22/13

to qiime...@googlegroups.com

Dear Tony

I turned off the machine for now but to answer your question I know that the 3 files are present in qiime tutorial in home directory. Thanks for your great assistance

Bassam

Jai Ram Rideout

unread,

Feb 22, 2013, 12:01:00 PM2/22/13

to qiime...@googlegroups.com

Hi Bassam,

When you have your VM on again, can you please send us the output of running the following 3 commands:

ls

ls qiime_tutorial/

split_libraries.py -m qiime_tutorial/Fasting_Map.txt -f qiime_tutorial/Fasting_Example.fna -q qiime_tutorial/Fasting_Example.qual -o split_library_output

Thanks,

Jai

Bassam Abomoelak

unread,

Feb 22, 2013, 12:20:29 PM2/22/13

to qiime...@googlegroups.com

Dear Jai
here are the 3 commands

qiime@qiime-VirtualBox:~$ Is
Is: command not found
qiime@qiime-VirtualBox:~$ is
is: command not found
qiime@qiime-VirtualBox:~$ Is qiime_tutorial/
Is: command not found
qiime@qiime-VirtualBox:~$ 1s
1s: command not found

qiime@qiime-VirtualBox:~$ split_libraries.py -m qiime_tutorial/Fasting_Map.txt -f qiime_tutorial/Fasting_Example.fna -q qiime_tutorial/Fasting_Example.qual -o split_library_output

Jai Ram Rideout

unread,

Feb 22, 2013, 12:29:58 PM2/22/13

to qiime...@googlegroups.com

Hi Bassam,

The 'ls' command is a lowercase L followed by a lowercase s. It looks like you were typing capital i as the first character. The 'ls' command lists the contents of a directory.

I highly recommend that you work through some beginning Unix/Linux command line tutorials to get comfortable with using the command line. The time you invest there will help you in the long run if you plan to continue using QIIME.

Here's a great interactive tutorial that will help you get started:

http://nixsrv.com/llthw

These additional tutorials may also be useful:

http://qiime.org/tutorials/unix_commands.html

http://www.linuxcommand.org/learning_the_shell.php

Hope this helps,

Jai

Bassam Abomoelak

unread,

Feb 22, 2013, 12:43:10 PM2/22/13

to qiime...@googlegroups.com

Thanks guys, you are so great

Bassam

Jai Ram Rideout

unread,

Feb 22, 2013, 2:09:21 PM2/22/13

to qiime...@googlegroups.com

Glad to help! Please let us know if you are still stuck on the QIIME tutorial commands after working through the Unix/Linux tutorial.

-Jai

Bassam Abomoelak

unread,

Feb 22, 2013, 3:05:30 PM2/22/13

to qiime...@googlegroups.com

Dear all
Here what I got with the 3 commands.
bassam
qiime@qiime-VirtualBox:~$ ls
core_set_aligned.fasta.imputed lanemask_in_1s_and_0s qiime_config_default
Desktop                         mapping_output         qiime_software
Documents                       Music                  qiime_tutorial-v1.5.0
Downloads                       Pictures               Templates
examples.desktop                Public                 Videos
qiime@qiime-VirtualBox:~$ ls qiime_tutorial/
ls: cannot access qiime_tutorial/: No such file or directory

qiime@qiime-VirtualBox:~$ ls qiime_tutorial/
ls: cannot access qiime_tutorial/: No such file or directory

qiime@qiime-VirtualBox:~$ split_libraries.py -m qiime_tutorial/Fasting_Map.txt -f qiime_tutorial/Fasting_Example.fna -q qiime_tutorial/Fasting_Example.qual -o split_library_output
Usage: split_libraries.py [options] {-m/--map MAP_FNAME -f/--fasta FASTA_FNAMES}

[] indicates optional input (order unimportant)
{} indicates required input (order unimportant)

Since newer sequencing technologies provide many reads per run (e.g. the 454 GS FLX Titanium series can produce 400-600 million base pairs with 400-500 base pair read lengths) researchers are now finding it useful to combine multiple samples into a single 454 run. This multiplexing is achieved through the application of a pyrosequencing-tailored nucleotide barcode design (described in (Parameswaran et al., 2007)). By assigning individual, unique sample specific barcodes, multiple sequencing runs may be performed in parallel and the resulting reads can later be binned according to sample. The script split_libraries.py performs this task, in addition to several quality filtering steps including user defined cut-offs for: sequence lengths; end-trimming; minimum quality score. To summarize, by using the fasta, mapping, and quality files, the program split_libraries.py will parse sequences that meet user defined quality thresholds and then rename each read with the appropriate Sample ID, thus formatting the sequence data for downstream analysis. If a combination of different sequencing technologies are used in any particular study, split_libraries.py can be used to perform the quality-filtering for each library individually and the output may then be combined.

Sequences from samples that are not found in the mapping file (no corresponding barcode) and sequences without the correct primer sequence will be excluded. Additional scripts can be used to exclude sequences that match a given reference sequence (e.g. the human genome; exclude_seqs_by_blast.py) and/or sequences that are flagged as chimeras (identify_chimeric_seqs.py).

Example usage:
Print help message and exit
split_libraries.py -h

Standard Example: Using a single 454 run, which contains a single FASTA, QUAL, and mapping file while using default parameters and outputting the data into the Directory "Split_Library_Output"
split_libraries.py -m Mapping_File.txt -f 1.TCA.454Reads.fna -q 1.TCA.454Reads.qual -o Split_Library_Output/

Multiple FASTA and QUAL Files Example: For the case where there are multiple FASTA and QUAL files, the user can run the following comma-separated command as long as there are not duplicate barcodes listed in the mapping file
split_libraries.py -m Mapping_File.txt -f 1.TCA.454Reads.fna,2.TCA.454Reads.fna -q 1.TCA.454Reads.qual,2.TCA.454Reads.qual -o Split_Library_Output_comma_separated/

Jai Ram Rideout

unread,

Feb 22, 2013, 3:09:21 PM2/22/13

to qiime...@googlegroups.com

Hi Bassam,

The QIIME tutorial files will be in the qiime_tutorial-v1.5.0/ directory, so your split_libraries.py command should look like this:

split_libraries.py -m qiime_tutorial-v1.5.0/Fasting_Map.txt -f qiime_tutorial-v1.5.0/Fasting_Example.fna -q qiime_tutorial-v1.5.0/Fasting_Example.qual -o split_library_output

-Jai

Bassam Abomoelak

unread,

Feb 24, 2013, 12:38:36 PM2/24/13

to qiime...@googlegroups.com

Dear Jai
I tried to create the OTU for the tutorial and I got this message. Is the command for creating the OTU right? Thanks
qiime@qiime-VirtualBox:~$ pick_otus_through_otu_table.py -i qiime_tutorial-v1.5.0/split_library_output/seqs.fna -o otus
Usage: pick_otus_through_otu_table.py [options] {-i/--input_fp INPUT_FP -o/--output_dir OUTPUT_DIR}

[] indicates optional input (order unimportant)
{} indicates required input (order unimportant)

This script takes a sequence file and performs all processing steps through building the OTU table.

Example usage:
Print help message and exit

pick_otus_through_otu_table.py -h

Simple example: The following command will start an analysis on seqs.fna (-i), which is a post-split_libraries fasta file. The sequence identifiers in this file should be of the form <sample_id>_<unique_seq_id>. The following steps, corresponding to the preliminary data preparation, are applied: Pick de novo OTUs at 97%; pick a representative sequence for each OTU (the OTU centroid sequence); align the representative set with PyNAST; assign taxonomy with RDP classifier; filter the alignment prior to tree building - remove positions which are all gaps, and specified as 0 in the lanemask; build a phylogenetic tree with FastTree; build an OTU table. All output files will be written to the directory specified by -o, and subdirectories as appropriate. ALWAYS SPECIFY ABSOLUTE FILE PATHS (absolute path represented here as $PWD, but will generally look something like /home/ubuntu/my_analysis/).
pick_otus_through_otu_table.py -i $PWD/seqs.fna -o $PWD/otus/

pick_otus_through_otu_table.py: error: option -i: file does not exist: 'qiime_tutorial-v1.5.0/split_library_output/seqs.fna'
qiime@qiime-VirtualBox:~$ ls
Desktop    examples.desktop Pictures        qiime_tutorial-v1.5.0 Videos
Documents mapping_output    Public          split_library_output
Downloads Music             qiime_software Templates
qiime@qiime-VirtualBox:~$ pick_otus_through_otus_table.py -i qiime_tutorial-v1.5.0/split_library_output/seqs.fna -o qiime_tutorial-v1.5.0 otus
pick_otus_through_otus_table.py: command not found
qiime@qiime-VirtualBox:~$

Jose Navas

unread,

Feb 24, 2013, 12:52:04 PM2/24/13

to qiime...@googlegroups.com

Hi Bassam,

Note that your split_library_output folder is not in the the qiime_tutorial-v1.5.0. From the output of your ls command, I can see that it is in your home directory. Then, your command should be:

pick_otus_through_otu_table.py -i split_library_output/seqs.fna -o otus

When a script gives you an error saying that one of your input files doesn't exists, use the ls command in order to know where is your input file (note that your input files can be in a different location that in the tutorial)

2013/2/24 Bassam Abomoelak <babom...@gmail.com>

--
Jose Navas

Bassam Abomoelak

unread,

Feb 24, 2013, 1:55:29 PM2/24/13

to qiime...@googlegroups.com

Jai
Thanks a lot. It worked and I created the OTUS. In the otus directory, the rep_set has the two files as the tutorial says but the rdp_assigned_taxonomy directory is empty although it's supposed to have tow files (log and text files). Any suggestion. Again, thanks
Bassam

Laura Wegener Parfrey

unread,

Feb 25, 2013, 10:43:55 AM2/25/13

to qiime...@googlegroups.com

Hi Bassam,
Can you post the error message you got? It is likely that RDP failed because not enough memory was allocated to it.
You can run RDP separately with the assign_taxonomy.py script. Increase the memory with the option --rdp_max_memory 4000
(you can vary this number to change the amount of memory allocated).

Then you can add the taxonomy to your otu table with the add_metadata.py script.
See: http://biom-format.org/documentation/adding_metadata.html
example (but make sure to use FULL PATHS for all commands):
add_metadata.py -i otu_table.biom -o otu_table_with_tax.biom --observation_mapping_fp taxonomy_map_from_RDP.txt --observation_header OTUID,taxonomy

Laura

Laura Wegener Parfrey
Postdoctoral Research Associate
University of Colorado
Boulder, CO 80309

Bassam Abomoelak

unread,

Feb 25, 2013, 10:56:59 AM2/25/13

to qiime...@googlegroups.com

Dear Laura
qiime@qiime-VirtualBox:~$ assign_taxonomy.py
Usage: assign_taxonomy.py [options] {-i/--input_fasta_fp INPUT_FASTA_FP}

[] indicates optional input (order unimportant)
{} indicates required input (order unimportant)

Contains code for assigning taxonomy, using several techniques.

Given a set of sequences, assign_taxonomy.py attempts to assign the taxonomy of each sequence. Currently there are three methods implemented: assignment with BLAST, assignment with the RDP classifier, and assignment with the RTAX classifier. The output of this step is a mapping of input sequence identifiers (1st column of output file) to taxonomy (2nd column) and quality score (3rd column). The sequence identifier of the best BLAST hit is also included if the blast method is used (4th column).

Example reference data sets and id_to_taxonomy maps can be found in the Greengenes OTUs. To get the latest build of those click the "Most recent Greengenes OTUs" link on the top right of http://blog.qiime.org. After downloading and unzipping you can use the following following files as -r and -t. As of this writing the latest build was gg_otus_4feb2011, but that portion of path to these files will change with future builds. Modify these paths accordining when calling assign_taxonomy.py.

-r gg_otus_4feb2011/rep_set/gg_97_otus_4feb2011.fasta
-t gg_otus_4feb2011/taxonomies/greengenes_tax_rdp_train.txt (best for retraining the RDP classifier)
-t gg_otus_4feb2011/taxonomies/greengenes_tax.txt (best for BLAST taxonomy assignment)

Example usage:
Print help message and exit

assign_taxonomy.py -h

Sample Assignment with BLAST: Taxonomy assignments are made by searching input sequences against a blast database of pre-assigned reference sequences. If a satisfactory match is found, the reference assignment is given to the input sequence. This method does not take the hierarchical structure of the taxonomy into account, but it is very fast and flexible. If a file of reference sequences is provided, a temporary blast database is built on-the-fly. The quality scores assigned by the BLAST taxonomy assigner are e-values.

To assign the sequences to the representative sequence set, using a reference set of sequences and a taxonomy to id assignment text file, where the results are output to default directory "blast_assigned_taxonomy", you can run the following command
assign_taxonomy.py -i repr_set_seqs.fasta -r ref_seq_set.fna -t id_to_taxonomy.txt

Optionally, the user could changed the E-value ("-e"), using the following command
assign_taxonomy.py -i repr_set_seqs.fasta -r ref_seq_set.fna -t id_to_taxonomy.txt -e 0.01

Assignment with the RDP Classifier: The RDP Classifier program (Wang, Garrity, Tiedje, & Cole, 2007) assigns taxonomies by matching sequence segments of length 8 to a database of previously assigned sequences. It uses a naive bayesian algorithm, which means that for each potential assignment, it attempts to calculate the probability of the observed matches, assuming that the assignment is correct and that the sequence segments are completely independent. The RDP Classifier is distributed with a pre-built database of assigned sequence, which is used by default. The quality scores provided by the RDP classifier are confidence values.

Note: If a reference set of sequences and taxonomy to id assignment file are provided, the script will use them to generate a new training dataset for the RDP Classifier on-the-fly. Because of the RDP Classifier's implementation, all lineages in the training dataset must contain the same number of ranks.

To assign the representative sequence set, where the output directory is "rdp_assigned_taxonomy", you can run the following command
assign_taxonomy.py -i repr_set_seqs.fasta -m rdp

Alternatively, the user could change the minimum confidence score ("-c"), using the following command
assign_taxonomy.py -i repr_set_seqs.fasta -m rdp -c 0.85

Sample Assignment with RTAX: Taxonomy assignments are made by searching input sequences against a fasta database of pre-assigned reference sequences. All matches are collected which match the query within 0.5% identity of the best match. A taxonomy assignment is made to the lowest rank at which more than half of these hits agree. Note that both unclustered read fasta files are required as inputs in addition to the representative sequence file.

To make taxonomic classifications of the representative sequences, using a reference set of sequences and a taxonomy to id assignment text file, where the results are output to default directory "rtax_assigned_taxonomy", you can run the following command
assign_taxonomy.py -i rtax_repr_set_seqs.fasta -m rtax --read_1_seqs_fp read_1.seqs.fna --read_2_seqs_fp read_2.seqs.fna -r rtax_ref_seq_set.fna -t rtax_id_to_taxonomy.txt

Sample Assignment with Mothur: The Mothur software provides a naive bayes classifier similar to the RDP Classifier. A set of training sequences and id-to-taxonomy assignments must be provided. Unlike the RDP Classifier, sequences in the training set may be assigned at any level of the taxonomy.

To make taxonomic classifications of the representative sequences, where the results are output to default directory "mothur_assigned_taxonomy", you can run the following command
assign_taxonomy.py -i mothur_repr_set_seqs.fasta -m mothur -r mothur_ref_seq_set.fna -t mothur_id_to_taxonomy.txt

qiime@qiime-VirtualBox:~$

Jai Ram Rideout

unread,

Feb 25, 2013, 11:07:39 AM2/25/13

to qiime...@googlegroups.com

Hi Bassam,

Additionally, can you please send the log file as an attachment? It will be under the otus/ directory and will be named something like log_20130225084020.txt (the numbers will be different for you though).

Can you also send the output of running the following command:

print_qiime_config.py -t

Thanks,

Jai

unread,

Feb 25, 2013, 3:19:05 PM2/25/13

to qiime...@googlegroups.com

Dear Jai
I tried to make the OTU network for the tutorial. Here what I got. Thanks
qiime@qiime-VirtualBox:~$ make_otu_network.py -m mapping.txt -i otus/otu_table.biom -o otus/OTU_Network
Usage: make_otu_network.py [options] {-i/--input_fp INPUT_FP -m/--map_fname MAP_FNAME -o/--output_dir OUTPUT_DIR}

[] indicates optional input (order unimportant)
{} indicates required input (order unimportant)

This script generates the otu network files to be passed into cytoscape and statistics for those networks. It uses the OTU fileand the user metadata mapping file.

Network-based analysis is used to display and analyze how OTUs are partitioned between samples. This is a powerful way to display visually large and highly complex datasets in such a way that similarities and differences between samples are emphasized. The visual output of this analysis is a clustering of samples according to their shared OTUs - samples that share more OTUs cluster closer together. The degree to which samples cluster is based on the number of OTUs shared between samples (when OTUs are found in more than one sample) and this is weighted according to the number of sequences within an OTU. In the network diagram, there are two kinds of "nodes" represented, OTU-nodes and sample-nodes. These are shown with symbols such as filled circles and filled squares. If an OTU is found within a sample, the two nodes are connected with a line (an "edge"). (OTUs found only in one sample are given a second, distinct OTU-node shape.) The nodes and edges can then be colored to emphasize certain aspects of the data. For instance, in the initial application of this analysis in a microbial ecology study, the gut bacteria of a variety of mammals was surveyed, and the network diagrams were colored according to the diets of the animals, which highlighted the clustering of hosts by diet category (herbivores, carnivores, omnivores). In a meta-analysis of bacterial surveys across habitat types, the networks were colored in such a way that the phylogenetic classification of the OTUs was highlighted: this revealed the dominance of shared Firmicutes in vertebrate gut samples versus a much higher diversity of phyla represented amongst the OTUs shared by environmental samples.

Not just pretty pictures: the connections within the network are analyzed statistically to provide support for the clustering patterns displayed in the network. A G-test for independence is used to test whether sample-nodes within categories (such as diet group for the animal example used above) are more connected within than a group than expected by chance. Each pair of samples is classified according to whether its members shared at least one OTU, and whether they share a category. Pairs are then tested for independence in these categories (this asks whether pairs that share a category also are equally likely to share an OTU). This statistical test can also provide support for an apparent lack of clustering when it appears that a parameter is not contributing to the clustering.

This OTU-based approach to comparisons between samples provides a counterpoint to the tree-based PCoA graphs derived from the UniFrac analyses. In most studies, the two approaches reveal the same patterns. They can reveal different aspects of the data, however. The network analysis can provide phylogenetic information in a visual manner, whereas PCoA-UniFrac clustering can reveal subclusters that may be obscured in the network. The PCs can be pulled out individually and regressed against other metadata; the network analysis can provide a visual display of shared versus unique OTUs. Thus, together these tools can be used to draw attention to disparate aspects of a dataset, as desired by the author.

In more technical language: OTUs and samples are designated as two types of nodes in a bipartite network in which OTU-nodes are connected via edges to sample-nodes in which their sequences are found. Edge weights are defined as the number of sequences in an OTU. To cluster the OTUs and samples in the network, a stochastic spring-embedded algorithm is used, where nodes act like physical objects that repel each other, and connections act a springs with a spring constant and a resting length: the nodes are organized in a way that minimized forces in the network. These algorithms are implemented in Cytoscape (Shannon et al., 2003).

Example usage:
Print help message and exit

make_otu_network.py -h

Example: Create network cytoscape and statistic files in a user-specified output directory. This example uses an OTU table (-i) and the metadata mapping file (-m), and the results are written to the "otu_network/" folder.
make_otu_network.py -i otu_table.biom -m Fasting_Map.txt -o otu_network

make_otu_network.py: error: option -m: file does not exist: 'mapping.txt'
qiime@qiime-VirtualBox:~$

Bassam

Jai Ram Rideout

unread,

Feb 25, 2013, 3:30:50 PM2/25/13

to qiime...@googlegroups.com

Hi Bassam,

You need to specify the path to your mapping file with the -m option. As Jose previously mentioned:

> When a script gives you an error saying that one of your input files doesn't exists, use the ls command in order to know where is your input file (note that your input files can be in a different location that in the tutorial)

-Jai

Bassam Abomoelak

unread,

Feb 27, 2013, 4:11:14 PM2/27/13

to qiime...@googlegroups.com

Dear Jai

I finished the tutorial and I want to load my samples to the VM. Should I load the seq.fna, qual, and mapping files on USB?. I appreciate so much your guidance through this process. Thanks guys, you are doing great job.

Bassam

Jose Navas

unread,

Feb 27, 2013, 4:41:14 PM2/27/13

to qiime...@googlegroups.com

Hi Bassam,

In the VM's Desktop there is a folder called 'Before you start'. In this folder there is a document called '4.Transferring_files_to_your_virtual_box'. Follow the instructions on that document in order to know how to get your own files in your VM.

Hope this helps,

2013/2/27 Bassam Abomoelak <babom...@gmail.com>

--
Jose Navas

Bassam Abomoelak

unread,

Feb 28, 2013, 7:39:30 AM2/28/13

to qiime...@googlegroups.com

Dear all

I have question regarding the analysis using new VM. If I need to upload my samples for analysis, should I delete the old VM that I used to analyse the tutorial files?. Thanks

Bassam

Bassam Abomoelak

unread,

Feb 28, 2013, 9:09:21 AM2/28/13

to qiime...@googlegroups.com

Dear Laura
I tried to split the library of my 454 samples and here what I got. Thanks
Bassam

qiime@qiime-VirtualBox:~$ split_libraries.py -m NR2.1.TCA.454Reads.fna -q NR2.1.TCA.454Reads.qual -o split_library_output

split_libraries.py: error: Required option --fasta_fnames omitted.
qiime@qiime-VirtualBox:~$ split_libraries.py -m NR2.1.TCA.454Reads.fna -q NR2.1.TCA.454Reads.qual -o split_library_output

split_libraries.py: error: Required option --fasta_fnames omitted.
qiime@qiime-VirtualBox:~$

Bassam Abomoelak

unread,

Feb 28, 2013, 9:39:23 AM2/28/13

to qiime...@googlegroups.com

Dear laura
Please ignore my first email. I tried to split the library and my folder was empty. Here is the error. Any suggestion?. Thanks
Bassam

qiime@qiime-VirtualBox:~$ Split_libraries.py -m NR2_mappingfile.txt -f NR2.1.TCA.454Reads.fna -q NR2.1.TCA.454Reads.qual -o split_library_output
Split_libraries.py: command not found
qiime@qiime-VirtualBox:~$ split_libraries.py -m NR2_Mappingfile_corrected.txt -f NR2.1.TCA.454Reads.fna -q NR2.1.TCA.454Reads.qual -o split_library_output

Traceback (most recent call last):

File "/home/qiime/qiime_software/qiime-1.6.0-release/bin/split_libraries.py", line 286, in <module>
    main()
File "/home/qiime/qiime_software/qiime-1.6.0-release/bin/split_libraries.py", line 283, in main
    truncate_ambi_bases = opts.truncate_ambi_bases)
File "/home/qiime/qiime_software/qiime-1.6.0-release/lib/qiime/split_libraries.py", line 1357, in preprocess
    'length of the barcode used. E.g. -b 4 for 4 base pair barcodes.')
ValueError: Barcode length detected in the mapping file, 10 does not match specified barcode length, 12. To specify a barcode length use -b golay_12 or -b hamming_8 for 12 and 8 base pair golay or hamming codes respectively, or -b # where # is the length of the barcode used. E.g. -b 4 for 4 base pair barcodes.
qiime@qiime-VirtualBox:~$ split_libraries.py -m NR2_Mappingfile_corrected.txt -f NR2.1.TCA.454Reads.fna -q NR2.1.TCA.454Reads.qual -o split_library_output

Traceback (most recent call last):

File "/home/qiime/qiime_software/qiime-1.6.0-release/bin/split_libraries.py", line 286, in <module>
    main()
File "/home/qiime/qiime_software/qiime-1.6.0-release/bin/split_libraries.py", line 283, in main
    truncate_ambi_bases = opts.truncate_ambi_bases)
File "/home/qiime/qiime_software/qiime-1.6.0-release/lib/qiime/split_libraries.py", line 1357, in preprocess
    'length of the barcode used. E.g. -b 4 for 4 base pair barcodes.')
ValueError: Barcode length detected in the mapping file, 10 does not match specified barcode length, 12. To specify a barcode length use -b golay_12 or -b hamming_8 for 12 and 8 base pair golay or hamming codes respectively, or -b # where # is the length of the barcode used. E.g. -b 4 for 4 base pair barcodes.
qiime@qiime-VirtualBox:~$ split_libraries.py -m Mapping_File_8bp_barcodes.txt -f 1.TCA.454Reads.fna -q 1.TCA.454Reads.qual -o split_Library_output_8bp/ -b 8

split_libraries.py: error: option -m: file does not exist: 'Mapping_File_8bp_barcodes.txt'
qiime@qiime-VirtualBox:~$

Laura Wegener Parfrey

unread,

Feb 28, 2013, 10:06:05 AM2/28/13

to qiime...@googlegroups.com

Bassam,
You need to be very careful about typing in the commands correctly. The problem is that you are calling Split_libraries.py which does not exist. you need split_libraries.py. The previous problem is that the options were specified incorrectly (sequences passed with -m instead of -f)

The error messages are informative, you should use them to diagnose your problems and then try again. Please try several things before posting to the forum.

In general you should be using the autocomplete function. Start typing a command or file and then hit the tab key. If you are typing it correctly and the file exists it should autofill.

Laura

--

---
You received this message because you are subscribed to the Google Groups "Qiime Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qiime-forum...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Bassam Abomoelak

unread,

Feb 28, 2013, 10:34:20 AM2/28/13

to qiime...@googlegroups.com

Dear laura

You are great. I appreciate your dedicated patience with me. Thanks

Bassam

Bassam Abomoelak

unread,

Mar 6, 2013, 9:47:49 AM3/6/13

to qiime...@googlegroups.com

Dear Laura

If I need to remove one sample from my analysis, should I redesign new mapping file and start the whole analysis again?. I guess there will be a command for this step. Thanks for the help

Bassam

Laura Wegener Parfrey

unread,

Mar 6, 2013, 9:57:40 AM3/6/13

to qiime...@googlegroups.com

Hi Bassam,
You do not need to start over.
You can use the command filter_samples_from_otu_table.py

Best,
Laura

--

---
You received this message because you are subscribed to the Google Groups "Qiime Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qiime-forum...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Bassam Abomoelak

unread,

Mar 6, 2013, 1:44:17 PM3/6/13

to qiime...@googlegroups.com

Dear laura

is this the right command to use to filter the sample? and the sample id should be in --sample_id_fp?. Is filtering the sample from the biomtable not going to affect the downstream analysis as the original mapping file is still will be there?. Thanks for the quidance.

Bassam

filter_samples_from_otu_table.py -i otu_table.biom -o otu_table_samples_to_keep.biom --sample_id_fp samples_to_keep.txt

Laura Wegener Parfrey

unread,

Mar 6, 2013, 2:02:04 PM3/6/13

to qiime...@googlegroups.com

Hi Bassam,
This is the correct command as long as samples_to_keep.txt is a list of the sample IDs that you want to keep. It will not change the mapping file. You could change the mapping file by just deleting the row(s) in the mapping file in excel and saving under a new name. Likely no necessary though.
Laura

Bassam Abomoelak

unread,

Mar 7, 2013, 10:27:12 AM3/7/13

to qiime...@googlegroups.com

Dear laura
I had failure when running alpha diversity after removing one sample from the mapping file and rename it again. Any suggestion

qiime@qiime-VirtualBox:~$ echo "alpha_diversity:metrics shannon,PD_whole_tree,choa1,observed_species" > alpha_params.txt
qiime@qiime-VirtualBox:~$ alpha_rarefaction.py -i otus/otu_table.biom -m NR3_Mappingfile.txt -o wf_arare/ -p alpha_params.txt -t otus/rep_set.tre

Traceback (most recent call last):

File "/home/qiime/qiime_software/qiime-1.6.0-release/bin/alpha_rarefaction.py", line 147, in <module>
    main()
File "/home/qiime/qiime_software/qiime-1.6.0-release/bin/alpha_rarefaction.py", line 144, in main
    status_update_callback=status_update_callback)
File "/home/qiime/qiime_software/qiime-1.6.0-release/lib/qiime/workflow.py", line 1011, in run_qiime_alpha_rarefaction
    close_logger_on_success=close_logger_on_success)
File "/home/qiime/qiime_software/qiime-1.6.0-release/lib/qiime/workflow.py", line 135, in call_commands_serially
    raise WorkflowError, msg
qiime.workflow.WorkflowError:

*** ERROR RAISED DURING STEP: Alpha diversity on rarefied OTU tables
Command run was:
/home/qiime/qiime_software/python-2.7.3-release/bin/python /home/qiime/qiime_software/qiime-1.6.0-release/bin/alpha_diversity.py -i wf_arare//rarefaction/ -o wf_arare//alpha_div/ --metrics shannon,PD_whole_tree,choa1,observed_species -t otus/rep_set.tre
Command returned exit status: 1
Stdout:

Stderr

Traceback (most recent call last):

unread,

Mar 9, 2013, 1:29:22 PM3/9/13

to qiime...@googlegroups.com

Hi Bassam,
Yes this is possible. See: http://biom-format.org/documentation/biom_conversion.html
You will likely want to use a version of this command:

convert_biom.py -i otu_table.taxonomy.biom -o otu_table.txt -b --header_key taxonomy --biom_table_type="otu table" --process_obs_metadata taxonomy

One way to find answers more quickly is to search the QIIME forum to see if someone else has already asked this question. You can do this either by searching in google and adding Qiime forum to your query or within the forum itself.
Best,
Laura

Bassam

--

---
You received this message because you are subscribed to the Google Groups "Qiime Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qiime-forum...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Bassam Abomoelak

unread,

Mar 22, 2013, 4:20:42 PM3/22/13

to qiime...@googlegroups.com

Dear laura

I need to download some sff files from the NIH website for analysis. I should convert them to fna and qual and merge them in one file. Is that right?. Other question, in this case I should also design my own mapping file?. Thanks for support?

unread,

Apr 4, 2013, 12:15:50 PM4/4/13

to qiime...@googlegroups.com

HI,
I do not know. You will need to explore the data and figure out the best options yourself.

Laura

Bassam Abomoelak

unread,

Apr 8, 2013, 1:13:16 PM4/8/13

to qiime...@googlegroups.com

Dear Laura

I downloaded 11 SFF files from the NIH website that contain my data for analysis. I converted all of them into 11 fna and qual by process.sff.py. Should I combine all of them in one fna and qual files before spliting the library? and how?. I understand I will need to generate corresponding mapping file. Thanks in advance

Bassam

Bassam Abomoelak

unread,

Apr 10, 2013, 10:45:21 AM4/10/13

to qiime...@googlegroups.com

Dear Laura

I downloaded 11 SFF files from the NIH website that contain my data for analysis. I converted all of them into 11 fna and qual by process.sff.py. I designed mapping file according to qiime format and it passed the check without error. When I split the libraries for each file it gives the 3 required files in designated output but the sequences are empty (I mean zero). The barcodes were of variable lengths so I used -b variable_length. Any suggestion?. Thanks

Bassam

Char Oskam

unread,

Apr 16, 2014, 5:31:33 AM4/16/14

to qiime...@googlegroups.com

Hi there,

I'm also getting:

qiime@qiime-VirtualBox:~$ pick_otus_through_otu_table.py -i $HOME/qiime_tutorial-v1.5.0/split_library_output/seqs.fna -o $HOME/qiime_tutorial-v1.5.0/split_library/otus

pick_otus_through_otu_table.py: command not found

qiime@qiime-VirtualBox:~$

here is

qiime@qiime-VirtualBox:~$ print_qiime_config.py -t

System information

==================

Platform: linux2

Python version: 2.7.3 (default, Dec 19 2013, 03:13:59) [GCC 4.6.3]

Python executable: /home/qiime/qiime_software/python-2.7.3-release/bin/python

Dependency versions

===================

PyCogent version: 1.5.3

NumPy version: 1.7.1

matplotlib version: 1.3.1

biom-format version: 1.3.1

qcli version: 0.1.0

QIIME library version: 1.8.0

QIIME script version: 1.8.0

PyNAST version (if installed): 1.2.2

Emperor version: 0.9.3

RDP Classifier version (if installed): rdp_classifier-2.2.jar

Java version (if installed): 1.6.0_30

QIIME config values

===================

blastmat_dir: /home/qiime/qiime_software/blast-2.2.22-release/data

sc_queue: all.q

topiaryexplorer_project_dir: None

pynast_template_alignment_fp: /home/qiime/qiime_software/core_set_aligned.fasta.imputed

cluster_jobs_fp: /home/qiime/qiime_software/qiime-1.8.0-release/bin/start_parallel_jobs.py

pynast_template_alignment_blastdb: None

assign_taxonomy_reference_seqs_fp: /home/qiime/qiime_software/gg_otus-13_8-release/rep_set/97_otus.fasta

torque_queue: friendlyq

template_alignment_lanemask_fp: /home/qiime/qiime_software/lanemask_in_1s_and_0s

jobs_to_start: 1

cloud_environment: False

qiime_scripts_dir: /home/qiime/qiime_software/qiime-1.8.0-release/bin

denoiser_min_per_core: 50

working_dir: /tmp/

python_exe_fp: /home/qiime/qiime_software/python-2.7.3-release/bin/python

temp_dir: /tmp/

blastall_fp: /home/qiime/qiime_software/blast-2.2.22-release/bin/blastall

seconds_to_sleep: 60

assign_taxonomy_id_to_taxonomy_fp: /home/qiime/qiime_software/gg_otus-13_8-release/taxonomy/97_otu_taxonomy.txt

...................................

----------------------------------------------------------------------

Ran 35 tests in 0.246s

OK

and when is do

qiime@qiime-VirtualBox:~$ ls pick_otus_through_otu_table.py

ls: cannot access pick_otus_through_otu_table.py: No such file or directory

qiime@qiime-VirtualBox:~$ ls pick_otus_through_otus_table.py

ls: cannot access pick_otus_through_otus_table.py: No such file or directory

qiime@qiime-VirtualBox:~$

Any ideas?

Char

On Sunday, 24 February 2013 10:38:36 UTC-7, louisville 1 wrote:

Dear Jai

I tried to create the OTU for the tutorial and I got this message. Is the command for creating the OTU right? Thanks
qiime@qiime-VirtualBox:~$ pick_otus_through_otu_table.py -i qiime_tutorial-v1.5.0/split_library_output/seqs.fna -o otus
Usage: pick_otus_through_otu_table.py [options] {-i/--input_fp INPUT_FP -o/--output_dir OUTPUT_DIR}

[] indicates optional input (order unimportant)
{} indicates required input (order unimportant)

This script takes a sequence file and performs all processing steps through building the OTU table.

Example usage:
Print help message and exit

pick_otus_through_otu_table.py -h

Simple example: The following command will start an analysis on seqs.fna (-i), which is a post-split_libraries fasta file. The sequence identifiers in this file should be of the form <sample_id>_<unique_seq_id>. The following steps, corresponding to the preliminary data preparation, are applied: Pick de novo OTUs at 97%; pick a representative sequence for each OTU (the OTU centroid sequence); align the representative set with PyNAST; assign taxonomy with RDP classifier; filter the alignment prior to tree building - remove positions which are all gaps, and specified as 0 in the lanemask; build a phylogenetic tree with FastTree; build an OTU table. All output files will be written to the directory specified by -o, and subdirectories as appropriate. ALWAYS SPECIFY ABSOLUTE FILE PATHS (absolute path represented here as $PWD, but will generally look something like /home/ubuntu/my_analysis/).
pick_otus_through_otu_table.py -i $PWD/seqs.fna -o $PWD/otus/

pick_otus_through_otu_table.py: error: option -i: file does not exist: 'qiime_tutorial-v1.5.0/split_library_output/seqs.fna'
qiime@qiime-VirtualBox:~$ ls
Desktop    examples.desktop Pictures        qiime_tutorial-v1.5.0 Videos
Documents mapping_output    Public          split_library_output
Downloads Music             qiime_software Templates
qiime@qiime-VirtualBox:~$ pick_otus_through_otus_table.py -i qiime_tutorial-v1.5.0/split_library_output/seqs.fna -o qiime_tutorial-v1.5.0 otus

pick_otus_through_otus_table.py: command not found
qiime@qiime-VirtualBox:~$

On Fri, Feb 22, 2013 at 1:09 PM, Jai Ram Rideout <jai.r...@gmail.com> wrote:

Hi Bassam,

The QIIME tutorial files will be in the qiime_tutorial-v1.5.0/ directory, so your split_libraries.py command should look like this:

split_libraries.py -m qiime_tutorial-v1.5.0/Fasting_Map.txt -f qiime_tutorial-v1.5.0/Fasting_Example.fna -q qiime_tutorial-v1.5.0/Fasting_Example.qual -o split_library_output

-Jai

On Fri, Feb 22, 2013 at 1:05 PM, Bassam Abomoelak <babom...@gmail.com> wrote:

Dear all
Here what I got with the 3 commands.
bassam
qiime@qiime-VirtualBox:~$ ls
core_set_aligned.fasta.imputed lanemask_in_1s_and_0s qiime_config_default
Desktop                         mapping_output         qiime_software
Documents                       Music                  qiime_tutorial-v1.5.0
Downloads                       Pictures               Templates
examples.desktop                Public                 Videos
qiime@qiime-VirtualBox:~$ ls qiime_tutorial/
ls: cannot access qiime_tutorial/: No such file or directory

qiime@qiime-VirtualBox:~$ split_libraries.py -m qiime_tutorial/Fasting_Map.txt -f qiime_tutorial/Fasting_Example.fna -q qiime_tutorial/Fasting_Example.qual -o split_library_output

split_libraries.py: error: option -m: file does not exist: 'qiime_tutorial/Fasting_Map.txt'

qiime@qiime-VirtualBox:~$ ls qiime_tutorial/
ls: cannot access qiime_tutorial/: No such file or directory

qiime@qiime-VirtualBox:~$ split_libraries.py -m qiime_tutorial/Fasting_Map.txt -f qiime_tutorial/Fasting_Example.fna -q qiime_tutorial/Fasting_Example.qual -o split_library_output

Usage: split_libraries.py [options] {-m/--map MAP_FNAME -f/--fasta FASTA_FNAMES}

[] indicates optional input (order unimportant)
{} indicates required input (order unimportant)

Since newer sequencing technologies provide many reads per run (e.g. the 454 GS FLX Titanium series can produce 400-600 million base pairs with 400-500 base pair read lengths) researchers are now finding it useful to combine multiple samples into a single 454 run. This multiplexing is achieved through the application of a pyrosequencing-tailored nucleotide barcode design (described in (Parameswaran et al., 2007)). By assigning individual, unique sample specific barcodes, multiple sequencing runs may be performed in parallel and the resulting reads can later be binned according to sample. The script split_libraries.py performs this task, in addition to several quality filtering steps including user defined cut-offs for: sequence lengths; end-trimming; minimum quality score. To summarize, by using the fasta, mapping, and quality files, the program split_libraries.py will parse sequences that meet user defined quality thresholds and then rename each read with the appropriate Sample ID, thus formatting the sequence data for downstream analysis. If a combination of different sequencing technologies are used in any particular study, split_libraries.py can be used to perform the quality-filtering for each library individually and the output may then be combined.

Sequences from samples that are not found in the mapping file (no corresponding barcode) and sequences without the correct primer sequence will be excluded. Additional scripts can be used to exclude sequences that match a given reference sequence (e.g. the human genome; exclude_seqs_by_blast.py) and/or sequences that are flagged as chimeras (identify_chimeric_seqs.py).

Example usage:
Print help message and exit
split_libraries.py -h

Standard Example: Using a single 454 run, which contains a single FASTA, QUAL, and mapping file while using default parameters and outputting the data into the Directory "Split_Library_Output"
split_libraries.py -m Mapping_File.txt -f 1.TCA.454Reads.fna -q 1.TCA.454Reads.qual -o Split_Library_Output/

Multiple FASTA and QUAL Files Example: For the case where there are multiple FASTA and QUAL files, the user can run the following comma-separated command as long as there are not duplicate barcodes listed in the mapping file
split_libraries.py -m Mapping_File.txt -f 1.TCA.454Reads.fna,2.TCA.454Reads.fna -q 1.TCA.454Reads.qual,2.TCA.454Reads.qual -o Split_Library_Output_comma_separated/

On Fri, Feb 22, 2013 at 12:09 PM, Jai Ram Rideout <jai.r...@gmail.com> wrote:

Glad to help! Please let us know if you are still stuck on the QIIME tutorial commands after working through the Unix/Linux tutorial.

-Jai

On Fri, Feb 22, 2013 at 10:43 AM, Bassam Abomoelak <babom...@gmail.com> wrote:

Thanks guys, you are so great
Bassam

On Fri, Feb 22, 2013 at 12:29 PM, Jai Ram Rideout <jai.r...@gmail.com> wrote:

Hi Bassam,

The 'ls' command is a lowercase L followed by a lowercase s. It looks like you were typing capital i as the first character. The 'ls' command lists the contents of a directory.

I highly recommend that you work through some beginning Unix/Linux command line tutorials to get comfortable with using the command line. The time you invest there will help you in the long run if you plan to continue using QIIME.

Here's a great interactive tutorial that will help you get started:

http://nixsrv.com/llthw

These additional tutorials may also be useful:

http://qiime.org/tutorials/unix_commands.html
http://www.linuxcommand.org/learning_the_shell.php

Hope this helps,
Jai

On Fri, Feb 22, 2013 at 10:20 AM, Bassam Abomoelak <babom...@gmail.com> wrote:

Dear Jai
here are the 3 commands

qiime@qiime-VirtualBox:~$ Is

Is: command not found
qiime@qiime-VirtualBox:~$ is
is: command not found
qiime@qiime-VirtualBox:~$ Is qiime_tutorial/
Is: command not found
qiime@qiime-VirtualBox:~$ 1s
1s: command not found

qiime@qiime-VirtualBox:~$ split_libraries.py -m qiime_tutorial/Fasting_Map.txt -f qiime_tutorial/Fasting_Example.fna -q qiime_tutorial/Fasting_Example.qual -o split_library_output

-Tony

Bassam

qiime@qiime-VirtualBox:~$ split_libraries.py -m qiime_tutorial/Fasting_Map.txt -f Fasting_Example.fna -q Fasting_Example.qual -0 split_library_output

split_libraries.py: error: option -m: file does not exist: 'qiime_tutorial/Fasting_Map.txt'
qiime@qiime-VirtualBox:~$

On Thu, Feb 21, 2013 at 1:18 PM, Jai Ram Rideout <jai.r...@gmail.com> wrote:

Hi Bassam,

You need to either be in the same directory as your Fasting_Map.txt file to have that command work, or you can change the filepath you used with the -m option to wherever your Fasting_Map.txt file is, relative to your current directory.

For example, it looks like you are in your home directory, /home/ubuntu/. If Fasting_Map.txt is in /home/ubuntu/qiime_tutorial/, you could run the following command:

check_id_map.py -m qiime_tutorial/Fasting_Map.txt -o mapping_output -v

You will need to modify the path to point to wherever your Fasting_Map.txt file is.

Hope this helps,
Jai

On Thu, Feb 21, 2013 at 1:06 PM, Bassam Abomoelak <babom...@gmail.com> wrote:

Dear Jose
This is what I got with the map check command. Can you explain to me if I'm following the steps correctly. I'm still in the tutorial. Thanks for great patience with me.
Bassam

qiime@qiime-VirtualBox:~$ check_id_map.py -m Fasting_Map.txt -o mapping_output -v
Usage: check_id_map.py [options] {-m/--mapping_fp MAPPING_FP}

[] indicates optional input (order unimportant)
{} indicates required input (order unimportant)

Example usage:
Print help message and exit

check_id_map.py -h

Example: Check the Fasting_Map.txt mapping file for problems, supplying the required mapping file, and output the results in the check_id_map_output directory
check_id_map.py -m Fasting_Map.txt -o check_id_map_output

check_id_map.py: error: option -m: file does not exist: 'Fasting_Map.txt'
qiime@qiime-VirtualBox:~$

Kyle Bittinger

unread,

Apr 16, 2014, 9:59:58 AM4/16/14

to qiime...@googlegroups.com

pick_otus_through_otu_table.py has changed names. You will use one of three scripts depending on the method of OTU picking.

Use one of these scripts instead:

http://qiime.org/scripts/pick_closed_reference_otus.html

http://qiime.org/scripts/pick_de_novo_otus.html

http://qiime.org/scripts/pick_open_reference_otus.html

A discussion of each method can be found at:

http://qiime.org/tutorials/otu_picking.html

--Kyle

For more options, visit https://groups.google.com/d/optout.

ghazal

unread,

Apr 28, 2014, 10:48:23 AM4/28/14

to qiime...@googlegroups.com

Hi dear all,

Actually I stuck at this point as well.

I am doing the tutorial based on the article entitled "Using QIIME to analyze 16S rRNA gene sequences from Microbial Communities" and using the old OTU picking command (pick_otus_through_otu_table.py -i split_library_output/seqs.fna -o otus), I get the same error: command not found

So I found the three new commands as it has been mentioned here as well, but I do not know which one I have to use and more importantly, I do not know what I must put in the reference part:

Both pick_open_reference_otus.py and pick_close_reference_otus.py commands require -r, so I use this command

pick_open_reference_otus.py -i split_library_output/seqs.fna -r o- otus

and I do not know what to put in front of -r and I am not sure about the other parts as well

I would be really thankful if anybody can help me out.

Kind Regards

Ghazal

Jose Carlos Clemente

unread,

Apr 29, 2014, 3:30:48 PM4/29/14

to Qiime Forum

Hi Ghazal,

the OTU picking tutorial has an explanation of the pros and cons of each method:

http://qiime.org/tutorials/otu_picking.html#description-of-qiime-s-otu-picking-protocols

Also, you can find here what is commonly used as a reference set to pass in the -r option:

http://qiime.org/tutorials/otu_picking.html#conventions-used-in-these-examples

Jose

Elisa Ramos Sevillano

unread,

May 22, 2014, 8:32:09 AM5/22/14

to qiime...@googlegroups.com

Hi all,

I am following the 454 overview tutorial right now. I have downloaded all the files are necessary but I think I am not doing well at all.Is this an example in order you can practice before start, or it is just a visual example. I mean, every time I introduce a command i get some extra information about that specific command, but i cannot work on the dates.

Thank you so much in advanced,

Elisa

El viernes, 21 de enero de 2011 11:11:51 UTC-7, Antonio González Peña escribió:

Dear QIIME Users:
This is a great resource for beginners in the Linux/Mac terminal:
http://www.linuxcommand.org/learning_the_shell.php
This should be very useful information for beginner users of Linux,
and users whose first experience with Linux is the Qiime Virtual Box.
Happy QIIMEing.

--
Antonio González Peña
Research Assistant, Knight Lab
University of Colorado at Boulder
https://chem.colorado.edu/knightgroup/

Emily TerAvest

unread,

May 22, 2014, 10:31:50 AM5/22/14

to qiime...@googlegroups.com

Hi Elisa,

Can you please copy and past the first command you are trying to run with the output you are getting.

Thank you

Emily

--

Reply all

Reply to author

Forward