Beta Diversity through plots error at make emperor plots

318 views
Skip to first unread message

Brianiee Albrighton

unread,
May 6, 2016, 4:14:33 AM5/6/16
to Qiime 1 Forum
Hi Guys,

I've run the script:

beta_diversity_through_plots.py -i otu_table_rare.biom -o bdiv_even100/ -t rep_set_tre -m mapping_file_metadata.txt -e 124797

and this log error has come up 

Logging started at 16:48:17 on 06 May 2016
QIIME version: 1.8.0

qiime_config values:
blastmat_dir /opt/shared/Qiime/1.8.0/blast-2.2.22-release/data
sc_queue all.q
pynast_template_alignment_fp /opt/shared/Qiime/1.8.0/core_set_aligned.fasta.imputed
cluster_jobs_fp /opt/shared/Qiime/1.8.0/qiime-1.8.0-release/bin/start_parallel_jobs.py
assign_taxonomy_reference_seqs_fp /opt/shared/Qiime/1.8.0/gg_otus-13_8-release/rep_set/97_otus.fasta
torque_queue friendlyq
template_alignment_lanemask_fp /opt/shared/Qiime/1.8.0/lanemask_in_1s_and_0s
jobs_to_start 1
cloud_environment False
qiime_scripts_dir /opt/shared/Qiime/1.8.0/qiime-1.8.0-release/bin
denoiser_min_per_core 50
working_dir /tmp/
python_exe_fp /opt/shared/Qiime/1.8.0/python-2.7.3-release/bin/python
temp_dir /tmp/
blastall_fp /opt/shared/Qiime/1.8.0/blast-2.2.22-release/bin/blastall
seconds_to_sleep 60
assign_taxonomy_id_to_taxonomy_fp /opt/shared/Qiime/1.8.0/gg_otus-13_8-release/taxonomy/97_otu_taxonomy.txt

parameter file values:
parallel:jobs_to_start 1

Input file md5 sums:
otu_table_rare.biom: 2ab3e1b3cd5020623928e52444f50278
mapping_file_metadata.txt: bed2028c6f74ee1c9cae85ffb7042af8
rep_set.tre: 93f464052fcc24586e6a506bdc013a95

Executing commands.

# Sample OTU table at 124797 seqs/sample command 
/opt/shared/Qiime/1.8.0/python-2.7.3-release/bin/python /opt/shared/Qiime/1.8.0/qiime-1.8.0-release/bin/single_rarefaction.py -i otu_table_rare.biom -o bdiv_even100//otu_table_rare_even124797.biom -d 124797

Stdout:

Stderr:

# Beta Diversity (weighted_unifrac) command 
/opt/shared/Qiime/1.8.0/python-2.7.3-release/bin/python /opt/shared/Qiime/1.8.0/qiime-1.8.0-release/bin/beta_diversity.py -i bdiv_even100//otu_table_rare_even124797.biom -o bdiv_even100/ --metrics weighted_unifrac  -t rep_set.tre 

Stdout:

Stderr:

# Rename distance matrix (weighted_unifrac) command 
mv bdiv_even100//weighted_unifrac_otu_table_rare_even124797.txt bdiv_even100//weighted_unifrac_dm.txt

Stdout:

Stderr:

# Principal coordinates (weighted_unifrac) command 
/opt/shared/Qiime/1.8.0/python-2.7.3-release/bin/python /opt/shared/Qiime/1.8.0/qiime-1.8.0-release/bin/principal_coordinates.py -i bdiv_even100//weighted_unifrac_dm.txt -o bdiv_even100//weighted_unifrac_pc.txt 

Stdout:

Stderr:

# Make emperor plots, weighted_unifrac) command 
make_emperor.py -i bdiv_even100//weighted_unifrac_pc.txt -o bdiv_even100//weighted_unifrac_emperor_pcoa_plot/ -m mapping_file_metadata.txt 



*** ERROR RAISED DURING STEP: Make emperor plots, weighted_unifrac)
Command run was:
 make_emperor.py -i bdiv_even100//weighted_unifrac_pc.txt -o bdiv_even100//weighted_unifrac_emperor_pcoa_plot/ -m mapping_file_metadata.txt 
Command returned exit status: 2
Stdout:

Stderr
Usage: make_emperor.py [options] {-i/--input_coords INPUT_COORDS -m/--map_fp MAP_FP}

[] indicates optional input (order unimportant)
{} indicates required input (order unimportant)

This script automates the creation  of three-dimensional PCoA plots to be visualized with Emperor using Google Chrome.

Example usage: 
Print help message and exit
 make_emperor.py -h

Plot PCoA data: Visualize the a PCoA file colored using a corresponding mapping file:
 make_emperor.py -i unweighted_unifrac_pc.txt -m Fasting_Map.txt -o emperor_output

Coloring by metadata mapping file: Additionally, using the supplied mapping file and a specific category or any combination of the available categories. When using the -b option, the user can specify the coloring for multiple header names, where each header is separated by a comma. The user can also combine mapping headers and color by the combined headers that are created by inserting an '&&' between the input header names. Color by 'Treatment' and by the result of concatenating the 'DOB' category and the 'Treatment' category:
 make_emperor.py -i unweighted_unifrac_pc.txt -m Fasting_Map.txt -b 'Treatment&&DOB,Treatment' -o emperor_colored_by

PCoA plot with an explicit axis: Create a PCoA plot with an axis of the plot representing the 'DOB' of the samples. This option is useful when presenting a gradient from your metadata e. g. 'Time' or 'pH':
 make_emperor.py -i unweighted_unifrac_pc.txt -m Fasting_Map.txt -a DOB -o pcoa_dob

PCoA plot with an explicit axis and using --missing_custom_axes_values: Create a PCoA plot with an axis of the plot representing the 'DOB' of the samples and define the position over the gradient of those samples missing a numeric value; in this case we are going to plot the samples in the value 20060000. You can select for each explicit axis which value you want to use for the missing values:
 make_emperor.py -i unweighted_unifrac_pc.txt -m Fasting_Map_modified.txt -a DOB -o pcoa_dob_with_missing_custom_axes_values -x 'DOB:20060000'

PCoA plot with an explicit axis and using --missing_custom_axes_values but setting different values based on another column: Create a PCoA plot with an axis of the plot representing the 'DOB' of the samples and defining the position over the gradient of those samples missing a numeric value but using as reference another column of the mapping file. In this case we are going to plot the samples that are Control on the Treatment column on 20080220 and on 20080240 those that are Fast
 make_emperor.py -i unweighted_unifrac_pc.txt -m Fasting_Map_modified.txt -a DOB -o pcoa_dob_with_missing_custom_axes_with_multiple_values -x 'DOB:Treatment==Control=20080220' -x 'DOB:Treatment==Fast=20080240'

Jackknifed principal coordinates analysis plot: Create a jackknifed PCoA plot (with confidence intervals for each sample) passing as the input a directory of coordinates files (where each file corresponds to a different OTU table) and use the standard deviation method to compute the dimensions of the ellipsoids surrounding each sample:
 make_emperor.py -i unweighted_unifrac_pc -m Fasting_Map.txt -o jackknifed_pcoa -e sdev

Jackknifed PCoA plot with a master coordinates file: Passing a master coordinates file (--master_pcoa) will display the ellipsoids centered by the samples in this file:
 make_emperor.py -i unweighted_unifrac_pc -s unweighted_unifrac_pc/pcoa_unweighted_unifrac_rarefaction_110_5.txt -m Fasting_Map.txt -o jackknifed_with_master

BiPlots: To see which taxa are the ten more prevalent in the different areas of the PCoA plot, you need to pass a summarized taxa file i. e. the output of summarize_taxa.py. Note that if the the '--taxa_fp' has fewer than 10 taxa, the script will default to use all.
 make_emperor.py -i unweighted_unifrac_pc.txt -m Fasting_Map.txt -t otu_table_L3.txt -o biplot

BiPlots with extra options: To see which are the three most prevalent taxa and save the coordinates where these taxa are centered, you can use the -n (number of taxa to keep) and the --biplot_fp (output biplot file path) options.
 make_emperor.py -i unweighted_unifrac_pc.txt -m Fasting_Map.txt -t otu_table_L3.txt -o biplot_options -n 3 --biplot_fp biplot.txt

Drawing connecting lines between samples: To draw lines betwen samples within a category use the '--add_vectors' option. For example to connect the lines by the 'Treatment' category.
 make_emperor.py -i unweighted_unifrac_pc.txt -m Fasting_Map.txt -o vectors --add_vectors Treatment

Drawing connecting lines between samples with an explicit axis: To draw lines between samples within a category of the mapping file and have them sorted by a category that's explicitly represented in the 3D plot use the '--add_vectors' and the '-a' option.
 make_emperor.py -i unweighted_unifrac_pc.txt -m Fasting_Map.txt --add_vectors Treatment,DOB -a DOB -o sorted_by_DOB

Compare two coordinate files: To draw replicates of the same samples like for a procustes plot.
 make_emperor.py -i compare -m Fasting_Map.txt --compare_plots -o comparison

make_emperor.py: error: The sample identifiers in the coordinates file must have at least one match with the data contained in mapping file. Verify you are using a coordinates file and a mapping file that belong to the same dataset.


Logging stopped at 16:51:30 on 06 May 2016


I seem to have trouble plotting the taxa summaries as well but not sure if they are related. In any case I managed to graph my taxa summary myself but not sure about a solution here. The input files were my biom otu table after single rarefaction, my mapping file (which worked fine through other stages) and the tree which worked fine through alpha diversity.py.

Any ideas?

Thanks,
Bri.

Colin Brislawn

unread,
May 6, 2016, 12:48:30 PM5/6/16
to Qiime 1 Forum
Hello Bri,

Here is the key line of the error:
make_emperor.py: error: The sample identifiers in the coordinates file must have at least one match with the data contained in mapping file. Verify you are using a coordinates file and a mapping file that belong to the same dataset.

How did you go about making the file 'otu_table_rare.biom'? Maybe some step in filtering or rarefying changed it so it no longer matches you metadata.

Colin 

Brianiee Albrighton

unread,
May 9, 2016, 4:36:26 AM5/9/16
to Qiime 1 Forum
to make the otu_table_rare.biom I just used single_rarefaction.py

I checked the otu table against my mapping file and the only difference I found was that the mapping file ids were capitalized. I changed them to lowercase to match the rare.biom file. However the ids didn't seem to affect the alpha_diversity.py except the rarefaction plots and the beta diversity plots stages. Is that expected? Do I need to re-rarefy my biom table and run these all again? or can I just re-run the alpha_rarefaction.py and beta_diversity_through_plots.py workflow? I think this may be the issue with my otu_network as well

Thanks,
Bri

Brianiee Albrighton

unread,
May 9, 2016, 4:40:40 AM5/9/16
to Qiime 1 Forum
Also I have another quick question- could you explain why its important to rarefy the samples before doing the diversity analysis? and If I was going to make an otu network would I use my original .biom table or the rare one?

Thanks,

Bri.

Colin Brislawn

unread,
May 9, 2016, 11:48:29 AM5/9/16
to Qiime 1 Forum
You mentioned:
I found was that the mapping file ids were capitalized
Everything has to exactly match. After you correct these, you will have to repeat any downstream steps, just to make sure your new IDs are in all your downstream files. This takes some time, but it's easier than trying to change your IDs in all of your files.

Colin 

Colin Brislawn

unread,
May 9, 2016, 11:50:50 AM5/9/16
to Qiime 1 Forum
Hello Bri,

Because different samples have different numbers of reads in them, the number of reads can confound real biological variables. I wrote up an example of this over here: https://groups.google.com/d/msg/qiime-forum/xCK61_VCvMM/EaE9ffALAAAJ

I hope that helps,
Colin Brislawn 

Brianiee Albrighton

unread,
May 9, 2016, 11:19:56 PM5/9/16
to qiime...@googlegroups.com
Hi Colin,

I have uploaded my mapping text file and my biom rare otu summary so you can see the id names. I removed the capitals so they are the same however in alpha_rarefaction.py I got the error:

Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
Working directory is /home/users/balbrighton/merged_fastq_files/slout/pick_de_novos_otus2
Running on host tizard19
Time is Tue May 10 06:45:04 ACST 2016
Traceback (most recent call last):
  File "/opt/shared/Qiime/1.8.0/qiime-1.8.0-release/bin/alpha_rarefaction.py", line 156, in <module>
    main()
  File "/opt/shared/Qiime/1.8.0/qiime-1.8.0-release/bin/alpha_rarefaction.py", line 153, in main
    retain_intermediate_files=retain_intermediate_files)
  File "/opt/shared/Qiime/1.8.0/qiime-1.8.0-release/lib/qiime/workflow/downstream.py", line 351, in run_alpha_rarefaction
    close_logger_on_success=close_logger_on_success)
  File "/opt/shared/Qiime/1.8.0/qiime-1.8.0-release/lib/qiime/workflow/util.py", line 116, in call_commands_serially
    raise WorkflowError, msg
qiime.workflow.util.WorkflowError: 

*** ERROR RAISED DURING STEP: Rarefaction plot: All metrics
Command run was:
 /opt/shared/Qiime/1.8.0/python-2.7.3-release/bin/python /opt/shared/Qiime/1.8.0/qiime-1.8.0-release/bin/make_rarefaction_plots.py -i arare//alpha_div_collated/ -m mapping_file_metadata.txt -o arare//alpha_rarefaction_plots/ 
Command returned exit status: 1
Stdout:

Stderr
Traceback (most recent call last):
  File "/opt/shared/Qiime/1.8.0/qiime-1.8.0-release/bin/make_rarefaction_plots.py", line 216, in <module>
    main()
  File "/opt/shared/Qiime/1.8.0/qiime-1.8.0-release/bin/make_rarefaction_plots.py", line 206, in main
    generate_average_tables = generate_average_tables)
  File "/opt/shared/Qiime/1.8.0/qiime-1.8.0-release/lib/qiime/make_rarefaction_plots.py", line 604, in make_averages
    new_ymax=(max([max(v) for v in rares_data['series'].values()])+\
ValueError: max() arg is an empty sequence




Resources Requested:
=========================================
mem=16gb
neednodes=1:ppn=1
nodes=1:ppn=1
vmem=16gb
walltime=100:00:00
=========================================

Resouces Used:
=========================================
cput=05:06:02
mem=2360720kb
vmem=3226344kb
walltime=05:06:18 
=========================================




Exit Status : [1]



Is this is sample ids again? 

The sample ids seemed to pick up fine in alpha diversity and taxa summaries.
otu_table_summary_rare.txt
mapping_file_metadata.txt

Colin Brislawn

unread,
May 10, 2016, 1:07:59 AM5/10/16
to Qiime 1 Forum
Hello Bri,

Thanks for posting your files for me. I've discovered the issue!

I took a quick look at the two files you posted, and names still don't match.
From the OTU table: B.Albr.TCE.acetate.1
From the mapping file: acetate.1

All your samples IDs must match exactly in all your files!
The easiest way to verify this is to go back to the demultiplexing step with your proper metadata. After demultiplexing, run validate_demultiplexed_fasta.py to make sure your files are good, then repeat the downstream analysis.

(For this specific error, make_rarefaction_plots.py failed, I think because the input (arare//alpha_div_collated/) was empty, because the IDs do not match.)

Colin 

Brianiee Albrighton

unread,
May 10, 2016, 8:36:46 AM5/10/16
to Qiime 1 Forum
Oh okay so the whole id needs to be included? I identified another spelling error as well. However I don't understand why the ids were find in all other steps?

The summarized biom file shows that its picked up the sample ids without the the section you've pointed out as well.

I had another question while on the topic of the rarefaction plots. Do people normally include these in reports? I have already graphed the values from alpha_diversity.py for observed sp. Shannon and PD whole tree..what do the rarefaction plots add?

Thanks,
Bri.

Brianiee Albrighton

unread,
May 10, 2016, 8:41:15 AM5/10/16
to Qiime 1 Forum
I'm actually thinking it may be an issue with the fact that when i demultiplexed I had to individually list the sample ids. So maybe i should remove the initial section of the mapping file so they read e.g. acetate.1 as I specified in the split libraries file?

Colin Brislawn

unread,
May 10, 2016, 12:51:31 PM5/10/16
to Qiime 1 Forum
Hello Bri,

So maybe i should remove the initial section of the mapping file so they read e.g. acetate.1 as I specified in the split libraries file?
Great thinking! As long as everything matches, the software will be happy, and it's easier to modify the mapping file. That should work great!

However I don't understand why the ids were find in all other steps?
Some steps treat all your reads as if they come from one sample. For example, common OTUs are identified across the whole study. Once you look into specific samples or try to compare them, that's when the sample IDs are used and then the consistency really matters. 

I had another question while on the topic of the rarefaction plots. Do people normally include these in reports? I have already graphed the values from alpha_diversity.py for observed sp. Shannon and PD whole tree..what do the rarefaction plots add?
These days, people seem to put those in supplemental, if they include the plots at all. The goal of the plots is to see at which sequencing depth you can distinguish the alpha the diversity between samples, to establish when your sequencing depth is 'good enough.'

Great work here Bri! Let me know how else I can help,
Colin

Brianiee Albrighton

unread,
May 11, 2016, 12:22:58 AM5/11/16
to Qiime 1 Forum
Hi Colin,

So my rarefaction and beta diversity scripts both ran without errors. Yay

However, I have two issues with viewing the html file outputs from both these scripts.

The alpha_rarefaction html opens in chrome fine and I am able to click the drop down boxes to select metrics and categories. The data table is also visible at the bottom however in all the error columns I have "nan" written. The image itslef is not loading.


For my Beta_diversity_thorugh_plots html file the browser says that WebGL is not enabled however my machine does support it and hardware acceleration is listed next to it under graphic feature status.

In regards to the rarefaction plots, I may include them as supplementary just as evidence to support my alpha_diversity.py indices (I simply column graphed the three metrics for each sample in one graph). The differences are slight in some cases so maybe the rarefaction plots might distinguish these groups a little more?


Thanks for all your help!,
Bri.

Colin Brislawn

unread,
May 11, 2016, 11:22:45 AM5/11/16
to Qiime 1 Forum
Good morning Bri,

The alpha_rarefaction.html file must be downloaded with the full folder of images that comes along with it. Make sure you download the full folder, and it should work. 

For the 3D plots in Emperior, you need to have a newer computer which can run WebGL. Try moving the whole folder to another computer, the viewing the graphs there. (It's also possible that if you download the full folder and then try opening it, you will be able to view the files.)

You can do what you want with the rarefaction plots. They are usually not part of the core narrative, but that depends on the biological question. 
Keep in touch! 
Colin

leena...@student.qau.edu.pk

unread,
Dec 17, 2016, 12:49:30 PM12/17/16
to Qiime 1 Forum
 Hello Everyone,

Can someone please also help me. I have having a similar problem while running beta diversity analysis. The error I am receiving is
 

Stderr

Error in make_emperor.py: Due to the variation explained, Emperor could not plot at least 3 axes, check the input files to ensure that the percent explained is greater than 0.01 in at least three axes.


I am attaching my map file and coordinates file for reference. I will be grateful for assistance.


Thanks alot

Leena

Colin Brislawn

unread,
Dec 17, 2016, 3:13:53 PM12/17/16
to Qiime 1 Forum
Hello Leena,

Thanks for getting in touch with us. I'll help you in the other thread, so that we don't have to fill up Bri's email inbox with our conversation.

Colin

divyapr...@gmail.com

unread,
Sep 26, 2017, 12:47:23 AM9/26/17
to Qiime 1 Forum
Good morning 
Hi Colin 
please help me out here. I am facing an error in computing beta-diversity. i am running the command  beta_diversity_through_plots.py -i otus/otu_table.biom -m Fasting_Map.txt -t otus/rep_set.tre -e  ....
However i am getting an error.
i will attach the error that is displayed in the terminal while running the command.  
VirtualBox_qiime_25_09_2017_12_58_00.png

divyapr...@gmail.com

unread,
Sep 26, 2017, 3:33:27 AM9/26/17
to Qiime 1 Forum
Hi any one there who can help me.
I am stuck in this problem from past couple of days.

Jenya Kopylov

unread,
Sep 26, 2017, 4:13:52 AM9/26/17
to Qiime 1 Forum
Hello divyaprince,

Could you create another thread for this issue and we will help you there?

Thanks,
Jenya
Reply all
Reply to author
Forward
0 new messages