Compare samples from different biom files

Skip to first unread message

May 10, 2017, 9:50:13 AM5/10/17
to Qiime 1 Forum
Hello everybody,

I am really wondering about the results I get at the end of my analysis. So I hope to get some feedback from the community :)
I followed the tutorial located at to create a plot with stacked bar charts for gut microbiome analysis. This way I compare the contained 6 barcoded samples in my fastq file with each other. The commands I used were:

#step 1
.py -m map1.txt -o validate_map
#step 2
.py -c fastq_to_fastaqual -f file.fastq -o fastaqual
#step 3
.py -m map1.txt -f fastaqual/file.fna -q fastaqual/file.qual -o split_library_out/ -b 13 -l 140 -z truncate_only
#step 4
.py -i split_library_out/seqs.fna -o otus/
#step 5
.py -i otus/rep_set/seqs_rep_set.fasta -m rdp -o rdp_assigned_taxonomy
#step 6
.py -i otus/uclust_picked_otus/seqs_otus_txt -t rdp_assigned_taxonomy/seqs_rep_set_tax_assignments.txt -o L7_otu_table.biom
#step 7
.py -i L7_otu_table.biom -o L7_taxonomy_summary/ -L 7
#step 8
.py -i L7_taxonomy_summary/L7_otu_table_L7.txt -o L7_taxonomy_plot/

I used the same procedure to process a second fastq file with 6 other samples. So far, the pipeline works well.
My problem is now that I would like to compare 2 samples from different biom files in a stacked bar chart with each other. I googled around a bit and these commands looked good to me:

# Split biom file belonging to file.fastq (created at step 6 above) by SampleID
.py -i L7_otu_table.biom -m map1.txt -f SampleID -o split_by_sample
# The same for file2.fastq
.py -i L7_otu_table.biom -m map2.txt -f SampleID -o split_by_sample
# Merge 2 biom files containing the samples I want to compare
.py -i otu_table.file.fastq.biom,otu_table.file2.fastq.biom -o merged_otu_table.biom
# Now I follow the above procedure starting at step 7 to create a bar chart with new biom file
.py -i merged_otu_table.biom -o new_taxonomy_summary/ -L 7
.py -i new_taxonomy_summary/merged_otu_table_L7.txt -o new_taxonomy_plot/

I end up with a new plot, containing 2 bar charts; but I am wondering that the percentages in one of the two charts are completely different compared to its initial version. To my mind I should get a plot with 2 bar charts, containing identical percentages like their initial version.

Am I doing something wrong or do I miss a step here? I really can't imagine why values change in my new plot. 
Any advise is really appreciated :)


Colin Brislawn

May 10, 2017, 1:05:59 PM5/10/17
to Qiime 1 Forum
Hello Patrick,

Comparing different .biom files could be easy or could be nearly impossible, depending how you made OTUs within those files.

If you used closed-ref OTU picking, then all your OTU ids will match and you can safely combine OTU tables using the command. If you used open-ref or de novo OTU picking methods, then different OTUs will have the exact same IDs, and your merged table will be meaningless. 

If you want to compare samples with any open-ref or de novo methods, the only easy way to do this is to combine your seqs.fna files, then process them together in OTU picking. 

Does that help answer your question?

Greg Caporaso

May 10, 2017, 3:56:42 PM5/10/17
to Qiime 1 Forum
Hi Patrick,
Colin is correct - you need to first combine your sequences files if you're going to run You can't merge tables resulting from two different runs of as the OTU identifiers are not consistent across the two runs. I think what's happening is when you're merging your OTU tables, the taxonomy that is associated with each OTU is getting mixed up because of this. has a warning about this in its help text:

$ --help

Requirements: It is also very important that your OTUs are consistent across the different OTU tables. For example, you cannot safely merge OTU tables from two independent de novo OTU picking runs. Finally, either all or none of the OTU tables can contain taxonomic information: you can't merge some OTU tables with taxonomic data and some without taxonomic data.

Once you combine your sequence files and re-run, your workflow will be a lot easier since all of your samples will already be in one BIOM file. If you want to split the data into two BIOM files for separate analyses, you can run If you want to generate a taxonomy barplot for just a couple of samples, you could use to create the table that has the two samples you want, and then pass that to

Hope this helps! 


May 11, 2017, 4:46:58 AM5/11/17
to Qiime 1 Forum
Hi Colin and Greg,

thank you very much for your help! Now I get the expected results.

Actually I already read's help text, but I missed this important paragraph...

Have a nice day!


Greg Caporaso

May 11, 2017, 1:11:44 PM5/11/17
to Qiime 1 Forum
Great, glad it's working for you. Thanks for following up! 

Reply all
Reply to author
0 new messages