One heat tree showing OTUs from two samples in different colours

40 views
Skip to first unread message

juan.a....@gmail.com

unread,
Feb 28, 2018, 9:08:36 AM2/28/18
to metacoder
Hello, 

Many thanks for producing this awesome package!. I would like to ask how can I show which OTUs come from which sample in a single heat_tree. For instance, using different colours.

My OTUs have the following format: OTU_ID /tab/ number indicating how many reads are clustered in this OTU /tab/ lineage /tab/ the sampleID

>PG5V6:00748:00973.1.289 17 Bacteria;Firmicutes;Bacilli;Bacillales;Bacillaceae;Pontibacillus; sample1
ACGGCCAGTGGAAGGTGGGGATGACGTCAAATCATCATGCCCCTTATG

>Q9CYJ:00502:00074.1.293 5 Bacteria;Firmicutes;Bacilli;Bacillales;Staphylococcaceae;Staphylococcus; sample2
GACGGCCAGTGGAAGGTGGGGATGACGTCAAATCATCATGCCCCTTATGAC

With the code below I can extract taxonomy but I haven't managed to specify grouping. Any help will be highly appreciated.

seqs.Dande <- ape::read.FASTA(fasta.file)
data.Dande <-extract_tax_data(names(seqs.Dande),regex = "^(.*)\\t(.*)\\t(.*)\\t(.*)", 
                                key = c(otu_id ="info", seq_count="info", otu_tax="class", sample="info"), class_sep=";")


Thank you in advance,

Juan

Zachary Foster

unread,
Feb 28, 2018, 2:51:37 PM2/28/18
to metacoder
Hello Juan,

Thanks, I am glad you like it!

So you want to color taxa based on the samples OTUs assigned to that taxon came from? I expect that might be tricky since most taxa will have OTUs assigned to them from different samples, especially more broad taxa like "Bacteria". How do you color a taxon that is present in three samples? Also, the heat_tree function does not technically support categorical information currently (it is something we want to do), but you can imitate it by setting the color "manually" by passing in colors for to "node_color" instead of numbers (e.g. something like `node_color = ifelse(my_sample == "sample_1", "red", "blue")`). Keep in mind you will not have a meaningful legend if you do it this way.

Currently, I suggest using a differential heat tree to compare two samples/treatments and a differential heat tree matrix for comparing more than two samples/treatments (see github readme). Once we have worked out plotting categorical information and multiple colors per taxon (e.g. each node would be a pie graph), doing what you describe would be pretty cool, but its not quite there yet.

Hope this helps. Let me know if you need more clarification or have other questions.

Best,

Zach

juan.a....@gmail.com

unread,
Mar 2, 2018, 4:05:45 AM3/2/18
to metacoder
Hi Zach,

Thank you for taking the time to reply. What Im after is showing differences between two samples in one heat_tree. For instance, I have removed all taxa shared between sample_1 and sample_2. Instead of having two heat_trees showing the "unique" taxa in each sample, I was thinking of having just one heat_tree showing in one colour the taxa from sample_1 and in another colour the taxa from sample_2. If you have any quick tip how to do this, great. If not, no worries, I'll be looking forward for when categorical info is implemented in heat_trees.

Best wishes,

Juan 

Zachary Foster

unread,
Mar 2, 2018, 11:16:09 AM3/2/18
to metacoder
Hi Juan,

You can do that using what we call a differential heat tree. There is an example of how to do that on the GitHub readme:

https://github.com/grunwaldlab/metacoder#comparing-two-treatmentsgroups

Will that do what you want?

Best,

Zach
Reply all
Reply to author
Forward
0 new messages