Colors in PCA/Hierarchical Clustering

shanno...@gmail.com

unread,

Aug 28, 2013, 8:01:01 AM8/28/13

to methylkit_...@googlegroups.com

This is a silly question that I should be able to figure out on my own, but I figure asking will speed up the process since I've hit a wall in trying to figure it out.

I've pulled out the SampleClustering/PCA code and have altered it so that I can change the colors used in the plots; however, is there a simple way to change the colors used on the plots within the package?

Thanks!

Altuna Akalin

unread,

Aug 28, 2013, 8:09:54 AM8/28/13

to methylkit_...@googlegroups.com

There is no easy way, colors are decided based on number of groups in the treatment vector, and selected from rainbow() palatte.

It shouldn't be so hard to add an additional argument that takes a colorPalatte as input. If you have such a function that is working robustly we can add it to the next release and cite your contribution.

Best,

Altuna

--
You received this message because you are subscribed to the Google Groups "methylkit_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discus...@googlegroups.com.
To post to this group, send email to methylkit_...@googlegroups.com.
Visit this group at http://groups.google.com/group/methylkit_discussion.
For more options, visit https://groups.google.com/groups/opt_out.

shanno...@gmail.com

unread,

Aug 29, 2013, 10:02:45 AM8/29/13

to methylkit_...@googlegroups.com

I don't necessarily have a terribly robust method, but it's working for me. I've simply added my.cols as an argument to the function. So, the default works as you have it now (using the rainbow pallete), but you can set the argument my.cols to whichever colors/pallete you'd like.

So, for 'clusterSamples', I call the function (for example):

clusterSamples(Object, dist="correlation", method="ward", plot=TRUE,my.cols=c("red","blue"))

with the underlying code modified as such (in red)

# cluster function on matrix and return hierarchical plot

# x matrix each column is a sample

# dist.method method to get the distance between samples

# hclust.method the agglomeration method to be used

# plot if TRUE, plot the hierarchical clustering

.cluster=function(x, dist.method="correlation", hclust.method="ward", plot=TRUE,

treatment=treatment,sample.ids=sample.ids,context, my.cols){

DIST.METHODS <- c("correlation", "euclidean", "maximum", "manhattan", "canberra",

"binary", "minkowski")

dist.method <- pmatch(dist.method, DIST.METHODS)

HCLUST.METHODS <- c("ward", "single", "complete", "average", "mcquitty",

"median", "centroid")

hclust.method <- pmatch(hclust.method, HCLUST.METHODS)

if (is.na(hclust.method))

stop("invalid clustering method")

if (hclust.method == -1)

stop("ambiguous clustering method")

if(DIST.METHODS[dist.method] == "correlation")

d = .dist.cor(t(x))

else

d=dist(scale(t(x)), method=DIST.METHODS[dist.method]);

hc=hclust(d, HCLUST.METHODS[hclust.method]);

if(plot){

#plclust(hc,hang=-1, main=paste("CpG dinucleotide methylation clustering\nDistance method: ",

# DIST.METHODS[dist.method],sep=""), xlab = "Samples");

# plot

treatment=treatment

sample.ids=sample.ids

col.list=as.list(my.cols[treatment+1])

names(col.list)=sample.ids

colLab <- function(n,col.list)

{

if(is.leaf(n))

{

a <- attributes(n)

attr(n, "nodePar") <- c(a$nodePar, list(lab.col =

col.list[[a$label]], lab.cex=1,

col=col.list[[a$label]], cex=1, pch=16 ))

}

n

}

dend = as.dendrogram(hc)

dend_colored <- dendrapply(dend, colLab,col.list)

plot(dend_colored, main = paste(context, "methylation clustering"));

# end of plot

}

return(hc)

}

setGeneric("clusterSamples", function(.Object, dist="correlation", method="ward",

sd.filter=TRUE,sd.threshold=0.5,

filterByQuantile=TRUE, plot=TRUE, my.cols=rainbow(length(unique(treatment)), start=1, end=0.6))

standardGeneric("clusterSamples"))

#' @rdname clusterSamples-methods

#' @aliases clusterSamples,methylBase-method

setMethod("clusterSamples", "methylBase",

function(.Object, dist, method ,sd.filter, sd.threshold,

filterByQuantile, plot, my.cols)

{

mat =getData(.Object)

# remove rows containing NA values, they might be introduced at unite step

mat =mat[ rowSums(is.na(mat))==0, ]

meth.mat = mat[, .Object@numCs.index]/

(mat[,.Object@numCs.index] + mat[,.Object@numTs.index] )

names(meth.mat)=.Object@sample.ids

# if Std. Dev. filter is on remove rows with low variation

if(sd.filter){

if(filterByQuantile){

sds=rowSds(as.matrix(meth.mat))

cutoff=quantile(sds,sd.threshold)

meth.mat=meth.mat[sds>cutoff,]

}else{

meth.mat=meth.mat[rowSds(as.matrix(meth.mat))>sd.threshold,]

}

.cluster(meth.mat, dist.method=dist, hclust.method=method,

plot=plot, treatment=.Object@treatment,

sample.ids=.Object@sample.ids,

context=.Object@context, my.cols)

}

)

(I'm sure this is overly simplistic and that you could come up with a better solution. No need to cite any contribution, but figured I'd share how I'm doing it.)

On Wednesday, August 28, 2013 8:09:54 AM UTC-4, Altuna Akalin wrote:

There is no easy way, colors are decided based on number of groups in the treatment vector, and selected from rainbow() palatte.

It shouldn't be so hard to add an additional argument that takes a colorPalatte as input. If you have such a function that is working robustly we can add it to the next release and cite your contribution.

Best,
Altuna

On Wed, Aug 28, 2013 at 2:01 PM, <shanno...@gmail.com> wrote:

This is a silly question that I should be able to figure out on my own, but I figure asking will speed up the process since I've hit a wall in trying to figure it out.

I've pulled out the SampleClustering/PCA code and have altered it so that I can change the colors used in the plots; however, is there a simple way to change the colors used on the plots within the package?

Thanks!

--
You received this message because you are subscribed to the Google Groups "methylkit_discussion" group.

To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discussion+unsub...@googlegroups.com.

Altuna Akalin

unread,

Aug 30, 2013, 5:53:07 AM8/30/13

to methylkit_...@googlegroups.com

Thank for sharing Shannon, looks OK to me at first glance.

Best,

Altuna

To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discus...@googlegroups.com.

Reply all

Reply to author

Forward