Warning message when plotting a boxplot from a reordered dataframe ///

1,524 views
Skip to first unread message

trichter

unread,
Dec 16, 2014, 8:28:21 AM12/16/14
to ggp...@googlegroups.com
i!

I have a list of numerical vectors:

vector.list3 <- lapply(list3, function(x) {as.numeric(as.vector(unlist(x)))})

I need a datamelt

library(reshape2)
v3m  <- melt(vector.list3)

I want to preserve the order of vector.list3 to the melted object, but in a reversed order:

v3m$L1 <- factor(v3m$L1,levels = rev(levels(v3m$L1)),ordered = TRUE)

I receive:
Warning message:
In `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels) else paste0(labels,  :
  duplicated levels in factors are deprecated

I can still plot:
ggplot() +
  geom_boxplot(data = v3m, aes(x = L1, y = value)) +
  stat_boxplot(geom ='errorbar') +
  theme_bw() +
  coord_flip()

And the plot looks as it should look like, but i get:

Warning messages:
1: In `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels) else paste0(labels,  :
  duplicated levels in factors are deprecated
2: In `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels) else paste0(labels,  :
  duplicated levels in factors are deprecated

This might be more related to reshape or basic R, but maybe some knows what its happening here, and if i can ignore it?

My factor levels:

names(vector.list3)
 [1] "1"       "3"       "2"       "12"      "13"      "15"      "5"       "11"      "21"     
[10] "18"      "20"      "19"      "out"     "25"      "4"       "GBSL1B0" "6"       "17"     
[19] "11B2"    "9"       "ATs328"  "d142"    "10"      "B276D12" "TPD58"   "23"      "HoloI"  
[28] "7I"      "7II"     "8Holo"   "8Aca"    "BPU1C1g" "22"      "26"


Thank you very much!







Doug Mitarotonda

unread,
Dec 16, 2014, 9:30:06 AM12/16/14
to trichter, ggp...@googlegroups.com
Not meaning to sound rude, but the warning message is telling you exactly what the issue is. This is a base R matter; the `levels` argument should have unique values. `rev(levels(v3m$L1))` has duplicated values apparently. I am sure you would see this if you did `duplicated(rev(levels(v3m$L1))`. 


--
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: https://github.com/hadley/devtools/wiki/Reproducibility
 
To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+u...@googlegroups.com
More options: http://groups.google.com/group/ggplot2

---
You received this message because you are subscribed to the Google Groups "ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ggplot2+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

trichter

unread,
Dec 17, 2014, 7:37:09 AM12/17/14
to ggp...@googlegroups.com
Hi!


The reason what i am puzzled is not the content, but i cannot find any duplicated values in my factor list:

vector3 <- lapply(list3, function(x) {as.numeric(as.vector(unlist(x)))})
names(vector3)
 [1] "1"       "3"       "2"       "12"      "13"      "15"      "5"       "11"      "21"     
[10] "18"      "20"      "19"      "out"     "25"      "4"       "GBSL1B0" "6"       "17"     
[19] "11B2"    "9"       "ATs328"  "d142"    "10"      "B276D12" "TPD58"   "23"      "HoloI"  
[28] "7I"      "7II"     "8Holo"   "8Aca"    "BPU1C1g" "22"      "26"
library(reshape2)
v1m <- melt(vector3)
head(v1m)
 value L1
1  81.9  1
2  82.3  1
3  82.5  1
4  82.7  1
5  81.7  1
6  81.8  1

L1 is now a factorized name of the former vector3.
There are no redundant factor names.

I am really puzzled.

Roman Luštrik

unread,
Dec 17, 2014, 7:43:12 AM12/17/14
to trichter, ggplot2
Can you post a reproducible example?

Cheers,
Roman


--
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: https://github.com/hadley/devtools/wiki/Reproducibility
 
To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+u...@googlegroups.com
More options: http://groups.google.com/group/ggplot2

---
You received this message because you are subscribed to the Google Groups "ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ggplot2+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
In God we trust, all others bring data.

trichter

unread,
Dec 17, 2014, 8:50:06 AM12/17/14
to ggp...@googlegroups.com
Hi!

thank you for wanting to have a look at it:

   dput(test1)
    structure(list(`1` = c(82.5, 82.2, 81.7, 83.2, 84.5, 82.1, 84,
    81.8, 84.1, 83.5), `3` = c(90.5, 92, 94.7, 92.7, 90.2, 91.2,
    85.7, 92.9, 92.9, 90.3), `2` = c(82.8, 81.7, 82, 81.9, 80.9,
    81.9, 81.7, 82.1, 81.5, 82.5), `12` = c(86, 85.3, 87.7, 87, 84.9,
    84.6, 84.4, 88.1, 86.8, 88.5), `13` = c(83.1, 83.2, 85, 83.9,
    82.6, 82.9, 83.7, 82.6, 82.7, 83.9), `15` = c(86.6, 84.6, 84,
    80.8, 83.6, 84.8, 84.8, 83.2, 85, 85.1), `5` = c(83, 81.5, 83.2,
    83.4, 81.8, 82.6, 83.4, 83.2, 83.9, 83), `11` = c(82.3, 82.2,
    83.1, 81, 81.4, 83.7, 82.1, 82.5, 82.7, 81.7), `21` = c(80.4,
    78.7, 78.7, 80.5, 81, 80.4, 79.9, 79.3, 80.4, 80), `18` = c(80.8,
    82.7, 81.9, 80.2, 81.2, 81.7, 80.5, 81, 81.3, 80.6), `20` = c(80.2,
    81.1, 82.2, 81.7, 81.5, 81.7, 80.1, 82.8, 81.3, 81.2), `19` = c(81.5,
    79.9, 79.7, 81.2, 81.3, 82.2, 81.8, 82.1, 82, 82.9), out = c(81.1,
    81.5, 80.9, 81.1, 80.5, 80.8, 81, 80.9, 80.9, 79.9), `25` = c(79.8,
    79.9, 78.8, 78, 78.6, 80.1, 78.6, 79.3, 78.8, 79.3), `4` = c(79.3,
    81.4, 80.8, 80, 80.4, 79.4, 79, 78.3, 79.1, 79.1), GBSL1B0 = c(76.4,
    75.2, 76.7, 78.6, 76.9, 76.3, 77.8, 79.2, 77.2, 77.1), `6` = c(80.3,
    81.2, 81.5, 81.3, 81.9, 82.6, 81.2, 81.5, 81.6, 81.1), `17` = c(82,
    80.7, 77.8, 81, 82.3, 80.9, 81.4, 80.9, 81.7, 82.6), `11B2` = c(82.2,
    79.9, 80.8, 80.8, 81.3, 82.7, 82, 81.5, 81.2, 82.1), `9` = c(80.8,
    80.6, 82.1, 80.6, 79.4, 82.3, 81.6, 81.4, 81, 79.5), ATs328 = c(81.1,
    80.7, 79.7, 81.9, 80.5, 80, 80.4, 81.2, 80.6, 79), d142 = c(80.2,
    80.8, 79.9, 79.4, 79.4, 80, 79.9, 81.5, 80.6, 80), `10` = c(79.9,
    80.6, 80.3, 79.8, 79, 79.2, 80.9, 80.6, 80, 78.5), B276D12 = c(80.6,
    78.9, 80.2, 79.6, 80, 79.6, 79.8, 79.2, 78.8, 79.6), TPD58 = c(79.2,
    80.3, 81.6, 80.5, 81.7, 81.6, 82.6, 80.4, 82.2, 82.2), `23` = c(78.2,
    80.2, 79.7, 79.8, 79.7, 80.4, 80.2, 77.8, 80, 79.9), HoloI = c(80.4,
    80.7, 80.7, 80.3, 80.3, 81, 79.8, 79, 77.9, 81.4), `7I` = c(80.2,
    81.8, 79.8, 80.7, 81, 78.7, 81.1, 79, 81.7, 81.4), `7II` = c(80.2,
    80.3, 80.7, 79.9, 80.5, 80, 81, 79.2, 81.9, 78), `8Holo` = c(80.2,
    82.5, 80.4, 79.5, 81.2, 79.4, 79, 80.9, 80, 79.6), `8Aca` = c(77.1,
    81.4, 80.7, 81.9, 81, 79.8, 79.9, 80.4, 78.9, 79), BPU1C1g = c(78.3,
    79.2, 76.9, 78.6, 79.1, 77.7, 78, 78.9, 78.5, 78), `22` = c(81.4,
    80.3, 80.1, 78.8, 81.1, 79.8, 81.1, 80.7, 81.8, 81.2), `26` = c(80.6,
    79.8, 80, 79, 80.6, 77.5, 80.6, 81.5, 79.8, 81.3)), .Names = c("1",
    "3", "2", "12", "13", "15", "5", "11", "21", "18", "20", "19",
    "out", "25", "4", "GBSL1B0", "6", "17", "11B2", "9", "ATs328",
    "d142", "10", "B276D12", "TPD58", "23", "HoloI", "7I", "7II",
    "8Holo", "8Aca", "BPU1C1g", "22", "26"))


Here is a interesting one:

vtm <- melt(test1)
head(vtm)
  value L1
1  83.7  1
2  84.2  1
3  84.1  1
4  83.6  1
5  82.2  1
6  83.7  1

vtm$L1 <- factor(vtm$L1,levels = vtm$L1,ordered = TRUE)

Warning message:
In `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels) else paste0(labels,  :
  duplicated levels in factors are deprecated
head(vtm)
value L1
1  83.7  1
2  84.2  1
3  84.1  1
4  83.6  1
5  82.2  1
6  83.7  1

However, if i go with
vtm$L1 <- factor(vtm$L1,levels = rev(levels(vtm$L1)),ordered = TRUE)
head(vtm)
  value   L1
1  81.9 <NA>
2  84.7 <NA>
3  83.2 <NA>
4  84.9 <NA>
5  82.9 <NA>
6  83.3 <NA>


Then all the names/factor levels are converted to NA.

I guess i fail hard at some point.




Brandon Hurr

unread,
Dec 17, 2014, 9:02:33 AM12/17/14
to trichter, ggplot2
Could you explain why you want to do this? Often order of data is ignored for many reasons and reordering has little value. 

If you simply want the dataset backwards, why not focus on that? 
Assuming the rownames are numeric and in order (they should be after melting)
revvtm <- vtm[rev(rownames(vtm)),]

trichter

unread,
Dec 17, 2014, 9:50:48 AM12/17/14
to ggp...@googlegroups.com
Hi!

The order of the data is following the top-to-bottom appearance of the groups in a dendrogram. I definitely dont want them to be numerically ordered.

But your advice to reverse the dataframe by rows worked. Thank you very much.











trichter

unread,
Dec 17, 2014, 10:49:22 AM12/17/14
to ggp...@googlegroups.com
Hm, seems like ggplot is ignoring the reversed ordering of the dataframe when
v3mr <- v2m[rev(rownames(v3m)),]
was run.


Here is the answer:

I needed to create a vector with all the names as factor, and used that for ordering the datamelt:

nv2 <- rev(as.factor(names(vector2)))
v2m  <- melt(vector2)
v2m$L1 <- factor(v2m$L1, levels = nv2, ordered = TRUE)

And, now, everything is perfect. Thank you for giving your advice.



Reply all
Reply to author
Forward
0 new messages