geom_bar fill order not maintained with stat='identity'

130 views
Skip to first unread message

kathyjo...@gmail.com

unread,
Apr 28, 2017, 1:00:23 PM4/28/17
to ggplot2
When plotting using geom_bar
The order is not maintained
the x-axis gets re-arranged in alphabetical order

Here is a simple example:



library(tidyverse)

## Create a simple df
Region <- c("SouthEast","CentralWest","SouthWest","NorthWest","CentralEast","NorthCentral","NorthEast","Central", NA)
Attendance <- c(63448129,60835672,37126096,32870237,29031014,25861016,18620807,16083103,1729)
SimpleDF <- data_frame(Region,Attendance)



## plot bar chart 
ggplot(SimpleDF,aes(x=Region, y=Attendance))+
        geom_bar(stat ="identity")

## Notice the variable 'Region' got rearranged in alphabetical order
## Why doesn't maintain the order and what a way around it
Thanks


Example geom_bar.pdf

Brandon Hurr

unread,
Apr 28, 2017, 2:45:46 PM4/28/17
to kathyjo...@gmail.com, ggplot2
See top answer here: https://stackoverflow.com/questions/20041136/avoid-ggplot-sorting-the-x-axis-while-plotting-geom-bar

Summary: Define Region as an ordered factor and then plot. 

library(tidyverse) #install.packages("tidyverse")

## Create a simple df
Region <- c("SouthEast","CentralWest","SouthWest","NorthWest","CentralEast","NorthCentral","NorthEast","Central", NA)
Attendance <- c(63448129,60835672,37126096,32870237,29031014,25861016,18620807,16083103,1729)
SimpleDF <-
data_frame(Region,Attendance) %>%
mutate(Region = factor(Region, levels=unique(Region)))

## plot bar chart 
ggplot(SimpleDF,aes(x=Region, y=Attendance)) +
        geom_bar(stat="identity")

HTH,
B

--
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: https://github.com/hadley/devtools/wiki/Reproducibility
 
To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+unsubscribe@googlegroups.com
More options: http://groups.google.com/group/ggplot2

---
You received this message because you are subscribed to the Google Groups "ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ggplot2+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Joyce Robbins

unread,
Apr 28, 2017, 4:11:41 PM4/28/17
to Brandon Hurr, kathyjo...@gmail.com, ggplot2
As noted in this comment on the page referenced below, it doesn't have to be an ordered factor:

"It's not making it an ordered factor that is making this work, it's specifing the desired order of the levels of the factor. Taking the ordered=TRUE out should have the same result. – Aaron May 13 '15 at 15:00"

Brandon Hurr

unread,
Apr 28, 2017, 4:15:34 PM4/28/17
to Joyce Robbins, kathyjo...@gmail.com, ggplot2
I could have worded it better. Thanks for clarifying. It is indeed simply setting the order of the factor levels while creating the factor and not actually creating an "ordered" factor.

To unsubscribe: email ggplot2+u...@googlegroups.com

More options: http://groups.google.com/group/ggplot2

---
You received this message because you are subscribed to the Google Groups "ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ggplot2+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: https://github.com/hadley/devtools/wiki/Reproducibility
 
To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+u...@googlegroups.com

More options: http://groups.google.com/group/ggplot2

---
You received this message because you are subscribed to the Google Groups "ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ggplot2+u...@googlegroups.com.

kathyjo...@gmail.com

unread,
Apr 28, 2017, 4:36:16 PM4/28/17
to ggplot2, joycer...@gmail.com, kathyjo...@gmail.com
The thing is that  in  stat = "identity", the data is not manipulated in any way, so the bars should  appear in the order in the original data frame.

So my question is , why do I have to make it a factor for this to work, why is it changing the order?

Thanks again

kathy

Brandon Hurr

unread,
Apr 29, 2017, 2:54:17 AM4/29/17
to kathyjo...@gmail.com, ggplot2, joycer...@gmail.com
The raw values are unchanged and are plotted as-is. As opposed to count data for a particular Region which could be summarized from a similar dataset.
But, the levels of the factor are sorted alphabetically with NA reported last. It has been like this or similar to this since as far as I can remember with ggplot2 and has been one of the most asked questions on here and stackoverflow.

A lot of people don't like it because they've purposefully ordered their dataset non-programmatically. If you step outside of this box and start not making assumptions about the order of the dataset, you'll see that there is no "best way" to define order for everyone. The best way is to make a general assumption (alphabetical), and let the user programmatically assign what they want.

I realize that is not very satisfying, but that is the best answer I have for you. Hadley might have a better perspective on the origins of this behavior since I believe he is the one who wrote it.

HTH
B

kathyjo...@gmail.com

unread,
May 1, 2017, 6:25:20 PM5/1/17
to ggplot2, kathyjo...@gmail.com
Thanks Brandon,
Hadley have asked me to post an example to ggplot2 mailing list, I'm ashamed to say that I don't know what that is??
I thought I was doing that here (google group),  do you know what ggplot2 mailing list is???

Thanks again

Brandon Hurr

unread,
May 2, 2017, 7:39:18 AM5/2/17
to kathyjo...@gmail.com, ggplot2, Hadley Wickham
You are in the right place. If Hadley can spare the time to explain the historical reasoning behind this design decision then hopefully he can do that here.

I scrolled through the NEWS file and wasn't able to find anything. The change might have occurred in a linked package like scales though. 

HTH,
Brandon



More options: http://groups.google.com/group/ggplot2

---
You received this message because you are subscribed to the Google Groups "ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ggplot2+unsubscribe@googlegroups.com.

Hadley Wickham

unread,
May 3, 2017, 11:05:31 AM5/3/17
to Brandon Hurr, kathyjo...@gmail.com, ggplot2
It's basically because when you create a factor in R it uses alphabetical order, not the order in which the values appear.

Hadley
Reply all
Reply to author
Forward
0 new messages