Histogram of nested data

56 views
Skip to first unread message

Valentina Peona

unread,
Jul 20, 2015, 6:33:24 AM7/20/15
to ggp...@googlegroups.com
Hi everyone,

I'm trying to create a histogram from this dataset:

Species Elements Order Pair Genome

Dog   61 Carnivora   1 blue
Cat    223 Carnivora   1 red
Hedgehog  77 Insectivora        1 blue
Shrew       2565 Insectivora   1 red
KanRat 171 Rodentia    1 blue
Mouse 737 Rodentia    1 red
Sqrel         73 Rodentia    2 blue
Mouse 737 Rodentia    2 red
KanRat 171 Rodentia    3 blue
Rat         2321 Rodentia    3 red
Sqrel         73 Rodentia    4 blue
Rat         2321 Rodentia    4 red

I would something like the histogram in the image below: the bars of the species belonging to the same pair and order near one another. Each pair should be spaced from one another. The colors are given by the Genome variable


If it can be possible I would faceting the plot by the Order variable.
I'm quite a newbie of ggplot and I don't understand how to do that.

Can you help me?

Thanks a lot!

Roman Luštrik

unread,
Jul 20, 2015, 6:38:05 AM7/20/15
to Valentina Peona, ggplot2
I think you want a barplot, something akin to this:


Cheers,
Roman


--
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: https://github.com/hadley/devtools/wiki/Reproducibility
 
To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+u...@googlegroups.com
More options: http://groups.google.com/group/ggplot2

---
You received this message because you are subscribed to the Google Groups "ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ggplot2+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
In God we trust, all others bring data.

Valentina Peona

unread,
Aug 5, 2015, 9:02:37 AM8/5/15
to ggplot2, valenti...@gmail.com
Thank you Roman, it was useful even though I couldn't solve the entire problem.

Roman Luštrik

unread,
Aug 5, 2015, 3:07:39 PM8/5/15
to Valentina Peona, ggplot2
Which parts are falling short?

Cheers,
Roman

Dennis Murphy

unread,
Aug 5, 2015, 4:29:50 PM8/5/15
to Valentina Peona, ggplot2
Hi:

I assume this has something to do with the other post you made today. Here are a couple of takes on this problem. You should use Pair as the x-aesthetic, which I modified in your data frame (see below). I also used scale_x_continuous() to plot the labels. Since I'm not a fan of angled text, the second take flips the graph so that the labels can be read easily.

You should also learn how to post data to a text-based mailing list - copy and paste from an R console or spreadsheet is not good. Here's what I got when trying to read your data:

DF <- read.table(header = TRUE, text = "
+ SpeciesElementsOrder     PairGenome
+ Dog   61Carnivora  1blue
+ Cat   223Carnivora  1red
+ Hedgehog 77Insectivora     1blue
+ Shrew      2565Insectivora  1red
+ KanRat171Rodentia   1blue
+ Mouse     737Rodentia   1red
+ Sqrel     73Rodentia   2blue
+ Mouse 737Rodentia   2red
+ KanRat171Rodentia   3blue
+ Rat     2321Rodentia   3red
+ Sqrel     73Rodentia   4blue
+ Rat     2321Rodentia   4red")
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  line 5 did not have 3 elements


Hmmm, hidden tabs. After I edited your data and altered the Pairs variable to have values 1-6 corresponding to the pairs rather than pair within element, I got a data frame named DF with Genome converted to character from factor.

Here is how you should send data:

> dput(DF)       # what you type to the R console

# The output from the above call
structure(list(Species = structure(c(2L, 1L, 3L, 7L, 4L, 5L,
8L, 5L, 4L, 6L, 8L, 6L), .Label = c("Cat", "Dog", "Hedgehog",
"KanRat", "Mouse", "Rat", "Shrew", "Sqrel"), class = "factor"),
    Elements = c(61L, 223L, 77L, 2565L, 171L, 737L, 73L, 737L,
    171L, 2321L, 73L, 2321L), Order = structure(c(1L, 1L, 2L,
    2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("Carnivora",
    "Insectivora", "Rodentia"), class = "factor"), Pair = c(1L,
    1L, 6L, 6L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L), Genome = c("blue",
    "red", "blue", "red", "blue", "red", "blue", "red", "blue",
    "red", "blue", "red")), .Names = c("Species", "Elements",
"Order", "Pair", "Genome"), row.names = c(NA, -12L), class = "data.frame")

The part starting with "structure" is what you should copy/paste into your help message. It is a reproduction of the R object named DF in text form. Anyone who copy/pastes this into their R session will see exactly the same object as you do.

That said, using the data above from dput() (and assigning it to object DF), the following code gets close to your posted graph:

### ---------- start here
library(ggplot2)

# Overkill to emphasize the point
DF <- structure(list(Species = structure(c(2L, 1L, 3L, 7L, 4L, 5L,
8L, 5L, 4L, 6L, 8L, 6L), .Label = c("Cat", "Dog", "Hedgehog",
"KanRat", "Mouse", "Rat", "Shrew", "Sqrel"), class = "factor"),
    Elements = c(61L, 223L, 77L, 2565L, 171L, 737L, 73L, 737L,
    171L, 2321L, 73L, 2321L), Order = structure(c(1L, 1L, 2L,
    2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("Carnivora",
    "Insectivora", "Rodentia"), class = "factor"), Pair = c(1L,
    1L, 6L, 6L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L), Genome = c("blue",
    "red", "blue", "red", "blue", "red", "blue", "red", "blue",
    "red", "blue", "red")), .Names = c("Species", "Elements",
"Order", "Pair", "Genome"), row.names = c(NA, -12L), class = "data.frame")

ggplot(DF, aes(x = Pair, y = Elements, fill = Genome)) +
     geom_bar(position = "dodge", stat = "identity", width = 0.5) +
     scale_fill_identity() +
     scale_x_continuous(breaks = DF$Pair + c(-0.125, 0.125),
                      labels = DF$Species) +
     theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0),
           panel.grid.minor = element_blank())

# Since I don't like angled text, here's another version that flips the graph:

ggplot(DF, aes(x = Pair, y = Elements, fill = Genome)) +
     geom_bar(position = "dodge", stat = "identity", width = 0.5) +
     scale_fill_identity() +
     scale_x_continuous(breaks = DF$Pair + c(-0.125, 0.125),
                      labels = DF$Species) +
     coord_flip() +
     theme(panel.grid.major.y = element_blank())

### ------ end

HTH,
Dennis


--
Reply all
Reply to author
Forward
0 new messages