Plotting multiple lines from a data frame with ggplot2

5,065 views
Skip to first unread message

Nan DENG

unread,
May 2, 2014, 1:03:50 AM5/2/14
to ggp...@googlegroups.com
Hi all,

I am trying to plot multiple lines using ggplot2. My data is fitted into a data frame as follow:

    > rs
      time           1           2           3           4
    1  200 17230622635 17280401147 17296993985 17313586822
    2  400 22328386154 22456712709 22499488227 22542263745
    3  600 28958840968 29186097622 29261849840 29337602058
    4  800 40251281810 40650094691 40783032318 40915969945
    5 1000 73705771414 74612829244 74915181854 75217534464

I would like to use the "time" column as the x value. Other columns are y values of points in different lines. In the data above, there are 4 lines, each line consists of 5 points. More specifically, the first line has points (200, 17230622635), (400, 22328386154), (600, 28958840968), etc. The second line has points (200, 17280401147), (400, 22456712709), etc. (If you need further explanation of the data format, see P.S. in the end.)

To generate a similar data, you could use the following code:

    rs = data.frame(seq(200, 1000, by=200), runif(5), runif(5), runif(5))
    names(rs)=c("time", 1:3)

I followed some examples on stack overflow and tried to use reshape2 and ggplot2 to do this plot:

I first melt the data into a "long-format":

    library('reshape2')
    library('ggplot2')
    melted = melt(rs, id.vars="time")

Then plot the data using the following statment:

    ggplot() + geom_line(data=melted, aes(x="time", y="value", group="variable"))

However, I got an empty graph which has no point nor line.

Can anyone help me to see what's wrong with my procedure? Thank you in advance!

Regards,
-Nan

P.S.

**About the data format**:

You can imagine there are many students in the class and we have their scores of several quizzes. Each row contains one student's data: first column is the quiz number, then the rest of columns are his/her scores. For each student, we want to plot a line to reflect how his/her scores change over different quizzes, each point is the score of one quiz for a certain students. Since there are multiple students, we would like to draw multiple lines.

**About the melted data**:

Specific to the data I show above, the data I got from the ``melt()`` function is:

    > melted
       time variable       value
    1   200        1 17230622635
    2   400        1 22328386154
    3   600        1 28958840968
    4   800        1 40251281810
    5  1000        1 73705771414
    6   200        2 17280401147
    7   400        2 22456712709
    8   600        2 29186097622
    9   800        2 40650094691
    10 1000        2 74612829244
    11  200        3 17296993985
    12  400        3 22499488227
    13  600        3 29261849840
    14  800        3 40783032318
    15 1000        3 74915181854
    16  200        4 17313586822
    17  400        4 22542263745
    18  600        4 29337602058
    19  800        4 40915969945
    20 1000        4 75217534464

Brandon Hurr

unread,
May 2, 2014, 5:55:31 AM5/2/14
to Nan DENG, ggplot2
I believe your problem is here: 

ggplot() + geom_line(data=melted, aes(x="time", y="value", group="variable"))

You should not put your variable names in ""'s unless your are using aes_string(). 

ggplot() + geom_line(data=melted, aes(x=time, y=value, group=variable))

HTH, 

B

Monnand

unread,
May 4, 2014, 5:46:48 PM5/4/14
to Brandon Hurr, ggplot2
Thank you so much!

Yes. Dropping the quotes makes it work.

Regards,
-Nan
Reply all
Reply to author
Forward
0 new messages