left_join by range

36 views
Skip to first unread message

Earl Brown

unread,
May 3, 2015, 12:09:22 AM5/3/15
to manip...@googlegroups.com
manipulatr-ers,

I'm trying to join two tables by taking a numeric value in one column and finding it's matching range (based on two columns) in the other data frame. Here's a toy example:

first_df <- 
  data_frame(
    start_time = c(20, 30, 40),
    end_time = c(29, 39, 49)
  )

second_df <- 
  data_frame(
    arrival_time = c(25, 33, 47),
    person = c("Bob", "Robert", "Rob")
  )

# doesn't work; just shows the logic I'm looking for
left_join(first_df, second_df, by = c("start_time" < "arrival_time" & "end_time" > "arrival_time"))

I'd like to end up with a data frame like this:
  start_time end_time person
1         20       29     Bob
2         30       39     Robert
3         40       49     Rob

While I can use several for loops to get the result, I'm wondering if there is a less painful way, possibly with left_join().

Thanks for your any ideas.

Message has been deleted

D Holmes

unread,
May 3, 2015, 9:44:08 AM5/3/15
to manip...@googlegroups.com
Cut works wonders:
> second_df <- 
+     data_frame(
+         arrival_time = c(25, 33, 47,22),
+         person = c("Bob", "Robert", "Rob","bobbert")
+     )
> 
> tmp<-second_df %>% mutate(start_time=as.character(cut(arrival_time,c(first_df$start_time,Inf),labels=first_df$start_time)))
> left_join(mutate(first_df, start_time=as.character(start_time)),tmp,by="start_time")
Source: local data frame [4 x 4]

  start_time end_time arrival_time  person
1         20       29           25     Bob
2         20       29           22 bobbert
3         30       39           33  Robert
4         40       49           47     Rob

>

Earl Brown

unread,
May 4, 2015, 1:59:13 PM5/4/15
to manip...@googlegroups.com
Thank you so much. That works.
Reply all
Reply to author
Forward
0 new messages