19 views
Skip to first unread message

Geoff Chan

unread,
Mar 27, 2013, 7:35:01 AM3/27/13
to hk...@googlegroups.com
hey, 

Just introduced the facebook page to one friend (who is social scientist wanna know R) to join, but she said the title page is SOO...UGLY...lol...ROOM TO IMPROVE....

G

Chung-hong Chan

unread,
Mar 27, 2013, 9:42:58 AM3/27/13
to hk...@googlegroups.com
Yes, it is ugly. Replaced with something "nicer". 

Geoff Chan

unread,
Mar 27, 2013, 11:59:58 AM3/27/13
to hk...@googlegroups.com
Hi All, 

Some questions regarding data frame matching by keys
 
I have two data frames: 
 
data frame 1 is                                      

x1 x2
1   NA
2   NA
3   NA
4   NA
5   NA
6   NA
7   NA
8   NA

data frame 2 is 

y1 y2
1   45
3   71
5   83
8   98

I would like to match these two df into something like
dataframe 3
x1 y3
1   45
2   0
3   71
4   0
5   83
6   0
7   0
8   98

Currently I write something lilke: 

dataframe1 <- data.frame(c(1:8), rep(NA, 8))
dataframe2 <- data.frame(c(1,3,5,8), c(45,71,83,98))


for (i in 1: nrow(dataframe1)){
          row.position = pmatch(dataframe1[i,1], dataframe2[,1])
          dataframe1[i,2] <- dataframe2[row.position,2]
}
dataframe1[dataframe1[,2]%in% NA,2] <- 0

But you know the script involving for loops (and seems a bit stupid)

I wonder whether there are better ways to do so?

G



C.H.

unread,
Mar 27, 2013, 12:31:37 PM3/27/13
to hk...@googlegroups.com
I would do it looplessly this way. Actually, the problem is only to
generate the y3 vector in the third data frame using x1 from the first
dataframe.

x1 <- 1:8 # or x1 <- dataframe1[,1]
y3 <- dataframe2[match(x1, dataframe2[,1]),2]
y3[is.na(y3)] <- 0

Then make a data frame using x1 and y3.
> --
> You received this message because you are subscribed to the Google Groups
> "Hong Kong R User group" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to hkrug+un...@googlegroups.com.
> To post to this group, send email to hk...@googlegroups.com.
> Visit this group at http://groups.google.com/group/hkrug?hl=en.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Melody Lam

unread,
Mar 27, 2013, 11:43:00 PM3/27/13
to hk...@googlegroups.com
Hong's method is great!
For me I just think of using the hash package (As I use Perl more frequently than R)
Melody

Geoff Chan

unread,
Mar 28, 2013, 12:00:06 AM3/28/13
to hk...@googlegroups.com
Thanks for the suggestions.

CH's method works charmingly Thanks

Never explore hash package, shall try Thanks

C.H.

unread,
Mar 28, 2013, 12:28:23 AM3/28/13
to hk...@googlegroups.com
Another stupid (or insane) way to do this is to use sqldf. Those who
dream in SQL will like this method. But the colnames for dataframe1
and dataframe2 need to be more descriptive.

require(sqldf)
colnames(dataframe1) <- c("x1", "x2")
colnames(dataframe2) <- c("y1", "y2")
dataframe3 <- sqldf("select x1, y2 as y3 from dataframe1 left outer
join dataframe2 on x1 = y1")
dataframe3$y3[is.na(dataframe3$y3)] <- 0
Reply all
Reply to author
Forward
0 new messages