Newbie question

kris

unread,

Mar 25, 2008, 3:25:02 PM3/25/08

to CorpLing with R

I am new to R. I am trying to match two tables, T1 with 4 columns and
T2 with 2 columns. I want to match the elements of T1[,4] with T2[,2].
I used kk<-match(T1[,4],T2[,2]) but it did not work. Then I checked
using grep I did find a lot of data in T[,4]was indeed in T2[,2].

any suggestions will be usefu.l

Stefan Th. Gries

unread,

Mar 25, 2008, 3:28:34 PM3/25/08

to math...@gmail.com, corplin...@googlegroups.com

Can you plz send an example with the kind of data you have and the
kind of output you'd like to get?
Thx,
STG
--
Stefan Th. Gries
-----------------------------------------------
University of California, Santa Barbara
http://www.linguistics.ucsb.edu/faculty/stgries
-----------------------------------------------

kris

unread,

Mar 25, 2008, 4:05:15 PM3/25/08

to CorpLing with R

The data in T2
1. 990334. G000021
2. 844047. G000025
3 281739. G000037
4 280903. G000039
5 418546. G000046
6 416652. G000047
7 416651. G000048
8 374000. G000053
9 396509. G000059
10 396450. G000061

The data in T1
V1 V2 V3 V4
1 T00015 AFP1 human G004657
2 T00035 AP-2alphaA human G002615
3 T00036 AP-4 human G003918
4 T00040 AR human G004953
5 T00045 COUP-TF2 human G004656
6 T00048 ATBF1-B human G004657
7 T00052 ATF-a human G004956
8 T00053 ATFa-isoform1 human G004956
9 T00065 B-Myb human G003920
10 T00070 Pax-5 human G003921

I want to compare the T1[,4] with T2[,2] and find the common data that
occurs the columns of the table.

Stefan Th. Gries

unread,

Mar 25, 2008, 4:43:54 PM3/25/08

to corplin...@googlegroups.com

I am not exactly sure what

> I want to compare the T1[,4] with T2[,2] and find the common data that occurs the columns of the table.

means and in your example there is no overlap.

# However, if these were your data:
set.seed(1)
T1<-data.frame(rnorm(10), rnorm(10), sample(letters[1:10]))
T2<-data.frame(rnorm(10), rnorm(10), sample(letters[5:14]))

# then this may be what you want:
T1[T1[,3] %in% T2[,3],]

HTH,

kris

unread,

Mar 25, 2008, 6:52:34 PM3/25/08

to CorpLing with R

sorry about the vagueness. So here is a better example
T1

1. 990334. G000021
2. 844047. G000025
3 281739. G000037
4 280903. G000039
5 418546. G000046

T2
1 T00015 AFP1 human G000046
2 T00035 AP-2alphaA human G000039

3 T00036 AP-4 human G003918
4 T00040 AR human G004953
5 T00045 COUP-TF2 human G004656

I want table T3, to be

280903. G000039 T00035 AP-2alphaA human
418546. G000046 T00015 AFP1 human

Here the common is elements are of the column containing G00$$$$. Hope
explains better what I am trying to do. Thanks for your help.

Stefan Th. Gries

unread,

Mar 25, 2008, 7:08:13 PM3/25/08

to corplin...@googlegroups.com

> sorry about the vagueness [...]
np ...

# If these are your data (note that the data frames have column names):

set.seed(1)
T1<-data.frame(rnorm(10), rnorm(10), sample(letters[1:10]))

names(T1)<-c("T1a", "T1b", "T1c")

T2<-data.frame(rnorm(10), sample(letters[5:14]), rnorm(10))
names(T2)<-c("T2a", "T2b", "T2c")

# then this should do what you want
merge(T1, T2, by.x="T1c", by.y="T2b")

# this checks which rows of T1x and T2b contain the same information and
# merges the data frames by giving all columns for these rows

Reply all

Reply to author

Forward