I have a 'long' data frame with id and featureCode columns. The featureCode column contains values of a categorical variable; each record has between 1 and 9 of these. For example:
I'd like to calculate the number of times each feature code is used with the other feature codes (the "pairwise counts" of the title). Ultimately, the result would be a matrix. For example:
PPLC PCLI PPL PPLC 0 3 1 PCLI 3 0 1 PPL 1 1 0
However, I suspect to get this far I need to use plyr (or similar) to produce an intermediate data frame in the form:
id featureCode1 featureCode2 5 PPLC PCLI 5 PCLI PPLC
I scoured the web (and the ggplot2 book, which has a section on plyr) for help and came up with the following:
my_func <- function(df) { with(df, data.frame( for (i in 1:length(featureCode)) { for (j in 1:length(featureCode)) { if (i != j) { featureCode1 = featureCode[i] featureCode2 = featureCode[j] } } } ))
Not surprisingly, it doesn't work (I'm an R beginner and come from a Java background). However, I include it here to give you an idea of what I'm trying to do.
Could anyone suggest where I might be going wrong? Thanks in advance for any help.
After this stage, I'm not sure the best way to count up the pairings. I can
think of some not-very-elegant ways to do it, but maybe someone else will
have better ideas.
-Winston
On Thu, Nov 1, 2012 at 11:16 AM, Iain Dillingham
<iain.dilling...@gmail.com>wrote:
> I have a 'long' data frame with id and featureCode columns. The
> featureCode column contains values of a categorical variable; each record
> has between 1 and 9 of these. For example:
> I'd like to calculate the number of times each feature code is used with
> the other feature codes (the "pairwise counts" of the title). Ultimately,
> the result would be a matrix. For example:
> Not surprisingly, it doesn't work (I'm an R beginner and come from a Java
> background). However, I include it here to give you an idea of what I'm
> trying to do.
> Could anyone suggest where I might be going wrong? Thanks in advance for
> any help.
> Iain
> --
> You received this message because you are subscribed to the Google Groups
> "manipulatr" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/manipulatr/-/vnt_nNbXcYkJ.
> To post to this group, send email to manipulatr@googlegroups.com.
> To unsubscribe from this group, send email to
> manipulatr+unsubscribe@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/manipulatr?hl=en.
<peter.meilst...@gmail.com> wrote:
> Do a join of the dat
> Iain Dillingham <iain.dilling...@gmail.com> wrote:
>> Hello everyone,
>> I have a 'long' data frame with id and featureCode columns. The
>> featureCode column contains values of a categorical variable; each record
>> has between 1 and 9 of these. For example:
>> I'd like to calculate the number of times each feature code is used with
>> the other feature codes (the "pairwise counts" of the title). Ultimately,
>> the result would be a matrix. For example:
>> Not surprisingly, it doesn't work (I'm an R beginner and come from a Java
>> background). However, I include it here to give you an idea of what I'm
>> trying to do.
>> Could anyone suggest where I might be going wrong? Thanks in advance for
>> any help.
>> Iain
>> --
>> You received this message because you are subscribed to the Google Groups
>> "manipulatr" group.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msg/manipulatr/-/vnt_nNbXcYkJ.
>> To post to this group, send email to manipulatr@googlegroups.com.
>> To unsubscribe from this group, send email to
>> manipulatr+unsubscribe@googlegroups.com.
>> For more options, visit this group at
>> http://groups.google.com/group/manipulatr?hl=en.
Thanks for your help. Unfortunately the merge isn't quite what I'm looking for, as it double counts categories. However, if you're interested I also posted the question on stackoverflow<http://stackoverflow.com/questions/13176741/how-to-calculate-a-table-...>and received some useful advice.
On Thursday, 1 November 2012 16:16:10 UTC, Iain Dillingham wrote:
> Hello everyone,
> I have a 'long' data frame with id and featureCode columns. The > featureCode column contains values of a categorical variable; each record > has between 1 and 9 of these. For example:
> I'd like to calculate the number of times each feature code is used with > the other feature codes (the "pairwise counts" of the title). Ultimately, > the result would be a matrix. For example:
> Not surprisingly, it doesn't work (I'm an R beginner and come from a Java > background). However, I include it here to give you an idea of what I'm > trying to do.
> Could anyone suggest where I might be going wrong? Thanks in advance for > any help.
<iain.dilling...@gmail.com> wrote:
> Thanks for your help. Unfortunately the merge isn't quite what I'm looking
> for, as it double counts categories. However, if you're interested I also
> posted the question on stackoverflow and received some useful advice.
> Iain
> On Thursday, 1 November 2012 16:16:10 UTC, Iain Dillingham wrote:
>> Hello everyone,
>> I have a 'long' data frame with id and featureCode columns. The
>> featureCode column contains values of a categorical variable; each record
>> has between 1 and 9 of these. For example:
>> I'd like to calculate the number of times each feature code is used with
>> the other feature codes (the "pairwise counts" of the title). Ultimately,
>> the result would be a matrix. For example:
>> Not surprisingly, it doesn't work (I'm an R beginner and come from a Java
>> background). However, I include it here to give you an idea of what I'm
>> trying to do.
>> Could anyone suggest where I might be going wrong? Thanks in advance for
>> any help.
> To post to this group, send email to manipulatr@googlegroups.com.
> To unsubscribe from this group, send email to
> manipulatr+unsubscribe@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/manipulatr?hl=en.