Ben Bimber
unread,Mar 18, 2010, 10:41:23 AM3/18/10Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to maduser
I have a data frame containing the Id, Mother, Father and Sex from about 10,000 animals in our colony. I am interested in graphing simple family trees for a given subject or small number of subjects. The basic idea is: start with data frame from entire colony and list of index animals. I need to identify all immediate relatives of these index animals and plot the pedigree for them. We're not trying to do any sort of real analysis, just present a visualization of the family structure. I have used the kinship and pedigree packages to plot the pedigree. My question relates to efficiently identifying the animals to include in the pedigree:
Starting with the data frame of ~10,000 records, I want to use a set of index animals to extract the immediate relatives and plot only a small number in the pedigree. 'Immediate relatives' is somewhat of an ambiguous term - I am currently defining it as 3 generations forward and 3 backward. Currently, I have a somewhat ugly approach where I recursively calculate each generation forward or backward and build a new dataframe. Is there a better approach or package that does this? I realize my code should be written better to get rid of the loops, so if anyone has suggestions there I would appreciate this as well. Thanks in advance.
Code to calculate generations forward and backward:
#queryIds holds the unique Ids for parents of the index animals
queryIds = unique(c(ped$Sire, ped$Dam));
for(i in 1:gens){
if (length(queryIds) == 0){break};
#allPed is the dataframe with Id,Dam,Sire and Sex for animals in our colony
newRows <- subset(allPed, Id %in% queryIds);
queryIds = c(newRows$Sire, newRows$Dam);
ped <- unique(rbind(newRows,ped));
}
#build forwards
#when calculating children, queryIds holds the Ids of the previous generation
queryIds = unique(ped$Id);
for(i in 1:gens){
if (length(queryIds)==0){break};
#allPed is the dataframe with Id,Dam,Sire and Sex for animals in our
colony
newRows <- subset(allPed, Sire %in% queryIds | Dam %in% queryIds);
queryIds = newRows$Id;
ped <- unique(rbind(newRows,ped));
}