RE: Digest for forestr@googlegroups.com - 4 Messages in 2 Topics

13 views
Skip to first unread message

Gregoire, Timothy

unread,
Oct 22, 2011, 5:49:35 AM10/22/11
to for...@googlegroups.com

Aaron,

 

R has its by() function, too, which is a “wrapper” for tapply(). Or you could write your own function to process data group by group. As Andrew mentions, it is a little difficult to know how to advise without more specific  background info about what you are trying to do.

 

Tim

 

Timothy G. Gregoire

J. P. Weyerhaeuser Professor of Forest Management

School of Forestry & Environmental Studies, Yale University

360 Prospect Street, New Haven, CT  06511-2104  U.S.A.

 

office: 1.203.432.9398  mobile: 1.203.508.4014, fax: 1.203.432.3809

timothy....@yale.edu

G&V sampling text: http://crcpress.com/product/isbn/9781584883708

 

From: for...@googlegroups.com [mailto:for...@googlegroups.com]
Sent: Saturday, October 22, 2011 3:00 AM
To: Digest Recipients
Subject: Digest for for...@googlegroups.com - 4 Messages in 2 Topics

 

Group: http://groups.google.com/group/forestr/topics

§  Dealing with Grouped Data [3 Updates]

§  Some Weibull code [1 Update]

Aaron <hol...@gmail.com> Oct 21 06:17AM -0700  

One thing I have yet to migrate from is SAS's all powerful BY
statement, which made dealing with grouped data very seamless. The
closet thing I could find in R was the groupedData() in library(nlme),
which allows use of the gsummary() and gapply() functions. However,
these were quite clunky and I often ran into memory issues with large
data frames. Recently, Dr. Wickham has developed the Plyr library,
which I have found very useful. Have others found a more effective
technique for summarizing large grouped datasets?

 

Andrew Robinson <A.Rob...@ms.unimelb.edu.au> Oct 22 08:05AM +1100  

Hi Aaron,
 
can you give an example of the kind of problem that you'd like to
solve, with a little example dataset?
 
Cheers
 
Andrew
 
On Fri, Oct 21, 2011 at 06:17:44AM -0700, Aaron wrote:
> To post to this group, send email to for...@googlegroups.com.
> To unsubscribe from this group, send email to forestr+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/forestr?hl=en.
 
--
Andrew Robinson
Deputy Director, ACERA
Department of Mathematics and Statistics Tel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/
 
Forest Analytics with R (Springer, 2011)
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009):
http://www.ms.unimelb.edu.au/spuRs/

 

Dave Larsen <kv0s...@gmail.com> Oct 21 09:36PM -0500  

Aaron
 
Thanks for getting the Forest-R going again. I was wondering if was
still available.
 
Dave Larsen

 

John Kershaw <jak...@gmail.com> Oct 21 07:26PM -0300  

Folks,
 
Since Aaron has revive the group with his grouped data question, I thought
I'd pass along some code I have posted under my R Applications page. I'll
get a manuscript associated with this code up soonish. there are several
pieces of code in the zip and the Weibull6.R file. There is a set of rdpq
functions to do most of the variants of Weibull - 2P, 3P, reverse, left and
right truncation. There is an MLE parameter estimation function as well as a
parameter recovery function. There is also an interactive parameter finder.
 
Any comments are greatly appreciated.
 
The URL is:
http://ifmlab.for.unb.ca/people/kershaw/index.php/r-applications/
 
 
--
John Kershaw, RPF, CF
Forest Mensurationist
 
"It is better to have an imprecise answer to the right question, than a
precise answer to the wrong question." - John Tukey

 

You received this message because you are subscribed to the Google Group forestr.
You can post via email.
To unsubscribe from this group, send an empty message.
For more options, visit this group.

--
You received this message because you are subscribed to the Google Groups "Forest-R" group.
To post to this group, send email to for...@googlegroups.com.
To unsubscribe from this group, send email to forestr+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/forestr?hl=en.

Aaron

unread,
Oct 23, 2011, 8:46:49 AM10/23/11
to Forest-R
Thanks for the quick reply everyone! I was just trying to generate
some discussion so thanks for the participation.

I could never master the apply() statements, except gapply(). A
classic example of needing a by statement is trying to summarize tree
lists from FVS that have years within plots within stands hierarchical
structure. Recently, I had to calculate 20 tree-level carbon values
(e.g. foliage, roots, stem, etc.). I wanted to sum all these values up
for each year, plot, and stand combination. The best I could do was:

library(nlme)
FVS.Tree=groupedData(Ht~DBH|StandID/Plot/Year,data=FVS.Tree)
FVS.sum=gsummary(FVS.Tree,sum)

Alternatively, I could have used Plyr. The code would have been:

library(plyr)
FVS.sum=ddply(FVS.Tree,c('StandID','Plot','Year'),sum)

The plyr code is much simpler, stable, flexible, and can handle larger
datasets. It has made my life much easier, but I was curious to see
how others have handled summarizing large, multiple hierarchy datasets
prior to plyr.

Thanks again,
Aaron

Andrew Robinson

unread,
Oct 23, 2011, 9:10:53 AM10/23/11
to for...@googlegroups.com
I'm a fan of aggregate. Is this close to what you want?

FVS.sum <- with(FVS.Tree,
aggregate(x = list(foliage = foliage,
roots = roots,
stem = stem,
etc = etc),
by = list(year = year,
stand = stand,
plot = plot),
FUN = sum)

Best wishes

Andrew

> --
> You received this message because you are subscribed to the Google Groups "Forest-R" group.
> To post to this group, send email to for...@googlegroups.com.
> To unsubscribe from this group, send email to forestr+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/forestr?hl=en.

--

Johnson, Greg

unread,
Oct 23, 2011, 11:32:40 AM10/23/11
to for...@googlegroups.com
Aggregate has been my go-to method for group-wise summarization for simple results. When more complex (data.frames for example) are required, I have suffered through by() or gone straight to the apply family.

Greg Johnson

christian salas

unread,
Oct 23, 2011, 11:50:52 PM10/23/11
to Forest-R
hi there!

I use the function summaryBy() of the library 'doBy'

for example

summaryBy(TPH + BA + QMD ~
for.type, data = plot.standvar,
FUN = function(x) { c( n = length(x), Mean = mean(x), StdDev =
sd(x)) } )

cheers from chile!
c
------------------------------------------------------------------------------------------------------
Christian Salas Eljatib, Ph.D., M.Sc. [Christian Salas, Ph.D.]
Profesor Asistente de Biometría [Assistant Professor of
Biometrics]
Departamento de Ciencias Forestales [Forest Science Departament]
Universidad de La Frontera
Temuco, Chile

Email: csa...@ufro.cl | Web: http://dungun.ufro.cl/~csalas

Laboratorio de Análisis Cuantitativo de Recursos Naturales
www.magrecnat.cl/lab
--------------------------------------------------

On Oct 23, 12:32 pm, "Johnson, Greg" <greg.johns...@weyerhaeuser.com>
wrote:
> Aggregate has been my go-to method for group-wise summarization for simple results. When more complex (data.frames for example) are required, I have suffered through by() or gone straight to the apply family.
>
> Greg Johnson
>
> >> For more options, visit this group athttp://groups.google.com/group/forestr?hl=en.

Johnson, Greg

unread,
Oct 24, 2011, 9:09:12 AM10/24/11
to for...@googlegroups.com
Christian,

It's good to hear from you. I had found and since forgotten the doBy library. I have used it to great success for summary tables and the like. Thanks for refreshing my memory. Hope all is going well in Chile.

Greg Johnson
Weyerhaeuser NR Company
greg.j...@weyerhaeuser.com

541-979-2063 [call first]
253-924-6933

Gould, Peter

unread,
Oct 24, 2011, 11:28:57 AM10/24/11
to for...@googlegroups.com
Hi Everyone,

I'm also a fan of aggregate. I have found that it can become slow with many levels of grouping. One way around it is to combine all the levels into a character string, aggregate the data (now with only one grouping level), and then recombine the results with the original groups. Here's an example:

##assume dataframe is already loaded
FVS.Tree$GROUPS =with(FVS.Tree, paste(year,stand,plot,sep="_"))
theGroups = unique(subset(FVS.Tree,select=c(year,stand,plot,GROUPS)))
###borrowed from Andrew's example below


FVS.sum <- with(FVS.Tree,
aggregate(x = list(foliage = foliage,
roots = roots,
stem = stem,
etc = etc),

by = list(GROUPS=GROUPS),
FUN = sum)
##merge back with group info
FVS.sum2 = merge(FVS.sum,theGroups,by="GROUPS")

Cheers,
Peter

Peter Gould
Research Forester
PNW Research Station
360-753-7677

Reply all
Reply to author
Forward
0 new messages