[R] Data Manipulation Question

0 views
Skip to first unread message

John Filben

unread,
Dec 3, 2009, 4:52:09 PM12/3/09
to r-h...@r-project.org
Can R support data manipulation programming that is available in the SAS datastep?  Specifically, can R support the following:
-          Read multiple dataset one record at a time and compare values from each; then base on if-then logic write to multiple output files
-          Load a lookup table and then process a different file; based on if-then logic, access and lookup values in the table
-          Support modular “gosub”programming
-          Sort files
-          Date math and conversions
-          Would it be able to support the following type of logic:
o   Start
§  Read Record from File 1
§  Read Record from File 2
§  Match
·         If Key 1 <> Key 2 and Key 1 < Key 2, Write to output file A
·         If Key 1 = Key 2, Write to output file B
·         If Key 1 <> Key 2 and Key 1 > Key 2, Write to output file C§  Goto Start until File 1 Done
 John Filben
Cell Phone - 773.401.2822
Email - johnf...@yahoo.com



[[alternative HTML version deleted]]

hadley wickham

unread,
Dec 3, 2009, 5:02:31 PM12/3/09
to John Filben, r-h...@r-project.org
On Thu, Dec 3, 2009 at 3:52 PM, John Filben <johnf...@yahoo.com> wrote:
> Can R support data manipulation programming that is available in the SAS datastep?  Specifically, can R support the following:
> -          Read multiple dataset one record at a time and compare values from each; then base on if-then logic write to multiple output files
> -          Load a lookup table and then process a different file; based on if-then logic, access and lookup values in the table
> -          Support modular “gosub”programming
> -          Sort files
> -          Date math and conversions
> -          Would it be able to support the following type of logic:
> o   Start
> §  Read Record from File 1
> §  Read Record from File 2
> §  Match
> ·         If Key 1 <> Key 2 and Key 1 < Key 2, Write to output file A
> ·         If Key 1 = Key 2, Write to output file B
> ·         If Key 1 <> Key 2 and Key 1 > Key 2, Write to output file C§  Goto Start until File 1 Done

Yes.

Hadley


--
http://had.co.nz/

______________________________________________
R-h...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Jason Morgan

unread,
Dec 3, 2009, 5:03:58 PM12/3/09
to John Filben, r-h...@r-project.org
Please refrain from posting HTML. The results can be incomprehensible:

On 2009.12.03 13:52:09, John Filben wrote:
> Can R support data manipulation programming that is available in the SAS datastep??? Specifically, can R support the following:
> -?????????????????? Read multiple dataset one record at a time and compare values from each; then base on if-then logic write to multiple output files
> -?????????????????? Load a lookup table and then process a different file; based on if-then logic, access and lookup values in the table
> -?????????????????? Support modular ???gosub???programming
> -?????????????????? Sort files
> -?????????????????? Date math and conversions
> -?????????????????? Would it be able to support the following type of logic:
> o???? Start
> ???? Read Record from File 1
> ???? Read Record from File 2
> ???? Match
> ?????????????????? If Key 1 <> Key 2 and Key 1 < Key 2, Write to output file A
> ?????????????????? If Key 1 = Key 2, Write to output file B
> ?????????????????? If Key 1 <> Key 2 and Key 1 > Key 2, Write to output file C???? Goto Start until File 1 Done
> ??John Filben


> Cell Phone - 773.401.2822
> Email - johnf...@yahoo.com
>
> [[alternative HTML version deleted]]

--
Jason W. Morgan
Graduate Student
Department of Political Science
*The Ohio State University*
154 North Oval Mall
Columbus, Ohio 43210

Gray Calhoun

unread,
Dec 3, 2009, 10:52:11 PM12/3/09
to John Filben, r-help
The data import/export manual can elaborate on a lot of these; this is
all straightforward, although many people would prefer to use a
relational database for some of the things you mentioned. I'm not
aware of a "goto" command in R, though (although I could be wrong).
--Gray

David Winsemius

unread,
Dec 4, 2009, 12:14:09 AM12/4/09
to Gray Calhoun, r-help, John Filben

On Dec 3, 2009, at 10:52 PM, Gray Calhoun wrote:

> The data import/export manual can elaborate on a lot of these; this is
> all straightforward, although many people would prefer to use a
> relational database for some of the things you mentioned.

See Wickham's pithy response to this.

> I'm not
> aware of a "goto" command in R, though (although I could be wrong).

In fairness to the OP, he did not ask if there were a go-to construct,
but rather whether there were a "gosub" construct that supported
"modular programming". My response would have been that calling
modular functions (i.e., subroutines with defined arguments) is
fundamental to R and the key to understanding how to use it with grace
and efficiency. I would say that the concept of functional programming
is to a much greater extent supported by R than by SAS, whose datastep
mechanisms (as I remember them from earlier incarnation) in no way
supported modular programming. I suspect that S and R arose precisely
because of the mental straightjackets imposed by SAS.

--
David.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

Barry Rowlingson

unread,
Dec 4, 2009, 3:18:58 AM12/4/09
to John Filben, r-h...@r-project.org

I'll expand on Hadley Wickham's "Yes", to say "Yes, and it wouldn't be
much of a 'system for statistical computation and graphics' if it
couldn't do that".

Remember R uses the 'S' and C programming languages and is Open
Source. If it _cant_ do something you want it to do, you can write
code that does it. Like the date math and conversions. Originally,
maybe waaaay back in R version 0.something, it didn't have that. But
someone wrote it, and wisely contributed it, and the community saw
that it was good. And now we have date math and conversions. And
nobody has to write any date math or conversion codes ever again.

Now tell me how to get something into the SAS core code.

Barry

P.S. I see a very obvious optimisation you can do on this line:

If Key 1 <> Key 2 and Key 1 < Key 2, Write to output file A

but maybe that's some kind of weird SASism....

Gray Calhoun

unread,
Dec 4, 2009, 8:31:32 AM12/4/09
to David Winsemius, r-help, John Filben
This is probably far more discussion than the question warranted, but...

On Thu, Dec 3, 2009 at 11:14 PM, David Winsemius <dwins...@comcast.net> wrote:
>
> On Dec 3, 2009, at 10:52 PM, Gray Calhoun wrote:
>
>> The data import/export manual can elaborate on a lot of these; this is
>> all straightforward, although many people would prefer to use a
>> relational database for some of the things you mentioned.
>
> See Wickham's pithy response to this.

Sure. My (indirect) point is that representing query results as
separate files is usually not the right approach, regardless of
statistical language/package one uses.

>
>> I'm not
>> aware of a "goto" command in R, though (although I could be wrong).
>
> In fairness to the OP, he did not ask if there were a go-to construct, but
> rather whether there were a "gosub" construct that supported "modular
> programming". My response would have been that calling modular functions
> (i.e., subroutines with defined arguments) is fundamental to R and the key
> to understanding how to use it with grace and efficiency. I would say that
> the concept of functional programming is to a much greater extent supported
> by R than by SAS, whose datastep mechanisms (as I remember them from earlier
> incarnation) in no way supported modular programming. I suspect that S and R
> arose precisely because of the mental straightjackets imposed by SAS.

>From the original: "Goto Start until File 1 Done." But, yes, probably
unfair and certainly less informative than your response.

Reply all
Reply to author
Forward
0 new messages