On Wed, May 24, 2017 at 11:37:10PM +0530, Athitya Kumar wrote:
>
> Hello Victor.
> I understand, but aren't formats like RData & RDS still preferred as
> they're binary formats which are lighter and faster as compared to
> formats like csv?
I would disagree with that. Compressed CSV tends to win because of
less IO - IO is usually the bottleneck. We have RData because the R
people have it that way. Reading RData is known to be slow.
It is good to support both formats, of course.
> I think this makes both import and export equally
> important for a Rubyist to work with a Team of R developers.
> Regarding (2), it is sad to note that all the above gems have R as a
> requirement, and importing / exporting RData / RDS data into Daru
> DataFrames might not be possible in a R-less environment.
> Continuing from the discussion today, I think we can go ahead with
> RSRuby for import and Rinruby for export as both have requirement of
> just R, whereas Ruby-rserve-client requires Rserve as well.
I was there at the birth of RSRuby. Alex did a great job :).
I think it is fine to have R as a dependency when dealing with RData.
Only thing is that it will probably rule out direct JRuby support. How
about making the RData tranformer a standalone tool - so it can be
called from both MRI and JRuby and even other languages.
People try to avoid dependencies between libraries and tools. But it
is actually a solved problem.
> I'm considering of choosing RSRuby for import, as it's the fastest of
> these 3 gems and is still able to parse RData files to provide R lists
> as Array of Hashes that can directly be used to create Daru::DataFrame.
> Rinruby has "assign" method, which makes it possible to directly
> "create" the R variables from Ruby, and write into RData / RDS files -
> making it suitable for using with export. However, the trade-off here
> is that this is the slowest of the 3 gems.
> Do share your opinions regarding this. :)
Shared. +1.
Pj.