Tightly coupling Ruby and R

17 views
Skip to first unread message

Rodrigo Botafogo

unread,
Sep 26, 2018, 12:20:32 PM9/26/18
to sciru...@googlegroups.com
Hello SciRubyists...

I've being working on tightly coupling TruffleRuby and FastR on top of GraalVM.  The integration is well under way and I would love to hear any comments from people in this group.  Basically, this work allows Ruby developers to call R transparently.

One of the more relevant results is the ability to use ggplot2 directly from Ruby.  Here, for example, the code and resulting scatter plot:

====================================================================
#
require 'cantata'
require 'ggplot'

# set options
R.options(scipen: 999)  # turn-off scientific notation like 1e+48
R.theme_set(R.theme_bw)  # pre-set the bw theme.

# read the R 'midwest' dataset onto the 'midwest' variable
midwest = ~:midwest

R.awt

# Scatterplot
gg = midwest.ggplot(E.aes(x: :area, y: :poptotal)) + 
       R.geom_point(E.aes(col: :state, size: :popdensity)) + 
       R.geom_smooth(method: "loess", se: false) + 
       R.xlim(R.c(0, 0.1)) + 
       R.ylim(R.c(0, 500000)) + 
       R.labs(subtitle: "Area Vs Population", 
                   y: "Population", 
                   x: "Area", 
                   title: "Scatterplot", 
                   caption: "Source: midwest")

puts gg

image.png


===================================================================

As another example, here is how to get a polynomial regression using the Boston data from R package ISLR.

require 'cantata'
R.require 'MASS'
R.require 'ISLR'

lm_fit5 = R.lm(R.formula("medv ~ poly(lstat, 5)"), data: :Boston)
puts lm_fit5.summary

[This output was cut from the total print...]

Residuals:
     Min       1Q   Median       3Q      Max
-13.5433  -3.1039  -0.7052   2.0844  27.1153

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)   
(Intercept)       22.5328     0.2318  97.197  < 2e-16 ***
poly(lstat, 5)1 -152.4595     5.2148 -29.236  < 2e-16 ***
poly(lstat, 5)2   64.2272     5.2148  12.316  < 2e-16 ***
poly(lstat, 5)3  -27.0511     5.2148  -5.187 3.10e-07 ***
poly(lstat, 5)4   25.4517     5.2148   4.881 1.42e-06 ***
poly(lstat, 5)5  -19.2524     5.2148  -3.692 0.000247 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 5.215 on 500 degrees of freedom
Multiple R-squared:  0,6817,    Adjusted R-squared:  0,6785
F-statistic: 214,2 on 5 and 500 DF,  p-value: < 2.2e-16

===================================================================

In order to test this:

* Install GraalVM (rc6)
* Install TruffleRuby (follow the simple instructions)
* Install FastR
* gem install rspec
* Use rake to run the tests ('rake -T' shows all available tasks)

Doing:

* rake specs:all -- Runs all the specs.  Reading the specs shows much of how to use the language
* rake sthda:all -- Runs a slideshow with over 80 plots
* rake islr:all -- Runs some 'labs' from the Introduction to Statistical Learning book

I think the installations instructions above should work, but I haven't yet extensively tested.  So, there might be some missing steps.  Please, feel free to e-mail me if anything goes wrong. 

Thanks for reaching to this point of this rather long e-mail! 


--
Rodrigo Botafogo

Reply all
Reply to author
Forward
0 new messages