Interested in "R Interface" (SurveyMan)

58 views
Skip to first unread message

pjal...@gmail.com

unread,
Mar 15, 2015, 2:08:35 PM3/15/15
to plasma-u...@googlegroups.com
Hi,

I am Prashant Jalan, final year Computer Science undergrad at IIT Kanpur. I am planning to apply for GSOC '15 and the project "R interface for SurveyMan" interests me.

I am currently in my 8th semester and over three years of my undergraduate studies, I have been involved in many research and development projects.

1) I was able to publish one of my works in NLP in ICON 2014, Goa, India, one of the leading NLP conference in India; titled "Syllables as Linguistic Units?" (http://goo.gl/Bm3RJ2).

2) I worked on parallelising a sequential algorithm (Sequitur), using CUDA (on a Nvidia K40 card). We are sending this work for publication in IEEE ISPA '15.

3) I worked on several development projects, some of which were appreciated and rewarded. Please find the details at http://www.prashantjalan.com and my detailed resume at http://prashantjalan.com/DetailedResume.pdf

I have used R during my internship and have decent knowledge. Could someone please guide on how to proceed further.

Emma Tosch

unread,
Mar 16, 2015, 1:48:25 PM3/16/15
to plasma-u...@googlegroups.com, pjal...@gmail.com
Hi Prashant,

Thanks for the inquiry! I just updated the SurveyMan starter task page with some suggestions for the R interface. If you run into any issues, please post specific questions here. We are happy to answer them.

Best,
Emma
Message has been deleted

pjal...@gmail.com

unread,
Mar 16, 2015, 1:54:04 PM3/16/15
to plasma-u...@googlegroups.com, pjal...@gmail.com
Thank you! I'll get started.

Prashant Jalan

unread,
Mar 18, 2015, 4:30:51 AM3/18/15
to plasma-u...@googlegroups.com, PRASHANT JALAN, eto...@gmail.com
Hey,

I have looked into the Python code and now have a basic understanding of SurveyMan. A small doubt regarding the task. We import a CSV file into R. Then can we directly call the Java CSV parser? If we choose the Python code then do we need to parse the CSV in R and then call the Python functions?
--
Prashant Jalan

Emma Tosch

unread,
Mar 18, 2015, 4:45:30 PM3/18/15
to plasma-u...@googlegroups.com, pjal...@gmail.com, eto...@gmail.com
Hi Prashant,

The Python code works similarly to the Java code, but does not parse the csv into SurveyMan objects. It's strictly a programmatic interface. HOWEVER, you don't need the csv parsing here, because the objective is to work from data frames and get familiarity with SurveyMan objects. :)

Start by loading an example survey (e.g., https://github.com/SurveyMan/SurveyMan/blob/master/src/test/resources/pick_randomly.csv) into an R data frame. You could even create a very simple survey with two questions each having Yes/No responses and load that into a data frame instead.

Then try constructing the SurveyMan objects using either rPython or rJava, mapping to either the SMPy survey objects or the SurveyMan respectively. You don't have to create the whole survey -- just enough to get familiarity with the Block, Survey, and Component/Option objects. Once you're able to create a survey object, call jsonize on it and validate the resultant json against the output schema.

You can certainly execute this code as a script, but I would prefer to a script. That is, you could do something like:

R> python.assign("bi", "Block(questions)")
R> python.assign("s", "Survey([b1,...,bi,...,bn]")
R> python.exec("print s.jsonize()")

but I would prefer to see something like:

make_question_from_df_selection <- function(slice_of_df) {
  // create a python object and return the unique identifier we will use to access that object.
}

where the idea is to hide the ugly string exec stuff as much as possible.

Prashant Jalan

unread,
Mar 21, 2015, 7:01:40 AM3/21/15
to Emma Tosch, plasma-u...@googlegroups.com
Hi Emma, 

I have written a simple R script using the SMPy module to create a survey and save it in a json file. This is done, for now, using the exec statements. rPython on their site have said:
"
One of the most daunting tasks in building an interface between two languages is provide mechanisms for exchanging different data structures between them. rPython uses json:

1) R/Python objects are dumped into json strings using their respective parsers (RJSONIO and simplejson) respectively.
2) These strings are passed between the two languages using their string exchange mechanisms.
3) json strings are finally reconstructed into language objects on the other side.
"

The Question(), Block(), etc objects are not json serialisable according to rPython. This left me no choice except to use exec statements. Please let me know if anything else is needed. Also, should I send a draft proposal to you for review?
--
Prashant Jalan
survey.R

Emma Tosch

unread,
Mar 23, 2015, 9:45:26 AM3/23/15
to plasma-u...@googlegroups.com, eto...@gmail.com, pjal...@gmail.com
Hi Prashant,

Looks like a good start! A second pass might store R representations of the survey objects (rather than Python representations) by calling jsonize *first* and then implementing the logic to manipulate the R objects after that. More complex operations that would be better suited for Python could call the Python directly. However, this would require an "unjsonize" python method, which is not currently implemented. If you have knowledge of Python and want to take a stab at it, please do (since if we are working in both Python and R, it would be good to have working knowledge of both, but not required). If you don't have knowledge of Python, go ahead and create a github issue for deserialization.

Best,
Emma

Emma Tosch

unread,
Mar 23, 2015, 9:46:19 AM3/23/15
to plasma-u...@googlegroups.com, eto...@gmail.com, pjal...@gmail.com
Oops, I forgot to mention that yes, if you want to send along a draft for us to review, please do.

Prashant Jalan

unread,
Mar 26, 2015, 3:47:39 PM3/26/15
to Emma Tosch, plasma-u...@googlegroups.com
Hi Emma, 

Thank you for your reply. Yes, I do have knowledge of Python. I'll look into it.
--
Prashant Jalan
Reply all
Reply to author
Forward
0 new messages