[GSoC] Introduction

105 views
Skip to first unread message

Ivan Evgrafov

unread,
Apr 29, 2015, 3:20:04 PM4/29/15
to sciru...@googlegroups.com
Hello,

I'm very glad to have such an opportunity both to help the SciRuby community and improve my skills in programming. Thank you for accepting.

This summer I will work on building a new gem for gnuplot. Gnuplot is a cross-platform command line utility for visualisation. It supports both 2D and 3D plots, may plot in cartesian, polar and parametric coordinates, allows to plot several data sets in one or several coordinate systems. I believe that this work will improve visualisation capabilities of Ruby and allow scientists to use it more widely. My proposal with much more project details may be found on google melange. I also prepared several examples of things which may be done with gnuplot and I want to achieve with gem. They can be found in my github repository.

Now let me briefly introduce myself.
I'm 4-th year student of Computer Science at Bauman State Technical University in Moscow, Russia. I'm programming Ruby for about two years and I'm going to use my current skills to develop the gem I proposed and also to improve my skills during development. I also hope to get familiar with TDD and CI tools during GSoC.

Regards,
Ivan

Ivan Evgrafov

unread,
May 23, 2015, 3:28:06 PM5/23/15
to sciru...@googlegroups.com
Hi,

only two days left before coding phase start so I want to summarize my progress up to this moment in the blog post. I shared my ideas about functional style and its usage in my project and provided some details about tools I'm going to use. I also wrote a few words about tests and how I implemented them.

Much more details may be found in the blog - http://dilcom.github.io/pilot-gnuplot/2015/05/before-coding-starts/.

Wish everyone to work on their projects with love and passion and to succeed in both evaluations.

Regards,
Ivan

среда, 29 апреля 2015 г., 22:20:04 UTC+3 пользователь Ivan Evgrafov написал:

Alexej Gossmann

unread,
May 24, 2015, 12:02:39 PM5/24/15
to sciru...@googlegroups.com
Good luck! I'm sure that I will use the results of your work in the future :)

As I have used the existing gnuplot gem and tried out other Ruby visualization tools, I would especially be happy about an exhaustive documentation (for example, In order to use the currently existing gnuplot, I ended up reading their code, because I was not able to find a proper documentation and their collection of example is by far not exhaustive).

Alexej

Carlos Agarie

unread,
May 25, 2015, 9:41:42 AM5/25/15
to sciru...@googlegroups.com
I agree with Alexej. Having good documentation and a *lot* of examples
makes a visualization library usable. When you have some working
examples, send a link to this mailing list and I'll try to get some
feedback from my coworkers and myself.

-----
Carlos Agarie
Data Scientist at Tapps Games (http://tappsgames.com)
+55 11 97320-3878 | @carlos_agarie
> --
> You received this message because you are subscribed to the Google Groups
> "SciRuby Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sciruby-dev...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Pjotr Prins

unread,
May 25, 2015, 10:45:11 AM5/25/15
to sciru...@googlegroups.com
Hi Ivan,

It is a good idea to program in a functional style.

On Functional Programming (FP):

FP is great. I personally favour it because you get easier to read
code. We think in terms of data transformation where a function is a
transformer of input to output. The same input should give the same
output data. Side-effects are indeed discouraged - which makes code
more predictable. As Joe Armstrong of Erlang says, you should be able
to cut and paste a function elsewhere and it should just work as
expected. Even using OOP methods that is possible - just make sure
classes carry the *minimum* state and, if possible, split data and
methods into objects and/or modules. I program in Ruby almost without
using classes that mix code and data these days. Sometimes OOP is
good, for example File class will have a filehandle that it carries
around. But that is easy to understand, right? The Ruby standard
library is actually a good example of mixing OOP and FP - it is flat
and easy to understand. That is the goal.

On Immutable data:

Immutable data is actually badly supported in Ruby (I love Tony's
http://tonyarcieri.com/2012-the-year-rubyists-learned-to-stop-worrying-and-love-the-threads),
so I would not focus too much on that. The trick is to copy data - so
to make sure you don't change data in lists etc. When you transform an
object/list you first have to deep copy it to gaurantee a new object.
I do that consistently in the bio-alignment gem, for example. I does
mean (in Ruby) that we have to copy data all the time. It helps
prevent stupid bugs - that is the upside. I only cheat when there are
real performance considerations.

Conclusion:

I do use FREEZE now and then, but mostly for class variables that
should be constant. There is no deep-freeze for lists etc. For me FP
in Ruby, therefore, mostly means keeping data and functions apart and
avoiding, like you say, side effects. Also avoiding variable
reassignment is good practice - but Ruby offers no safety net there.
Having freeze all over the place won't make the software easy to read.

Pj.

On Sat, May 23, 2015 at 12:28:06PM -0700, Ivan Evgrafov wrote:
> Hi,
> only two days left before coding phase start so I want to summarize my
> progress up to this moment in the blog post. I shared my ideas about
> functional style and its usage in my project and provided some details
> about tools I'm going to use. I also wrote a few words about tests and how
> I implemented them.
> Much more details may be found in the blog -
> [1]http://dilcom.github.io/pilot-gnuplot/2015/05/before-coding-starts/.
> Wish everyone to work on their projects with love and passion and to
> succeed in both evaluations.
> Regards,
> Ivan
> Ñ*Ñ*еда, 29 апÑ*елÑ* 2015 г., 22:20:04 UTC+3
> полÑ*зоваÑ*елÑ* Ivan Evgrafov напиÑ*ал:
>
> Hello,
> I'm very glad to have such an opportunity both to help the SciRuby
> community and improve my skills in programming. Thank you for accepting.
> This summer I will work on building a new gem for gnuplot. Gnuplot is a
> cross-platform command line utility for visualisation. It supports both
> 2D and 3D plots, may plot in cartesian, polar and parametric
> coordinates, allows to plot several data sets in one or several
> coordinate systems. I believe that this work will improve visualisation
> capabilities of Ruby and allow scientists to use it more widely. My
> proposal with much more project details may be found on [2]google
> melange. I also prepared several examples of things which may be done
> with gnuplot and I want to achieve with gem. They can be found in [3]my
> github repository.
> Now let me briefly introduce myself.
> I'm 4-th year student of Computer Science at Bauman State Technical
> University in Moscow, Russia. I'm programming Ruby for about two years
> and I'm going to use my current skills to develop the gem I proposed and
> also to improve my skills during development. I also hope to get
> familiar with TDD and CI tools during GSoC.
> Regards,
> Ivan
>
> --
> You received this message because you are subscribed to the Google Groups
> "SciRuby Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [4]sciruby-dev...@googlegroups.com.
> For more options, visit [5]https://groups.google.com/d/optout.
>
> References
>
> Visible links
> 1. http://dilcom.github.io/pilot-gnuplot/2015/05/before-coding-starts/
> 2. http://www.google-melange.com/gsoc/proposal/public/google/gsoc2015/dilcom/5629499534213120
> 3. https://github.com/dilcom/pilot-gnuplot/tree/master/samples
> 4. mailto:sciruby-dev...@googlegroups.com
> 5. https://groups.google.com/d/optout


--

Pjotr Prins

unread,
May 25, 2015, 10:48:40 AM5/25/15
to sciru...@googlegroups.com
You may want to look at Hamster

https://github.com/hamstergem/hamster
> > Ñ*Ñ*??????, 29 ????Ñ*????Ñ* 2015 ??., 22:20:04 UTC+3
> > ??????Ñ*????????Ñ*????Ñ* Ivan Evgrafov ????????Ñ*????:
> To unsubscribe from this group and stop receiving emails from it, send an email to sciruby-dev...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

--

Ivan Evgrafov

unread,
May 25, 2015, 12:18:16 PM5/25/15
to sciru...@googlegroups.com
Hi Alexej,

couldn't you please share some examples of cases when you needed to look into gnuplot's (gem) code?

Regards,
Ivan

Ivan Evgrafov

unread,
May 25, 2015, 12:23:57 PM5/25/15
to sciru...@googlegroups.com
Hi Carlos,

sure, as soon as I have working samples, I'll share a link here. Your feedbacks would be very helpful and I will try to get them as early as possible.

Regards,
Ivan

Ivan Evgrafov

unread,
May 25, 2015, 1:15:19 PM5/25/15
to sciru...@googlegroups.com
Hi Pjotr,

thanks for such a broad explanation.

Still have some questions:
Why to keep data and methods apart? If I understood right we expect functions to take input and return output without modifying input. Can't we consider methods like functions that take additional argument (an object) and just not modify it?

I'll definitely look at Hamster, thanks.

Regards,
Ivan

Pjotr Prins

unread,
May 25, 2015, 1:21:17 PM5/25/15
to sciru...@googlegroups.com
On Mon, May 25, 2015 at 10:15:19AM -0700, Ivan Evgrafov wrote:
> Hi Pjotr,
> thanks for such a broad explanation.
> Still have some questions:
> Why to keep data and methods apart?

Most of the OOP madness comes from dependencies within and between
objects. Splitting data and methods encourages less dependencies. FP
languages such as Haskell and Erlang don't do OOP (by default). Start
thinking in this way and you end up with easier to read and maintain
code. That, at least, is my experience.

> If I understood right we expect
> functions to take input and return output without modifying input. Can't
> we consider methods like functions that take additional argument (an
> object) and just not modify it?

Yes. But if you return it as a modified value, copy it first (clone or
dup). Don't update in place.

The reason is that if you *read* a function definition it should do
what you expect. Inputs should not change.

> I'll definitely look at Hamster, thanks.
> Regards,
> Ivan
> On Monday, May 25, 2015 at 5:48:40 PM UTC+3, Pjotr Prins wrote:
>
> You may want to look at Hamster
>
> [1]https://github.com/hamstergem/hamster
>
> On Mon, May 25, 2015 at 04:44:58PM +0200, Pjotr Prins wrote:
> > Hi Ivan,
> >
> > It is a good idea to program in a functional style.
> >
> > On Functional Programming (FP):
> >
> > FP is great. I personally favour it because you get easier to read
> > code. We think in terms of data transformation where a function is a
> > transformer of input to output. The same input should give the same
> > output data. Side-effects are indeed discouraged - which makes code
> > more predictable. As Joe Armstrong of Erlang says, you should be able
> > to cut and paste a function elsewhere and it should just work as
> > expected. Even using OOP methods that is possible - just make sure
> > classes carry the *minimum* state and, if possible, split data and
> > methods into objects and/or modules. I program in Ruby almost without
> > using classes that mix code and data these days. Sometimes OOP is
> > good, for example File class will have a filehandle that it carries
> > around. But that is easy to understand, right? The Ruby standard
> > library is actually a good example of mixing OOP and FP - it is flat
> > and easy to understand. That is the goal.
> >
> > On Immutable data:
> >
> > Immutable data is actually badly supported in Ruby (I love Tony's
> >
> [2]http://tonyarcieri.com/2012-the-year-rubyists-learned-to-stop-worrying-and-love-the-threads),
> > so I would not focus too much on that. The trick is to copy data - so
> > to make sure you don't change data in lists etc. When you transform an
> > object/list you first have to deep copy it to gaurantee a new object.
> > I do that consistently in the bio-alignment gem, for example. I does
> > mean (in Ruby) that we have to copy data all the time. It helps
> > prevent stupid bugs - that is the upside. I only cheat when there are
> > real performance considerations.
> >
> > Conclusion:
> >
> > I do use FREEZE now and then, but mostly for class variables that
> > should be constant. There is no deep-freeze for lists etc. For me FP
> > in Ruby, therefore, mostly means keeping data and functions apart and
> > avoiding, like you say, side effects. Also avoiding variable
> > reassignment is good practice - but Ruby offers no safety net there.
> > Having freeze all over the place won't make the software easy to read.
> >
> > Pj.
> >
>
> --
> You received this message because you are subscribed to the Google Groups
> "SciRuby Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [3]sciruby-dev...@googlegroups.com.
> For more options, visit [4]https://groups.google.com/d/optout.
>
> References
>
> Visible links
> 1. https://github.com/hamstergem/hamster
> 2. http://tonyarcieri.com/2012-the-year-rubyists-learned-to-stop-worrying-and-love-the-threads
> 3. mailto:sciruby-dev...@googlegroups.com
> 4. https://groups.google.com/d/optout


--

Alexej Gossmann

unread,
May 25, 2015, 2:02:08 PM5/25/15
to sciru...@googlegroups.com
couldn't you please share some examples of cases when you needed to look into gnuplot's (gem) code?

For example, plotting a 3D surface plot from data (the only example of a 3D plot I could find, shows how to plot from an analytic function expression and not from data points: https://github.com/rdp/ruby_gnuplot/blob/master/examples/3d_surface_plot.rb). In general, ruby_gnuplot has only six examples, which is very few, considering its capabilities, I think. Also, I have looked for what parameter values can be used in 2D or 3D plots for ds.with and similar things. Of course, the documentation to the actual gnuplot project (not ruby gnuplot) was helpful, but I found that the syntax in the gnuplot ruby gem was different enough, and that some things were not supported in the gnuplot ruby gem (unfortunately, I don't remember what it was).
In general, it was not too hard to figure out, but it took me some hours, and most of my math colleagues would not bother to go through the trouble. Or maybe I was just being stupid.

Alexej




--
You received this message because you are subscribed to the Google Groups "SciRuby Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sciruby-dev...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ivan Evgrafov

unread,
May 25, 2015, 2:13:15 PM5/25/15
to sciru...@googlegroups.com
Most of the OOP madness comes from dependencies within and between
objects. Splitting data and methods encourages less dependencies. FP
languages such as Haskell and Erlang don't do OOP (by default). Start
thinking in this way and you end up with easier to read and maintain
code. That, at least, is my experience.

Hm, thank you, I'm not sure if I'll split data and code in gnuplot gem but you gave me great stuff to think about. Will try to think functional.
 
>    If I understood right we expect
>    functions to take input and return output without modifying input. Can't
>    we consider methods like functions that take additional argument (an
>    object) and just not modify it?

Yes. But if you return it as a modified value, copy it first (clone or
dup). Don't update in place.
 
Sure, copy and update.
 
The reason is that if you *read* a function definition it should do
what you expect. Inputs should not change.  

>    I'll definitely look at Hamster, thanks.
>    Regards,
>    Ivan
>    On Monday, May 25, 2015 at 5:48:40 PM UTC+3, Pjotr Prins wrote:
>
>      You may want to look at Hamster
>
>        [1]https://github.com/hamstergem/hamster
>
--

Ivan Evgrafov

unread,
May 25, 2015, 2:28:27 PM5/25/15
to sciru...@googlegroups.com
Should gnuplot in ruby have documentation that covers everything that gnuplot's documentation (pdf of about 250 pages) does? Or maybe it should have many examples which cover some plots (2D, 3D, multiple datasets/plots, different terminals etc) and rules how options are translated between ruby and gnuplot (example { xrange: 1..100} will became 'xrange [1:100]') without  having all the information about possible options for gnuplot?

I understand how difficult may be usage of tools without docs. It's important part of useful lib and I hope to use your feedbacks to not step on the same rakes as ruby gnuplot did. Thanks.

Cristian Messel

unread,
May 25, 2015, 6:36:17 PM5/25/15
to sciru...@googlegroups.com
Hi Ivan,

An idea might be to document the most common cases, option translation rules, basically enough for covering most use cases. In addition to this, add many examples, even for the less common use cases.

The goal of the documentation should be to provide as much value to the reader as possible. You shouldn't provide documentation on how to use gnuplot as gnuplot is already documented (just like you shouldn't test gnuplot functinality, you should test the integration between ruby and gnuplot). You should give the users the "building blocks" so they can achieve their goals.  By reading the gem documentation and gnuplot documentation, the user should not have to wonder too much how to "translate" gnuplot to the correct gem options. Examples go a long way.

We also need to take into account who the documentation is for. Should it cover teaching people about what gnuplot can do? Blog posts also help with that.

That being said, we need to take into account what the community thinks. If it thinks it would be useful to take gnuplot documentation out of the picture, the ruby documentation should be all the user has to read (one-stop shop). If this is the case, a more extensive documentation is needed.

In addition to this, documentation and examples will help you write blog posts and gather feedback.

Cheers,
Cris

Sameer Deshmukh

unread,
May 26, 2015, 4:31:27 AM5/26/15
to sciru...@googlegroups.com
Dear Ivan,

You might want to use daru(https://github.com/v0dro/daru) for storing data for your plots.

Its still in active development and will mature sufficiently by the end of this GSOC. I'm already changing statsample to use daru (https://github.com/SciRuby/statsample/pull/35 - this is wip).

This way you wont have to reinvent the wheel for data storage. Plus, daru works with nmatrix.

Carlos Agarie

unread,
May 26, 2015, 10:31:32 AM5/26/15
to sciru...@googlegroups.com
+1 on using Daru for data storage. With that in place, moving data
from Statsample to Daru should be very easy. I'll ship a new version
of Statsample with Daru in a few weeks, or as soon as we get
Statsample#35 worked out. =)

-----
Carlos Agarie
Data Scientist at Tapps Games (http://tappsgames.com)
+55 11 97320-3878 | @carlos_agarie


John Woods

unread,
May 28, 2015, 12:13:41 PM5/28/15
to sciru...@googlegroups.com
Be nice if data could also be passed as a file and not just by loading and piping into gnuplot. But otherwise, yes, I like the idea of using daru — provided we're not preventing JRuby people from using gnuplot.

Ivan Evgrafov

unread,
May 28, 2015, 3:06:07 PM5/28/15
to sciru...@googlegroups.com
Hi,
Carlos, Sameer and John, I'm sorry for such a late answer, it required me a little bit more time than I expected to take a close look at Daru.

I'll try to clarify a little how gnuplot gem works now. Data may be passed several ways:
1) Math function. Example: Plot.new(['sin(x)*exp(-x)', title: 'Math function'], title: 'Example plot')
2) Name of file with data (columns are considered as x, y, z or some special way (may be set with gnuplot's option 'using')). Example: Plot.new(['points.data', title: 'Math function', with: 'lines'], title: 'Example plot'). You may update file any way you like and call Plot#replot to update the graph.
3) Some container with data.

First two cases are simple and doesn't require gem to store data. The third isn't so trivial. First it have two possible ways of storage the data you passed:
1) Store data in a temporary file. Gem creates file and fills it with data converted to gnuplot's format. If you update data of you Plot object, the file will be appended with that data. This case isn't default and enabled by {file: true} option. Example:
> graph = Plot.new([[x, y], file: true, title: 'Points', with: 'points'])
> updated_graph = graph.update_dataset([update_for_x, update_for_y])
> updated_graph.plot
2) Store data in memory and pipe it out to gnuplot on #plot call. On update new points are added to the end of existing data. Example:
> graph = Plot.new([[x, y], title: 'Points', with: 'points'])
> updated_graph = graph.update_dataset([update_for_x, update_for_y])
> updated_graph.plot
The difference: when you update data stored in file, it doesn't create new file with updated data, so all Plots which use the same Dataset will have some kind of *side effect* of that update. When you update data stored in memory, copy is created, so this update doesn't affect other Plot objects.

The last case requires data storage. Now data stored in a String. So when you pass some data to Plot#new, it's converted to gnuplot's format and stored as a String until you call #plot to pipe it out to gnuplot. It's unefficient since sometimes plots may share data. I'm going to use something like stack from hamster gem because it claims immutability together with memory effectiveness. I think it would be nice to plot data from Daru but not sure it should be used for storage inside. Anyway gnuplot takes just a string no matter in which form the data is stored before piping out.

Summary:
1) Gnuplot gem works with datafiles
2) Currently I'm using String to store some data but going to replace it with some efficient immutable container (from hamster)
3) Gnuplot will definitely support plotting data from Daru but I'm not sure Daru should also be used inside

So what do you think about such an implementation of data storage and plotting?

Sameer Deshmukh

unread,
May 28, 2015, 3:08:11 PM5/28/15
to sciru...@googlegroups.com
Daru will work on all interpreters.

About file IO, daru supports loading data from a host of sources (natively or right now with assistance from statsample, though I think I'll mostly port this to daru soon) so saving data to a file and loading data from it should also not be a problem.

Ivan Evgrafov

unread,
May 28, 2015, 3:41:32 PM5/28/15
to sciru...@googlegroups.com
Useful features indeed. I already read you blog and looked through Daru repository and think Daru is a great thing for data storage and analysis. My skepticism relate only to gnuplot gem's needs and not to Daru's features.

As soon as gnuplot gem will be ready to post here some examples, I'll also implement storage with Daru, it seems not to be hard, and we'll get feedbacks. Agreed?

Sameer Deshmukh

unread,
May 29, 2015, 2:48:29 AM5/29/15
to sciru...@googlegroups.com
Sure thing! Keep us posted :)

John Woods

unread,
May 29, 2015, 12:41:49 PM5/29/15
to sciru...@googlegroups.com
Re: 1) I'm not sure if we're talking about the same thing here. It's nice to be able to create a temporary file, but what about a non-temporary file? As in, what if I have one file with tons of data in it, but lots of different ways I might want to plot it — then it wouldn't make a lot of sense to load it into Ruby and pipe it into Gnuplot.

Make sense?

John

Иван Евграфов

unread,
May 29, 2015, 1:31:21 PM5/29/15
to sciru...@googlegroups.com
You can just pass a name of that file with tons of data and gem will plot it.
Like this:
> graph = Plot.new(['tons_of_data', title: 'Tons of data', with: 'lines'])
> graph.plot
Need to change some dataset options? Ok:
> plot_with_points = graph.update_dataset(with: 'points', title: 'Plot with points')
> plot_with_points.plot
Need to change the whole plot options? Ok:
> plot_interval = graph.options(title: 'Plot on [1..3]', xrange: 1..3)
> plot_interval .plot

File 'tons_of_data' is *not* read by Ruby and piped out to Gnuplot. Only its *name* piped out to Gnuplot. And Gnuplot takes care of reading and plotting itself. Plotting from datafiles with gnuplot gem works as fast as with Gnuplot itself.

Actually there are no tons of data in an example I wrote (only 10000000 points in 272Mb file), but it works the same way with bigger files. Example: https://github.com/dilcom/pilot-gnuplot/tree/master/samples/tons_of_data . I deleted 'tons_of_data' file before pushing to github (it is much more than the whole repository) so it should be generated before launching the example. About time: it took gnuplot about 10 second to plot this 3 graphs (see example) on my computer.

Now I'm talking about the same thing?

Regards,
Ivan

Ivan Evgrafov

unread,
Jul 5, 2015, 4:51:02 AM7/5/15
to sciru...@googlegroups.com
Hi everyone,
half of the summer is gone and I want to share with you my progress in development of Ruby bindings for Gnuplot. I wrote a little blog post where I summarized my progress till now and provided links to gem's repository and examples: http://dilcom.github.io/gnuplotrb/2015/07/midterm/ . Hope you'll find time to take a look and maybe leave feedbacks. It's very important for me to know you thoughts about it.

Regards,
Ivan.
Reply all
Reply to author
Forward
0 new messages