It would be great if Winston, Gaston, and the web repl can use the same API for common functionality. How do we get there?
-viral
I agree, it looks nice. I ran the demos, and they mostly worked, with nice
results. (Presumably because I have an ancient version of gnuplot, the image
demos didn't run for me. But I'll be upgrading in another month or so.)
I updated the "Graphics" section of the Wiki to advertise this option.
> It would be great if Winston, Gaston, and the web repl can use the same API
> for common functionality. How do we get there?
Agreed, where possible it would be useful to have a similar API. In the long
run, native solutions like Winston will presumably be better able to manage
their "state" than anything that calls out to external programs, so it may not
always be possible to maintain one-for-one compatibility. But for now I think
there's probably more divergence than required.
Still, one could argue that the current state is perhaps appropriate; getting
a good API on the first try is inconceivable, and it's nice to have a couple of
concrete implementations in front of us.
Best,
--Tim
> Hello,
> Thanks for the comments so far!
> On Sunday, April 01, 2012 12:34:46 pm Viral Shah wrote:
>> It would be great if Winston, Gaston, and the web repl can use the same API
>> for common functionality. How do we get there?
> I'll be happy to work on this with the other interested developers,
> and follow their advice. As Tim says, it may be too early to know what
> a good, stable API would look like.
A good start might be to look into R grid API. Specification is
available at http://www.stat.auckland.ac.nz/~paul/RGraphics/chapter5.pdf
.
And the latest Paul's book at http://www.stat.auckland.ac.nz/~paul/RG2e/
The api is pretty easy actually. Given that a huge amount of R piloting
functionality is build on grid (ggplot2, lattice, vcd etc) porting those
would be much easier if the underlying models were similar.
Vitalie.
This seems like a possible occassion to discuss a roadmap for the future. Let
me start by saying that I'm aware that Mike Nolton, the developer of Winston,
may not currently be able to participate if the discussion involves typing.
I'd be happy to set up a teleconference of interested parties. If that is
appealing, please respond to the following poll:
http://whenisgood.net/5gaemn3
Note that the first thing you should do is choose your timezone in the box at
the top.
Until that gets scheduled (if there is any interest, that is), I'll put a few
points out there for discussion/cogitation:
1. It appears that the major examples (Matlab, R, and Python---any others?)
have quite similar overall organization. While I don't use R, I'd say the
"grid" API looks like R's version of Matlab's "handle graphics." Python's
matplotplib is explicitly based on Matlab. Finally, Winston already has a
similar architecture (https://github.com/nolta/winston/wiki/Reference). This
is good news.
2. It should be possible to split out "graphics representation" from
"rendering." I'd propose we create a common tree-based representation of all
graphical objects, independent of how objects are rendered.
To give a little more detail: the idea might be that there is a (global?)
variable that serves as the "root" container for all the figures. Figures
contain other objects: "axes"/"plotwindows" or whatever we want to call them,
external annotation, and someday ui elements like buttons, sliders, etc.
Axes/plotwindows contain the individual graphical elements (lines, surfaces,
etc). These items are just variables: an axis variable knows its "parent"
figure, and has a list of its "children", one of which might be a "line"; in
turn, the "line" is a variable that contains the x- and y-coordinates (z too
so we support 3d plots), attributes to specify its color, etc. This hierarchy
is a common base for all plotting packages.
The key advantage is this: when a user says: "plot(x,y)", it just adds
something to the tree. This step can be performed independently of the
capabilities of any given rendering solution. We can design this tree to have
the properties we want it to have in the long run, and then evolve the
rendering platforms over time.
When you want to render a figure, one could say render(hfig), where hfig is
the "handle" for a given figure (apologies to R users and others who may not be
as familiar with Matlab terminology). Or even render_winston(hfig),
render_gaston(hfig), etc, so that all the solutions could co-exist. If all you
need to do is re-render a particular axis within a complex multiaxis figure,
then render(hax) might suffice for certain rendering platforms.
While I don't know the Winston code well, I have the impression that it
already has gone a long ways towards setting up such a hierarchy. Perhaps that
hierarchy, and all of its utility functions, could be separated out from the
actual details of rendering, to make it easier to re-use by other packages. Of
course, some refactoring might be involved.
3. As a long-time Matlab user who has watched about ~20 people learn Matlab
from scratch, I can't resist mentioning there are a couple of things I'd do
differently than Matlab. The most important of these is this: every function
that creates a new object in the graphics hierarchy should have as its first
argument the handle of its parent object. That is, rather than
plot(x,y)
I'm thinking we should insist on
plot(hax,x,y)
where hax is the "handle" to the "axis"/"plotwindow" that the line defined by x
and y "lives in." Over the years, Matlab has been adding more and more
versions of their basic plotting functions with this syntax. The notion of
"current axis" (i.e., "default axis"), while it seems convenient, in my view
is probably the major source of confusion and bugs for people once they start
trying to do anything more sophisticated---invariably they end up with
graphical elements living in different figures/axes/whatever than they intended.
It would also help teach the idea of the graphics hierarchy from the outset,
which I think might help people become more sophisticated more quickly.
Best,
--Tim
3. As a long-time Matlab user who has watched about ~20 people learn Matlabfrom scratch, I can't resist mentioning there are a couple of things I'd do
differently than Matlab. The most important of these is this: every function
that creates a new object in the graphics hierarchy should have as its first
argument the handle of its parent object. That is, rather than
plot(x,y)
I'm thinking we should insist on
plot(hax,x,y)
where hax is the "handle" to the "axis"/"plotwindow" that the line defined by x
and y "lives in." Over the years, Matlab has been adding more and more
versions of their basic plotting functions with this syntax. The notion of
"current axis" (i.e., "default axis"), while it seems convenient, in my view
is probably the major source of confusion and bugs for people once they start
trying to do anything more sophisticated---invariably they end up with
graphical elements living in different figures/axes/whatever than they intended.
It would also help teach the idea of the graphics hierarchy from the outset,
which I think might help people become more sophisticated more quickly.
> On Monday, April 2, 2012 6:11:12 AM UTC-5, Tim wrote:
>>
>> 3. As a long-time Matlab user who has watched about ~20 people learn
>> Matlab
>>
>> from scratch, I can't resist mentioning there are a couple of things I'd
>> do
>> differently than Matlab. The most important of these is this: every
>> function
>> that creates a new object in the graphics hierarchy should have as its
>> first
>> argument the handle of its parent object. That is, rather than
>> plot(x,y)
>> I'm thinking we should insist on
>> plot(hax,x,y)
>> where hax is the "handle" to the "axis"/"plotwindow" that the line defined
>> by x
Plot is a high level generic which might be specialized on complex
objectq -- data tables, densities, statistical analysis output etc. When
matlab gives you the power to manipulate dots and ticks, it helps you
loose your way among the trees. There should be internal functions to
manipulate those (with a handler of course). User level functionality
better be straightforward and to the point, with good defaults.
> It adds a lot of bookkeeping when working interactively, however. While I
> tend to stash axis handles and use them explicitly when writing plotting
> code, there's overhead when quickly plotting from the command line that I'd
> rather not deal with.
Vitalie.
These issues are separate from the question of whether you want to store your
graphics state in some kind of tree. In other words, whether it's
set(hline,'Selected','on')
or
hline.selected = true
fundamentally you're doing the same thing. The latter would be a more object-
oriented syntax. Personally, I agree that there's no reason to do the former.
But I'd regard these issues as "details to be hammered out" separate from the
question of "how do we get a unified plotting API that lets us build something
awesome, starting today?" To me the central point is my suspicion that
rendering is a task that is going to take some time to mature, and might
require a few independent efforts (something that gives maximal functionality
in a short time span, vs. something we'll be proud of in the long term).
However, if people think that a "data/view" model of the graphics
archictecture makes sense, then at least we can at least get going on the
"data" side of it, and build something there that has real longevity. That
would satisfy Viral's suggestion that we unify the plotting API, while giving
a lot of flexibility on the rendering side of things.
> Is there a way of interacting with a plotting engine that is in some sense
> true to the unifying themes of Julia?
I do think the tree idea can be easily made "juliaesque". Consider:
function render(hobj)
render_shallow(hobj)
for i = 1:length(hobj.children)
render(hobj.children[i])
end
end
function render_shallow(hfig::Figure)
do_something_for_figure
end
function render_shallow(hax::Axis)
do_something_for_axis
end
function render_shallow(hline::Line)
do_something_for_line
end
And so on. There would be a lot of details to work out, however; this is too
simple. (Example: axis lines above or below graphical elements? Might need
"prerender_shallow" and "postrender_shallow".)
--Tim
Yes, but consider even the following elementary example:
plot(x,sin(x))
(user inspects the plot)
plot(x,cos(x))
Now, did I want the plot of cos(x) to overwrite the previous plot, or to be
generated in a new figure window?
Compare
plot([],x,sin(x)) # creates a new window
plot([],x,cos(x)) # creates a second new window
plot(ans,x,cos(x)) # overwrites the previous plot
How much harder is that?
--Tim
Now, did I want the plot of cos(x) to overwrite the previous plot, or to be
generated in a new figure window?
plot([],x,sin(x)) # creates a new window
plot([],x,cos(x)) # creates a second new window
plot(ans,x,cos(x)) # overwrites the previous plot
How much harder is that?
I simply meant ans as a shortcut. In particular:
hline = plot([],x,sin(x)) # new plot
hline2 = plot([],x,cos(x)) # another new plot
plot(hline,x,2*sin(x)) # overwrite the first one
> Or if you performed a half-dozen calculations between creating the
> second plot and wanting to overwrite it. An unlimited-depth answer stack is
> not an acceptable workaround, either; I'm not wasting time counting back
> that far. Providing equivalents to gca()/gcf() might be good enough,
> though.
I typically find that I usually default to having plots appear in a new window.
So in Matlab I'm always typing
figure; plot(x,y)
which in julia could really be much easier as
plot([],x,y)
for (my) most common case, and
plot(gca(),x,y) or, perhaps better, plot(hline,x,y)
when I want to overwrite. (For the non-Matlab programmers out there: in
Matlab, "plot" overwrites the current axis, and "line" appends new lines to an
existing plot. I'm not certain this is a well-conceived dichotomy, either, but
that can be discussed later.)
Anyway, there's plenty of room for choice. I'd be happy replacing the word
"plot" in this discussion with "line," which in Matlab is considered a "low
level" plotting command (but with overall fairly similar syntax). All low-
level commands could require handle(s) as their first argument, and convenience
wrappers like "plot" could be written around them, allowing people to be more
casual. People can choose how to call the library.
Best,
--Tim
Oops, I meant Mike Nolta, the developer of Winsta :-). Typo.
Given Miguel's agreement on the tree idea, to me it sounds like there may be a
relatively clear "solution" (only partial) to the current graphics situation
(pending buy-in from Mike). The file names below are just intended to be
evocative, I don't care what we call them.
1. Write graphics_tree.jl, which implements julia's representation of graphics
state: a data structure like that behind Matlab's "handle graphics", R's "grid
graphics", and just about any serious graphics engine:
http://en.wikipedia.org/wiki/Scene_graph
This does _not_ implement any kind of plotting API (that comes below), nor
does it do any kind of rendering (also below). It's just the internal, user-
hidden representation of graphics state. We need to agree on "one true
graphics tree" to be used by everybody, or we'll get a mess, but I doubt this
is something that will generate a great deal of controversy. I don't see any
reason we can't make this good, now, while other components of the graphics
stack take time to mature.
2. Write graphics_api.jl, which provides the plotting functions that people
actually use. Behind the scenes, all these do is manipulate the tree in #1:
add/delete nodes, modify properties of existing nodes, etc. I'd propose we
separate this from #1, because it seems likely there will be less unity here.
Indeed, someone might want to write "rgraphics_api.jl" and
"matlabgraphics_api.jl" if they really just want a carbon-copy of whatever API
they are familiar with from another system. As long as everybody bases these
around graphics_tree.jl, there will be no fundamental incompatibility.
Alternatively, if we (echoing Viral) hope that people unite around a common
API, then there is still a major benefit to the separation from the renderer
(#3 below): to avoid further fragmentation, we probably need the API to mature
relatively quickly. But the renderers themselves will take the most time to
develop. By separating them, we let all components mature at their own rate.
3. Rendering engines (much of the code currently in Winston, web-repl, and to
a lesser extent in Gaston since it leverages gnuplot) will take the data in
the graphics tree and spit it out, on-screen or as an image file. We can indeed
include a "renderer" field in the representation of a figure-node in the
graphics tree, so that users can employ different renderers for different
currently-open figures.
Again, the main point behind this strategy is to prevent "incompatible
fragmentation" of julia's graphics: when graphics API is tied to the renderer,
you get the unfortunate situation of "I like that R-like graphics API, but
sadly toolbox X doesn't yet support surface plots." By separating API from the
renderer and making everything "pipe through" a single tree representation, we
can let people experiment on the renderer side while providing a good, and
relatively stable, programming interface.
Best,
--Tim
On Tuesday, April 03, 2012 05:20:13 am Tim Hoffmann wrote:
> I would completely agree on the scene graph and representation/
> renderbackend separation. Designing a good scene graph is not that
> easy, though. Having gone through this at leas twice, I doubt that
> there is a "universal" solution. On one hand, too much depends on the
> capabilities of the target render backends, on the other hand, much
> depends on the type of graphics you want to be able to handle.
I've not done this even once (well, I guess once I hacked in an alternate
graphics back-end to Matlab, but it was clunky), so I'm glad to have your
perspective. Agreed that it won't be trivial. I was focusing on scientific
graphics (i.e., one step beyond where Matlab is today), and hadn't thought
seriously about creating virtual reality worlds.
I'm wondering whether we can at least have some very basic things to be
universal, like:
1. The names of the global variables that store the state
2. Rules for traversing the graph
3. Rules for indicating that nodes, or their children, need updating
and so on. The idea might be that a node can contain any type of information,
and if a given renderer doesn't know how to represent that information, it
just punts on that node and all its children (or something).
That way new node types can be added, whose data representation is targeted at
the capabilities of a given renderer, without disrupting other components of
the framework.
Your virtual reality scene might therefore live in one "figure", which is a
window separate from the bar plots and surface plots. Updating the virtual
reality scene might require particular API elements and renderer backends that
are loaded in separately.
I'm not clear whether this is feasible, but it seems worth thinking about.
On another note, Julia seems to attract people with common first name/last
initial combinations. Stefan K., Tim H. Who next?
Best,
--Tim