As some of you probably know, I'm working on refigure2, a plotting
extension for Reinteract. The syntax for plotting is like so [1]:
with figure() as f:
plot([1,2,3])
f
This has two features I don't like: First, you must name the figure and
then explicitly return it to Reinteract at the end of the block.
Otherwise the figure's custom_result never gets put in the worksheet.
Second, the plotting commands return various objects, which get printed
out if they aren't assigned to a variable. For one command this isn't a
problem, but complicated figures can lead to a screenful of returned
objects that I don't really need to see.
I had previously suggested a modification to the with statement that
might improve these issues [2], but as I was poking around Reinteract
today, I realized that both issues could be fixed by access to
reinteract_output, the function that handles the return values of
statements. Specifically, figures could call reinteract_output(self)
from their __exit__ methods, putting themselves in the worksheet at the
end of their with block. They could also disable reinteract_output
between their __enter__ and __exit__ calls to suppress printing of all
those plotting objects.
While reinteract_output is accessible from the worksheet itself, there's
no easy way (that I could find) to get to the reinteract_output from
within a module [3]. For testing purposes, I added a function
set_output() to refigure2, which takes reinteract_output and saves it in
the module. Internally, refigure2 molests reinteract_output to disable
it during the with block, and then calls it with the figure itself at
the end. This version is available if you'd like to try it out [4], but
it has three significant problems:
1) You need to call set_output(reinteract_output) from the worksheet
before anything happens.
2) It does things to reinteract_output that are not done in polite
company. As such, it is probably rather fragile.
3) It only works for a single worksheet at a time. This is because each
worksheet has its own version of reinteract_output, but the refigure2
module is a singleton and can therefore only hold one at a time.
This got me thinking that maybe Reinteract could provide some mechanism
by which modules could affect the output function. I'm thinking
something like:
from reinteract import ReinteractOutput
ReinteractOutput.output( <some object> )
ReinteractOutput.enable( <bool> )
ReinteractOutput would have to be some sort of magic object that could
determine at run time what worksheet was being executed and adjust the
output function for that one. I don't know how to do this offhand, nor
do I know if it is even possible. But if it is, I think that it could
allow extension authors quite a bit of freedom in choosing a syntax.
refigure2 might overload the 'with' statement, but other extensions
could find a use for 'while' or 'for' loops.
As I was pondering this, I realized that this could offer another
mechanism for wrapping objects in CustomResults:
ReinteractOutput.add_handler( <func> )
or even
ReinteractOutput.set_output_function( <func> )
Granted, this mechanism would produce the same spooky effects as the
automatic _reinteract_wrap would, but ReinteractOutput could at least
provide an easier way for the user to see which handlers are currently
active.
I hope this explanation is somewhat lucid. I don't know if this is
possible or desirable. It would certainly allow for all sorts of weird
effects in unscrupulous hands, but I think it does have a few legitimate
uses.
Please let me know your thoughts,
Robert
[1] The single command plotting functions can actually be used without
the 'with' block, but ignore that for now.
[2] http://groups.google.com/group/reinteract/msg/355bd7e50b90ef82
[3] Importing __main__ gets you not the context of the worksheet, but
the context of the Reinteract application. I suppose you could drill
down and find the current worksheet with some trial and error, but this
seemed a bit unsafe even to me.
[4] It's the output branch on github
(http://github.com/rschroll/refigure2/tree/output). But this is mean
for demonstration and testing purposes only. Nice modules don't do the
things this does.
I think access from modules to the "current worksheet context" is
almost certainly part of any way of having more powerful extensions.
I've never sat down and worked out exactly how to do it, but one of
the two following methods should work:
- Using inspect.currentframe(), frame.f_back, frame.f_globals to find
the worksheet frame
- Using threading.local to store the currently executing statement
I think the second is probably the better way to do it ... that's
actually how we handle stdout redirection - see stdout_capture.py.
My general feeling is that any access to the current worksheet should
be done by functions that look like they access the current worksheet.
So, my current thoughts on sidebars are that:
c = recairo.Surface()
open_sidebar(c) # affects the worksheet
cr = cairo.Context(c)
cr.move_to(10, 10)
cr.line_to(20, 20)
I think your example with figure() fits into that general category,
though it's obviously a bit more magic because it's being used to give
a with statement a "return value" - something that with statements
don't have in Python.
> Second, the plotting
> commands return various objects, which get printed out if they aren't
> assigned to a variable. For one command this isn't a problem, but
> complicated figures can lead to a screenful of returned objects that I don't
> really need to see.
Hmm. I understand using reinteract_output from __exit__ to print out
the plot. And it also makes sense to me to have a with statement scope
a "current plot". But I'm not so sure about suppressing return values.
That just strikes me as confusing inconsistency. Why should these
commands have return values in one context, but not in another
context?
One the other hand, maybe it makes sense to only do the magic output
at the toplevel - so:
for i in xrange(0,5):
i
Doesn't give anything, and you have to use 'print i',
'reinteract_output(i)' or maybe 'output(i)' to get that effect. Not
sure - it would be less consistent with interactive Python that way,
but also less prone to accidentally spamming the output.
- Owen
Thanks for the thoughtful reply. It's taken me a bit to digest it all;
here's my current point of view:
Owen Taylor wrote:
> My general feeling is that any access to the current worksheet should
> be done by functions that look like they access the current worksheet.
> So, my current thoughts on sidebars are that:
>
> c = recairo.Surface()
> open_sidebar(c) # affects the worksheet
>
> cr = cairo.Context(c)
> cr.move_to(10, 10)
> cr.line_to(20, 20)
>
> I think your example with figure() fits into that general category,
> though it's obviously a bit more magic because it's being used to give
> a with statement a "return value" - something that with statements
> don't have in Python.
Let me be clear: I think that figures are a natural fit with sidebars.
I consider everything I'm doing with refigure(2) to be a stop-gap
solution until Reintereact gets sidebars. When that happens, I will
happily change refigure to use sidebars and/or help replot adapt to them.
So, whenever I propose something, if that thing can be better done with
a sidebar, it should be held off for the sidebars. But since I have
neither the interest nor ability to implement something so major, I'll
continue to suggest small tweaks to help the current situation.
> I've never sat down and worked out exactly how to do it, but one of
> the two following methods should work:
>
> - Using inspect.currentframe(), frame.f_back, frame.f_globals to find
> the worksheet frame
Wow - the standard library is full of wondrous things! I've modified
the output branch of refigure2 to use inspect to tweak
reinteract_output, and it seems to work just fine. It's certainly less
of a kludge than messing with functions' func_code. But if we're going
to have an 'official' way for modules to modify the current context, I
don't think this should be it.
> - Using threading.local to store the currently executing statement
>
> I think the second is probably the better way to do it ... that's
> actually how we handle stdout redirection - see stdout_capture.py.
To check my understanding: You're suggesting replacing the current
reinteract_output function with an class like StdoutCapture. Then,
modules can instantiate an object of this class and, thanks to
threading.local, modify only the current thread's output. Is that right?
If so, I agree that this is the better way to handle it. If this is
something you would like to have in Reinteract, I'd be happy to start
trying to implement it. But I don't know if this will provide anything
that sidebars won't. If you'd rather hold off for sidebars and avoid
duplication of code, I'm perfectly happy to use the inspect method
within refigure2 for now.
>> Second, the plotting
>> commands return various objects, which get printed out if they aren't
>> assigned to a variable. For one command this isn't a problem, but
>> complicated figures can lead to a screenful of returned objects that I don't
>> really need to see.
>
> Hmm. I understand using reinteract_output from __exit__ to print out
> the plot. And it also makes sense to me to have a with statement scope
> a "current plot". But I'm not so sure about suppressing return values.
> That just strikes me as confusing inconsistency. Why should these
> commands have return values in one context, but not in another
> context?
>
> One the other hand, maybe it makes sense to only do the magic output
> at the toplevel - so:
>
> for i in xrange(0,5):
> i
>
> Doesn't give anything, and you have to use 'print i',
> 'reinteract_output(i)' or maybe 'output(i)' to get that effect. Not
> sure - it would be less consistent with interactive Python that way,
> but also less prone to accidentally spamming the output.
I agree that this is a rather odd thing to do. If I would just learn
matplotlib's OO interface, I could probably avoid these problems. When
I use the pylab interface to make complicated figures, I can end up with
a whole screen full of useless output that I realized I could avoid. I
can't really think of many other places where suppressing this output
would be desired, so I don't see a good reason to make it the default.
But if you're going to give module authors access to the output
function, at least one of them (me) is going to try to figure out how to
disable it. So I think it might be a good idea to provide a supported
way to do this.
In an attempt to make this less magic, I've added a disable_output flag
to figure() to govern whether the output is suppressed in this manner.
But I've set the default to True to save myself typing, thereby
counteracting most of the good this might do. A better way would be for
refigure2 to read the default from a configuration file, but be False
otherwise. Then the user would have to explicitly ask for the
suppression, but only once. This brings up another question: Should
there be a default location for extensions to save their settings?
Thanks,
Robert
References to the sidebars idea are not meant to say that we should
hold off on doing other stuff - I have know idea when or if I'll get
to that major idea. Just trying to figure out what makes sense in my
head.
>> - Using threading.local to store the currently executing statement
>>
>> I think the second is probably the better way to do it ... that's
>> actually how we handle stdout redirection - see stdout_capture.py.
>
> To check my understanding: You're suggesting replacing the current
> reinteract_output function with an class like StdoutCapture. Then, modules
> can instantiate an object of this class and, thanks to threading.local,
> modify only the current thread's output. Is that right?
Actually, I was thinking of keeping the threading.local stuff more
behind the scenes. Have a class method Statement.get_current(), then
add whatever methods are needed on Statement. Patch to implement
Statement.get_current() attached - I tested:
===
from reinteract.statement import Statement
Statement.get_current().result_scope['inserted'] = 3
inserted
3
===
See if that works for you, and what extra you need beyond what
Statement currently exports.
[snip]
> In an attempt to make this less magic, I've added a disable_output flag to
> figure() to govern whether the output is suppressed in this manner. But I've
> set the default to True to save myself typing, thereby counteracting most of
> the good this might do. A better way would be for refigure2 to read the
> default from a configuration file, but be False otherwise. Then the user
> would have to explicitly ask for the suppression, but only once. This
> brings up another question: Should there be a default location for
> extensions to save their settings?
I actually dislike the whole matplotlib config file thing. Goes
against my feeling that code should be standalone and self-evident.
And especially in this case. If the return values for refigure have
any utility then presumable at some point it makes sense to make a
worksheet that does:
axes = add_axes([0.5, 0.5, 1, 1]);
axes.set_y_label_style('italic')
(Or whatever, I'm just making stuff up.) Then whether things worked or
not would depend on that config file setting. It should be obvious how
to implement behind-the-scenes worksheet state like:
set_return_objects(False);
using the Statement.get_current() patch - so you could have that once
at the top of your worksheet. I don't know if that's a good
alternative.
In terms of config files - and there probably are legitimate uses,
though I'm not sure this is one - I could imagine having:
Statement.get_current().worksheet.notebook.get_config_path()
(Or maybe add Worksheet.get_current() and Notebook.get_current() that
use Statement.get_current() internally). And have the config path be a
list of directories to look in where you might find a config file for
your directories.
- Owen
Yep. That does everything I want, and more. Thanks!
>> In an attempt to make this less magic, I've added a disable_output flag to
>> figure() to govern whether the output is suppressed in this manner. But I've
>> set the default to True to save myself typing, thereby counteracting most of
>> the good this might do. A better way would be for refigure2 to read the
>> default from a configuration file, but be False otherwise. Then the user
>> would have to explicitly ask for the suppression, but only once. This
>> brings up another question: Should there be a default location for
>> extensions to save their settings?
>
> I actually dislike the whole matplotlib config file thing. Goes
> against my feeling that code should be standalone and self-evident.
> And especially in this case. If the return values for refigure have
> any utility then presumable at some point it makes sense to make a
> worksheet that does:
>
> axes = add_axes([0.5, 0.5, 1, 1]);
> axes.set_y_label_style('italic')
>
> (Or whatever, I'm just making stuff up.) Then whether things worked or
> not would depend on that config file setting. It should be obvious how
> to implement behind-the-scenes worksheet state like:
>
> set_return_objects(False);
>
> using the Statement.get_current() patch - so you could have that once
> at the top of your worksheet. I don't know if that's a good
> alternative.
I'm afraid you may be misunderstanding my goad here. I think the plot
functions should always return values, because as you note they can be
useful. Instead, I want to suppress the printing of these return values
when they're not assigned to a variable. Right now, you get:
with figure() as f:
plot([1,2,1])
f
[<matplotlib.lines.Line2D object at 0x971822c>]
<the actual figure here>
What I'm going for is:
with figure(disable_output=True) as f:
plot([1,2,1])
f
<the actual figure here>
while ensuring that this still works:
with figure(disable_output=True) as f:
lines = plot([1,2,1])
f
This is why I'm trying to screw with reinteract_output, not with the
output of the plotting functions. (Maybe "disable_output" should really
be "disable_output_printing", but that seems too verbose. Perhaps
"suppress_output"?)
Even so, I think there's a reasonable argument that this should only be
done on a per-worksheet basis. ("Explicit is better than implicit.")
But refigure has always chosen expediency over correctness. I'd rather
not have the output suppression on by default, as I can see it confusing
new users. But I want to have it on all the time for myself without
extra work, which this lead to the idea of using a config file. Is this
the "right" thing to do? Probably not, but if I were doing the right
thing, I'd just end up rewriting replot :)
>
> In terms of config files - and there probably are legitimate uses,
The real use for a config file with refigure would be to allow the user
to set different defaults for refigure than are used for vanilla
matplotlib. Right now, refigure hard codes a smaller figure size than
the matplotlib default, but the user cannot adjust this through their
matplotlibrc file, as they can with most other settings. Something
better is obviously needed.
> though I'm not sure this is one - I could imagine having:
>
> Statement.get_current().worksheet.notebook.get_config_path()
>
> (Or maybe add Worksheet.get_current() and Notebook.get_current() that
> use Statement.get_current() internally). And have the config path be a
> list of directories to look in where you might find a config file for
> your directories.
If this feature is made available, I'd certainly use it. But I really
just meant to ask if we should set a default directory for config files,
and if so what it should be.
Thanks,
Robert
On 09/04/2010 10:16 PM, Owen Taylor wrote:
>> Second, the plotting
>> commands return various objects, which get printed out if they aren't
>> assigned to a variable. For one command this isn't a problem, but
>> complicated figures can lead to a screenful of returned objects that I don't
>> really need to see.
>
> Hmm. I understand using reinteract_output from __exit__ to print out
> the plot. And it also makes sense to me to have a with statement scope
> a "current plot". But I'm not so sure about suppressing return values.
> That just strikes me as confusing inconsistency. Why should these
> commands have return values in one context, but not in another
> context?
>
> One the other hand, maybe it makes sense to only do the magic output
> at the toplevel - so:
>
> for i in xrange(0,5):
> i
>
> Doesn't give anything, and you have to use 'print i',
> 'reinteract_output(i)' or maybe 'output(i)' to get that effect. Not
> sure - it would be less consistent with interactive Python that way,
> but also less prone to accidentally spamming the output.
I had previously argued against suppressing output by default as being a
bit surprising. But I wonder if we should revisit this for build
blocks. These exist specifically for the purpose of manipulating
output, so it'd be less surprising for this to happen in a build block
than in any other. I'm not sure if I like this or not, but I thought
I'd throw it out there.
As a reminder, the proposal is to suppress the automatic printing of
objects, while leaving explicit printing unmolested. So we'd have:
>>> build:
... 1+1
... print 2+2
4
and
>>> if True:
... 1+1
... print 2+2
2
4
Robert