Iterators and KeyboardInterrupt

74 views
Skip to first unread message

VulK

unread,
Aug 22, 2016, 3:42:26 AM8/22/16
to sage-...@googlegroups.com
Dear All,
in a ticket (#21254) I recently created with Dylan Rupel I need to explore a
(possibly) infinite n-regular tree in a breath-first search. The way it is
implemented right now is via iterators. I am writing to you to ask if there
is a preferred way to deal with user interaction and KeyboardInterrupt.

At the moment the init function of :class:`ClusterAlgebra` calls
:meth:`reset_exploring_iterator` that creates an instance of :meth:`seeds` and
stores it in an internal var ``_sd_iter``. The methods that need to explore
the tree just call ``next`` on this iterator till they get the data they are
looking for. This process is potentially never ending. For this reason the
user may specify a maximum depth at which to stop.

Since this iterator is computationally intense, it easy to imagine that
users will tend to start searching but change their mind in mid computation
and send an interrupt. This, unfortunately, leaves ``_sd_iter`` in a
corrupted state and all future calls to it will result in a StopIteration.

At the moment we have a warning in the docstring of each method accessing
_sd_iter to explain that after sending a KeyboardInterrupt the user needs to
call :meth:`reset_exploring_iterator`. Do you think we should instead catch
the interrupt, reset the iterator and then raise it again like this?


@@ -1999,12 +1999,17 @@ class ClusterAlgebra(Parent):
sage: len(A.g_vectors_so_far())
14
"""
- while self._explored_depth <= depth:
- try:
- seed = next(self._sd_iter)
- self._explored_depth = seed.depth()
- except StopIteration:
- break
+ try:
+ while self._explored_depth <= depth:
+ try:
+ seed = next(self._sd_iter)
+ self._explored_depth = seed.depth()
+ except StopIteration:
+ break
+ except KeyboardInterrupt:
+ print("Got a KeyboardInterrupt, cleaning up before returning.")
+ self.reset_exploring_iterator()
+ raise KeyboardInterrupt

The advantage of this is that the user does not have to remember an extra
(unnatural?) step. The drawback is that he/she looses any customization that
may have been made by a previous call to :meth:`reset_exploring_iterator`.


On a related topic. In the situation just described, the next exploration
will have to begin from the root of the tree resulting in a lot of wasted
effort. Is there any way around this? Sending a node of the tree back to the
iterator does not seem useful because of the breath-first search.

Thanks
S.


leif

unread,
Aug 22, 2016, 11:36:15 AM8/22/16
to sage-...@googlegroups.com
VulK wrote:
> Dear All,
> in a ticket (#21254) I recently created with Dylan Rupel I need to explore a
> (possibly) infinite n-regular tree in a breath-first search.

Breath-first search = Search & pray? ;-)

(Possibly infinite apnoea can't be healthy.)
Well, one usually implements checkpoints for such things (continually
saving state to optionally resume later if interrupted).


-leif


Nils Bruin

unread,
Aug 22, 2016, 12:12:50 PM8/22/16
to sage-devel
On Monday, August 22, 2016 at 12:42:26 AM UTC-7, Salvatore Stella wrote:
At the moment the init function of :class:`ClusterAlgebra` calls
:meth:`reset_exploring_iterator` that creates an instance of :meth:`seeds` and
stores it in an internal var ``_sd_iter``.

Are you really getting benefit from storing the state (i.e., the actual iterator) on the parent itself? (I see you haven't made ClusterAlgebra UniqueRepresentation, so it's not an immediate bug to have it this way)

Perhaps it's cleaner to hand out iterator objects that are kept track of in the relevant loop. That iterator would then just die whenever the frames of a KeyboardInterrupt exception are discarded and the flawed state wouldn't persist.

The iterator objects could even hold a reference to the ClusterAlgebra (they may well have to) and write back interesting finds that may benefit future computations back onto the ClusterAlgebra.

If you really need to store the iterator object on the ClusterAlgebra and you're finding it needs to be resilient in the face of KeyboardInterrupts (that is to say, be able to produce a "next" element after keyboard interruption), you could write your own iterator object and store that. An iterator object is an object that implements "__iter__" returning self and "next" returning the "next" element. See https://docs.python.org/2/library/stdtypes.html#iterator-types .

Using "yield" is a convenient way of writing simple iterators, but it is in no way the only way of doing it. When you explicitly store state yourself it is much easier to define the right behaviour in the face of unexpected interrupts.

 

Volker Braun

unread,
Aug 22, 2016, 1:53:23 PM8/22/16
to sage-devel
On Monday, August 22, 2016 at 6:12:50 PM UTC+2, Nils Bruin wrote:
Perhaps it's cleaner to hand out iterator objects that are kept track of in the relevant loop. That iterator would then just die whenever the frames of a KeyboardInterrupt exception are discarded and the flawed state wouldn't persist.

IMHO iterators must not have global state, which is really just a corollary to "global variables are bad". In particular, iterating twice simultaneously should work. With the exception of input iterators of course, but iterating over a tree doesn't consume it.

Nils Bruin

unread,
Aug 22, 2016, 2:27:28 PM8/22/16
to sage-devel
On Monday, August 22, 2016 at 10:53:23 AM UTC-7, Volker Braun wrote:
IMHO iterators must not have global state, which is really just a corollary to "global variables are bad". In particular, iterating twice simultaneously should work. With the exception of input iterators of course, but iterating over a tree doesn't consume it.

Just to make sure we stay compatible with python terminology: iterators get consumed and have their state altered by calling "next" on them. Iterables are objects that can be iterated over, meaning that calling "iter" on them produces an iterator on which one can call "next".

Iterators themselves are required to be "iterable", but in a strange way: calling "iter" on an iterators gives you back an identical object! In particular, if I is an iterator then calling next(I) and next(iter(I)) should have exactly the same result (in both cases modifying the state of I).

Volker Braun

unread,
Aug 22, 2016, 4:30:13 PM8/22/16
to sage-devel
On Monday, August 22, 2016 at 8:27:28 PM UTC+2, Nils Bruin wrote:
Iterators themselves are required to be "iterable", but in a strange way: calling "iter" on an iterators gives you back an identical object! In particular, if I is an iterator then calling next(I) and next(iter(I)) should have exactly the same result (in both cases modifying the state of I).

Yes, though thats not what I meant. I wanted to say that set(iter(X)) == set(iter(X)). Unless X is somehow being consumed in the iteration, like open(file).readlines(), that is, an input iterator. But a BFS most certainly does not consume the tree. 

Sébastien Labbé

unread,
Aug 22, 2016, 4:42:50 PM8/22/16
to sage-devel, etn4...@gmail.com
Do you know about

sage: RecursivelyEnumeratedSet?

Sébastien Labbé

unread,
Aug 22, 2016, 5:05:30 PM8/22/16
to sage-devel, etn4...@gmail.com


sage: RecursivelyEnumeratedSet?

See also

http://doc.sagemath.org/html/en/reference/structure/sage/sets/recursively_enumerated_set.html

If the structure of your set is a tree or forest, then you may be interested in using parallel computations on your structure provided by

http://doc.sagemath.org/html/en/reference/parallel/sage/parallel/map_reduce.html

which was merged into Sage this year and which is known enough I think.

I let you look whether KeyboardInterrupts works ok for your need, but consider including your tree iterator into these existing modules (I will be willing to review your code).

Sébastien

VulK

unread,
Aug 22, 2016, 8:03:16 PM8/22/16
to sage-devel
Hi All,
thank you very much for all the inputs!

> Breath-first search = Search & pray? ;-)
>
> (Possibly infinite apnoea can't be healthy.)

/me fails
:)

> Well, one usually implements checkpoints for such things (continually
> saving state to optionally resume later if interrupted).

I am not sure what data to store nor if it is a good compromise, I will think
about this a little but I guess Sébastien gave me a way out of this.

> Are you really getting benefit from storing the state (i.e., the actual
> iterator) on the parent itself? (I see you haven't made ClusterAlgebra
> UniqueRepresentation, so it's not an immediate bug to have it this way)
> Perhaps it's cleaner to hand out iterator objects that are kept track
> of in the relevant loop. That iterator would then just die whenever the
> frames of a KeyboardInterrupt exception are discarded and the flawed
> state wouldn't persist.

The main benefit I get from storing the iterator is that, if the user is
careful in calling the various functions with reasonable stopping points, the
code never has to start searching from scratch. For example currently

sage: A = ClusterAlgebra(['A',2,1])
sage: A.explore_to_depth(10)
sage: A.explore_to_depth(11)

effectively only traverses the tree once to depth 11. If I were not to store
the iterator then I would be traversing the tree twice. And unfortunately
this is expensive.

> Using "yield" is a convenient way of writing simple iterators, but it
> is in no way the only way of doing it. When you explicitly store state
> yourself it is much easier to define the right behaviour in the face of
> unexpected interrupts.

I never thought of this before.

> IMHO iterators must not have global state, which is really just a
> corollary to "global variables are bad". In particular, iterating twice
> simultaneously should work. With the exception of input iterators of
> course, but iterating over a tree doesn't consume it.

I agree in principle with the idea that "global variables are bad" but how do
you avoid to perform expensive computations multiple times?

> Do you know about
> sage: RecursivelyEnumeratedSet?

No and it looks extremely promising! Unfortunately the graph I have has
only a symmetric structure so it looks like I will not be able to use any
parallelization (unless there is some other wonder piece of code I know
nothing about!!!).

> I let you look whether KeyboardInterrupts works ok for your need, but
> consider including your tree iterator into these existing modules (I
> will be willing to review your code).

At this point I definitely intend to try RecursivelyEnumeratedSet; I'll keep
you posted on the outcome and hold you to the promise of a review!

Thanks
S.

Sébastien Labbé

unread,
Aug 23, 2016, 3:51:34 AM8/23/16
to sage-devel


> Are you really getting benefit from storing the state (i.e., the actual
> iterator) on the parent itself? (I see you haven't made ClusterAlgebra
> UniqueRepresentation, so it's not an immediate bug to have it this way)
> Perhaps it's cleaner to hand out iterator objects that are kept track
> of in the relevant loop. That iterator would then just die whenever the
> frames of a KeyboardInterrupt exception are discarded and the flawed
> state wouldn't persist.

The main benefit I get from storing the iterator is that, if the user is
careful in calling the various functions with reasonable stopping points, the
code never has to start searching from scratch. For example currently

    sage: A = ClusterAlgebra(['A',2,1])
    sage: A.explore_to_depth(10)
    sage: A.explore_to_depth(11)

effectively only traverses the tree once to depth 11. If I were not to store
the iterator then I would be traversing the tree twice. And unfortunately
this is expensive.

The method graded_component does exactly what you want. See

VulK

unread,
Aug 23, 2016, 4:16:53 AM8/23/16
to sage-...@googlegroups.com
Indeed RecursivelyEnumeratedSet seems to be a good fit for my needs.
The only problem I encountered so far is that it does not handle
KeybordInterrupts nicely either: if you interrupt the computation of
graded_component(n) then graded_component(m) with m<n still produce the
desired input while all the remaining m raise a StopIterator.
S.



* Sébastien Labbé <sla...@gmail.com> [2016-08-23 00:51:34]:

> > Are you really getting benefit from storing the state (i.e., the
> actual
> > iterator) on the parent itself? (I see you haven't made
> ClusterAlgebra
> > UniqueRepresentation, so it's not an immediate bug to have it this
> way)
> > Perhaps it's cleaner to hand out iterator objects that are kept
> track
> > of in the relevant loop. That iterator would then just die
> whenever the
> > frames of a KeyboardInterrupt exception are discarded and the
> flawed
> > state wouldn't persist.
> The main benefit I get from storing the iterator is that, if the
> user is
> careful in calling the various functions with reasonable stopping
> points, the
> code never has to start searching from scratch. For example
> currently
> Â Â sage: A = ClusterAlgebra(['A',2,1])
> Â Â sage: A.explore_to_depth(10)
> Â Â sage: A.explore_to_depth(11)
> effectively only traverses the tree once to depth 11. If I were not
> to store
> the iterator then I would be traversing the tree twice. And
> unfortunately
> this is expensive.
>
> The method graded_component does exactly what you want. See
>
> https://github.com/sagemath/sage/blob/master/src/sage/sets/recursively_
> enumerated_set.pyx#L607
> Â
>
> --
> You received this message because you are subscribed to the Google
> Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to [1]sage-devel+...@googlegroups.com.
> To post to this group, send email to [2]sage-...@googlegroups.com.
> Visit this group at [3]https://groups.google.com/group/sage-devel.
> For more options, visit [4]https://groups.google.com/d/optout.
>
> References
>
> 1. mailto:sage-devel+...@googlegroups.com
> 2. mailto:sage-...@googlegroups.com
> 3. https://groups.google.com/group/sage-devel
> 4. https://groups.google.com/d/optout

Sébastien Labbé

unread,
Aug 23, 2016, 5:01:42 AM8/23/16
to sage-devel

On Tuesday, August 23, 2016 at 10:16:53 AM UTC+2, Salvatore Stella wrote:
Indeed RecursivelyEnumeratedSet seems to be a good fit for my needs.
The only problem I encountered so far is that it does not handle
KeybordInterrupts nicely either: if you interrupt the computation of
graded_component(n) then graded_component(m) with m<n still produce the
desired input while all the remaining m raise a StopIterator.
S.


I see. Using alarm, I managed to create a reprocable bug:

https://trac.sagemath.org/ticket/21312
Reply all
Reply to author
Forward
0 new messages