Alexander Blinne <n...@blinne.net> writes:
>> def gen_s():
>> s = [1]
>> m = skipdups(heapq.merge(*[(lambda j: (k*j for k in s))(n) for n in [2,3,5]]))
>> yield s[0]
>> while True:
>> k = m.next()
>> s.append(k)
>> yield k
Nice. I wouldn't have been sure that "for k in s" worked properly when
s was changing like that.
There is a space complexity problem compared to the Scheme or Haskell
version: all the s[i]'s are saved in the s array, instead of being
discarded once they are yielded. That means generating n elements needs
O(n) space instead of O(n**0.7) or something like that. I guess you can
get around it with collections.deque instead of a list.
> Alexander Blinne <n...@blinne.net> writes:
>>> def gen_s():
>>> s = [1]
>>> m = skipdups(heapq.merge(*[(lambda j: (k*j for k in s))(n) for n in [2,3,5]]))
>>> yield s[0]
>>> while True:
>>> k = m.next()
>>> s.append(k)
>>> yield k
> Nice. I wouldn't have been sure that "for k in s" worked properly when
> s was changing like that.
I just tried it and it worked. Not sure if it is guaranteed.
> There is a space complexity problem compared to the Scheme or Haskell
> version: all the s[i]'s are saved in the s array, instead of being
> discarded once they are yielded. That means generating n elements needs
> O(n) space instead of O(n**0.7) or something like that. I guess you can
> get around it with collections.deque instead of a list.
An Element of s could be discarded, after every one of the three (k*j
for k in s)-generators went over it. I don't think that this is possible
with one deque (at least with the built-in merger of heapq, a
self-written one could be adapted). Storing everything three times (one
deque for every generator) would be a mess as well.
"Manual" garbage collection could be done by discarding all elements
smaller one fifth of the current element of s, because the three
generators already went by them. This could be done with a deque.
How do Haskell or Scheme determine when elements are not longer needed?
On Fri, 15 Jun 2012, Alexander Blinne wrote:
> How do Haskell or Scheme determine when elements are not longer needed?
Just like Python, they use garbage collection - in one sentence, if it can be proved the object (not a OO-object, just a piece of data) will no longer be needed, it can be safely deleted - and the code will work as if nothing happened, because the proof said it won't need this data in the future (so you need a right proving technique).
Now, the difference is, Scheme (and Lisps AFAIK) and Haskell (and those functional langs I heard of) posess one neat data type, linked list. They also allow for tail-call recursion, which - if one organises one's code properly - means infinite recursion, if one needs it. Some problems are expressed in an elegant and natural manner as linked lists (head to be processed now and rest/tail to be processed later). Such linked lists are ideal fit for tail-call recursion - you process a head and recurse with results and tail in place of original list (thus becoming a next level head+tail list). If no other piece of code stores your current head in a variable (simply speaking), it can be proven that head is no longer needed. Once you call your function recursively, head is waiting to be GC-ed. Your code does not need to worry about this.
Last time I checked, Python didn't have linked lists - arrayed lists are nice, but their elements can't be automatically GC-ed (or, this requires very nontrivial GC algorithm), the easiest way I can think would be replacing them with None manually. I'm not sure if del is performance-nice.
Also, around the same time, Python couldn't do tail-call, so the whole point of having linked lists was kind of theoretical.
Even more cool, with lazy evaluation (like in Haskell) one can generate lists on a fly and process them like they were statically allocated. Say, you only have a 2GB of ram but would like to process 128GB of list, generated ad hoc as your program runs? Like, counting all even numbers less than 2**39 - this is trivial, I know (2**38), but you could run such code with 2GB of ram. Your code processes head and when it recurses with tail, the new head (next number) is generated, so it can be processed. And so on. And thanks to lazy evaluation, you don't need to think about it, this is the job of compiler to organize your program in such way.
Yes, you could also run it in a loop or simulate lazy-eval manually (with yield) but the point here is you can have more choice for your algorithm with some languages and in some other languages (Ruby belongs here, too, AFAIK) you don't use recursion (too much) even if you'd like to.
Myself, I love more choice, but of course everybody can have his own preferences.
Regards,
Tomasz Rola
--
** A C programmer asked whether computer had Buddha's nature. **
** As the answer, master did "rm -rif" on the programmer's home **
** directory. And then the C programmer became enlightened... **
** **
** Tomasz Rola mailto:tomasz_r...@bigfoot.com **
Alexander Blinne <n...@blinne.net> writes:
> An Element of s could be discarded, after every one of the three (k*j
> for k in s)-generators went over it. I don't think that this is possible
> with one deque (at least with the built-in merger of heapq, a
> self-written one could be adapted). Storing everything three times (one
> deque for every generator) would be a mess as well.
I think for 3 lists I'd have just merged manually instead of figuring
out the heapq merge. The deque approach sounds straightforward but
often there is subtlety, so I'll have to try it.
> How do Haskell or Scheme determine when elements are not longer needed?
Normal gc, once there is no reference to an elemeent it is released.
Actually again there may be a subtlety, if there is a symbol pointing
to the stream. I'll check into this but I think when I tested it in
Haskell, it did the right thing.
> Last time I checked, Python didn't have linked lists - arrayed lists are
> nice, but their elements can't be automatically GC-ed (or, this requires
> very nontrivial GC algorithm), the easiest way I can think would be
> replacing them with None manually. I'm not sure if del is
> performance-nice.
Python does not come with a linked-list class, but one can easily use tuples or lists with two items as linked-list cells. One can optionally wrap such cell in a linked-list class. However, if array lists do the job, which they usually do, using linked-lists will take more time and space. The problem being discussed may be a case where they are useful and make it easier to save space.
> Also, around the same time, Python couldn't do tail-call,
Nonsense. A tail call is a call temporally followed by a return. In CPython bytecode, it would be a call followed by return. In Python code, it is a call spatially preceded by 'return'. Any "return f(whatever)", a common operation is a tail call.
I presume you actually mean that CPython does not automatically convert tail calls into local assignments and a jump to reuse the existing execution frame instead of a new one. True. A Python interpreter could easily detect all tail calls at either compilation or execution, but such conversions would erase call history, leaving gaps in exception tracebacks and make debugging harder. Depending on your viewpoint, such conversion might be considered a semantic change.
Selectively converting recursive tail calls has specific problems that have been discussed on other threads, and it would *still* erase the call history that one might need to debug. If you do branching recursion, as with a tree, and there is an unexpected exception, you most likely really do want to see the complete call path leading up to the exception. In addition, it is a feature that non-terminating recursions such as "def forever(): return forever()" get stopped.
In any case, a properly written linear tail-recursive function is, usually, easily converted to an explicit while loop. So if you want within-frame looping, write it explicitly. To illustrate one general pattern:
def tail_rec(a, b=start): # use default arg to avoid nesting
if should_loop(a, b):
return tail_rec(A(a,b), B(a,b))
else:
return term(a, b)
def while_equiv(a, b=start):
while should_loop(a, b):
a, b = A(a,b), B(a,b)
else:
return term(a, b)
In practice, should_loop, A, and B will usually be in-line expressions rather than calls. There may be additional statements after if, else, and while headers. In while_equiv, move b=start into the body. Else is typically omitted from the while version, but I prefer it in the recursive version.
One downside of the space saving is that the history of a,b values is invisible unless one add a debug print statement. Another is that the forever function becomes
def forever():
while True: pass
and Python will never stop it without intervention.
> Even more cool, with lazy evaluation (like in Haskell) one can generate
> lists on a fly and process them like they were statically allocated.
Python iterators can do lazy evaluation. All the builtin classes come with a corresponding iterator.
> Yes, you could also run it in a loop or simulate lazy-eval manually (with
> yield)
There is nothing simulated about yield. Python mostly does what you tell it to do. You just have to learn how to tell it to do what you want.
Terry Reedy <tjre...@udel.edu> writes:
> Python iterators can do lazy evaluation. All the builtin classes come
> with a corresponding iterator. ...
I wouldn't say iterators do lazy evaluation in the Scheme or Haskell
sense. Lazy evaluation imho means evaluation is deferred until you
actually try to use the value, and when you use it, it is computed and
kept around for later re-use (until it becomes gc-able). Python
iterators simply generate one value at a time and leave retention of old
values to be managed by the programmer.
> There is nothing simulated about yield. Python mostly does what you
> tell it to do. You just have to learn how to tell it to do what you
> want.
I'd be interested in seeing a clean implementation of that algorithm
using python iterators.
> Terry Reedy<tjre...@udel.edu> writes:
>> Python iterators can do lazy evaluation. All the builtin classes come
>> with a corresponding iterator. ...
> I wouldn't say iterators do lazy evaluation in the Scheme or Haskell
> sense. Lazy evaluation imho means evaluation is deferred until you
> actually try to use the value, and when you use it, it is computed and
> kept around for later re-use (until it becomes gc-able). Python
> iterators simply generate one value at a time and leave retention of old
> values to be managed by the programmer.
Ok, I see the difference. You are talking about something like a memoized __getitem__ that computes and store values as needed.
>> There is nothing simulated about yield. Python mostly does what you
>> tell it to do. You just have to learn how to tell it to do what you
>> want.
> I'd be interested in seeing a clean implementation of that algorithm
> using python iterators.
I already wrote "The problem being discussed may be a case where [linked lists] are useful and make it easier to save space" -- because, from what I understood, a moving buffer of temporarily saved buffers is needed.
On Fri, 15 Jun 2012, Terry Reedy wrote:
> On 6/15/2012 1:03 PM, Tomasz Rola wrote:
> > Last time I checked, Python didn't have linked lists - arrayed lists are
> > nice, but their elements can't be automatically GC-ed (or, this requires
> > very nontrivial GC algorithm), the easiest way I can think would be
> > replacing them with None manually. I'm not sure if del is
> > performance-nice.
> Python does not come with a linked-list class, but one can easily use tuples
> or lists with two items as linked-list cells. One can optionally wrap such
> cell in a linked-list class. However, if array lists do the job, which they
> usually do, using linked-lists will take more time and space. The problem
> being discussed may be a case where they are useful and make it easier to save
> space.
Yes. I made linked lists in Python few years ago, just to test something. At the time Python was my main algorithmic toy, so I couldn't resist.
However, handling the list felt half Pascal-y and half unnatural. Later on, I felt I didn't like cramming everything into arrays and switched to something else.
> > Also, around the same time, Python couldn't do tail-call,
> Nonsense. A tail call is a call temporally followed by a return. In CPython
> bytecode, it would be a call followed by return. In Python code, it is a call
> spatially preceded by 'return'. Any "return f(whatever)", a common operation
> is a tail call.
Now I am intrigued because this code just worked a minute ago, on 2.6.
Given this was written for 2.4, I was wrong.
Definitely something I would like to experiment with a bit. The need for adding decorator takes some joy away but it is interesting.
> In practice, should_loop, A, and B will usually be in-line expressions rather
> than calls. There may be additional statements after if, else, and while
> headers. In while_equiv, move b=start into the body. Else is typically omitted
> from the while version, but I prefer it in the recursive version.
You see, I spend quite a lot of time playing with concepts etc. Code performance is nice to have, but I prefer to maximize my performance as I write something and move on. If I feel (from my own point of view) something would be nicer to write with recursion, so be it - even though
some Common Lisp manual advises to use loops because they are faster. Actually, from what I have tested, this not always is true - both recursion and loops were comparable speed wise in some simple cases I checked. This manual is a bit old, and new compiler had its own say about optimisation, maybe that's the truth behind it.
For cases where I really want speed, proper algorithm (if only I can think
it) and compilation rule. If algorithm codes better with loops, this is ok. But if it codes better with recursion, this should be ok too, because if I feel better while coding it, I make less errors.
> > Even more cool, with lazy evaluation (like in Haskell) one can generate
> > lists on a fly and process them like they were statically allocated.
> Python iterators can do lazy evaluation. All the builtin classes come with a
> corresponding iterator.
This is fine, but I don't mind having more, like the whole language supporting the idea of being lazy.
> > Yes, you could also run it in a loop or simulate lazy-eval manually (with
> > yield)
> There is nothing simulated about yield.
Yes, yield is real and not simulated. And one can use it to do tricks with how/when Python evaluates/generates/performs. It is nicer than if I had to write lazy code in, say, C or Pascal, but it doesn't mean one should only use one language rather than choose language according to the task,
> Python mostly does what you tell it to do. You just have to learn how to > tell it to do what you want.
Well I believe I have already learnt some of it. I am not using Python on a daily basis nowadays, and I am stuck somewhere in 2.x land. To stay in this semicurrent state I read this group and, from time to time, some shorter PEPs. So I think I can tell Python a thing or two, but at the same time I don't want to tell it everything, everytime. :-) I like telling things in a language that sounds better, which depends on what I tell, actually.
Regards,
Tomasz Rola
--
** A C programmer asked whether computer had Buddha's nature. **
** As the answer, master did "rm -rif" on the programmer's home **
** directory. And then the C programmer became enlightened... **
** **
** Tomasz Rola mailto:tomasz_r...@bigfoot.com **
> I'm planning to learn one more language with my python.
> Someone recommended to do Lisp or Clojure, but I don't think it's a
> good idea(do you?)
> So, I consider C# with ironpython or Java with Jython.
> It's a hard choice...I like Visual studio(because my first lang is VB6
> so I'm familiar with that)
> but maybe java would be more useful out of windows.
> what do you think?
Learn C and Python.
They work well together... you can write Python modules in C (using
Cython or CTypes), and the most popular implementation is written in C
(CPython).