Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
links
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  11 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
srepmub  
View profile  
 More options Nov 28 2010, 7:43 am
From: srepmub <mark.duf...@gmail.com>
Date: Sun, 28 Nov 2010 04:43:32 -0800 (PST)
Local: Sun, Nov 28 2010 7:43 am
Subject: links
hi all,

I'd like to have a thread with links to blog postings and such where
shedskin is tested or compared with other implementations. to start
with, here's a very recent blog post, where the author tries some
different approaches to get his code to run faster:

http://geetduggal.wordpress.com/2010/11/25/speed-up-your-python-unlad...

shedskin goes from slowest (C++ doesn't like huge amounts of small
allocations!) to fastest (after manual C), after a few minor tweaks.
and it can be faster with -bw (see comments).

here's a recent comparison of mine against psyco. looks like next time
I will have to compare against pypy.. :)

http://shed-skin.blogspot.com/2010/05/shedskin-versus-psyco-for-131-p...

this comparison is perhaps my favourite though:

http://www.hxa.name/minilight/

while you won't often get a speedup of around 100 times, here's an
even larger one:

http://ianozsvald.com/2008/11/17/making-python-math-196-faster-with-s...

(though the largest measured speed was around 300 times iirc, for the
Kanoodle example)

so, please add to this thread if you know of or encounter any other
interesting tests or comparisons!

thanks,
mark.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
John Yeung  
View profile  
 More options Nov 28 2010, 10:09 am
From: John Yeung <gallium.arsen...@gmail.com>
Date: Sun, 28 Nov 2010 10:09:20 -0500
Local: Sun, Nov 28 2010 10:09 am
Subject: Re: links
Here is a very complimentary blog article.  It does have one hard
timing comparison, but it feels more like a general introduction to
Shed Skin for those who don't know about Shed Skin yet than an actual
benchmarking post.  (Which is still good, right?  I think this article
would have helped support the Wikipedia effort, for example; and could
be used for a future attempt to get into Wikipedia.)

I am surprised you didn't list this among your links already, because
the blog author mentions that he was contacted directly by Shed Skin's
author. ;)

http://www.korokithakis.net/node/117

John


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mark Dufour  
View profile  
 More options Nov 28 2010, 10:16 am
From: Mark Dufour <mark.duf...@gmail.com>
Date: Sun, 28 Nov 2010 16:16:43 +0100
Local: Sun, Nov 28 2010 10:16 am
Subject: Re: links

On Sun, Nov 28, 2010 at 4:09 PM, John Yeung <gallium.arsen...@gmail.com>wrote:

> Here is a very complimentary blog article.  It does have one hard
> timing comparison, but it feels more like a general introduction to
> Shed Skin for those who don't know about Shed Skin yet than an actual
> benchmarking post.  (Which is still good, right?  I think this article
> would have helped support the Wikipedia effort, for example; and could
> be used for a future attempt to get into Wikipedia.)

ah, yes, I forgot about this one, thanks! and good you mention wikipedia. if
anyone reading this might want to re-create a wikipedia page, here's a link
to the section in lutz's well known "learning python" book that describes
shedskin:

http://books.google.com/books?id=1HxWGezDZcgC&pg=PA31&lpg=PA31&dq=lea...

thanks again!!
mark.
--
http://www.youtube.com/watch?v=E6LsfnBmdnk


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jérémie Roquet  
View profile  
 More options Nov 28 2010, 1:35 pm
From: Jérémie Roquet <arkano...@gmail.com>
Date: Sun, 28 Nov 2010 19:35:29 +0100
Local: Sun, Nov 28 2010 1:35 pm
Subject: Re: links
Hi,

2010/11/28 srepmub <mark.duf...@gmail.com>:

> I'd like to have a thread with links to blog postings and such where
> shedskin is tested or compared with other implementations.

OK, I'll post some as I get them… I've set a Google alert for shedskin ;-)

> to start
> with, here's a very recent blog post, where the author tries some
> different approaches to get his code to run faster:

> http://geetduggal.wordpress.com/2010/11/25/speed-up-your-python-unlad...

> shedskin goes from slowest (C++ doesn't like huge amounts of small
> allocations!) to fastest (after manual C), after a few minor tweaks.
> and it can be faster with -bw (see comments).

I'm very pleased a few tweaks were enough to make the shedskin version
faster than the others (especially the pypy one :p). I was very
suprised by the first benchmark's results.

I don't know what kind of allocation is responsible for the overhead
there, but I guess escape analysis would help for scalar types… and
maybe some kind of factory would help for more complex objects too (if
this objects are properly garbage collected).

> here's a recent comparison of mine against psyco. looks like next time
> I will have to compare against pypy.. :)

> http://shed-skin.blogspot.com/2010/05/shedskin-versus-psyco-for-131-p...

Would be awesome to have the comparison against pypy :-)

> this comparison is perhaps my favourite though:

> http://www.hxa.name/minilight/

I missed that one, so thanks for the link.
It makes me think that it would be interesting to have some benchmarks
with clang / dragonegg instead of gcc… especially when comparing to
pypy.

Best regards,

--
Jérémie


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mark Dufour  
View profile  
 More options Nov 29 2010, 12:14 pm
From: Mark Dufour <mark.duf...@gmail.com>
Date: Mon, 29 Nov 2010 18:14:43 +0100
Local: Mon, Nov 29 2010 12:14 pm
Subject: Re: links

> I'm very pleased a few tweaks were enough to make the shedskin version
> faster than the others (especially the pypy one :p). I was very
> suprised by the first benchmark's results.

in the meantime, shedskin GIT has gone from 77 to about 46 seconds for the
original program. it turns out string comparison and slicing were both still
not very optimal.. :P

the strangeness of the benchmarks comes from the fact it's essentially a
single loop:

for x in range(200000000):
    s = somestr[x:x+3]
    if s == 'abc' or s == 'bcd' or s == 'cde':
        blah

so the crucial aspect is how does the memory subsystem handle all these
slices. it looks like pypy uses escape analysis to recycle the slices, or
perhaps it caches 3-length strings (shedskin only caches 256 1-length
strings at the moment). in any case, when I manually 'reuse' the same string
object, I get about the same speed as pypy.

I don't know what kind of allocation is responsible for the overhead

> there, but I guess escape analysis would help for scalar types… and
> maybe some kind of factory would help for more complex objects too (if
> this objects are properly garbage collected).

yes, I think some scheme to allocate objects N at a time and recycling them
somehow can help a lot. but I think in this and other cases, it could also
help to rewrite the string class. dropping (indirection to) the STL helped a
lot for the set and dict classes as well. unused mutability of std::string
probably also costs us.

I missed that one, so thanks for the link.

> It makes me think that it would be interesting to have some benchmarks
> with clang / dragonegg instead of gcc… especially when comparing to
> pypy.

yes, definitely. I think there's quite a bit that can be optimized outside
of shedskin. it feels a bit weird to have to worry about escape analysis and
other memory optimizations, if Java has them out-of-the-box.. I would expect
the Boehm GC to get in the way though, since it overloads 'new', so that
it's impossible to follow what happens there. I also doubt they do much in
this area, because the naive output of shedskin is not that typical (looks
more like Java.. hm :-)).

thanks!
mark.
--
http://www.youtube.com/watch?v=E6LsfnBmdnk


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jérémie Roquet  
View profile  
 More options Nov 29 2010, 12:52 pm
From: Jérémie Roquet <arkano...@gmail.com>
Date: Mon, 29 Nov 2010 18:52:37 +0100
Local: Mon, Nov 29 2010 12:52 pm
Subject: Re: links
2010/11/29 Mark Dufour <mark.duf...@gmail.com>:

> the strangeness of the benchmarks comes from the fact it's essentially a
> single loop:

> for x in range(200000000):
>     s = somestr[x:x+3]
>     if s == 'abc' or s == 'bcd' or s == 'cde':
>         blah

> so the crucial aspect is how does the memory subsystem handle all these
> slices. it looks like pypy uses escape analysis to recycle the slices, or
> perhaps it caches 3-length strings (shedskin only caches 256 1-length
> strings at the moment). in any case, when I manually 'reuse' the same string
> object, I get about the same speed as pypy.

Since Python strings are immutable, slices should never lead to memory
allocations, am I wrong?
This leads me to the next point :

>> I don't know what kind of allocation is responsible for the overhead
>> there, but I guess escape analysis would help for scalar types… and
>> maybe some kind of factory would help for more complex objects too (if
>> this objects are properly garbage collected).

> yes, I think some scheme to allocate objects N at a time and recycling them
> somehow can help a lot. but I think in this and other cases, it could also
> help to rewrite the string class. dropping (indirection to) the STL helped a
> lot for the set and dict classes as well. unused mutability of std::string
> probably also costs us.

Yes, I entirely agree. That's something I've already thought about
several times… std::string is of little help there, and it has a
serious overhead for several simple tasks :
 - slicing
 - copying
 - resizing to a shorter string
 - (maybe with some additional work) working with simple `char' where
strings are not needed (eg. for expressions like
  x = 'a'
or
  if foo[42] == 'z':
or even
  x = 'a'
  if foo[42] == x:
)

When the ticket for issue 74 (bz2) was opened, I wanted to write a
quick and dirty implementation. Unfortunately, the current internal
representation of strings prevents us from simply returning pointers
to arbitrary locations in the memory. Same problem for the `easy' task
`improve IO speed': being able to mmap files and return pointers on
the memory without any copying would help a lot.

As you may already know, the CPython representation for strings is a
pair of pointers: one for the beginning of the string, the other for
the end. I think nothing is more efficient than this.

Best regards,

--
Jérémie


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mark Dufour  
View profile  
 More options Nov 29 2010, 2:30 pm
From: Mark Dufour <mark.duf...@gmail.com>
Date: Mon, 29 Nov 2010 20:30:08 +0100
Local: Mon, Nov 29 2010 2:30 pm
Subject: Re: links

Since Python strings are immutable, slices should never lead to memory

> allocations, am I wrong?

well, if we keep everything copy-by-reference, we'd still have to store the
slice start and end somewhere.. but if we use copy-by-value, that would
avoid any allocations. though I never really liked copy-by-value. it's ugly
and complicates code generation in several ways..

Yes, I entirely agree. That's something I've already thought about

> several times… std::string is of little help there, and it has a
> serious overhead for several simple tasks :
>  - slicing
>  - copying
>  - resizing to a shorter string
>  - (maybe with some additional work) working with simple `char' where
> strings are not needed (eg. for expressions like

for set and dict it turned out to be a very good idea to just do it the
cpython way.. I also like this approach because it gives the same behaviour,
so for example making it faster in cpython makes it faster with shedskin as
well.

>  x = 'a'
> or
>  if foo[42] == 'z':
> or even
>  x = 'a'
>  if foo[42] == x:
> )

shedskin currently caches all 256 possible 1-length strings, so indexing a
string for example doesn't cause an allocation, and working with
"characters" is really fast. that's also how I optimized geet's program for
shedskin, by simply comparing three characters instead of slicing and then
comparing.

I played a bit with an alternative 'char' type some time ago, but that ended
up complicating too many things, and because the current approach works
practically just as well in many cases, in the end that clearly wasn't worth
the effort.

> When the ticket for issue 74 (bz2) was opened, I wanted to write a
> quick and dirty implementation. Unfortunately, the current internal
> representation of strings prevents us from simply returning pointers
> to arbitrary locations in the memory. Same problem for the `easy' task
> `improve IO speed': being able to mmap files and return pointers on
> the memory without any copying would help a lot.

yeah, that's something to keep in mind. though the main problem with IO
speed is that currently everything happens per-character.

> As you may already know, the CPython representation for strings is a
> pair of pointers: one for the beginning of the string, the other for
> the end. I think nothing is more efficient than this.

no, but it sounds logical at this point.. ;) I'm guessing though that it
still allocates a new object for each slice, to contain these pointers..
what I'm hoping is that we can somehow efficiently manage such objects,
without needing copy-by-value.. it should at least be possible to allocate
many of them at once. deallocation and recycling may be a bit trickier.. :P

thanks,
mark.
--
http://www.youtube.com/watch?v=E6LsfnBmdnk


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
srepmub  
View profile  
 More options Dec 6 2010, 11:07 am
From: srepmub <mark.duf...@gmail.com>
Date: Mon, 6 Dec 2010 08:07:32 -0800 (PST)
Local: Mon, Dec 6 2010 11:07 am
Subject: Re: links

> > here's a recent comparison of mine against psyco. looks like next time
> > I will have to compare against pypy.. :)

> >http://shed-skin.blogspot.com/2010/05/shedskin-versus-psyco-for-131-p...

> Would be awesome to have the comparison against pypy :-)

here's a comparison by greg copeland (haven't looked at the code yet):

https://www.blogger.com/comment.g?blogID=14063458&postID=142596991899...

thanks,
mark.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
John Yeung  
View profile  
 More options Dec 11 2010, 1:06 pm
From: John Yeung <gallium.arsen...@gmail.com>
Date: Sat, 11 Dec 2010 13:06:42 -0500
Local: Sat, Dec 11 2010 1:06 pm
Subject: Re: links
No benchmarks here, but I wanted to note that Shed Skin is listed in
the compiler section of "Python Implementations" on the Python Wiki:

http://wiki.python.org/moin/PythonImplementations

John


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mark Dufour  
View profile  
 More options May 24 2011, 5:08 am
From: Mark Dufour <mark.duf...@gmail.com>
Date: Tue, 24 May 2011 11:08:43 +0200
Local: Tues, May 24 2011 5:08 am
Subject: Re: links

hi,

an update to the 'links' thread - in addition to 'learning python', shed
skin is now also (very shortly) mentioned in 'python algorithms':

http://books.google.com/books?id=9_AXCmGDiz8C&pg=PA272&dq=python+shed...

thanks,
mark.

On Sat, Dec 11, 2010 at 7:06 PM, John Yeung <gallium.arsen...@gmail.com>wrote:

--
http://www.youtube.com/watch?v=E6LsfnBmdnk

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Enzo  
View profile  
 More options May 24 2011, 4:48 pm
From: Enzo <enzo.erb...@gmail.com>
Date: Tue, 24 May 2011 13:48:19 -0700 (PDT)
Local: Tues, May 24 2011 4:48 pm
Subject: Re: links
My congratulations!

Is very good that SS is gaining popularity, and even better to be
present in the literature!

Best regards.

On May 24, 5:08 am, Mark Dufour <mark.duf...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »