compile error

lesshaste

unread,

Dec 8, 2011, 6:44:16 AM12/8/11

to shedskin-discuss

I am trying to use shedskin version 0.9 for the first time (in a long
time). I have a simple toy python script which I attach at the
bottom. I saved it as test.py. These are the steps I carried out

shedskin test.py
make

it then fails with

"In file included from test.cpp:2:
/usr/local/lib/python2.6/dist-packages/shedskin/lib/itertools.hpp: In
function ‘__itertools__::izipiter<T, U>* __itertools__::izip(int,
__shedskin__::pyiter<T>*, __shedskin__::pyiter<B>*) [with T =
__shedskin__::str*, U = __shedskin__::str*]’:
test.cpp:110: instantiated from here
/usr/local/lib/python2.6/dist-packages/shedskin/lib/itertools.hpp:773:
error: no matching function for call to
‘__itertools__::izipiter<__shedskin__::str*,
__shedskin__::str*>::izipiter(__shedskin__::pyiter<__shedskin__::str*>*&,
__shedskin__::pyiter<__shedskin__::str*>*&)’
/usr/local/lib/python2.6/dist-packages/shedskin/lib/itertools.hpp:743:
note: candidates are: __itertools__::izipiter<T,
T>::izipiter(__shedskin__::pyiter<T>*) [with T = __shedskin__::str*]
/usr/local/lib/python2.6/dist-packages/shedskin/lib/itertools.hpp:740:
note: __itertools__::izipiter<T, T>::izipiter() [with
T = __shedskin__::str*]
/usr/local/lib/python2.6/dist-packages/shedskin/lib/itertools.hpp:687:
note: __itertools__::izipiter<__shedskin__::str*,
__shedskin__::str*>::izipiter(const
__itertools__::izipiter<__shedskin__::str*, __shedskin__::str*>&)
make: *** [test] Error 1
"

I am on ubuntu lucid.

Am I just compiling it wrong?
Raphael

--- full python source ---
#!/usr/bin/python

import math
import string
import random
import itertools

n = 20
l = int(math.floor(math.pow(27,3.0/4.0)))
sigma = int(math.floor(math.sqrt(n)))
assert(sigma < 10)

alphabet = (string.digits)[0:sigma]
#Create two random strings
def string_generator(size=l, chars=alphabet):
return ''.join(random.choice(chars) for x in range(size))

def hamdist(str1, str2):
"""Count the # of differences between equal length strings str1
and str2"""
diffs = 0
for ch1, ch2 in itertools.izip(str1, str2):
if ch1 != ch2:
diffs += 1
return diffs

unknown = string_generator()
pattern = string_generator(2*l-1)

#Now work out what the Hamming distance are
hds = []
for i in xrange(0,l):
hd = hamdist(unknown, pattern[i:l+i])
print hd,
hds.append(hd)

stringsleft =[''.join(y) for y in itertools.product(alphabet, repeat =
l)]

for i in xrange(0,l):
stringsleft = set(stringsleft).intersection(set([y for y in
stringsleft if hamdist(pattern[i:i+l], y) == hds[i]]))
print len(stringsleft)

刘振海

unread,

Dec 8, 2011, 9:12:50 AM12/8/11

to shedskin...@googlegroups.com

I tried it under msvc ,it fails too.

there is something wrong with the itertools.hpp

Regards

Liu Zhenhai

2011/12/8 lesshaste <drr...@gmail.com>

--
You received this message because you are subscribed to the Google Groups "shedskin-discuss" group.
To post to this group, send email to shedskin...@googlegroups.com.
To unsubscribe from this group, send email to shedskin-discu...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/shedskin-discuss?hl=en.

Jérémie Roquet

unread,

Dec 8, 2011, 9:55:53 AM12/8/11

to shedskin...@googlegroups.com

Hi,

2011/12/8 刘振海 <1989...@gmail.com>:

> 2011/12/8 lesshaste <drr...@gmail.com>
>>
>> I am trying to use shedskin version 0.9 for the first time (in a long
>> time). I have a simple toy python script which I attach at the
>> bottom. I saved it as test.py. These are the steps I carried out
>>
>> shedskin test.py
>> make
>>
>> it then fails with
>>

>> <snip>

>>
> I tried it under msvc ,it fails too.
> there is something wrong with the itertools.hpp

Fixed, waiting for the fix to be merged into mainline.

https://gitorious.org/shedskin/mainline/merge_requests/1

Thanks for reporting, best regards,

--
Jérémie

Mark Dufour

unread,

Dec 8, 2011, 3:23:36 PM12/8/11

to shedskin...@googlegroups.com

thanks guys! jeremie, I pushed your fix.

note that the program becomes much faster when using the more optimized 'zip' builtin here, and faster still by using something like the following:

for i in range(len(str1)):
if str1[i] != str2[i]:

still this makes it only marginally faster than the python program on my pc. there are a few things we could improve here and there, but there's no clear bottleneck at this point..

thanks again,
mark.

2011/12/8 Jérémie Roquet <arka...@gmail.com>

--
Jérémie

--
You received this message because you are subscribed to the Google Groups "shedskin-discuss" group.
To post to this group, send email to shedskin...@googlegroups.com.
To unsubscribe from this group, send email to shedskin-discu...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/shedskin-discuss?hl=en.

--
http://www.youtube.com/watch?v=E6LsfnBmdnk

srepmub

unread,

Dec 10, 2011, 5:55:44 AM12/10/11

to shedskin-discuss

hi guys,

I optimized str.__ne__ somewhat in GIT. after that, set(stringsleft)
was now clearly the slowest part of the program (and cpython is really
fast at this of course). but it looks like it is not needed, really..?
see below. now the compiled code (using shedskin -bw from GIT) runs in
about 6.5 seconds here, versus 14 seconds for the original program
using cpython. about 60% of the total time is now spent on the
following line (half in join, half in itertools.product):

stringsleft =[''.join(y) for y in itertools.product(alphabet, repeat =
l)]

so another minor optimization here could really improve things
further. looking into this next..

thanks for triggering!
mark.

srepmub@akemi:~/shedskin$ diff -u orig.py hup.py
--- orig.py 2011-12-10 11:25:22.247936948 +0100
+++ hup.py 2011-12-10 11:46:57.639881674 +0100

@@ -20,8 +20,8 @@

"""Count the # of differences between equal length strings str1
and str2"""
diffs = 0

- for ch1, ch2 in itertools.izip(str1, str2):
- if ch1 != ch2:
+ for i in range(len(str1)):
+ if str1[i] != str2[i]:

diffs += 1
return diffs

@@ -39,6 +39,5 @@
l)]

for i in xrange(0,l):
- stringsleft = set(stringsleft).intersection(set([y for y in
-stringsleft if hamdist(pattern[i:i+l], y) == hds[i]]))
+ stringsleft = set([y for y in stringsleft if hamdist(pattern[i:i

+l], y) == hds[i]])

print len(stringsleft)

On Dec 8, 9:23 pm, Mark Dufour <mark.duf...@gmail.com> wrote:
> thanks guys! jeremie, I pushed your fix.
>
> note that the program becomes much faster when using the more optimized
> 'zip' builtin here, and faster still by using something like the following:
>
> for i in range(len(str1)):
> if str1[i] != str2[i]:
>
> still this makes it only marginally faster than the python program on my
> pc. there are a few things we could improve here and there, but there's no
> clear bottleneck at this point..
>
> thanks again,
> mark.
>

> 2011/12/8 Jérémie Roquet <arkano...@gmail.com>
>
>
>
>
>
>
>
>
>
> > Hi,
>
> > 2011/12/8 刘振海 <1989l...@gmail.com>:

srepmub

unread,

Dec 10, 2011, 6:10:36 AM12/10/11

to shedskin-discuss

in fact, the elements of stringsleft are always unique, so it seems
there's no need for sets at all. just using lists makes the time go to
about 5.8 seconds.

srepmub

unread,

Dec 10, 2011, 7:37:16 AM12/10/11

to shedskin-discuss

after optimizing str.join a bit in GIT, itertools.product now takes
almost half the time.

jeremie, pre-allocating tuples as in the following makes it go a lot
faster. useful I think, because product is probably the most used
function in itertools, but it also seems applicable to more functions
in itertools (and small-object-allocation in shedskin in general
perhaps.. ;-))

example goes from 5.8 to 4.8 seconds with this. the 1000 is probably a
bit too much though. a value of 25 also makes the time go down to
about 5.0 seconds. what do you think?

thanks,
mark.

srepmub

unread,

Dec 10, 2011, 7:46:23 AM12/10/11

to shedskin-discuss

3.5 seconds :-)

if (this->iter.size()) {
- for (int i = 0; i < (int)this->iter.size(); ++i) {
+ int itersize = (int)this->iter.size();
+ tuple->units.resize(itersize);
+ for (int i = 0; i < itersize; ++i) {
int j = this->iter[i];
- tuple->units.push_back(this->values[j][this-
>indices[i]]);
+ tuple->units[i] = this->values[j][this->indices[i]];

srepmub

unread,

Dec 10, 2011, 8:08:22 AM12/10/11

to shedskin-discuss

(sorry, that's in productiter<T,T>::next)

srepmub

unread,

Dec 10, 2011, 2:13:49 PM12/10/11

to shedskin-discuss

it looks like I forgot to paste this in before. here we preallocate
tuples in batches, rather than one-by-one.

--- itertools.hpp 2011-12-10 14:17:56.875495121 +0100
+++ newiter.hpp 2011-12-10 13:20:39.355641798 +0100
@@ -1024,6 +1024,9 @@
std::vector<unsigned int> iter;
std::vector<unsigned int> indices;

+ tuple2<T, T> *__tuple_cache;
+ int __tuple_count;
+
productiter();

void push_iter(pyiter<T> *iterable);
@@ -1037,6 +1040,8 @@
template<class T> inline productiter<T, T>::productiter() {
this->exhausted = false;
this->highest_exhausted = 0;
+ this->__tuple_cache = new tuple2<T,T>[1000];
+ this->__tuple_count = 0;
}

template<class T> void productiter<T, T>::push_iter(pyiter<T>
*iterable) {
@@ -1079,7 +1084,11 @@
throw new StopIteration();
}

- tuple2<T, T> *tuple = new tuple2<T, T>;
+ tuple2<T, T> *tuple = &(this->__tuple_cache[this->__tuple_count+
+]);
+ if(this->__tuple_count == 1000) {
+ this->__tuple_count = 0;
+ this->__tuple_cache = new tuple2<T,T>[1000];
+ }

On Dec 10, 1:37 pm, srepmub <mark.duf...@gmail.com> wrote:

Jérémie Roquet

unread,

Dec 13, 2011, 2:01:43 PM12/13/11

to shedskin...@googlegroups.com

Hi Mark,

Sorry for the late and very short reply.

2011/12/10 srepmub <mark....@gmail.com>:

> jeremie, pre-allocating tuples as in the following makes it go a lot
> faster. useful I think, because product is probably the most used
> function in itertools, but it also seems applicable to more functions
> in itertools (and small-object-allocation in shedskin in general
> perhaps.. ;-))

Yeah, that's a great idea. Not only this is faster, but it also avoids
spending too much memory for the heap-admin, since the tuples are
small.

> example goes from 5.8 to 4.8 seconds with this. the 1000 is probably a
> bit too much though. a value of 25 also makes the time go down to
> about 5.0 seconds. what do you think?

I don't have time to set up any benchmark yet, but I guess the more
the better. Of course, it has to be a reasonable value, otherwise the
whole interest in using iterators is gone, but I don't think 1000 is
too much.
On the other hand, this may lead to some serious overhead if we are
combining small containers, so we should allocate std::min(1000, the
maximum number of tuples we'll actually need) and not 1000 (eg.
product('AB', 'CD') should allocate 4 tuples and not more).

2011/12/10 srepmub <mark....@gmail.com>:

> 3.5 seconds :-)
>
> if (this->iter.size()) {
> - for (int i = 0; i < (int)this->iter.size(); ++i) {
> + int itersize = (int)this->iter.size();
> + tuple->units.resize(itersize);
> + for (int i = 0; i < itersize; ++i) {
> int j = this->iter[i];
> - tuple->units.push_back(this->values[j][this-
>>indices[i]]);
> + tuple->units[i] = this->values[j][this->indices[i]];
> }

I wonder how using .reserve() instead would perform.

Best regards,

--
Jérémie

Mark Dufour

unread,

Dec 14, 2011, 11:06:14 AM12/14/11

to shedskin...@googlegroups.com

hi jeremie,

I don't have time to set up any benchmark yet, but I guess the more
the better. Of course, it has to be a reasonable value, otherwise the
whole interest in using iterators is gone, but I don't think 1000 is
too much.

I'm worried a bit about the effects on GC, since the GC probably has to hold on to all 1000 tuples until none of them is "reachable" anymore. the larger allocated chucks also, the bigger the chance of "bogus" pointers forcing the GC to hold on to them. from this angle, the smaller the better perhaps.. :-)

On the other hand, this may lead to some serious overhead if we are
combining small containers, so we should allocate std::min(1000, the
maximum number of tuples we'll actually need) and not 1000 (eg.
product('AB', 'CD') should allocate 4 tuples and not more).

good idea :-) I committed a fixed value of 25, though, for now. if you or anyone would like to look into this further (also not just for productiter<T,T>, and perhaps even not just for itertools), please be my guest.

I wonder how using .reserve() instead would perform.

reserve (in combination with push_back) seems to perform a tad slower, at least in this case.

thanks!
mark.
--
http://www.youtube.com/watch?v=E6LsfnBmdnk

srepmub

unread,

Dec 19, 2011, 8:05:39 AM12/19/11

to shedskin-discuss

in the meantime, I optimized productiter<T,U> as well.

I _think_ product can be quite a bit faster still, esp. in cases where
the iterator is consumed right away (as in a 'for .. in product(..)'
or list(product(..))), by adding some things to class productiter so
that FOR_IN can specialize things more aggressively (as it does right
now for iteration over most builtins).

this was actually jeremie's idea long ago, but implemented after his
contribution of itertools.. :-)

thanks,
mark.

Reply all

Reply to author

Forward