Results:
C : 0.812 seconds
Python: 1.458 seconds.
difference = 0.646 seconds.
If the file.seek is removed the Python loop takes 2ms so the loop
overhead is minimal.
Without pysco the loop overhead is 4.6ms and Python takes 1.866.
Any ideas what is causing the slow down over the 'C' version.
In general I have been trying to take a video application written in C+
+ and make it work in Python.
There seem to be delays in the handoff to Windows System code that
make the Python version just a touch slower in some areas, but these
slowdowns are critically effecting the work. File seek is not a deal
breaker here, it is just the latest thing I have noticed and the
simplest to demonstrate.
Python version:
import time
def main():
# write temp file
SIZE = 1000
f1 = file('video.txt', 'wb')
f1.write('+' * SIZE)
f1.close()
f1 = file('video.txt', 'rb')
t0 = time.clock()
for i in xrange(1000000):
f1.seek(0)
delta = time.clock() - t0
print "%.3f" % delta
f1.close()
if __name__ == '__main__':
import psyco
psyco.full()
main()
// 'C' version
#include <stdio.h>
#include <time.h>
#define SIZE 1000
static void main(int argc, char *argv[])
{
FILE *f1;
int i;
int t0;
float delta;
char buffer[SIZE];
// write temp file
memset(buffer, (int)'+', SIZE);
f1 = fopen("video.txt", "wb");
fwrite(buffer, SIZE, 1, f1);
fclose(f1);
f1 = fopen("video.txt", "rb");
t0 = clock();
for (i=0; i < 1000000; i++)
{
fseek(f1, 0, SEEK_SET);
}
delta = (float)(clock() - t0) / CLOCKS_PER_SEC;
printf("%.3f\n", delta);
fclose(f1);
}
for i in xrange(1000000):
f1.seek(0)
But there is still a lot going on, some of which you can lift out of
the loop. The easiest I can think of is the lookup of the 'seek'
attribute on the f1 object. Try this:
f1_seek = f1.seek
for i in xrange(1000000):
f1_seek(0)
How does that help your timing?
-- Paul
As a side note, do you realize that this definition is invalid, in two
ways? "main" cannot be declared "static". The whole reason we use the
special name "main" is so that the startup code in the C run-time can link
to it. If "main" is static, it won't be exposed in the object file, and
the linker couldn't find it. It happens to work here because your C
compiler knows about "main" and discards the "static", but that's not a
good practice.
Further, it's not valid to have "main" return "void". The standards
require that it be declared as returning "int". Again, "void" happens to
work in VC++, but there are architectures where it does not.
--
Tim Roberts, ti...@probo.com
Providenza & Boekelheide, Inc.
> As it turns out each call is only
> 646 nanoseconds slower than 'C'.
> However, that is still 80% of the time to perform a file seek,
> which I would think is a relatively slow operation compared to just
> making a system call.
A seek may not be doing much beyond setting a current offset value.
It is likely that fseek(f1, 0, SEEK_SET) isn't even doing a system call.
An implementation of fseek will often return relatively quickly when
the position is within the current buffer -- from line 192 in
http://www.google.com/codesearch/p?hl=en#XAzRy8oK4zA/libc/stdio/fseek.c&q=fseek&sa=N&cd=1&ct=rc
Neil
Exactly. If I replace both calls to fseek with gettimeofday (aka
time.time() on my platform in python) I get fairly close results:
$ ./testseek
4.120
$ python2.5 testseek.py
4.170
$ ./testseek
4.080
$ python2.5 testseek.py
4.130
FWIW, my results with fseek aren't as bad as those of the OP. This is
python2.5 on a 2.6.9 Linux OS, with psyco:
$ ./testseek
0.560
$ python2.5 testseek.py
0.750
$ ./testseek
0.570
$ python2.5 testseek.py
0.760
Ok, I ran some more tests.
C, seek : 0.812 seconds // test from original post
Python, f.seek : 1.458 seconds. // test from original post
C, time(&tm) : 0.671 seconds
Python, time.time(): 0.513 seconds.
Python, ctypes.msvcrt.time(ctypes.byref(tm)): 0.971 seconds. #
factored the overhead to be outside the loop, so really this was
func_ptr(ptr).
Perhaps I am just comparing apples to oranges.
I never tested the overhead of ctypes like this before.
Most of my problem timings involve calls through ctypes.
It is amazing to me that Cython generates a 'C' file that is 1478
lines.
#Cython code
import time
cdef int SEEK_SET = 0
cdef extern from "stdio.h":
void* fopen(char* filename, char* mode)
int fseek(void*, long, int)
def main():
cdef void* f1 = fopen('video.txt', 'rb')
cdef int i=1000000
t0 = time.clock()
while i > 0:
fseek(f1, 0, SEEK_SET)
i -= 1
delta = time.clock() - t0
print "%.3f" % delta
if __name__ == '__main__':
main()
regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
See PyCon Talks from Atlanta 2010 http://pycon.blip.tv/
Holden Web LLC http://www.holdenweb.com/
UPCOMING EVENTS: http://holdenweb.eventbrite.com/
Except that it's not a language but rather a syntax converter, i.e. it
doesn't really add any features to the C language but rather restricts
Python syntax to C language features (plus a bit of header file
introspection, it seems, but C's preprocessor has a bit of that, too).
Stefan
Well, it generated an optimised Python interface for your module and made
it compilable in CPython 2.3 through 3.2. It doesn't look like your C
module features that. ;)
> #Cython code
>
> import time
>
> cdef int SEEK_SET = 0
>
> cdef extern from "stdio.h":
> void* fopen(char* filename, char* mode)
> int fseek(void*, long, int)
Cython ships with a stdio.pxd that you can cimport. It looks like it
doesn't currently define fseek(), but it defines at least fopen() and FILE.
Patches are always welcome.
> def main():
> cdef void* f1 = fopen('video.txt', 'rb')
> cdef int i=1000000
> t0 = time.clock()
> while i> 0:
> fseek(f1, 0, SEEK_SET)
> i -= 1
> delta = time.clock() - t0
Note that the call to time.clock() takes some time, too, so it's not
surprising that this is slower than hand-written C code. Did you test how
it scales?
Also, did you look at the generated C code or the annotated Cython code
(cython -a)? Did you make sure both were compiled with the same CFLAGS?
Also, any reason you're not using a for-in-xrange loop? It shouldn't make a
difference in speed, it's just more common. You even used a for loop in
your C code.
Finally, I'm not sure why you think that these 30% matter at all. In your
original post, you even state that seek-time isn't the "deal breaker", so
maybe you should concentrate on the real issues?
Stefan
This is quite a stupid benchmark to write, since repeatedly seeking to 0
is a no-op. I haven't re-read the file object code recently, but chances
are that the Python file object has its own bookkeeping which adds a bit
of execution time.
But I would suggest measuring the performance of *actual* seeks to
different file offsets, before handwaving about the supposed "slowness"
of file seeks in Python.
Regards
Antoine.
I have used the profiler about as much as I can to find where my
program is slow, and it appears to me that
the overhead to calling 'C' functions is now my biggest problem.
I have been using Ctypes, which has been a great tool so far.
I just discovered Cython and this looks like it may help me.
I had not heard of pythoid, so I will check it out.
I did not mean to offend anybody in Cython community.
It just seemed funny to me that 21 lines of Python became 1478 lines
of 'C'.
I wasn't really expecting any response to this.
I don't know enough about this to really assume anything.
Stephan,
I just tested 1e7 loops.
'C': 8.133 seconds
Cython: 10.812 seconds
I can't figure out what Cython is using for CFLAGS, so this could be
important.
I used While instead of xrange, because I thought it would be faster
in Cython.
They had roughly the same execution speed.
Thanks all for the suggestions.
I think I will just consider this thread closed.