Python 3 port - all MySQL tests now pass on 2.6.7 and 3.2.2 with the same codebase

2,767 views
Skip to first unread message

Ian Clelland

unread,
Dec 8, 2011, 6:39:53 PM12/8/11
to django-d...@googlegroups.com
I now have Django passing its entire unit test suite with the MySQL and SQLite backends, on Python 2.6.7 and Python 3.2.2

Details:

Common environment:
OS X 10.6.8
MacPorts 2.0.3
MySQL 5.1.60 from MacPorts
SQLite 3.7.9 from MacPorts
Django from https://bitbucket.org/vinay.sajip/django/ hash 6b1413a9901a

Python 2:
Python 2.6.7 from MacPorts
MySQLdb 1.2.3 from MacPorts

Python 3:
Python 3.2.2 from MacPorts
MySQLdb-for-Python-3 from https://github.com/clelland/MySQL-for-Python-3 commit 5b7130f (should be the same as https://github.com/vsajip/MySQL-for-Python-3)

Results:
Python 2, MySQL:
Ran 4478 tests in 2935.590s
OK (skipped=83, expected failures=3)

Python 3, MySQL:
Ran 4432 tests in 3000.029s
OK (skipped=88, expected failures=2, unexpected successes=1)

Python 2, SQLite:
Ran 4478 tests in 449.281s
OK (skipped=91, expected failures=3)

Python 3, MySQL:
Ran 4432 tests in 446.825s
OK (skipped=96, expected failures=2, unexpected successes=1)

Thanks a lot to Vinay for his amazing work pulling this all together!

--
Regards,
Ian Clelland
<clel...@gmail.com>

Vinay Sajip

unread,
Dec 8, 2011, 8:56:29 PM12/8/11
to Django developers

On Dec 8, 11:39 pm, Ian Clelland <clell...@gmail.com> wrote:
> I now have Django passing its entire unit test suite with the MySQL and
> SQLite backends, on Python 2.6.7 and Python 3.2.2

Ian,

Thanks for the comprehensive summary and eliminating those last few
issues on the MySQL backend. One more thing which might be useful to
have for comparison purposes is the test results when run on 2.x with
an unmodified Django (from the mirror at https://bitbucket.org/django/django/),
so that people can see what the potential performance penalty is for
calling u() and b() all over the place. I realise that it's not going
to be a scientific comparison, but it will probably show up if there
is appreciable overhead (I haven't found that to be the case, but it
might vary based on platforms and backends).

I posted these from the tests (for the sqlite backend) run on my
Ubuntu VM, in response to a question from Armin Ronacher:

Django on 2.x unported: 4482 tests in 368.972s, skipped = 90
Django on 2.x, with u(), b(), sys.exc_info() etc: 4478 tests in
367.635s, skipped = 90
Django on 3.x, with u(), b(), sys.exc_info() etc: 4423 tests in
372.946s, skipped = 96

It would be interesting to see if a similar pattern emerges on OSX.

Regards,

Vinay Sajip

Ian Clelland

unread,
Dec 9, 2011, 11:17:13 AM12/9/11
to django-d...@googlegroups.com
I upgraded to the latest trunk head, and re-ran the tests with the same Python 2.6 virtualenv 

Totally unscientific results:

SQLite:
Ran 4482 tests in 442.362s
OK (skipped=91, expected failures=3)

MySQL:
Ran 4482 tests in 2917.268s
OK (skipped=83, expected failures=3)
 
Unscientifically, trunk without the Python 3 patches runs 1.5% faster w/ SQLite, 0.6% faster w/ MySQL. (based on a sample size of 1 :) )


Tom Evans

unread,
Dec 9, 2011, 11:36:58 AM12/9/11
to django-d...@googlegroups.com
On Fri, Dec 9, 2011 at 4:17 PM, Ian Clelland <clel...@gmail.com> wrote:
> Unscientifically, trunk without the Python 3 patches runs 1.5% faster w/
> SQLite, 0.6% faster w/ MySQL. (based on a sample size of 1 :) )
>

I know you put the word 'unscientifically' in there, but you can draw
no conclusions from doing one run of each like that. See my reply
earlier in the week on how to simply and easily do valid statistical
testing.

http://osdir.com/ml/django-developers/2011-12/msg00162.html

Cheers

Tom

Vinay Sajip

unread,
Dec 9, 2011, 1:17:19 PM12/9/11
to Django developers

On Dec 9, 4:36 pm, Tom Evans <tevans...@googlemail.com> wrote:

> I know you put the word 'unscientifically' in there, but you can draw
> no conclusions from doing one run of each like that. See my reply
> earlier in the week on how to simply and easily do valid statistical
> testing.
>
> http://osdir.com/ml/django-developers/2011-12/msg00162.html
>

Hi Tom,

You're right, and you did offer to run some benchmarks if you were
pointed to some code. The Django changeset that Ian pointed to
(https://bitbucket.org/vinay.sajip/django/changeset/6b1413a9901a)
should be runnable without needing to set up a database backend like
MySQL or PostgreSQL, is there any chance you could run those
benchmarks (on the regression test runs under 2.x and 3.x) and post
results?

Regards,

Vinay Sajip

Ian Clelland

unread,
Dec 9, 2011, 1:57:39 PM12/9/11
to django-d...@googlegroups.com
I posted the results to have at least a single data point, as a "things don't appear to be wildly different" reassurance. Agreed, that the confidence interval is probably big enough to drive a bus through, and that any claims of "faster" or "slower" are completely unjustified statistically*.

I'm just scripting my test runner to switch directories to do the a/b testing that you were suggesting earlier; perhaps I'll have some numbers to post in a couple of hours.

* Except of course, if I had phrased it as the true fact that my two test runs with Python 2.6 completed more quickly than my test runs with Python 3.2, which, perhaps, is what I should have done.
 

Joe & Anne Tennies

unread,
Dec 9, 2011, 2:43:12 PM12/9/11
to django-d...@googlegroups.com
The thing is, we aren't trying to "scientifically correct" statistics. What we're aiming to say is, "This is not so wildly different as to be of any concern." We aren't looking for minor difference, but orders of magnitude difference.

If you are that worried about a <2% difference in speed, you probably shouldn't be using Python... or at least Django. You should be finding the parts that you can optimize for your specific application and optimizing those. Python 3 *IS* the future. There isn't much way around that at this point. I believe the general idea is to make sure the solution at this point does not slow everything down to the point where it would be impossible for someone to switch to Python 3. Don't get me wrong, I don't want to see a 2% increase in timing all the time, but Python 3 support is a bullet that will HAVE to get bitten.

Also, I am expecting I could make bigger gains by just using that pure Python MySQL driver and running under PyPy. If the major part of the time wasn't spent inside the actual database (which should be fairly obviously the case as switching from SQLite in memory to MySQL is a >5x increase in time).


--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to django-d...@googlegroups.com.
To unsubscribe from this group, send email to django-develop...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.




--
Joe & Anne Tennies
ten...@gmail.com

Ian Clelland

unread,
Dec 9, 2011, 9:04:46 PM12/9/11
to django-d...@googlegroups.com
On Fri, Dec 9, 2011 at 11:43 AM, Joe & Anne Tennies <ten...@gmail.com> wrote:
The thing is, we aren't trying to "scientifically correct" statistics. What we're aiming to say is, "This is not so wildly different as to be of any concern." We aren't looking for minor difference, but orders of magnitude difference.


Agreed.

But in the name of science (science!) I've run the a/b test that Tom suggested (with the abababaaabbb pattern, even), and these were the results:

          Trunk     Patches
Run #1   443.093    448.851
Run #2   440.845    445.338
Run #3   439.795    445.746
Run #4   437.751    462.278
Run #5   439.482    460.737
Run #6   436.606    461.509
                
Mean     439.595    454.077
Std Dev    2.288      8.245


(I won't speak as to whether all of these decimal places are warranted, but unittest reports milliseconds, so I'm sticking with three places all around)

All times are in seconds. This was tested on Python 2.6.7, and SQLite, against Django trunk from this morning (a), and Vinay's 3k-compatible branch from yesterday (b).

Tom Evans

unread,
Dec 12, 2011, 5:40:54 AM12/12/11
to django-d...@googlegroups.com
On Fri, Dec 9, 2011 at 7:43 PM, Joe & Anne Tennies <ten...@gmail.com> wrote:
> The thing is, we aren't trying to "scientifically correct" statistics. What
> we're aiming to say is, "This is not so wildly different as to be of any
> concern." We aren't looking for minor difference, but orders of magnitude
> difference.
>
> If you are that worried about a <2% difference in speed, you probably
> shouldn't be using Python... or at least Django. You should be finding the
> parts that you can optimize for your specific application and optimizing
> those. Python 3 *IS* the future. There isn't much way around that at this
> point. I believe the general idea is to make sure the solution at this point
> does not slow everything down to the point where it would be impossible for
> someone to switch to Python 3. Don't get me wrong, I don't want to see a 2%
> increase in timing all the time, but Python 3 support is a bullet that will
> HAVE to get bitten.

Py3k is the future, changing DB adapter from a C library with a python
wrapper to a pure python adaptor should merit some performance testing
of the new adaptor, which is what I really wanted to test.

However, it is bad science (we should all consider ourselves
scientists) to do one run and say that performance hasn't changed.

Regression testing is not just about changes in test results, changes
should also not make the framework unnecessarily slow, and the only
way to determine how much effect a changeset has is to benchmark it,
and the only way to benchmark it is scientifically, using statistics
and to a high degree of confidence.

We should only care about large changes in performance, but how do you
determine if something is a large change, without statistically valid
benchmarks?

>
> Also, I am expecting I could make bigger gains by just using that pure
> Python MySQL driver and running under PyPy. If the major part of the time
> wasn't spent inside the actual database (which should be fairly obviously
> the case as switching from SQLite in memory to MySQL is a >5x increase in
> time).
>

Statistics or shut up :) Only joking :)

How the pure python mysql driver performs compared to one built around
libmysqlclient is particularly interesting. Have you actually tried
this, or are you speculating?


Ian, thanks for running those tests. Running the numbers through
ministat tells us that the patches are slightly slower (3.2% ± 2.5%,
at a 99% confidence).

Cheers

Tom

Reply all
Reply to author
Forward
0 new messages