lists2.py excercise - suggested solution is premature optimisation?

97 views
Skip to first unread message

Oliver Schönrock

unread,
Dec 22, 2016, 8:03:00 AM12/22/16
to Python GCU Forum
My 12 yr son was doing these exercises:

https://developers.google.com/edu/python/exercises/basic

for his implementation of linear_merge he wrote:

def linear_merge(list1, list2):
  return sorted(list1 + list2)

so I spent quite some time with him explaining that we can use the way the lists are presorted to write a more efficient algorithm which might run in linear time.
That was a great exercise and a lot of fun, BUT....

We finished writing the "manual python" linear_merge. It turns out that our solution (which used 2 list index pointers, rather than list.pop(0) like the suggested solution) is significantly faster that the suggested solution. HOWEVER, it is MUCH MUCH slower than my son's original one-liner using python's sorted() even for very large data sets.

I am not a python person, so I couldn't understand why, until I realised the underlying algorithm behind python's sorted() is timsort:
http://svn.python.org/view/python/trunk/Objects/listsort.txt?revision=69846&view=markup

which actually: "has supernatural performance on many kinds of partially ordered arrays"

So because sorted()'s timsort algorithm is ideally suited to this type of problem with pre-sorted lists and is implemented in efficient C, there is probably no way of writing python code that will beat my sons' one-liner.

This is a classic case of "premature optimisation is the root of all evil": http://wiki.c2.com/?PrematureOptimization

In other words.....write the one-liner (takes 1 minute)...test it with your data...

IF and ONLY IF it is too slow for your needs...try to find a faster way  ( in this case the one-liner, is probably the shortest, simplest and fastest possible solution anyway!)

All good learning.




Reply all
Reply to author
Forward
0 new messages