My 12 yr son was doing these exercises:
https://developers.google.com/edu/python/exercises/basicfor his implementation of linear_merge he wrote:
def linear_merge(list1, list2):
return sorted(list1 + list2)
so I spent quite some time with him explaining that we can use the way the lists are presorted to write a more efficient algorithm which might run in linear time.
That was a great exercise and a lot of fun, BUT....
We finished writing the "manual python" linear_merge. It turns out that our solution (which used 2 list index pointers, rather than list.pop(0) like the suggested solution) is significantly faster that the suggested solution. HOWEVER, it is MUCH MUCH slower than my son's original one-liner using python's sorted() even for very large data sets.
I am not a python person, so I couldn't understand why, until I realised the underlying algorithm behind python's sorted() is timsort:
http://svn.python.org/view/python/trunk/Objects/listsort.txt?revision=69846&view=markupwhich actually: "has supernatural performance on many kinds of partially ordered arrays"
So because sorted()'s timsort algorithm is ideally suited to this type of problem with pre-sorted lists and is implemented in efficient C, there is probably no way of writing python code that will beat my sons' one-liner.
This is a classic case of "
premature optimisation is the root of all evil":
http://wiki.c2.com/?PrematureOptimizationIn other words.....write the one-liner (takes 1 minute)...test it with your data...
IF and ONLY IF it is too slow for your needs...try to find a faster way ( in this case the one-liner, is probably the shortest, simplest and fastest possible solution anyway!)
All good learning.