sort functions in python

t3chn0n3rd

unread,

Feb 8, 2008, 8:00:27 PM2/8/08

to

Do you think it is relatively easy to write sort algorithms such as
the common Bubble sort in Python as compared to other high level
programming langauges

Michael Spencer

unread,

Feb 8, 2008, 8:07:18 PM2/8/08

to pytho...@python.org

yes

Jeff Schwab

unread,

Feb 8, 2008, 8:55:29 PM2/8/08

to t3chn0n3rd

http://www.codecodex.com/wiki/index.php?title=Bubble_sort

Steven D'Aprano

unread,

Feb 8, 2008, 9:34:20 PM2/8/08

to

You realise that bubble sort is one of the worst possible sort algorithms
possible, particularly on modern hardware? It's not the worst: bogo sort
and bozo sort are worse, but bubble sort is still pretty atrocious.

http://en.wikipedia.org/wiki/Bogosort

Frankly, you won't get better performance in Python than the built in
list.sort() method and sorted() function provide. If you want to actually
sort, use them.

But for the sake of the learning exercise, yes, Python makes it easy to
write algorithms including bubble sort and quick sort and merge sort.

Here's a version of bubble sort, from Wikibooks:

def bubblesort(l):
"Sorts l in place and returns it."
for passesLeft in range(len(l)-1, 0, -1):
for index in range(passesLeft):
if l[index] > l[index + 1]:
l[index], l[index + 1] = l[index + 1], l[index]
return l

http://en.wikibooks.org/wiki/Algorithm_implementation/Sorting/Bubble_sort

--
Steven

Jeff Schwab

unread,

Feb 8, 2008, 10:09:06 PM2/8/08

to

Steven D'Aprano wrote:
> On Fri, 08 Feb 2008 17:00:27 -0800, t3chn0n3rd wrote:
>
>> Do you think it is relatively easy to write sort algorithms such as the
>> common Bubble sort in Python as compared to other high level programming
>> langauges
>
>
> You realise that bubble sort is one of the worst possible sort algorithms
> possible, particularly on modern hardware? It's not the worst: bogo sort
> and bozo sort are worse, but bubble sort is still pretty atrocious.

That's the conventional wisdom, mostly because CS professors need a "bad
algorithm" to show undergrads, but it's not really accurate. The truth
is that bubble-sort's performance depends strongly on the input data.
In the worst case, yes, bubble-sort is O(n^2); however, for
almost-sorted data (only a few elements out of place), bubble-sort is
actually one of the fastest known algorithms. For already-sorted data,
bubble-sort is O(n). If you expect your data to be pretty nearly sorted
already, but you just want to make sure (e.g. because a small number of
elements may have been inserted or removed since the last sort),
bubble-sort is a good choice.

Paddy

unread,

Feb 9, 2008, 1:13:05 AM2/9/08

to

Hi,
From a quick google, you could compare the sources here:
http://www.daniweb.com/code/snippet452.html
with those from here:
http://cg.scs.carleton.ca/~morin/misc/sortalg/

(It seems that sort routines are a favourite for showing off Java
applet capability).

- Paddy.
P.S. I also found this absolute gem: The intelligent design sort,
"works outside of time .."

http://www.dangermouse.net/esoteric/intelligentdesignsort.html

:-)

Carl Banks

unread,

Feb 9, 2008, 1:27:02 AM2/9/08

to

On Feb 8, 10:09 pm, Jeff Schwab <j...@schwabcenter.com> wrote:
> If you expect your data to be pretty nearly sorted
> already, but you just want to make sure (e.g. because a small number of
> elements may have been inserted or removed since the last sort),
> bubble-sort is a good choice.

But if you're at that stage you probably were doing something wrong in
the first place.

For a list of any decent size a few insertions using a bisection
algorithm will take fewer comparisons than a single bubblesort pass.

Carl Banks

Steven D'Aprano

unread,

Feb 9, 2008, 3:02:18 AM2/9/08

to

On Fri, 08 Feb 2008 19:09:06 -0800, Jeff Schwab wrote:

> Steven D'Aprano wrote:
>> On Fri, 08 Feb 2008 17:00:27 -0800, t3chn0n3rd wrote:
>>
>>> Do you think it is relatively easy to write sort algorithms such as
>>> the common Bubble sort in Python as compared to other high level
>>> programming langauges
>>
>>
>> You realise that bubble sort is one of the worst possible sort
>> algorithms possible, particularly on modern hardware? It's not the
>> worst: bogo sort and bozo sort are worse, but bubble sort is still
>> pretty atrocious.
>
> That's the conventional wisdom, mostly because CS professors need a "bad
> algorithm" to show undergrads, but it's not really accurate. The truth
> is that bubble-sort's performance depends strongly on the input data. In
> the worst case, yes, bubble-sort is O(n^2); however, for almost-sorted
> data (only a few elements out of place), bubble-sort is actually one of
> the fastest known algorithms.

It depends on what you mean by "bubble sort". There are many different
variations of bubble sort, that are sometimes described by names such as
comb sort, cocktail sort, exchange sort, and sometimes merely referred to
bubble sort. It's rather like any divide-and-conquer sorting algorithm
being called quicksort.

I'm as guilty of that as anyone else -- the example code I posted, raided
from Wikibooks is very different from this bubble sort algorithm from
PlanetMath:

http://planetmath.org/encyclopedia/Bubblesort.html

def bubblesort(L):
done = False
while not done:
done = True
for i in xrange(len(L)-1):
if L[i+1] <= L[i]:
L[i], L[i+1] = L[i+1], L[i]
done = False
return L

This one runs significantly faster than the earlier one I posted.

Another vital factor is, what language are you implementing the sort
routine in? The conventional wisdom for low-level languages like assembly
and C doesn't necessarily hold for high-level languages like Python.
Comparisons are slow in Python and fast in C; in C a good algorithm will
minimize the amount of moves, while in Python a good algorithm will
minimize the number of comparisons.

Finally, what hardware are you running on? There are interactions between
hardware caches and pipes that can ruin the performance of an otherwise
promising algorithm.

But all those factors aside, I fear that your attempt to rehabilitate the
reputation of bubble sort is misplaced. Here's another person who wants
to defend bubble sort:

"A fair number of algorithm purists (which means they've probably never
written software for a living) claim that the bubble sort should never be
used for any reason. Realistically, there isn't a noticeable performance
difference between the various sorts for 100 items or less, and the
simplicity of the bubble sort makes it attractive."

http://linux.wku.edu/~lamonml/algor/sort/bubble.html

But then on the same person's page on insertion sort:

"The insertion sort is a good middle-of-the-road choice for sorting lists
of a few thousand items or less. The algorithm is significantly simpler
than the shell sort, with only a small trade-off in efficiency. At the
same time, the insertion sort is over twice as fast as the bubble sort
and almost 40% faster than the selection sort."

http://linux.wku.edu/~lamonml/algor/sort/insertion.html

He gives sample C code for both, and the bubble sort has eight lines of
code, versus nine for insertion sort (excluding bare braces).

Perhaps what he should have said is:

"Bubble sort is a tiny bit simpler than insertion sort, and almost twice
as slow!"

--
Steven

Jeff Schwab

unread,

Feb 9, 2008, 4:29:13 PM2/9/08

to

So basically, you and I agree that the right sorting algorithm for the
job depends on the job. I have no particular interest in evangelizing
for bubble-sort; however, I hate to see an algorithm (or data structure,
or language, or programming style) taken out of the running before the
race even starts, just because it's out of fashion.

Jeff Schwab

unread,

Feb 9, 2008, 4:37:23 PM2/9/08

to

Carl Banks wrote:
> On Feb 8, 10:09 pm, Jeff Schwab <j...@schwabcenter.com> wrote:
>> If you expect your data to be pretty nearly sorted
>> already, but you just want to make sure (e.g. because a small number of
>> elements may have been inserted or removed since the last sort),
>> bubble-sort is a good choice.
>
> But if you're at that stage you probably were doing something wrong in
> the first place.

How do you figure? You don't always have control over the
insertions/replacements, or where they happen. As a silly example,
assume you have a sorted list of children, by height. Maybe you check
your list once per school semester. The new sort order for any given
semester will likely be very close to the previous order; however, a few
swaps may be in order, according to the different speeds at which
children have grown.

> For a list of any decent size a few insertions using a bisection
> algorithm will take fewer comparisons than a single bubblesort pass.

Do you mean that if newly inserted data are kept separate from the
already-sorted list, then they can be merged into the list in O(log N) time?

Steven D'Aprano

unread,

Feb 9, 2008, 5:10:05 PM2/9/08

to

On Sat, 09 Feb 2008 13:37:23 -0800, Jeff Schwab wrote:

> Carl Banks wrote:
>> On Feb 8, 10:09 pm, Jeff Schwab <j...@schwabcenter.com> wrote:
>>> If you expect your data to be pretty nearly sorted already, but you
>>> just want to make sure (e.g. because a small number of elements may
>>> have been inserted or removed since the last sort), bubble-sort is a
>>> good choice.
>>
>> But if you're at that stage you probably were doing something wrong in
>> the first place.
>
> How do you figure? You don't always have control over the
> insertions/replacements, or where they happen. As a silly example,
> assume you have a sorted list of children, by height. Maybe you check
> your list once per school semester. The new sort order for any given
> semester will likely be very close to the previous order; however, a few
> swaps may be in order, according to the different speeds at which
> children have grown.

You check their heights once per semester, but you're worried about an
extra ten or twenty microseconds to sort the data?

Frankly, if we're talking about sorting at the level of Python
application programming, nothing you write is going to beat the speed of
the built-in list.sort() and sorted() built-ins. Here's that bubble sort
from yesterday again:

def bubble(A):
# http://planetmath.org/encyclopedia/Bubblesort.html
N = len(A)-1

done = False
while not done:
done = True

for i in xrange(N):
if A[i+1] <= A[i]:
A[i], A[i+1] = A[i+1], A[i]
done = False
return A

and some speed tests:

>>> L = [1, 2, 3, 4, 6, 5, 7, 9, 8]
>>> min(timeit.Timer('bubble(L)',
... 'from __main__ import bubble, L').repeat(number=1000))
0.012052059173583984
>>> min(timeit.Timer('sorted(L)',
... 'from __main__ import L').repeat(number=1000))
0.0055558681488037109
>>> min(timeit.Timer('bubble(L)',
... 'from __main__ import bubble; L=range(5)').repeat(number=1000))
0.008541107177734375
>>> min(timeit.Timer('sorted(L)',
... 'L=range(5)').repeat(number=1000))
0.0046191215515136719

If you're writing code in a low-level language, or a high-level language
with no built-in sort (or a crappy one), your mileage may vary.

--
Steven

Hrvoje Niksic

unread,

Feb 9, 2008, 5:07:44 PM2/9/08

to

Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au> writes:

> It depends on what you mean by "bubble sort". There are many different
> variations of bubble sort, that are sometimes described by names such as
> comb sort, cocktail sort, exchange sort, and sometimes merely referred to
> bubble sort.

I've never seen anything better than bubble sort being called bubble
sort. Comb sort is definitely regarded as a different algorithm, and
cocktail sort is no better than bubble sort anyway.

> It's rather like any divide-and-conquer sorting algorithm being
> called quicksort.

Where have you seen this? I've never heard of non-quicksort
divide-and-conquer algorithms (such as mergesort) being referred to as
quicksort.

Jeff Schwab

unread,

Feb 9, 2008, 5:28:15 PM2/9/08

to

Steven D'Aprano wrote:
> On Sat, 09 Feb 2008 13:37:23 -0800, Jeff Schwab wrote:
>
>> Carl Banks wrote:
>>> On Feb 8, 10:09 pm, Jeff Schwab <j...@schwabcenter.com> wrote:
>>>> If you expect your data to be pretty nearly sorted already, but you
>>>> just want to make sure (e.g. because a small number of elements may
>>>> have been inserted or removed since the last sort), bubble-sort is a
>>>> good choice.
>>> But if you're at that stage you probably were doing something wrong in
>>> the first place.
>> How do you figure? You don't always have control over the
>> insertions/replacements, or where they happen. As a silly example,
>> assume you have a sorted list of children, by height. Maybe you check
>> your list once per school semester. The new sort order for any given
>> semester will likely be very close to the previous order; however, a few
>> swaps may be in order, according to the different speeds at which
>> children have grown.
>
> You check their heights once per semester, but you're worried about an
> extra ten or twenty microseconds to sort the data?

Are you serious?

The fact that you wouldn't use a Python script for this is what makes it
a "silly" example in the first place. The debate is whether Bubble Sort
is so bad that someone should get slapped upside the head for even
mentioning it. I think it's a tool, like any other, and that
occasionally it might be the right tool for the job. Imagine that
instead of children, we're keeping a sorted list of a very large number
of crystals that may grow at varying speeds, and that the sizes of the
individual crystals are polled at a rate limited by the time taken by
the sort algorithm.

> Frankly, if we're talking about sorting at the level of Python
> application programming, nothing you write is going to beat the speed of
> the built-in list.sort() and sorted() built-ins.

That's probably true with five-nines probability, but there may be
exceptions. Do you really not understand what I'm getting at, or do you
just disagree?

Steven D'Aprano

unread,

Feb 9, 2008, 8:42:53 PM2/9/08

to

On Sat, 09 Feb 2008 23:07:44 +0100, Hrvoje Niksic wrote:

> Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au> writes:
>
>> It depends on what you mean by "bubble sort". There are many different
>> variations of bubble sort, that are sometimes described by names such
>> as comb sort, cocktail sort, exchange sort, and sometimes merely
>> referred to bubble sort.
>
> I've never seen anything better than bubble sort being called bubble
> sort.

What, you didn't read the post you're replying to? There goes your
credibility, right out the window and accelerating fast. The very next

paragraph in my post said:

"I'm as guilty of that as anyone else -- the example code I posted,
raided from Wikibooks is very different from this bubble sort algorithm
from PlanetMath:"

So as you can see, there are at least two quite different sort
algorithms, with very different speeds and behaviours, and both are
called bubble sort by the people who put them on the web.

The Wikibooks version always makes the same number of passes over the
list, regardless of it's initial order. Even if the list is sorted, it
still walks over the list O(N**2) times, doing pointless comparisons over
and over again. It ends up doing a constant 1/2*N*(N-1) comparisons,
regardless of the state of the list, whether it is sorted, reversed or
something in between.

http://en.wikibooks.org/wiki/Algorithm_implementation/Sorting/Bubble_sort

The Planet Math version does not run a fixed number of times. If the list
is already sorted, it only walks it once, doing only (N-1) comparisons.
But if the list is far from sorted, it can end up doing N*(N-1)
comparisons.

http://planetmath.org/encyclopedia/Bubblesort.html

If we're going to call them both bubble sort, we might as well call
virtually any exchange sort (e.g. gnome sort, comb sort, cocktail sort,
insertion sort, etc.) "bubble sort". And that would be stupid.

Sure, they're all *similar* algorithms, with O(N**2) behaviour, but their
differences are significant: they differ in their best, worst and average
behaviour and the coefficient multiplier.

> Comb sort is definitely regarded as a different algorithm,

And so it should be.

> and cocktail sort is no better than bubble sort anyway.

Not according to Wikipedia:

http://en.wikipedia.org/wiki/Cocktail_sort

>> It's rather like any divide-and-conquer sorting algorithm being called
>> quicksort.
>
> Where have you seen this? I've never heard of non-quicksort
> divide-and-conquer algorithms (such as mergesort) being referred to as
> quicksort.

My apologies for not being more clear. What I meant was that lumping all
exchange-sorts (a family of algorithms) as "bubble sort" (a specific
example of the family) is as bogus as lumping all divide-and-conquer
algorithms together as "quicksort" would be, if anyone did it.

--
Steven

Carl Banks

unread,

Feb 9, 2008, 11:02:54 PM2/9/08

to

On Feb 9, 4:37 pm, Jeff Schwab <j...@schwabcenter.com> wrote:
> Carl Banks wrote:
> > On Feb 8, 10:09 pm, Jeff Schwab <j...@schwabcenter.com> wrote:
> >> If you expect your data to be pretty nearly sorted
> >> already, but you just want to make sure (e.g. because a small number of
> >> elements may have been inserted or removed since the last sort),
> >> bubble-sort is a good choice.
>
> > But if you're at that stage you probably were doing something wrong in
> > the first place.
>
> How do you figure?

"Probably."

:)

Carl Banks

Steven D'Aprano

unread,

Feb 10, 2008, 12:06:29 AM2/10/08

to

On Sat, 09 Feb 2008 14:28:15 -0800, Jeff Schwab wrote:

> Steven D'Aprano wrote:
>> On Sat, 09 Feb 2008 13:37:23 -0800, Jeff Schwab wrote:
>>
>>> Carl Banks wrote:
>>>> On Feb 8, 10:09 pm, Jeff Schwab <j...@schwabcenter.com> wrote:
>>>>> If you expect your data to be pretty nearly sorted already, but you
>>>>> just want to make sure (e.g. because a small number of elements may
>>>>> have been inserted or removed since the last sort), bubble-sort is a
>>>>> good choice.
>>>> But if you're at that stage you probably were doing something wrong
>>>> in the first place.
>>> How do you figure? You don't always have control over the
>>> insertions/replacements, or where they happen. As a silly example,
>>> assume you have a sorted list of children, by height. Maybe you check
>>> your list once per school semester. The new sort order for any given
>>> semester will likely be very close to the previous order; however, a
>>> few
>>> swaps may be in order, according to the different speeds at which
>>> children have grown.
>>
>> You check their heights once per semester, but you're worried about an
>> extra ten or twenty microseconds to sort the data?
>
> Are you serious?
>
> The fact that you wouldn't use a Python script for this is what makes it
> a "silly" example in the first place.

Okay, now you've broken my brain. Why on earth do you think Python isn't
suitable for processing a list of a few hundred, or even thousand, items?

> The debate is whether Bubble Sort
> is so bad that someone should get slapped upside the head for even
> mentioning it.

Perhaps we have a different opinion about what "slapped upside the head"
means. But if you care to point out precisely what it was that I -- or
anyone else -- said that could be classified as abusive to the original
poster for mentioning bubble sort, I'll be glad to apologize to him.

If not, perhaps you should chill out and stop accusing folks of bad
behaviour that didn't happen.

> I think it's a tool, like any other, and that
> occasionally it might be the right tool for the job.

Sure its a tool. So is a stone axe, and if I were worried about the
collapse of industrial civilization and the loss of metal-working skills,
then I might advocate that engineering students learn how to make stone
axes.

In the case of bubble sort, I think it's a perfectly fine algorithm to
play around with *as a learning exercise*.

> Imagine that
> instead of children, we're keeping a sorted list of a very large number
> of crystals that may grow at varying speeds, and that the sizes of the
> individual crystals are polled at a rate limited by the time taken by
> the sort algorithm.

Bubble sort doesn't degrade gracefully. On "a very large" set of data, if
your almost-sorted data is actually less sorted than you imagined, you
could see your expected 10 seconds of processing time blow out to 35
minutes.

I'd be more sympathetic to the idea of using bubble sort on tiny data
sets, where if it goes bad, nobody is going to care that 0.01 second
blows out to 3s. Who'd even notice?

But then, for such small amounts of data, why would you bother with
bubble sort when insertion sort and Shell sort exist?

>> Frankly, if we're talking about sorting at the level of Python
>> application programming, nothing you write is going to beat the speed
>> of the built-in list.sort() and sorted() built-ins.
>
> That's probably true with five-nines probability, but there may be
> exceptions. Do you really not understand what I'm getting at, or do you
> just disagree?

It's not that bubble sort has no positives, but that they're so minor,
and the negatives so serious, that I can't think of any situation where
I'd recommend using bubble sort in production code -- or take somebody
seriously if they suggested it, without a *whole* lot of actual timing
results backing them up.

--
Steven

Message has been deleted

Hrvoje Niksic

unread,

Feb 10, 2008, 4:33:53 AM2/10/08

to

Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au> writes:

>> I've never seen anything better than bubble sort being called
>> bubble sort.
>
> What, you didn't read the post you're replying to?

I responded to a specific point about combsort being called bubble
sort. I agree that generally the line between "bubblesort variants"
and "algorithms based on bubblesort" is blurry and drawn by convention
rather than any strict criterion.

Jeff Schwab

unread,

Feb 10, 2008, 1:43:40 PM2/10/08

to

Either you're joking, or we're definitely not getting anywhere. I think
we're through here. See you else-thread. No hard feelings.