"Lazy"-sort ?!

96 views
Skip to first unread message

crippa

unread,
Sep 21, 1999, 3:00:00 AM9/21/99
to
Hi,

I have an open question conserning sorting in general.

Is there a data structure with which it would be possible, or rather
fruitfull, to do the sorting in a kind a lazy way ?

In pseudo code it could be like:

A = my_lazy_ADT:new([5,2,4,1,3]) --> [1, UnsortedTail_1]
head(A) --> 1
B = tail(A) --> [2, UnsortedTail_2}

...and so on.


If we consider lists maybe this is one kind of implementaion:
(see further at http://www.erlang.org )

-module(lazy_list).

-export([sort/1]).

sort([]) -> [];
sort([H|T]) -> sort(H, T, [], [H]).

sort(A, [B|T], WithoutMin, WithMin) when A < B ->
sort(A, T, [B|WithoutMin], [B|WithMin]);
sort(A, [B|T], WithoutMin, WithMin) when A >= B ->
sort(B, T, WithMin, [B|WithMin]);
sort(A, [], WithoutMin, WithMin) ->
[A|WithoutMin].


new(L) when list(L) -> sort(L).

head([H|T]) -> H.

tail([H|T]) -> sort(T).


This will give O(t) for each tail/1 operation, where t is the number
of element at the time for the operation. And my guess for the amortized
cost is O(sqrt(n)) (correct me on that one). In other words, this doesn't
look too good, right. But maybe one could extended the implementation above
to perform some bubble sort(or simular) during the list traverse. And after
some tail/1 operations we will know that the ListTail is sorted.

Any thoughts about this ?

/Christofer

--------------------------------------------------------------------
Christofer Törnkvist Phone: +46 8 727 57 52
Ericsson Utveckling AB Fax: -
Software Architecture Lab. http://www-sarc.ericsson.se/public
S-125 25 Stockholm, Sweden Email: cri...@erix.ericsson.se.foolu
--------------------------------------------------------------------

Lars Lundgren

unread,
Sep 21, 1999, 3:00:00 AM9/21/99
to
On Tue, 21 Sep 1999, crippa wrote:

> Hi,
>
> I have an open question conserning sorting in general.
>
> Is there a data structure with which it would be possible, or rather
> fruitfull, to do the sorting in a kind a lazy way ?
>

Yes of course, try the List data structure '[a]' in haskell, or any data
structure in haskell for that matter.

Isn't it fun to write

minimum xs = head (sort xs)

and get an efficient implementation of minimum?

/Lars L

George Russell

unread,
Sep 21, 1999, 3:00:00 AM9/21/99
to
crippa wrote:
>
> Hi,
>
> I have an open question conserning sorting in general.
>
> Is there a data structure with which it would be possible, or rather
> fruitfull, to do the sorting in a kind a lazy way ?
Easy-peasy. You obviously need O(n) time to get the lowest item out,
since you need to check all the items. So the best you can hope for
is O(n) for the first one, and O(log n) for all the others.
So use heap sort or balanced trees, both of which manage that.
Can't be bothered to code them now though . . .

crippa

unread,
Sep 22, 1999, 3:00:00 AM9/22/99
to

Ok, so there is nothing to gain here as far as complexity is
concerned. Only for small sizes of data there might be possible
to save some time, though. Probably a too small time save.

BTW: Using a heap the first element is picked in O(nlog(n)) as you
create (sort) the heap, and then the rest at O(log(n)).
The latter expression is settled by the deleteMin operation.

Best Regards

Martin Norb{ck

unread,
Sep 22, 1999, 3:00:00 AM9/22/99
to
Wed, 22 Sep 1999 13:00:32 +0200, cri...@erix.ericsson.se (crippa) ->

> BTW: Using a heap the first element is picked in O(nlog(n)) as you
> create (sort) the heap, and then the rest at O(log(n)).
> The latter expression is settled by the deleteMin operation.

You don't sort the elements when creating a heap. A heap only has a heap
ordering between the elements. You can actually build a heap in O(n) time.
The build function uses in-place updates, however.

Picking the first element is O(log n), so finding the first element
using a heap is O(n)+O(log n) = O(n).

(btw, finding the first element in a well coded merge sort is also O(n))

n.

--
[ http://www.dtek.chalmers.se/~d95mback/ ] [ PGP: 0x453504F1 ] [ UIN: 4439498 ]
Opinions expressed above are mine, and not those of my future employees.

crippa

unread,
Sep 22, 1999, 3:00:00 AM9/22/99
to
Martin Norb{ck wrote:
>
> Wed, 22 Sep 1999 13:00:32 +0200, cri...@erix.ericsson.se (crippa) ->
> > BTW: Using a heap the first element is picked in O(nlog(n)) as you
> > create (sort) the heap, and then the rest at O(log(n)).
> > The latter expression is settled by the deleteMin operation.
>
> You don't sort the elements when creating a heap. A heap only has a heap
> ordering between the elements. You can actually build a heap in O(n) time.
> The build function uses in-place updates, however.

Are you saying that you can sort in O(n) using a heap ?!

> Picking the first element is O(log n), so finding the first element
> using a heap is O(n)+O(log n) = O(n).
>
> (btw, finding the first element in a well coded merge sort is also O(n))
>
> n.
>
> --
> [ http://www.dtek.chalmers.se/~d95mback/ ] [ PGP: 0x453504F1 ] [ UIN: 4439498 ]
> Opinions expressed above are mine, and not those of my future employees.

Best Regards

Martin Norb{ck

unread,
Sep 22, 1999, 3:00:00 AM9/22/99
to
Wed, 22 Sep 1999 16:39:48 +0200, cri...@erix.ericsson.se (crippa) ->

> > You don't sort the elements when creating a heap. A heap only has a heap
> > ordering between the elements. You can actually build a heap in O(n) time.
> > The build function uses in-place updates, however.
>
> Are you saying that you can sort in O(n) using a heap ?!

Nope. The heap takes O(n) to build, but to sort you need to remove n
elements at O(log n) a piece, giving you sorting in O(n log n).

Finding the smallest element can be done in O(n) though. Just build and
pop the top element.

John Prevost

unread,
Sep 22, 1999, 3:00:00 AM9/22/99
to
crippa <cri...@erix.ericsson.se> writes:

> Are you saying that you can sort in O(n) using a heap ?!

> > Picking the first element is O(log n), so finding the first element


> > using a heap is O(n)+O(log n) = O(n).

No, he's saying that you can get the first element of a sort in O(n)
time. Compare with searching an entire list sequentially for the
minimum, also O(n) time. In fact, it's only because all subsequent
steps in the heap sort cost only O(log n) that you get O(n log n). If
this behavior happened all the time, you'd lost and get O(n^2).

The lazy win, of course, is that if you only want the first 5
elements, the cost is O(n) + O(5 log n) = O(n)--so getting the first m
items of a sorted list is O(m log n) if m and n vary independently,
and O(n) if m is held constant. In any case, that means you didn't do
as much work as you would've to build the whole sorted rep in memory.

John.


Adam P. Jenkins

unread,
Sep 22, 1999, 3:00:00 AM9/22/99
to
crippa <cri...@erix.ericsson.se> writes:

> Martin Norb{ck wrote:
> > You don't sort the elements when creating a heap. A heap only has a heap
> > ordering between the elements. You can actually build a heap in O(n) time.
> > The build function uses in-place updates, however.
>
> Are you saying that you can sort in O(n) using a heap ?!

No. A heap doesn't store it's elements in sorted order. The only
guarantee you get from a heap is that the greatest element in the heap
is always at the top. Creating the initial heap from an unsorted list
takes O(n) time. Each time you remove the greatest element, the heap
needs to be "reheapified", which takes O(log n) time. A heap-sort
works by first heapifying the list, which takes O(n) time, and then
removing the max element repeatedly until the list is empty, which
takes O(n log n) time. So heap-sort takes
O(n) + O(n log n) == O(n log n) time.

--
Adam P. Jenkins
ajen...@netway.com

crippa

unread,
Sep 23, 1999, 3:00:00 AM9/23/99
to
Martin Norb{ck wrote:
>
> Wed, 22 Sep 1999 16:39:48 +0200, cri...@erix.ericsson.se (crippa) ->
> > > You don't sort the elements when creating a heap. A heap only has a heap
> > > ordering between the elements. You can actually build a heap in O(n) time.
> > > The build function uses in-place updates, however.
> >
> > Are you saying that you can sort in O(n) using a heap ?!
>
> Nope. The heap takes O(n) to build, but to sort you need to remove n
> elements at O(log n) a piece, giving you sorting in O(n log n).
>
> Finding the smallest element can be done in O(n) though. Just build and
> pop the top element.
>
> n.
>
> --
> [ http://www.dtek.chalmers.se/~d95mback/ ] [ PGP: 0x453504F1 ] [ UIN: 4439498 ]
> Opinions expressed above are mine, and not those of my future employees.

Ok I'm with you. Fascinating, I've never reflected upon this fact.
And more surprising is that no litterature I come across pin-point this
(explicitly enough:-). To select a sorting method we may not just look
at the O(.) complexity for time and space and the amount of data to sort.
We must also consider how we are going to use the sorted data.
The heap sort now seems to be a good, if not the best, choice for a
lazy style sorting. Thanks for some good answers!

Jerzy Karczmarczuk

unread,
Sep 23, 1999, 3:00:00 AM9/23/99
to
Christofer Törnkvist commenting Martin Norbäck:

> > > Are you saying that you can sort in O(n) using a heap ?!
> >
> > Nope. The heap takes O(n) to build, but to sort you need to remove n
> > elements at O(log n) a piece, giving you sorting in O(n log n).
> >
> > Finding the smallest element can be done in O(n) though. Just build and
> > pop the top element.
> >
> > n.

> Ok I'm with you. Fascinating, I've never reflected upon this fact.


> And more surprising is that no litterature I come across pin-point this
> (explicitly enough:-). To select a sorting method we may not just look
> at the O(.) complexity for time and space and the amount of data to sort.
> We must also consider how we are going to use the sorted data.
> The heap sort now seems to be a good, if not the best, choice for a
> lazy style sorting. Thanks for some good answers!

> /Christofer

Would you mind *showing* a (lazy functional) implementation of
the heapsort where the heap construction time is O(N)?
The construction + the evaluation of the first element, of course.
I probably miss some crucial trick here, but *please, don't explain,
but show the implementation*.


BTW. A week ago I posted some silly benchmarks, forgetting about the
laziness... I apologize. So, my numbers actually yielded what some
of you wanted here - the access to the first element, and not the
overall sorting complexity.

The insertion sort is a good candidate, it is O(N), although we have
to pay first the construction of the comparison/construction thunk
of length N, and then its evaluation. But it wins with the quicksort,
and with a binary treesort where the insertions proceed down from the
root. Full sorting is another story.

This treesort seems to win (very slightly, and for large N) with the
qsort both for the 1-st element access and for the full sorting,
but if you don't like the no. of reductions in Hugs as a reasonable
criterion, inshallah...

Well implemented heapsort should be better, because the tree is
almost balanced (complete), and the percolation down for the heap
fixing is O(logN) only for the root and then geometrically descends,
making the fix O(N), this is a known theory, see e.g.

http://www.engr.orst.edu/~reichwja/cs261/lecture/priq/sld013.htm


but I still wait to see a good functional implementation, useful
e.g. for the priority queues...
///No! Parole d'honneur, this is not my homework!///
]

Jerzy Karczmarczuk
Caen, France.

crippa

unread,
Sep 23, 1999, 3:00:00 AM9/23/99
to
Jerzy Karczmarczuk wrote:
>
> Christofer Törnkvist commenting Martin Norbäck:
>
> > > > Are you saying that you can sort in O(n) using a heap ?!
> > >
> > > Nope. The heap takes O(n) to build, but to sort you need to remove n
> > > elements at O(log n) a piece, giving you sorting in O(n log n).
> > >
> > > Finding the smallest element can be done in O(n) though. Just build and
> > > pop the top element.
> > >
> > > n.
>
> > Ok I'm with you. Fascinating, I've never reflected upon this fact.
> > And more surprising is that no litterature I come across pin-point this
> > (explicitly enough:-). To select a sorting method we may not just look
> > at the O(.) complexity for time and space and the amount of data to sort.
> > We must also consider how we are going to use the sorted data.
> > The heap sort now seems to be a good, if not the best, choice for a
> > lazy style sorting. Thanks for some good answers!
>
> > /Christofer
>
> Would you mind *showing* a (lazy functional) implementation of
> the heapsort where the heap construction time is O(N)?
> The construction + the evaluation of the first element, of course.
> I probably miss some crucial trick here, but *please, don't explain,
> but show the implementation*.

I guess the term lazy doesn't fit for the heap sorting. Look at the my
first posting and maybe you'll understand. With lazy I meant that at the
time for start picking the sorted elements you have not allready consumed
the O(nlog(n)) time complexity. For ideas on how to implement this see
the bible "Purely Functional Data Structures" by Okasaki.

...
> Jerzy Karczmarczuk
> Caen, France.

Chris Okasaki

unread,
Sep 23, 1999, 3:00:00 AM9/23/99
to
Jerzy Karczmarczuk <kar...@info.unicaen.fr> wrote:
>Would you mind *showing* a (lazy functional) implementation of
>the heapsort where the heap construction time is O(N)?
>The construction + the evaluation of the first element, of course.
>I probably miss some crucial trick here, but *please, don't explain,
>but show the implementation*.

See Richard Bird's "Functional Algorithm Design" in Science of
Computer Programming (Vol 26/Num 1-3), May 1996.

The essential ideas are as follows (I know this is explain rather
than showing the implementation, contrary to your request, but
oh well...)

1. Take half the elements and create singleton trees
2. Repeat until only a single tree remains
a. take two trees and an element
b. combine into a new tree with the element at the
root and the two trees as subtrees
c. while the element is larger than one of its children, sift it down
to the right location

If you maintain the trees in a FIFO queue, then at step 2a you will
always be combining the two smallest trees. If there are a total of
2^k-1 elements, then you will end up with a perfectly balanced
heap ordered tree. Otherwise, you will probably have to play a little
bit with the order in which you process the elements to maintain
whatever kind of balance criteria you require for your heap (probably
some form of Braun tree?).


If you use fancier kinds of heaps, then it is even easier to get the O(n)
time bound for heap construction. For example, for binomial heaps or
skew binomial heaps (or any other kind of heap where insertion takes O(1)
time), you can just insert all the elements, one by one. For leftist heaps
or skew heaps (or any other kind of heap where merge takes O(log n)
time), you first place all the elements in singleton heaps, and then
repeatedly merge the elements in pairs until only a single heap remains.

Chris


Jerzy Karczmarczuk

unread,
Sep 23, 1999, 3:00:00 AM9/23/99
to
Chris Okasaki on heapsorts:

> See Richard Bird's "Functional Algorithm Design" in Science of
> Computer Programming (Vol 26/Num 1-3), May 1996.
>
> The essential ideas are as follows (I know this is explain rather
> than showing the implementation, contrary to your request, but
> oh well...)

> 1. Take half the elements and create singleton trees
> 2. Repeat until only a single tree remains
> a. take two trees and an element
> b. combine into a new tree with the element at the
> root and the two trees as subtrees
> c. while the element is larger than one of its children, sift it down
> to the right location

> If you maintain the trees in a FIFO queue, ...

etc.

Ok, Ok, I believe all of you. I might even add another quotation of
Chris Okasaki, his answer to Jon Fairbarn on the Haskell list,
October 1997:


* But the heapsort you give is nothing like the standard imperative
* heapsort! Yes, it uses a heap, but not the binary heap used by
* standard heapsort. Instead, it uses the functional equivalent
* of multi-pass pairing heaps[1]. Larry Paulson's "ML for the
* Working Programmer" includes a functional heapsort that is
* much closer in spirit to the standard imperative version, and
* so is probably superior for pedagogical purposes. (Although I expect
* that your version will be much faster in practice.)

* Chris

* [1] Fredman, Sedgewick, Sleator, and Tarjan.
* "The pairing heap: A new form of self-adjusting heap"
* Algorithmica 1(1):111-129, 1986.

=======================
Very nice.

...
But I would like anyway to see a nice, practical code in a lazy
functional language. Just for pleasure, without any bad thoughts.
Thank you.

Jerzy Karczmarczuk
Caen, France.

Lennart Augustsson

unread,
Oct 3, 1999, 3:00:00 AM10/3/99
to
crippa <cri...@erix.ericsson.se> writes:

> Ok I'm with you. Fascinating, I've never reflected upon this fact.
> And more surprising is that no litterature I come across pin-point this
> (explicitly enough:-). To select a sorting method we may not just look
> at the O(.) complexity for time and space and the amount of data to sort.
> We must also consider how we are going to use the sorted data.
> The heap sort now seems to be a good, if not the best, choice for a
> lazy style sorting. Thanks for some good answers!

This fact about heap sort is mentioned in some algorithms books.
When I teach algorithms I always point out this fact about heap sort;
it quite neat. Too bad heap sort does not perform so well in
absolute numbers.

--

-- Lennart Augustsson
[This non-signature is unintentionally not left unblank.]

Reply all
Reply to author
Forward
0 new messages