Just how lazy is a Stream?

71 views
Skip to first unread message

Alexandru Nedelcu

unread,
Nov 8, 2012, 9:09:37 AM11/8/12
to scala...@googlegroups.com
So I wonder, what's the complexity of this algorithm:

  def kthElement(stream: Stream[Int], k: Int): Int = 
    stream.sorted.apply(k)


Dennis Haupt

unread,
Nov 8, 2012, 9:32:13 AM11/8/12
to Alexandru Nedelcu, scala...@googlegroups.com
the stream is evaluated during the sorted call

-------- Original-Nachricht --------
> Datum: Thu, 8 Nov 2012 06:09:37 -0800 (PST)
> Von: Alexandru Nedelcu <con...@alexn.org>
> An: scala...@googlegroups.com
> Betreff: [scala-user] Just how lazy is a Stream?

Daniel Sobral

unread,
Nov 8, 2012, 9:50:44 AM11/8/12
to Alexandru Nedelcu, scala-user
Stream's sorted is implemented by SeqLike, which delegates to java.util.Arrays.sort, which is guaranteed O(n*log n). Sorted's apply and various other peripheral operations are linear or constant, so the final complexity is n * log n. 


--
Daniel C. Sobral

I travel to the future all the time.

Johannes Rudolph

unread,
Nov 8, 2012, 9:53:42 AM11/8/12
to Alexandru Nedelcu, scala-user
Interesting question. I don't think `sorted` is not optimized to make
use of laziness. You could do this:

def quicksort(s: Stream[Int]): Stream[Int] = {
if (s.size < 2) s
else {
val (pivot #:: rest) = s
val (less, greater) = rest.partition(_ < pivot)
quicksort(less) append (pivot #:: quicksort(greater))
}
}
def kthElement(stream: Stream[Int], k: Int): Int =
quicksort(stream)(k)

which would try to execute only those parts of the quicksort needed to
answer the question.

However, Dennis is right, of course, that elements of the original
stream have to be evaluated at least once.

--
Johannes

-----------------------------------------------
Johannes Rudolph
http://virtual-void.net

Johannes Rudolph

unread,
Nov 8, 2012, 9:54:17 AM11/8/12
to Alexandru Nedelcu, scala-user
On Thu, Nov 8, 2012 at 3:53 PM, Johannes Rudolph
<johannes...@googlemail.com> wrote:
> I don't think `sorted` is not optimized to make
> use of laziness.

Make that "I think `sorted` is not optimized to make
use of laziness."

Alexandru Nedelcu

unread,
Nov 8, 2012, 9:59:44 AM11/8/12
to Johannes Rudolph, scala-user
Nice quickSort implementation.

So I was expecting it to do something like ... stream.min #::
stream.tail.filter(_ != min))
In which case the complexity would be O(kn). So if K is a small
constant, that would be O(n).

Why isn't the implementation overridden in the Stream class?
And how about views or iterators? Do they have the same behavior as Stream?

nicola...@gmail.com

unread,
Nov 8, 2012, 10:13:04 AM11/8/12
to Alexandru Nedelcu, Johannes Rudolph, scala-user
There are algorithms that can do that in expected O(n), like QuickSelect. 
The lazy quicksort above, though nice and clever,   sorts a bit too much to be as good as that, I think.
--
Sent from an IBM Model M, 15 August 1989.

Johannes Rudolph

unread,
Nov 8, 2012, 10:17:10 AM11/8/12
to nicola...@gmail.com, Alexandru Nedelcu, scala-user
On Thu, Nov 8, 2012 at 4:13 PM, nicola...@gmail.com
<nicola...@gmail.com> wrote:
> There are algorithms that can do that in expected O(n), like QuickSelect.
> The lazy quicksort above, though nice and clever, sorts a bit too much to
> be as good as that, I think.

Yep, the optimized version would be implemented as lazy IndexedSeqs
which would allow to just sort the branches on the path to the k-th
element.

Alexandru Nedelcu

unread,
Nov 8, 2012, 10:32:59 AM11/8/12
to Johannes Rudolph, scala-user
On Thu, Nov 8, 2012 at 5:17 PM, Johannes Rudolph
<johannes...@googlemail.com> wrote:
> Yep, the optimized version would be implemented as lazy IndexedSeqs
> which would allow to just sort the branches on the path to the k-th
> element.

Sorry for the dumb newbie questions :-)

How do you create a lazy IndexedSeq?
Is that like a Vector(something).view?

Would lazyness still help in that case, considering that you need to
evaluate the left side of a partitioning on each step (which
naturally, goes over the whole list)?

Johannes Rudolph

unread,
Nov 8, 2012, 11:10:06 AM11/8/12
to Alexandru Nedelcu, scala-user
On Thu, Nov 8, 2012 at 4:32 PM, Alexandru Nedelcu <con...@alexn.org> wrote:
> How do you create a lazy IndexedSeq?
> Is that like a Vector(something).view?

E.g. like this:

https://gist.github.com/4039558

There may be better ways, that's just a way I use often for creating
lazy indexed seqs.

Btw. the naive quicksort implementation here is not the best (maybe
even wrong on edge cases), so using a proper optimized sort over the
complete list may the still be faster than the naive implementation.
See info at wikipedia [1]

> Would lazyness still help in that case, considering that you need to
> evaluate the left side of a partitioning on each step (which
> naturally, goes over the whole list)?

I think the lazyness should be the same now for any k as with the
stream version for k = 0.

--
Johannes

[1] http://en.wikipedia.org/wiki/Selection_algorithm

Daniel Sobral

unread,
Nov 8, 2012, 11:11:22 AM11/8/12
to Johannes Rudolph, Alexandru Nedelcu, scala-user
This is flawed.

On Thu, Nov 8, 2012 at 12:53 PM, Johannes Rudolph <johannes...@googlemail.com> wrote:
On Thu, Nov 8, 2012 at 3:09 PM, Alexandru Nedelcu <con...@alexn.org> wrote:
> So I wonder, what's the complexity of this algorithm:
>
>   def kthElement(stream: Stream[Int], k: Int): Int =
>     stream.sorted.apply(k)

Interesting question. I don't think `sorted` is not optimized to make
use of laziness. You could do this:

def quicksort(s: Stream[Int]): Stream[Int] = {
           if (s.size < 2) s

Here! You just made s strict to compute is size.

Instead, s.tail.isEmpty should have been used.
 
           else {
           val (pivot #:: rest) = s
           val (less, greater) = rest.partition(_ < pivot)
           quicksort(less) append (pivot #:: quicksort(greater))
           }
     }
def kthElement(stream: Stream[Int], k: Int): Int =
  quicksort(stream)(k)

which would try to execute only those parts of the quicksort needed to
answer the question.

However, Dennis is right, of course, that elements of the original
stream have to be evaluated at least once.

--
Johannes

-----------------------------------------------
Johannes Rudolph
http://virtual-void.net

Alexandru Nedelcu

unread,
Nov 8, 2012, 11:15:34 AM11/8/12
to scala...@googlegroups.com, Alexandru Nedelcu, Johannes Rudolph
On Thursday, November 8, 2012 5:13:08 PM UTC+2, Nicolas Oury wrote:
There are algorithms that can do that in expected O(n), like QuickSelect. 
The lazy quicksort above, though nice and clever,   sorts a bit too much to be as good as that, I think.

Yeah, I know, thanks for the tip, but I was wondering about what can one do with lazy collections in algorithms.

Like, if you think about it, most QuickSort implementations developed without attention to picking the pivot are O(n ^ 2) for common use-cases (e.g. sorting an already sorted collection or one that's almost sorted), as that's what happens when you pick the head as the pivot. This can also happen with QuickSelect if you're not careful, the worst case being O(n ^ 2), like when you pick the head and the collection is in descending order with a small K, or when the collection is in ascending order and K is close to N.

With a lazy sorted() on a lazy list, you'd get simplicity of implementation, O(k) extra storage used and a guaranteed O(kn). And if K is a constant close to the edges (0 or N-1), then you can consider it O(n), because in case it's close to N-1, then you can just sort in descending order and pick (N-K+1).

Alexandru Nedelcu

unread,
Nov 8, 2012, 12:04:51 PM11/8/12
to scala...@googlegroups.com, Alexandru Nedelcu, Johannes Rudolph
Somebody on Haskell's mailing list claims that with a mergeSort as
provided by Data.List, the complexity would be O(n + k log n), while with
a quickSort it would be O(n + k log k). I wonder how that works.

[1] http://www.haskell.org/pipermail/haskell-cafe/2009-July/064111.html
Reply all
Reply to author
Forward
0 new messages