interesting side effect with Stream.continually

844 views
Skip to first unread message

Andy Coolware

unread,
May 8, 2013, 3:51:12 PM5/8/13
to scala-user
Hi,

In order to present the gotcha I run into I have two pieces of code:


var r=new scala.util.Random
val s=Seq(1,2,3)


Now I want to have an infinite stream of the same 3 elem tuples. I Initially wrote that:

Stream.continually(s.map(_*1000+r.nextInt(10))).flatten.toIterator

However, that iterator keeps producing random numbers. In order to stop it I had to break it down in two values

val ss=s.map(_*1000+r.nextInt(10))
Stream.continually(ss).flatten.toIterator


A way to visualize it would be to run:

Stream.continually(ss).flatten.toIterator.take(15).foreach(println)
Stream.continually(s.map(_*1000+r.nextInt(10))).flatten.toIterator.take(15).foreach(println)

I wonder why is that? Any idea.


Thx,
Andy

Ryan LeCompte

unread,
May 8, 2013, 4:00:26 PM5/8/13
to Andy Coolware, scala-user
Hi Andy,

The reason is that Stream.continually takes a by-name parameter. Each invocation of Stream.continually() will re-evaluate the by-name argument, thus re-executing your logic.

Ryan


--
You received this message because you are subscribed to the Google Groups "scala-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-user+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Andy Coolware

unread,
May 8, 2013, 8:48:45 PM5/8/13
to Ryan LeCompte, scala-user
Hi,

(shame) I did not know that such a thing like by-name parameter existed - now all is crystal clear.

Thx,
Andy

Seth Tisue

unread,
May 8, 2013, 8:53:04 PM5/8/13
to scala-user
If you want a Stream that just repeats the same values over and over again, Stream.continually is not the best way. It always produces a linear structure, never a circular structure. So instead of O(1) memory usage, you get O(n) memory usage.

Stream doesn't have a built-in method that will give you a true circular stream. Here's what I use to get one:

  implicit class RichStream[T](xs: Stream[T]) {
    def circular: Stream[T] = {
      lazy val circle: Stream[T] = xs #::: circle
      circle
    }
  }

To see that this is truly circular, we can insert a println:

scala> Stream.from(1).map{x => println(x); x}.take(3).circular.take(10).force
1
2
3
res2: scala.collection.immutable.Stream[Int] = Stream(1, 2, 3, 1, 2, 3, 1, 2, 3, 1)

By contrast,

scala> Stream.continually(Stream(1, 2, 3).map{x => println(x); x}).flatten.take(10).force
1
2
3
1
2
3
1
2
3
1
res6: scala.collection.immutable.Stream[Int] = Stream(1, 2, 3, 1, 2, 3, 1, 2, 3, 1)

Ryan LeCompte

unread,
May 9, 2013, 1:19:24 AM5/9/13
to Seth Tisue, scala-user
That's a nice solution, Seth. This is what I've typically done to produce a cyclic stream:

val cycle = for {
  _ <- Stream.continually(1)
  x <- List(1,2,3)
} yield x

println(cycle.take(15).mkString(","))
1,2,3,1,2,3,1,2,3,1,2,3,1,2,3

I believe in this case you would also avoid a memory leak since you're essentially not holding on to the head of the Stream, correct? It would be easy to test.

Ryan



Naftoli Gugenheim

unread,
May 9, 2013, 1:41:30 AM5/9/13
to Ryan LeCompte, Seth Tisue, scala-user
How can the head be GC'd when you have it in a val?

Ryan LeCompte

unread,
May 9, 2013, 1:59:46 AM5/9/13
to Naftoli Gugenheim, Seth Tisue, scala-user
You're correct. Here's a version that allows the stream to be GC'd:

def cycle[T](s: Seq[T]): Stream[T] = {
  for {
    _ <- Stream.continually(1)
    x <- s
  } yield x
}

cycle(List(1,2,3)).foreach { i => i / 2 }

Seth Tisue

unread,
May 9, 2013, 8:31:54 AM5/9/13
to scala-user
On Thu, May 9, 2013 at 1:19 AM, Ryan LeCompte <leco...@gmail.com> wrote:
I believe in this case you would also avoid a memory leak [...]

This is equivalent to the original continually/flatten solution. It produces an infinite structure in memory, not a finite circular structure.

If you don't hang on to the head, already-traversed portions of that infinite structure will become eligible for GC, yes. But you'll be constantly allocating and reclaiming memory during traversal.

The point of using a finite circular structure is that *even if you hang on to the head*, you only ever do a finite number of allocations.

Nils Kilden-Pedersen

unread,
May 9, 2013, 9:05:05 AM5/9/13
to Seth Tisue, scala-user
On Wed, May 8, 2013 at 7:53 PM, Seth Tisue <se...@tisue.net> wrote:
If you want a Stream that just repeats the same values over and over again, Stream.continually is not the best way. It always produces a linear structure, never a circular structure. So instead of O(1) memory usage, you get O(n) memory usage.

Stream doesn't have a built-in method that will give you a true circular stream. Here's what I use to get one:

  implicit class RichStream[T](xs: Stream[T]) {
    def circular: Stream[T] = {
      lazy val circle: Stream[T] = xs #::: circle
      circle
    }
  }

I don't think I've ever seen a `lazy val` inside a method. A recursive lazy val even. Does it do what I think it does?

 

To see that this is truly circular, we can insert a println:

scala> Stream.from(1).map{x => println(x); x}.take(3).circular.take(10).force
1
2
3
res2: scala.collection.immutable.Stream[Int] = Stream(1, 2, 3, 1, 2, 3, 1, 2, 3, 1)

By contrast,

scala> Stream.continually(Stream(1, 2, 3).map{x => println(x); x}).flatten.take(10).force
1
2
3
1
2
3
1
2
3
1
res6: scala.collection.immutable.Stream[Int] = Stream(1, 2, 3, 1, 2, 3, 1, 2, 3, 1)

Ryan LeCompte

unread,
May 9, 2013, 9:06:54 AM5/9/13
to Seth Tisue, scala-user
Ah, I finally understand the subtle point you're making. Very nice. Thanks!

Ryan


Seth Tisue

unread,
May 9, 2013, 9:44:32 AM5/9/13
to scala-user
On Thu, May 9, 2013 at 9:05 AM, Nils Kilden-Pedersen <nil...@gmail.com> wrote:

  implicit class RichStream[T](xs: Stream[T]) {
    def circular: Stream[T] = {
      lazy val circle: Stream[T] = xs #::: circle
      circle
    }
  }

I don't think I've ever seen a `lazy val` inside a method. A recursive lazy val even. Does it do what I think it does?

I don't know. Does it? ;-)

If you omit `lazy` here, you get this error: "error: forward reference extends over definition of value circle". The compiler is being overly picky, since the recursive reference is inside #:::'s by-name parameter, so it isn't evaluated until later. So in this case `lazy` just avoids the forward reference check, rather than adding any real meaning.

(see SLS, top of Chapter 4: "there is a restriction on forward references in blocks...")

It occurs to me just now that my code above can actually be shortened to just:

  implicit class RichStream[T](s: Stream[T]) {
    lazy val circular: Stream[T] = s #::: circular
  }

I tested this and it works the same.

Ryan LeCompte

unread,
May 9, 2013, 9:53:13 AM5/9/13
to Seth Tisue, scala-user
Could you shorten it even more to the following?

implicit class RichStream[T](s: Stream[T]) {
  def circular: Stream[T] = s #::: circular
}




Seth Tisue

unread,
May 9, 2013, 9:56:31 AM5/9/13
to scala-user
On Thu, May 9, 2013 at 9:53 AM, Ryan LeCompte <leco...@gmail.com> wrote:
Could you shorten it even more to the following?

implicit class RichStream[T](s: Stream[T]) {
  def circular: Stream[T] = s #::: circular
}

No because with "def" you get a new Stream cell each time, so it's not circular anymore.

Ryan LeCompte

unread,
May 9, 2013, 9:59:00 AM5/9/13
to Seth Tisue, scala-user
Ah interesting, I see. Because re-running your example I only see "1 2 3" printed once like in your original example.


Seth Tisue

unread,
May 9, 2013, 10:02:54 AM5/9/13
to scala-user
A lurking phantom points out that once the `val` is no longer inside a method body, the forward reference restriction from SLS chapter 4 no longer applies, so the `lazy` isn't necessary anymore and you can just write:

implicit class RichStream[T](s: Stream[T]) {
  val circular: Stream[T] = s #::: circular
}

Note however that if you have other methods in your RichStream class (as I do in the real code this is from), you'll want to keep the `lazy`, since you don't want Stream cells allocated every time you call one of those *other* methods...!

Ryan LeCompte

unread,
May 9, 2013, 10:11:10 AM5/9/13
to Seth Tisue, scala-user
Nice! Thanks, Seth. Adding this one to the toolbox. :)


Seth Tisue

unread,
May 9, 2013, 10:27:00 AM5/9/13
to scala-user
On Thu, May 9, 2013 at 9:59 AM, Ryan LeCompte <leco...@gmail.com> wrote:
 re-running your example I only see "1 2 3" printed once like in your original example.

The ".map{x => println(x); x}" trick doesn't detect whether you're wasting memory. It detects whether you're repeating computations.

With `def`, you only compute the values in the original Stream once, as the println experiment shows... but then you replicate those values infinitely throughout memory.

I did test the `def` version just now and it does in fact run out of memory:

scala> val s = Stream.from(1).take(3).circular
s: Stream[Int] = Stream(1, ?)

scala> s.foreach(() => _)
Exception in thread "main" 
Exception: java.lang.OutOfMemoryError

Whereas with `val` or `lazy val`, it runs forever no problem.

Ryan LeCompte

unread,
May 9, 2013, 10:34:29 AM5/9/13
to Seth Tisue, scala-user
I just ran the same experiment and arrived at the same conclusion. Thanks for following up. These are great things to keep in mind when using Scala's Stream. I tend to not have a need for lazy streams, but will definitely keep these subtle points in mind.

Ryan


pagoda_5b

unread,
May 9, 2013, 11:10:20 AM5/9/13
to scala...@googlegroups.com, Seth Tisue
On Thursday, May 9, 2013 3:53:13 PM UTC+2, Ryan LeCompte wrote:
Could you shorten it even more to the following?

implicit class RichStream[T](s: Stream[T]) {
  def circular: Stream[T] = s #::: circular
}


Shouldn't this blow up the stack eventually?

Ivano

Seth Tisue

unread,
May 9, 2013, 11:39:56 AM5/9/13
to scala...@googlegroups.com
No. It blows up the heap, as discussed elsewhere in this thread, but not the stack.

#:::'s parameter is by-name, so the above desugars to something like
  Stream.cons(s, () => circular(s)). 
The result is that the inner call to "circular" doesn't happen right away; it's deferred. A thunk goes on the heap containing the inner call, and somebody might force that thunk later, or they might not. Since the outer call returns before the inner call ever takes place, you never consume more than a constant amount of stack.

Terminology: we say that the recursive calls are "trampolined".

Nils Kilden-Pedersen

unread,
May 9, 2013, 11:43:52 AM5/9/13
to Seth Tisue, scala-user
On Thu, May 9, 2013 at 8:44 AM, Seth Tisue <se...@tisue.net> wrote:
On Thu, May 9, 2013 at 9:05 AM, Nils Kilden-Pedersen <nil...@gmail.com> wrote:

  implicit class RichStream[T](xs: Stream[T]) {
    def circular: Stream[T] = {
      lazy val circle: Stream[T] = xs #::: circle
      circle
    }
  }

I don't think I've ever seen a `lazy val` inside a method. A recursive lazy val even. Does it do what I think it does?

I don't know. Does it? ;-)

It was somewhat of a rhetorical question, because I don't get it, and I don't have time to think about it :-)

It will have to go into that mental box of "New stuff learned everyday with Scala" that I don't quite understand.

Maybe I should stop using the Scala mailing list as my ADD distraction?
 

If you omit `lazy` here, you get this error: "error: forward reference extends over definition of value circle". The compiler is being overly picky, since the recursive reference is inside #:::'s by-name parameter, so it isn't evaluated until later. So in this case `lazy` just avoids the forward reference check, rather than adding any real meaning.

(see SLS, top of Chapter 4: "there is a restriction on forward references in blocks...")

It occurs to me just now that my code above can actually be shortened to just:

  implicit class RichStream[T](s: Stream[T]) {
    lazy val circular: Stream[T] = s #::: circular
  }

I tested this and it works the same.

Som Snytt

unread,
May 9, 2013, 12:20:31 PM5/9/13
to Nils Kilden-Pedersen, Seth Tisue, scala-user
> (shame) I did not know that such a thing like by-name parameter existed

Andy's shame is our daily dose.  Thanks for the interesting thread, all.


> I don't have time to think about it

The point of being a lazy val is that you can think about it later.

The beauty of being a distracted procrastinator is that you'll never blow your stack or your heap, only your work day.

Ryan LeCompte

unread,
May 9, 2013, 12:25:12 PM5/9/13
to Som Snytt, Nils Kilden-Pedersen, Seth Tisue, scala-user
LMFAO!

Naftoli Gugenheim

unread,
May 10, 2013, 1:14:13 AM5/10/13
to Nils Kilden-Pedersen, Seth Tisue, scala-user
It may help to keep in mind that Stream, like List, is an abstract class with two concrete implementations, the empty singleton, and cons, which consists of a head element and a tail collection (in this case Stream.Empty and Stream.Cons).

Here is the source of Cons (with parts deemphasized):

  final class Cons[+A](hd: A, tl: => Stream[A]) extends Stream[A] with Serializable {
    override def isEmpty = false
    override def head = hd
    @volatile private[this] var tlVal: Stream[A] = _
    def tailDefined: Boolean = tlVal ne null
    override def tail: Stream[A] = {
      if (!tailDefined)
        synchronized {
          if (!tailDefined) tlVal = tl
        }
      tlVal
    }
  }

Note that tl is a call-by-name, which is like syntactic sugar (with type-level support) for a function. So we can rewrite it (oversimplified) as:

class Cons[+A](hd: A, tl: () => Stream[A]) {
  private var tlVal: Stream[A] = null
  def tail = {
    if(tlVal eq null) tlVal = tl()
    tlVar
  }
}

Cheap version of circular (since I haven't implemented #:::)

lazy val test = new Cons[Int](0, () => test)

Hardcoded:

class Circle {
  val head = 0
  def tail = {() => this}()
}


pagoda_5b

unread,
May 10, 2013, 4:53:11 AM5/10/13
to scala...@googlegroups.com
Many pieces now fitting together. 
This really clears up some confusion I had regarding calls that unexpectedly (for me) filled the heap.
Thanks a lot.

Ivano 
Reply all
Reply to author
Forward
0 new messages