stack overflow with StreamT

Matthew Pocock

unread,

Jun 30, 2015, 9:16:18 PM6/30/15

to sca...@googlegroups.com

Hi,
I've ported some code over from using Stream to StreamT[Need, A]. The test cases that work are vastly more efficient, and it does appear that the stream is as lazy as I'd hoped -- it really is only computing elements when I ask for them. However, I'm now getting a stack overflow error. It looks like this:

at scalaz.Need$$anon$2$$anonfun$bind$2.apply(Name.scala:71)
       at scalaz.Need$$anon$4.value0$lzycompute(Name.scala:50)
       at scalaz.Need$$anon$4.value0(Name.scala:50)
       at scalaz.Need$$anon$4.value(Name.scala:51)
       at scalaz.Need$$anon$2$$anonfun$bind$2.apply(Name.scala:71)
       at scalaz.Need$$anon$4.value0$lzycompute(Name.scala:50)
       at scalaz.Need$$anon$4.value0(Name.scala:50)
       at scalaz.Need$$anon$4.value(Name.scala:51)
       at scalaz.Need$$anon$2$$anonfun$bind$2.apply(Name.scala:71)
       at scalaz.Need$$anon$4.value0$lzycompute(Name.scala:50)

This is repeated a silly number of times, occasionally taking a trip through my code. One def that it goes through is this one, but I thought I'd written it to work in constant space:

@tailrec def lastModel[A, M](str: TrueStream[(A, M)], m0: M): M =
  if(str.isEmpty.value) m0 else lastModel(str.tail, str.head.value._2)

The full code is in github at https://github.com/drdozer/consbol in the consbol sub-directory. You can replicate the error with `sbt test`.

Any ideas?

--

Dr Matthew Pocock

Turing ate my hamster LTD

mailto: turingate...@gmail.com

Integrative Bioinformatics Group, School of Computing Science, Newcastle University

mailto: matthew...@ncl.ac.uk

gchat: turingate...@gmail.com

msn: matthew...@yahoo.co.uk

irc.freenode.net: drdozer

skype: matthew.pocock

tel: (0191) 2566550

mob: +447535664143

Matthew Pocock

unread,

Jul 2, 2015, 6:48:49 PM7/2/15

to sca...@googlegroups.com

Hi,

I'm seeing this again, this time with calls to mappend for StreamT[Need, ...]. It seems to happen whenever I work with long StreamT instances.

M

Tony Morris

unread,

Jul 2, 2015, 10:52:42 PM7/2/15

to sca...@googlegroups.com

What happens when you try to minimise the test case, in particular, by making the data smaller? Point being, is this stackoverflow due to having some finite input that is too much for the stack, or infinite boundless termination?

From a quick perusal, I strongly suspect it is just a finite input that blows the stack. This is a common scenario in Scala, where you try to be neat and tidy (as you have done by exploiting lazy evaluation), only to find a gremlin pops up elsewhere (sorry, your stack is insufficient, have a nice day). If so, it typically comes down to deciding what you are willing to sacrifice to move forward.

--
You received this message because you are subscribed to the Google Groups "scalaz" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scalaz+un...@googlegroups.com.
To post to this group, send email to sca...@googlegroups.com.
Visit this group at http://groups.google.com/group/scalaz.
For more options, visit https://groups.google.com/d/optout.

Matthew Pocock

unread,

Jul 3, 2015, 6:01:53 AM7/3/15

to sca...@googlegroups.com

Hi Tony,

I don't see these stack traces when the data is small. It only happens when the proof system I'm writing tries to generate very, very large proofs. This is almost always due to a bug where it isn't cutting correctly. However, with the stack being blown, I don't see where the real fault is. The structures should always be finite.

I rewrote the findLast operation using foldLeft, but I see in another stack trace that looks almost the same that it passes through findLast.

What I don't understand is why mappend would fail like this. I'd have thought that it could be implemented by taking (lhs, rhs), and producing a StreamT that pops the next element off lhs untll that's done, and then returns rhs. This shouldn't introduce any stack-frames. The naive way to implement it would be to recurse through lhs and then prepend the elements to rhs I guess.

Matthew

Matthew Pocock

unread,

Jul 16, 2015, 7:26:11 AM7/16/15

to sca...@googlegroups.com

Hi,

I'm finding working with StreamT[Need, ] quite frustrating. The way it's implemented seems to build towers of Need instances that effectively chain myrads of point[Need].value calls. This kills performance and blows the stack. Is there any way to re-implement the Need operations so that in the trivial case of point[Need].value stacks (or (_:Need).value.point[Need] if that is easier) that the boxing can be avoided?

My intuition is that these are equivalent:

n.value.point[Need].value.point[Need]

n.value.point[Need]

However, in the case that the need isn't wrapping a pre-computed result, these are not equivalent as the former forces the calculation:

n.value.point[Need]

n

Matthew

Reply all

Reply to author

Forward