Concat implementation without stack overflow

Jon Distad

unread,

Aug 15, 2015, 9:03:05 PM8/15/15

to Clojure Dev

After learning of the left-recursive concat stack overflow gotcha, or LeReConStOG as it is more commonly known, I figured I'd see if I could resolve it. I suspected that the process would be similar to one I hacked together to enable arbitrary-depth tail calls in Wombat-J (https://github.com/jondistad/wombat-j).

More information on the original issue is on Stuart Sierra's blog: http://stuartsierra.com/2015/04/26/clojure-donts-concat

I've posted a gist with the good bits here: https://gist.github.com/jondistad/2a4971fe8948ca2f6ba0

And attached a patch, if you'd like to try it out. It applies cleanly to clojure-1.7.0 and master (as of today anyway).

You can think of Concat as a special Cons, only for sequences. Each Concat cell has a head

and a tail. The head is the first sequence in the concatenation, and the tail is

everything that follows. It could be a vanilla sequence, another Concat, or nil (null).

The basic idea is that if a Concat object is nested within another, they both actually

represent the same depth. So, we can flatten the left-recursive concats out, rather than

expand them recursively.

The basic flattening out (extrusion) algorithm is:

Extrude C:
  let Head := C.Head
  let Tail := C.Tail
  while Head is a Concat object:
    let Tail := Concat(Head.Tail, Tail)
    let Head := Head.Head
  return Concat(Head, Tail)

And then attempting to get a sequence from the head of the extruded Concat:

AttemptSeq C:
  if Seq(C.Head) is not Null:
    return Cons(Head.First, Concat(Head.Rest, Tail))
  else:
    return Null

And finally the process which gets the real sequence value:

Seq C:
  let CExt := Extrude(C)
  let S := AttemptSeq(CExt)
  if S is Null: // The head is an empty sequence
    while S is Null AND CExt.Tail is a Concat object:
      let CExt := Extrude(CExt.Tail)
      let S := AttemptSeq(CExt)
    if S is Null: // We're either out of sequences, or CExt.Tail is a normal sequence
      let S := Seq(SExt.Tail)
  return S

That last one is a bit awkward, but the general idea is you start with something like

this:

(concat (concat my-seq nil) another-seq)

and extrusion turns it into this:

(concat my-seq (concat nil another-seq))

And if the new Head is not empty or Null, then AttemptSeq gives you this:

(cons (first my-seq) (concat (rest my-seq) (concat nil another-seq)))

But what if we started with this:

(concat (concat () my-seq) another-seq)

Then extrusion gives us:

(concat () (concat my-seq another-seq))

So when we AttemptSeq, we get nil. But, we can ignore the head anyway because empty lists

are the concat identity. So we run extrude over the tail and get

(concat my-seq another-seq) // There was no change this time

And then run AttemptSeq on that and get

(cons (first my-seq) (concat (rest my-seq) another-seq))

The actual code ends up a bit more complicated because we want to cache the sequence

results rather than build everything from scratch every time, but it's essentially the

same.

TL;DR-

Without patch:

user=> (time (count (reduce #(concat (list %2) %1) nil (range 10000000)))) ; right-recursive, so it works
"Elapsed time: 11658.052987 msecs"
10000000
user=> (time (count (reduce #(concat %1 (list %2)) nil (range 10000000)))) ; BOOM
StackOverflowError   clojure.lang.LazySeq.seq (LazySeq.java:49)

With patch:

user=> (time (count (reduce #(concat (list %2) %1) nil (range 10000000))))
"Elapsed time: 5343.70402 msecs"
10000000
user=> (time (count (reduce #(concat %1 (list %2)) nil (range 10000000)))) ; Ta-da!
"Elapsed time: 7497.264373 msecs"
10000000

I hope some of you found this interesting. I had fun implementing it!

Questions and comments are encouraged!

Jon

0001-Define-clojure.lang.Concat.patch

Stuart Halloway

unread,

Sep 11, 2015, 9:45:50 AM9/11/15

to cloju...@googlegroups.com

Hi Jon,

Are you aware of other languages that do this?

Stu

--
You received this message because you are subscribed to the Google Groups "Clojure Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure-dev...@googlegroups.com.
To post to this group, send email to cloju...@googlegroups.com.
Visit this group at http://groups.google.com/group/clojure-dev.
For more options, visit https://groups.google.com/d/optout.

Jon Distad

unread,

Sep 11, 2015, 5:34:29 PM9/11/15

to cloju...@googlegroups.com

Hi Stu!

I'm not aware of any other language that does this. There's a paper about building trampolines in scala which is similar (http://blog.higher-order.com/assets/trampolines.pdf), but its not in the core language.

You received this message because you are subscribed to a topic in the Google Groups "Clojure Dev" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure-dev/ewBuyloeiFs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure-dev...@googlegroups.com.

Reply all

Reply to author

Forward