# Concat implementation without stack overflow

2252 views

Aug 15, 2015, 9:03:05 PM8/15/15
to Clojure Dev
After learning of the left-recursive concat stack overflow gotcha, or LeReConStOG as it is more commonly known, I figured I'd see if I could resolve it. I suspected that the process would be similar to one I hacked together to enable arbitrary-depth tail calls in Wombat-J (https://github.com/jondistad/wombat-j).

More information on the original issue is on Stuart Sierra's blog: http://stuartsierra.com/2015/04/26/clojure-donts-concat

I've posted a gist with the good bits here: https://gist.github.com/jondistad/2a4971fe8948ca2f6ba0

And attached a patch, if you'd like to try it out. It applies cleanly to clojure-1.7.0 and master (as of today anyway).

You can think of Concat as a special Cons, only for sequences. Each Concat cell has a head
and a tail. The head is the first sequence in the concatenation, and the tail is
everything that follows. It could be a vanilla sequence, another Concat, or nil (null).

The basic idea is that if a Concat object is nested within another, they both actually
represent the same depth. So, we can flatten the left-recursive concats out, rather than
expand them recursively.

The basic flattening out (extrusion) algorithm is:

`Extrude C:  let Head := C.Head  let Tail := C.Tail  while Head is a Concat object:    let Tail := Concat(Head.Tail, Tail)    let Head := Head.Head  return Concat(Head, Tail)`

And then attempting to get a sequence from the head of the extruded Concat:

`AttemptSeq C:  if Seq(C.Head) is not Null:    return Cons(Head.First, Concat(Head.Rest, Tail))  else:    return Null`

And finally the process which gets the real sequence value:

`Seq C:  let CExt := Extrude(C)  let S := AttemptSeq(CExt)  if S is Null: // The head is an empty sequence    while S is Null AND CExt.Tail is a Concat object:      let CExt := Extrude(CExt.Tail)      let S := AttemptSeq(CExt)    if S is Null: // We're either out of sequences, or CExt.Tail is a normal sequence      let S := Seq(SExt.Tail)  return S`

That last one is a bit awkward, but the general idea is you start with something like
this:

`(concat (concat my-seq nil) another-seq)`

and extrusion turns it into this:

`(concat my-seq (concat nil another-seq))`

And if the new Head is not empty or Null, then AttemptSeq gives you this:

`(cons (first my-seq) (concat (rest my-seq) (concat nil another-seq)))`

But what if we started with this:

`(concat (concat () my-seq) another-seq)`

Then extrusion gives us:

`(concat () (concat my-seq another-seq))`

So when we AttemptSeq, we get nil. But, we can ignore the head anyway because empty lists
are the concat identity. So we run extrude over the tail and get

`(concat my-seq another-seq) // There was no change this time`

And then run AttemptSeq on that and get

`(cons (first my-seq) (concat (rest my-seq) another-seq))`

The actual code ends up a bit more complicated because we want to cache the sequence
results rather than build everything from scratch every time, but it's essentially the
same.

TL;DR-
Without patch:
`user=> (time (count (reduce #(concat (list %2) %1) nil (range 10000000)))) ; right-recursive, so it works"Elapsed time: 11658.052987 msecs"10000000user=> (time (count (reduce #(concat %1 (list %2)) nil (range 10000000)))) ; BOOMStackOverflowError   clojure.lang.LazySeq.seq (LazySeq.java:49)`

With patch:

`user=> (time (count (reduce #(concat (list %2) %1) nil (range 10000000))))"Elapsed time: 5343.70402 msecs"10000000user=> (time (count (reduce #(concat %1 (list %2)) nil (range 10000000)))) ; Ta-da!"Elapsed time: 7497.264373 msecs"10000000`

I hope some of you found this interesting. I had fun implementing it!

Jon
0001-Define-clojure.lang.Concat.patch

### Stuart Halloway

Sep 11, 2015, 9:45:50 AM9/11/15
Hi Jon,

Are you aware of other languages that do this?

Stu

--
You received this message because you are subscribed to the Google Groups "Clojure Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure-dev...@googlegroups.com.
To post to this group, send email to cloju...@googlegroups.com.