Lazy Sequence Results in Stack Overflow

436 views
Skip to first unread message

Charles Reese

unread,
Sep 23, 2015, 8:14:39 PM9/23/15
to Clojure
I want to compute a lazy sequence of primes.

Here is the interface:

    user=> (take 10 primes)
    (2 3 5 7 11 13 17 19 23 29)

So far, so good.

However, when I take 500 primes, this results in a stack overflow.

                      core.clj:  133  clojure.core/seq
                      core.clj: 2595  clojure.core/filter/fn
                  LazySeq.java:   40  clojure.lang.LazySeq/sval
                  LazySeq.java:   49  clojure.lang.LazySeq/seq
                       RT.java:  484  clojure.lang.RT/seq
                      core.clj:  133  clojure.core/seq
                      core.clj: 2626  clojure.core/take/fn
                  LazySeq.java:   40  clojure.lang.LazySeq/sval
                  LazySeq.java:   49  clojure.lang.LazySeq/seq
                     Cons.java:   39  clojure.lang.Cons/next
                  LazySeq.java:   81  clojure.lang.LazySeq/next
                       RT.java:  598  clojure.lang.RT/next
                      core.clj:   64  clojure.core/next
                      core.clj: 2856  clojure.core/dorun
                      core.clj: 2871  clojure.core/doall
                      core.clj: 2910  clojure.core/partition/fn
                  LazySeq.java:   40  clojure.lang.LazySeq/sval
                  LazySeq.java:   49  clojure.lang.LazySeq/seq
                       RT.java:  484  clojure.lang.RT/seq
                      core.clj:  133  clojure.core/seq
                      core.clj: 2551  clojure.core/map/fn
                  LazySeq.java:   40  clojure.lang.LazySeq/sval
                  LazySeq.java:   49  clojure.lang.LazySeq/seq
                       RT.java:  484  clojure.lang.RT/seq
                      core.clj:  133  clojure.core/seq
                      core.clj: 3973  clojure.core/interleave/fn
                  LazySeq.java:   40  clojure.lang.LazySeq/sval
                  
I'm wondering what is the problem here and, more generally, when working with lazy sequences, how should I approach this class of error?

Here is the code.

    (defn assoc-nth
      "Returns a lazy seq of coll, replacing every nth element by val

      Ex:
      user=> (assoc-nth [3 4 5 6 7 8 9 10] 2 nil)
      (3 nil 5 nil 7 nil 9 nil)
      "
      [coll n val]
      (apply concat
             (interleave
              (map #(take (dec n) %) (partition n coll)) (repeat [val]))))

    (defn sieve
      "Returns a lazy seq of primes by Eratosthenes' method

      Ex:
      user=> (take 4 (sieve (iterate inc 2)))
      (2 3 5 7)

      user=> (take 10 (sieve (iterate inc 2)))
      (2 3 5 7 11 13 17 19 23 29)
      "
      [s]
      (lazy-seq
       (if (seq s)
         (cons (first s) (sieve
                          (drop-while nil? (assoc-nth (rest s) (first s) nil))))
         [])))

    (def primes
      "Returns a lazy seq of primes

      Ex:
      user=> (take 10 primes)
      (2 3 5 7 11 13 17 19 23 29)
      "
      (concat [2] (sieve (filter odd? (iterate inc 3)))))

Colin Yates

unread,
Sep 23, 2015, 8:16:52 PM9/23/15
to clo...@googlegroups.com

This might help: http://stuartsierra.com/2015/08/25/clojure-donts-lazy-effects

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Gary Verhaegen

unread,
Sep 26, 2015, 3:08:32 AM9/26/15
to clo...@googlegroups.com
Primes are hard, because you essentially need to track all previous primes.

Here's one way to think about what you're doing here. First, you
create a lazy sequence of infinite numbers from 3 and up. This is done
by building a cons cell with 3 as the head and (iterate inc 4) as the
tail.

Then you filter that. From now on, every time you want the next
element, you'll have to run two functons: inc and odd?, plus some
extra processing for iterate and filter which both build up an
intermediate lazy seq.

Then you call sieve, which yet again builds up a new lazy seq that
accumulates more functions on its tail: assoc-nth, drop-while and a
recursive call to sieve. And assoc-nth itself still wraps a bunch of
other functions on your tail.

Now, the crucial part here is that most of these functions never go
away: for each new element that you compute, you add additional
function calls to each subsequent element. So that getting the 5th
element requires more additional elements than getting the 4th.

In some way, you have chosen to encode your list of previous primes as
accumulated functions. By the 500th prime, you have some much
accumulated functions that your call stack itself blows the stack.

With your implementation, on my computer, I only get about 300:

(defn num-primes
[prime-seq]
(try (doall prime-seq)
(catch StackOverflowError e
(count prime-seq))))

=> (time (num-primes primes))
"Elapsed time: 62385.706564 msecs"
325

A slightly more direct approach to the sieve (though arguably not a
true sieve), which has the same problem, in a more obvious way, would
be:

(defn filter-primes
[]
(let [helper (fn helper [n]
(lazy-seq (cons n (remove #(zero? (mod % n))
(helper (inc n))))))]
(helper 2)))

=> (time (max-prime (filter-primes)))
"Elapsed time: 472.078328 msecs"
251

Two things to note:

* I've turned the sequence into a defn instead of a def. This makes it
possible to not hold onto the head. Since we get a stack overflow
around 300 integers here, this is no biggie, but it's a good practice
when defining potentially very long lazy seqs.
* It should be much more clear here that each prime is adding a call
to remove: whereas the 2 gets returned directly, the 3 has to go
through a mod 2 operation, then a remove function has to check that it
returned false. Similarly, the 5 will, in this very naive
implementation, need to check that it is not divisible by either 2, 3,
or 4, so we now have a "stack" of three remove functions to go
through.

A closer implementation to yours would be:

(defn filter-primes2
[]
(let [sieve (fn [ls]
(let [p (first ls)]
(lazy-seq
(cons p (remove #(zero? (mod % p)) ls)))))]
(sieve (iterate inc 2))))

=> (time (num-primes (filter-primes2)))
"Elapsed time: 59021.93554 msecs"
325

However, storing "primes found so far" into a stack of functions is
not very efficient (in either memory or computation), and the sieve
only really shines when it's used on vectors of known length (because
then there's no division), not on lazy seqs of potentially infinite
length. Here's an attempt to combine the advantages of the sieve with
lazy seqs:

(defn array-sieve
[]
(let [sieve (fn [n]
(let [a (long-array n)]
(loop [idx 0]
(when (< idx n)
(aset-long a idx (+ 2 idx))
(recur (inc idx))))
(loop [idx 0]
(when (and (< idx n)
(not= 0 (aget a idx)))
(let [v (aget a idx)]
(loop [idx2 (+ idx v)]
(when (< idx2 n)
(aset-long a idx2 0)
(recur (+ v idx2))))))
(when (< idx n)
(recur (inc idx))))
(vec (remove zero? a))))
generator (fn [[prev-idx known-primes prev-prime]]
(let [next-idx (inc prev-idx)
known-primes (if (< next-idx (count known-primes))
known-primes
(loop [n (* 2 next-idx)]
(let [k (sieve n)]
(if (> (count k) next-idx)
k

(recur (* 2 n))))))
next-prime (get known-primes next-idx)]
[next-idx known-primes next-prime]))]
(->> [0 [2] 2]
(iterate generator)
(map last))))

=> (time (nth (array-sieve) (dec 100000)))
"Elapsed time: 13454.668724 msecs"
1299709

(You can check e.g. here that this is correct:
https://primes.utm.edu/nthprime/index.php#nth; the dec is there
because this website counts from 1, and nth from 0)

This is of course much more complex, but it's also much faster and
much more memory-efficient. There's probably a way to exploit the
current known primes in the computation of the next ones, too, but I
find it fast enough (and complex enough!) as it is.

Note that I do not construct a lazy seq myself anymore, and instead
rely on iterate. If you can keep a relatively small amount of state to
compute the next number in the sequence, I would generally recommend
using a similar approach: iterate a function of [[value state]] ->
[value state] and then map first over it. Here's another example of
this technique, for computing Fibonacci numbers:

(defn fib
[]
(->> [0M 1M]
(iterate (fn [[a b]] [b (+ a b)]))
(map first)))

=> (time (nth (fib) 50000))
"Elapsed time: 283.99525 msecs"
10777734893072974780279...8305364252373553125M

(For Fibonacci numbers, I start out with BigNums because, well, it
overflows pretty quickly otherwise.)

William la Forge

unread,
Sep 26, 2015, 9:12:19 PM9/26/15
to Clojure
Why not turn this problem on its head? Have a vector of lists of prime factors for each i. March through this vector and for each prime factor in the current list, add that prime factor p to list i + p. And if i has no prime factors, then it is a prime itself and add it to the list at 2 * i.

So the only test for prime is if i has no prime factors as determined by propagating the prime factors forward, not through a mod function.

Yoshinori Kohyama

unread,
Sep 28, 2015, 2:14:17 AM9/28/15
to Clojure
Hi, Charles and all.

Here is my definition of prime numbers:
https://gist.github.com/kohyama/8e599b2e765ad4256f32

HTH.

Yoshinori Kohyama
Reply all
Reply to author
Forward
0 new messages