ref strange behavior

187 views
Skip to first unread message

Sergei Koledov

unread,
Aug 23, 2016, 8:22:00 AM8/23/16
to Clojure
Hello,

I had a problem when I run the following code:

(defn get-task [tasks]
  (dosync
    (let [task (first @tasks)]
      (alter tasks rest)
      task)))

(defn worker [& {:keys [tasks]}]
  (agent {:tasks tasks}))

(defn worker-loop [{:keys [tasks] :as state}]
  (loop [last-task nil]
    (if-let [task (get-task tasks)]
      (recur task)
      (locking :out (println "Last task: " last-task))))
  state)

(defn create-workers [count & options]
  (->> (range 0 count)
       (map (fn [_] (apply worker options)))
       (into [])))

(defn start-workers [workers]
  (doseq [worker workers] (send-off worker worker-loop)))

(def tasks (ref (range 1 10000000)))

(def workers (create-workers 100 :tasks tasks))

(start-workers workers)
(apply await workers)

Description: I have several agents (100 in my case). Each agent running in a separate thread. All agents share the one ref with the collection of tasks (range of longs in my case). Each agent get tasks from the collection (in transaction) one by one until the collection becomes empty and then prints the last task which it handle. However, when I run this code it looks like the collection of tasks suddenly becomes empty and workers handle only portion of all tasks (average 25-40% of all number).

This code behave as I expected, when I create only one agent or use explicit locking in get-task function:

(defn get-task [tasks]
  (locking :lock
    (dosync
    (let [task (first @tasks)]
      (alter tasks rest)
      task))))

I run this code on the Clojure 1.8.0
java version "1.8.0_91"
Java(TM) SE Runtime Environment (build 1.8.0_91-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode)

Can anyone tell me, what am I doing wrong, or it really looks like a bug in the clojure STM?
I already asked this question on stackoverflow.com (http://stackoverflow.com/questions/39054911/strange-behavior-of-clojure-ref), but so far nobody has been able to help me.

P.S. Sorry for my english skill.

hitesh

unread,
Aug 23, 2016, 8:57:58 AM8/23/16
to Clojure
Taking a quick look at this, I *think* the problem crops up in your tasks being a lazy sequence.

Modify this

(def tasks (ref (range 1 10000000)))

to this

(def tasks (ref (doall (range 1 10000000))))

Running that with your large endpoint was taking a lot of time, so I'd suggest running it with 1e5 as that will finish much quicker.

adrian...@mail.yu.edu

unread,
Aug 23, 2016, 9:09:31 AM8/23/16
to Clojure
I haven't run your code yet, but it's bad form to use Clojure's reference types inside other reference types. They should store persistent data structures to really make any sense in a concurrent context. 


On Tuesday, August 23, 2016 at 8:22:00 AM UTC-4, Sergei Koledov wrote:

Timothy Baldridge

unread,
Aug 23, 2016, 9:21:23 AM8/23/16
to clo...@googlegroups.com
This really seems like a good use case for Java's Executors: https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Executors.html#newFixedThreadPool(int)

You can specify the number of worker threads, then enqueue runnable jobs. But wait! There's more! Clojure functions implement Runnable, so you can just hand a zero argument function to the service once you create it: https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ExecutorService.html#submit(java.lang.Runnable)

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
“One of the main causes of the fall of the Roman Empire was that–lacking zero–they had no way to indicate successful termination of their C programs.”
(Robert Firth)

Sergei Koledov

unread,
Aug 23, 2016, 9:29:08 AM8/23/16
to Clojure
Yes, you are absolutely right. After i modify the code as you advised, it worked correctly. Thank you very much!
Does it mean that is necessary to avoid the use of lazy data structures within the STM?

вторник, 23 августа 2016 г., 19:57:58 UTC+7 пользователь hitesh написал:

adrian...@mail.yu.edu

unread,
Aug 23, 2016, 9:32:37 AM8/23/16
to Clojure
Here is working example code demonstrating how you might do this without agents:

(in-ns 'user)

(def tasks (ref (into clojure.lang.PersistentQueue/EMPTY (range 1 1000))))

(defn get-task
  [tasks]
  (dosync
    (let [task (first @tasks)]
      (alter tasks pop)
      task)))

(defn worker-loop
  []
  (loop [completed-tasks []]
    (if-let [task (get-task tasks)]
      (recur (conj completed-tasks task))
      completed-tasks)))

(defn create-workers
  [n & options]
  (vec (repeatedly n (fn []
                       (future
                         (worker-loop))))))

(def workers
  (create-workers 100))

(defn worker-test
  [xs]
  (= (set (mapcat deref workers))
     (set xs)))

(worker-test (range 1 1000))

adrian...@mail.yu.edu

unread,
Aug 23, 2016, 9:40:32 AM8/23/16
to Clojure
The lazy sequence works with the code I provided as well. 

Sergei Koledov

unread,
Aug 23, 2016, 10:52:49 AM8/23/16
to Clojure
Thank you very much! I suspected that I using the agents in wrong way. I ran your code and it worked great.

Also I little changed your example to a worker keep only the last task and run it on very large number of tasks (I had problem only on huge collections):
(def tasks (ref (into [] (range 1 10000000))))

(defn get-task
  [tasks]
  (dosync
    (let [task (first @tasks)]
      (alter tasks rest)
      task)))

(defn worker-loop
  []
  (loop [last-task nil]
    (if-let [task (get-task tasks)]
      (recur task)
      last-task)))

(defn create-workers
  [n & options]
  (vec (repeatedly n (fn []
                       (future
                         (worker-loop))))))

(def workers
  (create-workers 100))

(apply max (map deref workers))

This code worked as expected and returned 9999999.
But when I replace 
(def tasks (ref (into [] (range 1 10000000))))
to
(def tasks (ref (range 1 10000000)))
the problem arose again and this code returnined 1734656.
So I think the problem is the lazy collection in the ref.


вторник, 23 августа 2016 г., 20:32:37 UTC+7 пользователь adrian...@mail.yu.edu написал:

Sergei Koledov

unread,
Aug 23, 2016, 10:59:30 AM8/23/16
to Clojure
Thank you! I know about Java's thread pools, but I want to do my job in clojure-idiomatic way and possibly less using java interop.

вторник, 23 августа 2016 г., 20:21:23 UTC+7 пользователь tbc++ написал:

For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Gary Trakhman

unread,
Aug 23, 2016, 11:10:03 AM8/23/16
to clo...@googlegroups.com
On Tue, Aug 23, 2016 at 10:59 AM Sergei Koledov <grey...@gmail.com> wrote:
Thank you! I know about Java's thread pools, but I want to do my job in clojure-idiomatic way and possibly less using java interop.


I think many of the use-cases for refs/agents are supplanted by core.async.  In this case, the 'pipeline' family of functions can be used much like an executor pool.

I'm not sure how idiomatic STM can be, when I've rarely used it.  I think in most cases where you'd want to coordinate multiple refs within a transaction, a single atom is going to be much simpler and performant enough.

I think clojure uses the 'toolkit' approach, in that it allows many things to be combined and encourages composition, but it doesn't prescribe or force you towards any particular shape of solution.  That's a natural consequence of 'design by decoupling'.

Gary Verhaegen

unread,
Aug 28, 2017, 8:36:05 AM8/28/17
to clo...@googlegroups.com
I wasn't satisfied with the various answers to both this thread and
the StackOverflow question, so I spent a bit more time digging; it
turns out this was a bug in the range implementation that was fixed in
1.9.0-alpha11. I've added a bit more details on StackOverflow [1]; the
full story is on the ticket [2].

[1] https://stackoverflow.com/questions/39054911/strange-behavior-of-clojure-ref/45919119#45919119
[2] https://dev.clojure.org/jira/browse/CLJ-1914
Reply all
Reply to author
Forward
0 new messages