STM in face of heavy IO

ntu...@googlemail.com

unread,

Jun 15, 2008, 2:10:31 PM6/15/08

to Clojure

Dear list members,

I have a problem designing a system which maintains states for many
network clients. The states can change concurrently and the state
transitions also depend on IO. Now I am well aware that side-effects
have to be avoided in transactions but I wonder how I should design
the system. Currently on read events a protocol state machine runs
which uses the client's state, invokes callback-handlers and changes
state depending on the result of these handlers. IO is always possible
and I don't see how I can make client state changes thread-safe
without using locks unless I can get rid of IO. Are there any
experiences how to design such systems with STM? (I guess there is no
simple answer to my question and I might have to re-design big parts
of the system to make it work with STM. But anyway I wonder if this
new design would be hard to understand if I need to queue up IO
actions to run outside of transactions.)

Thanks!

Stephen C. Gilardi

unread,

Jun 15, 2008, 6:27:03 PM6/15/08

to clo...@googlegroups.com

On Jun 15, 2008, at 2:10 PM, ntu...@googlemail.com wrote:

> [...]

> Are there any
> experiences how to design such systems with STM? (I guess there is no
> simple answer to my question and I might have to re-design big parts
> of the system to make it work with STM. But anyway I wonder if this
> new design would be hard to understand if I need to queue up IO
> actions to run outside of transactions.)

I don't have experience with such a design, but along the lines of
your discussion, Clojure Agents may help. They interact well with the
STM in that if you send a message to an Agent within an STM
transaction, it will get sent in order and exactly once after the
transaction completes. This could make the queueing you're talking
about clearly understandable.

Can you construct a small example of what you're trying to do that
shows the problems you're anticipating?

--Steve

ntu...@googlemail.com

unread,

Jun 16, 2008, 7:38:34 AM6/16/08

to Clojure

Thanks. To give an example - there is a (ref {}) mapping from IDs to
clients. On read events the request ist read and the client is
retrieved from this map by its id and handed over to the state machine
which evaluates the request and takes the client state into account.
It invokes callback-handler depending on request type which do some
processing and return some data. The client state is then updated and
a response is send back. In parallel there are cleanup-threads running
over the map which also access and possibly change clients (usually
some timeout stuff).

Some pseudo-code:

(def clients (ref {}))

(defn handle-request [client req]
(binding [protocol/client client
protocol/callback1 callback1
...
protocol/callbackN callbackN]
(update clients (:id client) (protocol/process req)))) ; updates
clients with new client returned from protocol/process

(defn on-read []
(let [req (read-request)
id (read-id)
client (@clients id)] ; omitted client creation if not found
(on-thread (handle-request client req))))

(defn clean-up [] ; run in separate thread
(doseq c (vals @clients))
(check-and-change c)))

in protocol

(defn some-dispatch [t data]
(send-response (callback1 data)) ; callback1 one does arbitrary
processing including IO
(update client new-state))

(defn process [req]
(some-dispatch (determine-request-type req) (extract-info req)))

I thought about agents as well but since they run asynchronously and I
need synchronous change I am not sure if they fit in here very well. I
omitted any transaction/locking code but since the protocol and
possibly the callback code do IO I can only think of locking the
client to prevent concurrent modifications, e.g. from clean-up. To use
STM I would need to factor out the IO code from protocol and callbacks
it seems.

Thanks!

Rich Hickey

unread,

Jun 16, 2008, 8:34:30 AM6/16/08

to Clojure

On Jun 16, 7:38 am, "ntu...@googlemail.com" <ntu...@googlemail.com>
wrote:

What you are describing here is quite tangled. With locks, it would be
very difficult to get right, especially if you would be holding locks
across I/O. While possible, it's as undesirable as doing IO during
transactions.

Clojure offers a lot of tools but you need to check your presumptions,
especially that IO can happen at any time, synchronous change etc.

I won't pretend to be able to jump in here with an architecture, but
here's one to think about:

Using refs and transactions for the shared clients map, but an
agent for each client, i.e. clients is a map of id->agent

As soon as you start thinking "on-thread/callbacks" agents should come
to mind. An agent action can do IO (use send-off in that case), and
can run transactions (before/after it does IO). And only one action
can be happening for a particular agent at a time.

If, OTOH, a client event can change the state of multiple other
clients, you'll either have to use asynchrony (agent action sends to
other clients), or go to an all-STM approach, in which case you have
to segregate your IO, which is not a bad idea in any case.

Rich

Reply all

Reply to author

Forward