Functional purity and "globals" in Clojure

826 views
Skip to first unread message

Alexandr Kurilin

unread,
Sep 10, 2013, 3:19:35 AM9/10/13
to clo...@googlegroups.com
I'm trying to determine how to best deal with the concept of globals in Clojure. Say I have a map of configuration values for my Ring app, populate at app startup from disk or env, and I need to reference the contents of this map from all over the project. Assuming MVC, models and controllers all would be interested in its contents. I just want to clarify that the question is not so much about "configuration" as it is about dealing with state that many different components in an application might be all interested in. This is an issue that seems to arise very often.

I'm seeing a couple of options here:
  • Pass the configs map along with each Ring request with some middleware, and then down every subsequent function call that might eventually need it. It's a bit of a hassle since now you're passing more data, potentially increasing the # of parameters, or forcing you to use a parameter map everywhere where you might have passed along just 1 primitive. Nonetheless, this is as pure as it gets.
  • Def the configs map in a namespace somewhere and refer to it from wherever, just like global state. The trick is now that now you're no longer certain what e.g. your model functions are expecting there to be in the system and testing becomes trickier. Now you have to either lein test with a separate set of configurations (which doesn't actually give you much per-test granularity) or use with-redefs everywhere to make sure the right test config state is being used.
  • Something in the middle where perhaps you agree to never directly reference the configs map from your models (again, thinking MVC here) and instead only ever access it from controllers, and pass the necessary options along down to the model functions. This way you can test both controllers and models in isolation in purity.
Ultimately I want to contain the chaos of having to know internal implementation details of my functions at different layers and want them to be easily testable in isolation. It does seem like the first option is the most straightforward, but I'd be curious to find out what those of you who have deal with the problem for a while would recommend. Purity all the way, or are there patterns that can give you the best of both worlds? Or what else?


Philipp Meier

unread,
Sep 10, 2013, 4:05:10 AM9/10/13
to clo...@googlegroups.com
Am Dienstag, 10. September 2013 09:19:35 UTC+2 schrieb Alexandr Kurilin:
  • Something in the middle where perhaps you agree to never directly reference the configs map from your models (again, thinking MVC here) and instead only ever access it from controllers, and pass the necessary options along down to the model functions. This way you can test both controllers and models in isolation in purity.

I'd do the following: for every layout of your application (persistence, datamodel, web) define one or more vars that hold the configuration specific for that layer. Testing a layer is easy and documentation of the layer gives you the necessary information about the configuration vars. 

-billy.

Softaddicts

unread,
Sep 10, 2013, 6:58:46 AM9/10/13
to clo...@googlegroups.com
I agree more or less, I hate having my configuration data spread everywhere.
I prefer to dedicate a name space to maintain/access configuration data.
I usually split access to resources using bundles.
A bundle can reflect anything you want in your design, including a controller...

This allows you to later change where you store and how you manage
your configuration underneath, calls to the configuration are traceable in your code
(conf/...) and you can easily stub it for testing purposes.

Yes you need some discipline to avoid accessing it from everywhere (it could be
tempting) but in general I refer to it at the top level and pass bits of the
configuration data downward.

Downward in your code you refer to semantics that have little to do with what the
configuration data may represent as a whole. E.g. when trying to create a socket,
you do not matter where the host/port came from in your configuration,
you just need the values as args.

It helps keeping your functions as pure a possible or at least with a clearly delimited
side effect scope and still allows tests to be written.

Attaining 100% purity is in my view a nice but rarely reachable objective.

I would say that 80 to 90% is more realistic :)

Luc P.

> Am Dienstag, 10. September 2013 09:19:35 UTC+2 schrieb Alexandr Kurilin:
>
> >
> > - Something in the middle where perhaps you agree to never directly
> > reference the configs map from your models (again, thinking MVC here) and
> > instead only ever access it from controllers, and pass the necessary
> > options along down to the model functions. This way you can test both
> > controllers and models in isolation in purity.
> >
> >
> I'd do the following: for every layout of your application (persistence,
> datamodel, web) define one or more vars that hold the configuration
> specific for that layer. Testing a layer is easy and documentation of the
> layer gives you the necessary information about the configuration vars.
>
> -billy.
>
> --
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
--
Softaddicts<lprefo...@softaddicts.ca> sent by ibisMail from my ipad!

Philipp Meier

unread,
Sep 10, 2013, 8:35:05 AM9/10/13
to clo...@googlegroups.com


Am Dienstag, 10. September 2013 12:58:46 UTC+2 schrieb Luc:
I agree more or less, I hate having my configuration data spread everywhere.
I prefer to dedicate a name space to maintain/access configuration data.
I usually split access to resources using bundles.
A bundle can reflect anything you want in your design, including a controller...
 
You can always use a global config system and pass through the configuration information into the sub systems.

Timothy Baldridge

unread,
Sep 10, 2013, 9:17:22 AM9/10/13
to clo...@googlegroups.com
This Clojure/West talk deals with many of these concepts. 

http://www.infoq.com/presentations/Clojure-Large-scale-patterns-techniques

Timothy


--
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--
“One of the main causes of the fall of the Roman Empire was that–lacking zero–they had no way to indicate successful termination of their C programs.”
(Robert Firth)

Softaddicts

unread,
Sep 10, 2013, 9:50:57 AM9/10/13
to clo...@googlegroups.com
This presentation is funny from Stuart is interesting and funny to watch as usual :)

The "mud ball" issue is not new and you can face it in other languages than Lisp
that lack a single straight jacket with which anything needs to be defined
(like classes).

Freedom comes with a price, you cannot always rely on some fixed rail guards to tell
you how to do things.

I guess that a decade of OOP frenzy took its toll and now some folks suffer
from vertigo when they switch to a "muddy" language :)

Luc P

Christopher Allen

unread,
Sep 10, 2013, 3:07:34 PM9/10/13
to clo...@googlegroups.com
I do a hybrid in Bulwark.


Defaults to accepting a closure of a config map for nice testing and hygiene, with a fallback to a global atom map for configuration for muggles.

Works well for me.

Laurent PETIT

unread,
Sep 11, 2013, 9:15:42 AM9/11/13
to clo...@googlegroups.com
I have gone the dynamic var before, played well with ring handlers.

Why ?
Because I wanted to ensure consistency between "reads" to the
configuration map for the duration of the ring handler call.

So just having the configuration and read the root value at different
times would not suffice.

I just created a little ring middleware which would bind the dynamic
var's value before delegating to the real handler.

This enables me to change the configuration's root var without
worrying about concurrency conflicts with running handlers, since the
handlers see a consistent bound value.

My 0,02€,

--
Laurent

2013/9/10 Christopher Allen <c...@bitemyapp.com>:

Christophe Grand

unread,
Sep 11, 2013, 9:50:58 AM9/11/13
to clojure
I like the dynamic var approach. It even allows to enforce that submodules won't see some keys (using select-keys or dissoc). It tests nicely too.

Christophe

--
On Clojure http://clj-me.cgrand.net/
Clojure Programming http://clojurebook.com
Training, Consulting & Contracting http://lambdanext.eu/

Mikera

unread,
Sep 11, 2013, 10:32:46 AM9/11/13
to clo...@googlegroups.com
As far as possible, I think it is best to try and minimise mutable global state (like mutable configuration data) and implicit context (like dynamic vars). My preferred approach is to pass the configuration data (as a value) to a higher order function that constructs the configurable object / function appropriately. I regard this is more of a "functional" style. So you might do something like:

(def my-ring-application (create-application config-map))

Once constructed, the ring application is fully configured and doesn't need any extra parameters, i.e. it works just like a regular ring application. You can treat the ring application as being effectively immutable. If you want to reconfigure, then just create a new one!

This approach has several advantages:
- It's highly composable. Your components can easily be plugged together, since they aren't bound to any external configuration state.
- You can perform some significant optimisations in the construction phase (depending on the nature of the configuration you might be able to eliminate validation checks, optimise the size of buffers etc.)
- It's highly testable. Just create as many differently-configured instances as you like.

The main downside, of course, is that you need to be thoughtful and do a bit more work in designing the constructor function. But I think that's a worthwhile activity - it can often lead to a better design overall.

Note that this technique can apply to much more than web applications. You can even use it to construct objects that themselves contain mutable state. For example, I use this method to construct the entire GUI + running game instance for my little Clojure roguelike game "Alchemy". There is no mutable global state at all - you can launch several totally independent game instances from the same REPL. If you are interested, see the "launch" code at the bottom of this file: https://github.com/mikera/alchemy/blob/master/src/main/clojure/mikera/alchemy/main.clj

Softaddicts

unread,
Sep 11, 2013, 10:58:04 AM9/11/13
to clo...@googlegroups.com
We load configuration data once from zookeeper and conceal it in a name space.
Changing the context is quite simple, we reload resources in this name space
using a minimal list of properties which says where the configuration data should be
pulled from in zookeeper.

This is doable from the REPL at any time. No other name space keeps
this stuff in vars. Any external resource is pulled at runtime from the
configuration name space where they are cached except for some very short lived
structures (requests to storage, locks, queues, channels, ... for the duration of
the request).

Test configuration data is also contained in zookeeper. Tests
pull from this configuration tree the configuration they need.

Part of the test stubbing is kept in this configuration.

No mutations occur in the configuration data from zookeeper during the "normal"
app life cycle. We allow config changes while the app is running but in a special
context of its life cycle. Namely the app has to enter a state that allows it to
reconfigured itself. This means that workers have been stopped, ....

We can test a new configuration without mutating the previous one in zookeeper
using a versioning scheme.

This requires some discipline enforced by the configuration name space API.
So far it's been a charm to work with.

Luc P.
> > - Pass the configs map along with each Ring request with some
> > middleware, and then down every subsequent function call that might
> > eventually need it. It's a bit of a hassle since now you're passing more
> > data, potentially increasing the # of parameters, or forcing you to use a
> > parameter map everywhere where you might have passed along just 1
> > primitive. Nonetheless, this is as pure as it gets.
> > - Def the configs map in a namespace somewhere and refer to it from
> > wherever, just like global state. The trick is now that now you're no
> > longer certain what e.g. your model functions are expecting there to be in
> > the system and testing becomes trickier. Now you have to either lein test
> > with a separate set of configurations (which doesn't actually give you much
> > per-test granularity) or use with-redefs everywhere to make sure the right
> > test config state is being used.
> > - Something in the middle where perhaps you agree to never directly
> > reference the configs map from your models (again, thinking MVC here) and
> > instead only ever access it from controllers, and pass the necessary
> > options along down to the model functions. This way you can test both
> > controllers and models in isolation in purity.
> >
> > Ultimately I want to contain the chaos of having to know internal
> > implementation details of my functions at different layers and want them to
> > be easily testable in isolation. It does seem like the first option is the
> > most straightforward, but I'd be curious to find out what those of you who
> > have deal with the problem for a while would recommend. Purity all the way,
> > or are there patterns that can give you the best of both worlds? Or what
> > else?
> >
> >
> >
>
> --
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>

Laurent PETIT

unread,
Sep 11, 2013, 11:26:24 AM9/11/13
to clo...@googlegroups.com
2013/9/11 Softaddicts <lprefo...@softaddicts.ca>:
> We load configuration data once from zookeeper and conceal it in a name space.
> Changing the context is quite simple, we reload resources in this name space
> using a minimal list of properties which says where the configuration data should be
> pulled from in zookeeper.
>
> This is doable from the REPL at any time. No other name space keeps
> this stuff in vars. Any external resource is pulled at runtime from the
> configuration name space where they are cached except for some very short lived
> structures (requests to storage, locks, queues, channels, ... for the duration of
> the request).
>
> Test configuration data is also contained in zookeeper. Tests
> pull from this configuration tree the configuration they need.
>
> Part of the test stubbing is kept in this configuration.
>
> No mutations occur in the configuration data from zookeeper during the "normal"
> app life cycle. We allow config changes while the app is running but in a special
> context of its life cycle. Namely the app has to enter a state that allows it to
> reconfigured itself. This means that workers have been stopped, ....
>
> We can test a new configuration without mutating the previous one in zookeeper
> using a versioning scheme.
>
> This requires some discipline enforced by the configuration name space API.
> So far it's been a charm to work with.

Sure, and adding a binding call providing the same config as the one
found in root would allow for the whole thread to consistently read
the same value.

Softaddicts

unread,
Sep 11, 2013, 12:12:04 PM9/11/13
to clo...@googlegroups.com
Euh... our local config would require loading a top level map with more than
twenty keys and a dozen levels deep of maps.

I am not sure I would like to carry a local copy in all workers.

Freezing the app somehow when a config change occurs is mandatory.
Each node in the cluster runs at least a dozen workers and not all the nodes
run the same set of workers. We cannot allow a worker to run with the old
config in place while others pick up the changes. Mismatches in resources
between collaboration workers would create a huge chaos.

Allowing some workers to continue processing would require knowing which ones
are impacted by a change and which ones can safely continue w/o picking up
the change. This requires a tool to analyze dependencies at runtime
when a config change occurs.

This may be a future enhancement but up to now the value of such an
enhancement has been very low. A config change does not stall the app
for a significant amount of time.

Luc P.

Alexandr Kurilin

unread,
Sep 12, 2013, 12:13:04 AM9/12/13
to clo...@googlegroups.com
Thanks for sharing the various options you guys are using, much appreciated.

I'm thinking I'm going to go with a function returning a map of configs (potentially memoized) with the option to overwrite certain keys from a separate map for testing etc. Should be good enough for my use cases.
Alexandr Kurilin - Front Row Education (www.frontrowed.com)
206.687.8740 | al...@kurilin.net
@alex_kurilin

Jason Wolfe

unread,
Sep 12, 2013, 12:35:20 AM9/12/13
to clo...@googlegroups.com
At Prismatic, we do this using Graph:


I really try to avoid global vars, but passing through the seven parameters or resources you need through four layers of intermediate functions is also a hassle.  Graph makes it easier to define your service in a functional way, instantiate it with your parameters, and the data automatically flows to the leaves where it's needed.  When we leave the Graph core and enter ordinary function land, resources and parameters are passed through as a nested map, which at least means you only have a single context argument rather than a mess of args.  It's not a complete solution, but I think it's worked pretty well for us thus far.


On Tuesday, September 10, 2013 12:19:35 AM UTC-7, Alexandr Kurilin wrote:

Honza Pokorny

unread,
Jan 18, 2014, 2:27:09 PM1/18/14
to clo...@googlegroups.com
>  we do this using Graph

Could you show a simple example of this? I think this might be a good topic for a blog post.
Reply all
Reply to author
Forward
0 new messages