Clojure Spec, performance and workflows

Kjetil Thuen

unread,

Jul 15, 2016, 7:46:10 AM7/15/16

to Clojure

Hi.

I was sent here from the Clojurians Slack after posting a question there about
the performance of clojure.spec (or actually cljs.spec).

After having specced most of my data structures and all of the functions in a
couple of namespaces, I am seeing my test suite taking 5 minutes to complete in
contrast to the 14 seconds it used before adding specs. I really enjoy the extra
confidence the specs give me, but their cost is breaking my familiar work flow.

I was told you are still collecting feedback about clojure.spec, so please
consider this a story of how a user new to spec might approach introducing specs
to an existing application, and potentially a request for pointers to better ways of
doing things;

When clojure.spec was first announced I was actively looking for ways to make
refactorings and changes to the response formats of external services less
painful. Spec seemed to fit the bill perfectly, so as soon as the Clojurescript port
was out, I dived in.

Since this app is built around a single app-state atom, it seemed natural to
begin there. Starting at the leaves I identified logical components, and wrote
specs for them. In the process I namespaced keywords, and put the spec
definitions into namespaces that were already there for dealing with that part
of the state.

When doing this, I worked from the REPL, feeding conform and explain the specs
and data from the live app in addition to conforming and non-conforming data
snippets that I typed out manually.

So far, so good.

The next step was specing some functions. First, I picked a couple of small
namespaces (4-5 functions) to get going. At this point I wanted more automation,
so I added a call to instrument-all in my test runner. This was working fine,
and it helped me identify a couple of corner cases I had not considered. Great
stuff!

Next up was just typing out more specs, and then I could go on fixing bugs and
adding features with a newfound confidence. I picked one of the larger
namespaces (60 functions) next. I typed out specs for all the functions, spent
some time tweaking this and that, and ultimately got everything running without
complaints. But now running the tests takes so much time that I tend to not run
the tests on save any longer. Instead I evaluate single tests of instrumented
functions in Cider as I go along, but this takes away from the confidence that I
first set out to gain. Also, I am worried that our CI service will struggle as I add
even more specs.

At this rate, I fear the full test suite will consume close to half an hour to
complete once all the planned specs are in place.

Is calling instrument-all before running tests not a recommended approach? How
do other people leverage their specs in their workflows?

Rich Hickey

unread,

Jul 15, 2016, 10:50:32 AM7/15/16

to clo...@googlegroups.com

Thanks for your feedback. I've been anticipating this discussion, and my response here is directed to everyone, not just your problem.

Using specs and instrument+generative testing is much different than the example-based testing that has happened thus far, and should deliver substantially greater benefits. But the (test-time) performance profile of such an approach is similarly different, and will require a different approach as well.

Let's deal with the simplest issue - the raw perf of spec validation today. spec has not yet been perf-tuned, and it's quite frustrating to see e.g. shoot-out tables comparing perf vs other libs. If people want to see code being developed (and not just dropped in their laps) then they have to be patient and understand the 'make it right, then make it fast' approach that is being followed. I see no reason spec's validation perf should end up being much different than any other validation perf. But it is not there yet.

That being said, even after we perf-tune spec, comparing running a test suite with instrumented code (and yes, that is a good idea) with the same test suite without (which as people will find, has been missing bugs) is apples vs oranges.

Add in switching to (or adding) generative testing, which is definitely always going to be much more computationally intensive than example based tests (just by the numbers, each generative test is ~100 tests), there is no way that test-everything-every-time is going to be sustainable.

Should we not use generative testing because we can't run every test each time we save a file?

We have to look at the true nature of the problem. E.g., each time you save a file, do you run the test suites of every library upon which you depend? Of course not. Why not? *Because you know they haven't changed*. Did you just add a comment to a file - then why are you testing anything? Unfortunately, our testing tools don't have a fraction of the brains of decades-old 'make' when it comes to understanding change and dependency. Instead we have testing tools oriented around files and mutable-state programming where, yeah, potentially changing anything could break any and everything else, so let's test everything any time anything has changed.

This is just another instance of the general set of problems spec (and other work) is targeting - we are suffering from using tools and development approaches (e.g. building, testing, dependency management et al) whose granularity is a mismatch from reality. Having fine-grained (function-level) specs provides important opportunities to do better. While tools could (but currently mostly don't) know when particular functions change (vs files), specs can let us independently talk about whether the *interface* to a fn has changed, vs a change to its implementation. Testing could be radically better and more efficient if it leveraged these two things.

I don't like to talk about undelivered things, but I'll just say that these issues were well known and are not a byproduct of spec but a *target* of it (and other work).

Rich

Colin Yates

unread,

Jul 15, 2016, 1:23:56 PM7/15/16

to clo...@googlegroups.com

Just dipping in to say the pragmatism and maturity around here is
hugely attractive compared to other tech communities. +1.

In terms of reducing the impact of a change, I have found ruthlessly
separating out into separate libraries very beneficial here. lein's
'checkout' capability makes it not much more effort than single
monolithic codebase. I am not talking about deployment models, I am
talking about source-code modules.

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward