Keys spec with :additional-keys boolean option

191 views
Skip to first unread message

Daniel Dinnyes

unread,
Mar 3, 2019, 8:30:00 PM3/3/19
to Clojure
Hi Everyone,

I am using spec for a while (currently on the version shipped with Clojure 1.10). This post is intended to be a kind-of status report, focusing on a particular issue I am facing at the moment. To explain it in detail, please have a look on this following macro I wrote, which is an upgraded version of spec/keys:

(ns myns
  (:refer-clojure :exclude [keys])
  (:require
    [clojure.set :as set]
    [clojure.core :as core]
    [clojure.spec.alpha :as spec]))

(defmacro keys
  "Same as `clojure.spec/keys`, but accepts additional boolean option :additional-keys. Unless :additional-keys is set true, only the declared keys are allowed, and any additional keys will be invalid."
  [& {:keys [additional-keys] :as args}]
  (let [args (dissoc args :additional-keys)]
    (if additional-keys
      `(spec/keys ~@(mapcat identity args))
      (let [allowed-keys #{}
            allowed-keys
            (reduce
              (fn [acc k] (into acc (k args)))
              allowed-keys [:req :opt])
            allowed-keys
            (reduce
              (fn [acc k] (into acc (map (comp keyword name) (k args))))
              allowed-keys [:req-un :opt-un])]
        `(spec/and
           (spec/keys ~@(mapcat identity args))
           (fn [m#] (set/subset? (core/keys m#) ~allowed-keys)))))))

I would like to explain my use-case for writing this, and the reason I think there is a need for such feature.
I am writing an import/export library from a serialized XML data format, into in-memory representation. I need the import and export functions to be invertible:
 * given an original external XML data, if I import it, and then immediately export it, the re-exported data has to be equal to the original.
 * given an original internal in-memory data, if I export it, and then immediately import it, the re-imported data has to be equal to the original.

The main issue is that silent data-loss is absolutely unacceptable, (e.g. if a newer version of the XML data format has additional unspecified fields, these additional fields will fail to be imported to the in-memory model. Also, if any additional unspecified in-memory data gets added, it will fail to be exported. All this would all happen silently, without any error/warning).

The problem I am facing is I think one of the valid use-case, spec should have support for such scenarios.

Another similar use-case is when due to GDPR regulatory rules, extreme care should be taken what information gets stored, and any unverified data is potentially violating requirements. (e.g. a user might use clear-text comment fields to store credit-card information, N.I. numbers, etc.)

The original keys spec allows for additional data to be present, which is in line with what Rich described as design goals for spec (i.e. "requiring less", or "providing more" is not breakage but growth, and should be welcomed, as per the "Speculation" talk).

After I started using the above macro, it became obvious that it isn't versatile enough solution to the problem, because I would rather like to be notified of additional-keys, and then decide on a case-by-case basis, if in the given context that is to be considered an error. Instead the macro only gives me the option to decide this at the time when defining the spec, rather at the point of data validation.

If I recall correctly, in the "Maybe Not" talk Rich was talking about the concerns, that validating required and optional fields should be separated from the shape/schema of a spec, to be a'la carte, and decided at the point of validation, not at the time of defining the specs. It feels to me this concern is quite similar/related issue to the one I am writing about now.

I have also noticed that my usage pattern is that I rarely use spec/valid?, Instead prefer to use something like:

(if-let [explanation (explain-data :my/spec data)]
  (throw (ex-info "Invalid stuff" explanation)
  (do-business-logic-with (spec/conform :my/spec data)))

Due to this usage pattern, I am quite reliant on the fact that spec/valid? returns true iff spec/explain-data returns nil.

Nevertheless, I was thinking if spec/explain-data would return any unspecified keys found for a keys spec, under an :unspecified key in the explain-data map that would address the issue much better than my barbaric butchery with macros.

I would be interested to hear what you guys think about all the above, or if there are better workarounds/recommendations for my use-cases.

Cheers,
Daniel

Yuri Govorushchenko

unread,
Mar 4, 2019, 7:23:57 AM3/4/19
to Clojure
I find closed keys specs generally easier to work this (e.g. it protects from cases when I forget to spec some new data returned from the function). In my current project the number of open keys specs is ~1.5% and they're are for entities where the shape is really unknown beforehand. At the moment my rule of thumb is to start with strictest specs possible and then loosen them based on the feedback from code.

So far I can't recall to have any painful breakages described in Speculation talk. I haven't analyzed why, maybe eventually I'll stumble upon such case or this may be due to the fact that I have control over all the APIs and consumer code in the project.

Here's my take at adding `:disallowed-keys` into the `explain-data` of the closed spec (see `speced-keys` macro): https://gist.github.com/metametadata/53a847cd3b02056e8e4c124e63d9ae5a:

(s/def ::req-field1 any?)

(let [s (sp/speced-keys :req [::req-field1])
      value {::req-field1 123
             :extra1      100}

      ; act
      actual (s/explain-data s value)]
  ; assert
  (is (= {::s/problems [{:in              []
                         :path            []
                         :pred            'no-disallowed-keys?
                         :disallowed-keys #{:extra1} ; <-------------
                         :val             value
                         :via             []}]
          ::s/spec     s
          ::s/value    value}
         actual)))

As a bonus, `speced-keys` specs can be merged via `merge-keys`.
They also validate at runtime that all fields have specs already registered (related topic: https://groups.google.com/forum/#!topic/clojure/i8Rz-AnCoa8).

somewhat-functional-programmer

unread,
Mar 5, 2019, 9:59:59 PM3/5/19
to clo...@googlegroups.com
I assume you are forced to use XML (if you are choosing the format, I wholeheartedly recommend EDN!).  If you /do/ control the choice of XML/EDN but want to interoperate with other languages, check out: https://github.com/edn-format/edn/wiki/Implementations - maybe you could use EDN anyhow if you have forgiving consumers (hat tip to Alex Miller who pointed this page out again recently).

If you must use XML, have you considered the approach of using a generic XML data structure (which would by definition have the in-memory definition mirror the serialized version)?  You could write a transform function to turn it into something you'd rather use from clojure, or simply define accessor functions into your data.  I've used data.xml (https://github.com/clojure/data.xml) before along with specter (https://github.com/nathanmarz/specter) to transform/modify/access XML data.  Though I have to admit, every time I do, I curse immutable data structures -- they are unwieldy for /modifying/ highly nested structures (XML!).  Specter is the best library I've seen for modifying deeply nested structures, and is worth the learning curve (you'll use it everywhere once you get used to it -- it's really fantastic), but still isn't a perfect fit for XML.

Here's an example of something simpler I wrote to read out the bits of maven pom files that I cared about:

(defn collapse-xml
  "Collapses xml data in the form from clojure.data.xml (:tag :t :content [\"some whitespace\" {:tag :another-element ...}])
    into {:t {:another-element \"value\"}}

   This works well for many configuration style xml files, but you will lose all textual content surrounding elements.  For example:
    <p>This is some <b>bold</b> text.</p> Is a terrible use case for this, as it will return {:p {:bold \"text\"}}

   This does work very well for pom files however."
  [m {:keys [tag content]}]
  (assoc
   m
   tag
   (if (every? string? content)
     (let [s (clojure.string/join content)]
       (if (clojure.string/blank? s) nil s))
     (reduce
      collapse-xml
      {}
      (filter map? content)))))

If you are using mainly "configuration" style small XML files like maven pom files, something simple like this could give you an ergonomic /accessor/ clojure data structure (not one suitable for editing and transforming back to an on-disk representation though).

Every time I have to use XML from clojure I have yet to find a solution that feels clean.  I yearn for xpath and a mutable API (maybe just using a good set of Java APIs via interop is really the best answer here).

Given that I can't find a clean solution for XML in clojure, I find the idea of trying to use clojure.spec for validation purposes for XML data even more difficult to imagine than if the serialized representation were simply EDN to begin with.  In fact, I think that's the only time I've gotten any value out of clojure.spec in my own projects -- I've used it for verifying serialized EDN and found its validation error message output useful (though /so/ verbose for large specs!).  I relentlessly spec'd out my projects' data structures in the beginning when spec was released (I started my clojure journey when 1.9 was in alpha and spec just released), not even looking at plumatic schema or other alternatives since clojure.spec was going to be built into core.  However, what I ended up with was pages of attribute specs which mostly looked like int?, or string?, or keyword?, or map key lists (though some regexes and ranges!).  From the reader's perspective it was tough to see what the data was actually going to /look/ like.  And then, where to actually use the specs?  Runtime performance was too slow to use everywhere...  My most egregious use of spec was in multimethods or cond forms, essentially trying to dispatch on the /type/ of a map (not knowing in advance what I would get).  Needless to say I have stopped using spec as I couldn't manage to figure out a way forward for me where its benefits outweighed its costs.  I kept trying to use it as a type system, which I think it is ill-suited for (and even if it were suited to it -- it's too slow to use as such).

I have since settled on using truss, a wonderful little library (https://github.com/ptaoussanis/truss) and have never looked back.  I highly recommend looking at it -- you get great error messages for essentially run-time type assertions only where you need them (or are currently debugging).  That plus typical spy-like macros go a long way (see another great library by same author as truss, timbre (https://github.com/ptaoussanis/timbre) which includes a nice spy macro).  I made one that defs the value being spied in the current namespace if an optional condition is met on the data value.

I had high hopes I would grok clojure.spec enough to get more value out of it than time I put into it.  But as of now, simpler solutions have served me far better.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
For more options, visit this group at
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages