update-in and get-in why no default?

534 views
Skip to first unread message

Timothy Pratley

unread,
Dec 10, 2009, 7:20:29 AM12/10/09
to Clojure
Hi,

update-in is an especially useful function but I find the update
function inevitably requires a check for nil. If I could supply a not-
found value then my code would get better golf scores.

When I reach for update-in, I usually want to pass it a numerical
operator like inc or +, but these don't play nicely with nil. Another
scenario is when I want to pass conj, which is fine if I want to
create lists, except if I usually want the data structure to be
something else. I've never come across a scenario where I didn't want
to supply a not-found value, are there any common ones?

If others have similar experience perhaps it is a candidate for
change. Ideally I'd like to see a not-found parameter added to update-
in and an extra arity overload for get-in as outlined below:

(defn update-in2
"'Updates' a value in a nested associative structure, where ks is a
sequence of keys and f is a function that will take the old value
and any supplied args and return the new value, and returns a new
nested structure. If any levels do not exist, hash-maps will be
created. If there is no value to update, default is supplied to f. "
([m [k & ks] not-found f & args]
(if ks
(assoc m k (apply update-in2 (get m k) ks f args))
(assoc m k (apply f (get m k not-found) args)))))

user=> (reduce #(update-in2 %1 [%2] 0 inc) {} ["fun" "counting"
"words" "fun"])
{"words" 1, "counting" 1, "fun" 2}
user=> (reduce #(update-in2 %1 [(first %2)] [] conj (second %2)) {}
[[:a 1] [:a 2] [:b 3]])
{:b [3], :a [1 2]}


(defn get-in2
"returns the value in a nested associative structure, where ks is a
sequence of keys"
([m ks]
(reduce get m ks))
([m ks not-found]
(if-let [v (reduce get m ks)]
v
not-found)))

user=> (get-in2 {:a {:b 1}} [:a :b] 0)
1
user=> (get-in2 {:a {:b 1}} [:a :b :c] 0)
0

Changing update-in would be a breaking change unfortunately. To avoid
this you could consider checking the argument type for f to be
function or value (making an assumption here that you would rarely
want a function as the not-found value which is not 100% watertight).
Or you could have a similarly named update-in-or function (which is
less aesthetically pleasing), or maybe there is another even better
way?

Thanks for your consideration.


Regards,
Tim.

Sean Devlin

unread,
Dec 10, 2009, 10:27:44 AM12/10/09
to Clojure
Tim,
I like both of these ideas. I agree, get-in2 seems to make sense as a
drop in replacement.

With update-in2 I prefer a new name, because I do occasionally write
code that constructs lists of functions. I'd hate to get a weird bug
while doing this.

Sean

ataggart

unread,
Dec 10, 2009, 8:33:58 PM12/10/09
to Clojure
Seconded; I like the intention of both changes, and do something
similar in a lot of my code (e.g., parsing functions that take an
extra arg if the parse is unsuccessful). Also, testing the type of
update-in2's second arg is a bad idea, imo.

As for the breaking change of adding another arg to update-in, I can't
think of a time when nil was actually the default value I wanted
there, though sometimes the function I was using behaved well with
nils (e.g., conj). In every other case, I had to explicitly handle
nils (e.g., inc).

I don't have a lot of production code that would need retrofitting, so
I'm inclined to prefer breaking things for the better (at least while
things are young). I imagine others might feel differently.

Rich Hickey

unread,
Dec 12, 2009, 9:24:15 AM12/12/09
to clo...@googlegroups.com
The get-in function could be enhanced, and would mirror get.

As for update-in, a breaking change and type testing is out of the
question. However, the general case is one of applying functions to
missing/nil arguments that don't expect them, or whose behavior given
nil you'd like to change. If it is just a substitution (and it often
is, as you desire in update-in), something like this could cover many
applying functions, without having to add extra what-to-do-if-nil
arguments:

(defn fnil
"Takes a function f, and returns a function that calls f, replacing
a nil first argument to f with the supplied value x. Higher arity
versions can replace arguments in the second and third
positions (y, z). Note that the function f can take any number of
arguments, not just the one(s) being nil-patched."
([f x]
(fn
([a] (f (if (nil? a) x a)))
([a b] (f (if (nil? a) x a) b))
([a b c] (f (if (nil? a) x a) b c))
([a b c & ds] (apply f (if (nil? a) x a) b c ds))))
([f x y]
(fn
([a b] (f (if (nil? a) x a) (if (nil? b) y b)))
([a b c] (f (if (nil? a) x a) (if (nil? b) y b) c))
([a b c & ds] (apply f (if (nil? a) x a) (if (nil? b) y b) c ds))))
([f x y z]
(fn
([a b] (f (if (nil? a) x a) (if (nil? b) y b)))
([a b c] (f (if (nil? a) x a) (if (nil? b) y b) (if (nil? c) z c)))
([a b c & ds] (apply f (if (nil? a) x a) (if (nil? b) y b) (if
(nil? c) z c) ds)))))

usaage:

((fnil + 0) nil 42)
-> 42

((fnil conj []) nil 42)
-> [42]

(reduce #(update-in %1 [%2] (fnil inc 0)) {} ["fun" "counting" "words" "fun"])
->{"words" 1, "counting" 1, "fun" 2}

(reduce #(update-in %1 [(first %2)] (fnil conj []) (second %2))
{} [[:a 1] [:a 2] [:b 3]])
-> {:b [3], :a [1 2]}


fnil seems to me to have greater utility than patching all functions
that apply functions with default-supplying arguments.

Rich

Timothy Pratley

unread,
Dec 30, 2009, 6:18:34 AM12/30/09
to Clojure

On Dec 13, 1:24 am, Rich Hickey <richhic...@gmail.com> wrote:
> fnil seems to me to have greater utility than patching all functions
> that apply functions with default-supplying arguments.

Neat :) I like it.


> The get-in function could be enhanced, and would mirror get.

Should I interpret 'could' as 'patch welcome' or 'let me think about
it'?


Regards,
Tim.

Timothy Pratley

unread,
Jan 1, 2010, 6:45:55 PM1/1/10
to Clojure
On Dec 13, 1:24 am, Rich Hickey <richhic...@gmail.com> wrote:
> fnil seems to me to have greater utility than patching all functions
> that apply functions with default-supplying arguments.


Hi Rich,

To further comment on fnil, after having experimented with it a bit
now, I've come to slightly prefer specifying the 'default' value
before the function just because I think it reads nicer:
(fnil 0 inc) ;; instead of (fnil inc 0)
(fnil [] conj) ;; instead of (fnil conj [])
I read it as "fill nil with 0 for inc"
I suppose "fill nil of inc with 0" makes just as much sense but I find
"inc 0" leads my eye to believe 0 will always be passed to inc,
whereas "0 inc" does not. Putting the function last makes it clearer
to me that the 0 is conditional. It also looks more like if nil 0.
This contrasts with get, which makes perfect sense having the default
last, but I think at this point fnil and get are sufficiently far
apart that a different argument order would not be surprising.

Just a small observation I thought I'd raise for discussion to see
what preferences are out there if this function becomes widespread.


Regards,
Tim.

Sean Devlin

unread,
Jan 1, 2010, 11:04:48 PM1/1/10
to Clojure
Tim,
I don't think your version of the signature supports variadic defaults
well. Also, I'd (initially) implement fnil differently that Rich.
Here's my fnil-2 that I *suspect* has the intended behavior

(defn fnil-2
[f & defaults]
(fn[& args]
(let [used-args (map (fn [default-value value]
(if value value default-value))
defaults args)]
(apply f used-args))))

user=>(def nil+ (fnil-2 + 0 1 2 3 4 5 6))

user=>(nil+ nil)
0

user=>(nil+ 0 nil)
1

user=>(nil+ 0 0 nil)
2

user=>(nil+ 0 0 0 nil)
3

...

user=>(nil+ 0 0 0 0 0 0 nil)
6

;This example exceeds the provided # of nil values
user=>(nil+ 0 0 0 0 0 0 0 nil)
NPE

You get the idea. Also, Rich's style matches the signature of partial
better, which I personally prefer.

Just my $.02

Sean

Timothy Pratley

unread,
Jan 2, 2010, 3:21:41 AM1/2/10
to clo...@googlegroups.com
2010/1/2 Sean Devlin <francoi...@gmail.com>:

> I don't think your version of the signature supports variadic defaults well.

Hi Sean,

Thanks for commenting.

Just by way of clarification, taking the function as the last argument
seems to work fine in my experiments. I'm sure it could be better but
here what I've been using:

(defn fnil
"Creates a new function that if passed nil as an argument,
operates with supplied arguments instead. The function must be
passed as the last argument."
[& more]
(fn [& m] (apply (last more)
(map #(if (nil? %1) %2 %1)
m
(concat (butlast more)
(drop (count (butlast more)) m))))))

Relating to your examples:
user=> (def nil+ (fnil 0 1 2 3 4 5 6 +))


user=> (nil+ 0 0 0 0 0 0 nil)
6

user=> (nil+ 0 0 0 0 0 0 0 nil)

java.lang.NullPointerException (NO_SOURCE_FILE:0)
user=> ((fnil 1 +) nil 2 3 4 5)
15

; note fnil-2 does not handle the last case, though of course it could
easily if you wanted it to:
user=> ((fnil-2 1 +) nil 2 3 4 5)
java.lang.ClassCastException: java.lang.Integer cannot be cast to
clojure.lang.IFn (NO_SOURCE_FILE:0)

; in principle I don't think one form is any more restrictive than the
other, it just comes down to a matter of preference which is the key
part I wanted to generate discussion about.


> matches the signature of partial better, which I personally prefer.

Yes that is precisely why it catches me out to write (fnil + 1)
because it looks like a partial application. Partial applications are
actually very common even when partial is not explicitly used:
user=> (swap! (atom 1) + 1)
2
So I'm used to seeing arguments to the right of a function get
absorbed so to speak, and used to seeing conditionals occur in the
middle of the form. Again just clarifying that I too like partial
application style but have the opposite reaction in this case of
wanting to contrast that for fnil as it is not a partial application.
This of course is just my preference and I'm glad to hear reasoning
for the other style.


Regards,
Tim.

Timothy Pratley

unread,
Jan 2, 2010, 3:45:37 AM1/2/10
to clo...@googlegroups.com
2010/1/2 Timothy Pratley <timothy...@gmail.com>:

> user=> ((fnil-2 1 +) nil 2 3 4 5)
> java.lang.ClassCastException: java.lang.Integer cannot be cast to
> clojure.lang.IFn (NO_SOURCE_FILE:0)

Correction, I just realized of course it doesn't work if I specify the
arguments around the wrong way! I should have done:
user=> ((fnil-2 + 1) nil 2 3 4 5)
1
; I think this should result in 15, but that's just an implementation detail.

Rich Hickey

unread,
Jan 29, 2010, 8:48:20 AM1/29/10
to Clojure

On Dec 30 2009, 6:18 am, Timothy Pratley <timothyprat...@gmail.com>
wrote:

Patches welcome for get-in, and my fnil.

Thanks,


Rich

Sean Devlin

unread,
Jan 29, 2010, 9:04:46 AM1/29/10
to Clojure
Rich,
Your example didn't support a variadic signature. Is that the long
term plan?

Sean

Rich Hickey

unread,
Jan 29, 2010, 9:18:37 AM1/29/10
to Clojure

On Jan 29, 9:04 am, Sean Devlin <francoisdev...@gmail.com> wrote:
> Rich,
> Your example didn't support a variadic signature.  Is that the long
> term plan?
>

It's the short term plan. Let's see if there's any real need for more
than three. I've never needed more than one.

Rich

Timothy Pratley

unread,
Jan 29, 2010, 11:12:40 PM1/29/10
to clo...@googlegroups.com

Daniel Werner

unread,
Jan 31, 2010, 12:29:40 PM1/31/10
to Clojure
> ([m ks not-found]
> (if-let [v (reduce get m ks)]
> v
> not-found)))

If I understand this arity version of get-in correctly, won't the
default also be used if the value stored in the nested data structure
evaluates to something false-y?

Anyway, thanks for creating the patches!

Meikel Brandmeyer

unread,
Jan 31, 2010, 1:46:35 PM1/31/10
to clo...@googlegroups.com
Hi,

Am 31.01.2010 um 18:29 schrieb Daniel Werner:

>> ([m ks not-found]
>> (if-let [v (reduce get m ks)]
>> v
>> not-found)))
>
> If I understand this arity version of get-in correctly, won't the
> default also be used if the value stored in the nested data structure
> evaluates to something false-y?

Yes. It should probably read:

(if-let [l (reduce get m (pop ks))]
(get l (peek ks) not-found)
not-found))

Sincerely
Meikel

Timothy Pratley

unread,
Feb 1, 2010, 7:57:40 AM2/1/10
to clo...@googlegroups.com
> Am 31.01.2010 um 18:29 schrieb Daniel Werner:
>> If I understand this arity version of get-in correctly, won't the
>> default also be used if the value stored in the nested data structure
>> evaluates to something false-y?

Thanks for spotting that early!


On 1 February 2010 05:46, Meikel Brandmeyer <m...@kotka.de> wrote:
> (if-let [l (reduce get m (pop ks))]
>  (get l (peek ks) not-found)
>  not-found))

Good idea, but peek and pop work differently on vectors and sequences,
seeing get-in is not constrained to use vectors this could lead to an
unexpected behavior:
user=> (def m {:a 1, :b 2, :c {:d 3, :e 4}, :f nil})
#'user/m
user=> (get-in3 m '(:c :e) 2) ;peek-pop keys applied in unexpected order
2
user=> (get-in2 m '(:c :e) 2) ;expected result
4

I've replaced the patch on assembla with this:
(reduce #(get %1 %2 not-found) m ks)))
And added test cases for the falsey returns and seq args

Regards,
Tim.

Meikel Brandmeyer

unread,
Feb 1, 2010, 8:18:44 AM2/1/10
to Clojure
Hi,

On Feb 1, 1:57 pm, Timothy Pratley <timothyprat...@gmail.com> wrote:

> Good idea, but peek and pop work differently on vectors and sequences,
> seeing get-in is not constrained to use vectors this could lead to an
> unexpected behavior:
> user=> (def m  {:a 1, :b 2, :c {:d 3, :e 4}, :f nil})
> #'user/m
> user=> (get-in3 m '(:c :e) 2)   ;peek-pop keys applied in unexpected order
> 2
> user=> (get-in2 m '(:c :e) 2)   ;expected result
> 4

o.O Ok... 100% of my use-cases for get-in were with vectors up to now.
Didn't think about the allowing lists also.

> I've replaced the patch on assembla with this:
>    (reduce #(get %1 %2 not-found) m ks)))
> And added test cases for the falsey returns and seq args

Consider this (admittedly constructed) case:

(get-in {:a {:b 1}} [:x :c] {:c :uhoh})

I would not mind only allowing vectors in the interface. I would
expect the length of such a chain sufficiently short for easy
conversion if necessary. Or to use butlast/last instead of peek/pop.
Again the "damage" would be limited by the short keychain length.
(Of course it would have to be considered: is the usual keychain
really short?)

Sincerely
Meikel

Timothy Pratley

unread,
Feb 1, 2010, 7:19:29 PM2/1/10
to clo...@googlegroups.com
On 2 February 2010 00:18, Meikel Brandmeyer <m...@kotka.de> wrote:
> Consider this (admittedly constructed) case:
> (get-in {:a {:b 1}} [:x :c] {:c :uhoh})

Excellent point!


> Or to use butlast/last instead of peek/pop.

I think this is the best approach. butlast/last have linear time so
the overhead is small.

There are still some sharp edges I'm not sure about:
(A) user=> (get-in {:a 1} [])
{:a 1}
;; this is existing behavior, but I feel the result should be nil

(B) user=> (get-in {:a 1} nil)
{:a 1}
;; this is existing behavior, but I feel the result should be nil (nil
is a seq so not an exception)

(C) user=> (get-in {:a 1} 5)
java.lang.IllegalArgumentException: Don't know how to create ISeq
from: java.lang.Integer (NO_SOURCE_FILE:0)
;; existing behavior, seems sensible to throw an exception here rather
than return nil

(D) user=> (get-in {nil {:a 1}} [] 2)
{:a 1}
;; new behavior
;; using last/butlast only -> this is wrong... need to check the seq size
;; the solution depends upon what is correct for the preceding cases
so need to establish those first

Alternatively one could take the view that [] and nil are illegal key
sequences, in which case should get-in enforce that (via preconditions
or just a check) or should it just be added to the doc-string that
using those values is undefined, or both?

I've marked the ticket back to new in the meantime... will update once resolved.


For reference here is a version that behaves most like existing get-in:

(defn get-in
"Returns the value in a nested associative structure,
where ks is a sequence of keys. Returns nil if the key is not present,
or the not-found value if supplied."


([m ks]
(reduce get m ks))
([m ks not-found]

(if (empty? ks)
m
(if-let [l (reduce get m (butlast ks))]
(get l (last ks) not-found)
not-found))))

I'm not convinced returning the map when given an empty key sequence
is the right thing to do, I'd prefer it to return nil or throw an
exception in both arity cases.


Regards,
Tim.

Meikel Brandmeyer

unread,
Feb 2, 2010, 1:41:29 AM2/2/10
to Clojure
Hi Timothy,

On Feb 2, 1:19 am, Timothy Pratley <timothyprat...@gmail.com> wrote:

> There are still some sharp edges I'm not sure about:
> (A) user=> (get-in {:a 1} [])
> {:a 1}
> ;; this is existing behavior, but I feel the result should be nil

+1 for nil

> (B) user=> (get-in {:a 1} nil)
> {:a 1}
> ;; this is existing behavior, but I feel the result should be nil (nil
> is a seq so not an exception)

+1 for nil

> (C) user=> (get-in {:a 1} 5)
> java.lang.IllegalArgumentException: Don't know how to create ISeq
> from: java.lang.Integer (NO_SOURCE_FILE:0)
> ;; existing behavior, seems sensible to throw an exception here rather
> than return nil

+1 for exception

> (D) user=> (get-in {nil {:a 1}} [] 2)
> {:a 1}
> ;; new behavior
> ;; using last/butlast only -> this is wrong... need to check the seq size
> ;; the solution depends upon what is correct for the preceding cases
> so need to establish those first
>
> Alternatively one could take the view that [] and nil are illegal key
> sequences, in which case should get-in enforce that (via preconditions
> or just a check) or should it just be added to the doc-string that
> using those values is undefined, or both?

I'm not sure about this too. I tend to an exception. (get m) will also
throw an exception...

> For reference here is a version that behaves most like existing get-in:
>
> (defn get-in
>   "Returns the value in a nested associative structure,
>   where ks is a sequence of keys. Returns nil if the key is not present,
>   or the not-found value if supplied."
>   ([m ks]
>    (reduce get m ks))
>   ([m ks not-found]
>    (if (empty? ks)
>      m
>      (if-let [l (reduce get m (butlast ks))]
>        (get l (last ks) not-found)
>        not-found))))

I would get rid of the if-let. Together with throwing an exception in
case of an empty key sequence we get:

(defn get-in
"Returns the value in a nested associative structure,
where ks is a sequence of keys. Returns nil if the key is not
present,
or the not-found value if supplied."

([m ks] (get-in m ks nil))
([m ks not-found]
(if-let [ks (seq ks)]
(get (reduce get m (butlast ks)) (last ks) not-found)
(throw (Exception. "key sequence must not be empty")))))

Sincerely
Meikel

Richard Newman

unread,
Feb 2, 2010, 1:55:01 AM2/2/10
to clo...@googlegroups.com
>> There are still some sharp edges I'm not sure about:
>> (A) user=> (get-in {:a 1} [])
>> {:a 1}
>> ;; this is existing behavior, but I feel the result should be nil
>
> +1 for nil

I think I disagree.

If you view 'get-in' as an unwrapping operation, unwrapping by zero
steps should return the existing collection, no?

{:foo {:bar {:baz 1}}}

[] => {:foo ...
[:foo] => {:bar ...
[:foo :bar] => {:baz ...

This maps trivially to a sophisticated user's recursive mental model
of get-in:

(defn get-in [m ks]
(loop [looking-at m
first-key (first ks)
remaining-keys (rest ks)]
(if-not first-key
looking-at
(recur (get looking-at first-key)
(first remaining-keys)
(rest remaining-keys)))))

... if there are no keys, it returns m. That's intuitive to me, at
least.

Can you explain why you think the result should be nil?

>> (B) user=> (get-in {:a 1} nil)
>> {:a 1}
>> ;; this is existing behavior, but I feel the result should be nil
>> (nil
>> is a seq so not an exception)
>
> +1 for nil

As above: I equate nil with the empty sequence.

Meikel Brandmeyer

unread,
Feb 2, 2010, 2:10:47 AM2/2/10
to Clojure
Hi,

On Feb 2, 7:55 am, Richard Newman <holyg...@gmail.com> wrote:

> I think I disagree. Can you explain why you think the result should be nil?

Woops. I got confused. I didn't mean nil for empty key sequences. I
meant throwing an exception as does get.

Sincerely
Meikel

Timothy Pratley

unread,
Feb 2, 2010, 7:48:56 AM2/2/10
to clo...@googlegroups.com
On 2 February 2010 17:55, Richard Newman <holy...@gmail.com> wrote:
> If you view 'get-in' as an unwrapping operation, unwrapping by zero steps
> should return the existing collection, no?

Thanks for that description I completely agree.


> Can you explain why you think the result should be nil?

I was not thinking very clearly :) Loosely 'oh there is nothing to
look up'. Thanks for setting me straight.


> As above: I equate nil with the empty sequence.

Yup.


Ok patch updated - salient part is:
+ ([m ks not-found]
+ (if (seq ks)
+ (if-let [l (reduce get m (butlast ks))]
+ (get l (last ks) not-found)
+ not-found)
+ m)))

Which preserves all the desired behavior so far :)

http://www.assembla.com/spaces/clojure/tickets/256-get-in-optional-default-value


Regards,
Tim.

Meikel Brandmeyer

unread,
Feb 2, 2010, 8:02:36 AM2/2/10
to Clojure
Hi,

On Feb 2, 1:48 pm, Timothy Pratley <timothyprat...@gmail.com> wrote:

> > If you view 'get-in' as an unwrapping operation, unwrapping by zero steps
> > should return the existing collection, no?
>
> Thanks for that description I completely agree.

Hmm.. I thought of get-in as a recursive application of get. get-in
now diverges from get. Maybe this version should be called "unwrap"
instead?

Sincerely
Meikel

Timothy Pratley

unread,
Feb 2, 2010, 8:31:22 AM2/2/10
to clo...@googlegroups.com
On 2 February 2010 17:41, Meikel Brandmeyer <m...@kotka.de> wrote:
> I would get rid of the if-let.

Ah yes! Ok patch updated to:

+ ([m ks not-found]
+ (if (seq ks)

+ (get (reduce get m (butlast ks)) (last ks) not-found)
+ m)))

Note that (seq ks) will throw an illegal argument exception if ks is 5
for instance, if ks is nil or empty the original map is preserved.


> Hmm.. I thought of get-in as a recursive application of get. get-in
> now diverges from get. Maybe this version should be called "unwrap"
> instead?

Zero applications of get to a map might be thought of as the map itself.
Are you thinking of a particular scenario where throwing an exception
would be better?


Regards,
Tim.

Meikel Brandmeyer

unread,
Feb 2, 2010, 8:50:29 AM2/2/10
to Clojure
Hi,

On Feb 2, 2:31 pm, Timothy Pratley <timothyprat...@gmail.com> wrote:

> > Hmm.. I thought of get-in as a recursive application of get. get-in
> > now diverges from get. Maybe this version should be called "unwrap"
> > instead?
>
> Zero applications of get to a map might be thought of as the map itself.
> Are you thinking of a particular scenario where throwing an exception
> would be better?

Ok. One could see this like that:

(get-in m [:a :b]) => (get (get m :a) :b)
(get-in m [:a]) => (get m :a)
(get-in m []) => m

In so far I understand the picture of what happens. But does it make
sense?

get-in does a lookup of a key sequence in a nested structure of
associative things. I think nil/[] are simply not in the domain of get-
in. However it can be extended to nil/[] as the identity.

So in the end it will probably boil down to some "suitable definition"
argument of the domain of get-in. And I see some applications, where
just returning the original thing might be handy.

I'm persuaded. :)

Sincerely
Meikel

Reply all
Reply to author
Forward
0 new messages