(keys a-vector)

181 views
Skip to first unread message

Brandon Bloom

unread,
Jun 5, 2013, 1:40:23 PM6/5/13
to cloju...@googlegroups.com
Despite supporting ILookup, clojure.core/keys does not work on vectors. Here's a redefinition that matches my expectations:

(defn keys [x] (if (vector? x) (range (count x)) (clojure.core/keys x)))

Is there a justification for the keys not working on vectors?

Is there any interest in a patch to add support for vectors to the core keys function?

Cheers,
Brandon

Mikera

unread,
Jun 13, 2013, 5:40:50 AM6/13/13
to cloju...@googlegroups.com
I think I'd rather get an exception in this case - on the grounds that I probably done something wrong if keys is getting called on a vector. Most likely I've accidentally passed a vector where I should have passed a map, or perhaps I'm using the wrong data structure entirely for my problem.

In general, people shouldn't expect functions to behave on vectors in the same way as an equivalent map: consider (seq [:a :b]) vs. (seq {0 :a, 1 :b}) for example.

Jean Niklas L'orange

unread,
Jun 13, 2013, 9:35:04 AM6/13/13
to cloju...@googlegroups.com
Actually, it is possible to use keys and vals on anything implementing ISeq:

(let [v (vec {:a 1, :b 2})]
  (keys v))
;; => (:a :b)

(let [s (set {:a 1, :b 2})]
  (keys s))
;; => (:b :a)

(let [s (seq {:a :1, :b 2})]
  ;;; etc

This isn't a property of elements implementing ILookup, it is a property of collections which can be converted to seqs and only contain map entries.

--
Regards,
Jean Niklas L'orange

Brandon Bloom

unread,
Jun 13, 2013, 12:01:16 PM6/13/13
to cloju...@googlegroups.com
> I probably done something wrong if keys is getting called on a vector.

That doesn't seem to be valid motivation, given the haphazard behavior of many core functions on invalid input.

In general, people shouldn't expect functions to behave on vectors in the same way as an equivalent map

Except that I do expect functions to behave uniformly on ILookup instances, where appropriate. I very often have a tree of maps and vectors on which I use update-in or assoc-in. In order to write a generic traversal, I need to explicitly differentiate between vectors and maps, unless I use my keys variant, which lets me treat both of them uniformly for update-in and a subset of assoc-in (those for existing indicies) invocations.

Mikera

unread,
Jun 13, 2013, 1:36:46 PM6/13/13
to cloju...@googlegroups.com
On Thursday, 13 June 2013 17:01:16 UTC+1, Brandon Bloom wrote:
> I probably done something wrong if keys is getting called on a vector.

That doesn't seem to be valid motivation, given the haphazard behavior of many core functions on invalid input.

I'd see that as an argument for tightening up other functions in core rather than anything else :-)

Haven't decades of experience in software taught us that failing fast on invalid / unexpected input is a good idea? 

In general, people shouldn't expect functions to behave on vectors in the same way as an equivalent map

Except that I do expect functions to behave uniformly on ILookup instances, where appropriate. I very often have a tree of maps and vectors on which I use update-in or assoc-in. In order to write a generic traversal, I need to explicitly differentiate between vectors and maps, unless I use my keys variant, which lets me treat both of them uniformly for update-in and a subset of assoc-in (those for existing indicies) invocations.

Sounds like a sensible use case. Though as you demonstrated it's easy enough to write your own functions to support this kind of custom traversal, and I think that is better than cramming extra functionality into clojure.core/keys

I guess I'm just a little suspicious of core functions that try to be too "clever" and support lots of different input types in marginally inconsistent ways - it adds runtime overhead, it potentially masks bugs and it feels too much like "weak typing" to me. If you haven't seen it already then google "JavaScript wat" for some entertaining examples of where that can lead..... 

And where do you propose to stop in terms of considering things to have keys? sequences? sets? Strings? anything that supports nth? arbitrary deftypes implementing java.util.List? Java arrays? It will soon get messy. ILookup isn't really a sound way to define the boundary either, since it doesn't provide any methods that allow you to access the set of keys.

In general I believe functions should be designed to operate on types that conform to one specific, well-defined abstraction. In the case of keys and vals - I'd expect that abstraction to be a map.


Brandon Bloom

unread,
Jun 13, 2013, 2:50:32 PM6/13/13
to cloju...@googlegroups.com
> Haven't decades of experience in software taught us that failing fast on invalid / unexpected input is a good idea? 

You're operating on the assumption that a vector is an invalid or unexpected input for the keys function, which I am arguing, it is not.

> I guess I'm just a little suspicious of core functions that try to be too "clever" and support lots of different input types in marginally inconsistent ways

I think that this is neither clever nor inconsistent.
 
> it adds runtime overhead,

In this case, keys already performs a virtual dispatch on seq. It could easily perform a virtual dispatch on IKeyed or something similar and the performance cost would be identical.
 
it potentially masks bugs and it feels too much like "weak typing" to me. If you haven't seen it already then google "JavaScript wat" for some entertaining examples of where that can lead.....

I'm not proposing arbitrary coercion or anything of the sort.
 
And where do you propose to stop in terms of considering things to have keys?

I'd say approximately at the intersection of the ICounted, ILookup interfaces.
 
sequences? sets? Strings?

sequences: No (doesn't implement ICounted, ILookup, or IFn)
sets and strings, sure? Why not?
sets should implement keys and vals as seq
strings could implement keys and vals as range and seq, just like vectors

user=> (get-in ["abc" "def" "hij"] [1 2])
\f

user=> (get-in #{5} [5])
5

These things already support lots of different input types. That's a defining characteristic of clojure.core: Reliance on abstractions.
 
anything that supports nth? arbitrary deftypes implementing java.util.List? Java arrays? It will soon get messy. ILookup isn't really a sound way to define the boundary either, since it doesn't provide any methods that allow you to access the set of keys.

Vectors, the most important use case to me, could be supported trivially without adding an interface or protocol. However, that doesn't mean you can't add a protocol if it's useful.
 
In general I believe functions should be designed to operate on types that conform to one specific, well-defined abstraction. In the case of keys and vals - I'd expect that abstraction to be a map.

There is a common, well-defined abstraction here: http://en.wikipedia.org/wiki/Surjective_function

Here's a protocol:

(defprotocol ISurjection
  (-domain [_])
  (-codomain [_]))

I'm not actually advocating this protocol, only trying to demonstrate that there is an abstraction that underlies maps, vectors, arrays, strings, and even functions with respect to keys and vals. In theory, I could reify IFn and ISurjection, stick that in a map, and implement a generic traversal over that.

Mikera

unread,
Jun 13, 2013, 8:17:11 PM6/13/13
to cloju...@googlegroups.com
On Thursday, 13 June 2013 19:50:32 UTC+1, Brandon Bloom wrote:
> Haven't decades of experience in software taught us that failing fast on invalid / unexpected input is a good idea? 

You're operating on the assumption that a vector is an invalid or unexpected input for the keys function, which I am arguing, it is not.

> I guess I'm just a little suspicious of core functions that try to be too "clever" and support lots of different input types in marginally inconsistent ways

I think that this is neither clever nor inconsistent.


The problem is: if you try and treat maps and vectors as part of the same abstraction, you'll ultimately run into inconsistencies because they already have very different defined behaviours. 

For example, consider that one is considered as a sequence of key/value pairs and the other isn't: I already highlighted the difference between (seq {0 :a 1 :b}) and (seq [:a :b]) - If you expect 'keys' to behave like '(partial map first)', then it is clearly inconsistent when applied to vectors. 

Then there are also functions like 'dissoc' which fail if you try to remove a key/value mapping from a vector. 

No matter how you break it down, if you attempt to unify maps and vectors and other vaguely associative types under the same abstraction then there will be circumstances where the inconsistencies show through. The most likely outcome is that you don't actually end up with a consistent abstraction, i.e. each function will operate on a slightly different subset of types. This could work, but it would also be very tricky to reason about. Contrast with the sequence abstraction, which I think is much more clean and consistent: functions that operate on a sequence tend to work for any sequence type.

Anyway, none of this is a particularly big deal. Clearly 'keys' is a function which could be defined to work on vectors in a reasonably sane way. I'd still personally prefer an exception. But mostly, I'm just pointing out that maps and vectors are quite different beasts, and trying to urge caution when it comes to stretching an abstraction over disparate types.. 

Brandon Bloom

unread,
Jun 13, 2013, 8:34:07 PM6/13/13
to cloju...@googlegroups.com
> The problem is: if you try and treat maps and vectors as part of the same abstraction

That's not what I'm trying to do at all. I'm simply trying to utilize a property of the logical abstraction that is already defined.

Consider:

user=> (doc contains?)
-------------------------
clojure.core/contains?
([coll key])
  Returns true if key is present in the given collection, otherwise
  returns false.  Note that for numerically indexed collections like
  vectors and Java arrays, this tests if the numeric key is within the
  range of indexes. 'contains?' operates constant or logarithmic time;
  it will not perform a linear search for a value.  See also 'some'.

Vectors have numeric keys as far as clojure.core/contains? is concerned.

Right now, clojure.core/keys throws an exception when applied to a vector. So unless somebody is listening for that exception (and they shouldn't be) expanding the applicability of keys is a backwards compatible change. Clojure is conservative, so this is an enhancement proposal, but from the perspective of clojure.core/contains?, keys is bugged.

clojure.core is loaded with highly polymorphic functions and garbage-in/garbage-out function implementations. All of your points form valid design criteria for a world view that does not appear to match the world view of clojure.core.

Brandon Bloom

unread,
Jun 13, 2013, 8:47:37 PM6/13/13
to cloju...@googlegroups.com
Jean,

Sorry I missed your message.

I had *no idea* this worked. It seems implementation of clojure.core/keys simply calls clojure.core/seq, then attempts to cast each element to java.util.Map$Entry. That behavior is extremely surprising to me. That means that what I'm proposing would, in fact, be a breaking change for anybody who relies on this.

I guess this behavior allows clojure.core/keys to be lazy:

user=> (def kvs (drop 4 (into {} (map (juxt identity identity) (range 10)))))
#'user/kvs
user=> kvs
([4 4] [5 5] [6 6] [7 7] [8 8] [9 9])
user=> (keys kvs)
(4 5 6 7 8 9)

However, the (partial map first) illusion breaks down:

user=> (keys (map (fn [[k v]] [k v]) kvs))
ClassCastException clojure.lang.PersistentVector cannot be cast to java.util.Map$Entry  clojure.lang.APersistentMap$KeySeq.first (APersistentMap.java:152)

Sooo. Now I'm not sure what to suggest...

Cheers,
Brandon

--
You received this message because you are subscribed to a topic in the Google Groups "Clojure Dev" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure-dev/NR3qA2mudZA/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to clojure-dev...@googlegroups.com.
To post to this group, send email to cloju...@googlegroups.com.
Visit this group at http://groups.google.com/group/clojure-dev?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

David Nolen

unread,
Jun 14, 2013, 2:46:44 AM6/14/13
to cloju...@googlegroups.com
It doesn't make sense to me that just because some object supports ILookup that they should work with `keys`.

(defn keys
  "Returns a sequence of the map's keys."
  {:added "1.0"
   :static true}
  [map] (. clojure.lang.RT (keys map)))

static public ISeq keys(Object coll){
return APersistentMap.KeySeq.create(seq(coll));
}

Far as I can tell `keys` is meant to work only with things which are APersistentMap.

David


--
You received this message because you are subscribed to the Google Groups "Clojure Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure-dev...@googlegroups.com.

Mikera

unread,
Jun 15, 2013, 6:41:43 AM6/15/13
to cloju...@googlegroups.com
On Friday, 14 June 2013 07:46:44 UTC+1, David Nolen wrote:
It doesn't make sense to me that just because some object supports ILookup that they should work with `keys`.

(defn keys
  "Returns a sequence of the map's keys."
  {:added "1.0"
   :static true}
  [map] (. clojure.lang.RT (keys map)))

static public ISeq keys(Object coll){
return APersistentMap.KeySeq.create(seq(coll));
}

Far as I can tell `keys` is meant to work only with things which are APersistentMap.


It currently works with anything that implements the host interface java.util.Map, since seq converts these to a sequence of map entries:

(let [foo (java.util.HashMap. {:a 1})]
     (keys foo))
=> (:a) 

I believe this is intentional, i.e. keys is supposed to work with any map that is an instance of java.util.Map (which includes APersistentMap of course). I'd expect the equivalent rule to apply for host-defined maps on non-JVM versions of Clojure.

It also currently works with arbitrary sequences of map entries. Since this is not covered in the docstring, I'm guessing this is *not* intentional and is just a by-product of the implementation. Maybe someone is relying on this for something.... it allows stuff like (keys (filter some-function some-map)). Still, it is a bad idea to rely on things that only work because of undocumented implementation quirks and/or weak typing discipline.

This kind of stuff will certainly make your GSoC work with Ambrose on core.typed much more "fun" :-)

My pragmatic suggestion is that we attempt to define and document precisely what abstraction each function is intended to operate on. That will eliminate ambiguity and hopefully reduce nasty surprises in the future. Perhaps core.typed can help with formalising this?
 

Brandon Bloom

unread,
Jun 16, 2013, 10:24:26 PM6/16/13
to cloju...@googlegroups.com
It currently works with anything that implements the host interface java.util.Map, since seq converts these to a sequence of map entries

That's probably why it works the way it does. Especially since this was prior to protocols. There was just already a manual if/else chain for clojure.core/seq, so it was the shortest path to the polymorphic goal.

> It also currently works with arbitrary sequences of map entries.
> ...snip...
> I'm guessing this is *not* intentional

That's also my guess. As you can likely tell from my surprise above.

> Still, it is a bad idea to rely on things that only work because of undocumented implementation quirks and/or weak typing discipline.

I agree completely. However I expect your (keys (filter some-function some-map)) example, or something analogous, is not uncommon in the wild.

Ambrose Bonnaire-Sergeant

unread,
Jun 17, 2013, 3:08:30 AM6/17/13
to cloju...@googlegroups.com
FWIW this is the type I would give keys in core.typed.

(ann clojure.core/keys
     (All [x]
          [(U nil (IPersistentMap x Any)) -> (U nil (I (CountRange 1) (Seqable x)))
           :id 0 :path [Keys]]))

It can take a possibly-nil map, and returns nil, or a non-empty sequence of keys.

The :id and :path is so we can infer better types via every?.

eg. 
(let [mymap (foo)]
  ;; here mymap is (IPersistentMap Any Any)
  (when (every? symbol? (keys mymap))
    ;; here mymap is (IPersistentMap Symbol Any)
    ...))

This doesn't work atm, scheduled work for GSoC.

Thanks,
Ambrose

 

--
You received this message because you are subscribed to the Google Groups "Clojure Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure-dev...@googlegroups.com.
To post to this group, send email to cloju...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages