Re: Map Keywords are functions, why not vector elements?

瀏覽次數:212 次
跳到第一則未讀訊息

Stephen Feyrer

未讀,
2017年11月13日 中午12:40:102017/11/13
收件者:clo...@googlegroups.com
Updating subject to make it make more sense, hopefully. 

On 13 November 2017 at 14:15, Alex Miller <al...@puredanger.com> wrote:
Regarding the title, this is incorrect. Map keys are not functions; keywords are functions that take an associative data structure and will look themselves up in it. So, the premise here is not even correct to start.

Usually we prefer to work by starting from a problem and consider alternatives, but as you state below, you don't have a problem in mind.


Normally, I don't offer possible solutions when I enquiring about a problem.  Maybe with forethought context and or constraints...

This is  more of a solution (based on a lack of understanding) without a problem.  If you like the problem I'm hoping to address is to reduce my lack of understanding.

 

On Monday, November 13, 2017 at 3:49:41 AM UTC-6, Stephen Feyrer wrote:
Hi there,

This is possibly a silly question (and may have been asked before but I couldn't find earlier examples) so here it goes.
 

Get the nth element of the vector
user=> (3 [1 "is" the :same])
:same


Note that you can do something similar now with:

([1 "is" the :same] 3)

This works for me :)
 

Whereas in the situation with maps, keywords (a specific common key type) are functions, here you are implying that anything could implicitly be invokable as a lookup function on an associative data structure. This doesn't make a lot of sense to me and would undoubtedly have performance effects. Maybe there is some case like `(map 1 [[:a 0] [:b 1] [:c 2]])` where this would make sense. But this doesn't make sense for all indexed data structures, just those indexed by longs. Layering invocability on longs would require special handling in the compiler (Keywords are a type that embeds this by implementing IFn naturally, so it's not a special case).

Get the nthiness of an element in the vector
user=> (::same [1 "is" the :same])
3

This is something completely different - a linear search for a match. Clojure has considered and rejected making linear searching easier multiple times. It's almost always a sign that you are using the wrong data structure in the first place and should be using a hashed data structure instead. There is no chance we would do something like this.

Perhaps, I'm not understanding vectors properly.  A vector is a set of ordered elements, I think that is right.  Then at some point you might want to get one of those elements picking it out by its index.  Equally, this is probably where I've lost the plot you might want to know where in a vector your element is residing.

Get the nthiness of an element in the vector
user=> (:1 [1 "is" the :same])
(0 2)

I don't even understand what is intended here.


You've already covered this above but I'll explain what I meant (just to be clear). 

user=> (nth [1 "is" the :same] 2)
1

And

user=> ([1 "is" the :same] 0)
1

Therefore, a lookup of value '1' would return the indices for both occurrences in a list.

 
Taking things further into the realms of really you should have stopped there!

Get the these element of the vector
user=> ((1 3) [1 "is" the :same])
("is" :same)

Just for fun I tried:
user=> ((:this :the) {:this "is" :the "same"})                                    
                                                                                 
NullPointerException   user/eval786 (NO_SOURCE_FILE:1)


The thought here was the same as the above only with a list of invocations to return a list of values.

 

I may be barking up the wrong tree or possibly just barking...  I hope what I'm asking makes some sort of sense. 

Not particularly.
 
By way of apology to the reader, when I began writing this question it was ill thought out and seemed a lot simpler.  As a disclaimer, I can't think of direct examples how this would improve readability or such or even necessarily be useful. 

Seems like that should have been a sign. :)

That was one yes, the other was the suggestion that this might be a silly question.  In hindsight, I think the question I'm looking at now maybe quite different to the one I thought I had when I started.

At this point I think I would ask, how do you find the indices of elements within a vector for a certain value?  Or perhaps, should you be needing to obtain the positional component of a value, what would be the correct data structure to use?

For example, you're given a list (in an unspecified form) of horses from a race in their finishing order.

Carrot
Colonel Mustard
Sugar
Binky
Mr. Ned
Go Faster Stripes
Stick
Scary Monster Not Technically A Horse
Binky

Then (using Clojure) you're asked to get the positional value of 'Go Faster Stripes' or 'Binky'?



--
Go forth and apply trigonometry in real world applications

Stephen.

Alex Miller

未讀,
2017年11月13日 下午3:34:162017/11/13
收件者:Clojure


On Monday, November 13, 2017 at 11:40:10 AM UTC-6, Stephen Feyrer wrote:
Updating subject to make it make more sense, hopefully. 

On 13 November 2017 at 14:15, Alex Miller <al...@puredanger.com> wrote:
Regarding the title, this is incorrect. Map keys are not functions; keywords are functions that take an associative data structure and will look themselves up in it. So, the premise here is not even correct to start.

Usually we prefer to work by starting from a problem and consider alternatives, but as you state below, you don't have a problem in mind.


Normally, I don't offer possible solutions when I enquiring about a problem.  Maybe with forethought context and or constraints...

This is  more of a solution (based on a lack of understanding) without a problem.  If you like the problem I'm hoping to address is to reduce my lack of understanding.

 

On Monday, November 13, 2017 at 3:49:41 AM UTC-6, Stephen Feyrer wrote:
Hi there,

This is possibly a silly question (and may have been asked before but I couldn't find earlier examples) so here it goes.
 

Get the nth element of the vector
user=> (3 [1 "is" the :same])
:same


Note that you can do something similar now with:

([1 "is" the :same] 3)

This works for me :)

The important thing here is that thing being invoked is the vector (which is a custom Clojure type that implements the IFn invocation interface). Putting arbitrary types in the operator position of a call is where the issues lie.
 
 

Whereas in the situation with maps, keywords (a specific common key type) are functions, here you are implying that anything could implicitly be invokable as a lookup function on an associative data structure. This doesn't make a lot of sense to me and would undoubtedly have performance effects. Maybe there is some case like `(map 1 [[:a 0] [:b 1] [:c 2]])` where this would make sense. But this doesn't make sense for all indexed data structures, just those indexed by longs. Layering invocability on longs would require special handling in the compiler (Keywords are a type that embeds this by implementing IFn naturally, so it's not a special case).

Get the nthiness of an element in the vector
user=> (::same [1 "is" the :same])
3

This is something completely different - a linear search for a match. Clojure has considered and rejected making linear searching easier multiple times. It's almost always a sign that you are using the wrong data structure in the first place and should be using a hashed data structure instead. There is no chance we would do something like this.

Perhaps, I'm not understanding vectors properly.  A vector is a set of ordered elements, I think that is right. 

More importantly, vectors are indexed (you can look up an element by index) and this property is extended to make vectors associative where you associate a key (the index) with the value (the element at the index). This operation is fast (what we call "effectively constant" time as it's O(log32(n)) which is a function that grows very slowly compared to O(n)).
 
Then at some point you might want to get one of those elements picking it out by its index.  Equally, this is probably where I've lost the plot you might want to know where in a vector your element is residing.

These two sentences are talking about two different kinds of operations, but I think you believe they are the same kind of operation. The first sentence is a fast (effectively constant time) lookup by key (the index). The second sentence is a slow (linear time) search through the vector. Generally, if your program is doing a lot of the latter, you should change it so it does more of the former. :)
 

Get the nthiness of an element in the vector
user=> (:1 [1 "is" the :same])
(0 2)

I don't even understand what is intended here.


You've already covered this above but I'll explain what I meant (just to be clear). 

user=> (nth [1 "is" the :same] 2)
1

And

user=> ([1 "is" the :same] 0)
1

Therefore, a lookup of value '1' would return the indices for both occurrences in a list.

Both cases here are the fast lookup operation - both of these can go pluck out the i-th element without examining any other elements, just like you were instead doing:

(nth {0 1, 1 "is", 2 the, 3 :same} 0) or 
({0 1, 1 "is", 2 the, 3 :same} 0)
 

 
Taking things further into the realms of really you should have stopped there!

Get the these element of the vector
user=> ((1 3) [1 "is" the :same])
("is" :same)

Just for fun I tried:
user=> ((:this :the) {:this "is" :the "same"})                                    
                                                                                 
NullPointerException   user/eval786 (NO_SOURCE_FILE:1)


The thought here was the same as the above only with a list of invocations to return a list of values.

 

I may be barking up the wrong tree or possibly just barking...  I hope what I'm asking makes some sort of sense. 

Not particularly.
 
By way of apology to the reader, when I began writing this question it was ill thought out and seemed a lot simpler.  As a disclaimer, I can't think of direct examples how this would improve readability or such or even necessarily be useful. 

Seems like that should have been a sign. :)

That was one yes, the other was the suggestion that this might be a silly question.  In hindsight, I think the question I'm looking at now maybe quite different to the one I thought I had when I started.

At this point I think I would ask, how do you find the indices of elements within a vector for a certain value? 

In general, this requires examining each element till you find them (a linear-time search).
 
Or perhaps, should you be needing to obtain the positional component of a value, what would be the correct data structure to use?

This is a better question. :) I'd recommending using a data structure that allows you to do a constant time lookup, and avoid linear searches if possible. That is usually a hashed data structure like a map or set (or occasionally a vector if the values happen to be index-like).
 
For example, you're given a list (in an unspecified form) of horses from a race in their finishing order.

Carrot
Colonel Mustard
Sugar
Binky
Mr. Ned
Go Faster Stripes
Stick
Scary Monster Not Technically A Horse
Binky

Then (using Clojure) you're asked to get the positional value of 'Go Faster Stripes' or 'Binky'?

If you don't have control over the input, then you will need to walk in a linear search, either leveraging the natural indexes (probably with an explicit loop/recur), or using your own as you go (using something like keep-indexed). Or if checking for any match at all, something like (some #{"Go Faster Stripes" "Binky"} finishes).

But really, I'd endeavor to instead take as input a map like this:

{"Carrot" 1, "Colonel Mustard" 2, ...} and then just look up the answer without needing to examine every horse. Without more context on use, it's hard to suggest precise replacements.


Didier

未讀,
2017年11月13日 下午5:09:052017/11/13
收件者:Clojure
Yo are looking for indexOf (.indexOf vector value).

(.indexOf ["a" "b"] "b")
=> 1

(.indexOf ["a" "b" "b"] "b")
=> 1

Note how indexOf searches for the index of the first value which matches value.

To do what you ask, is a query over a vector, which requires a search on each element. This will take O(N) time. For a small list like in your example its probably good enough and not an issue.

If you want the name of the horse in a given position, that's a key lookup, which is ~O(1) time. You can use get:

(get ["a" "b"] 1)
=> "b"

If you really needed performance, you would need to combine a LinkedList and a map. Some datastructures do it for you under the hood, like Apache LinkedMap, amalloy ordered, java LinkedHashMap, etc.

Its possible to also just use sorted-map-by in a closure. But this only works if you're not going to add/update things to the datastructure after first creation.

See the example on clojuredocs: https://clojuredocs.org/clojure.core/sorted-map-by#example-542692d5c026201cdc327094

Stephen Feyrer

未讀,
2017年11月13日 晚上8:19:532017/11/13
收件者:clo...@googlegroups.com
Hi Alex, Didier,

Thanks for your patience.

That covers everything which I can think of and a fair bit more :)  I have a bit of reading and thinking to do now.

Again, thank you.


--
Rule of thumb simple question, complicated answer

Stephen.


--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sean Corfield

未讀,
2017年11月14日 凌晨12:22:212017/11/14
收件者:clo...@googlegroups.com

I don’t think anyone addressed your question about finding all the indices in a vector where the element matches a given search value?

 

There are a number of ways to tackle that (given it’s going to be a linear search). Since you want the indices, you are either going to need to track them directly or use map-indexed to produce them automatically.

 

(defn indices [x l] ; produces a vector

  (loop [i 0 l l r []]

    (if (seq l)

      (recur (inc i) (rest l) (if (= x (first l)) (conj r i) r))

      r)))

 

(defn indices [x l] ; produces a lazy sequence

  (keep identity (map-indexed (fn [i v] (when (= x v) i)) l)))

 

(defn indices [x l] ; produces a vector

  (transduce (comp (map-indexed vector)

                   (filter (comp (partial = x) second))

                   (map first))

             conj

             []

             l))

 

Hopefully that’ll give you some options to think about…

 

Sean Corfield -- (970) FOR-SEAN -- (904) 302-SEAN
An Architect's View -- http://corfield.org/

"If you're not annoying somebody, you're not really alive."
-- Margaret Atwood

 


From: clo...@googlegroups.com <clo...@googlegroups.com> on behalf of Stephen Feyrer <stephen...@gmail.com>
Sent: Monday, November 13, 2017 5:19:32 PM
To: clo...@googlegroups.com
Subject: Re: Map Keywords are functions, why not vector elements?
 

For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.

Gary Verhaegen

未讀,
2017年11月14日 清晨5:00:152017/11/14
收件者:clo...@googlegroups.com
If you're going to call that for multiple elements of the same vector, it's worth thinking about doing some work upfront so that each look-up is faster:

(defn indices [vect]
  (->> vect
       (map-indexed vector)
       (reduce (fn [acc [idx el]]
                 (update acc el (fnil conj []) idx))
               {})))

(indices [1 "is" 1 :same])
{1 [0 2], "is" [1], :same [3]}

would walk through the vector only once and create a map of value -> list of indices.

Stephen Feyrer

未讀,
2017年11月19日 下午5:35:372017/11/19
收件者:clo...@googlegroups.com
Hi,

A quick thank you to both Gary and Sean and everyone!  I will come back to this once I have achieved a deeper understanding.

Apologies for the delayed response.


--
Kind regards

Stephen.
回覆所有人
回覆作者
轉寄
0 則新訊息