Proposed Additions to contrib: proj

15 views
Skip to first unread message

Sean Devlin

unread,
Aug 16, 2009, 10:56:59 PM8/16/09
to Clojure Dev
Hello everyone,
I work with lists of hash-maps constantly. As such, I have a
series of proposals I would like to add to contrib, based on my
efforts. They are mostly related to data wrangling, and I have found
that they work together nicely. The high level of proposals is

* add the proj function, a utility to apply functions in parallel.
* some additions to map-utils, in order to make manipulating hash-
maps easier
* a new library, pred-utils, for creating complex predicates
* a new library, table-utils, for manipulating lists of hash-maps.
This includes outer (left, right, full) and inner (equi, natural,
cross) join operations, as well as pivoting operations (inspired by
Excel).

I would ask the group for patience as I present the proposals over the
next week. With that in mind, I'd like to present my first proposed
addition to clojure.contrib.core, the proj function.

* proj & comp *

In order to understand proj(ection), I'd like to first talk about comp
(osition). Comp can be defined in terms of reduce like so:

(defn my-comp [& fns]
(fn [args]
(reduce
(fn[accum f](f accum))
(conj (reverse (seq fns)) args))))

Granted, this isn't 100% equivalent to the Clojure comp function, but
it is very, very close. What it demonstrates is that comp applies a
list of functions in series using reduce. After writing Clojure for a
while, I found frequent need to apply a list of functions in parallel
using map. proj can be defined as follows

(defn proj [& fns]
(fn[arg] (map #(% args) fns)))

Notice that proj creates a closure. Initially, I used this as a way
to access values from a map.

user=>(def test-map {:a "1" :b "2" :c "3" :d "4"})

user=>((proj :a :c) test-map)
("1" "3")

However, as I used proj more and more, I found it to be a useful way
perform many operations on a map at once. For example

;assume parse-int turns a string to an int appropriately
user=>((proj :a (comp parse-int :c)) test-map)
("1" 3)

Since proj returns a closure, it is very useful in any place I would
use a map operation as well. For example, this made turning a list of
maps into a list of lists very easy. Also, this made it very easy to
determine if a sub-selection of a hash-map is equal to another hash-
map

user=>(def test-proj (proj :a :c))

user=>(= (test-proj {:a 1 :b 2 :c 3}) (test-proj {:a 1 :b 34 :c 3}))
true

One thing that is very interesting is that this function allows me to
simulate the behavior of let in a point-free style. This is something
I am still experimenting with.

;This is deliberate overkill for a small example
;Generate a list of squares
;Notice that the proj fn uses the range twice
user=>((partial map (proj identity #(* % %)))
(range 1 6))
((1 1) (2 4) (3 9) (4 16) (5 25))

I suspect that there are other uses for this type of function out
there that others can see more clearly than me. If this is something
others would be interested in, I'd gladly submit a patch.

For those of you that would like to further evaluate the proposal, you
can find the code in my library here:
http://github.com/francoisdevlin/devlinsf-clojure-utils/

Look in the lib.devlinsf.core namespace

In a few days I'll post some proposed additions to the map-utils
library. For now, I thank you for your time.
Sean Devlin

Timothy Pratley

unread,
Aug 18, 2009, 2:56:07 AM8/18/09
to Clojure Dev
Hi Sean,

Your proj is really about collecting different function results for a
single input. As such I feel a little uneasy about its name.
Projection is to me about selecting (this is likely just my
preconception/misinterpretation). A fixed point alternative might be
(mapval val f1 f2 ...) but this reduces its usefulness. Another
alternative: (multifn f1 f2) but that would confuse with multimethod.

Leaving aside the name for a moment, it seems to me there is already
support for collecting different function results on a value:
user=> (for [x (range 1 6)] (hash-map :id x :sqr (* x x)))
({:sqr 1, :id 1} {:sqr 4, :id 2} {:sqr 9, :id 3} {:sqr 16, :id 4}
{:sqr 25, :id 5})
But the key benefit of proj is the point-free usage.
Hmmm how about uncomp (uncomposed function calls)... arrrrhhhiiii
maybe proj is ok :)


Regards,
Tim.

Sean Devlin

unread,
Aug 18, 2009, 6:27:25 AM8/18/09
to Clojure Dev
Yeah, I'm uneasy about the name too. My first use was in view code,
"projecting" certain keys in a hashmap to an array. It later grew
into the point-free tool.

The only other name I came up for was "tap", as in wiretap (think
electrical circuits or LabVIEW if you've ever used it).

Sean

Jarkko Oranen

unread,
Aug 18, 2009, 6:30:36 AM8/18/09
to Clojure Dev


Timothy Pratley wrote:
> maybe proj is ok :)

I don't like it. Seems like another of those unnecessarily shortened
names :/ Besides, clojure.set already contains a "project" function,
though I it's not exactly like this one.

Maybe something like fn-tuple? what the resulting function essentially
does is collect the results of all the component functions in a tuple,
so that would describe it nicely.

--
Jarkko

Jarkko Oranen

unread,
Aug 18, 2009, 8:00:43 AM8/18/09
to Clojure Dev
I forgot one suggestion from my earlier post.

I think the function could be made more general if it used apply
internally, like so:

(defn fn-tuple [& fns]
(fn [& args] (map #(apply % args) fns)))

Like this, it would work with multiple arguments:

((fn-tuple + - * /) 10 2)

or with map:
(map (fn-tuple * +) [1 2 3] [4 5 6])

and of course, the original use case:

((fn-tuple :a :c) {:a 1 :b 2 :c 3})

--
Jarkko

Sean Devlin

unread,
Aug 19, 2009, 5:27:47 PM8/19/09
to Clojure Dev
1. None of us like the name. I could go with fn-tuple instead, but
I'd like to keep the discussion open for a better name.

2. I tried a few experiments at the REPL with the apply modification.
It turns out that using apply breaks keyword access

user=> (def abc123 {:a 1 :b 2 :c 3})
#'user/abc123
user=> (apply :a abc123)
java.lang.IllegalArgumentException: Wrong number of args passed to
keyword: :a (NO_SOURCE_FILE:0)
user=> (map #(apply % abc123) [:a :c])
java.lang.IllegalArgumentException: Wrong number of args passed to
keyword: :a

Also, I did some investigating with comp, and it doesn't appear to use
an implicit apply

user=> ((comp str identity) [1 2 3])
"[1 2 3]"
user=> ((comp (partial apply str) identity) [1 2 3])
"123"

Based on this, I think it would be better if the definition stayed the
same.

Any other thoughts?

Sean

Timothy Pratley

unread,
Aug 19, 2009, 6:52:25 PM8/19/09
to Clojure Dev
Just FYI Jarkko's version works fine for me.
Your apply examples are apples to his orange, as in his funciton args
is a seq (& args -> [arg1 arg2...])
but in your REPL you are using the hashmap itself not [hashmap].

Sean Devlin

unread,
Aug 19, 2009, 7:54:44 PM8/19/09
to Clojure Dev
Hmm, you're right. I think my bug is in the mapping fn.

I wrote

#(fn[arg](apply % arg)

And Jarkko's version is

#(fn[& args] (apply % args))

As such, I'm inclined to go with the apply version.

Good catch Jarkko.

Sean

Rich Hickey

unread,
Aug 29, 2009, 1:59:48 PM8/29/09
to Clojure Dev
I've added a version of this to core, called juxt for now (short for
juxtapose, a la comp[ose]). It returns vectors.

Other possible names are:

conjf/adjoin/abut/allfn/appose

Rich

Sean Devlin

unread,
Aug 29, 2009, 9:53:45 PM8/29/09
to Clojure Dev
1. I like juxt(apose). It's short, sounds the same, and describes
exactly what this function does.
2. I like returning a vector, it allows the result to be comparable.
This is important for interaction with group-by, which I just
discovered last night.

Sean
Reply all
Reply to author
Forward
0 new messages