A re look at re-groups

已查看 37 次
跳至第一个未读帖子

Matt Smith

未读,
2011年8月30日 11:18:462011/8/30
收件人 Clojure
I have been studying patterns or the notion of idiomatic code in
Clojure. The code in clojure.core has some good examples of proper
Clojure code and is well done. For that reason I was a bit surprised
at the definition for re-groups:

(defn re-groups [^java.util.regex.Matcher m]
(let [gc (. m (groupCount))]
(if (zero? gc)
(. m (group))
(loop [ret [] c 0]
(if (<= c gc)
(recur (conj ret (. m (group c))) (inc c))
ret)))))

It seems like the loop/recur is non-idiomatic for this usage and could
be done with either a map or reduce. Here is my rewrite attempting to
use common patterns:

(defn re-groups [^java.util.regex.Matcher m]
(let [gc (. m (groupCount))
[entire & groups :as ret] (map #(.group m %) (range (inc gc)))]
(if (seq groups)
(apply vector ret)
entire )))

The function is complicated by the fact that it either returns a
string or a vector depending on the number of groups. This might be
better handled with a multimethod:

(defmulti re-groups (fn [^java.util.regex.Matcher m] (.groupCount m)))
(defmethod re-groups 0 [m] (.group m))
(defmethod re-groups :default [m]
(let [idxs (range (inc (.groupCount m)))]
(reduce #(conj %1 (.group m %2)) [] idxs)))

The final implementation with multi-methods seems cleaner to me in
making it clearer what the intent of the code is while allowing the
disjoint functionality. Too bad the result of the defmulti filter is
not available to the methods.

Armando Blancas

未读,
2011年8月30日 12:16:012011/8/30
收件人 Clojure
> It seems like the loop/recur is non-idiomatic for this usage and could
> be done with either a map or reduce.

It could be that it's faster that way. How does you code perform?

Matt Smith

未读,
2011年8月30日 12:46:502011/8/30
收件人 Clojure





On Aug 30, 10:16 am, Armando Blancas <armando_blan...@yahoo.com>
wrote:
> > It seems like the loop/recur is non-idiomatic for this usage and could
> > be done with either a map or reduce.
>
> It could be that it's faster that way. How does you code perform?

Timings on this code for all three:
(time (count (map re-matches (cycle [#"[-+]?[0-9]+/([0-9])+" #"([-+]?
[0-9]+)/([0-9])+" #"[-+]?[0-9]+/[0-9]+" #"hamster"]) (take 200000 (for
[x (range) y (range x)] (str x "/" y))))))

clojure.core:Elapsed time: 668.029589 msecs"
first rewite: "Elapsed time: 887.169178 msecs"
multi-method: ""Elapsed time: 2632.672379 msecs"

Which begs the question: Is writing "idiomatic clojure" going to be
generally slower? Are there things that can be done to increase the
performance of the other 2 implementations?

Ken Wesson

未读,
2011年8月30日 13:03:332011/8/30
收件人 clo...@googlegroups.com
On Tue, Aug 30, 2011 at 11:18 AM, Matt Smith <m0s...@gmail.com> wrote:
> I have been studying patterns or the notion of idiomatic code in
> Clojure.  The code in clojure.core has some good examples of proper
> Clojure code and is well done.  For that reason I was a bit surprised
> at the definition for re-groups:
>
> (defn re-groups [^java.util.regex.Matcher m]
>    (let [gc  (. m (groupCount))]
>      (if (zero? gc)
>        (. m (group))
>        (loop [ret [] c 0]
>          (if (<= c gc)
>            (recur (conj ret (. m (group c))) (inc c))
>            ret)))))
>
> It seems like the loop/recur is non-idiomatic for this usage and could
> be done with either a map or reduce.

Speed is probably the consideration here, assuming this code appears
after the bulk of bootstrap.

> The final implementation with multi-methods seems cleaner to me in
> making it clearer what the intent of the code is while allowing the
> disjoint functionality.  Too bad the result of the defmulti filter is
> not available to the methods.

This suggests that the following might be a useful enhancement to
multimethods: if any particular method is given 1 more argument than
the dispatch function, the last argument will be filled with the
dispatch function's return value if that method is selected.

--
Protege: What is this seething mass of parentheses?!
Master: Your father's Lisp REPL. This is the language of a true
hacker. Not as clumsy or random as C++; a language for a more
civilized age.

回复全部
回复作者
转发
0 个新帖子