Pattern matching macro

77 views
Skip to first unread message

James Reeves

unread,
May 14, 2008, 8:40:47 PM5/14/08
to Clojure
Hi folks,

A recent discussion prompted me to try my hand at creating a pattern
matching macro. It's only 32 lines, so I'll include the source at the
end of this message. It's 1am here, so there might be some bugs, but
hopefully it's sufficiently complete to be of interest.

I guess it's easier to show than tell, so here's a few gratuitous
examples:

(match number
1 "one"
2 "two"
n (str "many (" n ")"))

(match person
{:name n, :email e, :job "programmer"}
(str n " <" e ">"))

(match args
[x] "One argument only"
[x 'true] "Second argument is boolean true"
[x 'false] "Second argument is boolean false")

It can handle vectors, hash maps and lists, and supports variable
destructuring; so I consider it to be a pretty flexible system. I
haven't tested performance yet, so if anyone wants to benchmark it,
I'd be interested in knowing the results.

I guess I've rambled on enough. Here's the code:

(defn has-match [matches]
(if (some (partial = :no-match) matches)
:no-match
(apply concat matches)))

(def matches?)

(defn match-map [x-map y-map]
(has-match
(map (fn [[k v]]
(if-let y (y-map k)
(matches? v y)
:no-match))
x-map)))

(defn match-seq [xs ys]
(if (= (count xs) (count ys))
(has-match (map matches? xs ys))
:no-match))

(defn matches? [x y]
(let [both #(and (% x) (% y))]
(cond
(= x `'~y) '()
(both seq?) (match-seq x y)
(both vector?) (match-seq x y)
(both map?) (match-map x y)
(symbol? x) (list x y)
(= x y) '()
true :no-match)))

(defmacro match [value & clauses]
(when clauses
`(let [~'derefs (matches? '~(first clauses) ~value)]
(if (not= ~'derefs :no-match)
(eval (list 'let (apply vector ~'derefs)
'~(second clauses)))
(match ~value ~@(rrest clauses))))))

--
James Reeves

Randall R Schulz

unread,
May 14, 2008, 9:08:51 PM5/14/08
to clo...@googlegroups.com
On Wednesday 14 May 2008 17:40, James Reeves wrote:
> Hi folks,
>
> A recent discussion prompted me to try my hand at creating a pattern
> matching macro. It's only 32 lines, so I'll include the source at the
> end of this message. It's 1am here, so there might be some bugs, but
> hopefully it's sufficiently complete to be of interest.
>
> I guess it's easier to show than tell, so here's a few gratuitous
> examples:

My question is this: What is the fundamental recognizing power of your
matcher?

I would dearly love to have, and lacking it, will probably write myself,
is a tree-regular expression matcher.


> ...
> --
> James Reeves


Randall Schulz

James Reeves

unread,
May 15, 2008, 5:01:25 AM5/15/08
to Clojure
> My question is this: What is the fundamental recognizing power of your
> matcher?

I'm not quite sure what you mean by that, sorry :)

But, perhaps some further explanation is in order. This is a pattern
matching system similar to that found in other functional languages,
like Haskell or Scala. It tests a value against a series of patterns,
and then evaluates the associated action. The syntax is:

(match <value> <pattern> <action> <pattern> <action> ...)

Symbols, quotes, vectors, sequences and hash-maps are treated
specially. Everything else is matched literally. For example:

user=> (defn foo [x] (match x 1 "one" 2 "two"))
#<Var: user/foo>
user=> (foo 1)
"one"
user=> (foo 2)
"two"
user=> (foo 3)
nil

If you put in a symbol, that matches any value, and binds the value to
the symbol:

user=> (defn foo [x] (match x 1 "one" 2 "two" n (str "many (" n ")")))
#<Var: user/foo>
user=> (foo 3)
"many (3)"
user=> (foo 4)
"many (4)"
user=> (foo 2)
"two"

Vectors and sequences are destructured in a way similar to normal
Clojure functions:

user=> (defn foo [& args] (match args (a) "one arg" (a b) "two args"))
#<Var: user/foo>
user=> (foo 1)
"one arg"
user=> (foo 1 2)
"two args"

user=> (defn foo [v] (match v [a] "one item" [a b] "two
items"))
#<Var: user/foo>
user=> (foo [1])
"one item"
user=> (foo [1 2])
"two items"

But you can also put in literal values inside sequences and vectors,
too:

user=> (defn foo [v] (match v [1] "one item" [1 b] "two items"))
#<Var: user/foo>
user=> (foo [1 2])
"two items"
user=> (foo [2 2])
nil

Hash maps work a little differently, as you don't have to specify the
full hash to get a match, and keys are always matched literally.

user=> (foo {:name "Fred", :job "programmer", :age
30})
"Fred is a coder"
user=> (foo {:name "Joe", :job "cook"})
"Joe isn't a coder"

I'm not sure how good this explanation was, but hopefully my lack of
clarity should be compensated by the example code. Can never have too
many examples :)

--
James Reeves

Randall R Schulz

unread,
May 15, 2008, 1:02:34 PM5/15/08
to clo...@googlegroups.com
On Thursday 15 May 2008 02:01, James Reeves wrote:
> > My question is this: What is the fundamental recognizing power of
> > your matcher?
>
> I'm not quite sure what you mean by that, sorry :)
>
> But, perhaps some further explanation is in order. This is a pattern
> matching system similar to that found in other functional languages,
> like Haskell or Scala. It tests a value against a series of patterns,
> and then evaluates the associated action. ...

The term "matching" / "matcher" covers a lot of territory, doesn't it?
For example, is it equivalent to a regular language recognizer? A
regular tree recognizer? A context-free grammar? Is it a unifier?

Unfortunately, I don't know anything about Haskell and very little about
Scala. I understood Scala's pattern matching to be based on algebraic
types. Is that true? Is this the same? Does your matcher handle
arbitrarily nested constructs, or only flat / linear ones? All your
examples seemed to be flat.


> ...


>
> I'm not sure how good this explanation was, but hopefully my lack of
> clarity should be compensated by the example code. Can never have too
> many examples :)

Actually, I disagree. Good software deserves good and reasonably formal
documentation and characterization. It is my least favorite aspect of
the open-source movement that while the software is often quite good,
the documentation is often quite poor.

Nonetheless, I don't mean to offend, and I certainly don't consider you
obligated to share my views or meet my expectations, but I am curious,
especially in this case, because I have pattern-matching needs, too.

My current feeling is that regular tree expressions are the best match
for some of my requirements. I probably also need a unifier, but for
that I intend to port some Scheme code I found. It's non-trivial enough
to bring me up-to-speed on Clojure (and Scheme, with which I've
scarcely worked before).

James Reeves

unread,
May 15, 2008, 2:35:17 PM5/15/08
to Clojure
> The term "matching" / "matcher" covers a lot of territory, doesn't it?
> For example, is it equivalent to a regular language recognizer? A
> regular tree recognizer? A context-free grammar? Is it a unifier?

It *might* be a regular tree recogniser. It was created to mimic
pattern matching in languages like Scala, OCaml and Haskell, without
any particular grammar structure in mind.

> Unfortunately, I don't know anything about Haskell and very little about
> Scala. I understood Scala's pattern matching to be based on algebraic
> types. Is that true?

As far as I understand, yes. Haskell's certainly is. Clojure doesn't
have algebraic types, but the match macro is somewhat inspired by
them. You could probably mimic algebraic type matching with hash maps.

> Is this the same? Does your matcher handle arbitrarily nested
> constructs, or only flat / linear ones? All your examples seemed to be
> flat.

Vectors, sequences and maps can be arbitrarily nested. Other objects
are just matched via equality.

> My current feeling is that regular tree expressions are the best match
> for some of my requirements. I probably also need a unifier, but for
> that I intend to port some Scheme code I found. It's non-trivial enough
> to bring me up-to-speed on Clojure (and Scheme, with which I've
> scarcely worked before).

This is probably too simple for your needs, though if you give me an
explanation of what you want to do (use small words, speak slowly,
treat me like a dunce ;), I'll see if it can do what you want.

--
James Reeves

James Reeves

unread,
May 15, 2008, 8:58:40 PM5/15/08
to Clojure
I've put the pattern matching code onto github, and added support for
the [x y & z] form.

http://github.com/weavejester/cljmatch/tree/master

I won't spam up the group any further about it, except to answer
questions :)

--
James

Randall R Schulz

unread,
May 15, 2008, 9:24:52 PM5/15/08
to clo...@googlegroups.com
Excess Verbiage Alert!

I apologize for the volume of this post. I spent a lot of time getting
it minimal while complete. The problem is just not simple (as far as I
can tell).

If you're interested in pattern matching, tree-regular expressions,
unification or theorem proving, read on. If this bores you, you can
probably just skip it.

RRS


On Thursday 15 May 2008 11:35, James Reeves wrote:
> ...


>
> This is probably too simple for your needs, though if you give me an
> explanation of what you want to do (use small words, speak slowly,
> treat me like a dunce ;), I'll see if it can do what you want.

I don't know if it's too simple, or just not quite the right thing (or
maybe even that it is _not_ the right thing!). This has been my
problem, for a while now: my lack of sufficient clarity on how to
resolve the current design deficiency in my system.

I'll be as brief as I can (!) in describing what I have and what I think
I need.

The application is a theorem prover. At its heart, it handles
first-order predicate calculus. The primary data structure, the
first-order formula, or FOF, is structurally a tree. FOFs are perfectly
well suited to S-Expression notation and the external, textual language
in which they're notated is very Lisp-like.

We get some trans-first-order capability by pattern-based rewriting of
certain classes of problems into equivalent forms that are strictly
first-order. Other rewriting is done to make problems more tractable.
The rewriting, in general, has both a syntactic / structural aspect and
optional non-syntactic applicability tests. Rewrite rules involve a
kind of "dissection" of the original FOF and the reconstruction of one
ore more replacement sub-proof FOFs.

Currently, all the rewriting rules are hand-written in Java and there's
a separate class that decides which rule to apply (based in part on the
rule's internal applicability criteria). This "strategy" code
recursively rewrites an original problem until no further rewriting is
possible and then submits the rewritten form (often multiple separate
sub-proofs) to the generic prover.

A key requirement is to open up the rewriting rules so one does not have
to muck around creating a lot of low-level Java to create new rules.
It has proven rather tedious, if unsurprisingly so, to write rules this
way.


So I need structural (template-like) pattern matching on trees along
with a sort of selective destructuring binding as a first step. Often
it will be necessary to match variable-arity sub-formulas (nodes with a
particular head and a varying / arbitrary number of children), and then
to be able to get a kind of "indexed" binding (a vector, say) to those
sub-formulas. This is where I suspect the tree-regular expression idea
might work best.

For example, a variable-arity conjunction (which has only structural
applicability criteria), such as:

Prove:
(and SubFormula1 SubFormula2 SuFormula3)

Must be able to be rewritten as:

Prove all:
SubFormula1
SubFormula2
SubFormula3


I'll provide other primitives that rule authors can use to express the
non-structural applicability tests following structure matching.

Coordinating multiple rules (when, say, more than one is applicable to a
given formula or when multiple rules can rewrite to each others' target
structure and hence produce non-terminating rewrites) need not be
addressed here.


That's the problem I'm facing and it's is one of the key things I'd like
to solve in Clojure, though there are other requirements I plan to
address using it, as well.


> --
> James Reeves


Any observations, insights or feedback you (or anyone) has is welcome.


Randall Schulz

Raoul Duke

unread,
May 15, 2008, 9:26:53 PM5/15/08
to clo...@googlegroups.com
dunno if this would be of any use / food for thought...

http://tom.loria.fr/

Randall R Schulz

unread,
May 15, 2008, 9:49:49 PM5/15/08
to clo...@googlegroups.com
On Thursday 15 May 2008 18:26, Raoul Duke wrote:
> dunno if this would be of any use / food for thought...
>
> http://tom.loria.fr/

Thanks.

I'm using Tom for some fixed transformations, but it is static and
requires both a type declaration (not itself a problem) _and_ a Java
source-code translation step.

The latter makes it a non-starter for user-supplied rewrite rules.


Randall Schulz

James Reeves

unread,
May 16, 2008, 1:16:32 PM5/16/08
to Clojure
On May 16, 2:24 am, Randall R Schulz <rsch...@sonic.net> wrote:
> We get some trans-first-order capability by pattern-based rewriting of
> certain classes of problems into equivalent forms that are strictly
> first-order. Other rewriting is done to make problems more tractable.  
> The rewriting, in general, has both a syntactic / structural aspect and
> optional non-syntactic applicability tests. Rewrite rules involve a
> kind of "dissection" of the original FOF and the reconstruction of one
> ore more replacement sub-proof FOFs.

This seems like it could easily get beyond what the match macro can
handle. It seems like what you need is more some manner of parser
generater for s-exps, though my expertise in formal grammar is rather
limited.

The match macro is more designed to be some useful syntax sugar that
combines destructuring let forms and value matching cond forms. Its
brevity limits its scope, but makes it a useful way of matching simple
patterns in a small amount of code.

--
James Reeves

Randall R Schulz

unread,
May 16, 2008, 3:58:46 PM5/16/08
to clo...@googlegroups.com
On Friday 16 May 2008 10:16, James Reeves wrote:
> On May 16, 2:24 am, Randall R Schulz <rsch...@sonic.net> wrote:
> > We get some trans-first-order capability by pattern-based rewriting
> > of certain classes of problems into equivalent forms that are
> > strictly first-order. Other rewriting is done to make problems more
> > tractable. The rewriting, in general, has both a syntactic /
> > structural aspect and optional non-syntactic applicability tests.
> > Rewrite rules involve a kind of "dissection" of the original FOF
> > and the reconstruction of one ore more replacement sub-proof FOFs.
>
> This seems like it could easily get beyond what the match macro can
> handle. It seems like what you need is more some manner of parser
> generater for s-exps, though my expertise in formal grammar is rather
> limited.

Yes. All the background study I've done suggests that regular tree
expressions are the best match to what I want to accomplish. I went
hunting for off-the-shelf implementations, but none of the
implementation projects I could find in the literature have released
their software.

In principal, I think the algorithms or code in James Clark's Jing RELAX
NG validator could be used, but I'm not sure whether it's more
efficient for me to decipher his source code to create an Clojure
S-Expression counterpart or to write it from scratch myself.


> The match macro is more designed to be some useful syntax sugar that
> combines destructuring let forms and value matching cond forms. Its
> brevity limits its scope, but makes it a useful way of matching
> simple patterns in a small amount of code.

It still sounds useful, even if not for every use case I have right.

Reply all
Reply to author
Forward
0 new messages