Simulate Python's multiple dispatch in Julia

281 views
Skip to first unread message

Michael Fox

unread,
Nov 26, 2013, 12:38:30 AM11/26/13
to julia...@googlegroups.com
I'm trying to port a Python library to Julia. It is the PEG parser generator, Parsimonious: https://github.com/erikrose/parsimonious

Parsimonious uses the Visitor Pattern. After a parse you have a node tree. Each node is an object with a name. Each visitor is a class with a collection of methods corresponding to the names.

For example, you might have nodes named "literal" and "quantifier". MyVisitor has methods MyVisitor.visit_literal(node) and MyVisitor.visit_quantifier(node).

The magic of Python is that when visiting the nodes, it's easy to dispatch the visitor methods using getattr(self, 'visit_' + node.name)

But, Julia seems to discourage this sort thing -- maybe can be done with some introspection or parse/eval methods that I haven't explored.

Enough backstory!

Here is what I tried and it seemed -- at-first -- to work:

  4 abstract Node
  5
  6 type NodeA <: Node
  7     children
  8 end
  9
 10 function NodeA()
 11     NodeA([])
 12 end
 13
 14 abstract Visitor
 15
 16 type MyVisitor <: Visitor end
 17
 18 function visit(v::MyVisitor, n::Node)
 19     println("hello node")
 20 end
 21
 22 function visit(v::MyVisitor, n::NodeA)
 23     println("hello A")
 24 end
 25
 26 a = NodeA()
 27
 28 visit(MyVisitor(), a)
 29
 30 typealias NodeB NodeA
 31
 32 function visit(v::MyVisitor, n::NodeB)
 33     println("hello B")
 34 end
 35
 36 b = NodeB()
 37
 38 visit(MyVisitor(), b)


Running it:

julia> reload("dispatch.jl")
Warning: replacing module Dispatch
hello A
hello B


Encouraging! But wait. Here is the same program in a slightly different order:

  4 abstract Node
  5
  6 type NodeA <: Node
  7     children
  8 end
  9
 10 function NodeA()
 11     NodeA([])
 12 end
 13
 14 typealias NodeB NodeA # <-- I moved this line
 15
 16 abstract Visitor
 17
 18 type MyVisitor <: Visitor end
 19
 20 function visit(v::MyVisitor, n::Node)
 21     println("hello node")
 22 end
 23
 24 function visit(v::MyVisitor, n::NodeA)
 25     println("hello A")
 26 end
 27
 28 function visit(v::MyVisitor, n::NodeB)
 29     println("hello B")
 30 end
 31
 32 a = NodeA() # <-- And this one
 33 b = NodeB()
 34
 35 visit(MyVisitor(), a) # <-- And finally this one
 36 visit(MyVisitor(), b)

Running this file:

julia> reload("dispatch2.jl")
Warning: replacing module Dispatch
hello B
hello B

Damn. That sucks. It looks like typealias doesn't do what I hope it would do which is give me the same type but have it dispatched differently. It's worse than that: It's non-deterministic or it's determined by some flaky unknown factors.

So it seems like if I want to use this method then for every node, someone will have to fill out a whole type definition for that node which is just a copy of the type definition for every node. It's a lot of boilerplate.

I would appreciate your suggestions.

Jameson Nash

unread,
Nov 26, 2013, 1:05:53 AM11/26/13
to julia...@googlegroups.com
typealiases are simply ways of renaming types for easier reference.
They have no other special properties, and if you print out the type,
you will typically find that it emits its underlying type. If you want
to dispatch differently, you will need a wrapper type `immutable
Wrapper; X::Node; end` or a parameterized type `type Node{name};
child; end`. these parameterized types are very powerful: there's a
whole section of the manual explaining how they work and how to use
them.

you are getting different answers because the original method call got
cached. this is an old issue and will eventually be fixed so that both
return the second answer. in the meantime it is advisable to try to
define all of your functions and types before calling and functions or
constructing any types.

Jeff Bezanson

unread,
Nov 26, 2013, 3:04:48 AM11/26/13
to julia...@googlegroups.com
No caching-related thing going on --- it's just that in the first case
the method was called before being overwritten by the "B" version.

This reminds me that we've been planning to add abstract types with
fields, so that many types can automatically inherit them.

Michael Fox

unread,
Nov 26, 2013, 12:54:10 PM11/26/13
to julia...@googlegroups.com
The wrapper works. Now the code looks like:

  8 abstract AbsNode
  9
 10 immutable NodeA <: AbsNode node::Node end
 11 function NodeA() NodeA(Node()) end
 12 immutable NodeB <: AbsNode node::Node end
 13 function NodeB() NodeB(Node()) end
 14
 15 # Not helpful:
 16 #function AbsNode()
 17 #    AbsNode(Node())
 18 #end
 19
 20 function Node() Node(Node[]) end
 21
 22 abstract Visitor
 23
 24 type MyVisitor <: Visitor end
 25
 26 function visit(v::MyVisitor, n::Node)
 27     println("hello node")
 28 end
 29
 30 function visit(v::MyVisitor, n::NodeA)
 31     println("hello A")
 32 end
 33
 34 function visit(v::MyVisitor, n::NodeB)
 35     println("hello B")
 36 end
 37
 38 a = NodeA(Node())
 39 b = NodeB(Node())
 40
 41 # Even though immutable, the list can still be push!ed
 42 push!(a.node.children, Node())
 43 println(a)
 44
 45 visit(MyVisitor(), a)
 46 visit(MyVisitor(), b)

And does:

julia> reload("dispatch3.jl")
Warning: replacing module Dispatch
NodeA(Node([Node([])]))
hello A
hello B

I don't see how Parametric Types can be helpful. I read the manual and felt like I was in one of those dreams where you're trying to read a snake before your grandma eats it but every time you look away the words change alphabet.


Stefan Karpinski

unread,
Nov 26, 2013, 12:55:53 PM11/26/13
to Julia Users
On Tue, Nov 26, 2013 at 12:54 PM, Michael Fox <415...@gmail.com> wrote:
I don't see how Parametric Types can be helpful. I read the manual and felt like I was in one of those dreams where you're trying to read a snake before your grandma eats it but every time you look away the words change alphabet.

I dunno, personally I like having C-compatible arrays of ints and floats.

Michael Fox

unread,
Nov 26, 2013, 1:09:20 PM11/26/13
to julia...@googlegroups.com
But I still have my original problem which is making a nice, user-friendly interface to the parsing library.

In the original Python you define the grammar as a string then parse the text which makes a tree. The nodes in the tree have names which fall out from the grammar. To process the tree you define a Visitor class with visit_nodename methods. It really couldn't be more slick:

grammar = Grammar(
    """
    bold_text  = bold_open text bold_close
    text       = ~"[A-Z 0-9]*"i
    bold_open  = "(("
    bold_close = "))"
    """)

class MyVisitor(NodeVisitor):
    def visit_bold_text(self, node, children):
        ...
    def visit_text(self, node, children):
        ...
    ...

tree = parse(text, grammar)
transformed = visit(tree, MyVisitor())

How to match such an interface in Julia?

The best (crapy) idea I have is in place of defining the Visitor class and methods, you define a series of functions and a dictionary to look them up. The dictionary step is decidedly not slick and I'd like to find a way to rid myself of it, possibly using the Type system, possibly using Meta magic.

function myvisit_bold_text(node, children) ... end
function myvisit_text(node, children) ... end
myvisitor = { "bold_text" => myvisit_bold_text, "text" => myvisit_text }

I think I'll go read Metaprogramming now.

Michael Fox

unread,
Nov 26, 2013, 1:11:02 PM11/26/13
to julia...@googlegroups.com
Not dissing the Parametric Type. Just saying I don't understand it or how it applies to my problem.

Stefan Karpinski

unread,
Nov 26, 2013, 1:19:52 PM11/26/13
to Julia Users
To me having visit_foo and visit_bar seems like a code smell. The fundamental verb is visit and the nouns are Foo objects and Bar objects. If you made visit a verb and defined different methods for different types of nodes it might alleviate your problems.

Stefan Karpinski

unread,
Nov 26, 2013, 1:25:23 PM11/26/13
to Julia Users
I see that's what you're doing in your Julia port so I'm a bit unclear on what the problem is. I'm not sure there's enough context here to give decent advice. Can you link to the original Python code? Also, even if pasting method names together in Python works, it seems awful. The fact that it's recommended or even standard is just bad. Even double dispatch would be better than that.

Michael Fox

unread,
Nov 26, 2013, 2:26:16 PM11/26/13
to julia...@googlegroups.com
Thank you for being so responsive here and in the issues and for probably being that guy who answers all the questions on IRC.

What I'd like to do is have the user write visitor methods like:

visit(visitor::MyVisitor, node::FooNode, other)

What I don't want to do is have the user define the type FooNode:

type FooNode <: Node
    children::Node
end
function FooNode() FooNode(Node[]) end


It's a lot of boilerplate and it's the parser generator's job to write the boilerplate for you. If this were an old-school Yacc-style parser generator, you'd simply have it write the boilerplate code and then you'd compile the generated code into your program. But it's a new-school on-the-fly parser generator where you only have a string representing the grammar at "compile time".

So, I guess the question is, how can I create new types -- which all have the same structure but different dispatch -- at runtime. And how can I write methods on those types before they are created?

The library I'm porting is Parimonious:

https://github.com/erikrose/parsimonious

Here's some simple examples of visitors and how they're used:

https://github.com/erikrose/parsimonious/blob/master/parsimonious/tests/test_nodes.py

And I'll paste one visitor here:

class HtmlFormatter(NodeVisitor):
    """Visitor that turns a parse tree into HTML fragments"""

    def visit_bold_open(self, node, visited_children):
        return '<b>'

    def visit_bold_close(self, node, visited_children):
        return '</b>'

    def visit_text(self, node, visited_children):
        """Return the text verbatim."""
        return node.text

    def visit_bold_text(self, node, visited_children):
        return ''.join(visited_children)

--

-
Michael

Michael Fox

unread,
Nov 26, 2013, 5:18:13 PM11/26/13
to julia...@googlegroups.com
Okay. So here's my godless anti-pattern for using strings to declare types. I think it just might work:

  4 abstract AbsNode
  5 immutable Node <: AbsNode children end
  6 function Node() Node(AbsNode[]) end
  7
  8 macro typenode(nodename)
  9     ex = esc((symbol(nodename)))
 10     quote
 11         immutable $ex <: AbsNode children end
 12         function $ex() $ex(AbsNode[]) end
 13     end
 14 end
 15
 16 @typenode "NodeA"
 17 @typenode "NodeB"
 18 @typenode "NodeX"
 19
 20 abstract Visitor
 21
 22 type MyVisitor <: Visitor end
 23
 24 function visit(v::MyVisitor, n::AbsNode)
 25     println("hello some subtype of node: ", n)
 26 end
 27
 28 function visit(v::MyVisitor, n::NodeA)
 29     println("hello A")
 30 end
 31

 32 function visit(v::MyVisitor, n::NodeB)
 33     println("hello B")
 34 end
 35
 36 a = NodeA()
 37 b = NodeB()
 38 c = Node()
 39 x = NodeX()

 40
 41 # Even though immutable, the list can still be push!ed
 42 push!(a.children, b)
 43 push!(a.children, c)
 44 push!(a.children, x)
 45 println("node a: ", a)
 46
 47 visit(MyVisitor(), a)
 48 visit(MyVisitor(), b)
 49 visit(MyVisitor(), c)
 50 visit(MyVisitor(), x)


Output:

julia> reload("dispatch3.jl")

Warning: replacing module Dispatch
node a: NodeA([NodeB([]),Node([]),NodeX([])])
hello A
hello B
hello some subtype of node: Node([])
hello some subtype of node: NodeX([])

Jameson Nash

unread,
Nov 26, 2013, 7:03:15 PM11/26/13
to julia...@googlegroups.com
Macro's are way harder than parameterized types. I would avoid them in
general. All that templating you are doing is simply creating a less
powerful parameterized type with a lot more code. Here's the same
thing, parameterizing Node on some symbol T which describes it.

immutable Node{T}
children::Vector{Node}
Node() = new( Node[] )
Node( children ) = new( children )
end
Node(T::Symbol) = Node{T}()

# I'll throw in a typealias, just 'cause. this isn't necessary
# but helps to show the purpose of typealiases
typealias NodeA Node{:A}

abstract Visitor
type MyVisitor <: Visitor end

function visit{T}(v::MyVisitor, n::Node{T})
println("hello subtype ", T, " of node: ", n)
end

#show off the various constructor patterns:
a = NodeA()
b = Node{:B}()
c = Node(:C)
d = Node{:D}(Node[])
push!(a.children, b)
# a is immutable, so a.children = [b] is invalid
# but the a.children array is still mutable
push!(a.children, c)
push!(a.children, x)
println("node a: ", a)

visit(MyVisitor(), a)

Jameson Nash

unread,
Nov 26, 2013, 7:11:42 PM11/26/13
to julia...@googlegroups.com
I forgot to add the best part of all this: Whereas this python code is
essentially foisting the method dispatch onto the caller, the julia
function that is calling visit doesn't even have to know that any of
this is going on!

function visit(v::MyVisitor, n::Node{:A})
println("hello A")
end

julia> visit(MyVisitor(), a)
hello A

julia> visit(MyVisitor(), b)
hello subtype B of node: Node{:B}([])

Michael Fox

unread,
Nov 26, 2013, 9:07:23 PM11/26/13
to julia...@googlegroups.com
Thanks for your patience, Jameson. When you brought up Parametric Types earlier today I wasn't very receptive -- probably because they remind me of C++, and I've never forgiven C++ for smothering the last remaining hope for the future of all mankind.

Now I see what they are and I agree, Parametric Types are just the ticket. I will use them exactly as you have suggested.

I still may find some use for the godless, beady-eyed macros. While the Types and Nodes are generated within cold confines of the parser generator, the Visitor methods have to be written by someone. That someone is probably already having a bad day because he's been asked to write a parser.

I think a line like:


function visit(v::MyVisitor, n::Node{:A})

is fairly clear once you get what these Parametric Types are all about. But it's a lot to ask of someone who skipped that section of the manual (like me) because it picks a fight with the way he's used to thinking and also because it smacks of C++. So I think I'll make a macro to replace it with a line like:

@visitor My A

Unless you think that's insultingly hand-holdy and destroys more clarity than it creates convenience.

Stefan Karpinski

unread,
Nov 26, 2013, 9:23:53 PM11/26/13
to Julia Users
On Tue, Nov 26, 2013 at 9:07 PM, Michael Fox <415...@gmail.com> wrote:
I've never forgiven C++ for smothering the last remaining hope for the future of all mankind.

LOL

Stefan Karpinski

unread,
Nov 26, 2013, 9:29:38 PM11/26/13
to Julia Users
In general, I think that macros do not help clarity – because they're magical black boxes into which expressions disappear and who knows what emerges. Anyone programming in Julia – writing a parser, no less – is going to encounter some dispatch at some point, so I'm not sure that avoiding it is that helpful. You could typealias NodeA to Node{:A} and then the parser-writer doesn't actually need to know that there's a type parameter at all.
Reply all
Reply to author
Forward
0 new messages