Interdependent types and null pointers

509 views
Skip to first unread message

Cassio Martini M. P.

unread,
Jul 24, 2013, 8:35:01 PM7/24/13
to julia...@googlegroups.com
Hi,

I'd like to define a recursive tree structure, where each node also has a pointer to its parent (thus making it easier to get its siblings).

Consider the code:

module Tree

type Node
    left::Union(Node, Leaf)
    right::Union(Node, Leaf)
    key::Int64
    parent::Union(Node,Nothing)
end

type Leaf
    parent::Node
    key:Int64
end

end

Julia does not allow it:

ERROR: Leaf not defined
 in include_from_node1 at loading.jl:91
at dep.jl:3

I realize that if I define an abstract type ABNode, which both Node and Leaf inherit, and change the Union(Node,Leaf) to ABNode, this will work. But what I want to know is: shouldn't this be possible to do without resorting to a common interface?

A second question: why is there no NULL pointer in julia? To me it would be a lot easier than defining Union(T,Nothing). I think that allowing NULL pointers make programming recursive structures easier, in cases where you may not have all the references you need when you are creating an object.

Thank you,
Cássio

Isaiah Norton

unread,
Jul 24, 2013, 10:28:47 PM7/24/13
to julia...@googlegroups.com
One way is to use an abstract type for Leaf only, something like:

```
abstract _Leaf    

 type Node         
   left::Union(Node, _Leaf)
   right::Union(Node, _Leaf)
   key::Int64
       
  parent::Union(Node, Nothing)
end

type Leaf <: _Leaf
  parent::Node
  key::Int64
end
```

There are some examples and discussion of related issues, here:

The discussion of incomplete initialization on that page may address your second point. If not, can you clarify what you are looking for? Ptr{Void} is a valid thing and can be used (with care) in the context of calling C libraries. Otherwise None and Nothing can fulfill the "undefined" role in pure Julia programs - used in moderation!

Also, you may be interested in the following packages:

Stefan Karpinski

unread,
Jul 24, 2013, 10:37:28 PM7/24/13
to Julia Users
This is really just a missing feature. We should have a way of declaring mutually recursive types, but it's just never been a pressing feature to add, partly because workarounds like this exist.

Isaiah Norton

unread,
Jul 24, 2013, 10:51:41 PM7/24/13
to julia...@googlegroups.com
+1. This would be really useful for reflecting a lot of real-world C structs; I haven't found any workaround that allows isbits(T) == true for a type with recursive members.

Stefan Karpinski

unread,
Jul 24, 2013, 11:03:13 PM7/24/13
to Julia Users
Oh, right, there's the NULL value business too. The answer is that there's no way we're going to add that – not having NULL is a major language feature. Checking for NULLs is one of the worst nightmares of Java and JavaScript programming. It sucks for the programmer and it sucks for the compiler. In particular, it would force us to store everything as pointers, which would destroy our ability to work with integers and floats efficiently. This is one of many reasons why using Integer in Java is so inferior to using int – among other things, because Integers are nullable whereas ints are not. Julia is carefully designed so that all objects have the same semantics (no Integer vs. int distinction), yet we still get the benefits of non-object types in languages like Java.

Tomas Lycken

unread,
Jul 25, 2013, 3:35:28 AM7/25/13
to julia...@googlegroups.com
It seems to me that you could do this tree structure with current language features and partial construction of the nodes. I don't have Julia on this computer, so I can't run it to see how it works, but I'd try something like this:

type Node
    key::Integer

    parent::Node
    left::Node
    right::Node

    Node(key::Integer) = (n = new(key); n.parent = n; n.left = n; n.right = n)
end

isequal(n1::Node, n2::Node) = isequal(n1.key, n2.key)  # assume that the keys are unique - lets us say that a pointer is "empty" if it points to the same node

isroot(n::Node) = isequal(n, n.parent)  # root node has no parent

isleaf(n::Node) = isequal(n, n.left) && isequal(n, n.right)  # leaf nodes has no left or right

This approach assumes that all your nodes have unique keys, but if there is some other (combination of) property(-ies) that identifies unique nodes you can just use them as well in isequal. It also leaves you with the same problems as introducing NULL in the language would; you have to check for isroot and isleaf when traversing the tree (but instead of following pointers to invalid memory, or null pointer exceptions, you get infinite loops...). This could be abstracted away with a few traversing methods, that take e.g. a function argument with an action to apply to each node.

As already mentioned, https://github.com/lindahua/DataStructures.jl is a good source of inspiration (and perhaps there's even a data structure there that you can use out of the box!)

// T

Cassio Martini M. P.

unread,
Jul 25, 2013, 9:58:28 AM7/25/13
to julia...@googlegroups.com
Thanks everyone for their input.

I think that coming from other languages that have some kind of null pointer, such as Java, Python, C, etc, I just expected that Julia would have it too, but I see what Stefan is saying about performance. I wonder if Numpy/Scipy also suffer from this problem.

lindahua is using the same technique of Union(T,Nothing) to account for null possibilities in https://github.com/lindahua/DataStructures.jl/blob/master/src/deque.jl

About the mutually recursive structures: I think it would be nice (or even important) to have, even though there are workarounds.

Best regards,
Cássio

Kevin Squire

unread,
Jul 25, 2013, 2:41:40 PM7/25/13
to julia...@googlegroups.com
You could also create an empty node type.  Starting from Tomas's example:

abstract AbstractNode

type
EmptyNodeType <: AbstractNode end

const EmptyNode = EmptyNodeType()

type
Node <: AbstractNode
    key
::Integer

    parent
::AbstractNode
    left
::AbstractNode
    right
::AbstractNode

end

Node(key::Integer) = Node(key, EmptyNode, EmptyNode, EmptyNode)
 

Kevin

Tomas Lycken

unread,
Jul 26, 2013, 4:32:20 AM7/26/13
to julia...@googlegroups.com
That's actually way better than the self-referential stuff I did. Note though that you still need to check for empty nodes when traversing the tree - instead of infinite loops, you'll get errors that properties like left and right don't exist if you're not careful.

// T

Dahua Lin

unread,
Jul 26, 2013, 2:58:06 PM7/26/13
to julia...@googlegroups.com
I am actually considering an alternative implementation that does not rely on declaring fields as Union(Node, Nothing).

In my experience, using non-concrete types (e.g. union or abstract types) almost always limits performance.

- Dahua

Cássio M. M. Pereira

unread,
Jul 26, 2013, 6:28:44 PM7/26/13
to julia...@googlegroups.com

Interesting. Do you have some idea of what would be the speedup?

Tomas Lycken

unread,
Jul 26, 2013, 7:33:49 PM7/26/13
to julia...@googlegroups.com
In most cases, it's about storage - for example, an Array{Float64} can be stored as a contiguous array of floats, and the compiler knows exactly how much memory will be needed, how to access it, etc, while an Array{Number} is much more ambiguous and leaves the compiler with very few possibilities of optimizing the code.

There's a whole section on this in the manual ;) It mentions this mostly for composite types, but I guess it's a pretty general consideration - if the compiler knows more about how to store your data, it can optimize better. Since no type can inherit from concrete types in Julia, using concrete types guarantees that the compiler has the maximum possible amount of information about what you're trying to do, so it can optimize as well as possible.

// T

Isaiah Norton

unread,
Jul 26, 2013, 9:29:12 PM7/26/13
to julia...@googlegroups.com
Compare the LLVM IR for these two simple functions:

julia> f(x::Int) = x + 1
julia> f(x::Union(Int, Nothing)) = x + 1

julia> disassemble(f, Int)
julia> disassemble(f, (Union(Int, Nothing),))

The former maps to a single instruction, whereas the latter generates a mess of bookkeeping
and error-checking instructions to deal with the ambiguity.

Dahua Lin

unread,
Jul 27, 2013, 11:10:01 AM7/27/13
to julia...@googlegroups.com
The idea is to use self-reference to indicate NULL.

Consider a linked list, you may implement a node as follows:

type Node{T}
    value
::T
    prev
::Node{T}
   
next::Node{T}

   
function Node(x::T)
        a
= new(x)  # this leaves prev and next undefined
        a.prev = a.next = a
        a
   
end
end

has_prev
(a::Node) = is(a.prev, a)
has_next
(a::Node) = is(a.next, a)

In the new implementation (just committed) of Deque, I use this trick and get better performance. 

Kevin Squire

unread,
Jul 27, 2013, 12:13:25 PM7/27/13
to julia...@googlegroups.com
(Just to note that this is similar to Tomas' previous suggestion.)

Dahua Lin

unread,
Jul 27, 2013, 12:17:40 PM7/27/13
to julia...@googlegroups.com
Yes, I think so.

This is not very beautiful though, but it avoids the performance penalty of using non-concrete types as fields.

Tim Holy

unread,
Jul 27, 2013, 12:39:45 PM7/27/13
to julia...@googlegroups.com
Isaiah's demo is brief and compelling. You'll want this, though:
disassemble(f, Int) -> disassemble(f, (Int,))

See a rather more long-winded explanation here:
http://docs.julialang.org/en/latest/manual/faq/#how-do-abstract-or-ambigious-
fields-in-types-interact-with-the-compiler


On Friday, July 26, 2013 09:29:12 PM Isaiah Norton wrote:
> Compare the LLVM IR for these two simple functions:
>
> julia> f(x::Int) = x + 1
> julia> f(x::Union(Int, Nothing)) = x + 1
>
> julia> disassemble(f, Int)
> julia> disassemble(f, (Union(Int, Nothing),))
>
> The former maps to a single instruction, whereas the latter generates a
> mess of bookkeeping
> and error-checking instructions to deal with the ambiguity.
>
> On Fri, Jul 26, 2013 at 7:33 PM, Tomas Lycken <tomas....@gmail.com>wrote:
> > In most cases, it's about storage - for example, an Array{Float64} can be
> > stored as a contiguous array of floats, and the compiler knows exactly how
> > much memory will be needed, how to access it, etc, while an Array{Number}
> > is much more ambiguous and leaves the compiler with very few possibilities
> > of optimizing the code.
> >
> > There's a whole section on this in the
> > manual<http://docs.julialang.org/en/release-0.1/manual/performance-tips/#
> > declare-specific-types-for-fields-of-composite-types>;) It mentions this
> > mostly for composite types, but I guess it's a pretty general
> > consideration - if the compiler knows more about how to store your data,
> > it can optimize better. Since no type can inherit from concrete types in
> > Julia, using concrete types guarantees that the compiler has the maximum
> > possible amount of information about what you're trying to do, so it can
> > optimize as well as possible.
> >
> > // T
> >
> > On Saturday, July 27, 2013 12:28:44 AM UTC+2, Cassio Martini M. P. wrote:
> >> Interesting. Do you have some idea of what would be the speedup?
> >> Em 26/07/2013 15:58, "Dahua Lin" <lind...@gmail.com> escreveu:
> >>
> >> I am actually considering an alternative implementation that does not
> >>
> >>> rely on declaring fields as Union(Node, Nothing).
> >>>
> >>> In my experience, using non-concrete types (e.g. union or abstract
> >>> types) almost always limits performance.
> >>>
> >>> - Dahua
> >>>
> >>> On Thursday, July 25, 2013 8:58:28 AM UTC-5, Cassio Martini M. P. wrote:
> >>>> Thanks everyone for their input.
> >>>>
> >>>> I think that coming from other languages that have some kind of null
> >>>> pointer, such as Java, Python, C, etc, I just expected that Julia would
> >>>> have it too, but I see what Stefan is saying about performance. I
> >>>> wonder if
> >>>> Numpy/Scipy also suffer from this problem.
> >>>>
> >>>> lindahua is using the same technique of Union(T,Nothing) to account for
> >>>> null possibilities in https://github.com/**lindahua**
> >>>> /DataStructures.jl/**blob/**master/src/deque.jl<https://github.com/lind
> >>>> ahua/DataStructures.jl/blob/master/src/deque.jl>
> >>>>> As already mentioned, https://github.com/****
> >>>>> lindahua/DataStructures.jl<https://github.com/lindahua/DataStructures.
> >>>>> jl> is a good source of inspiration (and perhaps there's even a data
> >>>>>>>>> http://docs.julialang.org/en/**l**atest/manual/constructors<http:/
> >>>>>>>>> /docs.julialang.org/en/latest/manual/constructors>
> >>>>>>>>>
> >>>>>>>>> The discussion of incomplete initialization on that page may
> >>>>>>>>> address your second point. If not, can you clarify what you are
> >>>>>>>>> looking
> >>>>>>>>> for? Ptr{Void} is a valid thing and can be used (with care) in the
> >>>>>>>>> context
> >>>>>>>>> of calling C libraries. Otherwise None and Nothing can fulfill the
> >>>>>>>>> "undefined" role in pure Julia programs - used in moderation!
> >>>>>>>>>
> >>>>>>>>> Also, you may be interested in the following packages:
> >>>>>>>>> https://github.com/JuliaLang/**G**raphs.jl<https://github.com/Juli
> >>>>>>>>> aLang/Graphs.jl>
> >>>>>>>>> https://github.com/lindahua/**Da**taStructures.jl<https://github.
> >>>>>>>>> com/lindahua/DataStructures.jl>

Cássio M. M. Pereira

unread,
Jul 27, 2013, 1:30:25 PM7/27/13
to julia...@googlegroups.com
Thanks, that's very instructive. I hadn't seen that FAQ, as
http://docs.julialang.org points directly to
http://docs.julialang.org/en/release-0.1/en/release-0.1/ right now.

Also, there's a typo on that section's name:
http://docs.julialang.org/en/latest/manual/faq/#how-do-abstract-or-ambigious-fields-in-types-interact-with-the-compiler
--> "ambigious".

Best regards,
Cássio
Reply all
Reply to author
Forward
0 new messages