default values and default constructor for Julia variables

591 views
Skip to first unread message

vav...@uwaterloo.ca

unread,
Jun 18, 2014, 7:41:01 AM6/18/14
to julia...@googlegroups.com
Dear Julia colleagues,

Earlier I asked whether there is a datatype similar to "map<K,D>" from C++, and nobody answered, so I decided as my first nontrivial Julia program to write such a type.  This is an associative array (i.e., a dictionary), with the added property that the keys have a sort-order, and it is possible to loop over the items in the array in this order.  I am using a 2-3 tree as the implementation.  My main data structure looks like this:

type Map{K,D}
   ...
    function Map{K,D}()
    ....
    end
end

But I am stuck right at the outset with two very basic Julia questions.  First, what is the syntax for a user of my datatype to declare a variable to be an empty map of a certain type?  In other words, suppose the user wants a map from strings to strings:

function test1()
    m1 = Map{ASCIIString, ASCIIString}()
    m1["hello"] = "jello"
end


This gave the error: ERROR: no method Map{ASCIIString, ASCIIString}().   What is the right syntax to invoke the default constructor for an empty map?  I couldn't figure this out from the manual; in the manual, the constructors either take arguments or else they are constructors for built-ins (like arrays) that have a special syntax like a = Int[].
 
Second, at various points in the code, I need to use a default or blank instance of either K (the key type) or D (the datatype).  In other words, I need to initialize an array element whose type is a composite type that has some K or D fields; these need to be initialized to some arbitrary K or D value that won't ever be used. What is the proper syntax to initialize a field k to have a value of type K (where K is a parameter type in my code), where the value doesn't matter (can be any default).

Thanks,
Steve Vavasis

Mauro

unread,
Jun 18, 2014, 8:04:59 AM6/18/14
to julia...@googlegroups.com
I re-read your original post. There are iterators available for
Associative KeyIterator{T<:Associative}, ValueIterator{T<:Associative}
and the usual iterator which returns key and value as a tuple. These
are all also available for the dicts in DataStructures.jl. So you can
do

for (key, val) in dict
...
end

or

for key in KeyIterator(dict)
...
end

The OrderedDict there keeps to order of insertion. I think there is no
dict which keeps the keys sorted.

On Wed, 2014-06-18 at 12:41, vav...@uwaterloo.ca wrote:
> Dear Julia colleagues,
>
> Earlier I asked whether there is a datatype similar to "map<K,D>" from C++,
> and nobody answered, so I decided as my first nontrivial Julia program to
> write such a type. This is an associative array (i.e., a dictionary), with
> the added property that the keys have a sort-order, and it is possible to
> loop over the items in the array in this order. I am using a 2-3 tree as
> the implementation. My main data structure looks like this:
>
> type Map{K,D}
> ...
> function Map{K,D}()
> ....
> end
> end

Probably better use a name containing `Dict` as that is the convention
in Julia. Also, probably you want to make it a subtype of
Associative(K,V}. Have a look at base/dict.jl.

> But I am stuck right at the outset with two very basic Julia questions.
> First, what is the syntax for a user of my datatype to declare a variable
> to be an empty map of a certain type? In other words, suppose the user
> wants a map from strings to strings:
>
> function test1()
> m1 = Map{ASCIIString, ASCIIString}()
> m1["hello"] = "jello"
> end
>
>
> This gave the error: ERROR: no method Map{ASCIIString, ASCIIString}().
> What is the right syntax to invoke the default constructor for an empty
> map? I couldn't figure this out from the manual; in the manual, the
> constructors either take arguments or else they are constructors for
> built-ins (like arrays) that have a special syntax like a = Int[].

Presumably you've read: http://docs.julialang.org/en/latest/manual/constructors/

Anyway, this is a often encountered problem. Have a look through the
mailing list. Basically, if you define an inner constructor you also
have to define an outer one.

> Second, at various points in the code, I need to use a default or blank
> instance of either K (the key type) or D (the datatype). In other words, I
> need to initialize an array element whose type is a composite type that has
> some K or D fields; these need to be initialized to some arbitrary K or D
> value that won't ever be used. What is the proper syntax to initialize a
> field k to have a value of type K (where K is a parameter type in my code),
> where the value doesn't matter (can be any default).

Array(Int, 10)
or
Array(String, 10)

Mauro

unread,
Jun 18, 2014, 8:08:50 AM6/18/14
to julia...@googlegroups.com
Kevin is working on a sorted dict as we speak:
https://github.com/JuliaLang/DataStructures.jl/pull/43
--

vav...@uwaterloo.ca

unread,
Jun 18, 2014, 8:32:53 AM6/18/14
to julia...@googlegroups.com
Mauro,

Thanks for taking the time to respond.  First, with regard to the existing Dict in Julia, just to be sure I understand, the order of the keys is not the order given by the '<' operation acting on the keys but is rather the order in which they were inserted, is that correct?

With regard to the inner and outer constructor both needed, can you give me a reference?  I have read the manual section you mentioned; the constructor for the Point example in the manual takes arguments and so it doesn't seem to cover my case.

With regard to default key values, I'm still not getting it, so let me ask the question in more detail.  A data node in my structure looks like this:

   immutable TreeLeaf{K,D}
       k::K 
       d::D
       parent::Int
    end

To initialize a map, I need to create an Array{TreeLeaf{K,D},2).  The two initial entries are markers for the start and end of the data.  The k and d fields of these two nodes don't matter, but the parent values do matter.  So apparently I need something like this:

    data = Array{TreeLeaf{K,D}, 2}
    data[1] = TreeLeaf(undefinedK, undefinedD, 1)
    data[2] = TreeLeaf(undefinedK, undefinedD, 1)

What should I put for 'undefinedK' and 'undefinedD' here?  After seeing your post, I tried this:  u = Array{K,1}, and then I used u[1] for undefinedK, but this gave an error ("access to undefined reference").  There are some other places in my code where I have a similar problem.

With regard to Kevin's work on a sorted dict, maybe I should wait for him to finish?  Is there an estimated completion time?

Thanks,
Steve Vavasis

Kevin Squire

unread,
Jun 18, 2014, 10:26:30 AM6/18/14
to julia...@googlegroups.com
Hi Steve, 

I saw your original post, too, and started to respond, but got distracted and it fell off my radar. Sorry about that. 

Mayor gave good answers to your questions. To clarify one point, I'm in the process of adding functions for sorting OrderedDicts, which are normally in insertion order, as you surmised.  The sorting will happen after the fact--any additional items added to the dictionary will be added to the end.

Julia does not currently have a dictionary which maintains sort order. One based on red-black trees was started and submitted as a PR to DataStructures.jl, but needs more review and more work. 

As an aside, DataStructures.jl does contain a DefaultDict and DefaultOrderedDict, which give (and set) default values for keys not in the dictionary whenever they are accessed.  You seemed to be interested in having this as well.

If you come up with a good RB tree dictionary implementation, please consider contributing it!  It might take a little work to wrangle it into the Julia dictionary interface if you don't do so up front, but it shouldn't be too bad.  (I should probably go back and review the RB-tree dictionary PR that's there now.)

Feel free to ask any more questions here.

Cheers,
   Kevin

Mauro

unread,
Jun 18, 2014, 1:19:58 PM6/18/14
to julia...@googlegroups.com
Hi Steve,

sorry the delay, I was out... I think the dictionary questions were
answered by Kevin.

> With regard to the inner and outer constructor both needed, can you give me
> a reference? I have read the manual section you mentioned; the constructor
> for the Point example in the manual takes arguments and so it doesn't seem
> to cover my case.

This is a long thread about this
https://groups.google.com/forum/#!searchin/julia-users/constructor/julia-users/LWgATl9Pd64/50aEPb8UAfUJ
probably the best bit is here by Stefan
https://groups.google.com/d/msg/julia-users/LWgATl9Pd64/Zp2JvfKhK48J

Try this one too:
https://groups.google.com/forum/#!searchin/julia-users/constructor/julia-users/lk3WcbDJOJo/QqkNzsNMUXQJ

> With regard to default key values, I'm still not getting it, so let me ask
> the question in more detail. A data node in my structure looks like this:
>
> immutable TreeLeaf{K,D}
> k::K
> d::D
> parent::Int
> end
>
> To initialize a map, I need to create an Array{TreeLeaf{K,D},2). The two
> initial entries are markers for the start and end of the data. The k and d
> fields of these two nodes don't matter, but the parent values do matter.
> So apparently I need something like this:
>
> data = Array{TreeLeaf{K,D}, 2}

I think here you want to construct an instance
data = Array(TreeLeaf{K,D}, 2)
what you wrote gives you back a datatype:
data = Array{TreeLeaf{K,D}, 2}

> data[1] = TreeLeaf(undefinedK, undefinedD, 1)
> data[2] = TreeLeaf(undefinedK, undefinedD, 1)
>
> What should I put for 'undefinedK' and 'undefinedD' here? After seeing
> your post, I tried this: u = Array{K,1}, and then I used u[1] for
> undefinedK, but this gave an error ("access to undefined reference").
> There are some other places in my code where I have a similar problem.

Hmm, I'm not sure and I don't have quite enough time to ponder it. But
one way could be to leave the fields of TreeLeaf uninitialised:

immutable A
a
A() = new()
A(x) = new(x)
end

a = A()
b = A(5)

Then you can check whether the field is defined:
julia> isdefined(a, :a)
false

julia> isdefined(b, :a)
true

But your probably better of with just leaving the `data`-array entries
undefined. Again, check with `isdefined`, accessing them is an error.

This probably doesn't answer your questions. Maybe have a look at some
other trees:

https://groups.google.com/forum/?fromgroups=#!searchin/julia-dev/AVL$20tree/julia-dev/KPpLattsEpc/PhGu0a8l-RQJ
https://groups.google.com/forum/?fromgroups=#!searchin/julia-dev/tree/julia-dev/dDaUEqHKVCk/zBBIQBAn7jMJ
red-black tree https://gist.github.com/SirVer/3371829

vav...@uwaterloo.ca

unread,
Jun 19, 2014, 8:21:52 AM6/19/14
to julia...@googlegroups.com
I have found a solution to both problems I raised, thanks to some helpful feedback, although I am not completely happy with the second solution.

For the problem of defining a no-argument constructor for my Map class, I made a mistake in the declaration:
   type Map{K,D}
       ...
       function Map{K,D}()
       ....
       end
   end

The mistake is that the inner constructor should be declared "function Map()" because the {K,D} parameters are already presumed.

For the second problem of how to construct records that have K and D fields that are dummy placeholders, I did not find a solution.  So instead I changed my implementation so that the map is initially created in a null state in which even the placeholders are absent.  The first call to "insert!" will have proper K and D values, so I use those K and D arguments also to insert the two dummy placeholders to put the data structure into its normal state (plus the given (K,D) pair is also inserted).

I'm not completely happy with this solution for two reasons.  First, all methods associated with Map{K,D} are now more complicated because they first need to make sure that it is not in the null state.  Second, if the first insertion involves a particular large {K,D} pair, the user of the data structure might be surprised that his/her initial {K,D} pair continues to occupy memory even when it is deleted from the data structure (because it is still being held down by the two placeholders).  So I welcome further suggestions for addressing the issue of finding dummy K and D values to use in placeholder records.

Thanks,
Steve Vavasis

Mauro

unread,
Jun 19, 2014, 2:57:08 PM6/19/14
to julia...@googlegroups.com
Hi Steve,

it would be easier to follow if you post your code somewhere, a gist for
instance:
https://gist.github.com/

M
--

Reply all
Reply to author
Forward
0 new messages