Defining a function in different modules

3,157 views
Skip to first unread message

Michael Turok

unread,
Apr 21, 2015, 9:26:01 AM4/21/15
to julia...@googlegroups.com
Hi,

What is the idiomatic way to create a function value() in different modules, dispatched on different arguments, without getting the warning/error about conflicting with an existing identifier?

It seems like there is an order dependency with the example below.   Seems like the 2nd module defines value(), unless you had already used value() prior to importing the 2nd module.   

Note that if I do the same with get() a function defined in Base, I don't get an error. 

Code and output from julia REPL below.

Any help appreciated,
Michael

# this is mike.jl

# ------------------------------
module Foo

# ------------------------------
importall Base
type FooType end

value(x::FooType) = "Foo::value"
get(x::FooType) = "Foo::get"

export value

end

# ------------------------------
module Bar
# ------------------------------
importall Base

type BarType end

value(x::BarType) = "Bar::value"
get(x::BarType) = "Bar::get"

export value

end

Using this in the REPL: 
julia> workspace() ; include("mike.jl")

julia> using Foo

julia> value(Foo.FooType())
"Foo::value"

julia> using Bar
Warning: using Bar.value in module Main conflicts with an existing identifier.

julia> value(Bar.BarType())
ERROR: `value` has no method matching value(::BarType)

# -----------------------------------------------------

julia> workspace() ; include("mike.jl")

julia> using Foo

julia> using Bar

julia> value(Foo.FooType())
ERROR: `value` has no method matching value(::FooType)

julia> value(Bar.BarType())
"Bar::value"

# -----------------------------------------------------

julia> workspace() ; include("mike.jl")

julia> using Bar

julia> using Foo

julia> value(Foo.FooType())
"Foo::value"

julia> value(Bar.BarType())
ERROR: `value` has no method matching value(::BarType)

julia> 

Michael Turok

unread,
Apr 21, 2015, 9:37:49 AM4/21/15
to julia...@googlegroups.com
Note that this can be made to work by tearing a page from Base:  we can a module (SuperSecretBase), that defines a stub value() function. We then use importall SuperSecretBase in each of Foo and Bar.    But this means that any module we create would need to declare its functions into SuperSecretBase.

julia> workspace() ; include("mike.jl")

julia> using Foo

julia> using Bar

julia> value(Bar.BarType())
"Bar::value"

julia> value(Foo.FooType())
"Foo::value"

julia> 

Modified code follows: 

module SuperSecretBase
value() = nothing
export value
end

# ------------------------------

module Foo

importall SuperSecretBase

importall Base
type FooType end

value(x::FooType) = "Foo::value"
get(x::FooType) = "Foo::get"

export value

end

# ------------------------------

module Bar

importall SuperSecretBase

importall Base

type BarType end

value(x::BarType) = "Bar::value"
get(x::BarType) = "Bar::get"

export value

end

Jeff Bezanson

unread,
Apr 21, 2015, 1:07:40 PM4/21/15
to julia...@googlegroups.com
We're planning to do something about this: #4345. When `using` two
modules with conflicting names, we should do something other than pick
one depending on order. Most likely we will print a warning, and
require uses of the name to be qualified.

If the two modules really do want to define different methods for the
same function, then either one has to import the other, or you have to
use your SuperSecretBase approach.

Michael Francis

unread,
Apr 22, 2015, 8:47:55 AM4/22/15
to julia...@googlegroups.com
I read through the issues / threads ( and some others )


I'm not sure that the either the SuperSecretBase or the warning are the correct approach. I'd like to propose a counter which is a very simple rule. 

"You can only export functions from a module where they reference at least one type defined in the module." 

There may have to be a slight tweak for Base, though it is not hard to argue that the primitive types are defined in Base. 

so

module Module1
type
Bar end
my( b::Bar ) = 1

export my       # fine exports to the global space
end

module Module2
type
Foo end
my() = 1

export my       # ERROR exporting function which does not reference local type
end

module Module3
type
Wow end
my( w::Wow ) = 1
my() = 1
end
export my       # Is an ERROR I can not export a function which does not reference a local type
end

So in the example provided my Mike above, multiple dispatch would do the right thing. If I also want to define a function for value in my module it would work consistently against the types I define. We don't have to perform recursive exports and import usage should be reduced.

If you want to define an empty function you can do so with a default arg 
module Module4
type
Zee end
my( ::Type{Zee} = Zee ) = 1
export my       # Works, but I can select against it using multiple dispatch by providing the last arg
end


I can't convince myself that exporting Types in general (nor macros) is a good idea.

A tweak may be to add C# like module alias syntax, which is just syntactic sugar over what we have ( except that we would likely want the definition of MY to be const in the scope. 

MY = using Foo.Bar.ReallyLongModuleName
t
= MY.Type()
my( t )
 

Thoughts ? 


I'm sure there is something I have missed, but this simple rule would seem to encourage multiple dispatch and support the development of modules. 

Jeff Bezanson

unread,
Apr 22, 2015, 4:40:03 PM4/22/15
to julia...@googlegroups.com
That rule seems extremely restrictive to me. It would be very common,
for example, to create a library of functions that operate on standard
data types like numbers and arrays. I don't see that we can exclude
that kind of use.

Also, printing a warning is not the key part of #4345. The important
part is that you'd have to qualify names in that case, which is the
same thing that would happen if `export`ing the names were disallowed.

Michael Francis

unread,
Apr 22, 2015, 5:58:33 PM4/22/15
to julia...@googlegroups.com
You are correct it is restrictive, though I will take some convincing that this is a bad thing, as systems get larger in Julia it is going to be increasingly important to manage code reuse and prevent accidental masking of types. Multiple dispatch is a wonderful tool for supporting these goals. Unfortunately allowing people the ability to export get(<string>) et al to the users scope seems like a bad idea. This is already happening from modules today. Perhaps the middle ground is to force an explicit import, so using only imports functions which have types defined in the module.  The person defining the module exports all the functions they want but only those that are 'safe' e.g. follow my original rule are implicitly imported. Hence you would have something like the following code. This is not far different from the importall today, except that the exports are automatically restricted. 

using MyMath         # Imports only those functions which include types defined in MyMath
import MyMath.*      # Imports all other  functions  defined in MyMath
import MyMath.afunc  # Imports  one  function
import MyOther.afunc # Fails collides with MyMath.afunc
 

Jeff Bezanson

unread,
Apr 22, 2015, 6:19:57 PM4/22/15
to julia...@googlegroups.com
I think it's reasonable to adopt a convention in some code of not using `using`.

Another way to look at this is that a library author could affect name
visibility in somebody else's code by adjusting the signature of a
method. That doesn't seem like a desirable interaction to me. Often
somebody might initially define foo(::Image), and then later realize
it's actually applicable to any array, and change it to
foo(::AbstractArray). Doing that shouldn't cause any major fuss.

Stefan Karpinski

unread,
Apr 24, 2015, 2:56:58 PM4/24/15
to Julia Users
For anyone who isn't following changes to Julia master closely, Jeff closed #4345 yesterday, which addresses one major concern of "programming in the large".

I think the other concern about preventing people from intentionally or accidentally monkey-patching is very legitimate as well, but it's way less clear what to do about it. I've contemplated the idea of not allowing a module to add methods to a generic function unless it "owns" the function or one of the argument types, but that feels like such a fussy rule, I don't think it's the right solution. But I haven't come up with anything better either.

ele...@gmail.com

unread,
Apr 24, 2015, 8:51:37 PM4/24/15
to julia...@googlegroups.com


On Saturday, April 25, 2015 at 4:56:58 AM UTC+10, Stefan Karpinski wrote:
For anyone who isn't following changes to Julia master closely, Jeff closed #4345 yesterday, which addresses one major concern of "programming in the large".

I think the other concern about preventing people from intentionally or accidentally monkey-patching is very legitimate as well, but it's way less clear what to do about it. I've contemplated the idea of not allowing a module to add methods to a generic function unless it "owns" the function or one of the argument types, but that feels like such a fussy rule, I don't think it's the right solution. But I haven't come up with anything better either.

I would have thought stopping intentional behaviour is non-Julian, but accidental errors should indeed be limited.  Perhaps adding methods to other modules functions needs to explicit.

Michael Francis

unread,
Apr 24, 2015, 10:55:39 PM4/24/15
to julia...@googlegroups.com
the resolution of that issue seems odd - If I have two completely unrelated libraries. Say DataFrames and one of my own. I export value( ::MyType) I'm happily using it. Some time later I Pkg.update(), unbeknownst to me the DataFrames dev team have added an export of value( ::DataFrame, ...) suddenly all my code which imports both breaks and I have to go through the entire stack qualifying the calls, as do other users of my module? That doesn't seem right, there is no ambiguity I can see and the multiple dispatch should continue to work correctly.

Fundamentally I want the two value() functions to collapse and not have to qualify them. If there is a dispatch ambiguity then game over, but if there isn't I don't see any advantage (and lots of negatives) to preventing the import.

I'd argue the same is true with overloading methods in Base. Why would we locally mask get if there is no dispatch ambiguity even if I don't importall Base.

Qualifying names seems like an anti pattern in a multiple dispatch world. Except for those edge cases where there is an ambiguity of dispatch.

Am I missing something? Perhaps I don't understand multiple dispatch well enough?

ele...@gmail.com

unread,
Apr 24, 2015, 11:36:18 PM4/24/15
to julia...@googlegroups.com
IIUC the problem is not where you have two distinct and totally separate types to dispatch on, its when one module also defines methods on a common parent type (think about ::Any).  That module is expecting the concrete types it defines methods for to dispatch to these methods and all other types to dispatch to the method defined for the abstract parent type.  

But if methods from another module were combined then that behaviour would be changed silently for some types to dispatch to the second modules methods.  

But nothing says these two functions do the same thing, just because they have the same name.  The example I usually use is that the `bark()` function from the `Tree` module is likely to be different to the `bark()` function in the `Dog` module.  So if functions are combined on name, mixing modules can silently change behaviour of existing code, and thats "not a good thing".

To extend an existing function in another module you can explicitly do so, ie name Module1.funct() so that the behaviour of the function can be extended by methods for other types, but by explicitly naming the function, you are guaranteeing that the methods have acceptable semantics to combine.
 

Stefan Karpinski

unread,
Apr 25, 2015, 9:46:00 AM4/25/15
to Julia Users
On Fri, Apr 24, 2015 at 8:51 PM, <ele...@gmail.com> wrote:
I would have thought stopping intentional behaviour is non-Julian, but accidental errors should indeed be limited.  Perhaps adding methods to other modules functions needs to explicit.

I think that John Myles White was the first to start advocating for using explicit qualification every time you extend methods from Base or some other module than then one you're currently in. At first this felt a little annoying to me, but I've grown to like it and I do think this may be a good, low-tech solution. Forcing the programmer to be aware of the fact that they're extending someone else's generic function already helps a lot. If we provided some tooling for finding cases where people are monkey patching and made it widely available, then that might really solve the whole issue.

Scott Jones

unread,
Apr 25, 2015, 10:52:22 AM4/25/15
to julia...@googlegroups.com
Yes... I also would that explicit qualification would be a very good thing.

(I also wish Julia also had explicit exception handling, a la CLU... exceptions in CLU didn't cost more than a normal return because of that rule, and
it really helped when trying to figure out a clusters correctness)

Scott

Stefan Karpinski

unread,
Apr 25, 2015, 11:20:14 AM4/25/15
to Julia Users
I think you're probably being overly optimistic about how infrequently there will be dispatch ambiguities between unrelated functions that happen to have the same name. I would guess that if you try to merge two unrelated generic functions, ambiguities will exist more often than not. If you were to automatically merge generic functions from different modules, there are two sane ways you could handle ambiguities:
  • warn about ambiguities when merging happens;
  • raise an error when ambiguous calls actually occur.
Warning when the ambiguity is caused is how we currently deal with ambiguities in individual generic functions. This seems like a good idea, but it turns out to be extremely annoying. In practice, there are fairly legitimate cases where you can have ambiguous intersections between very generic definitions and you just don't care because the ambiguous case makes no sense. This is especially true when loosely related modules extend shared generic functions. As a result, #6190 has gained a lot of support.

If warning about ambiguities in a single generic function is annoying, warning about ambiguities when merging different generic functions that happen share a name would be a nightmare. Imagine popular packages A and B both export a function `foo`. Initially there are no ambiguities, so things are fine. Then B adds some methods to its `foo` that introduce ambiguities with A's `foo`. In isolation A and B are both fine – so neither package author sees any warnings or problems. But suddenly every package in the ecosystem that uses both A and B – which is a lot since they're both very popular – is spewing warnings upon loading. Who is responsible? Package A didn't even change anything. Package B just added some methods to its own function and has no issues in isolation. How would someone using both A and B avoid getting these warnings? They would have to stop writing `using A` or `using B` and instead explicitly import all the names they need from either A or B. To avoid inflicting this on their users, A and B would have to carefully coordinate to avoid any ambiguities between all of their generic functions. Except that it's not just A and B – it's all packages. At that point, why have namespaces with exports at all?

What if we only raise an error when making calls to `foo` that are ambiguous between `A.foo` and `B.foo`? This eliminates the warning annoyance, which is nice. But it makes code that uses A and B that calls `foo` brittle in dangerous ways. Suppose, for example, you call `foo(x,y)` somewhere and initially this can only mean `A.foo` so things are fine. But then you upgrade B, which adds a method to `B.foo` that also matches the call to `foo(x,y)`. Now your code that used to work will fail at run time – and only when invoked with ambiguous arguments. This case may be possible but rare and not covered by your tests. It's a ticking time bomb introduced into your code just by upgrading dependencies.

The way this issue has actually been resolved, if you were using A and B and call `foo`, initially only is exported by A, as soon as package B starts exporting `foo`, you'll get an error and be forced to explicitly disambiguate `foo`. This is a bit annoying, but after you've done that, your code will no longer be affected by any changes to `A.foo` or `B.foo` – it's safe and permanently unambiguous. This still isn't 100% bulletproof. When `B.foo` is initially introduced, your code that used `foo`, expecting to call `A.foo`, will break when `foo` is called – but you may not have tests to catch this, so it could happen at an inconvenient time. But introducing new exports is far less common than adding methods to existing exports and you are much more likely to have tests that use `foo` in some way than you are to have tests that exercise a specific ambiguous case. In particular, it would be fairly straightforward to check if the tests use every name that is referred to anywhere in some code – this would be a simple coverage measure. It is completely intractable, on the other hand, to determine whether your tests cover all possible ambiguities between functions with the same name in all your dependencies.

Anyway, I hope that's somewhat convincing. I think that the way this has been resolved is a good balance between convenient usage and "programming in the large".

Kevin Squire

unread,
Apr 25, 2015, 11:25:58 AM4/25/15
to julia...@googlegroups.com
(#1255 would be icing on the cake here.)

Scott Jones

unread,
Apr 25, 2015, 11:53:42 AM4/25/15
to julia...@googlegroups.com
A problem I'm running into is the following (maybe the best practice for this is documented, and I just to stupid to find it!):
I have created a set of functions, which use my own type, so they should never be ambiguous.
I would like to export them all, but I have to import any names that already exist...
Then tomorrow, somebody adds that name to Base, and my code no longer works...
I dislike having to explicitly import names to extend something, how am I supposed to know in advance all the other names that could be used?

What am I doing wrong?

Michael Francis

unread,
Apr 25, 2015, 12:30:08 PM4/25/15
to julia...@googlegroups.com
Stefan, my takeaways from what you are saying are as follows.

1) dynamic dispatch doesn't work without the potential for surprising ambiguities in a mixed namespace environment. The more modules included the worse this gets.
2) A good practice would be to import no functions from modules I don't own into modules I do and explicit qualify all access to external modules from day one. I can't afford to simply break one day.
3) inside my own namespace, modules continue to use exports but I have to implement SuperSecretBase modules managing my function collapses. (Or minimize the use of modules)
4) throwaway scripts can continue to work as before but risk breakage as new functions are exported.

Is that fair?

Stefan Karpinski

unread,
Apr 25, 2015, 12:31:47 PM4/25/15
to Julia Users
Scott, I'm not really understanding your problem. Can you give an example?

Stefan Karpinski

unread,
Apr 25, 2015, 12:47:12 PM4/25/15
to Julia Users
On Sat, Apr 25, 2015 at 12:30 PM, Michael Francis <mdcfr...@gmail.com> wrote:
Stefan, my takeaways from what you are saying are as follows.

1) dynamic dispatch doesn't work without the potential for surprising ambiguities in a mixed namespace environment. The more modules included the worse this gets.

The issue here is not dispatch – it's deciding what global bindings refer to. Only if you merge generic functions upon import does naming get mixed up with dispatch, which we haven't done and I've argued would lead to various problems. The way things now work, naming and dispatch are completely orthogonal.
 
2) A good practice would be to import no functions from modules I don't own into modules I do and explicit qualify all access to external modules from day one. I can't afford to simply break one day.

You can completely prevent any surprises if you have tests that exercise every global binding that occurs anywhere in your code. That's a fairly straightforward coverage metric and one that's not unreasonable to insist on keeping at 100% in production-quality code. It is much easier to satisfy than perfect test coverage: if you have many functions that use `foo` you only need one test that calls `foo` somehow to be sure that `foo` is an unambiguous reference.

We should implement this metric: https://github.com/JuliaLang/julia/issues/11006
 
3) inside my own namespace, modules continue to use exports but I have to implement SuperSecretBase modules managing my function collapses. (Or minimize the use of modules)

I don't follow. Is this still about ambiguity of global bindings?
 
4) throwaway scripts can continue to work as before but risk breakage as new functions are exported.

Yes – anything that gets bindings via `using`, doesn't fully qualify them, and isn't tested for unambiguous global name resolution could potentially break.

Scott Jones

unread,
Apr 25, 2015, 1:06:28 PM4/25/15
to julia...@googlegroups.com
Like I said, this is likely just a newbie mistake... however, if I have the module:
module Foo
export my_new_function, length
import Base.my_new_function
import Base.length
type
Bar ; x::Int ; end
length
(y::Bar) = 42
my_new_function
(y::Bar) = "Thanks for all the fish!"
end

It complains about not being able to import Base.my_new_function.
However, if I don't have that import, then if somebody (like you or Jeff) adds their nifty new function into Base, I will get an error trying to do using Foo when I try to use the new version...

Scott

Mauro

unread,
Apr 25, 2015, 1:43:36 PM4/25/15
to julia...@googlegroups.com
>> 3) inside my own namespace, modules continue to use exports but I have to
>> implement SuperSecretBase modules managing my function collapses. (Or
>> minimize the use of modules)
>>

However, it probably makes sense to define the meaning of your generic
functions somewhere centrally. And this is also were the generic
documentation should go, as opposed to having to hunt down the general
documentation in different modules. I once opened a feature request
that generic functions can be created without methods:
https://github.com/JuliaLang/julia/issues/8283
This would fit with that approach.

Michael Francis

unread,
Apr 25, 2015, 1:49:31 PM4/25/15
to julia...@googlegroups.com
Mauro,

I like that idea that there is a central place. It is perhaps something akin to an interface. Being able to define and document an external interface for a module makes a lot of sense.

I'd love to see interfaces in general, especially for things like iteration. If I implement a base interface for my type I'd like to be able to assert that it is fully implemented.

Mauro

unread,
Apr 25, 2015, 1:57:18 PM4/25/15
to julia...@googlegroups.com
> I'd love to see interfaces in general, especially for things like
> iteration. If I implement a base interface for my type I'd like to be
> able to assert that it is fully implemented.

Check out Traits.jl (is getting reasonably stable, although I haven't
updated it to post #10380 Julia).

Jeff Bezanson

unread,
Apr 25, 2015, 1:57:23 PM4/25/15
to julia...@googlegroups.com
Michael, that's not a bad summary. I would make a couple edits. You
don't really need to qualify *all* uses. If you want to use `foo` from
module `A`, you can put `import A.foo` at the top and then use `foo`
in your code. That will have no surprises and no breakage.

Also I think calling it "SuperSecretBase" makes it sound worse than it
is. You can have modules that describe a certain named interface, and
then other modules extend it. Which reminds me that I need to
implement #8283, so you can introduce functions without adding methods
yet.

Jeff Bezanson

unread,
Apr 25, 2015, 2:10:25 PM4/25/15
to julia...@googlegroups.com
Scott, the behavior you're trying to get sounds to me like "IF this
function exists in Base then I want to extend it, otherwise just make
my own version of the function". That strikes me as a hack. What we've
tended to do is let everybody define whatever they want. Then if we
see the same name appearing in multiple packages, we decide if there
is indeed a common interface, and if so move the packages to using it,
e.g. by creating something like StatsBase or maybe adding something to
Base. But we don't want Base to grow much more, if at all.

Getting an error for using both Base and your package seems annoying,
but alternatives that involve doing "something" silently surely must
be considered worse. If a colliding name gets added to Base, the
default behavior should not be to assume that you meant to interfere
with its behavior.

Scott Jones

unread,
Apr 25, 2015, 3:27:18 PM4/25/15
to julia...@googlegroups.com
My point is, if I have been careful, and export methods that always reference at least one of type defined locally in my module, so that they
should always be unambiguous, I should NOT have to know about any other module (or Base) that a user of my module might also be using having a function with the
same name, and should NOT have to do an import.

For methods where I *am* trying to extend some type defined in another module/package or base, then yes, I believe you should do something explicitly to indicate that.

I don't think there is any real conflict here... right now it is too restrictive when the module's programmer has clearly signaled their intent by always using their own, unambiguous
signitures for their functions.

Have I got something fundamentally wrong here?

Thanks,
Scott

Jeff Bezanson

unread,
Apr 25, 2015, 3:58:16 PM4/25/15
to julia...@googlegroups.com
I think this is just a different mindset than the one we've adopted.
In the mindset you describe, there really *ought* to be only one
function with each name, in other words a single global namespace. As
long as all new definitions for a function have disjoint signatures,
there are no conflicts. To deal with conflicts, each module has its
own "view" of a function that resolves conflicts in favor of its
definitions.

This approach has a lot in common with class-based OO. For example in
Python when you say `x.sin()`, the `sin` name belongs to a single
method namespace. Sure there are different namespaces for *top level*
definitions, but not for method names. If you want a different `sin`
method, you need to make a new class, so the `x` part is different.
This corresponds to the requirement you describe of methods
referencing some new type from the same julia module.

Well, that's not how we do things. For us, if two functions have the
same name it's just a cosmetic coincidence, at least initially. In
julia two functions can have the same name but refer to totally
different concepts. For example you can have Base.sin, which computes
the sine of a number, and Transgressions.sin, which implements all
sorts of fun behavior. Say Base only defines sin(x::Float64), and
Transgressions only defines sin(x::String). They're disjoint. However,
if you say

map(sin, [1.0, "sloth", 2pi, "gluttony"])

you can't get both behaviors. You'll get a method error on either the
1.0 or the string. You have to decide which notion of `sin` you mean.
We're not going to automatically merge the two functions.

Scott Jones

unread,
Apr 25, 2015, 4:24:39 PM4/25/15
to julia...@googlegroups.com


On Saturday, April 25, 2015 at 3:58:16 PM UTC-4, Jeff Bezanson wrote:
I think this is just a different mindset than the one we've adopted.
In the mindset you describe, there really *ought* to be only one
function with each name, in other words a single global namespace. As
long as all new definitions for a function have disjoint signatures,
there are no conflicts. To deal with conflicts, each module has its
own "view" of a function that resolves conflicts in favor of its
definitions.

As a practical point, *why* should I have to know about every other package or module that users of my package might possibly want to use at the same time?
With the way it is now, it seems I have to force everybody to not use using, and use fully specified names, which seems utterly against the extensibility of Julia,
because if I try to export a function, I must know the intentions of every user, which packages they might load, etc. that might possibly have the same name.

I have a module that defines a packed database format, and I want to define a length, push!, and getindex methods...
Then (for examples sake) I also want to define a foobar method that people can use, and be able to call it on objects from my module with just
foobar(db,arg1,arg2) (where db is from my class).
All is well and good, but then some user complains that they can't use my package and package newdb, because coincidentally they also defined a function
called foobar, that does have a different signature.

I believe they should be able to use both, as long as there aren't any real conflicts, *without* spurious warnings...
 
This approach has a lot in common with class-based OO. For example in
Python when you say `x.sin()`, the `sin` name belongs to a single
method namespace. Sure there are different namespaces for *top level*
definitions, but not for method names. If you want a different `sin`
method, you need to make a new class, so the `x` part is different.
This corresponds to the requirement you describe of methods
referencing some new type from the same julia module.

Well, that's not how we do things. For us, if two functions have the
same name it's just a cosmetic coincidence, at least initially. In
julia two functions can have the same name but refer to totally
different concepts. For example you can have Base.sin, which computes
the sine of a number, and Transgressions.sin, which implements all
sorts of fun behavior. Say Base only defines sin(x::Float64), and
Transgressions only defines sin(x::String). They're disjoint. However,
if you say

map(sin, [1.0, "sloth", 2pi, "gluttony"])

you can't get both behaviors. You'll get a method error on either the
1.0 or the string. You have to decide which notion of `sin` you mean.
We're not going to automatically merge the two functions.



I'm not saying you should... on the other hand, if I have to functions from different packages, developed independently,
that happen to have a name in common, (but with different signatures), the users should not have to somehow get the developers
together (who may not even be around anymore), to somehow resolve the conflict (which would probably adversely affect other users
of both packages if some names had to be changed)

Then if we 
see the same name appearing in multiple packages, we decide if there 
is indeed a common interface, and if so move the packages to using it, 
e.g. by creating something like StatsBase or maybe adding something to 
Base. But we don't want Base to grow much more, if at all. 

I'm sorry, but that just seems like a recipe for disaster... you are saying that *after* users finally
decide they want to use two packages together, that then somehow you will force the
developers of the packages to agree on a common interface, or change the names of conflicting functions,
or make everybody use names qualified with the module name(s)...

As for your map, example...
If instead, I have map(sin, [1.0, myslothdays, 2pi, mygluttonydays] ),
where myslothdays and mygluttonydays both have the type MySinDiary, and there is a Transgressions.sin(x::Transgressions.MySinDiary) method...
that should work, right?

What is a good reason for it not to work?

Scott

Jeff Bezanson

unread,
Apr 25, 2015, 5:06:04 PM4/25/15
to julia...@googlegroups.com
The reason for it not to work is that we have two different concepts
that happen to be spelled the same.

To me you're just describing inherent problems with name conflicts and
agreeing on interfaces. Having a single method namespace is hardly a
magic bullet for that. It seems to require just as much coordination.
In Python again, two people might develop socket libraries that
implement connect, but one uses foo.connect(address, port) and the
other uses bar.connect(port, address). At that point, you absolutely
have to get the two developers to agree on one interface so that
people can use both, and say x.connect() where x might be either foo
or bar. If the libraries can't be changed, you can write shims to make
them compatible. But you can do the same thing in julia. In julia the
disaster is no bigger than usual.

Quite rightly, you are focusing on how code changes over time, and
what problems that might cause. But your design focuses on adding
functions, and assumes signatures don't change as much and have a
particular structure (i.e. typically referring directly to a type
defined in the same module). If those assumptions hold, I agree that
it could work very well. But it "breaks" as signatures change, while
our design "breaks" as export lists change. I prefer our tradeoff
because method signatures are far more subtle. Comparing method
signatures is computationally difficult (exponential worst case!),
while looking for a symbol in a list is trivial. Warnings for name
conflicts may be annoying, but at least it's dead obvious what's
happening. If a subtle adjustment to a signature affects visibility
elsewhere, I'd think that would be much harder to track down.

Mauro

unread,
Apr 25, 2015, 5:27:10 PM4/25/15
to julia...@googlegroups.com
I don't think it is realistic to expect be able to willy-nilly be
'using' any number of packages and it just works. The way you propose
may work most of the time, however, there were some solid arguments made
in this thread on how that can lead to hard to catch failures.

And maybe more importantly, from a programmer's sanity perspective, I
think it is imperative that one generic function does just one
conceptual thing. Otherwise it gets really hard to figure out what a
piece of code does.

Scott Jones

unread,
Apr 25, 2015, 5:47:49 PM4/25/15
to julia...@googlegroups.com
I think lindahua had it right:
Generally, conflicting extensions of methods are a natural consequence of allowing packages to evolve independently (which we should anyway). It is unavoidable as the eco-system grows (even if we address such the Images + DataArrays problem by other means). If this coupling over packages cannot be addressed in a scalable way, it would severely influence the future prospect of Julia to become a mainstream language.

Right now, Julia already has a big mess of overloaded operators and function names that aren't really exactly the same interface... (* and ^ on strings, for example, or ~ in DataFrames :-) ).

I do have a lot of experience of how code changes over time ;-)
But no, I did not assume that signatures wouldn't change within a package, not at all.

My suggestion doesn't break as signatures change... as long as you've, as the package/module creator, have used one of "your" types, so that you know that things are unambiguous,
then things don't break no matter how the signatures of my functions change...

Your method means that users are forced to 1) not use using on packages with coincidentally conflicting names and specify everything with the package/module name, or
2) force one of the package developers (if they are even still around) to change their package to avoid conflicting with somebody else's package's names,
which will requires users to remember to always use package A before package B (B being the one that had to change their package)...

You also seem to think that the only dynamic use of the language will be in the REPL...
If I write a system that can dynamically load Julia code from a database and execute it, where are all these warning messages going to be going?

Here's another thought:
Developer A makes a nice database binding package, with 20 different functions.

Developer B totally independently makes a new database binding package, that, because people in the same area are likely to pick similar names,
has 3 names (with totally different signatures).

Developer C comes along (me), and needs to use both packages... say A for the data sources and B for the backing store...
Why should I not be able to use "using A using B" or "using B using A", without worrying that they coincidentally used a few names that conflict,
and not have to try to get A & B together to come up with some common interface...

Remember, I am talking about something that I think would be pretty easy to determine, which somebody else had suggested somewhere
here (i.e. using the fact that a type local to the module was in the signature),  to say that you don't need the import <name> if you are exporting a function.
I am not talking about never having to do the import <name>, nor ever having the warnings in the case where you are creating a method on types
outside of the module.

To me, the fact that you have had to go to this "SuperSecretBase" points out the problems with the current design...

Scott

Scott Jones

unread,
Apr 25, 2015, 5:55:33 PM4/25/15
to julia...@googlegroups.com
The problem is, in practice, people *will* have names that collide, and will not mean the same thing.
It seems that people here are trying to say, if you have a particular name you'd like to use,
you'd better get together with all other developers past and future and hammer out who
"owns" the name, and what concept it can be used for... (like mathematical sin and fun sin,
or tree bark and dogs bark... it gets even worse when you consider other languages...
[Say I'm in Spain, and I write a robotics package that has a function "coger"..., and somebody in Argentina
writes a function "coger" that does something, well, XXX...])

I just don't see this as working for any length of time (and I think it is already breaking down with Julia...
to me, the fact that DataFrames picked using ~ as a binary operator, when that might have been
something that somebody wanted to use in the core language, shows how fragile things
are now...)

Scott

Kevin Squire

unread,
Apr 25, 2015, 6:10:28 PM4/25/15
to julia...@googlegroups.com
I don't really see anything wrong with prefixing package names to functions to distinguish usage. That's the general route that python has been going (for similar reasons), and it tends to work pretty well (especially if one can alias the package name to something short).

For Julia, this would mean discouraging use of "using MyPackage", and making module aliasing simpler. 

Cheers,
   Kevin

Jeff Bezanson

unread,
Apr 25, 2015, 6:13:11 PM4/25/15
to julia...@googlegroups.com
When two people write packages independently, I claim there are only
two options: (1) they implement a common interface, (2) they don't. To
pick option (1), there has to be some kind of centralization or
agreement. For option (2), which is effectively the default, each
package just gets its own totally separate function, and you have to
say which one you want. We're not saying "you'd better get together
with all other developers", because with option (2) you don't need to.

IIUC, you're proposing option (3), automatically merge everybody's
methods, assuming they don't conflict. But I don't see how this can
work. We could have:

module A
type AConnectionManager
end

function connect(cm::AConnectionManager, port, address)
end
end

module B
type BConnectionManager
end

function connect(cm::BConnectionManager, address, port)
end
end

Obviously, you cannot do `using A; using B`, and then freely use
`connect` and have everything work. The fact that the type of the
first argument distinguishes methods doesn't help. The rest of the
arguments don't match, and even if they did the behaviors might not
implement compatible semantics. The only options I see are my options
1 and 2: (1) move to a common interface, or (2) specify A.connect or
B.connect in client code, because the interfaces aren't compatible.

Scott Jones

unread,
Apr 25, 2015, 6:25:25 PM4/25/15
to julia...@googlegroups.com
No, not at all.

I have a bunch of code written using package A.
It knows what the correct arguments are to the connect function for type AConnectionManager.
I did using A, because having to specify all the time which package (however short the name is),
and it means that I wouldn't be able to use other packages with were designed to extend package A
 (and did use the import A to extend the functions with new methods).

Then I need to connect to another database package, that also used the name connect, with other arguments,
but no conflicting signatures at all.
It has a connect(BConnectionManager, namespacename, globalname) function (with a set of methods).
There might also be further packages which extend module B's functions, explicitly doing import B export extendedfunction
I think that should work just fine, but you are saying that I *must* specify B.connect, which then also means (AFAIK),
that I won't get a C.connect that extends B.connect (and intended to).

Why do you want to restrict what can be done, because you have this view that, like the Highlander, there can only be *one* true interface for a name?

Scott

Kevin Squire

unread,
Apr 25, 2015, 6:36:55 PM4/25/15
to julia...@googlegroups.com
Hi Scott,

While the current system in Julia may not be perfect, I'm finding it hard to follow some of your thoughts right now.  Perhaps you could come up with some trivial code which explains your concerns?  In particular, I'm not sure what you mean by "wouldn't be able to use other packages with were designed to extend package A".  I've been using Julia for a couple of years and haven't run into such problems.

Cheers!

  Kevin

Jeff Bezanson

unread,
Apr 25, 2015, 6:56:55 PM4/25/15
to julia...@googlegroups.com
In general connect(x, y, z) is dynamically dispatched: you don't
always know the type of `x`. So you wouldn't be able to write
*generic* code that uses connect. In generic code, there really can be
only one interface: if I write code that's supposed to work for any
`x`, and I say `connect(x, address, port)`, like it or not my code
only works for one of A and B, not both.

ele...@gmail.com

unread,
Apr 25, 2015, 7:49:33 PM4/25/15
to julia...@googlegroups.com
I think the key issue is your:
 
I believe they should be able to use both, as long as there aren't any real conflicts, *without* spurious warnings...

As Jeff said, the problem is "aren't any real conflicts" is not possible to determine in all cases, and can be costly in others. And IIUC its not possible to know ahead of time if it can be determined.

So because its not practically possible to protect against problematic situations, Julia plays it safe and complains about all situations.

Cheers
Lex

Scott Jones

unread,
Apr 25, 2015, 8:02:56 PM4/25/15
to julia...@googlegroups.com
I think you are again misunderstanding... I am *not* writing "generic" code.
I am writing code that accesses database A, with names like connect, set_record, get_record, disconnect
It's connect takes an argument of type A.DBManager, with some parameters like address, port, user, password, and returns an A.DBConnection object.
There is absolutely no ambiguity with the Base.connect, nor is the interface necessarily even the same, *however*, it is what the users of database A
would *expect* as far as names (possibly identical names to database A's C bindings).
I also use some other code that extends that package, say BetterA, adding useful stuff like being able to serialize / deserialize Julia objects, and use set_record and get_record.
That code was written with A in mind, and *explicitly* imports set_record and get_record, extends them, and exports them.
My code does using A, using BetterA, and then does things like:
myconn = connect(aManager, "127.0.0.1", 3000, "scott", "") ; set_record(myconn, myjuliaobject)
That set_record is actually handled nicely by multiple dispatch, the set_record in BetterA takes the Julia object, builds a string, and then calls A's set_record.

If I understand correctly (and this is why I said at the very beginning, part of this may be my newness to Julia), then if I have to explicitly reference
A.set_record, it will not work, because it will *not* dispatch to BetterA.set_record...

Is that correct or not?

Note, another reason I *don't* want to have to specify the module/package, is what happens if I want to use another package, that implements the same interface?
For example, I started out using MongoDB.connect, MongoDB.set_document!, etc., but then my old classmate, friend and great 6.111 partner Brad Kuzsmaul comes along and convinces me that
TokuMX is the greatest thing since sliced bread, so now I want to simply do:
using TokuMX instead of using MongoDB, and everything is hunky-dory, but I'll be very sad if I had to go in and edit all the code to say TokuMX.set_document instead of MongoDB.set_document!.

Now, after I've gotten my code working for multiple data sources I realize I need to connect to a KVS to use as the backend... So, I want to use a package GTM, that has a
connect(gtm_manager, cluster, namespace), and a set_node!(gtmconnect, global, value, subscripts...).  Later, I discover that there is another package GlobalsDB, that implements the same interface,
and so I have the same issue, I don't want to have been forced to not do using, when there are absolutely no ambiguities, and I really don't want to have to use module names on my calls!

The important point is that there can be different, unambiguous, perfectly valid sets of functions, which do not implement the same interface, but which may indeed implement different interfaces
(MongoDB & TokuMX may implement some sort of Document DB interface, while GTM & GlobalsDB implement an ANSI M interface, and MySQL & Postgres & SQL_Server & ODBC all implement a SQL interface...
and *all* of them are going to want to call things by the names that make sense to users of their systems... and those users are also going to want to be able to use multiple interfaces without hassle)

This is all very real world...

@kevin I hope this answers your question as well

Scott Jones

unread,
Apr 25, 2015, 8:12:51 PM4/25/15
to julia...@googlegroups.com
The compiler can't determine that there are no conflicts in the case where the method uses a type that is local to the module?
That is the *only* case where I am saying that it should not be necessary to have an "import Base.bar" or "import Foo.bar" if I
want to export a function and have it available at a higher level, and not have to worry that later on somebody adding bar to Base
or Foo will cause my module to stop working?
What if I had a getsockname function? ;-)
That name was just added to Base tonight apparently... so my code would break...
Not good!

ele...@gmail.com

unread,
Apr 25, 2015, 8:59:44 PM4/25/15
to julia...@googlegroups.com


On Sunday, April 26, 2015 at 10:12:51 AM UTC+10, Scott Jones wrote:
The compiler can't determine that there are no conflicts in the case where the method uses a type that is local to the module?

That is not a sufficient condition, a function of the same name which uses ::Any in the same parameter position can conflict IIUC.

 
That is the *only* case where I am saying that it should not be necessary to have an "import Base.bar" or "import Foo.bar" if I
want to export a function and have it available at a higher level, and not have to worry that later on somebody adding bar to Base
or Foo will cause my module to stop working?

As the module writer you have no control over how your module is used, you shouldn't be trying to enforce that your functions don't conflict since that implies a knowledge of the uses of your module.  The user of your module is the only one that knows which function they mean, they have to tell the compiler, not you.
 
What if I had a getsockname function? ;-)
That name was just added to Base tonight apparently... so my code would break...
Not good!

It should break, the compiler cannot in general know if there are no conflicts, and IIUC *cannot even know if it can determine it efficiently*.  Yes you can give examples where it would safely work, but if the compiler cannot check that without potentially entering an expensive computation, then it simply cannot check at all.

Michael Francis

unread,
Apr 25, 2015, 10:00:50 PM4/25/15
to julia...@googlegroups.com
I don't think Any in the same position is a conflict. This would be more of an issue if Julia did not support strong typing, but it does and is a requirement of dynamic dispatch. Consider

function foo( x::Any )

Will never be chosen over

Type Foo end

function foo( x::Foo )

As such I don't get the argument that if I define functions against types I define they cause conflicts.

Being in the position of having implemented a good number of modules and being bitten in this way both by my own dev and by changes to other modules I'm very concerned with the direction being taken.

I do think formalization of interfaces to modules ( and behaviors) would go a long way but expecting a diverse group of people to coordinate is not going to happen without strong and enforced constructs.

As an example I have implemented a document store database interface, this is well represented by an associative collection. It also has a few specific methods which would apply to many databases. It would be nice to be able to share the definition of these common interfaces. I don't advocate adding these to base so how should it be done?

Stefan Karpinski

unread,
Apr 25, 2015, 10:24:54 PM4/25/15
to julia...@googlegroups.com
I think there's some confusion here.

If BetterA extends a function from A then those are the same function object and calling A's function is the same as calling BetterA's function by that name.

If several modules implement a common interface, they shouldn't simply happen to use the same names – they should implement different methods of the same function object, inherited from a common namespace. These methods should dispatch on a connection object that determines who's method to use.

Scott Jones

unread,
Apr 25, 2015, 10:41:16 PM4/25/15
to julia...@googlegroups.com
That's why I had asked the following:

If I understand correctly (and this is why I said at the very beginning, part of this may be my newness to Julia), then if I have to explicitly reference
A.set_record, it will not work, because it will *not* dispatch to BetterA.set_record...
Is that correct or not?

So you are saying that that is not so, that once I've extended a function in A in my module BetterA, it doesn't matter if I call that function as A.func or BetterA.func?

Scott
...

ele...@gmail.com

unread,
Apr 26, 2015, 12:25:28 AM4/26/15
to julia...@googlegroups.com
The situation I was describing is that there is:

module A
type Foo end
f(a::Any)  ...
f(a::Foo) ...

which expects f(a) to dispatch to its ::Any version for all calls where a is not a Foo, and there is:

module B
type Bar end
f(a::Bar) ...

so a user program (assuming the f() functions combined):

using A
using B

b = Bar()
f(b)

now module A is written expecting this to dispatch to A.f(::Any) and module B is written expecting this to dispatch to B.f(::Bar) so there is an ambiguity which only the user can resolve, nothing tells the compiler which the user meant.

Cheers
Lex

Jeff Bezanson

unread,
Apr 26, 2015, 1:18:40 AM4/26/15
to julia...@googlegroups.com
Scott-- yes! If BetterA imports and extends a function from A, it is
exactly the same function object. A.func and BetterA.func will be
identical. Problems only enter if A and BetterA were developed in
total isolation, and the authors just happened to pick the same name.

The way to go here would be to have the DocumentDB module define an
interface, and then both MongoDB and TokuMX extend it. What I don't
get is that elsewhere in your argument, you seem to say that
coordinating and agreeing on an interface is a non-starter. But I just
don't see how you could talk to both MongoDB and TokuMX with the same
code unless they agreed on an interface.

Now consider `connect`. This might be common to both DocumentDB and
SQLDB. Is it similar enough that it should go in an even more abstract
interface GeneralDB? I'm not sure, but I think this is exactly the
kind of interface design process that happens in any OO language. In
Java, you can have a `connect` method whether or not you implement a
`Connectable` interface. But only if you explicitly refer to the
Connectable interface will you be able to work with code that requires
it. I think the same kind of thing is happening here. You can just
write a `connect` function, or you can say `import
Connectable.connect` first, in which case you will be extending the
"public" `connect` function.

Scott Jones

unread,
Apr 26, 2015, 5:20:30 AM4/26/15
to julia...@googlegroups.com


On Sunday, April 26, 2015 at 1:18:40 AM UTC-4, Jeff Bezanson wrote:
Scott-- yes! If BetterA imports and extends a function from A, it is
exactly the same function object. A.func and BetterA.func will be
identical. Problems only enter if A and BetterA were developed in
total isolation, and the authors just happened to pick the same name.

The way to go here would be to have the DocumentDB module define an
interface, and then both MongoDB and TokuMX extend it. What I don't
get is that elsewhere in your argument, you seem to say that
coordinating and agreeing on an interface is a non-starter. But I just
don't see how you could talk to both MongoDB and TokuMX with the same
code unless they agreed on an interface.

You are missing a couple points here:
1) Why would MongoDB and TokuMX coordinate and agree on an interface?
    They are competitors.  TokuMX has made their code compatible with MongoDB's interface,
     and MongoDB may very well decide to add things that will break TokuMX in the future.
2) While they might have compatible interfaces, because TokuMX decided to copy it,
     they are not likely to do anything together.

Now consider `connect`. This might be common to both DocumentDB and
SQLDB. Is it similar enough that it should go in an even more abstract
interface GeneralDB? I'm not sure, but I think this is exactly the
kind of interface design process that happens in any OO language. In
Java, you can have a `connect` method whether or not you implement a
`Connectable` interface. But only if you explicitly refer to the
Connectable interface will you be able to work with code that requires
it. I think the same kind of thing is happening here. You can just
write a `connect` function, or you can say `import
Connectable.connect` first, in which case you will be extending the
"public" `connect` function.

Here is the big stumbling block - you assume that the writers of these packages, whose names have collided because they
are doing similar, but not identical names, will be able to go back and come up with another interface,
and that there should be one true meaning for a particular name.
Since there is already a "connect" in Base, you are saying that nobody else can have a different interface that uses that name?
Also, let's say I had a "getsockname" in my DB access code, to get the name of the socket used by a connection...
Yesterday, people using my package would suddenly get errors, because #11012 added their own getsockname to Base.

Not good at all, IMO.

Scott Jones

unread,
Apr 26, 2015, 5:23:02 AM4/26/15
to julia...@googlegroups.com
Ah, but that is NOT the situation I've been talking about... If the writer of a module wants to have a function that takes ::Any, and is not using any other types that are local to that package, then, from the rules I'd like to see, they *would* have to explicitly import from Base (or whichever module they intended to extend).

Scott

ele...@gmail.com

unread,
Apr 26, 2015, 5:46:15 AM4/26/15
to julia...@googlegroups.com


On Sunday, April 26, 2015 at 7:23:02 PM UTC+10, Scott Jones wrote:
Ah, but that is NOT the situation I've been talking about... If the writer of a module wants to have a function that takes ::Any, and is not using any other types that are local to that package, then, from the rules I'd like to see, they *would* have to explicitly import from Base (or whichever module they intended to extend).

Neither module is extending anything, they are separate.  Its the user that wants to use them both in the same program, and its the user that Julia is protecting.

Cheers
Lex

Scott Jones

unread,
Apr 26, 2015, 6:24:15 AM4/26/15
to julia...@googlegroups.com
Yes, precisely... and I *do* want Julia to protect the user *in that case*.
If a module has functions that are potentially ambiguous, then 1) if the module writer intends to extend something, they should do it *explicitly*, exactly as now, and 2) Julia *should* warn when you have "using" package, not just at run-time, IMO.
I have *only* been talking about the case where you have functions that the compiler can tell in advance, just by looking locally at your module, by a very simple rule, that they cannot be ambiguous.

Scott

Toivo Henningsson

unread,
Apr 26, 2015, 8:16:24 AM4/26/15
to julia...@googlegroups.com
Regarding automatic merging of functions: It's hard enough as it is when you are reading code with several using statements to figure out which function actually gets called at a given point. Writing a piece of code, it might be quite obvious to you which function of a given name that you wanted to call, but consider the additional mental steps for someone reading it, including yourself at some point in the future.

Scott Jones

unread,
Apr 26, 2015, 9:53:08 AM4/26/15
to julia...@googlegroups.com
That is up to the user to make their code readable.
I don't think this makes that necessarily any harder.

My example:

using GlobalDB

globalDBManager = GlobalDB.Manage()
myGlobal = connect(globalDBManager, "mynamespace", "myglobal")

myGlobal["12345.566"] = "A numeric subscript"

....

using SQLDB
sqlDBManager = SQLDB.Manage()
sourceSQL = connect(sqlDBManager, "127.0.0.1", 3000)

Please tell me, just where is that hard to read?

I think it is perfectly readable - much better than having to try to figure out what:

a *= b

means, when it could be:  multiplying two numbers (scalar, rational, complex), or matrix multiplication of two vectors [I would rather have to explicitly use a the Unicode . symbol for dot product in that case, or multiplying a vector by a scalar... or the really confusing case, especially people who thing of strings as being vectors of characters, appending b to a...

Scott

David Gold

unread,
Apr 26, 2015, 12:10:41 PM4/26/15
to julia...@googlegroups.com
You, Jeff and Stefan seem to be concerned with different kinds of "ambiguity." Suppose I import `foo(T1, T2)` from module `A` and `foo(T2, T1)` from module `B`. I take you to claim that if I call `foo(x, y)` then, as long as there is no ambiguity which method is appropriate, the compiler should just choose which of `A.foo()` and `B.foo()` has the proper signature. I take you to be concerned with potential ambiguity about which method should apply to a given argument.

On the other hand, I take Jeff and Stefan to be concerned with the ambiguity of to which object the name '`foo()`' refers (though they should chime in if this claim or any of the following are wayward). Suppose we're the compiler and we come across a use of the name '`foo()`' while running through the code. Our first instinct when we see a name like '`foo()`' is to look up the object to which the name refers. But what object does this name refer to? You (Scott) seem to want the compiler to be able to say to itself before looking up the referent, "If the argument types in this use of '`foo()`' match the signature of `A.foo()`, then this instance of '`foo()`' refers to `A.foo()`. If the argument types match the signature of '`B.foo()`', then this instance of '`foo()`' refers to '`B.foo()`'." But then '`foo()`' isn't really a name, for the referent of a name cannot be a disjunction of objects. Indeed, the notion of a "disjunctive name" is an oxymoron. If you use my name, but your reference could be to either me or some other object depending on context, then you haven't really used my name, which refers to me regardless of context.

I suspect that many problems await if you try to adopt this "disjunctive reference" scheme. In particular, you'd need to develop a general way for the compiler to recognize not only that your use of '`foo()`' isn't intended as a name but rather as a disjunctive reference but also that every other name-like object is actually a name. I strongly suspect that there is no general way to do this. After all, what sorts of contexts could possibly determine with sufficient generality whether or not a given name-like object is actually a name or instead a disjunctive reference. The most obvious approach seems to be to let the compiler try to determine the referent of '`foo()`' and, if there is no referent, then to see whether or not there exist imported functions with the same name. If such imported functions exist, then perhaps the compiler decides that '`foo()`' is a disjunctive reference and tries to find an unambiguously matching method. But do we really want the compiler to assume, just because you used a "name" that has no referent but matches the name of one or more imported functions, that you really intend to use that "name" as a disjunctive reference? This is not even to mention the performance costs associated with trying to look up a referent-less "name," deciding that the "name" is actually intended as a disjunctive reference, and then trying to find a matching method and determine a "true" reference.

Now, if you're not going to try to implement '`foo()`' as a disjunctive reference, then it must be a name. But to which object does the name refer? The obvious choice would be some "merged" function that aggregates all the unambiguously distinct methods of `A.foo()` and `B.foo()`. But where does this object live? Do we make a new object, i.e. a function also named `foo()` whose methods are copies of all the unambiguously distinct methods of `A.foo()` and `B.foo()` and which belongs to the scope in which `A.foo()` and `B.foo()` were imported? But then if one imports `A.foo()` and `B.foo()` in a global scope one thereby creates a global object as the referent of '`foo()`'. I suppose this isn't so bad for functions, but it seems perverse that just importing a name from a module should create global object. 

Okay, so maybe we shouldn't create a new object. The alternative is to have to have our compiler decide upon import to let '`foo()`' refer once and for all either to `A.foo()` or `B.foo()`, and whichever one it does refer to will "absorb" all unambiguously distinct methods from the other. The first problem here is that it is entirely arbitrary which of the two functions should be the "true" referent, and which should hand over its methods. That this decision would be entirely arbitrary suggests (to me, at least) that it is not the best one to make. And regardless of which function is chosen to be the "true" referent, we still end up with the similarly perverse byproduct that just importing a name from module `A` results in a change to an object from module `B`. This seems to defeat the purpose of a module.

Again, this is just what I understand Stefan and Jeff to be arguing. I hope that either/both will point out any misinterpretations or errors on my part.

Jeff Bezanson

unread,
Apr 26, 2015, 1:41:24 PM4/26/15
to julia...@googlegroups.com
I keep getting accused of insisting that every name have only one
meaning. Not at all. When you extend a function there are no
restrictions. The `connect` methods for GlobalDB and SQLDB could
absolutely belong to the same generic function. From there, it's
*nice* if they implement compatible interfaces, but nobody will force
them to.

Scott, I think you're overstating the damage done by a name collision
error. You can't expect to change package versions underneath your
code and have everything keep working. A clear, fairly early error is
one of the *better* outcomes in that case.

In your design, there are *also* situations where an update to a
package causes an error or warning in client code. I'll grant you that
those situations might be rarer, but they're also subtler. The user
might see

Warning: modules A and B conflict over method foo( #= some huge signature =# )

What are you supposed to do about that?

It's worth pointing out that merging functions is currently very
possible; we just don't do it automatically. You can do it manually:

using GlobalDB
using SQLDB

connect(c::GlobalDBMgr, args...) = GlobalDB.connect(c, args...)
connect(c::SQLDBMgr, args...) = SQLDB.connect(c, args...)

This will perform well since such small definitions will usually be inlined.

If people want to experiment, I'd encourage somebody to implement a
function merger using reflection. You could write

const connect = merge(GlobalDB.connect, SQLDB.connect,
conflicts_favor=SQLDB.connect)

Michael Francis

unread,
Apr 26, 2015, 1:46:58 PM4/26/15
to julia...@googlegroups.com
Quick question, is there a safe way to get the list of exported methods from a module? I assume names(module) will retrieve all methods and types ?

Jameson Nash

unread,
Apr 26, 2015, 2:02:18 PM4/26/15
to julia...@googlegroups.com
> Quick question, is there a safe way to get the list of exported methods from a module? I assume names(module) will retrieve all methods and types?

filter!(names(module)) do name
  isa(getfield(module, name), Function)
end

MA Laforge

unread,
Apr 26, 2015, 7:52:06 PM4/26/15
to julia...@googlegroups.com
Hi,

I cannot speak about limitations during the compile process, but at a high level, I don't see why (with a few small tweaks) Julia's multiple dispatch system cannot deal with most of the issues discussed here.

From what I see, there appears to be 3 issues that cause most of the headaches:

1) A module seems to "own" a function.
2) "using" seems to pull names/symbols into the global namespace/scope (Main?) instead of the local one (whichever module the code resides).
3) Module developers "export" functions that cannot be resolved by multiple dispatch.

I believe items 2 & 3 are well covered in "Using ModX: Can scope avoid collisions & improve readability?":
https://groups.google.com/d/msg/julia-users/O8IpPdYBkLw/XzpFj9qiGzUJ

As for item 1, the next section should provide clarification:

*****A module seems to "own" a function*****
Well, my biggest problem is that if you want to reuse the word "open" for your particular application, you need to re-implement base.open(...). This requirement seems a bit awkward.

Should you not just be able to implement your own "MyModule.open" as long as it can uniquely be resolved through multiple dispatch?

In that case applying "using Base & using MyModule" should be able to merge the two "open" functions without ambiguity... as long as Base.open was defined with an interface that is unique:

#Sample Base module:
module Base
abstract Stream
abstract FileType <: Stream
type
BinaryFile <: FileType ...; end

function open(::Type{BinaryFile}, ...)
...
end

export Stream, FileType, BinaryFile
export open
end #Base

#Implement my own socket communication:
module MySocketMod
type
MySocket <: Base.Stream ...; end

function open(::Type{MySocket}, ...)
...
end

export MySocket
export open
end #MySocketMod

#Try out the code:
using Base
using MySocketMod #No problem... "open" ambiguities are covered by mulit-dispatch.

#Great! No abiguities:
myfile
= open(BinaryFile, ...)
mysocket
= open(MySocket, ...)

Sure, the extra argument in "open" *seems* a bit verbose compared to other languages, but it is a bit easier to read... and the argument *not* really redundant.

Indeed, I can see that this solution would require extra compiler complexity - depending on which items are being addressed:

A) Merging function specializations from different modules might have to be done at run time - not ideal.
B) "using"/importing module functions in a local scope makes the symbol tables much larger - not great either.

Having said that, I don't see actual ambiguities here, only additional complexity in the implementation.

Scott Jones

unread,
Apr 27, 2015, 8:05:55 AM4/27/15
to julia...@googlegroups.com


On Sunday, April 26, 2015 at 12:10:41 PM UTC-4, David Gold wrote:
You, Jeff and Stefan seem to be concerned with different kinds of "ambiguity." Suppose I import `foo(T1, T2)` from module `A` and `foo(T2, T1)` from module `B`. I take you to claim that if I call `foo(x, y)` then, as long as there is no ambiguity which method is appropriate, the compiler should just choose which of `A.foo()` and `B.foo()` has the proper signature. I take you to be concerned with potential ambiguity about which method should apply to a given argument.

No, that is not at all what I am saying.

I am saying that if I have foo(A.type, T1, T2), in module A, and foo(B.type, T2, T1), I should be able to call foo(myA, myT1, myT2), and foo(myB, myT2, myT1) without any
problems.

Everybody here seems to think I am talking about a quite different case that what I really am... which is the limited case where all of the functions that I export from my module
have at least one parameter that is a type local to the module, so that the compiler can know right away that it is unambiguous... (since the rule is that you dispatch to the most specific type).

One should be able to write modules in isolation, and as long as one conforms to that simple rule of always having a local type used in all exported functions, that module
should be able to be used with `using` anywhere, without warnings, without worrying that Base or one of the packages that the module uses have later on added a conflicting name...

Scott

Scott Jones

unread,
Apr 27, 2015, 8:19:35 AM4/27/15
to julia...@googlegroups.com


On Sunday, April 26, 2015 at 1:41:24 PM UTC-4, Jeff Bezanson wrote:
I keep getting accused of insisting that every name have only one
meaning. Not at all. When you extend a function there are no
restrictions. The `connect` methods for GlobalDB and SQLDB could
absolutely belong to the same generic function. From there, it's
*nice* if they implement compatible interfaces, but nobody will force
them to.

Scott, I think you're overstating the damage done by a name collision
error. You can't expect to change package versions underneath your
code and have everything keep working. A clear, fairly early error is
one of the *better* outcomes in that case.

I'm talking about cases where somebody has purposefully designed a package as a drop-in replacement for another (like TokuMX for MongoDB),
so yes, I would expect to change things under my code and expect things to keep working.
 
In your design, there are *also* situations where an update to a
package causes an error or warning in client code. I'll grant you that
those situations might be rarer, but they're also subtler. The user
might see

Warning: modules A and B conflict over method foo( #= some huge signature =# )

Again, I am only talking about methods where the signature includes a type from the module it is defined in...
the compiler should be able to detect that when compiling the module, and provide that information so that NO
extra checking needs to be done when `using` the module... wouldn't that speed things up?
 
What are you supposed to do about that?

It's worth pointing out that merging functions is currently very
possible; we just don't do it automatically. You can do it manually:

using GlobalDB
using SQLDB

connect(c::GlobalDBMgr, args...) = GlobalDB.connect(c, args...)
connect(c::SQLDBMgr, args...) = SQLDB.connect(c, args...)

Why should I have to write a bunch of extra code, just to handle something the compiler should have been smart enough to figure out?
(that it is impossible for the methods to conflict)...

This will perform well since such small definitions will usually be inlined.

That does not perform well from the viewpoint of the cost of the programmer's time...
 
If people want to experiment, I'd encourage somebody to implement a
function merger using reflection. You could write

const connect = merge(GlobalDB.connect, SQLDB.connect,
conflicts_favor=SQLDB.connect)

Scott 

MA Laforge

unread,
Apr 27, 2015, 8:36:29 AM4/27/15
to julia...@googlegroups.com
I am not sure I understand Scott's last comments, but I first want to correct something from my last entry.


Sorry about the last entry:


2) "using" seems to pull names/symbols into the global namespace/scope (Main?) instead of the local one (whichever module the code resides).
3) Module developers "export" functions that cannot be resolved by multiple dispatch.

Statement 2 is wrong (I have not thought of this in a while).  I would like to re-state it:

2) "using" seems to apply at the module-level.  I think dealing with name collision would be much easier if "using" applied to arbitrary scopes.

Also, I should point out that I only know how to deal with item 3 through good programming practices, again, as outlined in:
https://groups.google.com/d/msg/julia-users/O8IpPdYBkLw/XzpFj9qiGzUJ

Again, my apologies.

David Gold

unread,
Apr 27, 2015, 1:44:37 PM4/27/15
to julia...@googlegroups.com
I am saying that if I have foo(A.type, T1, T2), in module A, and foo(B.type, T2, T1), I should be able to call foo(myA, myT1, myT2), and foo(myB, myT2, myT1) without any
problems.

@Scott: Ah, I see. I apologize for this oversight. I can understand how working under this assumption (that the arguments have types specific to modules) would simplify the search for the relevant method. But I don't see how it comes to bear on the problem of 'foo()''s referent.

I keep getting accused of insisting that every name have only one
meaning. Not at all. When you extend a function there are no
restrictions. The `connect` methods for GlobalDB and SQLDB could
absolutely belong to the same generic function. From there, it's
*nice* if they implement compatible interfaces, but nobody will force
them to.

@Jeff: I didn't mean to impute to you that position, though I apologize if my wording was unclear.
 

ele...@gmail.com

unread,
Apr 27, 2015, 6:40:50 PM4/27/15
to julia...@googlegroups.com


On Sunday, April 26, 2015 at 8:24:15 PM UTC+10, Scott Jones wrote:
Yes, precisely... and I *do* want Julia to protect the user *in that case*.
If a module has functions that are potentially ambiguous, then 1) if the module writer intends to extend something, they should do it *explicitly*, exactly as now, and 2) Julia *should* warn when you have "using" package, not just at run-time, IMO.
I have *only* been talking about the case where you have functions that the compiler can tell in advance, just by looking locally at your module, by a very simple rule, that they cannot be ambiguous.

The issue is that, in the example I gave, the compiler can't tell, just by looking at your module, if that case exists.  It has to look at everything else imported and defined in the users program, and IIUC with macros, staged functions and lots of other ways of defining functions that can become an expensive computation, and may need to be delayed to runtime.

Cheers
Lex
 

Scott


Scott Jones

unread,
Apr 28, 2015, 9:38:34 AM4/28/15
to julia...@googlegroups.com
To me, that case is not as interesting... if you, the writer of the module, want to use some type that is not defined in your module, then the burden should be on you, to explicitly import from the module you wish to extend... (Base, or whatever package/module the types you are using for that function are defined)...

I'm only concerned about having a way that somebody can write a module, and guarantee (by always using a specific type from the module) that the names it wants to export cannot be ambiguous with other methods with the same name.


David Gold

unread,
Apr 28, 2015, 10:39:22 AM4/28/15
to julia...@googlegroups.com
Re: implementing such a merge function.

My first instinct would be to create a list of methods from each function, find the intersection, then return a function with methods determined by the methods from each input function, with methods in the intersection going to the value of "conflicts_favor". My question is, would it be okay to create a function f within the scope of merge, add methods to f by iterating through a list, and then return f? In particular, if one assigns the value of a global constant 'connect' to the returned function of merge, is a copy of the returned function created and then bound to 'connect', or would 'connect' be bound to something else that would cause trouble down the line?

Thank you!
D


On Sunday, April 26, 2015 at 1:41:24 PM UTC-4, Jeff Bezanson wrote:

ele...@gmail.com

unread,
Apr 28, 2015, 6:20:48 PM4/28/15
to julia...@googlegroups.com


On Tuesday, April 28, 2015 at 11:38:34 PM UTC+10, Scott Jones wrote:


On Monday, April 27, 2015 at 6:40:50 PM UTC-4, ele...@gmail.com wrote:


On Sunday, April 26, 2015 at 8:24:15 PM UTC+10, Scott Jones wrote:
Yes, precisely... and I *do* want Julia to protect the user *in that case*.
If a module has functions that are potentially ambiguous, then 1) if the module writer intends to extend something, they should do it *explicitly*, exactly as now, and 2) Julia *should* warn when you have "using" package, not just at run-time, IMO.
I have *only* been talking about the case where you have functions that the compiler can tell in advance, just by looking locally at your module, by a very simple rule, that they cannot be ambiguous.

The issue is that, in the example I gave, the compiler can't tell, just by looking at your module, if that case exists.  It has to look at everything else imported and defined in the users program, and IIUC with macros, staged functions and lots of other ways of defining functions that can become an expensive computation, and may need to be delayed to runtime.

Cheers
Lex

To me, that case is not as interesting... if you, the writer of the module, want to use some type that is not defined in your module, then the burden should be on you, to explicitly import from the module you wish to extend... (Base, or whatever package/module the types you are using for that function are defined)...

I was only talking about the *user* of your module, not you the writer.  The user is the one that lives in the big world where other packages use the same name, not the package writer in their insulated little module :)

 

I'm only concerned about having a way that somebody can write a module, and guarantee (by always using a specific type from the module) that the names it wants to export cannot be ambiguous with other methods with the same name.


If your methods were joined with any external methods of the same name whenever the user used your module with another using the same name, your methods would be dispatched to, since they use a concrete type, as you want.  But that would "steal" dispatches from methods with more general signatures defined in the other module.  That may break the other module, so the compiler doesn't do it by default to protect the user.  But its what you want if you are extending a function, and you can say so explicitly as you mentioned above.
 

MA Laforge

unread,
Apr 28, 2015, 11:50:54 PM4/28/15
to julia...@googlegroups.com
I can see that this issue is convoluted.  There appears to be competing requirements, and getting things to start humming is non trivial.

Instead of dealing with "what if-s"... I want to start with more concrete "what does"...

Transgressions.sin
First, I don't fully understand Jeff's talk about "Transgressions.sin".  I disagree that "you can't get both behaviors" with map(sin, [1.0, "sloth", 2pi, "gluttony"]).

I tried the following code in Julia, and everything works fine:
module Transgressions
   
Base.sin(x::String) = "Sin in progress: $x"
end

using Transgressions #Doesn't really do anything in this example...

map
(sin, [1.0, "sloth", 2pi, "gluttony"])

This tells me that when one uses map on an Array{Any}, Julia dynamically checks the object type, and applies multi-dispatch to execute the expected code.

I admit that one could argue this is not how "object oriented design" usually deals with this... but that's duck typing for you!

Ok... so what is the *real* problem (as I see it)?  Well, the problem is that Julia essentially decides that Base "owns" sin... simply because it was defined first.

The workaround here was to "extend" Base.sin from module "Transgressions".  This works reasonably well when one *knows* that Base defines the sin "family of methods"... but not very good when one wants to appropriate a new verb (enable, trigger, paint, draw, ...).

Why should any one module "own" such a verb (family of methods)?  This makes little sense to me.

As for Michael Turok's idea of the "SuperSecretBase"
As some people have pointed out, SuperSecretBase is a relatively elegant way to define a common interface for multiple implementations (Ex: A/BConnectionManager).  However, this is not really appropriate in the case when modules want to use the same verb for two completely different domains (ex: draw(x::Canvas, ...) vs draw(x::SixShooter, ...)).

And, as others have also pointed out: the SuperSecretBase solution is not even that great for modules that *do* want to implement a common interface.  If company A needs to convince standards committee X to settle on an interface of accepted verbs... that will surely impede on product deployment.  And even then... Why should standards committee X "own" that verb in the first place???  Why not standards committee Y?

Regarding the comment about not using "using"
Well, that just seems silly to me... by not using "using"... you completely under-utilize the multi-dispatch engine & its ability to author crisp, succinct code.

==>And I would like to point out: The reason that multi-dispatch works so well at the moment is because (almost) everyting in Julia is "owned" by Base... so there are no problems extending methods >>>>In base Julia<<<<

Some improvements on Transgressions.sin
FYI: I don't really like my previous example of Transgressions.sin.  The reason: The implementation does not make sufficient use of what I would call "hard types" (user-defined types).  Instead, it uses "soft types" (int/char/float/string).

Hard types are very explicit, and they take advantage of multiple dispatch.  On the other hand, a method that takes *only* soft types is more likely to collide with others & fail to be resolved by multiple dispatch.

I feel the following example is a *much* better implementation to resolve the sin paradox:
module Religion
   
#Name "Transgressions" has a high-likelyhood of name collisions - don't "export":
    type
Transgressions; name::String; end

   
#Personally, I find this "Transgressions" example shows that base should *not* "own" sin.
   
#Multi-dispatch *should* be able to deal with resolving ambiguities...
   
#In any case, this is my workaround for the moment:
   
Base.sin(x::Transgressions) = "Sin in progress: $x"

   
#Let's hope no other module wants to "own" method "absolve"...
    absolve
(x::Transgressions) = "Sin absolved: $x"

   
export absolve #Logically should have sin here too... but does not work with Julia model.
end

using Religion
Xgress = Religion.Transgressions #Shorthand... "export"-ing Transgressions susceptible to collisions.

map
(sin, [1.0, Xgress("sloth"), 2pi, Xgress("gluttony")])

Initially, creating a type "Transgressions" seems to be overdoing things a bit.  However, I have not noticed a performance hit.  I also find it has very little impact on readability.  In fact I find it *helps* with readability in most cases.

Best of all: Despite requiring a little more infrastructure in the module definition itself, there is negligible overhead in the code that *uses* module Religion.

ele...@gmail.com

unread,
Apr 29, 2015, 12:25:17 AM4/29/15
to julia...@googlegroups.com
Nice summary, it shows that, in the case where the module developer knows about an existing module and intends to extend its functions, Julia "just works".

But it misses the actual problem case, where two modules are developed in isolation and each exports an original sin (sorry couldn't resist).  In this case the user of both modules has to distinguish which sin they are committing.

Although in some situations it might be possible for Julia to determine that there is no overlap between the methods simply, it is my understanding that in general it could be an expensive whole program computation.  So at the moment Julia just objects to overlapping names where they are both original sin.

Cheers
Lex

MA Laforge

unread,
Apr 29, 2015, 9:27:47 AM4/29/15
to julia...@googlegroups.com
Hi Lex,

I think we agree here.  I also got the same impression as you regarding your statement "it might be possible for Julia to determine that there is no overlap between the methods simply, it is my understanding that in general it could be an expensive whole program computation".

For me, I think this impression is derived from one of Jeff's posts:

"
Comparing method signatures is computationally difficult (exponential worst case!), while looking for a symbol in a list is trivial. Warnings for name conflicts may be annoying, but at least it's dead obvious what's happening. If a subtle adjustment to a signature affects visibility elsewhere, I'd think that would be much harder to track down.
"

...But I am not absolutely certain I am following some of this discussion correctly....

Michael Francis

unread,
Apr 29, 2015, 1:01:01 PM4/29/15
to julia...@googlegroups.com
Lex, MA Laforge,

It sounds like many of us agree, I'm not convinced by the it's hard argument though - 

I made the point at the outset that it isn't hard (or expensive) if the exported functions from a module must reference types defined in that module. Hence the suggestion that module developers should only be able to export functions which reference owned/hard/contained/user types. This was shot down as being too restrictive, personally for most modules I am likely to write / define this rule would be sufficient. 

Where a module exports functions which only only reference base types I'd argue that the user of the module should explicitly import those functions, not implicitly. For example if I wanted to define some new function which work on Arrays/Vectors I can certainly do so they should either be qualified in usage 

Transgressions.sort( [ 1,2,7 3 ] )

or explicitly imported 

import Transgressions: sort 
^ the above should fail as it conflicts with sort in Base

I fail to see why the burden of name spacing should be born by every user. I also fail to see why these simple rules are not Julian or are too restrictive, quite the opposite in fact.

In the database example given previously there is significant negative value in having namespace qualification - it implies that if I wanted to implement a database driver to the same interface I either have to extend the methods in the other persons module ( seems like the wrong thing to do ) or define a parallel impl which can not be used at the same time in the same module without qualification of both, even though there is no ambiguity since all the methods reference a type that I have defined in the module. This just seems wrong.

To me the import/using/module rules that exist currently break the promise of multiple dispatch and are at best confusing and at worse make programming in the large fragile and prone to breaks with every new export / method addition.. 

Stefan Karpinski

unread,
Apr 29, 2015, 1:55:14 PM4/29/15
to Julia Users
On Wed, Apr 29, 2015 at 1:01 PM, Michael Francis <mdcfr...@gmail.com> wrote:
I made the point at the outset that it isn't hard (or expensive) if the exported functions from a module must reference types defined in that module. Hence the suggestion that module developers should only be able to export functions which reference owned/hard/contained/user types.

Unless I'm misunderstanding, this is a very limiting restriction. It would mean, for example, that you can't define and export a generic square(::Number) function. That's a silly example, but it's completely standard for packages to export new functions that operate on pre-existing types that don't dispatch on any type that "belongs" to the exporting module.

Another way of looking at this is that such a restriction would prevent solving half of the expression problem. In object-oriented languages, extending existing operations to new types is easily done via subtyping, but adding new operations to existing types is awkward or impossible. In functional languages, adding new operations to existing types is easy, but extending existing operations to new types is awkward or impossible. Multiple dispatch lets you do both easily and intuitively – so much so that people can easily forget why the expression problem was a problem in the first place. Preventing the export of new functions operating only on existing types would hobble the language, making it no more expressive than traditional object-oriented languages.

Michael Francis

unread,
Apr 29, 2015, 2:06:45 PM4/29/15
to julia...@googlegroups.com
I would expect the user to explicitly import those method, I did not preclude their existence. And it would be quite reasonable to support the existing import all syntax hence 

using MyModule
^ imports only those functions which explicitly reference user types defined in the module 
importall MyModule.Extensions
^imports the additional functionality on base types

if I subsequently import another function which conflicts then we throw an error. This would mean that the vast majority of non conflicting functions can be trivially exported and used without a namespace qualifier and extensions to base types would also work, but with the name collision check in place. 

I don't believe this violates the expression problem ? 

Scott Jones

unread,
Apr 29, 2015, 2:14:50 PM4/29/15
to julia...@googlegroups.com
This is similar to what I want to see, but I didn’t have the restriction that developers should only be able to export functions that reference owned/hard/contained/user types, but rather,
that they don’t require an explicit “import” to be able to extend that name (which basically makes it impossible to develop code in isolation…)

I thought the compiler could tell that fact rather easily (i.e., if the function being exported referenced at least one “owned/contained/user type”), but if that’s not so, it could still be
solved by having the develop explicitly indicate that that is the case…

Maybe instead of export, use a different keyword, or an option on export, to indicate that.

If that keyword or option is used, then the system should go ahead and do an automatic merge if somebody does “using”.


Preventing the export of new functions because they accidentally happen to have a name conflict (not because of an actual signature conflict) is also a very significant hobble to the language…

I think the language needs better tools to be able to be able to develop packages independently, while still allowing all the goodness of extending base, or other packages, as well as ways
of keeping the interface and implementation separate… (this goes back to what I learned from CLU… how important it was to be able to keep those separate, so that people can’t abuse your internal data structures…)   Julia is *way* too lax in that… I can look at the string types internal information, for example, and that ends up tempting people to bypass the abstraction, which
then means major breakage when you want to change things to improve something...

Scott


Scott Jones

unread,
Apr 29, 2015, 2:16:27 PM4/29/15
to julia...@googlegroups.com
Yes! Precisely what I’ve been advocating all along!

Scott

Stefan Karpinski

unread,
Apr 29, 2015, 2:49:34 PM4/29/15
to Julia Users
This scheme seems overly focused on object-oriented programming styles at the cost of making other programming styles much more inconvenient. In an o-o language that might be fine, but non-o-o styles are quite common in Julia. The `connect` example keeps coming up because it is one of those cases where o-o works well. That is not the norm in numerical package, however. This proposal would, for example, make using Distributions a nightmare – you'd have to explicitly import almost everything that it exports to use. That includes type constructors for distributions etc., since those are themselves (basically) generic functions and their arguments are just built-ins. Instead of doing this:

using Distributions
X = Normal(0.0, 1.0)
p = pdf(X, 0.1)

you'd have to do this:

using Distributions
X = Distributions.Normal(0.0, 1.0)
p = Distributions.pdf(X, 0.1)

Or you'd have to explicitly import every Distributions type and generic stats function that you want to use. Instead of being able to write `using Distributions` and suddenly having all of the stats stuff you might want available easily, you'd have to keep qualifying everything or explicitly importing it. Both suck for interactive usage – and frankly even for non-interactive usage, qualifying or explicitly importing nearly every name you use is a massive and unnecessary hassle.

Michael Francis

unread,
Apr 29, 2015, 3:06:31 PM4/29/15
to julia...@googlegroups.com
Imagine you want to do some math and you want to output a pdf from the Pdf, it exports pdf( ... ) 
now you have to qualify every call to pdf in distributions, so you still end up with your worst case.

I am proposing that we make these conflicts explicit you would write 

using Distributions
importall Distributions.Extensions 

Then your code would continue to work, it would not conflict with the pdf package as that is defined against its own types. If I then included MyDistributions which defined a pdf type against base classes then we have the conflict again, but this one is to be expected. 

I really don't think this is a discussion about OO vs non OO.

Stefan Karpinski

unread,
Apr 29, 2015, 3:28:20 PM4/29/15
to Julia Users
On Wed, Apr 29, 2015 at 3:06 PM, Michael Francis <mdcfr...@gmail.com> wrote:
Imagine you want to do some math and you want to output a pdf from the Pdf, it exports pdf( ... ) 
now you have to qualify every call to pdf in distributions, so you still end up with your worst case.

using Distributions, Pdf
const pdf = Distributions.pdf
# still have access to Pdf.pdf

Stefan Karpinski

unread,
Apr 29, 2015, 3:29:30 PM4/29/15
to Julia Users
There's a higher level issue here: orthogonality. In the current design, naming and dispatch are orthogonal. Deciding what a name refers to is completely independent of what kind of thing it refers to. The proposed "automatic function merging" inextricably complects naming and dispatch. How does this extend to exported names that aren't functions? What happens when one module exports something that's a function while another module exports something by the same name that is an integer or a string? These questions don't come up in the current design since the rules for deciding what a name refers to are completely independent of what kind of object it refers to – it can be a function, integer, strings – everything works the same way. The proposed scheme makes what names you can or can't export / import linked to what kinds of objects those names refer to and various complicated properties of those objects.

Tom Breloff

unread,
Apr 29, 2015, 3:36:24 PM4/29/15
to julia...@googlegroups.com
Stefan:  I agree that typing Distributions.Normal(0,1) in the REPL when you're just doing data exploration is really frustrating.  However, when you're building a package or a larger system, it can be really important to either explicitly qualify your types/functions or to explicitly "import Distributions: Normal".  Frequently I'll be looking at some code within a large package with a call to "Normal", but I don't know what function will be called here.  If I do some type of explicit importing, then I can do a simple find command on my directory to figure out what package it may have been defined in.  If I did a "using", then I have a lot more searching to figure out the correct function.  Not to mention the silent overwriting of functions by "using" different modules can cause unexpected problems.  For that reason, I try to keep "using" to an absolute minimum in most of my code.

Stefan Karpinski

unread,
Apr 29, 2015, 4:41:38 PM4/29/15
to Julia Users
Tom, this is a very legitimate concern. A simple solution is to have a coding standard not to use `using` when writing packages. Google has created coding standards for both C++ and Python, which are now widely used beyond the company.

Automatic function merging goes in the opposite direction: with this feature it becomes impossible to even say which package a function comes from – it's not even a meaningful question anymore. That is the point of Jeff's Base.sin versus Transgression.sin example – map(sin, [1.0, "greed", 2pi, "sloth"]). There is no answer to the question of which of the function Base.sin and Transgreassion.sin the `sin` function refers to – it can only refer to some new `sin` that exists only in the current module and calls either Base.sin or Transgressions.sin depending on the runtime values of its arguments. Perhaps this can be made clearer with an even nastier example, assuming hypothetical code with function merging:

module Foo
    export f
    immutable F end
    f(::F) = "this is Foo"
end

module Bar
    export f
    immutable B end
    f(::B) = "this is Bar"
end

julia> using Foo, Bar

julia> f(rand(Bool) ? Foo.F() : Bar.B()) # which `f` is this?

Which `f` is intended to be called here? It cannot be statically determined – it's not well-defined since it depends on the value of rand(Bool). Some dynamic languages are Ok with this kind of thing, but in Julia, the *meaning* of code should be decidable statically even if some of the behavior may be dynamic. Compare with this slightly different version of the above code (works on 0.4-dev):

module Sup
    export f
    f(::Void) = nothing # declare generic function without a la #8283
end

module Foo
    export f
    import Sup: f
    immutable F end
    f(::F) = "this is Foo"
end

module Bar
    export f
    import Sup: f
    immutable B end
    f(::B) = "this is Bar"
end

julia> using Foo, Bar

julia> f(rand(Bool) ? Foo.F() : Bar.B())

Why is this ok, while the previous code was problematic? Here you can say which `f` is called: Foo and Bar share `f` so the answer is well-defined – `f` is always `Sup.f`.

Stefan Karpinski

unread,
Apr 29, 2015, 5:05:47 PM4/29/15
to Julia Users

On Wed, Apr 29, 2015 at 4:40 PM, Stefan Karpinski <ste...@karpinski.org> wrote:
module Sup
    export f
    f(::Void) = nothing # declare generic function without a la #8283
end

Sorry, I meant to write this:

f(::Union()) = nothing

I.e. a method that can never apply to any call. Just a hack until #8283 is done.

David Gold

unread,
Apr 29, 2015, 5:53:12 PM4/29/15
to julia...@googlegroups.com
@Stefan: Out of curiosity, do you see any inherent problems in having Julia automatically create such an "empty" function f in the module that imports both Foo.f and Bar.f and then merge the (unambiguous) methods into the automatically created f? (Or just automatically merge unambiguous methods to whatever function already has the name f in the importing module.) Or is it just not in the style of Julia (or in the vision of its creators) to do something like that?

Again, I'm just asking out of curiosity, as I'm finding this conversation an interesting vehicle for learning about issues of scope and naming in Julia.

Stefan Karpinski

unread,
Apr 29, 2015, 6:05:39 PM4/29/15
to Julia Users
On Wed, Apr 29, 2015 at 5:53 PM, David Gold <david....@gmail.com> wrote:
@Stefan: Out of curiosity, do you see any inherent problems in having Julia automatically create such an "empty" function f in the module that imports both Foo.f and Bar.f and then merge the (unambiguous) methods into the automatically created f? (Or just automatically merge unambiguous methods to whatever function already has the name f in the importing module.) Or is it just not in the style of Julia (or in the vision of its creators) to do something like that?

Again, I'm just asking out of curiosity, as I'm finding this conversation an interesting vehicle for learning about issues of scope and naming in Julia.

Yes, all my arguments are against doing precisely that.

Scott Jones

unread,
Apr 29, 2015, 6:07:49 PM4/29/15
to julia...@googlegroups.com
On Wednesday, April 29, 2015 at 4:41:38 PM UTC-4, Stefan Karpinski wrote:
Tom, this is a very legitimate concern. A simple solution is to have a coding standard not to use `using` when writing packages. Google has created coding standards for both C++ and Python, which are now widely used beyond the company.

Automatic function merging goes in the opposite direction: with this feature it becomes impossible to even say which package a function comes from – it's not even a meaningful question anymore. That is the point of Jeff's Base.sin versus Transgression.sin example – map(sin, [1.0, "greed", 2pi, "sloth"]). There is no answer to the question of which of the function Base.sin and Transgreassion.sin the `sin` function refers to – it can only refer to some new `sin` that exists only in the current module and calls either Base.sin or Transgressions.sin depending on the runtime values of its arguments. Perhaps this can be made clearer with an even nastier example, assuming hypothetical code with function merging:

module Foo
    export f
    immutable F end
    f(::F) = "this is Foo"
end

module Bar
    export f
    immutable B end
    f(::B) = "this is Bar"
end

julia> using Foo, Bar

julia> f(rand(Bool) ? Foo.F() : Bar.B()) # which `f` is this?

Which `f` is intended to be called here? It cannot be statically determined – it's not well-defined since it depends on the value of rand(Bool). Some dynamic languages are Ok with this kind of thing, but in Julia, the *meaning* of code should be decidable statically even if some of the behavior may be dynamic. Compare with this slightly different version of the above code (works on 0.4-dev):

I think this is easier to solve than you think.
The compiler should see this call as having a signature of Union(F,B), which *neither* Foo.f nor Bar.f have...
So, it can either, give an error, or be even smarter, and see that it should be equivalent to:
rand(Bool) ? f(Foo.F()) : f(Bar.B()),
which is *totally* determined statically!

What's the real problem here?

Stefan Karpinski

unread,
Apr 29, 2015, 6:19:53 PM4/29/15
to Julia Users
On Wed, Apr 29, 2015 at 6:07 PM, Scott Jones <scott.pa...@gmail.com> wrote:

I think this is easier to solve than you think.
The compiler should see this call as having a signature of Union(F,B), which *neither* Foo.f nor Bar.f have...

What does this mean?
 
So, it can either, give an error,

What error? Why? You want a feature which, when used, is an error? Why have the feature in the first place then?
 
or be even smarter, and see that it should be equivalent to:
rand(Bool) ? f(Foo.F()) : f(Bar.B()),
which is *totally* determined statically!

That trick happens to work in this particular example, but you can easily construct cases where that transformation can't be done. If the argument to f is a function argument, for example. But it still has the exact same problem.
 
What's the real problem here?

Function merging has these problems:
  1. It complects name resolution with dispatch – they are no longer orthogonal.
  2. It makes all bindings from `using` semantically ambiguous – you have no idea what a name means without actually doing a call.

Scott Jones

unread,
Apr 29, 2015, 9:09:03 PM4/29/15
to julia...@googlegroups.com
On Apr 29, 2015, at 6:19 PM, Stefan Karpinski <ste...@karpinski.org> wrote:

On Wed, Apr 29, 2015 at 6:07 PM, Scott Jones <scott.pa...@gmail.com> wrote:

I think this is easier to solve than you think.
The compiler should see this call as having a signature of Union(F,B), which *neither* Foo.f nor Bar.f have...

What does this mean?


Maybe I’m misunderstanding the way the compiler is working… I thought it figured out the signature of the call, first with the most specific types
for each argument, and then tried to find the method that matched that signature…

So, it can either, give an error,

What error? Why? You want a feature which, when used, is an error? Why have the feature in the first place then?

If the compiler can’t figure out which one it is, then it *could* give an error… (and at compile time, not run-time)…

Why should the case that really matters be prohibited, i.e.:

f(Foo.F())
f(Bar.B())

or, more realistically:

myRec = fetch(mySQLtable,rowId)
myGlo = fetch(myMNode,subscript)

The two things are totally independent… they are not generic functions, these are two totally separate APIs.
They are unambiguous, and the intent is clear to the programmer.

Your restrictions are making it very hard to develop easy to use APIs that make sense for the people using them…

That’s why so many people have been bringing this issue up…

or be even smarter, and see that it should be equivalent to:
rand(Bool) ? f(Foo.F()) : f(Bar.B()),
which is *totally* determined statically!

That trick happens to work in this particular example, but you can easily construct cases where that transformation can't be done. If the argument to f is a function argument, for example. But it still has the exact same problem.

You can *construct* cases… however, in the interest of general programming utility, why can’t there be some way to let programmers have some isolation, while still being able to make best use of multiple dispatch?

Having to decorate everything with the module name does not seem to fit in with using multiple dispatch…

Scott

MA Laforge

unread,
Apr 29, 2015, 11:36:15 PM4/29/15
to julia...@googlegroups.com
Scott and Michael:
I am pretty certain I understand what you are saying, but I find your examples/descriptions a bit confusing.  I think (hope) I know why Stafan is confused.

Stefan:
I think Scott has a valid point but I disagree that "exported functions from a module must reference types defined in that module".  It is my strong belief that Scott is merely focusing on the symptom instead of the cause.

Fortunately, I am certain that this is not the crux of the problem...  And I agree completely with Stefan: Limiting exports to this particular case is extremely restrictive (and unnecessary).  I also agree that, this restriction *would* keep developers from developing very useful monkey patches (among other things).  So let's look at the problem differently...

Problem 1: A module "owns" its verbs.
See discussion above discussion (https://groups.google.com/d/msg/julia-users/sk8Gxq7ws3w/ASFlqZmVwYsJ) if you are not familiar with the idea.

Problem 2: Julia has 2 symbol types (for objects/methods/...)
Well, this is not really a problem... It is terrific!  Julia's multi-dispatch engine allows us to overload methods in a way I have never seen before!

In fact, the multi-dispatch system allows programmers to DO AWAY WITH namespaces altogether for the new type of symbol (at least to a first order).

So what are the symbol types?
1) Conventional Type: Used in most other imperative languages
2) Multi-Dispatch Type: Symbols of methods whose signature can be uniquely identified.

By this definition, if a symbol is not associated with a unique signature, it is simply a conventional symbol.

And, as defined, conventional symbols run a high potential for "signature collisions"... because the signature is insufficient to uniquely identify whether we are referring to Foo.CommonSymbol(???) or Bar.CommonSymbol(???).

So why do we need namespaces?
Namespaces were created to let programmers use succinct symbol names without the problem of running into never-ending name collisions.  Instead of dealing with symbol bloat (Foo.FoosSpecialSymbolThatCannotCollideWithAnybody) - we use scopes to give symbols a nice hierarchy (namespaces are basically named scopes).  When writing code in the native scope, all symbols are nice and short... and you can even *import* the symbol names to your own scope to interact with the module.  Now the user gets to use short names - not just the module developer!

So what about the multi-dispatch-ed symbols (type 2)?
Technically, they could *all* be located at the global scope.  You don't really need to say Foo.run... because the call signature is unique (by definition) - so there is no ambiguity.

Ok, then where is the problem for module developers?
Simply put: Julia is trying to cram those beautiful multi-dispatchable methods into a construct (namespaces) that *is not technically needed* for type 2 symbols.  As a consequence, the current Julia implementation actually makes it *difficult* for module developers to use multi-dispatch.  Of course, everything is just peachy when you work from within Base :).

Unfortunately, I believe this has caused a little more collateral damage: Since it is easier to work within Base, it has become this sort of "god module".  I believe Jeff has mentioned developers are getting reluctant to make it grow any further.




The Distributions module: Can it remain elegant?
The short answer is yes :)!

Stefan has a valid concern: you either have to keep using full symbol paths (Distributions.Normal), or
"[...]
you'd have to explicitly import every Distributions type and generic stats function that you want to use. Instead of being able to write `using Distributions` and suddenly having all of the stats stuff you might want available easily, you'd have to keep qualifying everything or explicitly importing it. Both suck for interactive usage - and frankly even for non-interactive usage, qualifying or explicitly importing nearly every name you use is a massive and unnecessary hassle.
"

I agree.  We don't want this... what a pain!  Let's not go there.

So where do we start?
In this case, "Normal" is a relatively common name, and I cannot say its argument list is sufficiently unique to qualify it as a type 2 method.  The signature only involves two "::Number" arguments.

Good rule of thumb: If an argument list is made up solely from base types (number/char/string/...) is likely to collide with a function in another module (say, a Geometry module).

That was easy!: That means "Normal" is a type 1 symbol.  As such, Normal *should* be "export"-ed by module Distributions... Nothing changes in the implementation!

Now how do we generate numbers with a normal distribution?
The Distributions package includes a rand() function.  Interestingly enough, the signature for rand makes it a type 2 method - and so it has a very low probability of signature collisions.

Here is a simplified definition of the rand function (Not using UnivariateDistribution, ...):
import Base: [...] rand
rand
(x::Normal, n::Int) = ...
Basically: Distributions is extending Base.rand

With this infrastructure, you can generate random distributions with very nice code:
using Distributions
X
= Normal(0.0, 1.0)

values
= rand(X, 3)

So what is the problem here?
Again, it is quite subtle: In Julia, the first module to define a method essentially becomes the "owner" (in this case "Base").  This is awkward.

Then what could be done differently?
Well, one possible solution would be simply declare type 2 methods as "global":
global rand(x::Normal, n::Int) = ...
(tentative syntax)

Due to its unique signature, this type 2 method is unlikely to collide with the definition from another module anyways... so why not make it global?

NOTE: I don't think this is ideal, but it is much better than the current solution where a module "owns" a given symbol (or verb, if you will).

Ok, but does this actually help?
Surprisingly yes!  By not using the same hammer for type 1 & type 2 symbols, *module developers* get a means to push the safe type 2 symbols to the global namespace... and *module users* get to decide when/where to import type 1 symbols from a given library.

Consequence 1:
It will be very unlikely that two modules will trigger signature collisions.  For example: draw(PlotPackage1.Canvas, ...) does not collide with draw(PlotPackage2.Canvas, ...).

Consequence 2:
Even in any other programming language out there: you expect that type 1 symbol collisions be fatal (because they *are* ambiguous).  As usual, you can only import symbols ("using") from *a single* conflicting module to the current namespace at a time.  The other import *must* be a true "import".

...But that's ok... because you still get to use Julia's awesome multi-dispatch engine on all the type 2 symbols!!!

>>>And that's the *biggest* problem with Julia's current behavior.  You don't even get to use multi-dispatch on type 2 symbols when you do an "import".  You only get to reap the benefits when "using" succeeds - or if you import individual type 2 methods.<<<

MA Laforge

unread,
Apr 30, 2015, 12:15:04 AM4/30/15
to julia...@googlegroups.com
Stefan,

I am sorry, but my experience leads me to disagree with your statement that Julia is unable to dispatch a function dynamically (@ runtime).  Quote included:

"
module Foo
    export f
    immutable F end
    f(::F) = "this is Foo"
end

module Bar
    export f
    immutable B end
    f(::B) = "this is Bar"
end

julia> using Foo, Bar

julia> f(rand(Bool) ? Foo.F() : Bar.B()) # which `f` is this?

Which `f` is intended to be called here? It cannot be statically determined – it's not well-defined since it depends on the value of rand(Bool). Some dynamic languages are Ok with this kind of thing, but in Julia, the *meaning* of code should be decidable statically even if some of the behavior may be dynamic. Compare with this slightly different version of the above code (works on 0.4-dev):
"

Indeed, running this code, I get a misleading error message.  Whenever the code tries to run f(Foo.F()), I get the following message:
"
ERROR: `f` has no method matching f(::F)
"

Here is the problem:
The error actually happened when Bar tried to declare f() - which was already exported by Foo, then "using"-d by the module user.

So, in order to play nice, Bar would have to extend Foo.f()... even though (in an ideal world) Bar should never need to know that Foo "owned" f().

To make matters worse (on v.0.3.6), Bar then takes control of f() - and "steals" it from Foo... as a warning - not an error.

To make my case, I submit a workaround for this example:
#Sorry: My version of Julia does not have rand(Bool)...
Base.rand(::Type{Bool}) = randbool()


module Foo
   
export f
    immutable F
end
    f
(::F) = "this is Foo"
end

module Bar

   
#export f #Nope... cannot do this... Foo defined first: it "owns" f
    immutable B
end

   
#Sad but true: Foo "owns" f... so we must adhere to this reality:
   
import Foo
   
Foo.f(::B) = "this is Bar"
end

using Foo, Bar

#No problem... Julia has an algorithm to dispatch functions
#even if the compiler cannot resolved the call statically.
#Of course, a statically resolved dispatch would be faster than a dynamic one...
for i in 1:10
println
(f(rand(Bool) ? Foo.F() : Bar.B()))
end


...So your statement confuses me a little...


Tamas Papp

unread,
Apr 30, 2015, 2:15:51 AM4/30/15
to julia...@googlegroups.com

On Thu, Apr 30 2015, Stefan Karpinski <ste...@karpinski.org> wrote:

> Function merging has these problems:
>
> 1. It complects name resolution with dispatch – they are no longer
> orthogonal.
> 2. It makes all bindings from `using` semantically ambiguous – you have
> no idea what a name means without actually doing a call.

IMO orthogonality of name resolution and dispatch should be preserved --
it is a nice property of the language and makes reasoning about code
much easier. Many languages have this property, and it has stood the
test of time, also in combination with multiple dispatch (Common
Lisp). Giving it up would be a huge price to pay for some DWIM feature.

Best,

Tamas

Michael Francis

unread,
Apr 30, 2015, 12:19:07 PM4/30/15
to julia...@googlegroups.com
My goal is not to remove namespaces, quite the opposite, for types a namespace is an elegant solution to resolving the ambiguity between different types of the same name. What I do object to is that functions (which are defined against user defined types) are relegated to being second class citizens in the Julia world unless you are developing in Base. For people in Base the world is great, it all just works. For everybody else you either shoe horn your behavior into one of the Base methods by extending it, or you are forced into qualifying names when you don't need to. 

1) Didn't that horse already bolt with Base. If Base were subdivided into strict namespaces of functionality then I see this argument, but that isn't the case and everybody would be complaining that they need to type strings.find("string") 
2) To me that is what multiple dispatch is all about. I am calling a function and I want the run time to decide which implementation to use, if I wanted to have to qualify all calls to a method I'd be far better off in an OO language. 

Tom Breloff

unread,
Apr 30, 2015, 1:11:27 PM4/30/15
to julia...@googlegroups.com
Can anyone point me in the right direction of the files/functions in the core library where dispatch is handled?  I'd like to explore a little so I can make comments that account for the relative ease at implementing some of the changes suggested.  

I agree that it would be really nice, in some cases, to auto-merge function definitions between namespaces (database connects are very simple OO example).   However, if 2 different modules define foo(x::Float64, y::Int), then there should be an error if they're both exported (or if not an error, then at least force qualified access??)   Now in my mind, the tricky part comes when a package writer defines:

module MyModule
export foo
type
MyType end
foo
(x::MyType) = ...
foo
(x) = ...
end


... and then writes other parts of the package depending on foo(5) to do something very specific.  This may work perfectly until the user calls "using SomeOtherModule" which in turn exported foo(x::Int).  If there's an auto-merge between the modules, then foo(x::MyModule.MyType), foo(x::Any), and foo(Int) all exist and are valid calls.

If the auto-merge changes the calling behavior for foo within module MyModule, then we have a big problem.  I.e. we have something like:

internalmethod() = foo(5)


that is defined within MyModule... If internalmethod() now maps to SomeOtherModule.foo(x::Int)... all the related internal code within MyModule will likely break.  However, if internalmethod() still maps to MyModule.foo(), then I think we're safe.


So I guess the question: can we auto-merge the foo methods in user space only (i.e. in the REPL where someone called "using MyModule, SomeOtherModule"), and keep calls within modules un-merged (unless of course you call "using SomeOtherPackage" within MyModule... after which it's my responsibility to know about and call the correct foo).

Are there other pitfalls to auto-merging in user-space-only?  I can't comment on how hard this is to implement, but I don't foresee how it breaks dispatch or any of the other powerful concepts.

Isaiah Norton

unread,
Apr 30, 2015, 1:20:36 PM4/30/15
to julia...@googlegroups.com
Can anyone point me in the right direction of the files/functions in the core library where dispatch is handled?  I'd like to explore a little so I can make comments that account for the relative ease at implementing some of the changes suggested.  

Mauro

unread,
Apr 30, 2015, 1:39:54 PM4/30/15
to julia...@googlegroups.com
>> Can anyone point me in the right direction of the files/functions in the
>> core library where dispatch is handled? I'd like to explore a little so I
>> can make comments that account for the relative ease at implementing some
>> of the changes suggested.
>>
>
> Start here:
> https://github.com/JuliaLang/julia/blob/cd455af0e26370a8899c1d7b3d194aacd8c87e9e/src/gf.c#L1655

and this
https://www.youtube.com/watch?v=osdeT-tWjzk

> On Thu, Apr 30, 2015 at 1:11 PM, Tom Breloff <t...@breloff.com> wrote:
>
>> Can anyone point me in the right direction of the files/functions in the
>> core library where dispatch is handled? I'd like to explore a little so I
>> can make comments that account for the relative ease at implementing some
>> of the changes suggested.
>>
>> I agree that it would be really nice, in some cases, to auto-merge
>> function definitions between namespaces (database connects are very simple
>> OO example). However, if 2 different modules define foo(x::Float64,
>> y::Int), then there should be an error if they're both exported (or if not
>> an error, then at least force qualified access??) Now in my mind, the
>> tricky part comes when a package writer defines:
>>
>> module MyModule
>> export foo
>> type MyType end
>> foo(x::MyType) = ...
>> foo(x) = ...
>> end
>>
>>
>> ... and then writes other parts of the package depending on foo(5) to do
>> something very specific. This may work perfectly until the user calls
>> "using SomeOtherModule" which in turn exported foo(x::Int). If there's an
>> auto-merge between the modules, then foo(x::MyModule.MyType), foo(x::Any),
>> and foo(Int) all exist and are valid calls.
>>
>> *If* the auto-merge changes the calling behavior for foo *within module
>> MyModule*, then we have a big problem. I.e. we have something like:
>>
>> internalmethod() = foo(5)
>>
>>
>> that is defined within MyModule... If internalmethod() now maps to
>> SomeOtherModule.foo(x::Int)... all the related internal code within
>> MyModule will likely break. However, if internalmethod() still maps to
>> MyModule.foo(), then I think we're safe.
>>
>>
>> So I guess the question: can we auto-merge the foo methods *in user space
>> only *(i.e. in the REPL where someone called "using MyModule,

Tom Breloff

unread,
Apr 30, 2015, 1:47:41 PM4/30/15
to julia...@googlegroups.com
Bookmarked and watching.  Thanks  :)

Matt Bauman

unread,
Apr 30, 2015, 2:18:03 PM4/30/15
to julia...@googlegroups.com
On Thursday, April 30, 2015 at 1:11:27 PM UTC-4, Tom Breloff wrote:
I agree that it would be really nice, in some cases, to auto-merge function definitions between namespaces (database connects are very simple OO example).   However, if 2 different modules define foo(x::Float64, y::Int), then there should be an error if they're both exported (or if not an error, then at least force qualified access??)   Now in my mind, the tricky part comes when a package writer defines:

module MyModule
export foo
type
MyType end
foo
(x::MyType) = ...
foo
(x) = ...
end

I think this is a very interesting discussion, but it all seems to come back to a human communication issue.  Each package author must *somehow* communicate to both users and other package authors that they mean the same thing when they define a function that's intended to be used interchangeably.  We can either do this explicitly (e.g., by joining or forming an organization like JuliaStats/StatsBase.jl, JuliaDB/DBI.jl, JuliaIO/FileIO.jl, etc.), or we can try write code in Julia to help mediate this discussion.

The heuristics you're proposing sound interesting (and may even work, especially when combined with delaying ambiguity warnings and making them errors at an ambiguous call), but I have a hunch that it will take a lot of work to implement.  And I'm not sure that it really makes things better.  Bad actors can still define their interfaces to prevent others from using the same names with multiple dispatch (e.g., by only defining `connect(::String)`). Doing the sort of automatic filetype dispatch (like FileIO is working towards) still needs *one* place where `load("data.jld")` is interpreted and re-dispatched to `load(::FileType{:jld})` that the HDF5/JLD package can define its dispatch on.  Finally, one currently unsolved area is plotting.  None of the `plot` methods defined in any of the various packages are combined into the same function, nor could they feasibly do so without massive coordination between the package authors (for no real functional gain).  This proposal doesn't really solve that, either.  It'll be just as impossible to do `using Gadfly, Winston` and have the `plot` function just work.

I hope this doesn't read as overly negative.  I think it's great that folks are pushing the edges here and proposing new ideas.  But I'm afraid that this won't replace the collaboration needed to get these sorts of interfaces working well and interchangeably.

Tom Breloff

unread,
Apr 30, 2015, 4:00:46 PM4/30/15
to julia...@googlegroups.com
I think it's a good thing to spell out precise scenarios where "using" multiple modules at the same time is good and unambiguous, and when you can get in trouble.  If anything, it gives developers an idea of the edge cases that need to be handled and can help in thinking about design changes and/or workarounds, and may help to more clearly define best practices.  I think the crux of the issue is this:

Name clash between similar/competing modules, which likely don't know or care about the other, and which may or may not define functionality common to Base

Good examples mentioned are databases and plotting packages.   FileSystemDB and CloudDB might both want to export a connect(s::String) method, just like Winston and Gadfly might want to export a plot(v::Vector{Float64}) method.  I don't think this is a bad thing, but for it to work with "using FileSystemDB, CloudDB" we have a couple requirements:
  • Within both FileSystemDB and CloudDB, they must call their respective connect methods internally.  If this doesn't hold then every package writer must know about every other package in existence now and in the future to ensure nothing breaks.  This requirement could be relaxed a little if the package writer had some control over what/how its internal methods could be overwritten.  (Comparing to C++, a class can have protected methods which can effectively be redefined by another class, but also private methods which cannot.  It's up to the class writer to decide which parts can be changed without breaking internals.)  In effect, a module could have "private" methods that are never monkey-patched, and "public" methods that could be.  Some languages do this with naming conventions (underscores, etc).  The decision would then rest with the package developer as to whether it would break their code to allow a different module to override their methods.  Here's an example where there's one function that you really don't want someone else to overwrite, and another that doesn't really matter.  (Idea: could possibly achieve this by automatically converting calls to "_cleanup()" within this module into "MyModule._cleanup()" during parsing?)
    module MyModule

    const somethingImportant = loadMassiveDatabaseIntoMemory()
    _cleanup
    () = cleanup(somethingImportant)

    type
    MyType end
    string(x::MyType) = "MyType{}"

    ...

    end
  • In the scope where the "using" call occurred, ambiguous calls should require explicit calling
    • The pain of this could certainly be lessened with syntax like
      using Gadfly as G, Winston
      which could use Winston's plot method by default... forcing you to call G.plot() otherwise.  This might be close to how it works now, anyways.
      or potentially harder to implement properly (but maybe under the hood just re-sorts the module priorities?):
      import Gadfly, Winston

      with Gadfly
        plot
      (x)
      end

      with Winston
       plot
      (y)
      end
      Both are very reasonable syntax from my point of view.  At some point the user has to tell us what they want, right?  You can't use multiple packages defining the exact same method and expect something to "just work"


Note: I think monkey patching is ok in some circumstances, but usually dangerous in packages (i.e. redefining
Base.sin(x::Real) = fly("Vegas")
in a package will lead to problems that the package maintainer just couldn't foresee.)  Monkey patching by an end user is a much different story, as they usually have a better idea on how all the components interact.  This kind of thing should end up in a best practices guide, though... not forced by the language.  

Maybe we just need better tooling to identify package interop... i.e. a test system that will do "using X, Y, Z, ... " in a systematic way before testing a package, thus letting tests happen with random subsets of other packages polluting the Main module to identify how fragile that package may be in the wild (and also whether using that package leads to breakages elsewhere).

Thoughts?  Does any of this exist already and I just don't know about it?

Scott Jones

unread,
Apr 30, 2015, 4:47:03 PM4/30/15
to julia...@googlegroups.com


On Wednesday, April 29, 2015 at 11:36:15 PM UTC-4, MA Laforge wrote:
Scott and Michael:
I am pretty certain I understand what you are saying, but I find your examples/descriptions a bit confusing.  I think (hope) I know why Stafan is confused.

Stefan:
I think Scott has a valid point but I disagree that "exported functions from a module must reference types defined in that module".  It is my strong belief that Scott is merely focusing on the symptom instead of the cause.

I've never advocated that!  What I've said, a number of times now, is that *if* the compiler determines that a method in a module is using a type defined in that module, which should mean
that it is unambiguous, and it is exported, that if somebody does "using xxx", it should be merged.
I actually now think that it really needs to be explicit... it is not very clear in Julia what the intent of the programmer was.

You have me confused with Michael here... I disagree with that part of what he was saying, and agree totally with Stefan that it would be way to limiting.
I also think that the approach of just postponing the check for ambiguity to run-time would be a very bad thing... I think it's hard enough now to know
if your module is correct, it seems a lot of stuff isn't caught until run-time.

I also think that your description of the problem is very good, but I don't think your solution does enough to make things better...

I would propose adding some new syntax to indicate if a function name in a module is meant to extend a particular function, instead of that happening implicitly because
it was previously imported, as well as syntax to indicate that a function name is meant to be a new "concept", that others can extend, and can unambiguously be used
in the global namespace (i.e. it uses one (or more) of its own type(s) as part of its signature...
module baz
function foo(args) export extends Base         # i.e. extends Base.foo, is exported
function silly(args) export extends Database # i.e. extends Database.silly, is exported
function bar(abc::MyType) export generic     # Creates a new concept "bar", which is exported, and can be extended by other modules explicity with the extends syntax
function ugh(args) extends Base                   # extends Base.ugh in the module, but is not visible outside the module (have to use baz.ugh to call)
function myfunc(args)                                    # makes a local definition, will hide other definitions from Base or used in the parent module, but can be use as baz.myfunc
function myownfunc(args) private                  # makes a local definition, as above, but is not visible at all in parent module

I'm not sure if this is possible, but, what if you wanted to extend a particular function, but *only* in the context of your module?
Currently, the extension is made public, even if you didn't export the function, which seems a bit wrong to me...
To me, it seems that whether a function is meant to be a new generic, or extend something else, is orthogonal to whether you want it to be able to be used directly
in code that does "using".  It would also be good to be able to make methods (even ones that are generic within the module) (or types, for that matter) private from the rest of the world...
so that outside the module, they cannot be used...
I see a big problem with Julia, in that it seems that one cannot prevent users from directly accessing your implementation, as opposed to be limited to just the abstract interface
you made.   This was a wonderful thing in CLU... I suppose they no longer teach CLU at MIT? :-(
[it would also be wonderful to be able to specify which exceptions a method can throw, and have that strict... if a method specifies its exceptions, then any unhandled exceptions
in the method get converted to a special "Unhanded Exception" exception...]

Scott

Stefan Karpinski

unread,
Apr 30, 2015, 5:03:52 PM4/30/15
to Julia Users
On Wed, Apr 29, 2015 at 9:08 PM, Scott Jones <scott.pa...@gmail.com> wrote:
Your restrictions are making it very hard to develop easy to use APIs that make sense for the people using them…

That’s why so many people have been bringing this issue up…

Not a single person who maintains a major Julia package has complained about this. Which doesn't mean that there can't possibly be an issue here, but it seems to strongly suggest that this is one of those concerns that initially appears dire, when coming from a particular programming background, but which dissipates once one acclimatizes to the multiple dispatch mindset – in particular the idea that "one generic function" = "one verb concept".

Stefan Karpinski

unread,
Apr 30, 2015, 5:18:23 PM4/30/15
to Julia Users
On Thu, Apr 30, 2015 at 12:19 PM, Michael Francis <mdcfr...@gmail.com> wrote:
My goal is not to remove namespaces, quite the opposite, for types a namespace is an elegant solution to resolving the ambiguity between different types of the same name. What I do object to is that functions (which are defined against user defined types) are relegated to being second class citizens in the Julia world unless you are developing in Base. For people in Base the world is great, it all just works. For everybody else you either shoe horn your behavior into one of the Base methods by extending it, or you are forced into qualifying names when you don't need to.

There's nothing privileged about Base except that `using Base` is automatically done in other (non-bare) modules. If you want to extend a function from Base, you have to do `Base.foo(args...) = whatever`. The same applies to functions from other modules. The Distributions and StatsBase packages are good examples of this being done in the Julia ecosystem. What is wrong with having shared concepts defined in a shared module?
 
1) Didn't that horse already bolt with Base. If Base were subdivided into strict namespaces of functionality then I see this argument, but that isn't the case and everybody would be complaining that they need to type strings.find("string")

I don't understand how any horses have bolted – Base is not particularly special, it just provides a default set of names that are available to use. It in no way precludes defining your own meanings for any names at all or sharing them among a set of modules.

2) To me that is what multiple dispatch is all about. I am calling a function and I want the run time to decide which implementation to use, if I wanted to have to qualify all calls to a method I'd be far better off in an OO language. 

This is exactly what happens, but you have to first be clear about which function you mean. The whole point of namespaces is that different modules can have different meanings for the same name. If two modules don't have the same meaning for `foo` then you have to clarify which sense of `foo` you intended. Once you've picked a meaning of `foo`, if it is a generic function, then the appropriate method will be used when you call it on a set of arguments.

Stefan Karpinski

unread,
Apr 30, 2015, 5:24:11 PM4/30/15
to Julia Users
On Thu, Apr 30, 2015 at 12:15 AM, MA Laforge <ma.laf...@gmail.com> wrote:
Stefan,

I am sorry, but my experience leads me to disagree with your statement that Julia is unable to dispatch a function dynamically (@ runtime).

I didn't say this – all function calls are dispatched dynamically.
 
...So your statement confuses me a little...

That's probably because you missed the statement right before that code where I said "assuming hypothetical code with function merging" – in other words that example is not how things work currently, but how it has been proposed that they could work – a behavior which I'm arguing against. If you run that code on 0.3 it will always call Bar.f; if you run it in 0.4-dev, it will give an error (as I believe it should).

Scott Jones

unread,
Apr 30, 2015, 5:26:19 PM4/30/15
to julia...@googlegroups.com
Maybe because it seems that a lot of the major packages have been put into Base, so it isn't a problem, as MA Laforge pointed out, leading to Base being incredibly large,
with stuff that means Julia's MIT license doesn't mean all that much, because it includes GPL code by default...

Scott
It is loading more messages.
0 new messages