namespaces and overloading

1 view

Skip to first unread message

Geoffrey Irving

unread,

Apr 24, 2011, 3:51:04 AM4/24/11

to Dylan Simon, duck...@googlegroups.com

Hello,

I'm vaguely toying with the idea of tinkering with duck again. Two
main thoughts:

1. Being purely functional is hard, so I'm waffling about that.
Shifting to impure functional might make it easier to proceed. There
are no issues on the type system side, thanks to our principle that
types are independent of context. The main downside is trouble for
optimizations down the line.

2. I don't have a satisfactory way to combine overloading with modules
/ namespaces. It feels like this is important. Among other things,
it'd be nice if the duck environment could be constructed piecemeal,
with full interleaving of execution and type inference. This would
need to happen anyways in the context of runtime optimization. It's
frustrating how natural namespaces are for object oriented languages
like Python, where a namespace is just a big object with a bunch of
fields. I'm just jealous, I suppose.

Not sure if this will go anywhere, but I'm curious if it triggers any
thoughts or further discussions.

Thanks,
Geoffrey

Dylan Simon

unread,

Apr 24, 2011, 10:52:52 AM4/24/11

to Geoffrey Irving, duck...@googlegroups.com

From Geoffrey Irving <irv...@naml.us>, Sun, Apr 24, 2011 at 12:51:04AM -0700:

> Hello,
>
> I'm vaguely toying with the idea of tinkering with duck again. Two
> main thoughts:
>
> 1. Being purely functional is hard, so I'm waffling about that.
> Shifting to impure functional might make it easier to proceed. There
> are no issues on the type system side, thanks to our principle that
> types are independent of context. The main downside is trouble for
> optimizations down the line.

Hard in what sense? For the programmer, because I might disagree with that.
I thought effect typing provided a reasonable solution to that, or do you mean
that effect typing itself is hard? I suppose a compromise would be impure
functional with optimization annotations (all arrows are interchangeable to
the type system, though possibly fully inferred, but meaningful to the
optimizer). I'd like to understand which part is hard first, though.

>
> 2. I don't have a satisfactory way to combine overloading with modules
> / namespaces. It feels like this is important. Among other things,
> it'd be nice if the duck environment could be constructed piecemeal,
> with full interleaving of execution and type inference. This would
> need to happen anyways in the context of runtime optimization. It's
> frustrating how natural namespaces are for object oriented languages
> like Python, where a namespace is just a big object with a bunch of
> fields. I'm just jealous, I suppose.

I don't think we have to do too much on this early on. Each module has a
single flat namespace, which can be imported (unioned) into other modules.
Full qualification and import restrictions can still be allowed, with
overloading only within that namespace. But possibly this is more complicated
than I think and I haven't thought about the issues for a while.

Geoffrey Irving

unread,

Apr 24, 2011, 1:40:46 PM4/24/11

to duck...@googlegroups.com, Dylan Simon

On Sun, Apr 24, 2011 at 7:52 AM, Dylan Simon <dylan-...@dylex.net> wrote:
> From Geoffrey Irving <irv...@naml.us>, Sun, Apr 24, 2011 at 12:51:04AM -0700:
>> Hello,
>>
>> I'm vaguely toying with the idea of tinkering with duck again. Two
>> main thoughts:
>>
>> 1. Being purely functional is hard, so I'm waffling about that.
>> Shifting to impure functional might make it easier to proceed. There
>> are no issues on the type system side, thanks to our principle that
>> types are independent of context. The main downside is trouble for
>> optimizations down the line.
>
> Hard in what sense? For the programmer, because I might disagree with that.
> I thought effect typing provided a reasonable solution to that, or do you mean
> that effect typing itself is hard? I suppose a compromise would be impure
> functional with optimization annotations (all arrows are interchangeable to
> the type system, though possibly fully inferred, but meaningful to the
> optimizer). I'd like to understand which part is hard first, though.

It may just be fear of the unknown, but I think the hard part is
"local" effects such as temporary arrays, where the effect types have
to be polymorphic. The canonical example is a function which
allocates a mutable array, constructs it via fundamentally impure
methods (quicksort or random access, for example), and returns it.
The caller wants to be able to either continue modifying the result or
decide that the array should now be frozen.

In the rest of the type system, we've managed to avoid universally
quantified types (and associated inference complexity) through the use
of Void. However, I don't any way to type this sort of "create"
function without universally quantifying over the array region. In
fact, it's worth than that, since you have to express the fact that
"create" doesn't store a reference to the array in some other
structure. Disciple handles this with closure types and other further
complexity.

Of course, "create" is also exactly the place where you most want
effect typing to express exactly what you're doing.

>> 2. I don't have a satisfactory way to combine overloading with modules
>> / namespaces. It feels like this is important. Among other things,
>> it'd be nice if the duck environment could be constructed piecemeal,
>> with full interleaving of execution and type inference. This would
>> need to happen anyways in the context of runtime optimization. It's
>> frustrating how natural namespaces are for object oriented languages
>> like Python, where a namespace is just a big object with a bunch of
>> fields. I'm just jealous, I suppose.
>
> I don't think we have to do too much on this early on. Each module has a
> single flat namespace, which can be imported (unioned) into other modules.
> Full qualification and import restrictions can still be allowed, with
> overloading only within that namespace. But possibly this is more complicated
> than I think and I haven't thought about the issues for a while.

Unfortunately, that simple method isn't enough, since you want closed
modules to be able to call overloads declared by their users. For
example, you can have a function in module A that evaluates an array
as a polynomial, a module B that declares a type of complex numbers,
and a module C that creates an array of complex numbers and evaluates
it as a polynomial. A has to be call arithmetic functions from module
B even though A and B have no relationship to each other.

C++ solves this via Koenig lookup, so that's one option.

One reason the namespace issue came up is that I was thinking of how
to arrange the data structures for the duck runtime. It'd be nice if
everything was wired together via pointers in such a way that pieces
of code / functions / modules / etc. would naturally disappear once no
longer in use. That may be entirely impractical, though.

Geoffrey

Dylan Simon

unread,

Apr 24, 2011, 2:44:39 PM4/24/11

to Geoffrey Irving, duck...@googlegroups.com

> >> 1. Being purely functional is hard, so I'm waffling about that.

> >> Shifting to impure functional might make it easier to proceed. ?There

> >> are no issues on the type system side, thanks to our principle that

> >> types are independent of context. ?The main downside is trouble for
> >> optimizations down the line.
> >
> > Hard in what sense? ?For the programmer, because I might disagree with that.

> > I thought effect typing provided a reasonable solution to that, or do you mean

> > that effect typing itself is hard? ?I suppose a compromise would be impure

> > functional with optimization annotations (all arrows are interchangeable to
> > the type system, though possibly fully inferred, but meaningful to the

> > optimizer). ?I'd like to understand which part is hard first, though.

>
> It may just be fear of the unknown, but I think the hard part is
> "local" effects such as temporary arrays, where the effect types have
> to be polymorphic. The canonical example is a function which
> allocates a mutable array, constructs it via fundamentally impure
> methods (quicksort or random access, for example), and returns it.
> The caller wants to be able to either continue modifying the result or
> decide that the array should now be frozen.
>
> In the rest of the type system, we've managed to avoid universally
> quantified types (and associated inference complexity) through the use
> of Void. However, I don't any way to type this sort of "create"
> function without universally quantifying over the array region. In
> fact, it's worth than that, since you have to express the fact that
> "create" doesn't store a reference to the array in some other
> structure. Disciple handles this with closure types and other further
> complexity.
>
> Of course, "create" is also exactly the place where you most want
> effect typing to express exactly what you're doing.

So, something like what the ST monad is for in haskell? Is this less of an
issue if we don't allow overloading of effects, so that overloaded functions
have to have matching effects? I guess it does seem an okay starting point to
make effects somewhat less specific or even binary (pure or not).

> >> 2. I don't have a satisfactory way to combine overloading with modules

> >> / namespaces. ?It feels like this is important. ?Among other things,

> >> it'd be nice if the duck environment could be constructed piecemeal,

> >> with full interleaving of execution and type inference. ?This would
> >> need to happen anyways in the context of runtime optimization. ?It's

> >> frustrating how natural namespaces are for object oriented languages
> >> like Python, where a namespace is just a big object with a bunch of

> >> fields. ?I'm just jealous, I suppose.
> >
> > I don't think we have to do too much on this early on. ?Each module has a

> > single flat namespace, which can be imported (unioned) into other modules.
> > Full qualification and import restrictions can still be allowed, with

> > overloading only within that namespace. ?But possibly this is more complicated

> > than I think and I haven't thought about the issues for a while.
>
> Unfortunately, that simple method isn't enough, since you want closed
> modules to be able to call overloads declared by their users. For
> example, you can have a function in module A that evaluates an array
> as a polynomial, a module B that declares a type of complex numbers,
> and a module C that creates an array of complex numbers and evaluates
> it as a polynomial. A has to be call arithmetic functions from module
> B even though A and B have no relationship to each other.

You could just not allow that, right? So, if I want module A to use my
overloads for (+), I have to pass it in, i.e., overloads are frozen at
symbolic reference time. No, I guess that's far too limiting isn't it. This
is why we had thought to start with a single, load everything at once approach
isn't it. Essentially importing a module has to share the namespace both
ways.

> C++ solves this via Koenig lookup, so that's one option.

Doesn't this prevent two independent modules that use a common module from
having conflicting overloads, so that you could get conflicts by importing two
different modules that each work on their own? Not that that's the end of the
world.

> One reason the namespace issue came up is that I was thinking of how
> to arrange the data structures for the duck runtime. It'd be nice if
> everything was wired together via pointers in such a way that pieces
> of code / functions / modules / etc. would naturally disappear once no
> longer in use. That may be entirely impractical, though.

So loading a module establishes all the pointers into all the overloads? It
does seem hard to determine exactly which overloads you could get rid of, but
this seems reasonable for entire symbols you no longer need.

Geoffrey Irving

unread,

Apr 24, 2011, 5:06:22 PM4/24/11

to Geoffrey Irving, duck...@googlegroups.com

On Sun, Apr 24, 2011 at 11:44 AM, Dylan Simon <dylan-...@dylex.net> wrote:
>> It may just be fear of the unknown, but I think the hard part is
>> "local" effects such as temporary arrays, where the effect types have
>> to be polymorphic. The canonical example is a function which
>> allocates a mutable array, constructs it via fundamentally impure
>> methods (quicksort or random access, for example), and returns it.
>> The caller wants to be able to either continue modifying the result or
>> decide that the array should now be frozen.
>>
>> In the rest of the type system, we've managed to avoid universally
>> quantified types (and associated inference complexity) through the use
>> of Void. However, I don't any way to type this sort of "create"
>> function without universally quantifying over the array region. In
>> fact, it's worth than that, since you have to express the fact that
>> "create" doesn't store a reference to the array in some other
>> structure. Disciple handles this with closure types and other further
>> complexity.
>>
>> Of course, "create" is also exactly the place where you most want
>> effect typing to express exactly what you're doing.
>
> So, something like what the ST monad is for in haskell? Is this less of an
> issue if we don't allow overloading of effects, so that overloaded functions
> have to have matching effects? I guess it does seem an okay starting point to
> make effects somewhat less specific or even binary (pure or not).

The hardness doesn't change in the absence of overloads. Starting
with pure or not is a possibility, but it doesn't seem that different
from starting with completely impure and inferring what we need for
optimization.

Yeah, disallowing that is too limiting.

>> C++ solves this via Koenig lookup, so that's one option.
>
> Doesn't this prevent two independent modules that use a common module from
> having conflicting overloads, so that you could get conflicts by importing two
> different modules that each work on their own? Not that that's the end of the
> world.

Not the end of the world, but certainly unpalatable. I guess one
option is for each module to declare what symbols it exports to be
callable from other namespaces. This is probably enough in practice.
After all, C programmers survive with no namespaces at all. A certain
amount of discipline is required to not pollute the overload table
with overly general overloads, but this is required whatever happens.
Maybe Koenig lookup is the right way to go.

I was able to say that maybe there should be three levels of
exportedness: fully exported, exported only for access via Koenig
lookup, and completely private. However, that's unnecessary for the
same reason that overloading works at all. Exported only for access
via Koenig lookup just means fully exported with a type that means it
can only get called if you pass in the right type.

Importantly, Koenig lookup also has the property that the set of
resolved overloads can't change after a module is loaded, so each
module can be type inferred before the next is even parsed. The only
way a new overload could be detected is if a function is called with a
type from a new module, and no such call could have happened before.

>> One reason the namespace issue came up is that I was thinking of how
>> to arrange the data structures for the duck runtime. It'd be nice if
>> everything was wired together via pointers in such a way that pieces
>> of code / functions / modules / etc. would naturally disappear once no
>> longer in use. That may be entirely impractical, though.
>
> So loading a module establishes all the pointers into all the overloads? It
> does seem hard to determine exactly which overloads you could get rid of, but
> this seems reasonable for entire symbols you no longer need.

I think it even works for the particular overloads. All we have to do
is make the resolved overload table store weak pointers. Each type
will have a reference to the module that contains it for Koenig lookup
purposes. If there are no references to a module and no references to
its types, the module and all it's overloads will vanish. Hmm. On
the other hand, I suppose types having references to their namespaces
would keep too much stuff alive.

In any case, this seems workable, so maybe Koenig lookup is a
sufficient solution. If so, it's probably something we don't need to
worry about now.