macro/static/const

Dylan Simon

unread,

May 15, 2011, 6:49:27 PM5/15/11

to duck...@googlegroups.com

I have an incomplete prototype of this working (it's in my github
https://github.com/dylex/duck if you want to take a look). So far I have
called everything "static", though "const" may be a better name.

After going round and round trying to figure out how do a macro expression
expansion step before inference that in turn depends on an overload mechanism,
I came up with something I'm actually quite happy with. It's a fairly small
amount of actual code change (though I did some reorganization and
optimization along the way, too) that affects inference without really
impacting any other phase.

The basic idea is simply to keep values in types:

TyStatic TypeVal Value

Whenever something is evaluated in static context, type inference creates
TyStatic types containing the actual values, and is in effect a mini,
dynamically-typed interpreter.

Static context is created during a static argument:

static :: static a -> a
static x = x

Since TyStatic is a type, we can create separate overloads for separate
values without needing to create new functions or even to change any
expressions. (Though, without constant folding it's a little excessive.)

In effect, this allows type inference to accept programs based on static
values that it wouldn't otherwise accept, which really is the motivation for
all this. Thus we can use the same exact code at runtime and still guarantee
type safety (even though branches that don't happen may have different types).

Example:

s :: static Bool -> a
s True = 0
s False = 'x'
put (s False)
exit (s True)

This does what you expect, but without the static, it fails.

I think this is pretty cool, but I'm not entirely sure how it plays with other
future pieces. It adds another variable to compile-time overload-resolution,
but hopefully not a significant one.

Things not yet done:

- LIR conversion needs to appropriately mark lifted arguments as static. This
may need a little code but I think the logic is obvious.

- Some or all primitives should be evaluated statically as requested. As it
is, I think this would result in IO primitives happening some unexpectedly
large number of times.

- I did a little trick that makes constructors like "True" have static types
without having to need an extra assignment. This may or may not be
desirable. On the other hand "Just True", which perhaps could be static, is
not. Both of these relate to determining the exact syntax and semantics for
top-level definitions. Like functions, they will have to be re-evaluated
during static evaluation. I believe this is a minor issue, though.

- Performance. I seem to have made performance worse. I'm doing a number of
extra and unnecessary overload resolutions to determine static annotations,
but with appropriate caching I think these should all go away.

- printf. Once all the above is done, printf might "just work."

Geoffrey Irving

unread,

May 17, 2011, 1:02:01 PM5/17/11

to duck...@googlegroups.com

On Sun, May 15, 2011 at 3:49 PM, Dylan Simon <dylan-...@dylex.net> wrote:
> I have an incomplete prototype of this working (it's in my github
> https://github.com/dylex/duck if you want to take a look). So far I have
> called everything "static", though "const" may be a better name.

Somehow static seems better to me, but either one is fine. I'll look
through the branch tomorrow night.

> After going round and round trying to figure out how do a macro expression
> expansion step before inference that in turn depends on an overload mechanism,
> I came up with something I'm actually quite happy with. It's a fairly small
> amount of actual code change (though I did some reorganization and
> optimization along the way, too) that affects inference without really
> impacting any other phase.
>
> The basic idea is simply to keep values in types:
>
> TyStatic TypeVal Value
>
> Whenever something is evaluated in static context, type inference creates
> TyStatic types containing the actual values, and is in effect a mini,
> dynamically-typed interpreter.
>
> Static context is created during a static argument:
>
> static :: static a -> a
> static x = x
>
> Since TyStatic is a type, we can create separate overloads for separate
> values without needing to create new functions or even to change any
> expressions. (Though, without constant folding it's a little excessive.)

TyStatic seems like the perfect way to do it. The proliferation of
overloads does sound like an issue, but I suppose it's something to
resolve later.

How do you implement comparison on TyStatic values? Eventually we'd
want to compare and hash them based on the actual == and hash
overloads defined by the user, but that doesn't seem feasible just
yet.

> In effect, this allows type inference to accept programs based on static
> values that it wouldn't otherwise accept, which really is the motivation for
> all this. Thus we can use the same exact code at runtime and still guarantee
> type safety (even though branches that don't happen may have different types).
>
> Example:
>
> s :: static Bool -> a
> s True = 0
> s False = 'x'
> put (s False)
> exit (s True)
>
> This does what you expect, but without the static, it fails.
>
> I think this is pretty cool, but I'm not entirely sure how it plays with other
> future pieces. It adds another variable to compile-time overload-resolution,
> but hopefully not a significant one.

Agreed about the coolness.

> Things not yet done:
>
> - LIR conversion needs to appropriately mark lifted arguments as static. This
> may need a little code but I think the logic is obvious.

Yep, that sounds straightforward. As you're going through and
collecting the free variables, also collect whether they're used in a
static context or not...oh. Actually that seems a bit tricky. I
guess how easy it is depends on how non-first class functions with
static arguments are. In the current structure, can you determine
whether something is in a static context without knowing the types of
things?

> - Some or all primitives should be evaluated statically as requested. As it
> is, I think this would result in IO primitives happening some unexpectedly
> large number of times.
>
> - I did a little trick that makes constructors like "True" have static types
> without having to need an extra assignment. This may or may not be
> desirable. On the other hand "Just True", which perhaps could be static, is
> not. Both of these relate to determining the exact syntax and semantics for
> top-level definitions. Like functions, they will have to be re-evaluated
> during static evaluation. I believe this is a minor issue, though.

I think we normally want to default to nonstatic to avoid blowup in
the number of overloads, but I may not understand what the trick is
for. Can you give an example?

> - Performance. I seem to have made performance worse. I'm doing a number of
> extra and unnecessary overload resolutions to determine static annotations,
> but with appropriate caching I think these should all go away.

Is staticness independent of the types of variables, so that the extra
logic could be performed before type inference / specialization? This
is related to the difficulty of marking lifted function arguments as
static.

> - printf. Once all the above is done, printf might "just work."

Very cool.

Geoffrey

Dylan Simon

unread,

May 17, 2011, 8:29:21 PM5/17/11

to duck...@googlegroups.com

> > The basic idea is simply to keep values in types:
> >

> > ? ? ? ?TyStatic TypeVal Value

> >
> > Whenever something is evaluated in static context, type inference creates
> > TyStatic types containing the actual values, and is in effect a mini,
> > dynamically-typed interpreter.
> >
> > Static context is created during a static argument:
> >

> > ? ? ? ?static :: static a -> a
> > ? ? ? ?static x = x

> >
> > Since TyStatic is a type, we can create separate overloads for separate
> > values without needing to create new functions or even to change any

> > expressions. ?(Though, without constant folding it's a little excessive.)

>
> TyStatic seems like the perfect way to do it. The proliferation of
> overloads does sound like an issue, but I suppose it's something to
> resolve later.
>
> How do you implement comparison on TyStatic values? Eventually we'd
> want to compare and hash them based on the actual == and hash
> overloads defined by the user, but that doesn't seem feasible just
> yet.

I actually hadn't dealt with that at first, and was just going by pointers.
I've now started a TypedValue type with Ord instances (and might move all
the Prettyval stuff there, too).

Speaking of users' (==), we probably want some sort of "automatic" feature,
analogous to "deriving". I also think it should be easy to implement SYB-like
features, and I keep being tempted to change our code to use Data.Generics
already, but that's a separate issue.

> > Things not yet done:
> >
> > - LIR conversion needs to appropriately mark lifted arguments as static. ?This
> > ?may need a little code but I think the logic is obvious.

>
> Yep, that sounds straightforward. As you're going through and
> collecting the free variables, also collect whether they're used in a
> static context or not...oh. Actually that seems a bit tricky. I
> guess how easy it is depends on how non-first class functions with
> static arguments are. In the current structure, can you determine
> whether something is in a static context without knowing the types of
> things?

I was thinking about it the other way around (outside-in: static arguments
are passed on as static), but now I'm not sure. I'll have to think about it
some more. Right now functions can return static values, but that may be a
bad idea without explicit annotation.

> > - Some or all primitives should be evaluated statically as requested. ?As it
> > ?is, I think this would result in IO primitives happening some unexpectedly
> > ?large number of times.

> >
> > - I did a little trick that makes constructors like "True" have static types

> > ?without having to need an extra assignment. ?This may or may not be
> > ?desirable. ?On the other hand "Just True", which perhaps could be static, is
> > ?not. ?Both of these relate to determining the exact syntax and semantics for
> > ?top-level definitions. ?Like functions, they will have to be re-evaluated
> > ?during static evaluation. ?I believe this is a minor issue, though.

>
> I think we normally want to default to nonstatic to avoid blowup in
> the number of overloads, but I may not understand what the trick is
> for. Can you give an example?

It was more a hack because I don't have static globals yet. I agree making it
explicit sounds better and this should go away.

> > - Performance. ?I seem to have made performance worse. ?I'm doing a number of
> > ?extra and unnecessary overload resolutions to determine static annotations,
> > ?but with appropriate caching I think these should all go away.

>
> Is staticness independent of the types of variables, so that the extra
> logic could be performed before type inference / specialization? This
> is related to the difficulty of marking lifted function arguments as
> static.

Maybe. You do need to know types to resolve overloads, which affect static
context, but not static variables. I might end up coming up with some
restrictions that make this feasible. Mainly I wanted to make sure this was a
reasonable direction.

Geoffrey Irving

unread,

May 21, 2011, 2:51:43 AM5/21/11

to duck...@googlegroups.com

On Tue, May 17, 2011 at 5:29 PM, Dylan Simon <dylan-...@dylex.net> wrote:
>> > The basic idea is simply to keep values in types:
>> >
>> > ? ? ? ?TyStatic TypeVal Value
>> >
>> > Whenever something is evaluated in static context, type inference creates
>> > TyStatic types containing the actual values, and is in effect a mini,
>> > dynamically-typed interpreter.
>> >
>> > Static context is created during a static argument:
>> >
>> > ? ? ? ?static :: static a -> a
>> > ? ? ? ?static x = x
>> >
>> > Since TyStatic is a type, we can create separate overloads for separate
>> > values without needing to create new functions or even to change any
>> > expressions. ?(Though, without constant folding it's a little excessive.)
>>
>> TyStatic seems like the perfect way to do it. The proliferation of
>> overloads does sound like an issue, but I suppose it's something to
>> resolve later.
>>
>> How do you implement comparison on TyStatic values? Eventually we'd
>> want to compare and hash them based on the actual == and hash
>> overloads defined by the user, but that doesn't seem feasible just
>> yet.
>
> I actually hadn't dealt with that at first, and was just going by pointers.
> I've now started a TypedValue type with Ord instances (and might move all
> the Prettyval stuff there, too).

Sounds good.

> Speaking of users' (==), we probably want some sort of "automatic" feature,
> analogous to "deriving". I also think it should be easy to implement SYB-like
> features, and I keep being tempted to change our code to use Data.Generics
> already, but that's a separate issue.

Definitely. The most general way to do that is to have a factory
function which generates arbitrary code and stuffs it into the
program. Something like

genericEq :: a -> a -> Bool
genericEq x y =
static f = compile genericEqCode (Type x)
f x y

where compile is a primitive that converts Lir into a value. I
imagine we'll want less extreme versions as well. Actually the static
bit isn't really necessary, since compile can be a macro.

I'm finally looking through the code now. Here are miscellaneous comments:

1. It's interesting that all the type set functions ignore the value
of TyStatic. I suppose it makes sense, though. There are a couple
places (e.g., in printing), where we may want to pay attention to the
value at some point.
2. At least in the intermediate commit I'm looking at now
(afe22c38bb), it seems like you're not using context to determine
static evaluation. The way I'd imagined it working is that before
inferring an expression, we'd check to see whether it's being
evaluated in static context. If so, we'd enforce that all free
variables were static (I suppose that means checking whether they have
static types.
3. Relatedly, "case" should always enforce that its branches have the
same type, regardless of whether the argument is static. If you want
a case which requires a static argument and doesn't enforce matching
types, one should use scase.
4. I suppose there's an issue with static expressions being evaluated
more than once. I'm not sure I see any way to avoid that other than
generated specialized Lir during type inference, which seems heavy
handed. On the other hand, we might have to do that anyway for
backend and optimization purposes. It might end up cleaner; the
specialized IR would be completely free of static specifiers, so Infer
wouldn't have to worry about static at all.
5. TypedValue should just be called Any. We'll want it for other
purposes as well (e.g., heterogeneous data structures in normal duck
code).
6. I think shifting to a context-centric approach to statics will also
help performance. For example, the new primType has to check whether
it's arguments are static in every single call. In contrast, if we
determined staticness based on context, we'd check the free variables
for staticness once for each static context, and switch briefly from
Infer to Interp to do the evaluation.
7. At a higher level, it seems like the right approach involves Infer
calling directly into Interp. Currently Infer is still independent of
Interp, right?

Geoffrey

Dylan Simon

unread,

May 21, 2011, 11:57:39 AM5/21/11

to duck...@googlegroups.com

> I'm finally looking through the code now. Here are miscellaneous comments:
>
> 1. It's interesting that all the type set functions ignore the value
> of TyStatic. I suppose it makes sense, though. There are a couple
> places (e.g., in printing), where we may want to pay attention to the
> value at some point.

Yes, TyStatic prints with the value now. There is a sort of strange situation
with unify, where two TyStatics with different values really shouldn't unify,
though really we should not end up in that situation anyway.

> 2. At least in the intermediate commit I'm looking at now
> (afe22c38bb), it seems like you're not using context to determine
> static evaluation. The way I'd imagined it working is that before
> inferring an expression, we'd check to see whether it's being
> evaluated in static context. If so, we'd enforce that all free
> variables were static (I suppose that means checking whether they have
> static types.
> 3. Relatedly, "case" should always enforce that its branches have the
> same type, regardless of whether the argument is static. If you want
> a case which requires a static argument and doesn't enforce matching
> types, one should use scase.

I was trying to be smart in determining what should be static. Last night I
finally came around and now agree that to have 100% coherent semantics, it
does need to be explicit. In Lir, I'm changing "ExpLet Trans Var Exp Exp"
(which can also be used to create delays, why not), and might do the same with
case or maybe add a bool.

I realized I only really had two objections to "scase". One was that I wanted
function cases to work, but I've realized this can be done anyway just by
using "scase" when expanding static argument matches. The second was that I
don't like the name "scase", but that's obviously a minor issue. Maybe
"#case" or something.

> 4. I suppose there's an issue with static expressions being evaluated
> more than once. I'm not sure I see any way to avoid that other than
> generated specialized Lir during type inference, which seems heavy
> handed. On the other hand, we might have to do that anyway for
> backend and optimization purposes. It might end up cleaner; the
> specialized IR would be completely free of static specifiers, so Infer
> wouldn't have to worry about static at all.

Yeah, I was trying to figure out something like this at first, and realized
that, if we're replacing Lir Exps, it can only be done as part of Infer, which
seemed too messy. If we do go that way with this, though, I think we can
replace a lot of TyStatic with AtomVal (though we still need it for closures
and locals).

> 5. TypedValue should just be called Any. We'll want it for other
> purposes as well (e.g., heterogeneous data structures in normal duck
> code).

I'm all for a shorter name, but I have to admit I don't understand "Any".

> 6. I think shifting to a context-centric approach to statics will also
> help performance. For example, the new primType has to check whether
> it's arguments are static in every single call. In contrast, if we
> determined staticness based on context, we'd check the free variables
> for staticness once for each static context, and switch briefly from
> Infer to Interp to do the evaluation.

I think the major performance issues right now are that there's not enough
information in the Ptrie (so the staticFunction/overload call is slow). In
the best of worlds, the overload tree, for an incomplete application, could
include the list of valid overloads in the tree, so that all of the already
applied arguments don't need to be re-inferred and staticFunction can be a
single lookup. Alternatively, it could go in the closure itself.

> 7. At a higher level, it seems like the right approach involves Infer
> calling directly into Interp. Currently Infer is still independent of
> Interp, right?

Interp depends on Infer heavily, of course. But aside from the complication
of cyclic dependencies, Interp doesn't really do what we need at the moment.
For one, it depends on the global environment already being populated, which
is not and cannot be the case in static evaluation. It also returns Values
instead of TypedValues. But the critical issue is that it doesn't do any
inference -- it call into Infer. What we actually need is a true stand-alone
dynamically-typed interpreter that does inference and evaluation in parallel.
I started writing an Eval module to do that, but realized that it would be
quite a large amount of code which would largely duplicate Infer. On the
other hand, the biggest chunks of code added to Infer now ("scase" and static
argument handling) would still have to be done in Infer, since it wants to
statically evaluate the atom and the matches, but then just (non-statically)
infer the branches.

I think, however, that if we disallowed overloaded static functions, all of
this complexity goes away, and we could do simple replacements pre-inference
without caring about types (true expression macros, much like template
haskell).

Geoffrey Irving

unread,

May 21, 2011, 1:54:22 PM5/21/11

to duck...@googlegroups.com, duck...@googlegroups.com

I don't like #case since it looks like a comment. Let's go with scase unless we think of a better single letter suffix or prefix (something that works on "if" as well).

>
>> 4. I suppose there's an issue with static expressions being evaluated
>> more than once. I'm not sure I see any way to avoid that other than
>> generated specialized Lir during type inference, which seems heavy
>> handed. On the other hand, we might have to do that anyway for
>> backend and optimization purposes. It might end up cleaner; the
>> specialized IR would be completely free of static specifiers, so Infer
>> wouldn't have to worry about static at all.
>
> Yeah, I was trying to figure out something like this at first, and realized
> that, if we're replacing Lir Exps, it can only be done as part of Infer, which
> seemed too messy. If we do go that way with this, though, I think we can
> replace a lot of TyStatic with AtomVal (though we still need it for closures
> and locals).
>
>> 5. TypedValue should just be called Any. We'll want it for other
>> purposes as well (e.g., heterogeneous data structures in normal duck
>> code).
>
> I'm all for a shorter name, but I have to admit I don't understand "Any".

Any as in "can have any type" or "any value whatsoever". The name is stolen from boost::any, though I think it's a fairly standard choice. Pixar used the same name, for example.

>
>> 6. I think shifting to a context-centric approach to statics will also
>> help performance. For example, the new primType has to check whether
>> it's arguments are static in every single call. In contrast, if we
>> determined staticness based on context, we'd check the free variables
>> for staticness once for each static context, and switch briefly from
>> Infer to Interp to do the evaluation.
>
> I think the major performance issues right now are that there's not enough
> information in the Ptrie (so the staticFunction/overload call is slow). In
> the best of worlds, the overload tree, for an incomplete application, could
> include the list of valid overloads in the tree, so that all of the already
> applied arguments don't need to be re-inferred and staticFunction can be a
> single lookup. Alternatively, it could go in the closure itself.

Agreed.

>
>> 7. At a higher level, it seems like the right approach involves Infer
>> calling directly into Interp. Currently Infer is still independent of
>> Interp, right?
>
> Interp depends on Infer heavily, of course. But aside from the complication
> of cyclic dependencies, Interp doesn't really do what we need at the moment.
> For one, it depends on the global environment already being populated, which
> is not and cannot be the case in static evaluation. It also returns Values
> instead of TypedValues. But the critical issue is that it doesn't do any
> inference -- it call into Infer. What we actually need is a true stand-alone
> dynamically-typed interpreter that does inference and evaluation in parallel.
> I started writing an Eval module to do that, but realized that it would be
> quite a large amount of code which would largely duplicate Infer. On the
> other hand, the biggest chunks of code added to Infer now ("scase" and static
> argument handling) would still have to be done in Infer, since it wants to
> statically evaluate the atom and the matches, but then just (non-statically)
> infer the branches.

The critical issue isn't an issue at all. If you need to evaluate an expression statically, just call Infer first to get the type and Interp second to get the value.

> I think, however, that if we disallowed overloaded static functions, all of
> this complexity goes away, and we could do simple replacements pre-inference
> without caring about types (true expression macros, much like template
> haskell).

Would you still need to be running inference and some form of evaluation as you go?

Geoffrey
>
> --
> To unsubscribe send email to duck-lang+...@googlegroups.com.
> More info at http://groups.google.com/group/duck-lang.

Geoffrey Irving

unread,

May 21, 2011, 2:03:13 PM5/21/11

to duck...@googlegroups.com, duck...@googlegroups.com

A note about Infer / Interp interoperability: what if instead of storing static values in types (or maybe in addition to) Infer maintained global an local environments of static variables. That way you'd have everything ready for a call from Infer into Interp whenever you hit a static context.

On the other you, you'd still want to do a preliminary check of the free variables, so maybe building the environment would naturally happen during that check. So probably never mind.

Geoffrey

On May 21, 2011, at 8:57 AM, Dylan Simon <dylan-...@dylex.net> wrote:

Geoffrey Irving

unread,

May 21, 2011, 3:19:14 PM5/21/11

to duck...@googlegroups.com

On Sat, May 21, 2011 at 8:57 AM, Dylan Simon <dylan-...@dylex.net> wrote:
> I think, however, that if we disallowed overloaded static functions, all of
> this complexity goes away, and we could do simple replacements pre-inference
> without caring about types (true expression macros, much like template
> haskell).

Could you elaborate further on this? In particular, would it avoid
the need for TyStatic? I'm a little concerned that the TyStatic way
leads to an extreme cache size blowup. On the other hand, I think
overloaded macros are quite convenient, so it would be a shame to lose
them.

Geoffrey

Dylan Simon

unread,

May 21, 2011, 6:40:05 PM5/21/11

to duck...@googlegroups.com

> The critical issue isn't an issue at all. If you need to evaluate an
> expression statically, just call Infer first to get the type and Interp
> second to get the value.

The problem is that the types in a static expression depend on the values
(and, because of overloading, vice-versa). I'm pretty sure you really need to
compute both simultaneously, and in the current scheme they'd just end up
calling back and forth continuously.

> > I think, however, that if we disallowed overloaded static functions, all of
> > this complexity goes away, and we could do simple replacements pre-inference
> > without caring about types (true expression macros, much like template
> > haskell).
>
> Could you elaborate further on this? In particular, would it avoid
> the need for TyStatic? I'm a little concerned that the TyStatic way
> leads to an extreme cache size blowup. On the other hand, I think
> overloaded macros are quite convenient, so it would be a shame to lose
> them.

One issue is that functions have both static and non-static arguments, and
there is really no way around this, so staticness may depend on the types of
earlier arguments. Making sure static arguments are always first may be
sufficient to side-step this issue.

My original design for all this was to do exactly that, including overloading
on Macro functions. I went so far as to write a generic Overload resolver and
started writing Eval. But after struggling with this I realized I would have
to reimplement all of Infer. This lead me to try a two-stage inference
approach, where the first stage did static and the second did non-static.
This created TyStatic, but made it clear that first stage would basically have
to do a lot of the second stage anyway, and there was no reason to do them
separately.

Sorry, this is sort of unsatisfactory I know, and in some sense the answer is
that I just couldn't figure out how to do it any other way. Maybe with my
slightly better understanding I can try again.

I think the case of recursive functions that take both static and non-static
arguments encompasses all of the complication. When you're in pure static
evaluation mode, things are easy enough, Eval or Interp are reasonable.
However, in a function with static arguments, you are not in static mode, but
are carrying around values that may be used in static mode. Furthermore, even
when you are in static mode, you have to do both static and non-static
function overloading (and the non-static resolutions should work the same as
in non-static mode).

The resulting observation that "static arg" exactly means "overload this
argument on its value too" led me here.

The only real issue is, as you say, blowup in the size of the overload table.
One thing I've always been worried about is how normal overload resolutions
really will happen at runtime. The current approach of calling back into
infer continuously clearly won't work, and neither does simply carrying around
types. Whether these static values are really a problem seems to depend on
the solution the overloads in general.

Geoffrey Irving

unread,

May 22, 2011, 7:03:28 PM5/22/11

to duck...@googlegroups.com

On Sat, May 21, 2011 at 3:40 PM, Dylan Simon <dylan-...@dylex.net> wrote:
>> The critical issue isn't an issue at all. If you need to evaluate an
>> expression statically, just call Infer first to get the type and Interp
>> second to get the value.
>
> The problem is that the types in a static expression depend on the values
> (and, because of overloading, vice-versa). I'm pretty sure you really need to
> compute both simultaneously, and in the current scheme they'd just end up
> calling back and forth continuously.

I'm still convinced it would work fine. Here's one way to look at it:

1. There are three varieties of environment, which map variable names
to (1) types, (2) static values, and (3) dynamic values. Infer knows
about only the first two.
2. When Infer sees an expression in a static context, it first checks
that all free variables in the expression have static values. It then
(1) infers the type of the expression normally, and (2) calls Interp
with the dynamic value environment equal to the static value
environment.
3. Since Infer on an expression never calls Interp on the same
expression, there are no loops in the call chain.

One note: a slightly unsatisfactory aspect of TyStatic is that it
doesn't encode the fact that TyStatic only occurs in the top level of
types. I.e., only variables can have TyStatic types, not subparts of
types.

Geoffrey

Dylan Simon

unread,

May 22, 2011, 7:17:48 PM5/22/11

to duck...@googlegroups.com

From Geoffrey Irving <irv...@naml.us>, Sun, May 22, 2011 at 04:03:28PM -0700:

> On Sat, May 21, 2011 at 3:40 PM, Dylan Simon <dylan-...@dylex.net> wrote:

> >> The critical issue isn't an issue at all. ?If you need to evaluate an

> >> expression statically, just call Infer first to get the type and Interp
> >> second to get the value.
> >
> > The problem is that the types in a static expression depend on the values

> > (and, because of overloading, vice-versa). ?I'm pretty sure you really need to

> > compute both simultaneously, and in the current scheme they'd just end up
> > calling back and forth continuously.
>
> I'm still convinced it would work fine. Here's one way to look at it:
>
> 1. There are three varieties of environment, which map variable names
> to (1) types, (2) static values, and (3) dynamic values. Infer knows
> about only the first two.
> 2. When Infer sees an expression in a static context, it first checks
> that all free variables in the expression have static values. It then
> (1) infers the type of the expression normally, and (2) calls Interp
> with the dynamic value environment equal to the static value
> environment.
> 3. Since Infer on an expression never calls Interp on the same
> expression, there are no loops in the call chain.

Maybe...

One immediate problem with this is that Interp would need to on-demand
evaluate global definitions instead of relying on them being already
evaluated.

This would also cause redundant evaluation of nested static environments.

I'll think about it some more.

> One note: a slightly unsatisfactory aspect of TyStatic is that it
> doesn't encode the fact that TyStatic only occurs in the top level of
> types. I.e., only variables can have TyStatic types, not subparts of
> types.

I agree, and I've thought about making infer instead return something like
(TypeVal, Maybe Value). Currently TyStatic appears in closures too, but that
could be solved by exchanging the order.

Geoffrey Irving

unread,

May 22, 2011, 8:05:12 PM5/22/11

to duck...@googlegroups.com

The redundant evaluation is the same problem as the redundant
evaluation between Interp and Infer. I'm not sure there's any way of
solving it other than having Infer spit out type specialized code.

Geoffrey

Geoffrey Irving

unread,

May 22, 2011, 8:22:29 PM5/22/11

to duck...@googlegroups.com

Yeah, thinking about it a bit further, we should probably bite the
bullet and spit out specialized code from Infer. Slir, perhaps?
Optimization passes will mostly want to operate on specialized IR
anyways. The design is pretty easy; just take the Interp module and
add a type anywhere we call back into Infer. Once Interp no longer
needs to call Infer, we'll have the definition of Slir. I hate the
name Slir, though, so we'll have to think of something better.

Geoffrey

Geoffrey Irving

unread,

May 22, 2011, 8:26:13 PM5/22/11

to duck...@googlegroups.com

By the way: let me know if you want me to take a crack at any of this.
I've been very lazy for the last week or so, and now I'm somewhat
leery of ripping anything apart before the macro stuff stabilizes.

Geoffrey

Geoffrey Irving

unread,

May 22, 2011, 8:31:51 PM5/22/11

to duck...@googlegroups.com

A happy thought: spitting out specialized IR from Infer would serve as
a constructive proof that there are no loops in the Infer/Interp
dance.

Geoffrey

Dylan Simon

unread,

May 22, 2011, 8:32:30 PM5/22/11

to duck...@googlegroups.com

> >> This would also cause redundant evaluation of nested static environments.
> >
> > The redundant evaluation is the same problem as the redundant

> > evaluation between Interp and Infer. ?I'm not sure there's any way of

> > solving it other than having Infer spit out type specialized code.
>
> Yeah, thinking about it a bit further, we should probably bite the
> bullet and spit out specialized code from Infer. Slir, perhaps?
> Optimization passes will mostly want to operate on specialized IR
> anyways. The design is pretty easy; just take the Interp module and
> add a type anywhere we call back into Infer. Once Interp no longer
> needs to call Infer, we'll have the definition of Slir. I hate the
> name Slir, though, so we'll have to think of something better.

It seems like this means creating specialized versions of each function for
each argument types, right, otherwise there is no single type. If we did
that, we wouldn't actually need types in Interp/Slir at all, since they're
only needed for overloads, and then you could just put the right version of
the function in place.

> By the way: let me know if you want me to take a crack at any of this.
> I've been very lazy for the last week or so, and now I'm somewhat
> leery of ripping anything apart before the macro stuff stabilizes.

Hopefully I'm just a few more steps away from static being working, and then
we'll figure out what comes next. In general getting inference fully sorted
out first seems to make sense.

As it is I've tried to make sure everything I push up (to my repo) still
passes all the non-static test cases, though, so unrelated changes can still
be made and tested.

I also just changed the way the overload Ptrie works to cache partial
resolution options. It ended up not actually helping performance that much.
You're likely right that a mutable overload table is the way to go.

Geoffrey Irving

unread,

May 22, 2011, 8:40:19 PM5/22/11

to duck...@googlegroups.com

On Sun, May 22, 2011 at 5:32 PM, Dylan Simon <dylan-...@dylex.net> wrote:
>> >> This would also cause redundant evaluation of nested static environments.
>> >
>> > The redundant evaluation is the same problem as the redundant
>> > evaluation between Interp and Infer. ?I'm not sure there's any way of
>> > solving it other than having Infer spit out type specialized code.
>>
>> Yeah, thinking about it a bit further, we should probably bite the
>> bullet and spit out specialized code from Infer. Slir, perhaps?
>> Optimization passes will mostly want to operate on specialized IR
>> anyways. The design is pretty easy; just take the Interp module and
>> add a type anywhere we call back into Infer. Once Interp no longer
>> needs to call Infer, we'll have the definition of Slir. I hate the
>> name Slir, though, so we'll have to think of something better.
>
> It seems like this means creating specialized versions of each function for
> each argument types, right, otherwise there is no single type. If we did
> that, we wouldn't actually need types in Interp/Slir at all, since they're
> only needed for overloads, and then you could just put the right version of
> the function in place.

Unfortunately, in the current setup you need some notion of types even
in the backend, since a single closure object can refer to all sorts
of overloads. E.g., you could do

f = (+)
f (f 1 2) (f 3. 4.)

Eliminating this sort of nonsense will be one of the most important
optimizations. As it is, function calls basically have to be
dynamically typed.

I've pondering trying to eliminate this by making overloaded functions
not first class. The only argument I have against it is the purity
argument of "types shouldn't depend on context", and the related
increase in type inference complexity.

One compromise solution is to make monomorphic functions a separate
type from overloaded functions, with implicit conversion from
overloaded to monomorphic. That way functions like map which take
monomorphic arguments will perform as one would like: the overload
resolution happens once when the function gets passed into map, and
from then on everything is clean.

>> By the way: let me know if you want me to take a crack at any of this.
>> I've been very lazy for the last week or so, and now I'm somewhat
>> leery of ripping anything apart before the macro stuff stabilizes.
>
> Hopefully I'm just a few more steps away from static being working, and then
> we'll figure out what comes next. In general getting inference fully sorted
> out first seems to make sense.

Cool.

> As it is I've tried to make sure everything I push up (to my repo) still
> passes all the non-static test cases, though, so unrelated changes can still
> be made and tested.
>
> I also just changed the way the overload Ptrie works to cache partial
> resolution options. It ended up not actually helping performance that much.
> You're likely right that a mutable overload table is the way to go.

Geoffrey

Reply all

Reply to author

Forward