optimizing inference

3 views
Skip to first unread message

Dylan Simon

unread,
Nov 9, 2011, 10:40:30 PM11/9/11
to duck...@googlegroups.com
Vaguely looking at some duck stuff again. First thing I'd like to do is make
inference faster.

Right now, we do this:

s <- get -- fork state to do speculative fixpoint
r <- fix TyVoid -- recursive function calls are initially assumed to be void
put s -- restore state
r' <- fix True r -- and finalize with correct type
when (r /= r') $ inferError noLoc "internal error: type convergence failed"
return (or r)

However, if we just don't bother with the speculation:

r <- fix TyVoid
return (or r)

It seems like things still work fine, and much much faster. The only downside
I see is that we end up with lots of useless stuff in the overload table.
However, maybe there was some counterexample that requires the rollback
(something to do with inferring functions return void that will not on the
final pass? it'd be nice if we had a test-case for this if there is one).

As for the extra stuff in the overload table, there are some options to get
rid of it again, like by marking at each pass, or doing a final top-level
restricted inference pass (similar to what Interp does) that just marks/copies
used overloads. On the other hand, they don't actually hurt anything...

There are further optimizations we can make, but I just wanted to put this on
the table first to make sure I wasn't missing anything.


Longer term, I'd like to try producing specialized, type-free code
(C++-template style). I know we've talked about this and the exponential
blow-up and linking issues, but I'd still like to try it.

Geoffrey Irving

unread,
Nov 10, 2011, 12:36:34 AM11/10/11
to duck...@googlegroups.com
On Wed, Nov 9, 2011 at 7:40 PM, Dylan Simon <dylan-...@dylex.net> wrote:
> Vaguely looking at some duck stuff again.  First thing I'd like to do is make
> inference faster.
>
> Right now, we do this:
>
>  s <- get -- fork state to do speculative fixpoint
>  r <- fix TyVoid -- recursive function calls are initially assumed to be void
>  put s -- restore state
>  r' <- fix True r -- and finalize with correct type
>  when (r /= r') $ inferError noLoc "internal error: type convergence failed"
>  return (or r)
>
> However, if we just don't bother with the speculation:
>
>  r <- fix TyVoid
>  return (or r)
>
> It seems like things still work fine, and much much faster.  The only downside
> I see is that we end up with lots of useless stuff in the overload table.
> However, maybe there was some counterexample that requires the rollback
> (something to do with inferring functions return void that will not on the
> final pass? it'd be nice if we had a test-case for this if there is one).

The simplest example I could come up with (after a half hour of confusion) is

f () = h (Type (f ()))

h :: Type Void -> Int
h Void = g ()

h :: Type Int -> Int
h Int = 0

g () = f ()

_ = Type (f ()) :: Type Int
_ = Type (g ()) :: Type Int

This succeeds with the current code but fails if you turn off the
rollback. The first time f is inferred, it gets an Int return type,
and g gets the preliminary return type Void. The next time f is
inferred, it still gets Int, but g is never called, so its type
doesn't get updated to Int. If we leave Void as the return type for
g, later references to g will be invalid.

This stuff is pretty subtle, so its entirely possible the current code
has holes as well.

A while back I started writing a generic framework for fixpoint
computations in order to make type inference near optimal (e.g.,
inferring non-recursive functions exactly once instead of twice), but
I got mired in details and then never finished it. I'd like to finish
that, but I don't want to hold you up. Unfortunately, without doing
the rather invasive rewrite or something similar, I don't see a way to
make it both quadratic time and correct.

Therefore, I propose dropping the rollback for now, and filing a bug
with the above test case. I'll fix the bug and add the test case to
our list of tests once I finish the rewrite (hopefully in not too
long). With the rollback dropped type inference should be pleasantly
linear time, and you won't have to deal with as much extreme slowness.
Do you want to file the bug and make the intentional breaking change?

> As for the extra stuff in the overload table, there are some options to get
> rid of it again, like by marking at each pass, or doing a final top-level
> restricted inference pass (similar to what Interp does) that just marks/copies
> used overloads.  On the other hand, they don't actually hurt anything...

Actually I think we want all that extra stuff to stay in the overload
table; I only deleted it because of the potential incorrectness. Once
the rewrite is complete it'll definitely all stay there.

> There are further optimizations we can make, but I just wanted to put this on
> the table first to make sure I wasn't missing anything.
>
>
> Longer term, I'd like to try producing specialized, type-free code
> (C++-template style).  I know we've talked about this and the exponential
> blow-up and linking issues, but I'd still like to try it.

Cool.

Geoffrey

Dylan Simon

unread,
Nov 10, 2011, 12:00:29 PM11/10/11
to duck...@googlegroups.com
From Geoffrey Irving <irv...@naml.us>, Wed, Nov 09, 2011 at 09:36:34PM -0800:

> On Wed, Nov 9, 2011 at 7:40 PM, Dylan Simon <dylan-...@dylex.net> wrote:
> > Vaguely looking at some duck stuff again. ?First thing I'd like to do is make

> > inference faster.
> >
> > Right now, we do this:
> >
> > ?s <- get -- fork state to do speculative fixpoint
> > ?r <- fix TyVoid -- recursive function calls are initially assumed to be void
> > ?put s -- restore state
> > ?r' <- fix True r -- and finalize with correct type
> > ?when (r /= r') $ inferError noLoc "internal error: type convergence failed"
> > ?return (or r)

> >
> > However, if we just don't bother with the speculation:
> >
> > ?r <- fix TyVoid
> > ?return (or r)
> >
> > It seems like things still work fine, and much much faster. ?The only downside

Ah, right. I remember now we talked about that. If you already have some
code written, then that sounds good.

In the mean time, I might add type aliases and work on some new test cases.
(There's a commented failing test case in fac.duck that seems to cause a loop
somewhere.)

I've also pushed updates for ghc 7.2 and some small cleanups to my github
repo, but I should test on ghc 6.12 before pushing to yours.

Reply all
Reply to author
Forward
0 new messages