avoiding types at runtime

22 views

Skip to first unread message

Geoffrey Irving

unread,

Mar 3, 2012, 1:08:58 AM3/3/12

to duck...@googlegroups.com, Jonathan Bachrach

Hello,

Dylan: for background, Jack is starting on a lispish language on top
of LLVM, and might potentially use a type system similar to duck.

The main nastiness in the duck type system is that, as a consequence
of the rule that types do not depend on context, first class functions
must be passed around as tables mapping argument types to function
instances. This is really annoying. The purpose of this email is to
ponder how difficult it would be to introduce enough context
dependence to avoid this problem. The answer may well be very
unpleasant, but it's worth checking.

The canonical example is map, which looks like this

map :: (a -> b) -> List a -> List b
map :: (a -> b) -> Maybe a -> Maybe b

f :: a -> List a
f x = [x]

print \ map f [1,2,3]

Here's what happens during type inference of the printed expression:

1. We encounter map and f, and give them types "Type map" and "Type
f". That is, their types say nothing more or less than which
functions they are.
2. [1,2,3] :: List Int
3. We apply (Type map) to (Type f), giving (Type map \ Type f) since
we don't have enough arguments for overload resolution.
4. We apply the previous type to (List Int). Overload resolution now
succeeds, the first map is chosen, and the map expression is inferred
to have type List \ List Int. This involves the following dance
4a. For each overload of map, we try to apply it to the argument list
[Type f, List Int]. I'll consider only what happens for the correct
overload.
4b. We try to find a substitution so that (a -> b) >= Type f. This is
impossible since we don't yet know what "a" is.
4b. We try to find a substitution so that List a >= List Int. This is
achieved by a = Int.
4c. We go back to (a -> b) >= Type f, which now looks like (Int -> b)
>= Type f. Inferring f with an Int argument gives b = Int, and we're
done.

The key points are

1. As we traverse through the various expressions, information only
propagates upwards from variables to their containing expressions and
so on. The types that propagate along are always TypeVals, that is,
concrete types corresponding to concrete sets of possible runtime
values.
2. Some cyclic behavior is required, but it's confined to the type apply code.

In contrast, in Hindley–Milner type inference information propagates
in all directions, mediated by type judgements that contain as yet
unspecified type variables.

If we wanted a runtime system where all first class runtime functions
are nonoverloaded, to avoid the need to look up instances in a type
map during function application, we'd need something like this:

1. The type inference algorithm would have to start passing around
nonconcrete types with variables similar to Hindley-Milner. In the
code, this would mean a switch from TypeVal to TypePat.
2. When we encounter f, instead of simply inferring (f :: Type f),
we'd generate a fresh type variable 'a, and infer that this particular
instance of f has type 'a, with a link back from 'a to the particular
instance of f.
3. These type variables would propagate upwards, possibly unifying
with other type variables (e.g., the expression (if c then f else g)).
4. When we reach the second argument of map, the overload resolution
dance would be
4a. Try to find a substitution so that (a -> b) >= 'a. Again, since
'a is a type variable representing a function, we can't do anything
with this.
4b. List a >= List Int gives a = Int.
4c. Going back to (Int -> b) >= 'a, we now know that 'a is applied to
Int. Linking back to f, we see that that instance of f should use the
Int overload, and that 'a = (Int -> List Int).

A few notes:

1. If the introduced type variables escape too far, it would probably
be an error. For example, the function

g () = f

would be an error, since its return type is unknown. Adding a
type signature to g such as

g :: () -> Int -> a

would fix this, since then we unify Type f with (Int -> a) and
figure out which overload was meant. Similarly

g () = f :: Int -> a

also works.

2. If the introduced type variables vanish without being filled in, it
means it doesn't matter what overload is used, and we can safely
substitute a null pointer.
3. All types would have to become concrete in the passage from
function call to overload resolution to function body, in order to
maintain the reasonableness of the whole algorithm.
4. The last point might be incompatible with many natural uses of
nested functions, where the instantiations wouldn't be known until
several different function bodies are jointly inferred.

That last point seems potentially quite unpleasant.

Geoffrey

Reply all

Reply to author

Forward

0 new messages