Data types and type inference

Athanasios Anastasiou

unread,

Jan 3, 2018, 10:00:06 AM1/3/18

to antlr-discussion

Hello

This is not strictly an Antlr question but rather a question that was motivated by my contact with Antlr in building parsers for various languages.

In one phrase, what I am looking for is references (both academic and "practical") towards issues surrounding practical data type implementation
and type inference implementation.

The specific situation I am dealing with is this:

I have created a language that operates over a single data type. This case is "simple" because the building blocks of expressions are either literals or
identifiers and each one of them can be of only one type. This also means that things like assignments don't require any particular treatment.

However, in making the transition to a language that now deals with more than one data types we are immediately faced with type inference.

Some of this type inference can be resolved (or constrained) when defining the language via structurally constraining expressions to be able to distinguish between expressions.

For example numeric expressions involve +,-,/,* over numeric identifiers and numeric literals and string expressions only involve + over string literals and string identifiers.

But, it is this last example where things can easily "cross-over" because addition between string identifiers is indistinguishable from addition between numeric identifiers (except of course if you resort to things like "myVar$" or "$myVar" or representing concatenation with a different symbol....which I would not like to do so that the language is readable and the user doesn't have to remember 20 different ways of applying the same concept.)

So, in this case, we are left with the option of modeling addition between identifiers and deciding at execution time if the operation being expressed is permissible or not.

And at this point, I am not sure how to handle this. Given two data types now, say strings and numbers, how do I make type inference generic?

In a simplistic way, I would "catch" a generic additionOperation and then have a series of if-then-else to decide if the operation should be handled as a string addition or as a numeric addition or if the user is now trying to express an action that is not permissible by the language.

Given that I may extend the repertoire of data types supported by this language, is there a more elegant / generic / appropriate solution for this kind of type inference I am after or is it indeed a matter of a well thought out "tree" of if-then-elses?

Should I strive to be more restrictive at the structural definition or be more general at the language design and then handle things during execution time?

Ideally, I would like the language to mean what it says exactly. But even if I differentiate operators, I am still left with assignment cases like "a=b" when both "a" and "b" exist. So now I still have to decide if "a" and "b" are of the same type, do type coercion if possible and finally throw an error if the operation ends up not being permissible. How can I model this efficiently?

All the best

AA

Eric Vergnaud

unread,

Jan 4, 2018, 7:33:30 AM1/4/18

to antlr-discussion

Well that's definitely not an Antlr question i.e. when using variables, antlr would have no way to decide whether a + b is a valid operation.

Athanasios Anastasiou

unread,

Jan 4, 2018, 7:40:54 AM1/4/18

to antlr-di...@googlegroups.com

Thank you for your response Eric. I am looking for references or terminology that could help me narrow my search and I thought that the antlr4 community would be the best place to ask, I appreciate this is not strictly about the software.

I come across a lot of theoretical work on type inference which is useful to an extent but it would be nice to get a more practical aspect too or any relevant insight into modeling (?)

All the best

Athanasios Anastasiou

--
You received this message because you are subscribed to a topic in the Google Groups "antlr-discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/antlr-discussion/xRCOh27X7Z4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to antlr-discussion+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Eric Vergnaud

unread,

Jan 4, 2018, 10:16:30 AM1/4/18

to antlr-discussion

Having implemented type inference in a language, I found that Java's type architecture is pretty effective - apart from Generics which are crap.

Basically you have an inheritance tree with java.lang.reflect.Type at its root.

Just below you have Class, and the magic isAssignableFrom method.

That method can answer pretty much all type related questions that an interpreter may ask.

The tough part about type inference is how deep you want to drill.

Inferring variable types from literals, typed parameters and method return types is pretty straightforward using a mechanism such as the above.

As you iterate over the statements, you collect assigned types, then you check whether they are compatible and pick the most generic one.

If however your method parameters are untyped, you're in for a very very tough journey!

To unsubscribe from this group and all its topics, send an email to antlr-discussi...@googlegroups.com.

Athanasios Anastasiou

unread,

Jan 5, 2018, 11:24:51 AM1/5/18

to antlr-di...@googlegroups.com

Thank you very much Eric. I think that this is a great start.

One of the things that i got caught up with was how do you decide the result of an operation.

This isAsignableFrom is a good start, possibly aided by the operator functions themselves (functions that every datatype object has, to carry out the modelled operations).

All the best

AA

To unsubscribe from this group and all its topics, send an email to antlr-discussion+unsubscribe@googlegroups.com.

Reply all

Reply to author

Forward