Hello
This is not strictly an Antlr question but rather a question that was motivated by my contact with Antlr in building parsers for various languages.
In one phrase, what I am looking for is references (both academic and "practical") towards issues surrounding practical data type implementation
and type inference implementation.
The specific situation I am dealing with is this:
I have created a language that operates over a single data type. This case is "simple" because the building blocks of expressions are either literals or
identifiers and each one of them can be of only one type. This also means that things like assignments don't require any particular treatment.
However, in making the transition to a language that now deals with more than one data types we are immediately faced with type inference.
Some of this type inference can be resolved (or constrained) when defining the language via structurally constraining expressions to be able to distinguish between expressions.
For example numeric expressions involve +,-,/,* over numeric identifiers and numeric literals and string expressions only involve + over string literals and string identifiers.
But, it is this last example where things can easily "cross-over" because addition between string identifiers is indistinguishable from addition between numeric identifiers (except of course if you resort to things like "myVar$" or "$myVar" or representing concatenation with a different symbol....which I would not like to do so that the language is readable and the user doesn't have to remember 20 different ways of applying the same concept.)
So, in this case, we are left with the option of modeling addition between identifiers and deciding at execution time if the operation being expressed is permissible or not.
And at this point, I am not sure how to handle this. Given two data types now, say strings and numbers, how do I make type inference generic?
In a simplistic way, I would "catch" a generic additionOperation and then have a series of if-then-else to decide if the operation should be handled as a string addition or as a numeric addition or if the user is now trying to express an action that is not permissible by the language.
Given that I may extend the repertoire of data types supported by this language, is there a more elegant / generic / appropriate solution for this kind of type inference I am after or is it indeed a matter of a well thought out "tree" of if-then-elses?
Should I strive to be more restrictive at the structural definition or be more general at the language design and then handle things during execution time?
Ideally, I would like the language to mean what it says exactly. But even if I differentiate operators, I am still left with assignment cases like "a=b" when both "a" and "b" exist. So now I still have to decide if "a" and "b" are of the same type, do type coercion if possible and finally throw an error if the operation ends up not being permissible. How can I model this efficiently?
All the best
AA