I apologize for the length of this message! There are three concrete questions embedded in it, numbered 1)-3). The rest is context for these questions.
I had the impression, perhaps an incorrect one, that one should be able to annotate the type of any variable so as to provide manual assistance to type inference. Inference may not be up to some particular task, but if I can figure out the type on my own, I should be able to tell it to Julia to mitigate type instability, and she'll carry on from there in her inference. In some situations, however, it seems that I can't effectively inform type inference that a variable has a concrete type. Consider the following.
type T1{N<:Number} end
type T2{T<:T1} end
type T3{T<:T2} end
type T4{T<:T3} end
type T5{T<:T4} end
T5{N<:Number}(::Type{N}) = T5{T4{T3{T2{T1{N}}}}}()
It may seem like an unreasonably complicated hierarchy, but I could describe a use case where something along these lines arises naturally and maybe doesn't seem so unreasonable. This is a long message already, however, so let me skip over that. For now, just observe that any realization of type `T5` violates `MAX_TYPE_DEPTH` in `inference.jl`. Let me call such a type baroque. The main issue is that there doesn't seem to be a way to tell Julia that a variable will be a concrete type of T5. Maybe it's a bad idea to have such a crazy type hierarchy. But if I can figure out what flavor of T5 a variable will be, shouldn't I be able to inform Julia? I try, but she plugs her ears and goes "La, la, la,..."
Specifically, if I write
function foo()
x = T5{T4{T3{T2{T1{Int}}}}}()
x
end
then the type of `x` will be inferred by `apply_type_tfunc` in `inference.jl`. The heavy lifting is done by a call to the real-live `apply_type`, which correctly deduces that the resulting type is a `T5{T4{T3{T2{T1{Int}}}}}`. But there is a subsequent call at the end of `apply_type_tfunc` to `type_too_complex` which sees that the type is baroque and hence `apply_type_tfunc` returns the unspecialized `T5{T4{T3{T2{T1{N<:Number}}}}}` instead. This is the ear-plugging part, and the moral is that a type coming out of `apply_type_tfunc` can't be baroque. The outer function `foo` is a bit of a red herring here; the return type of the constructor `T5{T4{T3{T2{T1{Int}}}}}()` is unknown because it relies on the type decided on by `apply_type_tfunc`.
What if I try an explicit type assert?
function foo()
x = T5{T4{T3{T2{T1{Int}}}}}()
x::T5{T4{T3{T2{T1{Int}}}}}
end
The type of the the argument T5{T4{T3{T2{T1{Int}}}}} to the type assert will again be determined by `apply_type_tfunc`. So it will be an unspecialized `T5{T4{T3{T2{T1{N<:Number}}}}}`, and the concrete return type of foo() is not known.
Question 1:
Is there a way to write the function `foo` and annotate the type of variable `x` so that it is known to be a T5{T4{T3{T2{T1{Int}}}}}, and such that the type of the return value of `foo` is known?
Although types coming out of `apply_type_tfunc` cannot be baroque, it's easy enough to inject baroque types into type inference. For example, arguments to functions are not subject to `type_too_complex`. Nor are the types of return values. For example, consider
first_one{N<:Number}( a::T5{T4{T3{T2{T1{N}}}}}, b::T5{T4{T3{T2{T1{N}}}}} ) = a
The typed code for `first_one( T5(Int), T5(Int) )` knows that `a` is a `T5{T4{T3{T2{T1{Int}}}}}`, and so is `b`, and so is the return type. Calls to `getfield` also seem to not be bothered much by baroque types. To be fair, I see that there is some code in `getfield_tfunc` that imposes `limit_type_depth`, but I haven't looked at it closely, and I haven't run into that restriction in practice. So in effect I can send existing concrete baroque types down the call stack and have them be reasoned with, but I can't annotate the type of a new one created at my current level, and consequently I can't send its type up the call stack either.
Question 2:
Is there a reason for the asymmetry? In particular, is there a reason why baroque types coming out of `apply_type` are problematic, but baroque types as arguments, as fields in at least some cases, and as return values are not? Why is it ok to know that that variable `x` is of type `T5{T4{T3{T2{T1{Int}}}}}` because I passed it in as an argument, but problematic to know it is of type `T5{T4{T3{T2{T1{Int}}}}}` when I create one with `new`?
Question 3:
As a conservative remedy, would it be problematic if the call to `type_too_complex` in `apply_type_tfunc` was omitted for for leaf types? That is, if `apply_type` can determine we have a leaf type, then let that leak into type inference, even if it is baroque, just as it can as a function argument or a return value. This would allow the return type of explicit constructors and type asserts to be known. To be clear, I don't pretend to understand more that the broad strokes of what is happening in `inference.jl`. There are surely a lot of subtleties in that code. So maybe what I'm suggesting is a bad idea. But it struck me that it may be the case that concrete baroque types coming out of `apply_type_tfunc` may not be any worse that the baroque types that can already find their way into the inference system.
Many thanks for having read this far...
-David