Cyclic types

212 views
Skip to first unread message

Jan Mercl

unread,
May 9, 2018, 4:21:47 AM5/9/18
to golang-nuts
Robert Griesemer wrote in https://github.com/golang/go/issues/25305#issuecomment-387624488 at May 9th:

I'm probably using incorrect assumptions. Let me summarize them here:


1) A type is cyclical iff its size is not computable.


I'm really not sure if this is what the specification really means. If not then I wonder why not, because


2) Determining computability of the size of a type is trivial (wrt "we go through great lengths to detect such cycles").


AFAICT, there are two classes of types.


In the first (scalar) class the size of T is a constant fully determined by the kind of T: bool, integers, real and complex types, slices, interfaces, pointers, maps, channels, functions. (The last three being just a special case of a pointer.)


In the second (non-scalar) class a type T has size dependent (transitively) on other types (T_1, ... T_n), possibly including T itself. Scalar T_i brings no problem in computing the size of T.


For non-scalar T_i, all we need is a sentinel provided by knowing if the size of a type is a) not yet determined, b) being determined, c) determined/valid. When the size of T is needed, but not yet determined, it's first marked as "being determined". If, during computation of the size of T, we run into the sentinel, ie. we need to know the size of T_i marked "size being determined", we have proved the size of T is not computable. Otherwise the size of T is computed, stored and T is marked as "size determined/valid".


Wrt "even if they are "obviously" not cyclic to a human reader."


I think there's no difference between cyclic type determined by a program or by a human reader except for a bit higher error rate in the later case ;-)


--

-j

Robert Griesemer

unread,
May 9, 2018, 1:01:29 PM5/9/18
to Jan Mercl, golang-nuts
This sounds all good.

I am not disputing at all what you are saying, but a) the spec doesn't actually state any of this explicitly; and b) I agree that size computation is straight-forward once a type data structure is all constructed. The caveat is the 2nd part of this sentence: We're not doing always a correct job of setting up a type completely before it's used. Hence we have issues like #25305. The compiler does a better job than go/types most of the time; but sometimes it's the other way around.

I think we also have been hesitant to simply disallow "cyclical types" (per your definition of cyclical) in the spec because we (or at least I) don't have a good understanding that our code actually detects exactly those. We have plenty of examples of code where we could determine the type's size but we still exclude the type. For instance

type T = *T

T has clearly the size of a pointer, yet we disallow (in the compiler) such types. In this case it's by design (of the type alias proposal), but it would be nice if we could relax it. But I'm not sure we (or I) understand all the consequences fully, quite yet. And I think we have other situations (not involving alias types) where we run into problems, even though we can compute the type's size.

(FWIW, I don't think everybody equates "cyclic type" with "type size is not computable". People tend to use "cyclic" and "recursive" interchangeably for types. I was definitively using "cyclic" as "recursive" in #25305).

More generally, I think it would be great if we could state exactly what you said in the spec:

1) Types for which their sizes cannot be computed (see 2) are invalid.
2) The size of a type is computable if ... (and then we give essentially the rules you outlined already).

As said above, 2) requires all involved types to be set up sufficiently such that we can determine the relevant size information. Sometimes that's not the case. Hence my comment in the issue #25305.

Finally, I agree that there shouldn't be a difference between cycle detection by a human and a computer. But the problem is that the computer may be using an algorithm that may be conservative, or incorrect, or not very general (for the sake of speed in the common case).

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Robert Griesemer

unread,
May 9, 2018, 2:00:34 PM5/9/18
to Jan Mercl, golang-nuts
PS: Here's an example where we (humans) can obviously compute the size of a type, yet neither cmd/compile, gccgo, nor go/types have any success in doing so:

type T struct {
a [10]int
b [len(T{}.a)]int
}

The problem is that all implementations have an "eager" (depth-first) approach somewhere leading to requiring all of T to be set up before we can determine the size of T.a. Specifically, when determining the size of T.b we must be careful to not look at the size of T (which is in the process of being determined), and only at the size of T.a (which we can obviously tell). Furthermore, we must use an algorithm that computes the size of T.a "on demand", not in the order as the fields appear (otherwise it wouldn't work if b was before a). And so forth. All these things make size computation more complicated and expensive. That question is: Is it worth the extra cost? Or are these cases esoteric and don't show up in real code? And if we use simpler algorithms, is there an easy way to describe which types are accepted and which aren't?

matthe...@gmail.com

unread,
May 9, 2018, 2:56:53 PM5/9/18
to golang-nuts
type T struct {
    a
[10]int
    b
[len(T{}.a)]int
}

could appear about as maintainably like this:

const size = 10

type T
struct {
    a
[size]int
   
// many other fields and comments
    b
[size]int
}

This case doesn’t justify more complexity to me.

Matt
Message has been deleted

Bakul Shah

unread,
May 9, 2018, 5:05:34 PM5/9/18
to Robert Griesemer, Jan Mercl, golang-nuts
I think it is more than a matter of "eager" vs "on-demand"
computation. T{} can't be constructed until its size is known.
Semantically this is a recursive definition.

However, a human may falsely assume that T{} is constructable
and given that they may think len(T{}.a) can be computed.

If on the other hand you had

type T struct { a [10]int;b *[len(T{}.a)]int }

there would be no problem.

Jan Mercl

unread,
May 10, 2018, 4:13:14 AM5/10/18
to Robert Griesemer, golang-nuts
On Wed, May 9, 2018 at 7:01 PM Robert Griesemer <g...@golang.org> wrote:

> I think we also have been hesitant to simply disallow "cyclical types" (per your definition of cyclical) in the spec because we (or at least I) don't have a good understanding that our code actually detects exactly those. We have plenty of examples of code where we could determine the type's size but we still exclude the type. For instance
>
> type T = *T
>
> T has clearly the size of a pointer, yet we disallow (in the compiler) such types. In this case it's by design (of the type alias proposal), but it would be nice if we could relax it.

In this particular case it seems to me that we don't need to forbid it apriori. The contexts in which this definition can appear

1) In a scope where T is not defined:

        type T = *T

Covered already, results in 'T undefined'.

2) In a scope where the name T is already declared:

        type T int
        type T = *T

Covered already by the specs: "No identifier may be declared twice in the same block, ..."

3) Otherwise T must have been declared in any of the parent scopes:

        type T int

        func foo() {
                var v T
                type T = *T
                var w T
                w = &v // Okay
                v = *w  // Okay
                fmt.Printf("%T %T\n", v, w) // Not nice, but the same outcome as w/o aliases: https://play.golang.org/p/jgRxFN8qqAL
        }

It seems there's no problem in accepting it in this case. sizeof(v) == sizeof(int) and sizeof(w) == sizeof(unsafe.Pointer).

> (FWIW, I don't think everybody equates "cyclic type" with "type size is not computable". People tend to use "cyclic" and "recursive" interchangeably for types. I was definitively using "cyclic" as "recursive" in #25305).

Agreed. Using the word 'invalid' - as you do later in your post - is much better and unambiguous.

--

-j

Jan Mercl

unread,
May 10, 2018, 4:14:17 AM5/10/18
to Robert Griesemer, golang-nuts
On Wed, May 9, 2018 at 7:45 PM Robert Griesemer <g...@google.com> wrote:

> PS: Here's an example where we (humans) can obviously compute the size of a type, yet neither cmd/compile, gccgo, nor go/types have any success in doing so:
>
> type T struct {
>         a [10]int
>         b [len(T{}.a)]int
> }

> The problem is that all implementations have an "eager" (depth-first) approach somewhere leading to requiring all of T to be set up before we can determine the size of T.a. Specifically, when determining
> the size of T.b we must be careful to not look at the size of T (which is in the process of being determined), and only at the size of T.a (which we can obviously tell). Furthermore, we must use an
> algorithm that computes the size of T.a "on demand", not in the order as the fields appear (otherwise it wouldn't work if b was before a).

Sure, as you wrote, size of T.b does not depend on size of T. It depends on len([10]int) only - or the type of T.a in a more general view.

I agree with the quoted above completely. Do not compute the size of a struct until needed - that's the only way I can think of how to avoid the (existing) problems of correctly detecting invalid transitively self-referential types. Later on, any possibly unused types will get their sizes determined only after type checking is completed - to ensure they're valid. Exported or not.

> All these things make size computation more complicated and expensive. That question is: Is it worth the extra cost? Or are these cases esoteric and don't show up in real code? And if we use simpler
> algorithms, is there an easy way to describe which types are accepted and which aren't?

It does not sound hard to me to implement it this way. Performance-wise, it'd surprise me to have noticeable higher run-time cost. Actually, not having to construct and maintain the type stack can possibly improve performance (also b/c of less allocations). But I guess it's not trivial to change the existing implementation to the different approach.

--

-j

Reply all
Reply to author
Forward
0 new messages