Schema Refactoring and Unique IDs

159 views
Skip to first unread message

Matt Stern

unread,
Nov 25, 2020, 2:39:01 PM11/25/20
to Cap'n Proto
Hi all,

Suppose I have a simple schema file:

0xabbeabbeabbeabbe;

struct Foo {
  val @0 : UInt32;
};
struct Bar {
  val @1 : UInt32;
};

I would like to move Bar into a separate schema file. If I understand the docs correctly, then this will change Bar's unique ID.

I have two questions about that:
  1. Will changing Bar's unique ID cause backwards incompatibility with old messages that are serialized with the old ID?
  2. If so, what can I do to prevent this? I would like my change to have no side-effects (a pure no-op).
Thanks!

Erin Shepherd

unread,
Nov 25, 2020, 2:52:14 PM11/25/20
to 'Kenton Varda' via Cap'n Proto
1. This depends. The only places that the IDs get used "behind the scenes" are (a) those on interfaces are used in RPC calls to identify the interface and (b) when encoding schema annotations

On the other hand, someone might be explicitly reading the ID from the schema file or the constant from the generated code

2. capnpc can be asked to generate capnp format output. In that case, it'll emit a copy of the schema with comments stripped and all automatically generated IDs inserted. You can grab the ID (and syntax) from there

- Erin
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+...@googlegroups.com.

Kenton Varda

unread,
Nov 25, 2020, 3:15:57 PM11/25/20
to Matt Stern, Erin Shepherd, Cap'n Proto
Erin is correct.

For #2, the command-line syntax to have the compiler echo back capnp format (with all type IDs defined explicitly) is:

    capnp compile -ocapnp foo.capnp

This of course requires that `capnp` and the generator plugin `capnpc-capnp` are in your $PATH, which they should be after installing Cap'n Proto globally. If you haven't installed it globally, you can do:

    path/to/capnp compile -opath/to/capnpc-capnp foo.capnp

In any case, the output will go to the terminal (no files are generated).

-Kenton

Matt Stern

unread,
Nov 25, 2020, 4:43:05 PM11/25/20
to Cap'n Proto
Thanks for the quick responses and clarifications!

My actual schema file is compiled in C++ and Java. As such, it has the following preamble:

using Cxx = import "/capnp/c++.capnp";
$Cxx.namespace("MyNamespace");

using Java = import "/capnp/java.capnp";
$Java.package("com.myorg");
$Java.outerClassname("MyOuterClass");

When I try to run capnp compile, I get the following:

error: Import failed: /capnp/java.capnp

I have tried some of the workarounds in this thread but had no luck:

If I comment out the Java bits and just compile in C++ (which works for me), will this have any effect on the unique IDs for the structs in my schema file?

Thanks!

Ian Denhardt

unread,
Nov 25, 2020, 5:26:52 PM11/25/20
to Cap'n Proto, Matt Stern
The ID doesn't affect the encoding itself, so the basic things will
still work.


You can avoid changing the id by specifying it explicitly, e.g.

struct Bar 0xfeefefffefeefefe {
val @0 :UInt32;
}

You can discover the current id by running:

capnp compile -ocapnp myschema.capnp

Which will output a version of the schema including the ids, as well as
some other information.

-Ian

Quoting Matt Stern (2020-11-25 14:39:01)
> Hi all,
>
> Suppose I have a simple schema file:
>
> 0xabbeabbeabbeabbe;
>
> struct Foo {
>
> val @0 : UInt32;
>
> };
>
> struct Bar {
> val @1 : UInt32;
> };
>
> I would like to move Bar into a separate schema file. If I understand
> the [1]docs correctly, then this will change Bar's unique ID.
>
> I have two questions about that:
>
> 1. Will changing Bar's unique ID cause backwards incompatibility with
> old messages that are serialized with the old ID?
> 2. If so, what can I do to prevent this? I would like my change to
> have no side-effects (a pure no-op).
>
> Thanks!
>
> --
> You received this message because you are subscribed to the Google
> Groups "Cap'n Proto" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to [2]capnproto+...@googlegroups.com.
> To view this discussion on the web visit
> [3]https://groups.google.com/d/msgid/capnproto/aa247efb-ca69-40fa-97a7-
> 415792fd0c1dn%40googlegroups.com.
>
> Verweise
>
> 1. https://capnproto.org/language.html#unique-ids
> 2. mailto:capnproto+...@googlegroups.com
> 3. https://groups.google.com/d/msgid/capnproto/aa247efb-ca69-40fa-97a7-415792fd0c1dn%40googlegroups.com?utm_medium=email&utm_source=footer

Matt Stern

unread,
Nov 25, 2020, 8:11:57 PM11/25/20
to Ian Denhardt, Cap'n Proto
Great, thanks everyone. I will get the current unique ID from the capnp tool and specify it explicitly for my struct before I do my refactoring.

Kenton Varda

unread,
Nov 27, 2020, 10:27:09 AM11/27/20
to Matt Stern, Cap'n Proto
On Wed, Nov 25, 2020 at 3:43 PM Matt Stern <mjst...@gmail.com> wrote:
When I try to run capnp compile, I get the following:

error: Import failed: /capnp/java.capnp

You will need to specify the same -I flags (import path) that you normally specify to `capnp compile` when running the Java code generator.

If I comment out the Java bits and just compile in C++ (which works for me), will this have any effect on the unique IDs for the structs in my schema file?

No, the auto-generated IDs do not in any way depend on the contents of other files. Auto-generated IDs are constructed by concatenating the parent scope ID and the type name, and then taking a hash of that.

-Kenton

Ian Wilson

unread,
Aug 1, 2023, 4:00:56 PM8/1/23
to Cap'n Proto
Hi all, apologies for jumping on this thread a few years later.

I have a similar question: suppose I have 2 different capnp files with the same file ID.

# file foo.capnp
0xabbeabbeabbeabbe;

struct Foo {
  val @0 : UInt32;
};

# file bar.capnp
0xabbeabbeabbeabbe;

struct Bar {
  val @1 : UInt32;
};

In my project, someone copy/pasted a file leading to two different schemas with the same file ID, 0xabbeabbeabbeabbe.
I find a use case where I want to import both of these capnp files and their types into a new one - of course this runs into a compilation error with the duplicate IDs.

I'd like to generate a new file ID to replace one of these, but I'm not sure if this would impact the autogenerated type IDs and break other systems.

Would merely changing the file ID affect the types that already exist in these files?

Kenton Varda

unread,
Aug 2, 2023, 6:37:12 PM8/2/23
to Ian Wilson, Cap'n Proto
Yes, changing the file ID will change the IDs of all types declared within, unless they have explicitly declared their own IDs.

This only really matters if you are using RPC. Type IDs of interfaces are part of the wire protocol. Type IDs of structs and enums are not really used for anything, unless your application itself is using them. So it's usually fine to change the type IDs of structs and enums.

So if you are only using serialization, not RPC, just go ahead and change one of the file IDs. If you are using RPC, you will need to manually override the IDs of the interface types in the file to keep them consistent with what they were originally, and then you can change the file ID.

Of course, if both files happened to declare an interface with the same name, those interfaces will have the same ID. In this case you have big problems. You will have to make a breaking change to one of those two interfaces.

-Kenton

--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+...@googlegroups.com.

Jonathan Shapiro

unread,
Aug 3, 2023, 12:53:26 AM8/3/23
to Kenton Varda, Ian Wilson, Cap'n Proto
Kenton: Are the type IDs likely to have collided?

Ian: If not, you might be able to hand-annotate the impacted types with the already generated IDs and then change the file ID. This offers the possibility of a clean changeover at the next breaking version change.

Kenton: what other IDs will change if the file ID is changed that Ian might need to consider here?


Jonathan


Kenton Varda

unread,
Aug 3, 2023, 2:57:21 PM8/3/23
to Jonathan Shapiro, Ian Wilson, Cap'n Proto
On Wed, Aug 2, 2023 at 11:51 PM Jonathan Shapiro <sh...@buttonsmith.com> wrote:
Kenton: Are the type IDs likely to have collided?

If the two files contain types with identical names, they will have identical IDs, as the default ID for a type is generated by taking a hash of the parent scope's ID together with the type's name.
 
Ian: If not, you might be able to hand-annotate the impacted types with the already generated IDs and then change the file ID. This offers the possibility of a clean changeover at the next breaking version change.

Kenton: what other IDs will change if the file ID is changed that Ian might need to consider here?

64-bit high-entropy IDs are assigned to all "static" declarations, that is:
- files (top level)
- struct types
- enum types
- interface types
- constants
- annotations

Usually, in practice, only interface IDs really matter for compatibility with other programs.

The other IDs might matter if programs at different versions of the schema are dynamically exchanging their schemas. SchemaLoader decides that two schemas are versions of the same type if they have the same type ID. But in practice I've never seen a use case that simultaneously uses dynamic schemas and is trying to reconcile multiple versions of those schemas.

-Kenton
Reply all
Reply to author
Forward
0 new messages