Encoding null, false, and true

171 views
Skip to first unread message

Dale Schumacher

unread,
Sep 30, 2013, 1:43:58 PM9/30/13
to capn...@googlegroups.com
I'm considering the encoding of three simple messages.  Since each message must start with a struct pointer to the root structure, it seems that the following encodings would represent null, false, and true.

null =  0x0001000000000000 0x0000000000000000  -- a struct of zero data and one pointer, where the pointer is null
false = 0x0000000100000000 0x0000000000000000  -- a struct of one data and zero pointers, where the data is a single bool (0)
true =  0x0000000100000000 0x0000000000000001  -- a struct of one data and zero pointers, where the data is a single bool (1)

Alternatively, it seems like a null root pointer may also encode a null message, but it's not clear to me if that would be legal.

Andrew Lutomirski

unread,
Sep 30, 2013, 1:52:38 PM9/30/13
to Dale Schumacher, capnproto
So this is something like:

struct Foo {
a @0: Bool;
b @0: Object;
}

This seems strange, at least in the context of your description.

--Andy

>
> --
> You received this message because you are subscribed to the Google Groups
> "Cap'n Proto" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to capnproto+...@googlegroups.com.
> Visit this group at http://groups.google.com/group/capnproto.

Dale Schumacher

unread,
Sep 30, 2013, 2:04:25 PM9/30/13
to Andrew Lutomirski, capnproto
I think what I want is more like:

struct Answer {
  ok @0: Bool;
}

Answer can be true, false, or null.  Maybe that means I need the root pointer to be null.  Otherwise, I would need a union type to carry the null-pointer value.

Kenton Varda

unread,
Sep 30, 2013, 2:13:48 PM9/30/13
to Dale Schumacher, Andrew Lutomirski, capnproto
Hi Dale,

If what you want is a "nullable boolean", you probably shouldn't be using the concept of a null pointer to represent it.  Instead, you could use an enum with three enumerants, or you could use a union of a Bool and a Void.

While Cap'n Proto technically has null pointers on the wire, they generally should be treated as equivalent to the default value for that field.  In fact, if you call getFoo() for a pointer field foo when the value is null, what you'll get back is actually foo's default value.  If foo does not declare a default, the "default default" is empty (for lists or blobs) or a struct with all fields set to their defaults (for structs).  There are a few motivations for this, with one of the main ones being security:  unexpected null pointers tend to lead to exceptions or (in C++) crashes.

-Kenton

Dale Schumacher

unread,
Sep 30, 2013, 4:03:05 PM9/30/13
to Kenton Varda, Andrew Lutomirski, capnproto
Ok.  For comparison purposes, how would each value be encoded as an enum, and as a Bool/Void union?  Also, would they be encoded differently as part of a message versus as the entire message?  In other words, is there encoding in the root struct the same as their encoding in any other struct?  I'm looking to anchor my understanding in the actual bit-patterns on the wire, so pictures (or hex values) would be helpful.

Kenton Varda

unread,
Sep 30, 2013, 4:31:41 PM9/30/13
to Dale Schumacher, Andrew Lutomirski, capnproto
On Mon, Sep 30, 2013 at 1:03 PM, Dale Schumacher <dale.sc...@gmail.com> wrote:
Ok.  For comparison purposes, how would each value be encoded as an enum, and as a Bool/Void union?

Keep in mind that in order to have a wire encoding, you need to have a schema.  It doesn't make sense to ask how a primitive value is encoded as a message on the wire without talking about its representation in a particular struct or list; primitive values cannot be encoded independently of a larger object.

So we have these two possible definitions:

    struct NullableBool1 {
      trit @0 :Trit;
      enum Trit {
        null @0;
        false @1;
        true @2;
      }
    }

    struct NullableBool2 {
      union {
        null @0 :Void;
        bool @1 :Bool;
      }
    }

Message with NullableBool1 as root:
    null = 0x0000000100000000 0x0000000000000000
    false = 0x0000000100000000 0x0000000000000001
    true = 0x0000000100000000 0x0000000000000002

Message with NullableBool2 as root:
    null = 0x0000000100000000 0x0000000000000000
    false = 0x0000000100000000 0x0000000000000001
    true = 0x0000000100000000 0x0000000000010001

(The 16-bit union tag is encoded in bits 0-15, and the bool value (when active) is in bit 16.)
 
Also, would they be encoded differently as part of a message versus as the entire message?  In other words, is there encoding in the root struct the same as their encoding in any other struct?

No, of course not.  The root struct is encoded the same as any other.  Keep in mind that the examples above are showing the root pointer followed by the root struct value.  Normally, of course, the pointer would live elsewhere, and contain a non-zero offset.
 
  I'm looking to anchor my understanding in the actual bit-patterns on the wire, so pictures (or hex values) would be helpful.

Hmm.  One troublesome bit with this approach is that the algorithm which chooses field offsets is non-trivial (although usually intuitive).  Note that you can use "capnp -ocapnp" to ask the compiler to tell you what field offsets it will choose for a particular schema.

-Kenton

Andreas Stenius

unread,
Oct 1, 2013, 12:03:32 AM10/1/13
to Dale Schumacher, capnproto
2013/9/30 Dale Schumacher <dale.sc...@gmail.com>
[...]  I'm looking to anchor my understanding in the actual bit-patterns on the wire, so pictures (or hex values) would be helpful.

I've found the effort to implement a reader (and later, also a writer) for capnp messages very enlightening (well beyond that of trying to grasp the documentation and trying to figure it out). I.e. hands on has been king! ;)

Also, as Kenton mentioned, `capnpc -ocapnp <schema-file>` helps a lot when working with the bit patterns.

But I agree, a couple of pretty pictures along with the docs could ease the work here, too.. :)

Reply all
Reply to author
Forward
0 new messages