Mapping CML data to binary CM

11 views
Skip to first unread message

Dave Smith

unread,
Jan 6, 2022, 12:26:18 PM1/6/22
to component-framework-dev, fidl-dev
Hi Folks -

I'm going through an exercise to better understand the compiled format of binary (.cm) component manifests. Basically, I'm looking to map the fields from a basic CML like hello_world to the corresponding sections in the binary file.

After looking through the ComponentDecl reference and FIDL wire format spec, I have a couple of questions that I'm hoping the experts can straighten out:
  1. The manifest seems to be the encode_persistent() form of the Component table, which means the first 8 bytes of the file are the metadata, correct? Are there additional bytes here before the header of the table begins?
  2. The wire format docs indicate that a table contains a header followed by a vector of envelopes. Does the vector also have its own header, or are they one and the same?
  3. The wire format spec shows an ellipsized ("...") block between the header and envelopes. What is this meant to represent? Shouldn't the vector always follow the header?
Thanks in advance!
Dave Smith | Developer Relations | smit...@google.com | @devunwired

Dave Smith

unread,
Jan 6, 2022, 12:55:54 PM1/6/22
to Pascal, Mitchell Kember, component-framework-dev, fidl-dev
Thanks Pascal! That's very helpful.

One quick follow-up about ordering: since the manifest seems to be encoded, it looks like all pointers are replaced by 0xFFFFFFFF. Does that mean that the out-of-line contents should follow each pointer? Or are they all just grouped together in a fixed order (i.e. all envelopes first, followed by each content section)?

Cheers,

Dave Smith | Developer Relations | smit...@google.com | @devunwired
On Thu, Jan 6, 2022 at 10:38 AM Pascal <pasca...@google.com> wrote:
  1. The manifest seems to be the encode_persistent() form of the Component table, which means the first 8 bytes of the file are the metadata, correct? Are there additional bytes here before the header of the table begins?
yes, that seems to be the case. +Mitchell Kember can confirm
 
  1. The wire format docs indicate that a table contains a header followed by a vector of envelopes. Does the vector also have its own header, or are they one and the same?
a table is a vector of envelopes, so the header of a table is the header of a vector, i.e. a size and a pointer
  1. The wire format spec shows an ellipsized ("...") block between the header and envelopes. What is this meant to represent? Shouldn't the vector always follow the header?
the out-of-line data does not follow directly the inline data due to the traversal order https://fuchsia.dev/fuchsia-src/reference/fidl/language/wire-format#traversal-order

if you have a struct { string; vector<int32>; }
the inline portion will have
  • header of string
  • header of vector
then the out-of-line blocks will be
  1. data for string
  2. data for vector
hence the header of the vector is not juxtaposed with its out-of-line block, but one block away (due to the out-of-line portion of the string)

hth

Mitchell Kember

unread,
Jan 6, 2022, 1:01:55 PM1/6/22
to Pascal, Dave Smith, component-framework-dev, fidl-dev
(Confirming that yes, there are only 8 bytes of metadata, and the table header immediately follows that.)

On Thu, Jan 6, 2022 at 12:59 PM Pascal <pasca...@google.com> wrote:
The out-of-line contents will be laid out in traversal order, depth-first traversal of the types.

You start by the first type, write the inline portion, then traverse the first element of this type.  If this has an out-of-line portion, that gets written down. Then you move on to the next element, and so on.

Pascal

unread,
Jan 6, 2022, 3:17:57 PM1/6/22
to Dave Smith, Mitchell Kember, component-framework-dev, fidl-dev
For those watching at home, Dave and I had a quick chat to clarify something.

A table is a vector<envelope> essentially, and each envelope has an inline portion, and an out-of-line portion.

As a result, when looking at a table say table { 1: int64 } you will have
  • header of vector, i.e. size + data ptr
  • envelopes, here one envelope, only the inline portion
  • content of envelope 1: here the int64
So the actual content is two-hops away from the table inline portion, not one hop away.

On Thu, Jan 6, 2022 at 12:58 PM Pascal <pasca...@google.com> wrote:
The out-of-line contents will be laid out in traversal order, depth-first traversal of the types.

You start by the first type, write the inline portion, then traverse the first element of this type.  If this has an out-of-line portion, that gets written down. Then you move on to the next element, and so on.

On Thu, Jan 6, 2022 at 12:55 PM Dave Smith <smit...@google.com> wrote:

Pascal

unread,
Jan 6, 2022, 3:18:01 PM1/6/22
to Dave Smith, Mitchell Kember, component-framework-dev, fidl-dev
The out-of-line contents will be laid out in traversal order, depth-first traversal of the types.

You start by the first type, write the inline portion, then traverse the first element of this type.  If this has an out-of-line portion, that gets written down. Then you move on to the next element, and so on.

On Thu, Jan 6, 2022 at 12:55 PM Dave Smith <smit...@google.com> wrote:

Pascal

unread,
Jan 6, 2022, 3:18:04 PM1/6/22
to Dave Smith, Mitchell Kember, component-framework-dev, fidl-dev
  1. The manifest seems to be the encode_persistent() form of the Component table, which means the first 8 bytes of the file are the metadata, correct? Are there additional bytes here before the header of the table begins?
yes, that seems to be the case. +Mitchell Kember can confirm
 
  1. The wire format docs indicate that a table contains a header followed by a vector of envelopes. Does the vector also have its own header, or are they one and the same?
a table is a vector of envelopes, so the header of a table is the header of a vector, i.e. a size and a pointer
  1. The wire format spec shows an ellipsized ("...") block between the header and envelopes. What is this meant to represent? Shouldn't the vector always follow the header?
the out-of-line data does not follow directly the inline data due to the traversal order https://fuchsia.dev/fuchsia-src/reference/fidl/language/wire-format#traversal-order

Dave Smith

unread,
Jan 6, 2022, 4:59:47 PM1/6/22
to Pascal, Mitchell Kember, component-framework-dev, fidl-dev
Thanks for all the help everyone, I was successful in manually mapping the files for a couple different examples.

Ultimately, the main blocker was that the persistent encoding used by cmc seems to use the v1 FIDL wire format, and the wire format spec was recently updated to represent the v2 format (fxrev.dev/573901) only. This led to confusion on things like envelope sizes while parsing the file.

Is manifest compilation the only case we have that uses this older wire format?
If so, perhaps we should update cmc?
If not, should we investigate updating the reference docs to describe both formats?


Cheers,
Dave Smith | Developer Relations | smit...@google.com | @devunwired

Pascal

unread,
Jan 6, 2022, 5:17:11 PM1/6/22
to Dave Smith, Benjamin Prosnitz, Mitchell Kember, component-framework-dev, fidl-dev
Is manifest compilation the only case we have that uses this older wire format?
If so, perhaps we should update cmc?

Yes, we should update cmc to use the updated format which has 8 bytes envelopes. Will ping you on the chat to coordinate.
Reply all
Reply to author
Forward
0 new messages