fuchsia.io2 migration plan

61 views
Skip to first unread message

Adam Barth

unread,
Sep 16, 2021, 2:21:08 AM9/16/21
to eng-council-discuss, discuss
# Summary

We are doing a soft, in-place transition of Fuchsia's core IO protocols from fuchsia.io to fuchsia.io2 (see Issue 77623).  Unless you work directly with the IO protocol, no action is required on your part.  This message is purely informational and hopefully useful as context in case you run into some dust from the transition.

# Approach

The general approach is as follows:

(1) Teach all the fuchsia.io servers to respond to both the io1 and io2 message.
(2) Migrate all the fuchsia.io clients to always send io2 messages.
(3) Remove the ability for fuchsia.io servers to respond to io1 messages.
(4) Cleanup.

One complication to this plan is that every Fuchsia component is both a fuchsia.io client and a fuchsia.io server because fuchsia.io is the protocol components use to access services from other components and to publish their own services.  In particular, the transition plan needs to handle the full matrix of (in-tree, out-of-tree) x (client, server).

Another complication is that some service *forward* messages from their clients to other fuchsia.io servers.  When forwarding a message, we do not want to translate between io1 and io2 because we do not want to risk encountering semantic gaps in the translation.

To resolve the first complication, we will need to wait six weeks between step (1) and step (2) as well as another six weeks between step (2) and step (3).  Waiting six weeks between step (1) and step (2) is necessary to ensure that all the prebuilt binaries know how to respond to io2 message before we attempt to send them any io2 messages.  Waiting six weeks between step (2) and step (3) is necessary to ensure that all none of the prebuilt binaries are sending io1 messages to servers that no longer know how to respond to them.

To resolve the second complication, servers that receive io1 messages will forward them as io1 messages.  Servers that receive io2 messages will forward them as io2 messages.  The io2 forwarding code should remain dormant until the first client begins sending io2 messages.  At that point, all the servers, including the server receiving the forwarded message, should know how to respond to io2 messages.

To reduce size of the transition, we will likely transition the File and Directory protocols in separate waves.  These waves can overlap because clients and servers won't confuse File messages for Directory messages (or vice versa).  All told, the transition period will take a number of calendar months.

# Current status

We have executed this complete this complete lifecycle for fuchsia.io/Directory.Unlink, proving the concept behind the migration.  We are now scaling up the migration to the rest of the messages.  In particular, the FIDL definitions of fuchsia.io/Node and fuchsia.io/File now declare both the io1 and io2 messages.  We are in the process of teaching the servers how to respond to these message.

This intermediate state is delicate because not all the servers know how to respond to all of the declared messages.  For example, if you are writing directly to the fuchsia.io interfaces, you might wonder whether you should send a Read or a Read2 message.  If you find yourself in this situation, please reach out to me for advice.  Hopefully, very few people will be in this situation because most people in interact with fuchsia.io through libraries.

# History

The section provides historical context for the io2 migration.  Feel free to stop reading this email now if you're not interested in this history.

Originally, Fuchsia used a suite of protocols called RemoteIO (RIO) to perform IO.  This protocol was defined manually using a C structure.  All the messages in the protocol conformed to the same C structure, whose fields were overloaded with different semantics for different protocol messages.

At the time FIDL used a much less efficient wire format derived from the Mojo IPC system used by Chromium.  In order to unify RIO and FIDL, we completely redesigned FIDL to be performance-competitive with manually defined C structures.  This project was called FIDL2 and is the basis for the FIDL we use today in the system.

The very first protocol defined in FIDL2 was fuchsia.io.  We used this protocol definition to compare directly against RIO, particularly for performance.  After we migrated the system from FIDL1 to FIDL2, we migrated the IO system from FIO to FIDL2.

In the intervening time, FIDL has matured significantly.  Unfortunately, because the current IO protocol was designed at the inception of FIDL2, the protocol does not take advantage of that maturity.  Specification, io1 has two major shortcomings:

(1) Methods return error values directly rather than using the "error" feature in FIDL, violating the FIDL rubric:

    Read(struct {
        count uint64;
    }) -> (struct {
        s zx.status;
        data vector<uint8>:MAX_BUF;
    });

For example, the Rust bindings for these messages do not use the Result type to allow for idiomatic error handling.

(2) The reflection mechanism (fuchsia.io/Node.Describe) is built around structs rather than tables.  Using structs means this mechanism is difficult to evolve.  For example, the FileObject struct is used to describe a File.  At some point, we added a zx.handle:<STREAM> to the FileObject struct to facilitate faster IO.  This change was possibly only because the FileObject struct happened to have a "gap" that we could abuse to hold the new handle.  Now that we have filled that gap, we can never and anything other information to the description of a File.

To resolve these issues (and clean up some other details in the protocol), we defined the fuchsia.io2 protocol.  To resolve (1), we started using the error syntax in the normal way:

    Read2(struct {
        count uint64;
    }) -> (struct {
        data Transfer;
    }) error zx.status;

To resolve (2), we define FileInfo, the io2 equivalent of FileObject, as a table.  In FIDL, tables are extensible, which means we can continue to evolve the description of a File in Fuchsia without breaking the ABI.

# Conclusion

If all goes well, just like FIDL2 is just called FIDL, fuchsia.io2 will just be called fuchsia.io.  We have been planning this transition for long enough that the io1 message actually use binary ordinals derived from the io1 name.  For example, the binary name for the Read method today is actually hash("fuchsia.io1/File.Read").  When the transition is complete, the binary name for the io2 Read method will be hash("fuchsia.io/File.Read").

In the meantime, we will be in a slightly fragile state for a few months while we complete all the soft transitions.  Hopefully, once we've migrated to io2, we should be able to use the modern extensibility features of FIDL to evolve the core IO protocols smoothly for a long time, avoiding the need for an io3.

Please do not hesitate to reach out to me if you have any questions or concerns.

Adam

Adam Barth

unread,
Jan 12, 2022, 4:05:42 PM1/12/22
to eng-council-discuss, discuss
We've begun step (2) below.  Last week, clients started sending Close2.  On Monday, clients started sending Sync2.  Hopefully by the end of the day, clients will start sending Read2, ReadAt2, Write2, and WriteAt2.

In principle, these changes should not cause any disruption.  If you run into any issues, please let me know.

Adam
Reply all
Reply to author
Forward
0 new messages