Re: Change in dart/sdk[master]: Experiment that adds Uint8List.transferrable constructor.

16 views
Skip to first unread message

Vyacheslav Egorov

unread,
Jan 10, 2019, 12:55:57 PM1/10/19
to Alexander Aprelev, dart-vm-compil...@google.com, vm-...@dartlang.org, Dart Reviews, Alexander Markov, Martin Kustermann, Ryan Macnak, Siva Annamalai
Hi,

I have been chatting with Martin about this and we think that it might be much safer if we stop neutering the array that is being transferred, and instead just keep it attached on the sending side. 

A lot of code has been written pervasively with the assumption that the length does not change - so we have hard time predicting whether or not we fix all of them. This introduces potential for hard to debug bugs and security issues. 

It seems much safer to simply introduce a reference count into the backing store of the transferrable array (which should probably be called shareble) and increment this reference count once we share it with another isolate.  

We can simply specify that concurrent accesses to the backing store have undefined behavior. 

To me this seems like a much safer approach: we get benefits of cheap memory transfer and at the same time we reduce possibility of bugs considerably, because old invariants still continue to hold.

Note that embedders were always free to create such "shared" arrays pointing to the same memory - so we don't really introduce something new that was not possible before.

What do you think?

// Vyacheslav Egorov


On Thu, Jan 10, 2019 at 6:11 PM Alexander Aprelev (Gerrit) <noreply-gerritcoderevie...@google.com> wrote:

This change is ready for review.

View Change

    To view, visit change 86640. To unsubscribe, or for help writing mail filters, visit settings.

    Gerrit-Project: sdk
    Gerrit-Branch: master
    Gerrit-Change-Id: I5f1b7f981afcd480839f46d0b081f26393d66629
    Gerrit-Change-Number: 86640
    Gerrit-PatchSet: 17
    Gerrit-Owner: Alexander Aprelev <a...@google.com>
    Gerrit-Reviewer: Alexander Aprelev <a...@google.com>
    Gerrit-Reviewer: Ryan Macnak <rma...@google.com>
    Gerrit-Reviewer: Siva Annamalai <as...@google.com>
    Gerrit-Reviewer: Vyacheslav Egorov <veg...@google.com>
    Gerrit-CC: Alexander Markov <alexm...@google.com>
    Gerrit-CC: Martin Kustermann <kuste...@google.com>
    Gerrit-Comment-Date: Thu, 10 Jan 2019 17:11:42 +0000
    Gerrit-HasComments: No
    Gerrit-Has-Labels: No
    Gerrit-MessageType: comment

    Alexander Aprelev

    unread,
    Jan 10, 2019, 1:17:27 PM1/10/19
    to Vyacheslav Egorov, dart-vm-compil...@google.com, vm-...@dartlang.org, Dart Reviews, Alexander Markov, Martin Kustermann, Ryan Macnak, Siva Annamalai
    Hi,

      this sounds like a hard choice: leaving user with undefined behavior for concurrent access versus chasing (but hopefully ultimately fixing) bugs related to assumption that Uint8List length remains constant.

      Thanks for sharing your opinion on this. I will try to prototype your proposal and see what it looks like.

    Alex.

    Alexander Aprelev

    unread,
    Jan 10, 2019, 2:00:34 PM1/10/19
    to Vyacheslav Egorov, dart-vm-compil...@google.com, vm-...@dartlang.org, Dart Reviews, Alexander Markov, Martin Kustermann, Ryan Macnak, Siva Annamalai
    Briefly chatted with Siva and Ryan about this too, they think this undefined behavior is really undesirable.
    Siva's counterproposal is to keep length of the sender's object intact, yes, but then re-point source object to fake(newly allocated, for example) buffer. This solves undefined behavior, but downside is that for large transferrable data(think multi-megabyte images), we would have to allocate buffers on the sender's side for them. Potentially we could reuse single (large enough) buffer on a sender's side for all transferred out arrays. We would reference-count it and keep allocating larger buffers if existing buffer is not large enough.

    Alex.

    Vyacheslav Egorov

    unread,
    Jan 11, 2019, 4:35:47 AM1/11/19
    to Ryan Macnak, Alexander Aprelev, dart-vm-compil...@google.com, vm-...@dartlang.org, Dart Reviews, Alexander Markov, Martin Kustermann, Siva Annamalai
    I invite everybody to think about this as problem where we 
    need to strike balance between "purity" and "practicality". 

    From the purity perspective - yes, it would certainly be nice not to 
    create a shared memory situation between two Dart isolates. 

    From the practicality perspective - this is already something that can 
    be created by an embedder and makes our implementation simpler.

    Thus I think we should really concentrate on practical side to see pros 
    and cons, instead of concentrating on whether or not this adheres 
    to the letter of the spec which we can influence.

    I have already listed benefits. (Note that sharing semantics is even 
    implementable on the Web platform to a certain degree using SharedArrayBuffers - 
    though they are currently disabled everywhere except Chrome 
    configurations that use site-isolation).

    One drawback I became aware of is that receiving side might need to be 
    on guard against concurrent modifications. This might lead to a issue 
    (depending on how these buffers are used) if receiving end does
    not expect shared memory access. 

    I don't think there are any other practical drawbacks. 

    Now we have several solutions to choose from:

    (A) transferrable that moves ownership of the backing store to receiver and 
        neuters length in the sender (has potential to introduce bugs in VM itself)
    (B) transferrable that shares backing store (has potential to introduce bugs in 
        user and embedder code)
    (C) transferrable that moves ownership of the backing store to receiver and 
        replaces backing store with a dummy backing store in the sender. (seems 
        in general safe, but might temporary double the amount of memory).

    I also realized that we have a 4th variant which is a variant of (A) with additional
    safety check:

    (D) transferrable that moves ownership of the backing store to receiver and 
        neuters pointer to the backing store in the sender - and we change [] and []= 
        operations to throw when backing store is null. (seems safer than neutering just 
        length, but incurs additional compare-and-branch and through per access).

    // Vyacheslav Egorov



    On Fri, Jan 11, 2019 at 2:26 AM Ryan Macnak <rma...@google.com> wrote:
    On Thu, Jan 10, 2019 at 9:55 AM Vyacheslav Egorov <veg...@google.com> wrote:
    Hi,

    I have been chatting with Martin about this and we think that it might be much safer if we stop neutering the array that is being transferred, and instead just keep it attached on the sending side. 

    A lot of code has been written pervasively with the assumption that the length does not change - so we have hard time predicting whether or not we fix all of them. This introduces potential for hard to debug bugs and security issues. 

    It seems much safer to simply introduce a reference count into the backing store of the transferrable array (which should probably be called shareble) and increment this reference count once we share it with another isolate.  

    We can simply specify that concurrent accesses to the backing store have undefined behavior.

    Isolates are specified as communicating only via message passing. An implementation of typed data that creates a shared memory communication channel is not permitted.
     
    To me this seems like a much safer approach: we get benefits of cheap memory transfer and at the same time we reduce possibility of bugs considerably, because old invariants still continue to hold.

    Note that embedders were always free to create such "shared" arrays pointing to the same memory - so we don't really introduce something new that was not possible before.

    While it has been possible for an embedder to create a shared array because the VM does not check, I maintain this a bug in the embedder. The embedder is also responsible for maintaining the actor model, both for typed data like this and not exposing global state through native functions.

    Vyacheslav Egorov

    unread,
    Jan 15, 2019, 1:27:28 PM1/15/19
    to change...@dart-review.googlesource.com, Alexander Aprelev, dart-vm-compil...@google.com, vm-...@dartlang.org, Dart Reviews, Alexander Markov, Martin Kustermann, Ryan Macnak, Siva Annamalai, commi...@chromium.org
    Hi Alex, 

    Yes and you would also need to make similar change to LoadUntaggedInstr code pattern so that optimized code handles it correctly as well. 

    // Vyacheslav Egorov


    On Tue, Jan 15, 2019 at 6:01 AM Alexander Aprelev (Gerrit) <noreply-gerritcoderevie...@google.com> wrote:

    Slava,

    is https://dart-review.googlesource.com/c/sdk/+/86640/1..25/runtime/lib/typed_data.cc#206 what you had in mind regarding adding a check for null-pointer typed data?

    View Change

      To view, visit change 86640. To unsubscribe, or for help writing mail filters, visit settings.

      Gerrit-Project: sdk
      Gerrit-Branch: master
      Gerrit-Change-Id: I5f1b7f981afcd480839f46d0b081f26393d66629
      Gerrit-Change-Number: 86640
      Gerrit-PatchSet: 26
      Gerrit-Owner: Alexander Aprelev <a...@google.com>
      Gerrit-Reviewer: Alexander Aprelev <a...@google.com>
      Gerrit-Reviewer: Ryan Macnak <rma...@google.com>
      Gerrit-Reviewer: Siva Annamalai <as...@google.com>
      Gerrit-Reviewer: Vyacheslav Egorov <veg...@google.com>
      Gerrit-CC: Alexander Markov <alexm...@google.com>
      Gerrit-CC: Martin Kustermann <kuste...@google.com>
      Gerrit-Comment-Date: Tue, 15 Jan 2019 05:01:02 +0000

      Alexander Aprelev

      unread,
      Jan 17, 2019, 9:42:46 AM1/17/19
      to Vyacheslav Egorov, change...@dart-review.googlesource.com, dart-vm-compil...@google.com, vm-...@dartlang.org, Dart Reviews, Alexander Markov, Martin Kustermann, Ryan Macnak, Siva Annamalai, commi...@chromium.org
      Hi Slava,

      On Tue, Jan 15, 2019 at 10:27 AM Vyacheslav Egorov <veg...@google.com> wrote:
      Hi Alex, 

      Yes and you would also need to make similar change to LoadUntaggedInstr code pattern so that optimized code handles it correctly as well. 

      Okay, got you. Thanks. https://dart-review.googlesource.com/c/sdk/+/86640/26..29 has those changes. 
      I don't see anything outstanding on golem run for those changes.

      Alex.

      Alex Tatumizer

      unread,
      Jun 2, 2019, 10:07:08 PM6/2/19
      to Dart VM Developers, veg...@google.com, change...@dart-review.googlesource.com, dart-vm-compil...@google.com, rev...@dartlang.org, alexm...@google.com, kuste...@google.com, rma...@google.com, as...@google.com, commi...@chromium.org
      Reply all
      Reply to author
      Forward
      0 new messages