Hi Slava, wanted to share with you what I have so far to check whether we are on the same page regarding the approach for dart implementation of isolate api.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
// TODO(aam): Ensure isolate group doesn't disappear after we exit,It should not disappear because we have a thread stack which uses it. That would be a serious bug if that could happen. We should block any sort of graceful shutdown for a group if there are threads like this.
Thread::ExitIsolateGroupAsMutator(/*bypass_safepoint=*/false);It's a bit strange that we need to exit isolate group and then re-enter - especially given that we are creating a new isolate within the same isolate group.
I think we should not require exiting the whole group to create an isolate in it - though we should exit the current isolate.
class BorrowScope {nit: `: public ValueObject`.
nit: consider `IsolateBorrowScope`.
Also this class needs a comment explaining what it does and what expectations we should have about its behavior.
Put this class into anonymous namespace if it is only used in this file.
DEFINE_NATIVE_ENTRY(Isolate_shutdown_, 0, 1) {I wonder if we should call this `shutdownSync`?
DEFINE_NATIVE_ENTRY(Isolate_runEventLoopSync_, 0, 1) {nit: We should make sure that we have tests covering situations where mutator slots are exhausted - the code to acquire target isolate should still continue to work correctly and we should not deadlock
String::Handle(String::New("Failed to enter isolate in time"));Should this be "Reached timeout while trying to enter isolate"?
Dart_Handle result_dart_handle = Dart_RunLoop();`Dart_RunLoop` actually runs loop on thread pool, while `runEventLoopSync` is intentionally specified to run on the current thread.
(Please also add a test to cover this requirement).
case Isolate::kYieldMutatorMsg: {If this is an OOB message then it can be processed in the middle of the other normal message because we schedule an interrupt. I think this is problematic because it allows "unnatural" interleaving of Dart code. I would have expected that we send a normal priority message instead.
We probably want a test which checks that this does not happen.
Thread* thread = nullptr;Should this just be
```
auto thread = suspended_thread;
if (thread != nullptr) {
// ...
}
```
/// Execute the given function in the context of the given isolate.nit: please take doc comments from the latest version off the proposal, I made updates based on Lasse's comments.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
thank you for taking a look, Slava!
// TODO(aam): Ensure isolate group doesn't disappear after we exit,It should not disappear because we have a thread stack which uses it. That would be a serious bug if that could happen. We should block any sort of graceful shutdown for a group if there are threads like this.
Yes, makes sense.
Thread::ExitIsolateGroupAsMutator(/*bypass_safepoint=*/false);It's a bit strange that we need to exit isolate group and then re-enter - especially given that we are creating a new isolate within the same isolate group.
I think we should not require exiting the whole group to create an isolate in it - though we should exit the current isolate.
We are exiting isolate group as mutator. I don't think we can be expected to enter isolate (as mutator), which we do when we create new isolate, while still staying as mutator in isolate group, even if isolate is in the same group.
nit: `: public ValueObject`.
nit: consider `IsolateBorrowScope`.Also this class needs a comment explaining what it does and what expectations we should have about its behavior.
Put this class into anonymous namespace if it is only used in this file.
Done
I wonder if we should call this `shutdownSync`?
Done
nit: We should make sure that we have tests covering situations where mutator slots are exhausted - the code to acquire target isolate should still continue to work correctly and we should not deadlock
Done
String::Handle(String::New("Failed to enter isolate in time"));Should this be "Reached timeout while trying to enter isolate"?
Done
`Dart_RunLoop` actually runs loop on thread pool, while `runEventLoopSync` is intentionally specified to run on the current thread.
(Please also add a test to cover this requirement).
Ah, I see. Let me work on it a separate cl.
If this is an OOB message then it can be processed in the middle of the other normal message because we schedule an interrupt. I think this is problematic because it allows "unnatural" interleaving of Dart code. I would have expected that we send a normal priority message instead.
We probably want a test which checks that this does not happen.
Done
Should this just be
```
auto thread = suspended_thread;
if (thread != nullptr) {
// ...
}
```
Done
/// Execute the given function in the context of the given isolate.nit: please take doc comments from the latest version off the proposal, I made updates based on Lasse's comments.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
DEFINE_NATIVE_ENTRY(Isolate_runSync_, 1, 3) {I think this needs to handle the following cases better:
All of this needs correponding tests.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
DEFINE_NATIVE_ENTRY(Isolate_runSync_, 1, 3) {I think this needs to handle the following cases better:
- We are doing `runSync` on the current isolate - should just happen.
- We are dong `runSync` on an isolate which is pinned to the current thread - should just happen.
- We are doing `runSync` on the isolate which is pinned to some other thread - should error
- We are doing `runSync` on an isolate which does not have event loop running - we need to decide what this means, but probably it means we need to acquire its ownership, run the code and then release ownership.
All of this needs correponding tests.
Is there something particular you have in mind regarding handling those cases better?
Pinned-functionality is not implemented yet, it's not part of this CL.
Otherwise case 1 is tested by testRunRunOnCurrentIsolate, case 4 - by testRunOnCooperatingIsolate, testRunOnNonCooperatingIsolate in threading_test.dart.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
I would like somebody else (in addition to me) to look at this - because it is very subtle code.
I have kinda persuaded myself that this implementation looks reasonable, but maybe we need more eyes.
So I am adding @rma...@google.com as a reviewer.
DEFINE_NATIVE_ENTRY(Isolate_runSync_, 1, 3) {Alexander AprelevI think this needs to handle the following cases better:
- We are doing `runSync` on the current isolate - should just happen.
- We are dong `runSync` on an isolate which is pinned to the current thread - should just happen.
- We are doing `runSync` on the isolate which is pinned to some other thread - should error
- We are doing `runSync` on an isolate which does not have event loop running - we need to decide what this means, but probably it means we need to acquire its ownership, run the code and then release ownership.
All of this needs correponding tests.
Is there something particular you have in mind regarding handling those cases better?
Pinned-functionality is not implemented yet, it's not part of this CL.
Otherwise case 1 is tested by testRunRunOnCurrentIsolate, case 4 - by testRunOnCooperatingIsolate, testRunOnNonCooperatingIsolate in threading_test.dart.
I would like `Isolate.current.runSync` to be short-ciricuited so that we don't do any dances with entering/exiting/borrowing.
Ack regarding pinning functionality. Though note that some notion of pinning already exists (see `Dart_SetCurrentThreadOwnsIsolate` and its implementation plus my comment about it above) so we need to respect it here.
Ignore my comment about event loop - I realized that I misread the code. The message we are sending serves as an "interruption" for event loop, but we are not required to run an event loop to borrow the isolate.
OSThread* owner_tag_ = 0;This needs to be unified with `owner_thread_` / `Isolate::SetOwnerThread`, etc. We can't have multiple different notions of ownership.
bool EnsureOwnership(intptr_t timeout_ms = Monitor::kNoTimeout);Better name `TryAcquireOwnership` (because it can fail).
Maybe document that it can only fail if timeout is supplied.
ASSERT(priority == Isolate::kAsEventAction);Why do we need this dance? Can be just post message like that from the very beginning? It feels strange that we need to post message as OOB first, which then reposts itself as normal message?
(I can see that this is copied from other places - but I think other places are also basically wrong).
void Isolate::HandleBorrowRequests() {It's a bit concerning that worker thread is completely blocked while doing this. I would have imagined that we stop handling messages and return worker thread back to the pool.
{I wonder if this should have a CompareExchange based fast path for fast acquisition of ownership?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
OSThread* owner_tag_ = 0;This needs to be unified with `owner_thread_` / `Isolate::SetOwnerThread`, etc. We can't have multiple different notions of ownership.
`owner_thread_`/`SetOwnerThread` enforces constraint that only one thread can even own isolate through isolate's lifetime. This ownership does not have such constraint as different unrelated threads can borrow isolate and run dart code on it. It might be that this infrastructure should stay away from "owner" terminology.
bool EnsureOwnership(intptr_t timeout_ms = Monitor::kNoTimeout);Better name `TryAcquireOwnership` (because it can fail).
Maybe document that it can only fail if timeout is supplied.
Done
Why do we need this dance? Can be just post message like that from the very beginning? It feels strange that we need to post message as OOB first, which then reposts itself as normal message?
(I can see that this is copied from other places - but I think other places are also basically wrong).
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Commit-Queue | +1 |
DEFINE_NATIVE_ENTRY(Isolate_runSync_, 1, 3) {Alexander AprelevI think this needs to handle the following cases better:
- We are doing `runSync` on the current isolate - should just happen.
- We are dong `runSync` on an isolate which is pinned to the current thread - should just happen.
- We are doing `runSync` on the isolate which is pinned to some other thread - should error
- We are doing `runSync` on an isolate which does not have event loop running - we need to decide what this means, but probably it means we need to acquire its ownership, run the code and then release ownership.
All of this needs correponding tests.
Slava EgorovIs there something particular you have in mind regarding handling those cases better?
Pinned-functionality is not implemented yet, it's not part of this CL.
Otherwise case 1 is tested by testRunRunOnCurrentIsolate, case 4 - by testRunOnCooperatingIsolate, testRunOnNonCooperatingIsolate in threading_test.dart.
I would like `Isolate.current.runSync` to be short-ciricuited so that we don't do any dances with entering/exiting/borrowing.
Ack regarding pinning functionality. Though note that some notion of pinning already exists (see `Dart_SetCurrentThreadOwnsIsolate` and its implementation plus my comment about it above) so we need to respect it here.
Ignore my comment about event loop - I realized that I misread the code. The message we are sending serves as an "interruption" for event loop, but we are not required to run an event loop to borrow the isolate.
Provided fast-path for `Isolate.current.runSync`.
{Alexander AprelevI wonder if this should have a CompareExchange based fast path for fast acquisition of ownership?
Done
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Commit-Queue | +1 |
It's a bit concerning that worker thread is completely blocked while doing this. I would have imagined that we stop handling messages and return worker thread back to the pool.
Done
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
send_port = SendPort::New(created_isolate->main_port());also provide origin_id
StackZone stack_zone(current_thread);
HandleScope hs(current_thread);Unneeded
result = DartEntry::InvokeClosure(thread, args, args_desc);current_thread
if (result.IsUnwindError()) {If we did the current-isolate shortcut, maybe we should just propagate the unwind error. Otherwise Dart code has the power to veto VM shutdown, which it didn't have before.
static Isolate create({String? debugName}) {spawnFunction also has
```
bool paused = false,
bool errorsAreFatal = true,
SendPort? onExit,
SendPort? onError,
```
should those also be available here?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Commit-Queue | +1 |
send_port = SendPort::New(created_isolate->main_port());Alexander Aprelevalso provide origin_id
Done
StackZone stack_zone(current_thread);
HandleScope hs(current_thread);Alexander AprelevUnneeded
Done
result = DartEntry::InvokeClosure(thread, args, args_desc);Alexander Aprelevcurrent_thread
Done
If we did the current-isolate shortcut, maybe we should just propagate the unwind error. Otherwise Dart code has the power to veto VM shutdown, which it didn't have before.
Done
static Isolate create({String? debugName}) {spawnFunction also has
```
bool paused = false,
bool errorsAreFatal = true,
SendPort? onExit,
SendPort? onError,
```should those also be available here?
These seems to be properties of isolates message handler, which might not exist for this isolate - user might just have created this isolate with intent of only running some sequential synchronous code via Isolate::runSync, without running event loop handler.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
There are flaky reload-bot failures https://logs.chromium.org/logs/dart/buildbucket/cr-buildbucket/8682610591257958737/+/u/find_ignored_flaky_test_failure_logs/raw_io.output_text due to failing (GC) stack walking assert that `fp_` is in the expected range due to the fact that thread might get switched during kernel compile request as part of IsolateLeaveScope that happens as part of native port creation and closing. `fp_` is remembered when runtime call to `DN_Isolate_runSync` is made. So it's done before isolate/osthread are released by `IsolateLeaveScope`, and if we end up with a new osthread after kernel compilation, we will get this assert failing. I'm thinking we should remove IsolateLeaveScope from Dart_NewConcurrentNativePort/Dart_CloseNativePort, Dart_InvokeVMServiceMethod, remove check `Isolate::Current() == nullptr` from `NativeMessageHandler::CheckAccess`
There are flaky reload-bot failures https://logs.chromium.org/logs/dart/buildbucket/cr-buildbucket/8682610591257958737/+/u/find_ignored_flaky_test_failure_logs/raw_io.output_text due to failing (GC) stack walking assert that `fp_` is in the expected range due to the fact that thread might get switched during kernel compile request as part of IsolateLeaveScope that happens as part of native port creation and closing. `fp_` is remembered when runtime call to `DN_Isolate_runSync` is made. So it's done before isolate/osthread are released by `IsolateLeaveScope`, and if we end up with a new osthread after kernel compilation, we will get this assert failing. I'm thinking we should remove IsolateLeaveScope from Dart_NewConcurrentNativePort/Dart_CloseNativePort, Dart_InvokeVMServiceMethod, remove check `Isolate::Current() == nullptr` from `NativeMessageHandler::CheckAccess`
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
DEFINE_NATIVE_ENTRY(Isolate_runSync_, 1, 3) {Alexander AprelevI think this needs to handle the following cases better:
- We are doing `runSync` on the current isolate - should just happen.
- We are dong `runSync` on an isolate which is pinned to the current thread - should just happen.
- We are doing `runSync` on the isolate which is pinned to some other thread - should error
- We are doing `runSync` on an isolate which does not have event loop running - we need to decide what this means, but probably it means we need to acquire its ownership, run the code and then release ownership.
All of this needs correponding tests.
Slava EgorovIs there something particular you have in mind regarding handling those cases better?
Pinned-functionality is not implemented yet, it's not part of this CL.
Otherwise case 1 is tested by testRunRunOnCurrentIsolate, case 4 - by testRunOnCooperatingIsolate, testRunOnNonCooperatingIsolate in threading_test.dart.
Alexander AprelevI would like `Isolate.current.runSync` to be short-ciricuited so that we don't do any dances with entering/exiting/borrowing.
Ack regarding pinning functionality. Though note that some notion of pinning already exists (see `Dart_SetCurrentThreadOwnsIsolate` and its implementation plus my comment about it above) so we need to respect it here.
Ignore my comment about event loop - I realized that I misread the code. The message we are sending serves as an "interruption" for event loop, but we are not required to run an event loop to borrow the isolate.
Provided fast-path for `Isolate.current.runSync`.
Done
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Are we going to ship this as stable/supported in v3.13? If so we should document these additions in the changelog, too!
external R runSync<R>(R Function() f, {Duration? timeout});We should add @Since annotations here, right?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Are we going to ship this as stable/supported in v3.13? If so we should document these additions in the changelog, too!
This is not decided so not CHANGELOG.md updates as of yet.
OSThread* owner_tag_ = 0;Alexander AprelevThis needs to be unified with `owner_thread_` / `Isolate::SetOwnerThread`, etc. We can't have multiple different notions of ownership.
`owner_thread_`/`SetOwnerThread` enforces constraint that only one thread can even own isolate through isolate's lifetime. This ownership does not have such constraint as different unrelated threads can borrow isolate and run dart code on it. It might be that this infrastructure should stay away from "owner" terminology.
I have spend more time thinking about this, and I have arrived to the conclusion that we should unify it and only allow one owner (and no borrowing for such isolates) in the newly added APIs as well, e.g. we enforce the following invariants:
I think, not only this makes it easier to reason about things, but this also makes implementation much simpler overall - this will allow you to delete a lot of code that causes complexity in this CL.
WDYT?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
OSThread* owner_tag_ = 0;Alexander AprelevThis needs to be unified with `owner_thread_` / `Isolate::SetOwnerThread`, etc. We can't have multiple different notions of ownership.
Slava Egorov`owner_thread_`/`SetOwnerThread` enforces constraint that only one thread can even own isolate through isolate's lifetime. This ownership does not have such constraint as different unrelated threads can borrow isolate and run dart code on it. It might be that this infrastructure should stay away from "owner" terminology.
I have spend more time thinking about this, and I have arrived to the conclusion that we should unify it and only allow one owner (and no borrowing for such isolates) in the newly added APIs as well, e.g. we enforce the following invariants:
- If isolate has an event loop running or has message related callbacks installed - you can't borrow it. Can only communicate with it asynchronously.
- If isolate has owner thread - you can't borrow it from another thread. You can only message it asynchronously.
- Otherwise - you can temporary borrow it and enter from different threads. These are like mutexes - a thread needs to fully release a mutex for another thread to be able to enter that isolate.
I think, not only this makes it easier to reason about things, but this also makes implementation much simpler overall - this will allow you to delete a lot of code that causes complexity in this CL.
WDYT?
It will simplify the implementation, yes. The most interesting case I can think of that will become impossible is doing `runSync` on regular `Isolate.spawn`-isolates since they always have an event loop running.
Whether this use case is important for the potential users - not sure.
Borrowing the way how it's implemented is rather general, but yes, its interaction with existing vm infrastructure (the way how vm exits/enters isolate at various points) makes implementation complicated. Holding mutex for duration of `runSync` sounds much simpler.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
StackZone stack_zone(current_thread);A stack zone was just created for this native function and is mostly empty at this point, so there's no benefit in starting a new stack zone here.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |