`void*` pointer support for fast calls

184 views
Skip to first unread message

Aapo Alasuutari

unread,
Jul 26, 2022, 3:05:14 PM7/26/22
to v8-dev
Hello,

I'm interested in implementing `void*` pointer support for Fast API calls. My thinking was that V8's `External` objects are appropriate to stand in for external `void*` pointers coming in from external code and going back out, since that's what they're (presumably) meant for.

Unfortunately this seems to be a complex endeavour, a bit more than I can start hacking together directly. I'm also not sure if the `Sandboxify JSExternalObject external pointer` PR will complicate this plan of mine.

The origin of my interest is Deno FFI support, that is calling native libraries from Deno JS runtime that uses the V8 engine. Recent changes to the FFI have added V8 Fast API support and made the FFI a lot faster, but unfortunately we're bound to using plain numbers as pointers, meaning both that creating pointers is as easy as just writing a number and that (Fast API compatible) pointers are limited to 53 bit numbers which will not be enough for eg. pointer cryptography on ARM v8.3.

It believe it would be preferable if Deno could use `External` objects to stand for pointers but this would negate the current Fast API performance benefits. Thus, `void*` pointer support for fast calls.


Any comments? Suggestions on how I might best proceed with this to implement it? Or is this perhaps not a reasonable idea?

Side note: I was sad to find that getting the pointer value out of an `Local<External>` is measurably slower than getting the pointer number value out of a `Local<Number>`. This is presumably due to the `External` internally saving the pointer in the `ExternalMap`. The slower performance is still a bit sad, from having expected `External` to be the main public API meant to handle external pointers.

Leszek Swirski

unread,
Jul 29, 2022, 5:02:40 AM7/29/22
to v8-...@googlegroups.com, msle...@chromium.org
+Maya, you're probably the best person to answer this.

--
--
v8-dev mailing list
v8-...@googlegroups.com
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-dev+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/a4914444-88bf-4238-828c-9ec3f2e09878n%40googlegroups.com.

Andreas Haas

unread,
Jul 29, 2022, 5:08:53 AM7/29/22
to v8-dev, msle...@chromium.org
Maya is on leave over the summer, unfortunately.

Aapo Alasuutari

unread,
Aug 23, 2022, 1:07:04 AM8/23/22
to v8-dev
Has Maya possibly returned from vacation? Or is their leave still continuing?

Camillo Bruni

unread,
Aug 23, 2022, 4:32:30 AM8/23/22
to v8-...@googlegroups.com
Hi, yes, Maya is out until mid-september.
Cheers, Camillo

Camillo Bruni | Software Engineer, V8 | Google Germany GmbH | Erika-Mann Str. 33, 80636 München 

Registergericht und -nummer: Hamburg, HRB 86891 | Sitz der Gesellschaft: Hamburg | Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

Diese E-Mail ist vertraulich. Falls Ssie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde.  This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.

Aapo Alasuutari

unread,
Sep 23, 2022, 8:15:10 AM9/23/22
to v8-dev
I presume Maya might now be back be at the office?

Would it be possible to get some guidance regarding implementing void pointer support, either here on Groups or possibly by organizing an online meeting of some sort?

-Aapo

Aapo Alasuutari

unread,
Sep 30, 2022, 1:36:21 AM9/30/22
to v8-dev
Still hoping to get some guidance with this.

I'm also interested in support, even if limited, for string value parameters (or even return values) and returning of TypedArray buffers. Though, I expect those to be much harder to implement than returning External objects for void pointers. I guess a somewhat related option is to return external pointers as zero-sized TypedArrays / ArrayBuffers, but that sounds quite wrong compared to External objects.

Marja Hölttä

unread,
Sep 30, 2022, 3:43:21 AM9/30/22
to v8-...@googlegroups.com, Maya Armyanova


--


Google Germany GmbH

Erika-Mann-Straße 33

80636 München


Geschäftsführer: Paul Manicle, Liana Sebastian.

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg


Diese E-Mail ist vertraulich. Falls sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde.

msle...@chromium.org

unread,
Sep 30, 2022, 4:23:56 AM9/30/22
to v8-dev
Hi,

First of all I'm really sorry for the late reply, I didn't see Leszek's ping in time.

External sounds like the right type to represent embedder pointers, though the poor performance you report sounds unfortunate. Tbh I'm not aware of particular efforts to optimize it, but it might be indeed due to the ExternalMap. I'll check with colleagues if it's possible to do something about the performance there.

On the main topic, adding C callbacks that accept an argument of type External<JSExternalObject> should be doable, given that memory wise External has the same representation as v8::Value (which we support to pass regular v8::Object's). It should mostly be an addition to the public interface file, which I can guide you into implementing, if you're interested.

Regarding the other two points:
 - Strings - we decided for now to leave them out of the API, due to the large number of string types in V8, which would make the implementation annoyingly complex. We talked about possibly adding limited support for return string types only, as the C++ -> JS direction would need support only for plain C strings. Still, I don't have any particular plan to implement it in the near future, but would be happy to support you if it's an important feature for Deno.
 - Returning TypedArrays - this is again somewhat cumbersome, as the TypedArray object would need to be allocated as a placeholder from the generated code before calling out to the C++ callback, as the callback itself is not allowed to allocate. It should be generally doable, but we didn't have a use case until now.

Hope this helps, will let you know once I learn more about v8::External performance.

All the best,
Maya

Aapo Alasuutari

unread,
Sep 30, 2022, 6:01:12 AM9/30/22
to v8-dev
Hey,

Thank you for getting back to me!

I'm definitely interested in implementing Externals for C callbacks both as parameters and as returns values. Returning void pointers should prove to be more difficult I guess. I can see two ways to go about it:
1. Take the same route as you mention for returning TypedArrays, where V8 will allocate a placeholder before calling the callback.
2. Simply have the C callback return the pointer and have V8 do the allocation after calling it. I presume this should be doable since the return value CType can (or perhaps "should") be trusted to speak the truth, and returning a pointer does not cause any issues calling convention wise.

My personal preference is definitely on #2, as it feels more "natural" and contains less indirection. It also has the slight benefit of not doing any unnecessary allocations for the External when the fast callback signals a need to deopt using the fallback flag.

I wonder if #2 could be used to likewise implement TypedArray returning? I can't exactly remember the System V ABI for C++ structures but I seem to recall that a structure with a size of up two two pointers worth can be returned through RAX and an extra register (that is, as long as the class does not have an non-trivial copy constructors or destructor). Other ABIs might of course differ on that. Still, if it happened to be so that all ABIs allowed returning two pointers, it would mean that a C callback could return the same TypedArray struct that is used to pass them in as parameters. (I'm skipping considerations of ownership, lifetime, copying, memory management and all that because it gets hard and ruins my idea pretty well :D )

On Strings: It turns out that as long as one keeps a pointer to the Isolate somewhere, it's already possible to support String parameters in fast calls, at least to a limited and possibly unstable degree. See this PR of mine: https://github.com/denoland/deno/pull/16014
Essentially, if a parameter is declared as v8::Value then it will happily accept a String as well, and with the Isolate pointer it is then possible to write the string data out. I'm unsure of the safety of this, I expect it should panic on roped strings as V8 flattens them but so far I've not seen clear evidence of that happening.

I personally think that a limited C string return only -kind of string support would not be a good idea. As an example, I expect that the Chrome / Blink team would find good use for returning of UTF8 strings in atob / btoa and TextDecoder APIs. (And so would Deno.) Again here I ponder on the possibility of the option #2 above.

About Deno's interest in Fast API in general: I'm not part of the Deno team, and am only contributing to the FFI and a little bit on the core ops (JS <-> Rust binding) layer so I cannot truly speak for what the team considers important and am just speaking for myself. That being said:
1. Deno's FFI API relies heavily on Fast API. Every foreign library's symbol (C function) that a user wants to use will by default use the Fast API. Only symbols that call back into V8 need to / should opt out of this using a "callback" boolean option.
Adding more supported types to Fast API directly adds to wider and better support for Deno FFI. As an example, currently returning of 64 bit integers (eg. pointers) is done via a TypedArray out pointer, where the pointer is written into. If returning of External objects was possible, this out pointer system and its (slight) performance overhead could be removed. (And most importantly, numbers-as-pointers insecurity could be eliminated.)
Returning of C strings would allow Deno FFI to have "native" support of those (currently C string extraction is done via a separate method).

2. Deno's ops layer has recently moved to using Fast API by default where possible. Deno's binding functions are written as normal Rust functions and an ops macro takes care of writing the binding logic to V8's FunctionTemplate.
Due to the near-universality of the ops macro, any Fast API binding logic needs to only be written once and the macro will take care of taking setting up the bindings for all ops that are bindable. Thus, here even more than with FFI, having more supported types leads near-automatically to faster binding layer in Deno, which is very much of interest to the Deno team.
Some examples:
* FFI might not benefit from Strings as parameters that much, since foreign APIs would only expect C strings. Deno ops however very much would like to get arbitrary (UTF8) strings in fast calls. They would also love to return arbitrary UTF8 strings.
* FFI only cares about returning pointers in some form, External being the most logical. Deno ops would very much want to return TypedArrays of varying sizes, and they would not mind being explicit about memory management either.
* ops have cases where eg. a String or TypedArray parameter might be optional. Overloads are already supported to a degree, but eg. null parameters in the middle currently are not supported directly (except as v8::Values which I'm not sure if it would ruin the "better typed" overload matching)
* (Completely impossible stretch goal): Some ops take objects of some given shape. If V8 were to match its JS object shapes to a declared parameter struct shape, now that would be impossibly cool. Also, probably too hard to feasibly do but a man can dream.


This has become a massive, meandering writeup. Sorry about that.

Back onto topic: If you can give me some pointers on where I should look to add the External<JSExternalObject> stuff for, I would much appreciate it. I would personally also prefer to write the code such that the C callback receives not the v8::External object but is directly called with the pointer that the External represents. This I expect to require some changes in the lowering code.

Thank you for your time
-Aapo Alasuutari

Maya Armyanova

unread,
Sep 30, 2022, 6:56:08 AM9/30/22
to v8-dev
Hi again,

Regarding the void pointers, idea #2 sounds good to me too. I guess there's no really need to pre-allocate anything.

Regarding returning TypedArrays as a pair of pointers - this sounds like an interesting idea indeed. Still, two questions come to my mind:
1) this seems really platform specific and we should really carefully study all calling conventions we care about (and we have quite a few);
2) "considerations of ownership, lifetime, copying, memory management" - yeah, this is what bothers me too. The fast API isn't really so much about safety, but it shouldn't open any obvious security holes. And returning a random address from C++ and providing that as a TypedArray elements store seems pretty fishy to me. I can imagine all kinds of dangers such as out-of-bounds reads or writes.

Regarding External - small correction to what I wrote above, we can use a Local<External>, similar to Local<Value>. And a possible reason why it is slow (thanks to verwaest@) is that External is a full-blown JSObject, having its own elements and property backing store, which is unused for C++ objects (which it is supposed to represent).

Re: Strings passed around as Values - wow, now this seems risky indeed. The worst we could stumble upon is again unexpected memory writes. Not sure how possible in reality that is, but I'll need to ping someone more familiar with security concerns.

A noob question - what is the Deno ops layer and what would an engineer use it for?

Regarding overload resolution with null parameters in the middle - yeah, the purpose of not supporting the full overload resolution logic that Web APIs have was to keep this code simpler. Otherwise at runtime time we'll need to repeat much of what Blink already does, possibly making the fast dispatch slower. Regarding the JSObject shapes, not sure how relevant that is, but we had an idea to provide the embedder with means of enumerating their C++ types and representing their hierarchy in V8 using those assigned numerical IDs. This would be super useful for Web APIs such as accessing e.g. Node.nodeType from various successors of Node (such as Div).

Regarding implementing External support - again a correction, you could have a Local<External> on the C++ side. And you could already try passing the External* or Local<External> as an argument and use the (obsolete) kApiObject parameter type. Add a mapping here https://source.chromium.org/chromium/chromium/src/+/main:v8/include/v8-fast-api-calls.h;l=666;drc=ca79bd5301566d1a3fc573c6e6858b5880c00fbd from Local<External> to kApiObject, the low level machinery for it is still there. And how the C++ function takes the argument - as a raw pointer or as a Local - is actually the same for the generated code that calls it, so it's your call. If it works (or it doesn't), please feel free to upload a CL on Gerrit, happy to take a look.

Good luck,
Maya

Aapo Alasuutari

unread,
Sep 30, 2022, 8:20:10 AM9/30/22
to v8-dev
Re: Returning TypedArrays
1. Yeah, this definitely needs to be carefully considered. Is there any easy listing of supported V8 compilation targets? A simple preliminary study would be to just check good old Godbolt compiler against the list :)
2. This is indeed fraught with both potential user errors and plain bad ideas. An example of what I've already implemented for Deno FFI for normal binding functions is for users to get an ArrayBuffer out of a pointer with a given byte length. This is useful, or even necessary, for some C APIs where mutating memory through a pointer is needed. These are created with a BackingStore using a noop deleter callback, so effectively the BackingStore is not taking ownership of the data, just referencing it. However, lifetime becomes an issue as of course the BackingStore does not know how long the pointer is valid. Thus, a user error may lead to a use-after-free error. I guess that's FFI for you. Generally though, from a V8 perspective, you should be able to trust the fast call to return a proper length with the pointer to be turned into a TypedArray. The only real issue, I think, is how to deal with the three different options of:
a) Reference TypedArray: V8 does not control the lifetime (dangerous since now JS-side user error creates use-after-free)
b) Copied TypedArray: V8 should copy byteLength bytes from the pointer.
c) Owned TypedArray: The pointer is actually already owned by V8, ie. somehow a fast call is returning a pointer it received from V8 in the first place, or (if such an API is provided in the future) the fast call allocated a buffer into V8 heap and is now returning it as a TypedArray.
The return type CType might be used to tell V8 what it should do with such a TypedArray but it's still fraught with danger. No easy answers here.

Re: External support
I think you might've misunderstood my meaning with passing External pointers as parameters. I wasn't referring to an External* but instead the void* that one would receive by calling the Local<External>::Value() method. My original thinking was that lowered V8 code might even turn Local<External> internally into the void* though the Value() method, but thinking on it now it may not makes sense (how to return to the Local<External> from the void*? So not a good idea.). So, in the end it would be that a fast call with a declared void* parameter would expect to find a Local<External> in that parameter slot, and will call the fast call with the Local<External>::Value() return value in that parameter slot. So, the C++ side will never even see the Local<External> but will instead simply receive a pointer to whatever the External is pointing to. This is why I expected this to be a bit harder than just working with the public API file, as I expect this will need at least some work on the lowering code.

Re: ops layer
The ops layer is how Deno binds the JS world to native code. Ops are called from JS through eg. Deno.core.ops.op_print("foo"). This function is a V8 FunctionTemplate instance, which will then call into the Rust code that actually implements Deno's own console printing. And as said, each op's V8 FunctionTemplate binding code is generated automatically and if the parameters and return value of the Rust function match what Fast API is capable of, the op FunctionTemplate will be created with a fast call.

-Aapo

Maya Armyanova

unread,
Oct 3, 2022, 5:58:58 AM10/3/22
to v8-dev
Re: TypedArrays:
1. Supported list of OSes can be found here: https://source.chromium.org/chromium/chromium/src/+/main:v8/include/v8config.h;l=65;drc=56816d76c121c8dd5b406dc6019350eee05f4abd, the platforms are basically the subfolders of this one: https://source.chromium.org/chromium/chromium/src/+/main:v8/src/codegen/
2. I think only options b) or c) (copying or owning) are viable and safe, tbh. Option c) can be done as "pre-allocating" the TypedArray backing store before doing the call.

Re: External support - I see, I got confused that the External* is itself the C++ pointer we care about. Well, then similar to before, you could use a void* -> kExternalObject (or similar, which would be a new value in the `CTypeInfo::Type` enum) mapping in the public header, then handle this kExternalObject similar to kV8Value. From the machine point of view, it's still only a machine word-sized pointer. And then we'll need tests that use it and some code in Turbofan to read out the External::Value out of the wrapper object and pass it as the void* param. Maybe we can setup a chat or pair-coding session in the coming days, I'm based in CET timezone.

Re: ops - thanks for the explanation, sounds really cool indeed.

Please let me know how can I further support you!
Maya

Aapo Alasuutari

unread,
Oct 5, 2022, 1:30:29 AM10/5/22
to v8-dev
Hey,

Sorry for the late reply, had some work stuff blocking my calendar. A pair-coding session would be ideal if at all possible. Would Friday afternoon work for you? eg. At 12 or 1 PM on Friday.

-Aapo

msle...@chromium.org

unread,
Oct 5, 2022, 6:03:21 AM10/5/22
to v8-dev
Hi, not sure yet about a good time slot, so I pinged you over Gmail chat. 
Reply all
Reply to author
Forward
0 new messages