Questions about ArrayBuffer and ArrayBuffer::Contents

1,167 views
Skip to first unread message

Roman Shtylman

unread,
Dec 8, 2013, 10:56:14 PM12/8/13
to v8-u...@googlegroups.com
A few things are unclear to me after trying to use the ArrayBuffer api from v8 3.23.10

The first inconsistency I find is the disconnect between an initialized ArrayBuffer allocated via the globally set allocator and one that you have then externalized. It seems there is no good way to get this global allocator and thus properly know how to cleanup the ArrayBuffer::Contents once you you have them.

The other difficult aspect is the inability to get the externalized contents again once you have gotten them once. This makes sense in the context of "externalization" but what I would find very useful is just access to the data pointer for the array buffer and let it continue to manage the lifetime of the memory.

Allowing the array buffer to manage the lifetime has the nice benefit of using the correct allocator and re-using the memory if I have calls that want to do so; right now it is not possible to easily re-use the memory without some clever hacks. We already have a system for increasing the lifetime of a handle (persistents) so externalizing just to access the data and ensure it is not deleted too soon doesn't seem like a relevant api.

tl;dr;
1. ArrayBuffer::Contents could be inconsistent with global ArrayBuffer allocator
2. No way to reuse array buffer memory since externalize can only happen once

Justin King

unread,
Dec 9, 2013, 1:01:57 AM12/9/13
to v8-u...@googlegroups.com
I believe I can shine some light on this. There are a few things that are not explained well by the current implementation. Feel free to correct me if I am wrong.

The only difference between a non-externalized ArrayBuffer and an externalized ArrayBuffer is who is responsible for managing the underlying buffer. The underlying buffer of a non-externalized ArrayBuffer is guaranteed to be allocated by the specified ArrayBuffer::Allocator set prior to execution using V8::SetArrayBufferAllocator. What Google Chrome does (last I checked) is use the default internal fields created with each ArrayBuffer and ArrayBufferView controlled by V8_ARRAY_BUFFER_INTERNAL_FIELD_COUNT and V8_ARRAY_BUFFER_VIEW_INTERNAL_FIELD_COUNT macros respectively. Once externalized you may store the pointer to the underlying storage in one field and another pointer to the persistent storage cell within V8 to be notified when the ArrayBuffer becomes "weak". Every time you come across an ArrayBuffer you may use the shadowed method ArrayBuffer::IsExternal to check if ArrayBuffer::Externalize has been called, which implies you can access the pointer. The ArrayBuffer continues to use the same underlying storage even after calling ArrayBuffer::Externalize until ArrayBuffer::Neuter is called to remove all references to the underlying storage among the ArrayBuffer and its ArrayBufferViews.

Since the embedder (you) is responsible for implementing the ArrayBuffer::Allocator V8 assumes you have a reference. The easiest way to replicate this is use a singleton class with a public method returning the singleton (I believe this is the approach used by Google Chrome). V8 also assumes that if you create a new ArrayBuffer via ArrayBufer::New(Isolate*, void*, size_t) you know how you allocated it, and thus we come back full circle to the responsibility of the implementer. There is no possible way for V8 or JavaScript to avoid using the ArrayBuffer::Allocator. Only the embedder may create a pre-externalized ArrayBuffer via ArrayBuffer::New(Isolate*, void*, size_t).

Hope this helps,
Justin

--
--
v8-users mailing list
v8-u...@googlegroups.com
http://groups.google.com/group/v8-users
---
You received this message because you are subscribed to the Google Groups "v8-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-users+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--
Michigan Technological University
Computer Network and Systems Administration
Information Technology Operations

Dmitry Lomov

unread,
Dec 9, 2013, 6:26:32 AM12/9/13
to v8-u...@googlegroups.com
Note that ArrayBuffer API is experimental and we might change it in the future. The below reflects what we ship currently.


On Mon, Dec 9, 2013 at 4:56 AM, Roman Shtylman <shty...@gmail.com> wrote:
A few things are unclear to me after trying to use the ArrayBuffer api from v8 3.23.10

The first inconsistency I find is the disconnect between an initialized ArrayBuffer allocated via the globally set allocator and one that you have then externalized. It seems there is no good way to get this global allocator and thus properly know how to cleanup the ArrayBuffer::Contents once you you have them.

The embedder should provide the allocator to V8 prior to V8's initialization (using v8::V8::SetArrayBufferAllocator). So if you embed V8. you should know your allocator. There is no default allocator; there can be only one allocator in the system; any time you get an ArrayBuffer::Contents (via ArrayBuffer::Externalize call) the memory returned by ArrayBuffer::Contents::Data() is guaranteed to be allocated by ArrayBuffer::Allocator::Allocate call.
 

The other difficult aspect is the inability to get the externalized contents again once you have gotten them once. This makes sense in the context of "externalization" but what I would find very useful is just access to the data pointer for the array buffer and let it continue to manage the lifetime of the memory.

This is by design: once the memory is externalized, the embedder assumes ownership of ArrayBuffer's memory. We push the burden of keeping relationship between externalized data pointers and array buffer objects to the embedder. 
Here is an example of that that is probably simpler than full-blown Blink bindings: https://code.google.com/p/chromium/codesearch#chromium/src/gin/array_buffer.h

So ArrayBuffer API is restricted by design, driven mainly by the desire to carefully control who owns memory when. We could consider loosing it up a bit later, but we have been bitten by embedder bugs too many times in the past, so with this new design we started in a very controlled state.

Hope this helps,
Dmitry


--

Roman Shtylman

unread,
Dec 9, 2013, 9:21:54 AM12/9/13
to Justin King, v8-u...@googlegroups.com
This is spot on to how I understand the behavior, however I think this behavior exposes the problems I outlined. This is especially evident if you are writing in a binding in a “plugin” like environment where you are not the one to setup the Allocator for array buffers.

Your use of the internal fields to store values is exactly the “hacks” I was referring to. This seems unsafe given that if you pass around the array buffer and something else decides to use these internal fields in a different way then you will be ruined.

Yes, I agree if you create an array buffer using already externalized constructor then you most likely know where the memory came from and can free it, but if you received an array buffer as an argument or used the non externalized constructor, then it would be very apt to just have access to the memory if the handle is still alive.
You received this message because you are subscribed to a topic in the Google Groups "v8-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/v8-users/whHPAIhfMu8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to v8-users+u...@googlegroups.com.

Roman Shtylman

unread,
Dec 9, 2013, 9:22:51 AM12/9/13
to Dmitry Lomov, v8-u...@googlegroups.com
Ok, and how would one influence the fixing or changing of the API to suite a use case which seems currently not possible without additional hacks?
You received this message because you are subscribed to a topic in the Google Groups "v8-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/v8-users/whHPAIhfMu8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to v8-users+u...@googlegroups.com.

Dmitry Lomov

unread,
Dec 9, 2013, 10:03:25 AM12/9/13
to Roman Shtylman, v8-u...@googlegroups.com
On Mon, Dec 9, 2013 at 3:22 PM, Roman Shtylman <shty...@gmail.com> wrote:
Ok, and how would one influence the fixing or changing of the API to suite a use case which seems currently not possible without additional hacks?

Clearly stating the use case claimed to be impossible, with justification as to why it is important might help.
Note that Blink uses this API for ArrayBuffer implementation, so that must be a use case not covered by the usage in Blink - which might be a tough sell ;)

Dmitry

Roman Shtylman

unread,
Dec 9, 2013, 2:52:54 PM12/9/13
to v8-u...@googlegroups.com
Adding a note here about an API example which I thought was interesting but communicated off the list.

ArrayBuffer::Contents::Collect();

This would call the "Free" method on the Allocator used to create the data in the first place. This exists because passing the allocator to a plugin or library may be unfeasible and kinda pointless since the Contents knows what Allocator it was allocated with. This would avoid inconsistencies. Possibly also having an IsCollected() method.

Would also be good to have a way to re-get externalized Contents from the ArrayBuffer after they have been externalized but not Neutered. This can't easily be done now because determining which internal fields to use for what purpose is not standardized and would thus present problems across plugins or disjoint libraries. This can be resolved by the ArrayBuffer class since it is the mediator. This is the major pain point with externalization at the c++ level right now; not being able to have a known way to get back at this externalized information even if it is still valid.

Not saying that ArrayBuffer needs to expose the data*, but allowing multiple access to the Contents would solve this as long as the Contents have not been collected. And the safe way to collect Contents would be via  Collect() call which would make the contents invalid as well as use the appropriate Allocator without guessing or other additional setup routines.

Dmitry Lomov

unread,
Dec 9, 2013, 3:07:04 PM12/9/13
to v8-u...@googlegroups.com
Repeating my off-list replies on-list now (sigh... please don't do this)

The behind the API for ArrayBuffer is that once AB is externalized, the embedder manages the book-keeping for that data (maybe using shared_ptr or some other mechanism). 

Therefore the API that allows "re-externalization" of ArrayBuffer contents will create a side-channel for the embedder (if you have a v8::AB in your hand, you can get at memory the embedder manages without the rest of the embedder knowing - so embedder might have carefully managed the access to memory with smart pointers or similar, but a rogue plugin just thwarted that architecture by freeing data willy-nilly)



On Mon, Dec 9, 2013 at 8:52 PM, Roman Shtylman <shty...@gmail.com> wrote:
Adding a note here about an API example which I thought was interesting but communicated off the list.

ArrayBuffer::Contents::Collect();

This would call the "Free" method on the Allocator used to create the data in the first place. This exists because passing the allocator to a plugin or library may be unfeasible and kinda pointless since the Contents knows what Allocator it was allocated with. This would avoid inconsistencies. Possibly also having an IsCollected() method.

This is a bad idea for the reasons described above. 

 

Would also be good to have a way to re-get externalized Contents from the ArrayBuffer after they have been externalized but not Neutered. This can't easily be done now because determining which internal fields to use for what purpose is not standardized and would thus present problems across plugins or disjoint libraries. This can be resolved by the ArrayBuffer class since it is the mediator. This is the major pain point with externalization at the c++ level right now; not being able to have a known way to get back at this externalized information even if it is still valid.

Key point here is that ArrayBuffer is not a mediator of access to its backing store. ArrayBuffer participates in managing that memory with the embedder. The real mediator of access to that memory is the embedder. 
So it is with the embedder that any libraries or plugins need to negotiate the patterns of memory access. 
V8 is not in the business of providing plugin system for its embedders.

Hope this helps,
Dmitry


--

Dmitry Lomov

unread,
Dec 9, 2013, 3:38:43 PM12/9/13
to v8-u...@googlegroups.com
[+v8-users]


On Mon, Dec 9, 2013 at 9:37 PM, Dmitry Lomov <dsl...@chromium.org> wrote:
I think you assume what is (possibly) true for one embedder and extrapolate to other embedders. Node.js is not our only embedder :)


On Mon, Dec 9, 2013 at 9:20 PM, Roman Shtylman <shty...@gmail.com> wrote:




On December 9, 2013 at 3:07:10 PM, Dmitry Lomov (dsl...@chromium.org) wrote:

Repeating my off-list replies on-list now (sigh... please don't do this)


The behind the API for ArrayBuffer is that once AB is externalized, the embedder manages the book-keeping for that data (maybe using shared_ptr or some other mechanism). 

Therefore the API that allows "re-externalization" of ArrayBuffer contents will create a side-channel for the embedder (if you have a v8::AB in your hand, you can get at memory the embedder manages without the rest of the embedder knowing - so embedder might have carefully managed the access to memory with smart pointers or similar, but a rogue plugin just thwarted that architecture by freeing data willy-nilly)

That’s the point of the Weak stuff. The person who first externalizes must be the one who understands how to cleanup so there is no “rouge” nonsense. Additional users who want to access the data must get a Contents from the AB and check IsCollected() or IsEmpty() or something doesn’t matter. Yes, I realize they *could* store the data pointer and not get it from the Contents ever time but shit this is c++ they could do that if they wanted to anyway! What you are preventing is actually doing the memory management a clean way instead of preventing some “potential” data* storage concern. If it was easy to get the Contents multiple times from the AB then a person would be more likely to do that.

Here is an actual example where I encounter this problem:

https://github.com/defunctzombie/libuv.js/blob/master/src/uvjs_fs.h#L416

If the read call is called with the some buffer that was already externalized I would like to re-use it but I cannot.

These bindings don’t have any state setup routines currently because all they do is return an ObjectTemplate with these bound methods which means they are for others to embed in their shells or environments. With the right APIs they would be able to use the externalized pointers just by asking the AB for them and if they were already externalized querying if they are safe to use (similar to a NULL check if you will).


What I suggest to do here is to implement the API in Node that will let you get at externalized pointer. After that, it is just:

shared_ptr<ABBackingStore> data = nodeApi->GetBackingStore(arr);

Everything else is the same. You can use your favorite or std:: variation of smart pointers to manage it.

One thing that is nice about this is that you don't have to worry about each and every function that tries to externalize memory from ArrayBuffer about how to create weak callbacks &c. This is done once and for all by Node API.

I have no idea how libuv plugs into Node, but I can suggest ways of accessing Node API even from stateless functions fi you can't already (you can hang it off v8::Isolate)


Hope this helps,
Dmitry



You received this message because you are subscribed to a topic in the Google Groups "v8-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/v8-users/whHPAIhfMu8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to v8-users+u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages