Binary data in embedded V8

414 views
Skip to first unread message

Bret Taylor

unread,
Sep 3, 2008, 4:31:16 AM9/3/08
to v8-users
I am exploring the V8 code base for the first time and making a few
embedding demos to become more familiar. I was in the process of
wrapping some system calls and calling them from JavaScript, and I was
curious what the best way to pass binary strings to and from
JavaScript is (and if it is possible).

For the sake of getting help, let's say I want a ReadFile(path) call
that is more or less like this, but capable of reading binary files:

Handle<Value> ReadFile(const Arguments& args) {
if (args.Length() != 1) {
return ThrowException(String::New("One argument required"));
}
ifstream input(*String::AsciiValue(args[0]), ios_base::binary);
if (input.fail()) {
return ThrowException(String::New("Could not read file"));
}
string contents;
while (input.good()) {
char buffer[1024];
input.read(buffer, sizeof(buffer));
contents.append(buffer, input.gcount());
}
return String::New(contents.c_str(), contents.size());
}

...
HandleScope handle_scope;
Handle<ObjectTemplate> global = ObjectTemplate::New();
global->Set(String::New("readFile"), FunctionTemplate::New(ReadFile));
Handle<Context> context = Context::New(NULL, global);
...


The documentation for String::New implies the contents of the file are
UTF-8-decoded at "return String::New(contents.c_str(),
contents.size())", which would obviously mangle binary data.

I realize binary data is not that important to client-side JavaScript,
but it is useful for embedded scenarios, and i am curious how V8/other
engines deal with it, if at all.

Bret

Erik Corry

unread,
Sep 3, 2008, 6:38:49 AM9/3/08
to v8-users
On Sep 3, 10:31 am, Bret Taylor <btay...@gmail.com> wrote:
> I am exploring the V8 code base for the first time and making a few
> embedding demos to become more familiar. I was in the process of
> wrapping some system calls and calling them from JavaScript, and I was
> curious what the best way to pass binary strings to and from
> JavaScript is (and if it is possible).

Right now there's no simple way to do that for 8 bit data.

The String::New(const char* data, int length) method will 'mangle'
bytes over 127 (ie interpret them as UTF-8 if possible).

The String::New(const uint16_t* data, int length) method doesn't
mangle anything, but it only takes an even number of bytes and if the
binary input data contains any 8 bit characters then the characters in
the string will bear little relation to those in the original data.

If your data is binary and not really a string then perhaps you should
be storing it in an array instead of in a string. If you do that then
note that integers that can be represented in 31 bits or fewer (8, 16
or 24 for example) are more efficient in V8 than full blown signed 32
bit integers, which require a separate Number object to represent the
numbers in half the range.

Feng Qian

unread,
Sep 3, 2008, 10:35:34 AM9/3/08
to v8-u...@googlegroups.com
You can use JavaScript Array to hold binary data.

Arseniy Pavlenko

unread,
Jul 4, 2013, 6:03:09 PM7/4/13
to v8-u...@googlegroups.com
Any updates/workarounds for 8 bit binary strings?

Stephan Beal

unread,
Jul 4, 2013, 6:08:13 PM7/4/13
to v8-u...@googlegroups.com
On Fri, Jul 5, 2013 at 12:03 AM, Arseniy Pavlenko <h0x...@gmail.com> wrote:
Any updates/workarounds for 8 bit binary strings?

JS doesn't support them, so v8 doesn't either. You cannot legally use Strings for arbitrarily-encoded data. You _can_ wrap your own Buffer types which point to such memory, but it is illegal to convert their contents to a JS string.

--
----- stephan beal
http://wanderinghorse.net/home/stephan/

Ben Noordhuis

unread,
Jul 4, 2013, 7:11:57 PM7/4/13
to v8-u...@googlegroups.com
On Fri, Jul 5, 2013 at 12:08 AM, Stephan Beal <sgb...@googlemail.com> wrote:
> On Fri, Jul 5, 2013 at 12:03 AM, Arseniy Pavlenko <h0x...@gmail.com> wrote:
>>
>> Any updates/workarounds for 8 bit binary strings?
>
>
> JS doesn't support them, so v8 doesn't either. You cannot legally use
> Strings for arbitrarily-encoded data. You _can_ wrap your own Buffer types
> which point to such memory, but it is illegal to convert their contents to a
> JS string.

There's a trick though:

Handle<String> BinaryString(const uint8_t* data, size_t size) {
uint16_t* out = new uint16_t[size];
for (size_t n = 0; n < size; ++n) out[n] = data[n];
Local<String> s = String::NewFromTwoByte(Isolate::GetCurrent(), out);
delete[] out;
return s;
}

There's room for optimization of course, but the basic concept is the
same: convert the input to 16 bits and create a two-byte string from
that.

Dan Carney

unread,
Jul 6, 2013, 3:29:05 AM7/6/13
to v8-u...@googlegroups.com
On Friday, July 5, 2013 12:03:09 AM UTC+2, Arseniy Pavlenko wrote:
Any updates/workarounds for 8 bit binary strings?

V8 supports 8 bit strings now.  It has for some months.  Use String::NewFromOneByte.  For arbitrary binary data, though, you probably want an array.

Arseniy Pavlenko

unread,
Jul 6, 2013, 4:00:32 AM7/6/13
to v8-u...@googlegroups.com
Thanks!!!

Sent from my iPhone
--
--
v8-users mailing list
v8-u...@googlegroups.com
http://groups.google.com/group/v8-users
---
You received this message because you are subscribed to a topic in the Google Groups "v8-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/v8-users/jFF3kjZ1dJ4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to v8-users+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply all
Reply to author
Forward
0 new messages