Chris Angelico
I'm not at the computer where I do that coding right now, but Google
pointed me to this page which has the prototypes:
http://bespin.cz/~ondras/html/classv8_1_1String.html
V8EXPORT int WriteAscii (char *buffer, int start=0, int length=-1,
WriteHints hints=NO_HINTS) const
V8EXPORT int WriteUtf8 (char *buffer, int length=-1, int
*nchars_ref=NULL, WriteHints hints=NO_HINTS) const
Returns:
The number of characters copied to the buffer excluding the null
terminator. For WriteUtf8: The number of bytes copied to the buffer
including the null terminator.
I don't really care about the number of characters, only the number of
bytes, which I then further process.
Chris Angelico
Well, it's wasting some effort. You just have to decrement the count
that you get back. It's not impossible to deal with, but it just has
that anomalous feeling of 'struct tm' from the C standard library -
day of month is 1-31, but month of year is 0-11. The true anomaly is
the day of month, which ought to be 0-30 to parallel the
hour/minute/second, but civil time is usually displayed starting from
1 in the day and month, and it feels weird to have it come out
differently. WriteUtf8 can logically be explained as returning the
count of bytes written, but 'most every other function that does this
sort of job won't count the null.
Chris Angelico
Chris Angelico
One ASCII character fits in one byte. One Unicode character doesn't,
and encoded as UTF-8, might take between one and three bytes. The null
terminator takes one byte, of course.
ChrisA
One ASCII character fits in one byte. One Unicode character doesn't,and encoded as UTF-8, might take between one and three bytes. The null
terminator takes one byte, of course.
Sorry, my bad. I don't know why I said three; probably a consequence
of posting at 3AM. Three bytes covers the BMP, four bytes will cover
all currently-defined Unicode codepoints. Not significant at the
moment, though.
On Tue, Jul 5, 2011 at 6:40 AM, Henrik Lindqvist
<henrik.l...@gmail.com> wrote:
> Its more serious than a just little "quirk". Many binary protocols use
> Pascal type strings where the length is stored explicitly, then
> String::WriteUtf8 can't be used. V8 should atleast skip writing \0
> when HINT_MANY_WRITES_EXPECTED is specified, that would be logical.
The trouble is, any code written now will expect it to include the \0
in the count. Would it suit to add an additional hint, eg
HINT_NO_NULL_TERMINATOR, which will then (a) not write the null, and
(b) not include it in the count?
It'll be a fairly simple change. I could make it when I get to work in
an hour or so, and submit a patch. Where are such things handled?
Chris Angelico
Changes made and being tested. Moving this thread to the v8-dev list
which is more appropriate.
Thanks for the advice!
Chris Angelico
On Tue, Jul 5, 2011 at 7:53 AM, Chris Angelico <ros...@gmail.com> wrote:Changes made and being tested. Moving this thread to the v8-dev list
> It'll be a fairly simple change. I could make it when I get to work in
> an hour or so, and submit a patch. Where are such things handled?
which is more appropriate.
If by "docs" you mean the Doxygen comments in v8.h, the patch I posted
on the bugtracker does update that. Are there other docs?
ChrisA
The API docs that you can find splattered across the internet are all
generated from the source, and differ only in how up-to-date they are.
Open source software: Where you have the power to read the source, but
also the need to.
ChrisA
I ended up calling it WRITE_NO_NULL_TERMINATOR and renaming 'hints' to
'options'; patch is here:
http://code.google.com/p/v8/issues/detail?id=1537
ChrisA
I'm not at the computer where I do V8 work, but you could fairly
easily implement that yourself. You may wish to apply my patch first,
and call yours an option rather than a hint.
ChrisA