--
v8-users mailing list
v8-u...@googlegroups.com
http://groups.google.com/group/v8-users
--
console.log() of Node uses v8::String::WriteUtf8() internally.
Unfortunately it supports only BMP.
http://code.google.com/p/v8/issues/detail?id=761
On Thu, 11 Aug 2011 12:57:05 -0500, Marcel Laverdet <mar...@laverdet.com> wrote:
> console.log() and document.write() are not parts of v8. These are host
> functions and have different implementations in Chrome and in NodeJS. Chrome
> seems to have a very robust implementation of both, which is aware of
> surrogate pairs and the target encoding. NodeJS on the other hand fails to
> respect surrogate pairs.
>
> Your examples don't show too much other than the fact that
> String.fromCodeCode() will not generate surrogate pairs and therefore can
> only generate characters with a 16 bit codepoint. You'll see the same
> results in NodeJS.
>
> '??' === String.fromCharCode(0xd864, 0xdd0e)
> true
>
> On Thu, Aug 11, 2011 at 12:05 PM, ~flow <wolfga...@gmail.com> wrote:
>
> > so i went and put the same javascript into an HTML page to be displayed by
> > chrome and into a standalone js snippet to be run using nodejs:
> >
> > var f = function( text ) {
> > document.write( '<h1>', text, '</h1>' );
> > document.write( '<div>', text.length, '</div>' );
> > document.write( '<div>0x', text.charCodeAt(0).toString( 16 ), '</div>' );
> > document.write( '<div>0x', text.charCodeAt(1).toString( 16 ), '</div>' );
> > console.log( '<h1>', text, '</h1>' );
> > console.log( '<div>', text.length, '</div>' );
> > console.log( '<div>0x', text.charCodeAt(0).toString( 16 ), '</div>' );
> > console.log( '<div>0x', text.charCodeAt(1).toString( 16 ), '</div>' );
> > };
> >
> > f( '??' );
> > f( String.fromCharCode( 0x2910e ) );
> > f( String.fromCharCode( 0xd864, 0xdd0e ) );
> >
> > in function f(), those document.write() calls are only present in the HTML
> > document, not the standalone.
> >
> > i want to show here that something more fundamental must be different
> > between javascript running inside google chrome and javascript running
> > inside nodejs. because, you see, the output i get inside chrome looks like
> > this:
> >
> > ??
> > 2
> > 0xd864
> > 0xdd0e
> > ?
> > 1
> > 0x910e
> > 0xNaN
> > ??
> > 2
> > 0xd864
> > 0xdd0e
> >
> > the second character is silently truncated (notice how the chr code is
> > reported as 0x910e where it should be 0x2910e) which is sad, but both
> > using a string literal and a numerical surrogate pair works---both in the
> > HTML page and in chrome's console output! conversely, in nodejs, this is
> > what i get:
> >
> > <h1> ? </h1>
> > <div> 1 </div>
> > <div>0x fffd </div>
> > <div>0x NaN </div>
> > <h1> ? </h1>
> > <div> 1 </div>
> > <div>0x 910e </div>
> > <div>0x NaN </div>
> > <h1> ?????</h1>
> > <div> 2 </div>
> > <div>0x d864 </div>
> > <div>0x dd0e </div>
> >
> > the silver lining here is that v8 inside nodejs does preserve the surrogate
> > pair, even though it fails to output it correctly. however, the
> > console.log() method gets it completely wrong. may i add that the analog in
> > python 3.1 does work---since i use a 'narrow' python build, it also reports
> > a string '??' as being two characters long, and manages to print it out
> > correctly, which seems to tell me that my ubuntu gnome terminal knows how to
> > handle surrogate pairs.
> >
> > i could perfectly live with those surrogate pairs---they're a nuisance but
> > i know how to deal with them from years of experience with python. the
> > really sad thing here is that nodejs's v8 seems to fall short on something
> > that v8 can be demonstrated to do correctly when running inside chrome.
> >
> > that said, let me add that i sometimes worry about the unneeded complexity
> > that goes into implementations. why can't people just use a 32bit wide
> > character datatypes? instead they make users jump to all kinds of gratuitous
> > hoops.
> >
> > --
> > v8-users mailing list
> > v8-u...@googlegroups.com
> > http://groups.google.com/group/v8-users
> >
>
> --
> v8-users mailing list
> v8-u...@googlegroups.com
> http://groups.google.com/group/v8-users
--
{
name: "Koichi Kobayashi",
mail: "koi...@improvement.jp",
blog: "http://d.hatena.ne.jp/koichik/",
twitter: "@koichik"
}
I am not familiar with node.js but the actual function may not be
named log. I would search for "log" (the word log in double quotes)
to find where it is mapping a function name to a function pointer in
the console object.
--
Bryan White
console.log is defined in lib/console.js,
but it just passes string to the stream.
You should look Buffer::Utf8Write in src/node_buffer.cc
It converts JS's string (UCS-2) to byte array (UTF-8)
using v8::String::WriteUtf8().
https://github.com/joyent/node/blob/v0.4.10/src/node_buffer.cc#L471