Chris Clarke wrote:
> ( [_id] => someid [title] => Semiconductor Physics and Devices
> (2011/12 \u2013 Semester 1) )
What else do you expect? The string is likely returned represented as
unicode string in PHP (I am not a PHP guru) and displayed using print_r
using its *internal* represenation. That's what print_r is for. You need
to convert it to UTF-8 of course to render it in "human-readable" form.
- -aj
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iQGUBAEBAgAGBQJPPJUGAAoJEADcfz7u4AZjiuALvAy74lOzvLPtIV943wPXanyg
pylobFnpyEqJ2+v6TdyI5Qh5EiEV9UCPrUheFOFMTEqB2PO08p0InvSADKN3iI6D
WVHXDCf7DonQR4D0d2sx+L+30JfKfOLeMz4uaipQnPwQMkFppKmUqq6G5wlThyD9
Ro3ls0bSvKofBBYa9P8uDa/SrPXUVBWRq4fppRgvO1fG/7j36vV7+G06ayJTMrch
fH6GWOLZMbP9uMrv4Kne+AbETJjTM2JlCbtWVAZLBBfBoZYOu9yDLOna8Xk+29bl
bfEBocdgCZlEKVWgmachEb27mDVIPJXTVhHE8nz3LIoLd1ZtTJTWt09+XNmfCeF1
vfMrsnn6wh2ZpOEnKh6qcNB6mJ5+jW1KSHgqCmrVGfORdzDt3E3reQ2pZqfqJRQ0
RjMvYk4F3Ro5pKHJuyjpJa6jpCFpwADXbjBZJ2w2saoSnP/6sHdYazfFEP1rErvC
CCO3NtakSk6eSyJEqympUcoYqQIx5RA=
=Qcae
-----END PGP SIGNATURE-----
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
> I have some data in a mongo document that includes the EN DASH Unicode
> character
> (http://www.fileformat.info/info/unicode/char/2013/index.htm).
>
> When queried via the mongo shell I have something like this:
>
> {
> _id: "someid",
> title: "Semiconductor Physics and Devices (2011/12 – Semester 1)"
> }
>
> However, using the php driver, when I query this string and print_r
> the result I am getting:
>
> (
> [_id] => someid
> [title] => Semiconductor Physics and Devices (2011/12 \u2013 Semester 1)
> )
I can't reproduce this:
derick@whisky:/tmp$ cat unicode.php
<?php
$m = new Mongo();
$m->demo->test->insert( array( '_id' => 'unicode', 'unicode' => '–' ) );
$r = $m->demo->test->findOne( array( '_id' => 'unicode' ) );
var_dump($r);
print_r($r);
?>
derick@whisky:/tmp$ php unicode.php
array(2) {
["_id"]=>
string(7) "unicode"
["unicode"]=>
string(3) "—"
}
Array
(
[_id] => unicode
[unicode] => —
)
How are you adding the data? Could you do a var_dump() of your arguments
to insert/update?
cheers,
Derick
--
http://mongodb.org | http://derickrethans.nl
twitter: @derickr and @mongodb
> Can we see the code that saves the unicode character to DB in PHP?
>
> PHP should not convert your character to either UTF8 (which it isn't) or
> unicode without your express permission.
Just to clarify, PHP does not do any converstion between character sets,
and neither does the driver. The user is responsible for providing the
driver with UTF-8 strings.