Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Oracle 10g / PHP / utf-8

355 views
Skip to first unread message

Nico

unread,
Nov 19, 2008, 1:18:37 PM11/19/08
to
Hi, I have installed the Oracle Database 10g Express Edition Release
10.2.0.1.0 on a Linux workstation for dev purposes, along with Apache
2 and PHP5. The production environment uses Unix for the Oracle
database server, Win2003/IIS for the webserver.

Our PHP application works with UTF-8 encoded strings and actually
sends SQL INSERTS in UTF-8 format to the server. When this data is
retrieved, foreign characters (all characters which are not in the
Basic Latin alphabet) are corrupted. It's not clear to me where I
should search for the characterset related issues. It seems to be a
server-side problem.

Some NLS settings:

National Language Parameter Value
NLS_CHARACTERSET AL32UTF8
NLS_COMP BINARY
NLS_LANGUAGE AMERICAN
NLS_NCHAR_CHARACTERSET AL16UTF16
NLS_NCHAR_CONV_EXCP FALSE
NLS_TERRITORY AMERICA

Questions:
- Is it possible to use ALTER SESSION to change the characterset?
- What do we need to configure in order to have the database server
and the client work with UTF-8?

Thanks in advance for your reply.

Kind regards,
Nico

sybr...@hccnet.nl

unread,
Nov 19, 2008, 6:09:06 PM11/19/08
to
Comments embedded


On Wed, 19 Nov 2008 10:18:37 -0800 (PST), Nico <nicod...@yahoo.com>
wrote:

>Hi, I have installed the Oracle Database 10g Express Edition Release
>10.2.0.1.0 on a Linux workstation for dev purposes, along with Apache
>2 and PHP5. The production environment uses Unix for the Oracle
>database server, Win2003/IIS for the webserver.
>
>Our PHP application works with UTF-8 encoded strings and actually
>sends SQL INSERTS in UTF-8 format to the server.

the server. Which one? The server hosting Express Edition, or the
server using an unknown version of Oracle RDBMS whatever?
Is Apache involved or IIS?


When this data is
>retrieved, foreign characters (all characters which are not in the
>Basic Latin alphabet) are corrupted. It's not clear to me where I
>should search for the characterset related issues.

You need to read the Globalization Manual.


It seems to be a
>server-side problem.

Why? Assuming the database characterset is an UTF8 one, why would it
be a server side problem? Did you check the NLS_LANG environment
variable (Unixes) or the NLS_LANG string variable in the registry on
the IIS server?


>
>Some NLS settings:
>

There are NLS session, instance and database parameters.
To which one does this refer?


>National Language Parameter Value
>NLS_CHARACTERSET AL32UTF8
>NLS_COMP BINARY
>NLS_LANGUAGE AMERICAN
>NLS_NCHAR_CHARACTERSET AL16UTF16
>NLS_NCHAR_CONV_EXCP FALSE
>NLS_TERRITORY AMERICA
>
>Questions:
>- Is it possible to use ALTER SESSION to change the characterset?

No, it isn't

>- What do we need to configure in order to have the database server
>and the client work with UTF-8?
>

You need to set up NLS_LANG on the client side (client can be a
webserver appropiately). Which is in the documentation you didn't
read.


>Thanks in advance for your reply.
>
>Kind regards,
>Nico

Regards,

--
Sybrand Bakker
Senior Oracle DBA

Laurenz Albe

unread,
Nov 20, 2008, 5:09:25 AM11/20/08
to

You'll have to set the environment variable NLS_LANG to
AMERICAN_AMERICA.AL32UTF8
so that the Oracle client knows that the data you feed it is actually UTF-8.

Yours,
Laurenz Albe

Nico

unread,
Nov 20, 2008, 10:28:25 AM11/20/08
to
Hi, thanks for the replies.

@Sybrand: BEFORE POSTING yesterday, I searched the net and read quite
some docs.
On Linux/Apache/PHP5/Oracle Database 10g Express Edition Release
10.2.0.1.0 I checked the script /usr/lib/oracle/xe/app/oracle/product/
10.2.0/server/config/scripts/sqlplus.sh, where the environment var
NLS_LANG is set to AMERICAN.UTF8. So far so good. It means that the
client was properly configured. Since these settings did not have the
desired effect, I decided to post here. It seems I should have
provided this info to avoid some flames. Sorry ;)

@Laurenz: From PHP I connect using oci_connect(["Username"],
["Password"],["Server"],"AL32UTF8"); The last par is the charset. To
make sure we are OK in the PHP client, I set the environment var once
more: putenv("NLS_LANG=_.UTF8");

Today I have done lots of testing again.
Surprisingly we had a good situation, without problems with UTF8.
After recreation of the database with sqlplus in a number of
environments (production, test, development), data became corrupted.
Today the UTF8 problem disappeared miraculously. No idea why. And
exactly that makes me somewhat concerned, since these type of problems
tend to pop up again exactly when it is really inconvenient.

So please, I'm just an application developer, not a DBA.
Could anyone hint me towards a clear understanding of this problem?

Kind regards,
Nico

Laurenz Albe

unread,
Nov 20, 2008, 11:17:54 AM11/20/08
to
Nico <nicod...@yahoo.com> wrote:
> On Linux/Apache/PHP5/Oracle Database 10g Express Edition Release
> 10.2.0.1.0 I checked the script /usr/lib/oracle/xe/app/oracle/product/
> 10.2.0/server/config/scripts/sqlplus.sh, where the environment var
> NLS_LANG is set to AMERICAN.UTF8. So far so good. It means that the
> client was properly configured.

Oracle's UTF8 is in UNICODE Version 3.0
Oracle's AL32UTF8 is UNICODE Version 4.01

The latter is preferred.

> @Laurenz: From PHP I connect using oci_connect(["Username"],
> ["Password"],["Server"],"AL32UTF8"); The last par is the charset. To
> make sure we are OK in the PHP client, I set the environment var once
> more: putenv("NLS_LANG=_.UTF8");

Again, it should be AL32UTF8.

> Today I have done lots of testing again.
> Surprisingly we had a good situation, without problems with UTF8.
> After recreation of the database with sqlplus in a number of
> environments (production, test, development), data became corrupted.
> Today the UTF8 problem disappeared miraculously. No idea why. And
> exactly that makes me somewhat concerned, since these type of problems
> tend to pop up again exactly when it is really inconvenient.
>
> So please, I'm just an application developer, not a DBA.
> Could anyone hint me towards a clear understanding of this problem?

If the Client Encoding is set to AL32UTF8, it should work fine.
It's a simple as that.

It's hard to say from here why it first didn't work on your computer
and now does.

If you change environment variables, you have to restart the process
if it should pick them up, but I guess you know that.

Yours,
Laurenz Albe

Nico

unread,
Nov 20, 2008, 2:02:54 PM11/20/08
to
On Nov 20, 5:17 pm, Laurenz Albe <inv...@spam.to.invalid> wrote:

> Nico <nicodenb...@yahoo.com> wrote:
> > On Linux/Apache/PHP5/Oracle Database 10g Express Edition Release
> > 10.2.0.1.0  I checked the script /usr/lib/oracle/xe/app/oracle/product/
> > 10.2.0/server/config/scripts/sqlplus.sh, where the environment var
> > NLS_LANG is set to AMERICAN.UTF8. So far so good. It means that the
> > client was properly configured.
>
> Oracle's UTF8 is in UNICODE Version 3.0
> Oracle's AL32UTF8 is UNICODE Version 4.01
>
> The latter is preferred.
>

Laurenz; As I wrote, from PHP I do not set the language, but only the
charset - to AL32UTF8. So in PHP we're OK.

If I understand correct, for sqlplus or other client software, I
should set the environment var NLS_LANG to AMERICAN.AL32UTF8 ?

Thank you Laurenz for your feedback!

Kind regards, Nico

The Boss

unread,
Nov 20, 2008, 7:01:05 PM11/20/08
to
Nico wrote:
> On Nov 20, 5:17 pm, Laurenz Albe <inv...@spam.to.invalid> wrote:
>> Nico <nicodenb...@yahoo.com> wrote:
>>> On Linux/Apache/PHP5/Oracle Database 10g Express Edition Release
>>> 10.2.0.1.0 I checked the script
>>> /usr/lib/oracle/xe/app/oracle/product/
>>> 10.2.0/server/config/scripts/sqlplus.sh, where the environment var
>>> NLS_LANG is set to AMERICAN.UTF8. So far so good. It means that the
>>> client was properly configured.
>>
>> Oracle's UTF8 is in UNICODE Version 3.0
>> Oracle's AL32UTF8 is UNICODE Version 4.01
>>
>> The latter is preferred.
>>

That's an understatement, see links below.

>
> Laurenz; As I wrote, from PHP I do not set the language, but only the
> charset - to AL32UTF8. So in PHP we're OK.
>
> If I understand correct, for sqlplus or other client software, I
> should set the environment var NLS_LANG to AMERICAN.AL32UTF8 ?

Make that AMERICAN_AMERICA.AL32UTF8 (as stated by Laurenz).
Also check these links:
http://www.mail-archive.com/uni...@unicode.org/msg17884.html
(+subsequent replies!)
http://forums.oracle.com/forums/thread.jspa?threadID=343349
where you also will find a link to a 14-page Oracle Whitepaper on Oracle's
Unicode support.

HTH

--
Jeroen


Nico

unread,
Nov 21, 2008, 5:08:28 PM11/21/08
to
Hi Jeroen,

On Nov 21, 1:01 am, "The Boss" <use...@No.Spam.Please.invalid> wrote:
> Make that AMERICAN_AMERICA.AL32UTF8 (as stated by Laurenz).

The customer is in Holland, uses the Dutch language only.
Just curious; what's the difference between AMERICA and
AMERICAN_AMERICA?
And, is the country setting really important or even interesting while
using the charset AL32UTF8?

> Also check these links:
> http://www.mail-archive.com/unic...@unicode.org/msg17884.html
> (+subsequent replies!)http://forums.oracle.com/forums/thread.jspa?threadID=343349


> where you also will find a link to a 14-page Oracle Whitepaper on Oracle's
> Unicode support.

Very interesting discussions in your links. It seems clear that
AL32UTF8 is the preferred choice.
I'll read the white paper this weekend.

>
> HTH
>
> --
> Jeroen

Yes, you surely helped. Thanks!

To ALL: In the production environment (Oracle database server on Unix,
webserver on win2003 / IIS) the environment setting NLS_LANG needed to
be fixed. It now works well. For my dev workstation we did not find an
explanation yet why it first worked well, then not any more and later
mysteriously properly.

Kind regards,
Nico

Mladen Gogala

unread,
Nov 22, 2008, 9:05:36 AM11/22/08
to
On Fri, 21 Nov 2008 14:08:28 -0800, Nico wrote:

> Just curious; what's the difference between AMERICA and
> AMERICAN_AMERICA?

Well, there are also states called Canada and Mexico on the same
continent as the United States of America. Mexico has some exquisite food
and great tourist destinations while Canada has lots of forrests, some
gloomy weather and strange dialect, ah? The best thing in Canada is the
Pythian Group, excellent database consulting company.


--
http://mgogala.freehostia.com

0 new messages