Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[Info-ingres] replacing special characters eg ä in SQL with a unicode script u'ä'

305 views
Skip to first unread message

Allan Biggs

unread,
Apr 14, 2016, 10:24:03 AM4/14/16
to info-...@lists.planetingres.org
Is there a way of replacing characters such as ä  (a umlaut) as part of an SQL query.

We are sometimes fed these characters by another system,

in my case I am producing a python script which for ä needs to be represented by the string u'ä' or substituted for a none unicode character.


clearly I can do something like

update filename_w_asd  set file_name = replace(file_name,'ä','''ä''');
\p\g

but is there a more generic way to do this?

Allan


This communication is for use by the intended recipient and contains
information that may be Privileged, confidential or copyrighted under
applicable law. If you are not the intended recipient, you are hereby
formally notified that any use, copying or distribution of this e-mail,
in whole or in part, is strictly prohibited. Please notify the sender by
return e-mail and delete this e-mail from your system. Unless explicitly
and conspicuously designated as "E-Contract Intended", this e-mail does
not constitute a contract offer, a contract amendment, or an acceptance
of a contract offer. This e-mail does not constitute a consent to the
use of sender's contact information for direct marketing purposes or for
transfers of data to third parties.

Francais Deutsch Italiano  Espanol  Portugues  Japanese  Chinese  Korean

          http://www.DuPont.com/corp/email_disclaimer.html

Chris Clark

unread,
Apr 14, 2016, 11:46:48 AM4/14/16
to
On Thursday, April 14, 2016 at 7:24:03 AM UTC-7, Allan Biggs wrote:
> Is there a way of replacing characters
> such as ä
>  (a umlaut) as part of an
> SQL query.
>
> We are sometimes fed these characters
> by another system,
>
> in my case I am producing a python script
> which for ä needs to
> be represented by the string u'ä'
> or substituted for a none unicode character.

Ingres will translit between varchar and nvarchar. So Ingres considers (using your syntax) 'ä' and u'ä' as comparable (assuming the character is encoded correctly in the declared II_CHARSETxx encoding).

Depending on which version of Python 2.x or 3.x series they may or may not compare the same way (depending on locale, encoding, etc.).


>
> clearly I can do something like
>
> update filename_w_asd  set file_name
> = replace(file_name,'ä','''ä''');
>
> \p\g
>
> but is there a more generic way to do
> this?
>
> Allan

I'm not sure you need to do this at all. Unless you are trying to convert something into 7-bit clean US-ASCII, i.e. 'ä' into 'a' (plain a). There isn't a good way to do this in Ingres (you could use a NFC database and strip all non-ascii characters but again, not a good way). Doing this as part of ETL (e.g. with a Python script) is probably your best bet here.

If you just want unicode strings in Python, then select using an nvarchar cast. E.g.

# Assuming jyjdbc or (py)pyodbc driver using Ingres JDBC or ODBC driver
c.execute("SELECT nvarchar('ä')") # NOTE ensure python encoding set correctly
c.execute("SELECT nvarchar(?)", ('ä',)) # NOTE ensure python encoding set correctly

Allan Biggs

unread,
Apr 14, 2016, 12:04:04 PM4/14/16
to Chris Clark, info-...@lists.planetingres.org
Chris ,

I just realised I only included Ian on this - this is my fuller answer.

I am a python novice I am just using it via http://xlsxwriter.readthedocs.org/ to produce a .xlsx spreadsheet  - the Ä     went directly to the python code which objected to it. when it was run
$python test.py

thanks
Allan


----- Forwarded by Allan Biggs/GB/CONT/DPT on 14/04/2016 16:55 -----

From:        Allan Biggs/GB/CONT/DPT
To:        Ian Kirkham <Ian.K...@actian.com>@DUPONT_MHUB
Date:        14/04/2016 16:30
Subject:        RE: [Info-ingres] replacing special characters  eg ä  in SQL with a unicode script u'ä'



Ian,

I am using report writer to create a python script  (test.py) which is then run by python

$python test.py

to produce an .xlsx spreadsheet. All on linux

see http://xlsxwriter.readthedocs.org/ its a really easy way of generating .xlsx files from pure script.

I needed to put the registered mark  ® into the spreadsheet and found the only way I could do this was

this .....
worksheet1.write(  43,   0,'xxxx' u'®'  'and xxxx' u'®'  ' are registered trademarks of xxxxxxxxx.' ,format0)
worksheet1.write(  44,   0,'Only xxxxxx makesxxxxx' u'®' ' brand and xxxxxx ' u'®' 'xxxxxx',format0)

..... works very well but in final testing we discovered that certain orders did not run and that an A Umlaut in the customer data stopped the python script working

Oh - the python has to have  this as the first line

# _*_ coding: utf-8

..... I tried this quickly
worksheet1.write(  44,   0,'Only xxxxxx makesxxxxx' u'®' ' br u'Ä' nd and xxxxxx ' u'®' 'xxxxxx',format0)


which works.

 Ingres was happy to pass the umlauts on , its the python that was unhappy.

I am half way through modifying the sql like this to replace the umlaut characters but I would like to have found a more generic solution
not sure that this will work when put into a string (there are a lot of '''''s around) :-)

update rxrpy_w_rrp set cons_cust  = replace(cons_cust,'ä','u''ä''');
update rxrpy_w_rrp set cons_town  = replace(cons_town,'ä','u''ä''');
update rxrpy_w_rrp set cons_ctry  = replace(cons_ctry,'ä','u''ä''');
update rxrpy_w_rrp set cust_ref   = replace(cust_ref,'ä','u''ä''');
update rxrpy_w_rrp set cust_ordno = replace(cust_ordno,'ä','u''ä''');

update rxrpy_w_rrp set cons_cust  = replace(cons_cust,'Ä','u''Ä''');
update rxrpy_w_rrp set cons_town  = replace(cons_town,'Ä','u''Ä''');
update rxrpy_w_rrp set cons_ctry  = replace(cons_ctry,'Ä','u''Ä''');
update rxrpy_w_rrp set cust_ref   = replace(cust_ref,'Ä','u''Ä''');
update rxrpy_w_rrp set cust_ordno = replace(cust_ordno,'Ä','u''Ä''');

\p\g

 
forgive the redactions there is customer specific data in here  - I would send you the python script but cannot for the same reason ...

thanks
Allan





From:        Ian Kirkham <Ian.K...@actian.com>
To:        Allan Biggs <Allan...@GBR.dupont.com>
Date:        14/04/2016 15:55
Subject:        RE: [Info-ingres] replacing special characters  eg ä  in SQL with a unicode script u'ä'




This really depends on what you are trying to achieve.
UTF-8 handles these very well so what is the problem that makes you want to embed quotes around them?
Are you talking about the constants in a query or the data in the tables?
Or is it that you really want to transfer the data in ASCII to avoid problems with handling 8-bit data?
 
Regards,
Ian
 
From: info-ingr...@lists.planetingres.org [mailto:info-ingr...@lists.planetingres.org] On Behalf Of Allan Biggs
Sent:
14 April 2016 15:24
To:
info-...@lists.planetingres.org
Subject:
[Info-ingres] replacing special characters eg ä in SQL with a unicode script u'ä'

 
Is there a way of replacing characters such as ä  (a umlaut) as part of an SQL query.

We are sometimes fed these characters by another system,


in my case I am producing a python script which for
ä needs to be represented by the string u'ä' or substituted for a none unicode character.


clearly I can do something like


update filename_w_asd  set file_name = replace(file_name,'
ä','''ä''');
\p\g


but is there a more generic way to do this?


Allan

Chris Clark

unread,
Apr 14, 2016, 12:23:13 PM4/14/16
to
On Thursday, April 14, 2016 at 9:04:04 AM UTC-7, Allan Biggs wrote:
> I am a python novice I am just using
> it via http://xlsxwriter.readthedocs.org/
> to produce a .xlsx spreadsheet  - the Ä
>     went directly to the python code which objected to it. when
> it was run
...

>
> this .....
>
> worksheet1.write(  43,
>   0,'xxxx' u'®'  'and xxxx' u'®'  ' are registered trademarks
> of xxxxxxxxx.' ,format0)


Two options (there are others but these are least effort):
1) you could use Actian Director to generate exports
2) change the python code you emit - which is sounds like this would be your preferred options

For 2) there are two further options (one may not work):

a) Just use a str type (and use Python 2.x) - there is a chance xlsxwriter may check the type and fail
b) Just use a str type and convert to unicode, e.g.:


worksheet1.write(43, 0, 'xxxx ® and xxxx® are registered trademarks of xxxxxxxxx.'.decode('utf8') ,format0)


i.e.:

'xxxx' u'®' 'and xxxx' u'®' ' are registered trademarks of xxxxxxxxx.'

becomes

'xxxx ® and xxxx® are registered trademarks of xxxxxxxxx.'.decode('utf8')

Allan Biggs

unread,
Apr 14, 2016, 2:34:03 PM4/14/16
to Chris Clark, Ingres lists
Thanks I'll give

xxxx ® and xxxx® are registered trademarks of xxxxxxxxx.'.decode('utf8')

a whirl

Allan
--
Sent from my Android phone with mail.com Mail. Please excuse my brevity.

_______________________________________________
Info-ingres mailing list
Info-...@lists.planetingres.org
http://lists.planetingres.org/mailman/listinfo/info-ingres
0 new messages