Regarding the passing of utf-8 vs. utf-16 and the differences in WITH_UNICODE mode, you should really email the cx_oracle list about that as WITH_UNICODE to my knowledge only has to do with the Python interpretation of arguments, not its interaction with OCI. If you email their list, make sure all code examples are purely derived from cx_oracle, to eliminate any doubt that each behavior is one of cx_oracle directly. WITH_UNICODE was not intended for general use in Python 2.xx. if your whole application really works with it, then there's no reason not to use it, its just that you'll always have to ensure that no non-Python unicode strings ever get sent to cx_oracle which can be fairly tedious. It sounds like a bug that cx_oracle would be expanding into a two-byte-per-character stream like that.
Its possible that calling setinputsizes() with cx_oracle.UNICODE may be the key to cx_oracle's behavior here. We call setinputsizes for every Oracle cursor but we currently exclude the string types from that list as it had some unwanted side effects.
Also note that VARCHAR2(4000) is measuring itself in bytes, not characters. For true Unicode support Oracle provides the NVARCHAR2 type, where the "length" specifies the number of characters, instead of bytes. Recent versions of Oracle also support the form VARCHAR2(4000 CHAR) which will similarly measure the column in terms of characters based on the databases encoding instead of bytes. Its worth investigating if using NVARCHAR2 causes cx_oracle, or OCI, to change its behavior.
>
> Thank you in advance.
>
> Guilherme.
>
> --
> You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
> To post to this group, send email to sqlal...@googlegroups.com.
> To unsubscribe from this group, send email to sqlalchemy+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
>
I will report the problem to the cx_Oracle list and see what they have to say.
Regards,
Guilherme.
It seems that cx_Oracle always sends data in UTF-16 if WITH_UNICODE is
unset and unicode() objects are passed.
Regards,
Guilherme.
On Wed, Nov 17, 2010 at 2:16 PM, Michael Bayer <mik...@zzzcomputing.com> wrote:
>