On 24.05.2022 at 12:41, Luciano Rodrigues Nunes Mendes wrote:
> One last question:
> If I always use UTF8 as the client charset wouldn't I unnecessarily
> increase my system's network latency since UTF8 needs 4 times more bytes to
> represent the same character in WIN1252?
It all depends on your client setup. I mean: what kind of SDK / database
library you use to talk to Firebird and which string encoding your
client application uses internally.
If your client app is for example an "old" win32 app without unicode
support and only uses single byte WIN1252 (ANSI) encoding internally,
you can safely specify win1252 as the connection charset. This way the
data you get back from the database will be automatically converted to
win1252, if possible (i.e. if it doesn't contain characters not
representable in win1252).
If, on the other hand, your application uses unicode internally,
probably utf-8 would be a better option. However, keep in mind, that if
your database column encoding is win1252, trying to store in that column
any unicode characters, that can't be translated to win1252, will result
in a database error (Firebird can't store a character outside of win1252
in a win1252 column). In other words, setting utf-8 as connection
charset doesn't circumvent in any way the single byte win1252 encoding
of a table column.
In general, you should choose the connection encoding best matching the
internal string representation in your client application.
As the last remark, I wouldn't worry too much about wire representation
of utf-8. Most characters from win1252 will probably requre 1 or 2 bytes
at most. Unless you store huge amounts of text in a single column, utf-8
shouldn't impact the network performance significantly (at least that's
my general experience - I've never actually benchmarked it).
regards
Tomasz