Collation PXW_SLOV is for Slovakia or Slovenia

28 views
Skip to first unread message

Jan Javůrek

unread,
Dec 9, 2020, 8:24:32 AM12/9/20
to firebird-support
Hello,
i found only one old document where describe PXW_SLOV is for Slovakian(country Slovakia).
51  --  --  PXW_SLOV  Slovakian

Collation PXW_SLOV is for Slovakia or Slovenia?
What collation is correct to use for Slovakia?

Thanks Jan Javurek.
Czech Republic

Roland Turcan

unread,
Dec 9, 2020, 8:26:49 AM12/9/20
to firebird-support
Hi Honza,

As far as I tried in the past, it was matching Slovenian collation. But maybe it has changed over time.

RT;

Mark Rotteveel

unread,
Dec 9, 2020, 8:54:48 AM12/9/20
to firebird...@googlegroups.com
The table at
https://firebirdsql.org/file/documentation/html/en/refdocs/fblangref25/firebird-25-language-reference.html#fblangref25-appx06-charsets
is definitely wrong, because it maps Polish to PXW_HUNDC, and Slovak to
PXW_PLK (which I think should be Polish); it looks like some things
shifted. I'll fix that.

Given these collations were originally derived from Paradox, a Google
for Paradox collations produces (PXW stands for Paradox Windows):

The Firebird sources map PXW_SLOV to pw1250slov.h, which contains a
comment "Language driver ansislov"
ANSISLOV -> Paradox Windows Slovene language driver

In other words, PXW_SLOV is for Slovenia, not Slovakia.

I'm wondering if the collation order for Slovakia might be the same as
some or all of the Czech collation orders (eg PXW_CSY and PDOX_CSY,
DB_CSY, as I think CSY refers to Czechoslovakia, and Paradox is so old,
this definition probably predates the split), if not, I guess there
isn't such a collation in Firebird.

Mark
--
Mark Rotteveel

Roland Turcan

unread,
Dec 9, 2020, 9:04:31 AM12/9/20
to firebird-support
Mark, yes, that table is wrong and it needs some corrections.

When I started the biggest project, I was looking for best collation and the best I found was for Slovak language WIN1250/PXW_CSY, which is not optimal but the best I found.

RT

Dimitry Sibiryakov

unread,
Dec 9, 2020, 9:24:32 AM12/9/20
to firebird...@googlegroups.com
09.12.2020 15:04, Roland Turcan wrote:
> When I started the biggest project, I was looking for best collation and the best I found
> was for Slovak language WIN1250/PXW_CSY, which is not optimal but the best I found.

For new project there must be a very good reason to use anything but UTF-8 and Unicode
collation.

--
WBR, SD.

Roland Turcan

unread,
Dec 9, 2020, 9:28:19 AM12/9/20
to firebird-support
Hi Dmitri,

I do completely agree with you, but it is a running project is about 20 years old and I have planned to change over to UTF-8, but customers are having DB in hundreds gigabytes.

Mark Rotteveel

unread,
Dec 9, 2020, 9:31:06 AM12/9/20
to firebird...@googlegroups.com
Using the unicode collation by itself is not a good plan if your users
expect a localized sort order. So at least you need a custom collation
that applies country specific sort orders, which is hard to do on
Firebird 3 and earlier given - at least on Windows - the absence of a
complete ICU library, which means collations information is not always
available, so you can't create the necessary collations.

Mark
--
Mark Rotteveel

Mark Rotteveel

unread,
Dec 9, 2020, 9:33:58 AM12/9/20
to firebird...@googlegroups.com
On 09-12-2020 12:06, Jan Javůrek wrote:
> Hello,
> i found only one old document where describe PXW_SLOV is for
> Slovakian(country Slovakia).
> https://firebirdsql.org/en/firebird-1-5-character-sets-collations/
> 51 -- -- PXW_SLOV Slovakian

I have fixed the collations on that page (and in the Firebird 2.5
language reference) and identified them as Slovene instead.

Mark
--
Mark Rotteveel

Roland Turcan

unread,
Dec 9, 2020, 9:37:40 AM12/9/20
to firebird-support
Thanks Mark...

The main reason why I stopped migration to UTF-8 is weak ICU library...

But what it the reason to get full ICU library to Firebird?

Thanks for response.
 

Mark Rotteveel

unread,
Dec 9, 2020, 10:07:06 AM12/9/20
to firebird...@googlegroups.com
On 09-12-2020 15:37, Roland Turcan wrote:
> Thanks Mark...
>
> The main reason why I stopped migration to UTF-8 is weak ICU library...
>
> But what it the reason to get full ICU library to Firebird?

Firebird 4 will include the full library, at least, as far as I'm aware.
The reason it wasn't included in earlier version was, I believe,
primarily to save space.

Getting the full library for Firebird 3 and earlier is pretty hard, as
the ICU project itself doesn't provide binaries (at least, not for the
ICU version used by Firebird 3, and I'm not actually sure if a newer
version would work), so you would need to build it yourself.

Mark
--
Mark Rotteveel

Jan Javůrek

unread,
Dec 11, 2020, 8:47:33 AM12/11/20
to firebird-support

thank you for the quick response
PXW_CSY is for Czech and Slovak (Slovakia)
PXW_SLOV for Slovenia
Dne středa 9. prosince 2020 v 15:33:58 UTC+1 uživatel ma...@lawinegevaar.nl napsal:

Mark Rotteveel

unread,
Dec 11, 2020, 9:47:01 AM12/11/20
to firebird...@googlegroups.com
On 11-12-2020 14:47, Jan Javůrek wrote:
>
> thank you for the quick response
> PXW_CSY is for Czech and Slovak (Slovakia)
> PXW_SLOV for Slovenia

Looking at another source, it might be possible that PXW_CSY might miss
some ordering that is common for Slovak (though I'm definitely not an
expert on Slovak ordering, nor really on collations).

Basing this on comparing
https://collation-charts.org/mssql/mssql.041B.1250.Slovak_CS_AS.html
with https://collation-charts.org/firebird20/fb203.WIN1250.PXW_CSY.html,
the ordering of - for example - ä and Ä could be wrong.

Mark
--
Mark Rotteveel
Reply all
Reply to author
Forward
0 new messages