Collation change to UTF8

167 views
Skip to first unread message

Ertan Küçükoglu

unread,
Mar 15, 2021, 6:45:40 AM3/15/21
to firebird...@googlegroups.com
Hello,

We are using FirebirdSQL 2.5.9.latest. Each database is located at a customer site. We do not have direct access to them.

Upon a request, we need to convert all databases at customers from code page WIN1254 (Turkish) to UTF8. That has to be without any data loss and proper conversion of national characters. In the end, there will be multiple language data saved in a converted database like Turkish + Russian will co-exist in it.

I wonder if there is a kind of document in detail explaining such an operation.

This is a script alone converting varchar columns. I am not sure how I should be converting the database wide code page and at what point.

It would be great if it is possible to use an existing database and do UTF8 conversion on it without any need to create and copy over data to a new database file. As above script seems to be doing something like that.

I also appreciate any advice for such a conversion.

Thanks & Regards,
Ertan Küçükoğlu

Dimitry Sibiryakov

unread,
Mar 15, 2021, 6:54:05 AM3/15/21
to firebird...@googlegroups.com
15.03.2021 11:45, Ertan Küçükoglu wrote:
> It would be great if it is possible to use an existing database and do UTF8 conversion on
> it without any need to create and copy over data to a new database file.

Create a new database from scratch and properly copy data is the safest way of
conversion so if I were you I wouldn't look for others.

--
WBR, SD.

Mark Rotteveel

unread,
Mar 15, 2021, 6:59:22 AM3/15/21
to firebird...@googlegroups.com
On 15-03-2021 11:45, Ertan Küçükoglu wrote:
> It would be great if it is possible to use an existing database and do
> UTF8 conversion on it without any need to create and copy over data to a
> new database file. As above script seems to be doing something like that.
>
> I also appreciate any advice for such a conversion.

That linked script performs direct system table manipulation, which is
discouraged in 2.5 and earlier, and no longer possible in Firebird 3.

It is not really possible to change the character set of a column (and
the script shown can result in index issues and potentially conversion
issues).

If you want to change the character set, there are really only two options:

1. Create a new database with the same structure but updated character
set for the columns and pump the data over
2. Create a new column with the right character set, assign values from
old to new column, drop old column, rename new column (and all
associated fixes necessary for dependencies).

Usually option 2 is a lot more work than option 1, unless there are just
a few columns involved with few dependencies.

Mark
--
Mark Rotteveel

Ertan Küçükoglu

unread,
Mar 15, 2021, 7:48:12 AM3/15/21
to firebird...@googlegroups.com
Thanks for all the replies.

I understand that I cannot avoid creating a new database with default character set UTF8 and copying data over to the new database.

In this case, I would like to do a data pump as fast as possible as there are databases of sizes GBs with all these BLOB text and BLOB binary columns and computed columns and everything. The worst part is the need for downtime for such an operation. This unicode request is not that all customers request but some. Though, the decision is to do a complete switch to UTF8 for easier source code handling.

I am thinking of using "for execute statement ... on external data". It will be long SQL statements to prepare for executing. There may be a better way to handle this on SQL level, let me know, please.

It is also possible to use an application for such a purpose, but if above will handle all conversion and everything, I believe it will be faster to prepare SQL statements than developing an application.

What would be suggestions for copying complete database "A" to a new database "B" with proper code page conversions and doing it as fast as possible?

Thanks & Regards,
Ertan Küçükoğlu


Mark Rotteveel <ma...@lawinegevaar.nl>, 15 Mar 2021 Pzt, 13:59 tarihinde şunu yazdı:
--
You received this message because you are subscribed to the Google Groups "firebird-support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firebird-suppo...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/firebird-support/9c0c2975-4acb-e385-090a-600df64cb29d%40lawinegevaar.nl.

Dimitry Sibiryakov

unread,
Mar 15, 2021, 7:52:56 AM3/15/21
to firebird...@googlegroups.com
15.03.2021 12:48, Ertan Küçükoglu wrote:
> I am thinking of using "for execute statement ... on external data". It will be long SQL
> statements to prepare for executing.

The statement is trivial "select field1,field2,field3 from table1", it should not be long.

--
WBR, SD.
Reply all
Reply to author
Forward
0 new messages