UTF8/Unicode implicit conversion for SSIS+

57 views
Skip to first unread message

Josef Jurák

unread,
Nov 6, 2023, 3:20:59 PM11/6/23
to COZYROC
Hello,
I’m using CozyRoc SSIS+ in my company for many years, it’s great toolset. But I want to ask you something.
I need help with converting between UTF-8 and Unicode (UTF-16) in SSIS. Maybe a transform script or a transform component for SSIS+.

I need to simple convert all text columns in SSIS DataFlow Path from UTF-8 (DT_STR 65001) to Unicode (DT_WSTR), and the same in the opposite direction (Unicode to UTF-8). Alternatively, from any CP (such as 1250 or 1252) to UTF-8 or to Unicode. However, it is important that for ALL text columns in transformation on Path in DataFlow, while keeping the column name and the same length in characters, with single setting for whole transformation (not on every column).

Sure, I can do this using the standard "Data Conversion" component, where I can do it individually for each column. But if you have thousands of tables, and each table has hundreds of text columns, that's an extremely busy job.

Why? Some Data Flow sources and destinations are coded differently. For example, we have flatfiles in UTF-8, MSSQL is often in Unicode, Oracle DB in UTF-8, but if I use an OLEDB provider, then it looks like Unicode, Microsoft Fabric/Synapse is in UTF-8, etc. And if I want to transfer a table from local MSSQL (with unicode nvarchars)) to Microsoft Synapse (UTF-8), or load csv (UTF-8) to Oracle OLEDB (Unicode), it looks like a simple job for SSIS, but due to the strict coding (and absence of implicit conversion in SSIS) it is very hard work.

Please, can you help?
Thanks a lot,
Josef

Ivan Peev

unread,
Nov 6, 2023, 9:45:13 PM11/6/23
to COZYROC
Hi Josef,

Have you tried using the reusable script posted here.

Josef Jurák

unread,
Nov 7, 2023, 4:27:08 AM11/7/23
to COZYROC
Hi Ivan,
Thank you for response.
Yes, I know about this script, but this is doing someting else, can be used for another purposes, but for this real-world scenario is not usable.
This script have two limitations, which making unusable for this.

1) It is limited to a single target data type for all columns. We have thousands of dataflows paths, but not one contains exclusively text columns. Real-world dataflow paths contain mixed column types, like text, numbers, dates, ... and encoding only needs to be changed for text columns, the others need to be kept.

2) It is limited to one length of column text. We have thousands of dataflow paths, but not one contains text columns of the same length. Dataflow paths contain texts in different lengths, which needs to be kept, only the encoding needs to be changed.

It is probably only about metadata operation (I'm not sure) on some new simple component, because curently I can solve by two ways, usnig standard "Data Conversion" component, or another way, by setting DataType and CodePage property on output column in advanced editor on (e.g.) source component. But in both cases, only manually column by column for every column. And if have  dataflows with hunders columns, it is almost imposible to do it. This is my issue.

Thank you so much,
Josef
Dne úterý 7. listopadu 2023 v 3:45:13 UTC+1 uživatel Ivan Peev napsal:

Ivan Peev

unread,
Nov 7, 2023, 8:49:36 AM11/7/23
to COZYROC
Hi Josef,

You can only select the input columns for which you want the data type to be changed. Also, you have access to the script's source code. You can enhance the script to not modify the length if you desire.

Josef Jurák

unread,
Nov 8, 2023, 9:36:37 AM11/8/23
to COZYROC
Hi Ivan,
I cannot select columns only to change, I can select input/output columns which is not usable.
1) In realworld, if you have  hunders columns (and you don't see datatype in this dialog) is not possible select column individually.
2) If select columns, all other columns are gone, but in real world, I still need keep other columns
3) all selected columns are set to same length, which is in real-world not usable

Honestly, I've been working on ETL for 20 years (and 15 years with SSIS), and I don't know what this script is good for, I've never needed to reset all columns to the same datatype (not to mention that I can do it exactly for 4 clicks in a standard "advanced editor" on each component).

Yes, I know I can write my own script, or better, code my own component. But I asked if you could help with that. Suffice it to say that: no.
But thank you,
Josef

Dne úterý 7. listopadu 2023 v 14:49:36 UTC+1 uživatel Ivan Peev napsal:

Ivan Peev

unread,
Nov 8, 2023, 10:48:17 AM11/8/23
to COZYROC
Josef,

Changing the column type in the "Advanced Editor" will not do the conversion. Most probably after you change the type, the component will fail. The reusable script I was referring to is doing actual input column conversions.

Before continuing further, please state how good your programming skills are from 1-10 . Thank you!

Reply all
Reply to author
Forward
0 new messages