informix unloads converting the data

Floyd Wellershaus

unread,

Dec 19, 2008, 5:47:47 PM12/19/08

to inform...@iiug.org

Hi,
This is on ids10.0 and aix5.3

We're trying to convert some data over to another database. When unloading certain fields that have encrypted data inserted into them ( not informix encryption ), we noticed that the data is getting converted a bit.
For instance, an \F in the character string in the database, comes out as a \\F in the unload file.
The delimiter doesn't matter. It is still putting some extra escape quotes in certain places.

We can't use the HP load internal format, because the other database wouldn't know what to do with it.

Are there any ideas as to how to prevent this, or why it happens ?

Thank you,
Floyd

Jonathan Leffler

unread,

Dec 20, 2008, 2:33:42 AM12/20/08

to fl...@fwellers.com, inform...@iiug.org

On Fri, Dec 19, 2008 at 2:47 PM, Floyd Wellershaus <fl...@fwellers.com> wrote:
> Hi,
> This is on ids10.0 and aix5.3
>
> We're trying to convert some data over to another database. When unloading
> certain fields that have encrypted data inserted into them ( not informix
> encryption ), we noticed that the data is getting converted a bit.

How are you doing the unload? With the UNLOAD statement? If so, then
it always emits two backslashes for each actual backslash in the data;
it also emits backslash-pipe for any pipe in the data (assuming
DBDELIMITER is unset or is set to pipe; change the delimiter, and any
occurrence of the delimiter in the data is emitted as
backslash-delimiter).

> For instance, an \F in the character string in the database, comes out as a
> \\F in the unload file.

That would be correct. What did you have in mind?

> The delimiter doesn't matter. It is still putting some extra escape quotes
> in certain places.
>
> We can't use the HP load internal format, because the other database
> wouldn't know what to do with it.
>
> Are there any ideas as to how to prevent this, or why it happens ?

It happens because the input (LOAD) process expects backslash to
escape the following character. If your data file contained just \F,
then a single character, F, would be loaded.

If you don't like the Informix UNLOAD format, you are faced with a
variety of options. You can write your own unloader that formats the
data the way you think is correct. You can try using SQLCMD; it has a
variety of output formats and one of those may suit you. However, by
default, it uses backslash as an escape for backslash and the other
delimiters. You can, however, change the delimiter and the escape
(whereas standard Informix tools do not allow you to change the
escape). It also has CSV and XML output formats.

--
Jonathan Leffler #include <disclaimer.h>
Email: jlef...@earthlink.net, jlef...@us.ibm.com
Guardian of DBD::Informix v2008.0513 -- http://dbi.perl.org/
"Blessed are we who can laugh at ourselves, for we shall never cease
to be amused."
NB: Please do not use this email for correspondence.
I don't necessarily read it every week, even.

Ian Goddard

unread,

Dec 20, 2008, 6:34:49 AM12/20/08

to

Several posts recently have been concerned with the formats of Informix
unload files not being quite what they wanted. As a one-off the
required changes can be made with vi. If the file is too big for vi or
a scripted solution is required sed can be used instead.

The general format of the vi command is:

:%s/this/that/g

: - puts vi into a mode to accept ex commands

% - part of the ex command. Tells the editor to apply the command to
all lines in the file, not just the current line

s - substitute the string between the first and second slash with the
string between the second and third

g - global - apply to all occurrences of the first string; without it
only the first occurrence on each line would be substituted

Lines without the target string will be unaffected.

The usual regular expression tricks apply so that, for instance

:%s/^[Aa]/1/

would replace either A or a at the start of a line by 1 which means that
special characters need to be escaped with a backslash so that

:%s/\[/{/g

replaces all square brackets by curlies.

This means that the backslash itself needs to be escapes so in Floyd's
case the command

:%S/\\\\/\\/g

will substitute a single backslash for each pair.

The other issue which has been raised recently is removal of trailing
spaces before field delimiters without removing non-trailing spaces.
The trick here is to substitute the combination of space and delimiter
by the delimiter e.g.

:%s/ |/|/g

My general approach is to issue a series of commands to remove
decreasing numbers of spaces. For instance if I expect no more than 15
trailing spaces the first command would be:

:%s/ |/|/g

For those reading in proportional spaced fonts that was 8 spaces. The
next command would be for 4 spaces, then 2, then 1. A similar approach
can be used to remove leading spaces.

Having worked out the series of commands needed to hack a sample file
they can be assembled into a file and handed to sed as a script using
the -f option of sed. As sed doesn't need to be put into command mode
and as it applies each command in turn to each line neither the : nor
the % are needed so a file to apply the fix that Floyd needs, remove up
to 3 trailing spaces, remove a single space at the start of a line and
add another delimiter at the end of each line would be:

s/\\\\/\\/g
s/ |/|/g
s/ |/|/g
s/^ //g
s/$/|/g

One advantage of sed is that it can be used as a filter. I've used it
when converting fixed width files to Informix load files - a C program
adds the delimiters as required and sed takes its standard out and
removes the trailing spaces. Another is that it can handle arbitrarily
large files whilst vi may well fail on large files.

--
Ian

Hotmail is for spammers. Real mail address is igoddard
at nildram co uk

Ian Michael Gumby

unread,

Dec 20, 2008, 8:46:40 PM12/20/08

to Jonathan Leffler, fl...@fwellers.com, inform...@iiug.org

Outch!

Doesn't the LOAD/UNLOAD commands understand or allow for a binary data field?

I guess you could write an ESQL/C program or even a Python/Perl script to do the unload and subsequent load for you.
Or you might be able to modify one of Art's utilities over on the IIUG site. (I think there's something there about loading / unloading the database.

-G

> Date: Fri, 19 Dec 2008 23:33:42 -0800
> From: jleffl...@gmail.com
> To: fl...@fwellers.com
> Subject: Re: informix unloads converting the data
> CC: inform...@iiug.org

>
> On Fri, Dec 19, 2008 at 2:47 PM, Floyd Wellershaus <fl...@fwellers.com> wrote:
> > Hi,
> > This is on ids10.0 and aix5.3
> >
> > We're trying to convert some data over to another database. When unloading
> > certain fields that have encrypted data inserted into them ( not informix
> > encryption ), we noticed that the data is getting converted a bit.
>

> How are you doing the unload? With the UNLOAD statement? If so, then
> it always emits two backslashes for each actual backslash in the data;
> it also emits backslash-pipe for any pipe in the data (assuming
> DBDELIMITER is unset or is set to pipe; change the delimiter, and any
> occurrence of the delimiter in the data is emitted as
> backslash-delimiter).
>

> > For instance, an \F in the character string in the database, comes out as a
> > \\F in the unload file.
>

> That would be correct. What did you have in mind?
>

> > The delimiter doesn't matter. It is still putting some extra escape quotes
> > in certain places.
> >
> > We can't use the HP load internal format, because the other database
> > wouldn't know what to do with it.
> >
> > Are there any ideas as to how to prevent this, or why it happens ?
>

> It happens because the input (LOAD) process expects backslash to
> escape the following character. If your data file contained just \F,
> then a single character, F, would be loaded.
>
> If you don't like the Informix UNLOAD format, you are faced with a
> variety of options. You can write your own unloader that formats the
> data the way you think is correct. You can try using SQLCMD; it has a
> variety of output formats and one of those may suit you. However, by
> default, it uses backslash as an escape for backslash and the other
> delimiters. You can, however, change the delimiter and the escape
> (whereas standard Informix tools do not allow you to change the
> escape). It also has CSV and XML output formats.
>
>
> --
> Jonathan Leffler #include <disclaimer.h>
> Email: jlef...@earthlink.net, jlef...@us.ibm.com
> Guardian of DBD::Informix v2008.0513 -- http://dbi.perl.org/
> "Blessed are we who can laugh at ourselves, for we shall never cease
> to be amused."
> NB: Please do not use this email for correspondence.
> I don't necessarily read it every week, even.

> _______________________________________________
> Informix-list mailing list
> Inform...@iiug.org
> http://www.iiug.org/mailman/listinfo/informix-list

Send e-mail faster without improving your typing skills. Get your Hotmail® account.

scottishpoet

unread,

Dec 21, 2008, 6:39:57 PM12/21/08

to

is the other database informix? If so then the load should convert the
\\F back to \F

otherwise you may need to write your own output program, or some
conversion script on the unloaded file