CSV and NULL values

3,954 views
Skip to first unread message

Schönfisch, Jörg

unread,
Jul 25, 2012, 9:31:35 AM7/25/12
to jooq...@googlegroups.com

Hi,

 

is it possible to support NULL values in the CSV import and export? For example, MySQL and Postgres escape NULL as \N. Another possibility I can think of is treating “””” differently from “”.

Or maybe allow some “Oracle-like” behavior which converts the empty string to NULL before inserting it?

 

Is it planned to support NULL for the XML import and export?

 

Regards,

Joerg

Lukas Eder

unread,
Jul 25, 2012, 10:03:28 AM7/25/12
to jooq...@googlegroups.com
Hi Jörg,

> is it possible to support NULL values in the CSV import and export? For
> example, MySQL and Postgres escape NULL as \N. Another possibility I can
> think of is treating “””” differently from “”.

I'm not sure if CSV can formally distinguish between NULL and ''. jOOQ
could define a behaviour for exports along the lines you specified,
i.e. ,"", corresponds to an empty string, whereas ,, corresponds to
NULL. For imports it might be a bit more tricky, as this would be
quite a "hard" requirement on CSV files.

> Or maybe allow some “Oracle-like” behavior which converts the empty string
> to NULL before inserting it?

That is weird for non-Oracle users. I'm still amazed by that, myself :-)
On the other hand, for numeric columns, empty strings already
correspond to NULL today, so maybe that's more consistent?

> Is it planned to support NULL for the XML import and export?

I guess for XML export it will be easier to formally handle NULL's as
jOOQ uses its own proprietary XML format.
Could you open two GitHub issues for handling NULL in CSV / XML
imports / exports?

Note, XML imports are currently not supported by jOOQ, but it is on the roadmap:
https://github.com/jOOQ/jOOQ/issues/801

Schönfisch, Jörg

unread,
Jul 26, 2012, 4:44:53 AM7/26/12
to jooq...@googlegroups.com
Hi Lukas,

> I'm not sure if CSV can formally distinguish between NULL and ''. jOOQ
> could define a behaviour for exports along the lines you specified,
> i.e. ,"", corresponds to an empty string, whereas ,, corresponds to
> NULL. For imports it might be a bit more tricky, as this would be
> quite a "hard" requirement on CSV files.

As far as I know, there is no way to distinguish between NULL and '' in plain text. There is something as a NUL character in Unicode, but it is rendered as a normal space most of the time, which might be confusing.

> That is weird for non-Oracle users. I'm still amazed by that, myself :-)
> On the other hand, for numeric columns, empty strings already
> correspond to NULL today, so maybe that's more consistent?

Figuring out why all my empty strings came out to be NULL and thus giving me errors for NOT NULL columns took me some time. Pretty weird behavior if you have only used other databases before. So this approach might also not be ideal for some users. However, it leads to consistent results for all databases. Right now I get NULL in Oracle and '' in every other database after an import.
I've discovered the nullif option in Oracle's SQLLoader which lets you specify a value that is replaced with NULL on insert. I think implementing this solves most issues a user could have with the CSV import/export and NULL handling, and would allow the import from various different sources with different conventions. For the default values, two possibilities come to mind: Leave the export as-is and change the import to treat '' as NULL, which should be most consistent but loses some information; or mimic the behavior of Postgres or MySQL and use \N as NULL.

> Could you open two GitHub issues for handling NULL in CSV / XML
> imports / exports?

Yes, I've added https://github.com/jOOQ/jOOQ/issues/1627 and https://github.com/jOOQ/jOOQ/issues/1628 which reference this discussion. I will add some more information to them later.


Cheers,
Jörg

Lukas Eder

unread,
Jul 27, 2012, 11:05:04 AM7/27/12
to jooq...@googlegroups.com, joerg.sc...@softplant.de, stefa...@gmail.com
Hello,

I have implemented a backwards-compatible solution that involves these rules:
1. The default behaviour of the Loader API is unchanged. If not
specified, empty strings from CSV records are loaded as empty strings
(except for Oracle, which doesn't know empty strings, only NULL)
2. The loader API allows for overriding the "nullString" as shown below.
3. The format API allows for defining a "nullString" as shown below

Details can be seen here:
https://github.com/jOOQ/jOOQ/issues/1627

Handling NULL in XML exports will be dealt with separately

Cheers
Lukas

2012/7/26 <stefa...@gmail.com>:
>
> Hi,
>
>>
>> Figuring out why all my empty strings came out to be NULL and thus giving
>> me errors for NOT NULL columns took me some time. Pretty weird behavior if
>> you have only used other databases before. So this approach might also not
>> be ideal for some users. However, it leads to consistent results for all
>> databases. Right now I get NULL in Oracle and '' in every other database
>> after an import.
>> I've discovered the nullif option in Oracle's SQLLoader which lets you
>> specify a value that is replaced with NULL on insert. I think implementing
>> this solves most issues a user could have with the CSV import/export and
>> NULL handling, and would allow the import from various different sources
>> with different conventions. For the default values, two possibilities come
>> to mind: Leave the export as-is and change the import to treat '' as NULL,
>> which should be most consistent but loses some information; or mimic the
>> behavior of Postgres or MySQL and use \N as NULL.
>>
>
>
> Just to clarify, postgresql COPY FROM CSV format uses an empty non-quoted
> string to represent null values. This is not the same as the TEXT format
> which uses "\N", using "\N" in CSV mode won't work out of the box.
> However, one can specify which value translates to NULL using the NullString
> optional parameter, so perhaps opting for the same approach and letting the
> user decide what gets writed when a NULL value is found would be the best
> solution..
Reply all
Reply to author
Forward
0 new messages