SQL null behavior with DATE columns

Hal

unread,

Jul 13, 2010, 10:34:14 PM7/13/10

to

Hi...I cannot figure this out. The basic problem is this, I want to
select records out of a table that has four columns that when strung
together can form a date.

select * from DB2Test.db2library A

So far so good, right? The four columns are called CC YY MM DD. And I
need to do some logic against these columns, I.E. I need to convert
those four CHAR fields CC YY MM DD into a date column, sorta like this

select a.*, date(a.cc||a.yy||'-'||a.mm||'-'||a.dd) from
db2test.db2library A

That works fine. Not all rows have the CC YY MM DD columns populated,
so on those rows I see (null) as the result for the date function.
Fine and dandy.

But here is where the problem comes in. That (null) result from the
date function isn't a true null, at least it does not behave like one.
Because no matter what I try, I cannot filter the rows out of the
result set where the date returns (null). The end result, what I
really need is a table where I can pick the rows where an employee has
a termination date sometime in the -future-, so that I don't delete
their account -today- when the terminated employee flag is set. So I
thought no problem, I'll just add a where clause. But it does not
work. I even tried wrapping the select around another select to create
an MQT, and put a coalesce function in, sorta like this:
select * from
(
select a.*, coalesce(date(a.cc||a.yy||'-'||a.mm||'-'||a.dd),
date(2099-01-01) as TermDate from db2test.db2library A
)
as Finaltable where finaltable.TermDate is not null

and I still get the null in the result set. The real problem comes in
when I try to put the brains in....the where clause on the end that
says

select * from db2test.db2libraryA where date(a.cc||a.yy||'-'||
a.mm||'-'||a.dd) >= date(current_date)

That select statement will work...it basically says Gimme The Rows
where the termination date is GREATER than or EQUAL to today's date.
And it works right up until it hits a row with an invalid date or no
date, at which point the interactive SQL stops working and gives me an
error. That makes the query useless in my Java program where I
basically step through the result set and decide on a row-by-row basis
if I need to remove an account and/or files from the as/400.

What am I doing wrong? Anyone run into this before?

Chris

iseriesflorida

unread,

Jul 14, 2010, 3:19:18 PM7/14/10

to

Chris, maybe you need to define all the fields that you want to put
together as not null up front prior to putting them together.

dieter...@t-online.de

unread,

Jul 15, 2010, 2:44:01 AM7/15/10

to

Hi

null can't be concated with anything (its something like typeless, the
column could have any type), the coalesce wouldn't help at this time, it
comes too late. the coalesce has to be around the null values of the source
table: coalesce(a.cc, 20) || coalesce(a.yy, 99)...
then you would need some case constructs too to fix this, ore a UDF to put
the logic in cancatting the fields.

Dieter Bender

CRPence

unread,

Jul 17, 2010, 12:53:09 PM7/17/10

to

On 13-Jul-2010 19:34, Hal wrote:
> <<SNIP>> The four columns are called CC YY MM DD. And I need to

> do some logic against these columns, I.E. I need to convert those
> four CHAR fields CC YY MM DD into a date column, sorta like this
>
> select a.*, date(a.cc||a.yy||'-'||a.mm||'-'||a.dd) from
> db2test.db2library A
>
> That works fine. Not all rows have the CC YY MM DD columns
> populated, so on those rows I see (null) as the result for the
> date function. Fine and dandy.
>
> But here is where the problem comes in. That (null) result from
> the date function isn't a true null, at least it does not behave
> like one.

The result of the expression as the NULL value is no different
than any other SQL NULL value.

> Because no matter what I try, I cannot filter the rows
> out of the result set where the date returns (null).

WHERE date_expresssion IS NOT NULL

> The end result, what I really need is a table where I can pick
> the rows where an employee has a termination date sometime in
> the -future-, so that I don't delete their account -today- when
> the terminated employee flag is set. So I thought no problem,
> I'll just add a where clause. But it does not work.

What is "it"?

> I even tried wrapping the select around another select to create
> an MQT, and put a coalesce function in, sorta like this:

Unsure of what MQT mention intends to suggests in the above.
There is a NTE in the following SELECT.

>
> select * from
> (
> select a.*, coalesce(date(a.cc||a.yy||'-'||a.mm||'-'||a.dd),

> date(2099-01-01)) as TermDate from db2test.db2library A

> )
> as Finaltable where finaltable.TermDate is not null
>
> and I still get the null in the result set.

The NULL value will appear for any of the individual columns, but
not the expression, since the result of the expression is coalesced
to what presumably should be a future date. The IS NOT NULL is
[incorrectly] testing the date_expression which is assured by the
COALESCE() in the NTE, to never be the NULL value.

Note: the expression DATE(2009-01-01) evaluates to DATE(2097)
where 2097 is the result of the numeric expression involving two
subtractions, and that DATE(2097) which is the request to provide
the date from an integer value evaluates to the date value
"0006-09-28"; i.e. probably not what is intended nor desired.
Presumably DATE('2099-01-01') is more desirable; though, I would
never suggest that, but possibly DATE('9999-21-31') as a more
appropriate maximum date value.?

> The real problem comes in when I try to put the brains in....the
> where clause on the end that says
>
> select * from db2test.db2libraryA
> where date(a.cc||a.yy||'-'||a.mm||'-'||a.dd)>=date(current_date)

Note: The special register Current_Date is already a date; i.e.
there is no value in using DATE() casting on the register.

> That select statement will work...it basically says Gimme The
> Rows where the termination date is GREATER than or EQUAL to
> today's date.

Either the COALESCE should be used, or some other test added to
handle the NULL value for the date expression.

> And it works right up until it hits a row with an invalid date
> or no date, at which point the interactive SQL stops working
> and gives me an error.

Selection on an invalid date value does not allow the query
engine to determine if the row should be included or omitted, since
its value can not be determined. Thus the query should fail with a
selection error. If invalid values are allowed in the columns that
define the expression to define the date, then some CASE logic or a
UDF should provide a valid\corrected date, or the NULL value. With
that, the expression evaluates to a valid date or NULL, which can be
compared against.

> That makes the query useless in my Java program where I basically
> step through the result set and decide on a row-by-row basis if I
> need to remove an account and/or files from the as/400.

If the program is making the selection anyhow, does the query
really need to do any\the selection? Seems making the query do all
the selection would be more logical than partially in one language
and partially in the other.?

> What am I doing wrong? Anyone run into this before?

At this point I am not sure if the desired outcome is to exclude
the rows with the NULL values or to include them [seems the opening
suggest exclude, but the coalesce with greater than seemed to want
to include them], but I think the following SQL request [perhaps
removing the "NOT"] should give the desired result set.?:

select * from db2test.db2libraryA
where
date(a.cc||a.yy||'-'||a.mm||'-'||a.dd) >= date(current_date)

or date(a.cc||a.yy||'-'||a.mm||'-'||a.dd) IS NOT NULL
/* the DATE() is not required to test for NULL, */
/* nor are the '-'; each of the fields could be tested */
/* for NULL separately, and could improve performance */

Regards, Chuck

Hal

unread,

Jul 21, 2010, 12:13:37 AM7/21/10

to

Hi all...thank you for responding. I opened a PMR with software
support today. I created a much simplified example, put the 3 row
table into a save file, and sent it off to the Holy Land in Rochester.

I created a test table with three rows. Two have valid dates, one has
an invalid date. Here is the query I wrote against this table

SELECT
rownum,
datestring, --Character string in the table
date(datestring) as Test1, --MAY return
null
coalesce(date(datestring),current_date) as Test2, --SHOULD NEVER
return null
current_date as Test3,
date(current_date) as Test4,
coalesce(current_date,date(datestring)) as Test5
FROM
store.datetest

The data set in the table looks like this:

....+....1....+....2....+....3.
ROWNUM DATESTRING
1 2010-06-01
2 2010-06-02
3 0000-01-01
******** End of data ********

And if I run the query above, I get this:

....+....1....+....2....+....3....+....4....+....5....+....6....+....
7....+....8.
ROWNUM DATESTRING TEST1 TEST2 TEST3
TEST4 TEST5
1 2010-06-01 06/01/10 06/01/10 07/20/10
07/20/10 07/20/10
2 2010-06-02 06/02/10 06/02/10 07/20/10
07/20/10 07/20/10
3 0000-01-01 ++++++++ ++++++++ 07/20/10
07/20/10 07/20/10
******** End of data
********

The third row, column 'test2' should be showing today's date. The
interesting thing is, when Rochester ran it...it worked. The support
lady even sent me a screen capture of the green screen STRSQL output.

But when I run it, it doesn't work. Green screen or in DbVisualizer
via JDBC. I'm up to date on all PTF's, so it must be something simple
that is eluding me for the moment...

Chris

dieter...@t-online.de

unread,

Jul 21, 2010, 2:09:12 AM7/21/10

to

looks like date not in valid range, caused by date format

D*B

Hal wrote:

> 0000-01-01

jonathan bailey

unread,

Jul 21, 2010, 5:09:53 AM7/21/10

to

"Hal" <hal...@gmail.com> wrote in message
news:09bd6fd1-6cba-406d...@u36g2000prg.googlegroups.com...

Works fine for me too.
None of your columns will return null, 0001-01-01 is a perfectly good date
in as400.
You cant display dates outside the 1940-2039 range in only 2 digits for the
year.
In SQL press F13 Then 1 then change *MDY to *ISO or *EUR
Then you will get data like this:

ROWNUM DATESTRING TEST1 TEST2 TEST3 TEST4 TEST5

1 2010-06-01 2010-06-01 2010-06-01 2010-07-21 2010-07-21
2010-07-21
2 2010-06-02 2010-06-02 2010-06-02 2010-07-21 2010-07-21
2010-07-21
3 0000-01-01 0000-01-01 0000-01-01 2010-07-21 2010-07-21
2010-07-21

Adding 4, null reveals this record:
4 - - 2010-07-21 2010-07-21 2010-07-21
2010-07-21

HTH
--
Jonathan

Peter H. Coffin

unread,

Jul 21, 2010, 9:55:03 AM7/21/10

to

On Tue, 20 Jul 2010 21:13:37 -0700 (PDT), Hal wrote:
> SELECT
> rownum,
> datestring, --Character string in the table
> date(datestring) as Test1, --MAY return
> null
> coalesce(date(datestring),current_date) as Test2, --SHOULD NEVER
> return null
> current_date as Test3,
> date(current_date) as Test4,
> coalesce(current_date,date(datestring)) as Test5
> FROM
> store.datetest
>
> The data set in the table looks like this:
>
> ....+....1....+....2....+....3.
> ROWNUM DATESTRING
> 1 2010-06-01
> 2 2010-06-02
> 3 0000-01-01
> ******** End of data ********

When was Year 0 again?

--
29. I will dress in bright and cheery colors, and so throw my enemies
into confusion.
--Peter Anspach's list of things to do as an Evil Overlord

jonathan bailey

unread,

Jul 21, 2010, 12:32:02 PM7/21/10

to

"Peter H. Coffin" <hel...@ninehells.com> wrote in message
news:slrni4dur9...@abyss.ninehells.com...

I wasnt paying so much attention earlier but year 0 seems to work OK. I just
typed 0001-01-01 automatically.

--
Jonathan

CRPence

unread,

Jul 21, 2010, 12:44:41 PM7/21/10

to

The correct result for the casting of the string '0000-01-01' to
a [Gregorian] date value should be a data mapping error; i.e. the
/plus signs/ displayed under Test1 & Test2 in the report from the
quoted text are correct. Your results are expected and correct.

I am very surprised that on v5r3, the result is the presentation
of that string as an apparent [valid] date value; i.e. instead of
'++++++++++' appearing on my report [using *ISO date presentation
format], I saw '0000-01-01'. As Peter alludes in his question,
there is [albeit IMO year zero is as legitimate as year one] no year
zero in the Gregorian calendar, so the value is not a valid date and
thus should not be presented as valid. The database should be
issuing a "data mapping error" so the report writer presents the
string of "+" symbols instead of anything that might be
misinterpreted as a valid result [which year 0000 is not]. The
smallest allowed value by the database for casting from a character
string should be '0001-01-01', and the largest allowed value should
be '9999-12-31. The subtraction of one day from the former effects
a date underflow, and the addition of one day to the latter effects
a date overflow; both, as expected.

Regards, Chuck

On 20-Jul-2010 21:13, Hal wrote:
> Hi all...thank you for responding. I opened a PMR with software
> support today. I created a much simplified example, put the 3
> row table into a save file, and sent it off to the Holy Land in
> Rochester.
>
> I created a test table with three rows. Two have valid dates, one
> has an invalid date. Here is the query I wrote against this table
>
> SELECT
> rownum,
> datestring, --Character string in the table
> date(datestring) as Test1,

> -- Test1: MAY return null
> coalesce(date(datestring),current_date) as Test2,
> -- Test2: SHOULD NEVER return null

Hal

unread,

Jul 23, 2010, 12:03:33 PM7/23/10

to

Okay, so I fixed the problem. I figured I would share, and I have to
thank Rochester support in addition to everyone here for nudging me in
the right direction. You cannot use COALESCE to solve this one. I had
to write a user defined function. It wasn't too hard, but I did hit a
few stumbling blocks that I thought I'd share.

Here's the UDF

Create Function QGPL.datecheck
(indate VarChar(10))
Returns Date
Language SQL
Begin
Declare workdate DATE;
DECLARE EXIT HANDLER FOR SQLSTATE '22007'
BEGIN
Set workdate = date('2099-01-01') ;
return workdate;
END;
Set workdate = date(indate);
Return workdate;
END

Basically this tries to date() the input string, if that does not work
the DB2 engine sets an SQLSTATE 22007, so I have an exit handler trap
that and kick back '2099-01-01' if it fails.

Gotcha #1 - The DB2 engine apparently CACHES the UDF's. While I was
writing this thing I started small, just writing a function trying to
kick back a valid date if I called the function with any string as an
input parm. That worked. So then I dropped the function and re-created
it with interactive SQL in one window, and then re-ran my SQL
statement that called the function in another window...and I got the
same results(!). So I dropped the function and verified the *SRVPGM
object was gone in a green screen session. It was. So I tried to run
the SQL statement that calls the function again..and it worked (!!).
The only way I was able to force DB2 to re-load the UDF after I made a
change was to end/reconnect my JDBC connection.

Gotcha #2 - 2099-01-01 will not display if your date format is USA, so
even with the above function, I have to wrap the results of it with a
char(,ISO) otherwise I still see null. That I can live with.

Here's the query.

select em#,
char(qgpl.datecheck(emtmdc||substr(right('000'||emtdat,6),1,2)||'-'||
substr(right('000'||emtdat,6),3,2)||'-'||right(emtdat,2)),ISO) as
testfield2,
emtmdc||substr(right('000'||emtdat,6),1,2)||'-'||substr(right('000'||
emtdat,6),3,2)||'-'||right(emtdat,2)
from store.peis301 where emloc = '99'

It works...if I pass a bad date, or even just a text string, the
function throws back a date object of 2099-01-01, which is what I
wanted. I closed the PMR. And I learned how to write a rudimentary
user-defined function.

Thanks to all for your advice.

Chris

CRPence

unread,

Jul 23, 2010, 5:03:28 PM7/23/10

to

On 23-Jul-2010 09:03, Hal wrote:
> Okay, so I fixed the problem. I figured I would share, and I have
> to thank Rochester support in addition to everyone here for
> nudging me in the right direction. You cannot use COALESCE to
> solve this one. I had to write a user defined function. It wasn't
> too hard, but I did hit a few stumbling blocks that I thought I'd
> share.

If the only "bad date" string was '0000-01-01' [e.g. if like the
value '2099-01-01', the '0001-01-01' was being used as an effective
"special value"], then using NULLIF or CASE along with COALESCE
could easily resolve the original presented case for evaluation of
the expression to the NULL value.

> Here's the UDF
>
> Create Function QGPL.datecheck
> (indate VarChar(10))
> Returns Date
> Language SQL
> Begin
> Declare workdate DATE;
> DECLARE EXIT HANDLER FOR SQLSTATE '22007'
> BEGIN
> Set workdate = date('2099-01-01') ;
> return workdate;
> END;
> Set workdate = date(indate);
> Return workdate;
> END
>
> Basically this tries to date() the input string, if that does not
> work the DB2 engine sets an SQLSTATE 22007, so I have an exit
> handler trap that and kick back '2099-01-01' if it fails.

For lack of any SET OPTION statement in the function to establish
the preference for date format, I am not sure if the UDF will be
dependent on the options established by the user, or if the UDF
established the option from the session in which the function was
created. Probably best to review the PRTSQLINF, or to test with
varying DATSEP() and DATFMT() settings.

FWiW I find the use of 2099 an odd choice as a special value,
given how in the not so distant past, people were fixing sometimes
forty year old software problems; problems of similar origin.

> Gotcha #1 - The DB2 engine apparently CACHES the UDF's. While I
> was writing this thing I started small, just writing a function
> trying to kick back a valid date if I called the function with
> any string as an input parm. That worked. So then I dropped the
> function and re-created it with interactive SQL in one window,
> and then re-ran my SQL statement that called the function in
> another window...and I got the same results(!). So I dropped the
> function and verified the *SRVPGM object was gone in a green
> screen session. It was. So I tried to run the SQL statement that
> calls the function again..and it worked (!!). The only way I was
> able to force DB2 to re-load the UDF after I made a change was to
> end/reconnect my JDBC connection.

Hmmm, I have never had that experience. I do generally create in
the same session where I am testing, so perhaps that is the reason.
However since a SQL UDF service program is created with
ACTGRP(*CALLER), running in a named activation group would effect
such caching; i.e. RCLACTGRP would need to be performed, just as
with any ILE re-create for debug scenario. And that would be the
language environment responsible for the cache, not the database. I
am aware of DETERMINISTIC effecting caching within the database, but
only within the active SQL statement.

> Gotcha #2 - 2099-01-01 will not display if your date format is
> USA, so even with the above function, I have to wrap the results
> of it with a char(,ISO) otherwise I still see null. That I can
> live with.

To be clear, the *USA [like *ISO] is a four-digit year date
presentation format. While the typical date formatting in the USA
is *MDY, that is better noted explicitly versus being described as
"USA" since the CHAR() scalar supports the token USA as its second
element to effect 'MM/DD/YYYY' date presentation format.

Any date presentation format for which there is only a two-digit
year, only then will the valid date values outside 1940 through 2039
fail to display. That is because those presentations will give an
ambiguous presentation; e.g. the date value 10/08/06 could be valid
as any of *YMD, *MDY, or *DMY, so the reader must infer [or be privy
to intent as to] what the actual date value should be. AFaIK the
DATFMT(*JOB) even in IBM i 7.1 still resolves to only one of the
two-digit year date formats.

>
> Here's the query.
>
> select em#,
> char(qgpl.datecheck(

> emtmdc||substr(right('000'||emtdat,6),1,2)||'-'
> ||substr(right('000'||emtdat,6),3,2)||'-'

> ||right(emtdat,2))

> ,ISO) as testfield2,
> emtmdc||substr(right('000'||emtdat,6),1,2)||'-'
> ||substr(right('000'||emtdat,6),3,2)||'-'
> ||right(emtdat,2)
> from store.peis301 where emloc = '99'

The expression to build the date string seems a bit more complex
than I would expect. Creating another function more specific to
common inputs, e.g. specific to those columns, could make the query
much cleaner. I did notice the columns are obviously different from
the opening post, so I am not sure what EMTDAT field is in the above
query. The input to the function is also only eight bytes, probably
for a two-digit year input built with a dash as separator, whereas
the function is defined to accept up to ten. The use of RIGHT()
seems a bit odd if the EMTDAT is a six-digit numeric, since then the
DIGITS() scalar might be more obvious to the reader; seems unlikely
the data is already left justified character string of numeric digits.?

FWiW use of the concatenation symbols and compressing spaces
between a comma and the next numeric literal is discouraged, since
both tend to cause difficulties for use outside of USEnglish
installations. Inserting a space after a comma enables a user with
comma as decimal separator to copy\paste a given statement without
them having to first reformat the query to get the expected effect.

I am not sure how using the UDF [as written, returns a NULL
value] would resolve the [Subject] case of any of the fields in the
expression being the NULL value, since any NULL value in the
expression will cause the result of the expression to be the NULL
value. I suppose perhaps there was never any issue with NULL
values, and instead there was only bad data, overflow, and underflow
issues.?

> It works...if I pass a bad date, or even just a text string, the
> function throws back a date object of 2099-01-01, which is what
> I wanted. I closed the PMR. And I learned how to write a
> rudimentary user-defined function.
>

The SQL user-defined function defaults to CALLED ON NULL INPUT,
so for lack of COALESCE or IFNULL on the return for the evaluated
expression of the WorkDate variable, the function is invoked
needlessly for a[n expression evaluated to a] NULL value.

Regards, Chuck