Re: [sqlite] NaN in, 0.0 out?

399 views
Skip to first unread message

Simon Slavin

unread,
Mar 24, 2012, 6:41:44 AM3/24/12
to General Discussion of SQLite Database

On 24 Mar 2012, at 7:18am, Stephan Buchert <steph...@gmail.com> wrote:

> Occasionally the sensor has a fault or the values are invalid (like
> negative temperatures). To mark them once and for all as invalid, the
> faulty values are replaced (in a C environment) by NaNs. Then the data are
> inserted into the database with a C porgram, in a loop with the
> sqlite3_bind_double function. The date are retrieved in a loop with the
> sqlite3_column_double function. It turns out that SQLITE has quietly turned
> our invalid data (NaNs) into valid 0.0 values! Is this a bug?

Instead of using NaN can you use NULL to mark such values ? That should work perfectly.

Handling of NaN is not always what you might expect. I await an expert to clarify whether it should or shouldn't work.

Simon.
_______________________________________________
sqlite-users mailing list
sqlite...@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Jean-Christophe Deschamps

unread,
Mar 24, 2012, 8:10:15 AM3/24/12
to General Discussion of SQLite Database
Stephan,

> > Occasionally the sensor has a fault or the values are invalid (like
> > negative temperatures). To mark them once and for all as invalid, the
> > faulty values are replaced (in a C environment) by NaNs. Then the
> data are
> > inserted into the database with a C porgram, in a loop with the
> > sqlite3_bind_double function. The date are retrieved in a loop with the
> > sqlite3_column_double function. It turns out that SQLITE has
> quietly turned
> > our invalid data (NaNs) into valid 0.0 values! Is this a bug?
> >
>

>SQLite converts NaN inputs into NULL. And sqlite3_column_double() is
>defined to return 0.0 for any non-numeric value, including NULL.
>
>I recommend that you first check the datatype coming back using
>sqlite3_column_type() first, and only use sqlite3_column_double() if the
>type is SQLITE_INTEGER or SQLITE_REAL and return a NaN if
>sqlite3_column_type() is anything else.

You can also test for NaN before inserting and insert NaNs as BLOBs,
which are guaranteed to be stored and retrieved verbatim. This can be
important to keep the semantic of the NaN payload, should you use it
someday. Convert those BLOBs to C NaNs back after reading, if needed
to use the payload.

This way you have an efficient way to manipulate, select, avoid,
change, compare, sort valid values and NaNs without having to deal with
NULLs.

Jay A. Kreibich

unread,
Mar 24, 2012, 10:48:33 PM3/24/12
to General Discussion of SQLite Database
On Sat, Mar 24, 2012 at 07:32:32AM -0400, Richard Hipp scratched on the wall:

> SQLite converts NaN inputs into NULL.

I think this is the right choice. It is what I would expect.

Well, no... what I would actually expect is that
sqlite3_bind_double() should return an error, since it was provided
with an invalid value. But if it isn't going to do that,
conversion to NULL seems most correct.

NaNs are, by definition, not actual numbers, and represent no
meaningful numeric value or concept (such as infinity). What they
represent is outside the domain of real numbers that float point
values attempt to represent. They are a side-effect of the IEEE 754
standard, and are used to represent invalid or inconsistent results
from floating point operations. They exist to support the operators
defined in the standard, and are not inherent to the binary
representation of floating point numbers.

NaNs and NULLs, as a concept, share a lot. Both are value-less, in
the sense that A != A for both NULLs and NaNs. Both are also
unordered, meaning they cannot be sorted.

In this case, the NaN is being used as a "flag value" to indicate
something outside the normal domain of numbers. In the database
world, this is exactly how NULLs are used. In the context of an
application interfacing with a database, using NULLs for missing
values is the appropriate thing to do. I'd go so far to say that
if your application uses NaNs as flag values, the application itself
should be doing the NULL conversion, and not depending on the
database to do a more subtle conversion.

Databases understand NULLs. Everything in a database environment
already understands three-value-logic, and how to deal with it.
Aggregate functions understand NULLs, and do the right thing. NULLs
are part of SQL and understood by any SQL relational database. IEEE
754, on the other hand, and the NaNs that go with it, is not a
database standard. Many databases don't even use IEEE 754 to represent
numeric values, making it impossible to insert a NaN in any form.

I also think allowing NaN values into an SQLite database would be
dangerous. Never forget that NaNs are used to represent error
conditions. Inserting a signaling-NaN is going to cause floating
point exceptions with almost any operation. Or are we limiting
NaNs to quiet-NaNs? Or converting signaling-NaNs to quite-NaNs?
Should SQLite really have to know that much about IEEE 754?

Proper support would require SQLite being very aware of the values
it is dealing with. Just like NULLs, NaNs tend to propagate and
take over any calculation they enter. They also risk throwing
exceptions. Unless every expression, function, and aggregate is
made aware of NaNs, and how to properly special-case them (test
that!), I suspect a database would largely become unusable, except
as a simple data store. You would also need to do things like
define a sort order. Is a NaN less than -INF? Greater than +INF?
Does that question even make sense?

To me, allowing NaNs to enter the database makes about as much sense
as allowing strings with negative length values. Just because I can
come up with the bits to represent it doesn't make it real or right.

Database systems, including SQLite, already have a well defined, well
understood, and well documented system for representing and dealing
with special-case flag values, especially when it comes to "missing",
"undefined", or "unknown" values. It is best to go with the existing
system which already covers all these cases.

-j

--
Jay A. Kreibich < J A Y @ K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it,
but showing it to the wrong people has the tendency to make them
feel uncomfortable." -- Angela Johnson

Simon Slavin

unread,
Mar 25, 2012, 12:48:01 AM3/25/12
to j...@kreibi.ch, General Discussion of SQLite Database

On 25 Mar 2012, at 3:48am, "Jay A. Kreibich" <j...@kreibi.ch> wrote:

> On Sat, Mar 24, 2012 at 07:32:32AM -0400, Richard Hipp scratched on the wall:
>
>> SQLite converts NaN inputs into NULL.
>
> I think this is the right choice. It is what I would expect.

But it's not what the docs say:

<http://sqlite.org/datatype3.html>

"REAL. The value is a floating point value, stored as an 8-byte IEEE floating point number."

That includes +infinity, -infinity, and NaN. If you boast IEEE you should be storing those. Can those values be passed from a 'double' C variable ? I believe so. So I see no reason why SQLite shouldn't be storing them.

And given that SQLite does calculations internally you also have to get the correct IEEE-specified results if someone does

SELECT a * b FROM myTable

or such things. SQLite should return the right result.

> [snip] Should SQLite really have to know that much about IEEE 754?

A legitimate question. If you removed the reference to IEEE from the documentation, I'd also be satisfied. Presumably SQLite would then generate some sort of error if you passed sqlite3_bind_double() infinite or NaN values since it can stand up to the word REAL without handling them. But saying that you store IEEE and somewhere between accepting and returning it arbitrarily replacing NaN with NULL strikes me as wrong.

SQLite3 works the way it does, and changing that is going to break stuff, as people have noted. But I think that a fault has been identified, and SQLite4 (should it ever materialise) should do this better. In the mean time, removing IEEE from the docs might be an idea.

Simon.

Roger Binns

unread,
Mar 25, 2012, 1:09:26 AM3/25/12
to sqlite...@sqlite.org
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 24/03/12 21:48, Simon Slavin wrote:
> But it's not what the docs say:

NaN support changed in SQLite 3.5.9 (May 2008):

http://www.sqlite.org/changes.html#version_3_5_9

However cvstrac no longer shows code so I couldn't work out if NaN was
converted to NULL during calculations, or also stored that way. As
someone else pointed out, NaNs and NULLs have very similar properties
which is presumably why this approach was chosen.

I have no idea what the standard says, but this also appears relevant:

sqlite> select typeof(1.0/0);
null

Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iEYEARECAAYFAk9uqIYACgkQmOOfHg372QQ6PwCgoQjGCemoT4f0G0DAgjmX8Emm
Qg4AoLOFttq8FgAel8w618BECp/+jCTx
=2z8X
-----END PGP SIGNATURE-----

Stephan Buchert

unread,
Mar 25, 2012, 2:44:28 AM3/25/12
to sqlite-users
Thanks for all the answers. The solution to check first the type with
sqlite3_column_type() and, if NULL, return NaN seems the most practicable
one from a user's standpoint. A future sqlite3_column_double_v2() or
another way to control the conversion of non-numeric values to double would
be nice and appreciated perhaps by several other users, too.

/Stephan

Francis J. Monari, Esquire

unread,
Mar 25, 2012, 9:29:58 AM3/25/12
to General Discussion of SQLite Database
All,

How are +infinity and -infinity handled?

Frank.

Jay A. Kreibich

unread,
Mar 25, 2012, 12:10:58 PM3/25/12
to mckerna...@juno.com, monari...@juno.com, General Discussion of SQLite Database
On Sun, Mar 25, 2012 at 09:29:58AM -0400, Francis J. Monari, Esquire scratched on the wall:

> How are +infinity and -infinity handled?

Generally, they are treated like normal values. You can calculate,
store, and retrieve an INF or -INF value using just SQL expressions.

As with all calculations, when the result is a NaN (such as INF - INF)
the result is converted to a NULL.

-j

--
Jay A. Kreibich < J A Y @ K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it,
but showing it to the wrong people has the tendency to make them
feel uncomfortable." -- Angela Johnson

Francis J. Monari, Esquire

unread,
Mar 25, 2012, 2:03:44 PM3/25/12
to j...@kreibi.ch, mckerna...@juno.com, General Discussion of SQLite Database
All,

Not to sound critical, but to be clear: using SQLite for IEEE floating
point will result in data "loss' unless precautions are taken.

Assuming my statement above is correct, then does a standard set of
precautions exist?

Frank.


Francis J. Monari, Esquire
McKernan, McKernan & Godino
113 North Sixth Street
Camden, New Jersey 08102-1269
856-964-7759 (voice)
856-964-9620 (fax)
mckerna...@juno.com ("main", "office", "primary")
monari...@juno.com

Please use mckerna...@juno.com as the ("main", "office", "primary")
email.
If possible please copy emails to monari...@juno.com.

ATTENTION: This e-mail and the materials accompanying it convey and
contain privileged and confidential information belonging to the sender
which is legally privileged. The transmittal sheet, the materials, and
the information they contain are intended only for the use of the
individual(s) or entity(ies) named above. If you are not the intended
recipient, you are hereby notified that any disclosure, copying,
distribution, or any reliance on the e-mail, the materials accompanying
it, or the information they contain is strictly prohibited, and that
review by any individual other than the intended recipient shall not
constitute waiver of the attorney/client privilege. If you have
received the materials in error, please immediately notify us by
telephone at 856-964-7759 or by reply email. We will arrange for the
return of the materials to us. Please delete the copy of this message
on your server. Very few methods of communication are 100% secure,
please exercise appropriate care when using or replying to this
message. Thank you.

I.R.S. CIRCULAR 230 NOTICE: To comply with requirements imposed by the
Internal Revenue Service (I.R.S.) please be informed that the advice
contained in this communication was not intended or written to be used
and cannot be used for the purpose of avoiding penalties that may be
imposed by the I.R.S..

McKernan, McKernan & Godino does not attempt to sweep e-mail and
attachments for viruses. It does not guarantee that either are
virus-free and accepts no liability or any damage sustained as a result
of viruses.

Jay A. Kreibich

unread,
Mar 25, 2012, 2:55:50 PM3/25/12
to mckerna...@juno.com, monari...@juno.com, General Discussion of SQLite Database
On Sun, Mar 25, 2012 at 02:03:44PM -0400, Francis J. Monari, Esquire scratched on the wall:

> All,
>
> Not to sound critical, but to be clear: using SQLite for IEEE
> floating point will result in data "loss' unless precautions are
> taken.

Define "loss". SQLite, in general, is dependent on the underlying
hardware for floating point calculations. These calculations are done
within the limitations of the given hardware. Most modern hardware
is IEEE 754 compliant, but some mobile platforms are not fully
compliant. The SQLite application code takes great care in
converting between strings and IEEE 754 values so that no precision
or accuracy is lost.

What other types of loss are of concern?

Reply all
Reply to author
Forward
0 new messages