Question #1
Why did SQL99 *Not* have a Boolean data type on purpose?
Question #2
Why does Celko call the MS BIT column type, "A Kludge"?
(a) Because BIT Column implementation is stupid and slow?
(b) Because it is proprietary?
Microsoft makes a BIT column, and --CELKO-- disaproves.
-- CELKO wrote:
-- Tuesday, May 09, 2006 10:48 AM
-- RE: SQL 2005 Server Boolean value --
That is a kludge. SQL does not have a BOOLEAN data type on purpose and
BIT is a proprietary Numeric data type.
-- End CELKO
Thanks
Russell Mangel
Las Vegas, NV
SQL supports an unique 3-valued logic approach that supports TRUE, FALSE and
UKNOWN. The truth tables for this system is a bit strange, but are formally
defined by the likes of Jan Lukasiewicz. The reason why SQL cannot implement
a true boolean type and a boolean logic is due to the existance of NULLs.
One of the basic characteristic of an SQL datatype is that it must be
nullable. Thus to have a valid 3VL type, SQL will have to support 3 values
and a NULL. Thus if we have a hypothetical Boolean datatype in SQL, we would
have to violate 3-VL and go for 4-VL to include NULL state for its columns &
variables -- which would be inconsistent and the core concepts of SQL.
2. Better to ask him.
--
Anith
Hmm, interesting. I'm reading ANSI SQL:1999 and I do indeed find a Boolean datatype in there. I'm in
9075-2 (Fundamentals), page 24, section 4.6.
(I agree with Anith's comment on the complexity of supporting a Boolean where you have 3-valued
logic. ANSI seems to ignore this complexity by saying that UNKNOWN and NULL is the same thing.)
--
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
"Russell Mangel" <rus...@tymer.net> wrote in message news:uNuSIdg...@TK2MSFTNGP02.phx.gbl...
If I read it correctly, it means:
1. Boolean comparisons are supported for IF... ELSE...,etc statements
2. Boolean *Column data types* are not supported.
>> Hmm, interesting. I'm reading ANSI SQL:1999 and I do indeed find a Boolean
>> datatype in there. I'm in 9075-2 (Fundamentals), page 24, section 4.6.
>
> If I read it correctly, it means:
> 1. Boolean comparisons are supported for IF... ELSE...,etc statements
> 2. Boolean *Column data types* are not supported.
A "data type" is also a "column data type"
4.6 Boolean types
The data type boolean comprises the distinct truth values true and
false. Unless prohibited by a NOT NULL constraint, the boolean data type
also supports the unknown truth value as the null value. This
specification does not make a distinction between the null value of the
boolean data type and the unknown truth value that is the result of an
SQL <predicate>, <search condition>, or <boolean value expression>; they
may be used interchangeably to mean exactly the same thing.
The boolean data type is described by the boolean data type descriptor.
The boolean data type descriptor contains:
— The name of the boolean data type (BOOLEAN).
6.1 <data type>
...
<boolean type> ::= BOOLEAN
Dieter
OK, you lost me here... why is UNKNOWN in this hypothetical type not
equivalent with NULL?
--
Adam Machanic
Pro SQL Server 2005, available now
http://www.apress.com/book/bookDisplay.html?bID=457
--
TRUE OR NULL = NULL
TRUE OR UNKNOWN = TRUE
But also TRUE and FALSE are states of being in RDBMS, not values.
SQL-92 has <exp> IS [NOT] [TRUE | FALSE | UNKNOWN] that nobody
implements. We got rid of BOLLEANs because of these fundamental
problems .
That truth table does not seem to work. For instance:
SELECT 1
WHERE 1=1 OR 1=NULL
This returns 1, meaning that the end result was TRUE. But these two
predicates logically simplify to the form:
SELECT 1
WHERE (TRUE) OR (NULL)
So (TRUE OR NULL), at least as implemented in SQL Server, == (TRUE).
You might want to look at the longwinded discussion Louis and I had
about this a couple of years ago:
http://groups.google.com/groups/search?q=louis+orange+apple+sqlserver+kass
The point that I think is hard to grasp is that NULL is not a truth value.
The "third" truth value used in SQL is UNKNOWN. TRUE OR UNKNOWN
should be TRUE, but TRUE OR NULL (if there were a 4-valued boolean
with NULL in addition to TRUE, FALSE, and UNKNOWN) should be NULL.
(See comment inline also)
NULLs propagate "just because", not because it makes sense for some
particular interpretation of what NULL means (a typical one being "not
known")
For example, if @n is an INT that is null, @n*0 is also NULL, even though
@n*0
is 0 for every possible known value of @n.
Steve Kass
Drew University
"Adam Machanic" <amachanic@hotmail._removetoemail_.com> wrote in message
news:e%23YiEEng...@TK2MSFTNGP02.phx.gbl...
> "--CELKO--" <jcel...@earthlink.net> wrote in message
> news:1148830393.4...@y43g2000cwc.googlegroups.com...
>>
>> TRUE OR NULL = NULL
>
> That truth table does not seem to work. For instance:
>
> SELECT 1
> WHERE 1=1 OR 1=NULL
>
> This returns 1, meaning that the end result was TRUE. But these two
> predicates logically simplify to the form:
>
> SELECT 1
> WHERE (TRUE) OR (NULL)
Nope. The value of (1=NULL) is truth value #3: UNKNOWN. SQL has
no NULL truth value, so you can't test Celko's truth table using SQL.
>
> So (TRUE OR NULL), at least as implemented in SQL Server, == (TRUE).
TRUE OR UNKNOWN is what you tested.
>"--CELKO--" <jcel...@earthlink.net> wrote in message
>news:1148830393.4...@y43g2000cwc.googlegroups.com...
>>
>> TRUE OR NULL = NULL
>
> That truth table does not seem to work. For instance:
>
>SELECT 1
>WHERE 1=1 OR 1=NULL
>
> This returns 1, meaning that the end result was TRUE. But these two
>predicates logically simplify to the form:
>
>SELECT 1
>WHERE (TRUE) OR (NULL)
Hi Adam,
No - any comparison with NULL as one of it's opperands evaluates to
UNKNOWN, not to NULL. So the comoparison 1=NULL should be replaced by
UNKNOWN.
SELECT 1
WHERE (TRUE) OR (UNKNOWN)
> So (TRUE OR NULL), at least as implemented in SQL Server, == (TRUE).
(TRUE OR NULL) is not implemented in SQL Server since SQL Server doesn't
support a boolean data type.
Logical expressions can only evaluate to TRUE, FALSE, or UNKNOWN. Not to
NULL. And since OR can only be placed between two logical expressions,
it's inmpossible to have SQL Server evaluate (TRUE OR NULL).
--
Hugo Kornelis, SQL Server MVP
Use the rules for theta operators, not the rules for NULL propagation:
SELECT 1 WHERE (TRUE) OR (UNKNOWN);
and I am back to the same problems.
NULL means we do not know what the value is, but unknown means we know that
the value is undetermined?
Am I paraphrasing that correctly?
This sounds like the same definition to me. What's the difference between
knowing that a value is unknown and not knowing what the value is?
OR is it as simple as Joe Celko put it, and a NULL value is the same as an
UNKNOWN value, but by definition they are handled differently when logic is
concerned. Meaning the values are the same, but we use two different names
to indicate HOW we resolve those values in logical expressions. So, if we
are logically dealing with nulls, the NULL overrides all known values, but
UNKNOWN leaves the other values alone? Or UNKNOWN sometimes leaves other
values alone, but other times overrides them returning a result that is also
UNKNOWN?
If my post makes no sense at all, I apologize. This whole thing doesn't
make a lot of sense to me just yet.
"Steve Kass" <sk...@drew.edu> wrote in message
news:eTVSCcng...@TK2MSFTNGP04.phx.gbl...
Jim Underwood wrote:
>
> OK, I am having trouble getting this...
>
> NULL means we do not know what the value is, but unknown means we know that
> the value is undetermined?
> Am I paraphrasing that correctly?
No. NULL refers to both situation with respect to a column value.
Unknown refers to the value of a predicate (the result of a comparison).
By definition (simply because it was defined that way) NULLs propagate.
UNKNOWNS do not, at least not necessarily. If a boolean value can only
be achieve with predicates, its values will only be TRUE, FALSE or
UNKNOWN. If a data type can be a boolean, NULL would have to be
supported too (resulting in 4VL).
Gert-Jan
NULL refers to an unknown value.
UNKNOWN, in this context, refers to a boolean result.
NULLS propagate by definition, UNKNOWN does not.
It is possible that a boolean result is not known, but not that it is NULL.
The main difference being that NULL propogates and thus would interfere with
the processing of our boolean logic.
Did I get that right?
"Gert-Jan Strik" <so...@toomuchspamalready.nl> wrote in message
news:447C99A4...@toomuchspamalready.nl...
>OK, I am having trouble getting this...
>
>NULL means we do not know what the value is, but unknown means we know that
>the value is undetermined?
>Am I paraphrasing that correctly?
Hi Jim,
No, you are not.
NULL is a marker for a missing value. It says nothing about the reason
for the absence of the value - just that there is no value in this row
for that specific column. The reason MIGHT BE that the value is unknown,
but it might just as well mean "not applicable", "not yet", or "you are
not cleared to access that information or even to know if there is any
information there at all", or who knows what else.
UNKNOWN is the result of evaluating a boolean expression where (probably
due to missing information) it turned out to be impossible to assert
whether the expression is true or false.
(snip)
>If my post makes no sense at all, I apologize. This whole thing doesn't
>make a lot of sense to me just yet.
You're not alone - this subject manages to confuse very many people. The
sheer number of textbooks that have it wrong or mix up the terminology
doesn't help much either :-(
3 + x = y
Obviously we don't know x (it's NULL), so we can't know the result (y). So
the NULL (x) propagates to y in the formula.
UNKNOWN is the result of a comparison with NULL. If you were to say:
IF NULL = 1
The result would be UNKNOWN, since we don't know the value that NULL is
standing in for. Likewise if we had a variable @x that was NULL and
compared it to something, the result is UNKNOWN:
IF @x = 1
Ditto in queries that compare columns where NULLs are present.
Here's a logic table for SQL 3VL (if it looks messed up, copy it to notepad
in a fixed-width font):
+---------------------------------------+
| Three-Valued Logic Table |
+---------+---------+---------+---------+
| p | q | p AND q | p OR q |
| True | True | True | True |
| True | False | False | True |
| True | Unknown | Unknown | True |
| False | True | False | True |
| False | False | False | False |
| False | Unknown | False | Unknown |
| Unknown | True | Unknown | True |
| Unknown | False | False | Unknown |
| Unknown | Unknown | Unknown | Unknown |
+---------+---------+---------+---------+
"Jim Underwood" <james.underwoodATfallonclinic.com> wrote in message
news:uU2fHACh...@TK2MSFTNGP04.phx.gbl...
OK, so if there were hypothetically a Boolean datatype that treated a
value of NULL the same as UNKNOWN (and, likewise, UNKNOWN would be assigned
as NULL), what would be the consequences? I'm still trying to wrap my head
around why it isn't possible to merge these two for the sake of a Boolean
type, even if the type does not behave 100% according to what NULL means.
Basically, it is all rather mathematical, and math has to be consistent. If
we change the rules in one place, the rules cease to be rules and the system
breaks down.
Basically, unknown and null are not equivalent in boolean terms, and
treating them as equivalent would be wrong, mathematically speaking. If we
treat null as an unknown in this case, then we are essentially disallowing a
null, which introduces inconsistency to the model.
At least this is what I can gather from the discussion.
On the one hand I agree with you that it would not be a big deal to have
nulls behave as unknown when dealing with boolean datatypes. However, the
question arises, if we make an exception here, why not over there also.
Next thing you know, SQL has more exceptions than the English language. I,
for one, would be willing to accept one single exception, so long as it
stopped there. I would not be at all happy with a language/standard that
was riddled with exceptions. IMO, it is better to have consistency, because
that contributes to maintainability.
"Adam Machanic" <amachanic@hotmail._removetoemail_.com> wrote in message
news:OcJ4YPF...@TK2MSFTNGP03.phx.gbl...
Celko is the one who should respond to this question. However, I think
one of his reasons (the most important, maybe) is this:
The bit data type in SQL Server is not a BOOLEAN data type (as defined
by the SQL Standard), because you cannot execute:
DECLARE @x bit
SET @x=(7>3)
...and expect @x to be TRUE (or 1). You will get a syntax error,
because the bit data type is not the same as a BOOLEAN.
You might say that the bit data type is just like a tinyint data type,
but instead of allowing the values between 0 and 255, it allows only
the values between 0 and 1.
Razvan
Razvan Socol wrote:
>You might say that the bit data type is just like a tinyint data type,
>but instead of allowing the values between 0 and 255, it allows only
>the values between 0 and 1.
>
>
>
Bit is not as similar to tinyint as you might think. It has some
unusual properties, such as:
1. Any number-type value can be converted to [bit], and the
result is 0 or 1 depending on whether the value was 0 or not.
select cast(pi() as bit)
----
1
2. Arithmetic operands are not valid on [bit]s:
declare @b bit
set @b = 1
set @b = @b + @b
select @b
Server: Msg 8117, Level 16, State 1, Line 3
Operand data type bit is invalid for add operator.
Steve Kass
Drew University
>Razvan
>
>
>
A lot of people feel that SQL already is riddled with exceptions.
Apparently that's a big reason why no one is able to implement anywhere near
the full standard...
"Adam Machanic" <amachanic@hotmail._removetoemail_.com> wrote in message
news:%23OjcN5M...@TK2MSFTNGP05.phx.gbl...
The reason I asked this question, is that I recently
upgraded to MS Sql server 2005 and I wanted
to "fix a badly designed column".
The original Column is called "CompanyType" and
is represented in the GUI as a set of CheckBoxes
(True/False). The original design of this column
used a (multi-value) varchar(6).
Companies (Table)
===================
CompanyName varchar(50)
CompanyCode varchar(5)
CompanyType varchar(6) -- (Multi-Values)
The Six values for CompanyType are:
"A","B","C","D","E","F"
So you will typically see the following in a field:
ABF
AC
D
A
BF
etc... Any Combination
So I thought the best way to fix this column would be
to add a column for *each of the six types* and use the BIT
column data type. But now I am not so sure this is
the best way to fix this column. Because BIT column
is MS proprietary column, and CELKO dis-approves.
>"Hugo Kornelis" <hu...@perFact.REMOVETHIS.info.INVALID> wrote in message
>news:pvip72hbnuiepvj9b...@4ax.com...
>>
>> UNKNOWN is the result of evaluating a boolean expression where (probably
>> due to missing information) it turned out to be impossible to assert
>> whether the expression is true or false.
>
> OK, so if there were hypothetically a Boolean datatype that treated a
>value of NULL the same as UNKNOWN (and, likewise, UNKNOWN would be assigned
>as NULL), what would be the consequences?
Hi Adam,
If you treat NULL the same as UNKNOWN, then you can't have this
combination:
1. TRUE OR UNKNOWN = TRUE
2. TRUE OR NULL = NULL
So you'll have to settle for either of these options:
a) Change the logic tables to have TRUE OR UNKNOWN = UNKNOWN (plus some
other, similar changes) - not *that* would be a breaking change!!!
or
b) Get rid of the rule that NULLS allways propagate ("great - took us
years to teach all our developers to work with CONCAT_NULL_YIELDS_NULL
on, and now they're changing it back")
> I'm still trying to wrap my head
>around why it isn't possible to merge these two for the sake of a Boolean
>type, even if the type does not behave 100% according to what NULL means.
If you read the SQL-99 or SQL-2003 standard, you'll find that you're not
alone. One of the few features from the ANSI standards that I really
hope will *not* be implemented in SQL Server.
Well, what I was suggesting was to just have an exception in this
case -- i.e., ONLY for the sake of booleans, NULL would not propagate and
would therefore be essentially equivalent with UNKNOWN ... in my opinion
(and probably simply because I could have used this feature in a few
different apps I've worked on in recent memory), an exception for this case
would only be beneficial. HOWEVER, I also realize that creating exceptions
of any kind is a very slippery slope to start down, so I understand your
disdain for this feature...
(snip)
>The original Column is called "CompanyType" and
>is represented in the GUI as a set of CheckBoxes
>(True/False). The original design of this column
>used a (multi-value) varchar(6).
(snip)
>So you will typically see the following in a field:
>ABF
>AC
>D
>A
>BF
>etc... Any Combination
Hi Russell,
Since a company can be of more than one type, I'd recommend storing this
in an extra table
CREATE TABLE CompanyTypes
(CompanyCode varchar(5) NOT NULL,
CompanyType char(1) NOT NULL,
PRIMARY KEY (CompanyCode, CompanyType),
FOREIGN KEY (CompanyCode) REFERENCES Companies(CompanyCode),
CHECK (CompanyType IN ('A', 'B', 'C', 'D', 'E', 'F'))
)
With this design, you'll be very happy when (not if!) management decides
to introduce a seventh company type.