difference between two select statements.

SpreadTooThin

unread,

Aug 27, 2012, 1:18:46 PM8/27/12

to

I perform two selects on a database and get a difference of 1 record.

SELECT COUNT(DISTINCT UID) from MyTable WHERE (c1 LIKE 'a%' or c1 LIKE 'b%') and c2='c';

SELECT COUNT(DISTINCT UID) FROM MyTable;

The first select return 199 the second returns 199.

How can i get the UID for the record that is different?

Lennart Jonsson

unread,

Aug 27, 2012, 1:30:44 PM8/27/12

to

On 08/27/2012 07:18 PM, SpreadTooThin wrote:
> I perform two selects on a database and get a difference of 1 record.
>
> SELECT COUNT(DISTINCT UID) from MyTable WHERE (c1 LIKE 'a%' or c1 LIKE 'b%') and c2='c';
>
> SELECT COUNT(DISTINCT UID) FROM MyTable;
>

Not sure if mysql supports except (minus):

SELECT UID FROM MyTable;
except
SELECT UID from MyTable WHERE (c1 LIKE 'a%' or c1 LIKE 'b%') and c2='c'

if not you can use exists:

SELECT DISTINCT UID
FROM MyTable x
WHERE NOT EXISTS (
SELECT 1 from MyTable y

WHERE (c1 LIKE 'a%' or c1 LIKE 'b%') and c2='c'

AND x.uid = y.uid
);

/Lennart

SpreadTooThin

unread,

Aug 27, 2012, 1:39:12 PM8/27/12

to

So when you say MyTable x and MyTable y are these new tables that are created in memory from the result of select statements?

SpreadTooThin

unread,

Aug 27, 2012, 1:48:23 PM8/27/12

to

perform two selects on a database and get a difference of 1 record.

SELECT COUNT(DISTINCT UID) from MyTable WHERE (c1 LIKE 'a%' or c1 LIKE 'b%') and c2='c';

SELECT COUNT(DISTINCT UID) FROM MyTable where c2='c';

The first select return 198 the second returns 199.

How can i get the UID for the record that is different? How can i get the UID for the record that is different?

(My apologies for the re-post.

Lennart Jonsson

unread,

Aug 27, 2012, 3:36:41 PM8/27/12

to

On 08/27/2012 07:39 PM, SpreadTooThin wrote:
[...]

> So when you say MyTable x and MyTable y are these new tables that are created in memory from the result of select statements?

Not sure I understand the question. Is it the aliases x and y that you
are concerned about? Your query looked like:

SELECT UID FROM MyTable

which is the same as

SELECT UID FROM MyTable x

or

SELECT UID FROM MyTable as x

Did that clarify?

/Lennart

SpreadTooThin

unread,

Aug 27, 2012, 3:53:36 PM8/27/12

to

I guess i need to go back to basics here a little.

1) If i have two tables (x and y) with the same schema (and only 1 column) how can I find what is in table a, but not in table b

2) How do I combine the results or two select statements to form the query requested in point 1.

// This select creates a table called x of UIDs.
SELECT DISTINCT UID from MyTable x WHERE (c1 LIKE 'a%' or c1 LIKE 'b%') and c2='c';

// This select creates a table called y of UIDs.
SELECT DISTINCT UID from MyTable y WHERE c2='c';

// my best guess at the syntax for locating the record in question.
select 1 from x where 1 not found in y;

But the tables x and y don't live past the the statement that created them....

Lennart Jonsson

unread,

Aug 27, 2012, 5:46:09 PM8/27/12

to

On 08/27/2012 09:53 PM, SpreadTooThin wrote:
[...]

> I guess i need to go back to basics here a little.
>
> 1) If i have two tables (x and y) with the same schema (and only 1 column) how can I find what is in table a, but not in table b
>

ok, say:

create table x ( val int not null primary key );
create table y ( val int not null primary key );
insert into x (val) values (1),(2),(3);
insert into y (val) values (2),(3),(4);

select val from x
where not exists (
select 1 from y where x.val = y.val
)

>
> 2) How do I combine the results or two select statements to form the query requested in point 1.
>

See above

>
> // This select creates a table called x of UIDs.
> SELECT DISTINCT UID from MyTable x WHERE (c1 LIKE 'a%' or c1 LIKE 'b%') and c2='c';
>

While it is true that the result of a select statement is a table, I get
a feeling that this confuses you. Let's instead say that

SELECT DISTINCT UID from MyTable x WHERE (c1 LIKE 'a%' or c1 LIKE 'b%')
and c2='c'

is a set of rows, let's call it A. Your other statement:

SELECT DISTINCT UID from MyTable y WHERE c2='c';

is also a set of rows, say B.

If A = { 1,2,3 } and B = { 2,3,4 }

A - B = { 1 } and B - A = { 4 }

In your case A - B should be empty { } since A cannot contain any
element that's not in B.

But if I remember correctly mySQL has no support for "set -" (named
except in DB2 and minus in Oracle). We therefore have to rephrase this
expression as:

Any element that belongs to B but does not belong to A

or

Any element that belongs to B but does not exists in A

implemented in sql that becomes:

SELECT DISTINCT UID
FROM MyTable x
WHERE NOT EXISTS (
SELECT 1 from MyTable y

WHERE (c1 LIKE 'a%' or c1 LIKE 'b%') and c2='c'

AND x.uid = y.uid
);

>

Peter H. Coffin

unread,

Aug 27, 2012, 5:51:43 PM8/27/12

to

You reverse the logic, inverting operations and swapping "and" for "or":

(c1 LIKE 'a%' or c1 LIKE 'b%') becomes
(c1 NOT LIKE 'a%' and c1 NOT LIKE ''b%')

That will leave you with the item missing from the first query. You'll
need to keep the c2='c' the same because it's the same between both
queries.

There are situations where reversing logic won't be as simple as this
little rule, but it's a solid place to start from.

--
59. I will never build a sentient computer smarter than I am.
--Peter Anspach's list of things to do as an Evil Overlord

SpreadTooThin

unread,

Aug 27, 2012, 6:15:08 PM8/27/12

to

On Monday, August 27, 2012 3:46:10 PM UTC-6, Lennart Jonsson wrote:
> On 08/27/2012 09:53 PM, SpreadTooThin wrote:
>
> [...]
>
> > I guess i need to go back to basics here a little.
>
> >
>
> > 1) If i have two tables (x and y) with the same schema (and only 1 column) how can I find what is in table a, but not in table b
>
> >
>
>
>
> ok, say:
>
>
>
> create table x ( val int not null primary key );
>
> create table y ( val int not null primary key );
>
> insert into x (val) values (1),(2),(3);
>
> insert into y (val) values (2),(3),(4);
>
>
>
> select val from x
>
> where not exists (
>
> select 1 from y where x.val = y.val
>
> )
>
>
>
> >
>
> > 2) How do I combine the results or two select statements to form the query requested in point 1.
>
> >
>
>
>
> See above
>
>
>
> >
>
> > // This select creates a table called x of UIDs.
>
> > SELECT DISTINCT UID from MyTable x WHERE (c1 LIKE 'a%' or c1 LIKE 'b%') and c2='c';
>
> >
>
>
>
> While it is true that the result of a select statement is a table, I get
>
> a feeling that this confuses you. Let's instead say that
>
>
>
> SELECT DISTINCT UID from MyTable x WHERE (c1 LIKE 'a%' or c1 LIKE 'b%')
>
> and c2='c'
>
>
>
> is a set of rows, let's call it A. Your other statement:
>

Why call it A or B isn't it x or y as per the sql statement?

>
>
> SELECT DISTINCT UID from MyTable y WHERE c2='c';
>
>
>
> is also a set of rows, say B.
>
>
>
> If A = { 1,2,3 } and B = { 2,3,4 }
>
>
>
> A - B = { 1 } and B - A = { 4 }
>
>
>
> In your case A - B should be empty { } since A cannot contain any
>
> element that's not in B.
>
>
>
> But if I remember correctly mySQL has no support for "set -" (named
>
> except in DB2 and minus in Oracle). We therefore have to rephrase this
>
> expression as:
>
>
>
> Any element that belongs to B but does not belong to A
>
>
>
> or
>
>
>
> Any element that belongs to B but does not exists in A
>
>
>
>
>
> implemented in sql that becomes:
>
>
>
> SELECT DISTINCT UID
>
> FROM MyTable x
>
> WHERE NOT EXISTS (
>
> SELECT 1 from MyTable y
>
> WHERE (c1 LIKE 'a%' or c1 LIKE 'b%') and c2='c'
>
> AND x.uid = y.uid
>
> );
>
>

The last thing I don't understand about this is why the first SELECT doesn't have or need the where c2='c' qualifier?

P.S. Many thanks all.

Denis McMahon

unread,

Aug 27, 2012, 6:35:46 PM8/27/12

to

On Mon, 27 Aug 2012 10:18:46 -0700, SpreadTooThin wrote:

> I perform two selects on a database and get a difference of 1 record.
>
> SELECT COUNT(DISTINCT UID) from MyTable WHERE (c1 LIKE 'a%' or c1 LIKE
> 'b%') and c2='c';
>
> SELECT COUNT(DISTINCT UID) FROM MyTable;

I think:

SELECT DISTINCT UID FROM MyTable WHERE UID NOT IN ( SELECT DISTINCT UID
FROM MyTable WHERE (c1 LIKE 'a%' or c1 LIKE 'b%') and c2='c' );

might work, but I haven't tested it.

Rgds

Denis McMahon

SpreadTooThin

unread,

Aug 27, 2012, 8:39:36 PM8/27/12

to

use test;

create table MyTable
( `uid` varchar(8),
`c1` varchar(8),
`c2` varchar(8)
);

insert into MyTable (uid, c1, c2) values('111', 'aaa', 'c');
insert into MyTable (uid, c1, c2) values('111', 'ccc', 'c');
insert into MyTable (uid, c1, c2) values('111', 'ddd', 'c');

insert into MyTable (uid, c1, c2) values('222', 'bbb', 'c');
insert into MyTable (uid, c1, c2) values('222', 'ddd', 'c');
insert into MyTable (uid, c1, c2) values('222', 'eee', 'c');

insert into MyTable (uid, c1, c2) values('333', 'ddd', 'c');
insert into MyTable (uid, c1, c2) values('333', 'eee', 'c');
insert into MyTable (uid, c1, c2) values('333', 'fff', 'c');

insert into MyTable (uid, c1, c2) values('444', 'xxx', 'd');
insert into MyTable (uid, c1, c2) values('444', 'yyy', 'd');
insert into MyTable (uid, c1, c2) values('444', 'zzz', 'd');

select count(distinct uid) from MyTable where (c1 like 'a%' or c1 like 'b%') and c2='c';
select count(distinct uid) from MyTable where c2='c';

mysql> select count(distinct uid) from MyTable where (c1 like 'a%' or c1 like 'b%') and c2='c';
+---------------------+
| count(distinct uid) |
+---------------------+
| 2 |
+---------------------+
1 row in set (0.00 sec)

mysql> select count(distinct uid) from MyTable where c2='c';
+---------------------+
| count(distinct uid) |
+---------------------+
| 3 |
+---------------------+

SELECT DISTINCT UID

FROM MyTable x
WHERE NOT EXISTS (
SELECT 1 from MyTable y

WHERE (c1 LIKE 'a%' or c1 LIKE 'b%') and c2='c'

AND x.uid = y.uid
);

+------+
| UID |
+------+
| 333 |
| 444 |
+------+

This should only return 333

SpreadTooThin

unread,

Aug 27, 2012, 9:30:59 PM8/27/12

to

select DISTINCT uid From MyTable x where c2='c' and NOT EXISTS (select 1 from MyTable y where (c1 like 'a%' or c1 like 'b%') and c2='c' and x.uid = y.uid);

This worked... I hope for the right reasons...

Lennart Jonsson

unread,

Aug 28, 2012, 12:15:36 AM8/28/12

to

On 08/28/2012 03:30 AM, SpreadTooThin wrote:
> select DISTINCT uid From MyTable x where c2='c' and NOT EXISTS (select 1 from MyTable y where (c1 like 'a%' or c1 like 'b%') and c2='c' and x.uid = y.uid);
>
>

That is correct

> This worked... I hope for the right reasons...

Now, don't just use that query blindly. Try understand what it does. For
each row in:

select x.* From MyTable x where c2='c';

convince your self why that row is part of, or missing in:

select x.* From MyTable x

where c2='c'
and NOT EXISTS (
select 1 from MyTable y
where (c1 like 'a%' or c1 like 'b%')
and c2='c' and x.uid = y.uid
);

/Lennart

Lennart Jonsson

unread,

Aug 28, 2012, 12:16:49 AM8/28/12

to

On 08/28/2012 12:15 AM, SpreadTooThin wrote:
[....]

>>
>> SELECT DISTINCT UID
>>
>> FROM MyTable x
>>
>> WHERE NOT EXISTS (
>>
>> SELECT 1 from MyTable y
>>
>> WHERE (c1 LIKE 'a%' or c1 LIKE 'b%') and c2='c'
>>
>> AND x.uid = y.uid
>>
>> );
>>
>>
> The last thing I don't understand about this is why the first SELECT doesn't have or need the where c2='c' qualifier?
>

It was not in your first post from which I copied the example. I did not
notice that you added that predicate, but of course it will have to be
part of the query

[...]

SpreadTooThin

unread,

Aug 28, 2012, 11:29:01 AM8/28/12

to

Why select 1 why not select uid? is this because it is faster or because we simply know that it is the first column or the table?

Lennart Jonsson

unread,

Aug 28, 2012, 1:52:49 PM8/28/12

to

On 08/28/2012 05:29 PM, SpreadTooThin wrote:
[...]

>
> Why select 1 why not select uid? is this because it is faster or because we simply know that it is the first column or the table?

It does not matter what we select, could be *, uid, 'a' or any other
value. It's the existence of the row that matters, i.e. does it exist a
row such that ...

In an ancient history a common recommendation was not to use * in such
situations, because it might cause a lookup in the catalogue for the
meaning of *. Now a days I don't think it matters at all, so it is more
of an old habit that I use 1 in exists predicates.

/Lennart

SpreadTooThin

unread,

Aug 28, 2012, 5:21:35 PM8/28/12

to

On Monday, August 27, 2012 10:15:37 PM UTC-6, Lennart Jonsson wrote:

I'm pretty sure I understand it, but I'm left with questions i can't find answers to in books...

I think like a programmer... I have two record sets.

rs1 = sql("select distinct uid from ImageTable where c2='c';")
rs2 = sql("select distinct uid from ImageTalbe where c1 like 'a5' and c2='c';")

foreach rec1 in rs1
not_found = true
foreach rec2 in rs2
if rec1.uid = rec2.uid then
found = true
break
end if
next
if not_found then
print rec1.uid
endif
next

When I look at your sql statement I don't 'get' how the looping mechanism works... Does that make sence?

I figured out that select 1 just generates a table of 1s for each row of the record set, but I still don't see what use that is...

I'm thick..

Lennart Jonsson

unread,

Aug 30, 2012, 2:40:03 AM8/30/12

to

On 08/28/2012 11:21 PM, SpreadTooThin wrote:

[...]

>
> I'm pretty sure I understand it, but I'm left with questions i can't find answers to in books...
>
> I think like a programmer... I have two record sets.
>
> rs1 = sql("select distinct uid from ImageTable where c2='c';")
> rs2 = sql("select distinct uid from ImageTalbe where c1 like 'a5' and c2='c';")
>
> foreach rec1 in rs1
> not_found = true
> foreach rec2 in rs2
> if rec1.uid = rec2.uid then
> found = true
> break
> end if
> next
> if not_found then
> print rec1.uid
> endif
> next
>
> When I look at your sql statement I don't 'get' how the looping mechanism works... Does that make sence?
>

In some sense. Let's start with a standard recommendation, don't think
in loops, think in sets instead. In my experience this takes some time
to get used to, but turn out to be a very powerful way of thinking. That
said, looking at your algorithm it appears to be correct. Another way of
expressing the algorithm that is closer to the query is if we imagine
rs2 to be a dictionary with rec2 as key.

foreach rec1 in rs1
if rec1.uid not in rs2 then
print rec1.uid
endif
next

Sorry for the latency, I've been quite busy for the past number of days

/Lennart