SQL query

Eduardo

unread,

Mar 29, 2013, 11:30:07 PM3/29/13

to

Suppose I have a table with two rows and filled with these values:

Col1 Col2
1 100
1 101
1 102
1 105
1 108
2 100
2 103
2 110
2 111
3 102
3 105
3 112
4 100
4 103
4 105
4 110

I need to make an SQL query that returns a Col1 values list, but depending
on what values there are on Col2

For the example I need the list of all Col1 values, but only the ones that
have 100 and 103 on some rows.
It would be a recordset of two records, returning 2 and 4.
Is that possible to do that with an SQL query?

Eduardo

unread,

Mar 29, 2013, 11:52:30 PM3/29/13

to

Perhaps I have simplified too much the former example, here I reformulate
it:

> Col1 Col2
> 1 100
> 1 101
> 1 102
> 1 105
> 1 108
> 2 100
> 2 103
> 2 110
> 2 111
> 3 102
> 3 105
> 3 112
> 4 100
> 4 103
> 4 105
> 4 110

I need a list of Col1 records that has rows with Col2 = 100 AND also has
rows with Col2 = 103 OR = 102
It should return three records: 1, 2 and 4.

Carlos Rocha

unread,

Mar 30, 2013, 1:35:16 AM3/30/13

to

Hi Eduardo,

With SQLite you can do this:

select distinct a.Col1 from MyTable a
join MyTable b
where a.[Col2]=100 and (b.[Col2]=102 or b.[Col2]=103)

(I hope Henning is not around, in case you are talking about SQL Server) :P

Eduardo

unread,

Mar 30, 2013, 9:58:13 AM3/30/13

to

"Carlos Rocha" <jazzd...@gmail.com> escribi� en el mensaje
news:e542ba67-7a03-4c58...@googlegroups.com...

Hi Carlos,

I'm talking about an Access 2000 Database used from VB with DAO 3.6

In the case the condition has... let's say up to 50 variables, the 50 JOIN's
could be a problem?
(I'll perform some tests)

Eduardo

unread,

Mar 30, 2013, 10:56:07 AM3/30/13

to

"Carlos Rocha" <jazzd...@gmail.com> escribi� en el mensaje
news:e542ba67-7a03-4c58...@googlegroups.com...

> Hi Eduardo,
>
> With SQLite you can do this:
>
> select distinct a.Col1 from MyTable a
> join MyTable b
> where a.[Col2]=100 and (b.[Col2]=102 or b.[Col2]=103)

For some reason it doesn't work. It returns error 3131 "error in the FROM
clause".

I also tried adding:

select distinct a.Col1 from MyTable a

join MyTable b ON (a.[Col1] = b.[Col1])

where a.[Col2]=100 and (b.[Col2]=102 or b.[Col2]=103)

but got the same error.

If I remove the JOIN clause and leave

select distinct a.Col1 from MyTable a, MyTable b

where a.[Col2]=100 and (b.[Col2]=102 or b.[Col2]=103)

I get no error, but it's a recordset of all records that have any of the two
values on Col2 and not only the ones that have the two Col2 values at the
same time in different rows.

Leonardo Azpurua

unread,

Mar 30, 2013, 11:25:13 AM3/30/13

to

"Eduardo" <m...@mm.com> escribi� en el mensaje
news:kj5nh0$omt$1...@speranza.aioe.org...

Hi,

Try this:

SELECT DISTINCT Tabla1.col1
FROM Tabla1
INNER JOIN (SELECT * FROM Tabla1 WHERE Col2 IN (102, 103)) T
ON Tabla1.col1 = T.col1
WHERE Tabla1.col2 = 100

It's a very weird requirement, though. It feels like something may be wrong
with the design.

Schmidt

unread,

Mar 30, 2013, 12:55:45 PM3/30/13

to

Am 30.03.2013 15:56, schrieb Eduardo:

> If I remove the JOIN clause and leave
>
> select distinct a.Col1 from MyTable a, MyTable b
> where a.[Col2]=100 and (b.[Col2]=102 or b.[Col2]=103)
>
> I get no error, but it's a recordset of all records that
> have any of the two values on Col2 and not only the ones that
> have the two Col2 values at the same time in different rows.

Eduardo, for those kind of "tricky SQL-Queries", it's
always best, when you include a small example - maybe
two (with always the same small input-list) - one
result-list, what you *don't* want - and another small
result-list with the result you desire.

It's easier to come up with a good suggestion then.

Olaf

Carlos Rocha

unread,

Mar 30, 2013, 2:57:49 PM3/30/13

to

>
> > select distinct a.Col1 from MyTable a
> > join MyTable b
> > where a.[Col2]=100 and (b.[Col2]=102 or b.[Col2]=103)
>
> For some reason it doesn't work. It returns error 3131 "error in the FROM
> clause".
>

A fast google search suggests that this error "tends" to be related to the table name. Check if the syntax is accepted by Access and make the query in a single line, to make sure there are all required spaces between tokens (I don't know how access interpret a newline).

I just replicated your list in the first post inside a SQLite table and it works as expected, returning 1,2,4

Eduardo

unread,

Mar 30, 2013, 3:03:07 PM3/30/13

to

"Schmidt" <n...@vbRichClient.com> escribi� en el mensaje
news:kj75aq$3cg$1...@dont-email.me...

Hello Olaf:

What I'm trying to achieve is some "intelligent" text search.

The search term may be a single word, in that case it's not so much complex,
but it also can be a search phrase with several words at the same time.

I have a database with "documents", that have several fields.
One field is the Title, another field is the summary (those are the two more
important ones), and I have also some other text fields where to seach the
phrase.
In total I have 11 fields, one is the Title, another one the Summary, and
for the rest they can be more than one per document (so I have them is
another related table).

I have the search routine half done. I build a list if Documents_IDs (it's
an array) with all the documents IDs that have relevance for the query, also
I have another array with RevelanceValues where I save a number according to
the relevance of that document for the phrase.
First, I search in the title, it's a normal text search, I add the ones that
math and assign a relevance value in RevelanceValues, it's a high value
number because the relevance is strong if the phrase appears in the Title
exactly as written by the user.
After that I perform a normal search in the Summary, also assigning a
relevance value.
I do the same for the other 9 fields.

After that, I go more fine and I want to perform a search by Lemma.

I already have stored in the database all the Spanish words with their
lemmas.
Each word can have more than one lemma.

A lemma is like the entry in a dictionary, the main word without any
inflection.
It's similar to a Stem (I know you have experience with stemming), but it's
not the same thing.
The stem is how the word starts, the "root" (althought "root" is not a
proper word to use because it means something else) or common characters at
the beginning of words that are the same word but with different
inflections.
For example in English "wait", "waited", "waiting" have the stem "wait".
Here there is a more extensive explanation:
http://en.wikipedia.org/wiki/Word_stem

A lemma takes into account the meaning and not only how the word starts.
For example, in English "seen" and "see" has the lemma "see" that in that
case is the same for the stem, also "see", but "saw" has also the lemma
"see", but it has another stem.
In other words, lemmas takes into account that the words can be irregular
and to have another declination or morphology.
Lemma explained in wikipedia:
http://en.wikipedia.org/wiki/Lemma_%28morphology%29

In English this kind of search it's a lesser complex problem, because the
verbs has only three tenses (and only the third person has a difference
adding an "s" or "es" at the end) and the words are very much regular, but
the Spanish have a lot of verbs conjugations, and also suffixes can be added
to the verbs like "informing to us" ("informando a nosostros") can become
"informingus" (inform�ndonos") or "remind it to me" ("recuerda esto a m�")
can be written as "remindmeit" ("recu�rdamelo") , and there are many words
that are not regular so this kind of search becomes more important.

So, first I have all the Spanish words with their lemmas already stored in a
database.

And I've made three tables that are intermediate tables and must be updated
every time that there are editions to the documents or new documents are
added before performing queries. But it's not a problem because it's not
that they will enter new information and to want to perform queries at once.
They will enter all the documents information, prepare the database for the
final user (including updating these intermediate tables) and ship the
database to the final users that will only be able to perfor queries but not
to enter new documents or edit the one it has. So the update of these tables
takes some seconds (before being able to perform queries after editing data)
but it's not a problem.

So the program has two states of working: "data entry" or "final user" -or
"query"-.

The complete list of Spanish words is in a separate database file. This will
not be shipped to the final user and is used only in the data entry
location.

About the three intermediate tables, one of the tables is a list of all the
Spanish words but only the ones used in the documents (I decided not to
include all the Spanish words in the final user database because it would
add 100 Mb to the database file).

So these three tables, one has the Spanish words used by the documents and
with their corresponding lemmas. Each word can have more than one lemma
(adding this to the complexity). For example in English, the word "saw" has
(at least) two lemmas: "to see" (a verb) and "saw" -a tool to cut wood- (a
noun). Most of the words have only one lemma, some has two, to have more
than two is more rare, but there are words that have up to 5 lemmas. So in
this table I have 5 fields for lemmas: Lemma1, Lemma2, Lemma3, Lemma4 and
Lemma5.
Each word in this table is an unique entry, and it has a Word_ID. This is
the primary key.
Also there is another field to point whether the word is a "stop word":
http://en.wikipedia.org/wiki/Stop_words

In the second table I have three fields: Document_ID, Words_ID and a third
field to know how many times that word appears in the document.
The primary key is Document_ID + Words_ID

In the third table I have:
Document_ID
Field_Number
Position
Word_ID

Document_ID Identifies a document. Field_Number is a number from 1 to 11, 1
meaning Title, 2 meaning Summary, and so on. This is were the words were
taken from.
Position is a number with the word position in the text.
Word_ID is the ID of the word, the ID that the word has in the first table.

So, I have all this design (that I'm not so sure any more that it's a good
design), and I'm trying to search for a phrase but taking into account that
it can be with other words inflections. Also, to ignore (or not) stop words,
to threat all the pronouns as same (or not), to be able to find if the words
are, but not exactly in the same order as the user wrote it in the query,
also if they are nearby or far away from each other... and to return a
different relevance number for each case.
Also, to consider the times each word appear in the text, if they are the
exact word or another one with the same lemma.

But I got stuck trying to build a recordset with the list of Document_ID's
that has the three words (but only if they have the three words in the same
Field_Number -in a three words query phrase example: not one word found in
the Title and the other two in the Summary-).

Also, I wanted to return for each phrase all the documents that has not the
exact words but words that have the same lemmas. I already has built the
list of words with the same lemmas for each word (of the query phrase) in an
array. So that's why I said in the example I posted = 100 OR = 102.

I know it's somewhat complex, but I still don't figure how I should have
stored the words in a better proper way to accomplish this.

I still can do what I want not with a full recordset, and with many
FindFirst's searches, but that will be very slow I think.

Eduardo

unread,

Mar 30, 2013, 3:09:03 PM3/30/13

to

"Leonardo Azpurua" <leon...@exmvps.org> escribi� en el mensaje
news:kj6vm6$2j5$1...@dont-email.me...

OK, I'll test it, but it seems that it will get even more complex as more
conditions (more variables) take into acccount.

> It's a very weird requirement, though. It feels like something may be
> wrong with the design.

Humm, may be yes, I'm starting to consider that.
I've explained what I'm trying to achieve in my answer to Olaf's message.

Eduardo

unread,

Mar 30, 2013, 3:11:10 PM3/30/13

to

"Carlos Rocha" <jazzd...@gmail.com> escribi� en el mensaje

news:0034f7ef-e428-480d...@googlegroups.com...

>
> > select distinct a.Col1 from MyTable a
> > join MyTable b
> > where a.[Col2]=100 and (b.[Col2]=102 or b.[Col2]=103)
>
> For some reason it doesn't work. It returns error 3131 "error in the FROM
> clause".
>

> A fast google search suggests that this error "tends" to be related to the
> table name. Check if the syntax is accepted by Access > and make the query
> in a single line, to make sure there are all required spaces between
> tokens (I don't know how access
> interpret a newline).

Carlos, I have all in the same line and with one space separation. The Table
name is right. I looked and checked the query line several times.
I could not find on internet query samples for DAO with JOIN for the same
table.

Carlos Rocha

unread,

Mar 30, 2013, 8:15:12 PM3/30/13

to

> For some reason it doesn't work. It returns error 3131 "error in the FROM
> clause".
>
> I also tried adding:
>
> select distinct a.Col1 from MyTable a
> join MyTable b ON (a.[Col1] = b.[Col1])
> where a.[Col2]=100 and (b.[Col2]=102 or b.[Col2]=103)
>
> but got the same error.
>

As I hate to give up, so I tried it myself with Access2007 and this one works:

select distinct a.Col1 from Table1 as a inner join Table1 as b on a.Col1=b.Col1 where a.[Col2]=100 and (b.[Col2]=102 or b.[Col2]=103);

It needs INNER JOIN in place of a simple JOIN, and also the ON clause

.NET Sales Support

unread,

Mar 31, 2013, 12:51:36 AM3/31/13

to

"Eduardo" <m...@mm.com> wrote in message <news:kj5nh0$omt$1...@speranza.aioe.org>...

select col1 from t1
where col1 in (select col1 from t1 where col2 = 100)
and (col1 in (select col1 from t1 where col2 = 103)
or col1 in (select col1 from t1 where col2 = 102))

Leonardo Azpurua

unread,

Mar 31, 2013, 2:02:39 PM3/31/13

to

"Leonardo Azpurua" <leon...@exmvps.org> escribi� en el mensaje
news:kj6vm6$2j5$1...@dont-email.me...

Well, not necessarily wrong.

A likely real world case could be to get a list of invoices that include
'item 100' and also include 'item 103' or 'item 102'.

I guess there is no other way to get such information.

Eduardo

unread,

Mar 31, 2013, 8:51:30 PM3/31/13

to

"Leonardo Azpurua" <leon...@exmvps.org> escribi� en el mensaje

news:kj9t97$n9n$1...@dont-email.me...

>> It's a very weird requirement, though. It feels like something may be
>> wrong with the design.
>
> Well, not necessarily wrong.
>
> A likely real world case could be to get a list of invoices that include
> 'item 100' and also include 'item 103' or 'item 102'.

Yes, exactly, it doesn't seem too strange after all.

> I guess there is no other way to get such information.

PS: I still didn't get back to work to test the three SQL constructions
possibilities offered here (and their performances).

Eduardo

unread,

Apr 1, 2013, 11:05:09 AM4/1/13

to

"Eduardo" <m...@mm.com> escribi� en el mensaje
news:kj5m71$lln$1...@speranza.aioe.org...

> Col1 Col2
> 1 100
> 1 101

[...]

I tested the performance of the three SQL queries proposed, two worked. I
did it without the last OR.

1) From Carlos Rocha, it worked, reurned 5 records from my real table:
Set iRecordSet = gDataBase.OpenRecordset("SELECT DISTINCT a.Col1 FROM Table1
AS a INNER JOIN Table1 AS b ON a.Col1 = b.Col1 WHERE a.Col2 = 100 AND b.Col2
= 102")

Time: 3.05 milliseconds

2) From Leonardo, it worked, returned 5 records from my real table:
Set iRecordSet = gDataBase.OpenRecordset("SELECT DISTINCT Table1.Col1 FROM
Table1 INNER JOIN (SELECT * FROM Table1 WHERE Col2 IN (102)) T ON
Table1.Col1 = T.Col1 WHERE Table1.Col2 = 100")

Time: 3,37 milliseconds

3) From VS6 Sales Support, it didn't work, because it returned 1 record. It
seems that it just returned one Col1 where the two Col2 were present:
Set iRecordSet = gDataBase.OpenRecordset("SELECT Col1 FROM Table1 WHERE Col1
IN (SELECT Col1 FROM Table1 WHERE Col2 = 100) AND Col1 IN (SELECT Col1 FROM
Table1 WHERE Col2 = 102)")

Time: 4.2 milliseconds.

If I change the AND to OR it goes faster (2 milliseconds) but I get also
just one record.

Thanks guys.

Schmidt

unread,

Apr 1, 2013, 11:53:51 AM4/1/13

to

Am 01.04.2013 17:05, schrieb Eduardo:

> I tested the performance of the three SQL queries proposed, two worked. I
> did it without the last OR.

Another way to skin the cat would be per In Clause against a
SubQuery ... not sure though if that performs faster in the
JET-Engine, than the Join-based approaches.

Select Distinct Col1 From MyTable
Where Col1 In (Select Col1 From MyTable Where Col2=100)
AND Col2 In (102,103)

Olaf

Eduardo

unread,

Apr 1, 2013, 12:22:26 PM4/1/13

to

"Schmidt" <n...@vbRichClient.com> escribi� en el mensaje
news:kjcae6$e1t$1...@dont-email.me...

Olaf, it returned the 5 records right and performed faster: 1,7 milliseconds
Thanks a lot.

Eduardo

unread,

Apr 1, 2013, 12:25:32 PM4/1/13

to

"Eduardo" <m...@mm.com> escribi� en el mensaje
news:kjcc7v$987$1...@speranza.aioe.org...

But now I have to figure out how to generalize it for more than two query
entries.
For example, how it would be for three?

Jeff Johnson

unread,

Apr 1, 2013, 12:29:10 PM4/1/13

to

"Eduardo" <m...@mm.com> wrote in message

news:kj5m71$lln$1...@speranza.aioe.org...

I can't speak to the speed you'll get, but using the EXISTS clause ought to
be more extensible than the joins:

SELECT DISTINCT Col1
FROM Table1 t1
WHERE EXISTS (SELECT * FROM Table1 t2 WHERE t2.Col1 = t1.Col1 AND t2.Col2 =
100) AND EXISTS (SELECT * FROM Table1 t2 WHERE t2.Col1 = t1.Col1 AND t2.Col2
= 103)

Here's the AND/OR one:

SELECT DISTINCT Col1
FROM Table1 t1
WHERE EXISTS (SELECT * FROM Table1 t2 WHERE t2.Col1 = t1.Col1 AND t2.Col2 =
100) AND EXISTS (SELECT * FROM Table1 t2 WHERE t2.Col1 = t1.Col1 AND t2.Col2
IN (102, 103))

Ultimately, however, I think you're probably trying to do something which
would be better handled outside of SQL.

Eduardo

unread,

Apr 1, 2013, 1:03:54 PM4/1/13

to

"Jeff Johnson" <i....@enough.spam> escribi� en el mensaje
news:kjccgb$tu8$1...@dont-email.me...

Hello Jeff, what I tested is:

Set iRecordset = gDataBase.OpenRecordset("SELECT DISTINCT Col1 FROM Table1
T1 WHERE EXIST (SELECT Col1 FROM Table1 T2 WHERE T2.Col1 = T1.Col1 AND
T2.Col2 = 100) AND EXISTS (SELECT * FROM Table1 T2 WHERE T2.Col1 = t1.Col1
AND T2.Col2 IN (102))")

I had to change the * to Col1 on the second SELECT because I got an error
3306.
This returned the 5 recods right, but it was very slow compared to others:
600 milliseconds

>
> Ultimately, however, I think you're probably trying to do something which
> would be better handled outside of SQL.

I didn't even test the FindFirst/FindNext approach because I thought that an
SQL query should be faster.

Eduardo

unread,

Apr 1, 2013, 1:11:38 PM4/1/13

to

"Eduardo" <m...@mm.com> escribi� en el mensaje

news:kjccdp$a2f$1...@speranza.aioe.org...

I think I already figured it:

Set iRecordset = gDataBase.OpenRecordset("SELECT DISTINCT Col1 FROM Table1

WHERE Col1 IN (SELECT Col1 FROM Table1 WHERE Col2 = 100) AND Col1 IN (SELECT

Col1 FROM Table1 WHERE Col2 = 105) AND Col2 IN (103)")

Schmidt

unread,

Apr 1, 2013, 1:33:29 PM4/1/13

to

Am 01.04.2013 18:22, schrieb Eduardo:

>> Another way to skin the cat would be per In Clause against a
>> SubQuery ... not sure though if that performs faster in the
>> JET-Engine, than the Join-based approaches.
>>
>> Select Distinct Col1 From MyTable
>> Where Col1 In (Select Col1 From MyTable Where Col2=100)
>> AND Col2 In (102,103)
>>
>

> ...it returned the 5 records right and performed faster:
> 1,7 milliseconds

Wasn't really expecting that, but one never knows, what
a certain DB-Engine does in its Query-Optimizer.

Tested this with SQLite against 120,000 Entries in a Test-
Table (using Integer-Columns) - and then Carlos, Leonardos -
and my above approach delivered roughly comparable results
(when an Index on Col2 *was* set).

Between 0.21 and 0.25msec for Carlos and my approach - and
between 0.29 and 0.35msec with Leonardos approach.

When *no* Index on Col2 was set, the result came out
(much slower) this way:
Carlos: 77.48msec
Leonardo: 354.27msec
Olaf: 278.47msec

So (at least for SQLite) I'd perhaps go with Carlos' SQL-String,
which performs well in both cases (indexed and non-indexed).

But the JET-Engine has a different Query-Optimizer - only
appropriate Tests can show what performs best, on a given
Indexing-Scheme and on the concrete DB-Engine in question.

And as you see from the huge differences in the indexed
and non-indexed results on SQLite - this would be a very
important part on your end, to place and test those properly.

Now I see, that Jeff has also thrown in his 2cents - and
given what you wrote about the background of the whole thing,
I tend to agree with him, that perhaps other approaches
(e.g. fast Hash-Tables or "Dictionaries", which perform those
"loads of cascading lookups" you want to execute, directly
in memory) would be worthwhile to test too...
I'm pretty sure, he didn't have an Rs.FindFirst in mind...

Olaf

Jeff Johnson

unread,

Apr 1, 2013, 1:51:21 PM4/1/13

to

"Eduardo" <m...@mm.com> wrote in message

news:kjcelm$hkd$1...@speranza.aioe.org...

> Set iRecordset = gDataBase.OpenRecordset("SELECT DISTINCT Col1 FROM Table1
> T1 WHERE EXIST (SELECT Col1 FROM Table1 T2 WHERE T2.Col1 = T1.Col1 AND
> T2.Col2 = 100) AND EXISTS (SELECT * FROM Table1 T2 WHERE T2.Col1 = t1.Col1
> AND T2.Col2 IN (102))")
>
> I had to change the * to Col1 on the second SELECT because I got an error
> 3306.

Actually, it should be the other way around. EXISTS statements should always
use SELECT *, if by nothing other than convention.

Eduardo

unread,

Apr 1, 2013, 1:58:20 PM4/1/13

to

"Schmidt" <n...@vbRichClient.com> escribi� en el mensaje

news:kjcg90$qht$1...@dont-email.me...

I have all the fields indexed in that table.

> Now I see, that Jeff has also thrown in his 2cents - and
> given what you wrote about the background of the whole thing,
> I tend to agree with him, that perhaps other approaches
> (e.g. fast Hash-Tables or "Dictionaries", which perform those
> "loads of cascading lookups" you want to execute, directly
> in memory) would be worthwhile to test too...
> I'm pretty sure, he didn't have an Rs.FindFirst in mind...

I really don't know what he had in mind.
This approach seems to be performing quite fast. I still didn't test it with
all the OR's (that can be up to 20 or 30) but I don't think that this will
slow down it too much.

I already have in memory something like a "dictionary", it's a vector with
all the words and another with all their ID's. I use them for locating a
misspelled word. When I can't find a word in the database, I check for some
other word very similar. If there is one, I assume that it's that word that
the user intended to write (it must have more than 75% of coincidence, I use
the more similar word in the case that there are more than one word with 75%
or more). I'm using a modified version of the word similarity class that you
gave to me.

But, it only has the words, not if they appear or where they appear or the
order how the appear.

Eduardo

unread,

Apr 1, 2013, 2:07:34 PM4/1/13

to

"Jeff Johnson" <i....@enough.spam> escribi� en el mensaje

news:kjchaf$361$1...@dont-email.me...

I don't know Jeff, I'm not an SQL expert, I just say that it threw that
error and what I changed and (seemingly) worked.
I didn't copy the error message description because it was in Spanish.
Perhaps the SQL query is wrong as I got it to work and that's why it took so
long to execute, but with the asterisk it raised that error.

.NET Sales Support

unread,

Apr 2, 2013, 2:00:05 AM4/2/13

to

"Eduardo" <m...@mm.com> wrote in message <news:kjcid2$u9r$1...@speranza.aioe.org>...

You might be better off with a relational algebra library rather than trying
to beat SQL into submission, especially if you'll be trying to build monster
queries. This academic paper outlines the general idea.

http://www.ijikm.org/Volume6/IJIKMv6p073-083McMaster544.pdf

Deanna Earley

unread,

Apr 2, 2013, 6:39:51 AM4/2/13

to

On 30/03/2013 03:30, Eduardo wrote:
> Suppose I have a table with two rows and filled with these values:
>

<SNIP>

>
> I need to make an SQL query that returns a Col1 values list, but depending
> on what values there are on Col2
>
> For the example I need the list of all Col1 values, but only the ones that
> have 100 and 103 on some rows.
> It would be a recordset of two records, returning 2 and 4.
> Is that possible to do that with an SQL query?

You could try a crosstab query.
This would give you something like:
Col1 100 101 102 103 105 108 110 111 112
1 1 1 1 0 1 1 0 0 0
2 1 0 0 1 0 0 1 1 0
3 0 0 1 0 1 0 0 0 1
4 1 0 0 1 1 0 1 0 0

You can then filter this as with any other table/query.

I can't give a sample, but Access has a wizard and designer for them.

--
Deanna Earley (dee.e...@icode.co.uk)
iCatcher Development Team
http://www.icode.co.uk/icatcher/

iCode Systems

(Replies direct to my email address will be ignored. Please reply to the
group.)

Jeff Johnson

unread,

Apr 2, 2013, 2:32:38 PM4/2/13

to

"Eduardo" <m...@mm.com> wrote in message

news:kjcid2$u9r$1...@speranza.aioe.org...

>>> Set iRecordset = gDataBase.OpenRecordset("SELECT DISTINCT Col1 FROM
>>> Table1 T1 WHERE EXIST (SELECT Col1 FROM Table1 T2 WHERE T2.Col1 =
>>> T1.Col1 AND T2.Col2 = 100) AND EXISTS (SELECT * FROM Table1 T2 WHERE
>>> T2.Col1 = t1.Col1 AND T2.Col2 IN (102))")
>>>
>>> I had to change the * to Col1 on the second SELECT because I got an
>>> error 3306.
>>
>> Actually, it should be the other way around. EXISTS statements should
>> always use SELECT *, if by nothing other than convention.
>
> I don't know Jeff, I'm not an SQL expert, I just say that it threw that
> error and what I changed and (seemingly) worked.
> I didn't copy the error message description because it was in Spanish.
> Perhaps the SQL query is wrong as I got it to work and that's why it took
> so long to execute, but with the asterisk it raised that error.

What version of Access? I tested this under 2010, and I used SELECT * in all
my EXISTS subqueries. No errors.

Also, are you copy/pasting your actual SQL? Because there's a typo (EXIST
vs. EXISTS) in what you posted, which makes me think you typed it into the
post by hand. We need to see the actual code you're running.

Eduardo

unread,

Apr 2, 2013, 3:09:51 PM4/2/13

to

"Eduardo" <m...@mm.com> escribi� en el mensaje

news:kjcf47$j6n$1...@speranza.aioe.org...

[...]

I facing another problem.

I simplified the example and I thought that I would be able to figure the
other parts, but I see that perhaps this approach can't handle that.

As I commented somewhere else (where I explained what I was doing), I have
some other columns in the table.

This is the SQL:

SELECT DISTINCT Col1 FROM MyTable WHERE Col1 IN (SELECT Col1 FROM MyTable
WHERE Col2 IN (100)) AND Col1 IN (SELECT Col1 FROM MyTable WHERE Col2 IN
(105)) AND Col2 IN (102) AND Doc_Field = 3

In the more real case:
Col1 from the example becomes Document_ID
Col2 from the example becomes Word_ID

The table Documents_Words is

Document_ID
Doc_Field
Sub_Item_ID
Word_Position
Word_ID

They are all numeric values.
Doc_Field is a number from 1 to 11 to point if the text is 1: from the
Title, 2: from the Summary, 3 to 11 from other fields.
This other fields, 3 to 11, can repeat in a document (so they are in another
table). There can be several of then (of the same "Field number", for
example "4"), so to point to an specific text, the Doc_Field number must be
used together with the Sub_Item_ID.
Sub_Item_ID for the case of 1 (Title) or 2 (Summary) is always zero, but for
the other fields, it has a number.

And my real SQL now is something like:

SELECT DISTINCT Document_ID FROM Documents_Words WHERE Document_ID IN
(SELECT Document_ID FROM Documents_Words WHERE Word_ID IN (100)) AND
Document_ID IN (SELECT Document_ID FROM Documents_Words WHERE Word_ID IN
(105)) AND Word_ID IN (102) AND Doc_Field = 3

But, I need to also to tell that the Sub_Item_ID value must match.

I think I could figure to add that to the JOIN version of the query, (WHERE
a.Sub_Item_ID = b.Sub_Item_ID), but I don't see how I could do that with
this other query.
Is that possible?
(sorry to bother again and thanks)

Eduardo

unread,

Apr 2, 2013, 3:13:42 PM4/2/13

to

"Jeff Johnson" <i....@enough.spam> escribi� en el mensaje

news:kjf83q$4rm$1...@dont-email.me...

> "Eduardo" <m...@mm.com> wrote in message
> news:kjcid2$u9r$1...@speranza.aioe.org...
>
>>>> Set iRecordset = gDataBase.OpenRecordset("SELECT DISTINCT Col1 FROM
>>>> Table1 T1 WHERE EXIST (SELECT Col1 FROM Table1 T2 WHERE T2.Col1 =
>>>> T1.Col1 AND T2.Col2 = 100) AND EXISTS (SELECT * FROM Table1 T2 WHERE
>>>> T2.Col1 = t1.Col1 AND T2.Col2 IN (102))")
>>>>
>>>> I had to change the * to Col1 on the second SELECT because I got an
>>>> error 3306.
>>>
>>> Actually, it should be the other way around. EXISTS statements should
>>> always use SELECT *, if by nothing other than convention.
>>
>> I don't know Jeff, I'm not an SQL expert, I just say that it threw that
>> error and what I changed and (seemingly) worked.
>> I didn't copy the error message description because it was in Spanish.
>> Perhaps the SQL query is wrong as I got it to work and that's why it took
>> so long to execute, but with the asterisk it raised that error.
>
> What version of Access? I tested this under 2010, and I used SELECT * in
> all my EXISTS subqueries. No errors.

2000

> Also, are you copy/pasting your actual SQL? Because there's a typo (EXIST
> vs. EXISTS) in what you posted, which makes me think you typed it into the
> post by hand. We need to see the actual code you're running.

That could be then the reason. Yes, from my code to here I copied and pasted
it (but before I made several text replaces for the names of fields and the
table)
But when I wrote the code example, I wrote it by hand so I could not copy a
letter.

Eduardo

unread,

Apr 2, 2013, 3:35:10 PM4/2/13

to

"Eduardo" <m...@mm.com> escribi� en el mensaje
news:kjfakr$26o$1...@speranza.aioe.org...

Yes, there was a typo with the EXISTS keyword, now I fixed it and it didn't
raise any error, but returned 0 records and lasted 300 milliseconds.

This was the query:

Set iRecordSet = gDataBase.OpenRecordset("SELECT DISTINCT Col1 FROM MyTable
T1 WHERE EXISTS (SELECT * FROM MyTable T2 WHERE T2.Col1 = T1.Col1 AND
T2.Col2 = 100) AND EXISTS (SELECT * FROM MyTable T2 WHERE T2.Col1 = T1.Col1
AND T2.Col2 IN (103))")

I just replaced in notepad with its replace tool the name of the fields and
the table, and manually replaced the "Col2" numbers by 100 and 103 to match
the example posted here.

Eduardo

unread,

Apr 2, 2013, 3:47:15 PM4/2/13

to

"Eduardo" <m...@mm.com> escribi� en el mensaje
news:kjfakr$26o$1...@speranza.aioe.org...

>> What version of Access? I tested this under 2010, and I used SELECT * in
>> all my EXISTS subqueries. No errors.
>
> 2000

The database is Access (2000 or 2003) but I'm not using Access, I use DAO
3.6 from VB6

ralph

unread,

Apr 2, 2013, 4:37:10 PM4/2/13

to

Then that can be a problem when comparing SQL statements.

The version of MSAccess can determine the file format (in this case
Jet 4.0), but the SQL implementation is determined by the
Provider/Driver. When Microsoft updated the Jet file format from 3.5
to 4.0, they also updated the Jet SQL Implementation, to more "ANSI
SQL 1992" compliant. They call this change - "SQL Extensibility". BUT
this extensibiltiy was or is only supported by ADO providers, not by
any version of DAO drivers*.

[* this is as true inside MSAccess as outside, no matter what the
MSAccess SQL 92 Mode is set to.]

In practical use I've never found it to be much of problem - it only
comes up in those rare situations where you are pasting a Query from
an ADODB App to a DAO app. (Like this case <g>). Have never found a
complete list of what "extensions" have been added or differences, but
it simple enough to write a simple ADO/DAO side-by-side comparison
utility to test.

-ralph

Eduardo

unread,

Apr 2, 2013, 4:58:41 PM4/2/13

to

"ralph" <nt_con...@yahoo.com> escribi� en el mensaje
news:lkeml85rbb2ndrscd...@4ax.com...

Thank Ralph for the clarification.

Eduardo

unread,

Apr 2, 2013, 5:06:45 PM4/2/13

to

"Eduardo" <m...@mm.com> escribi� en el mensaje

news:kjfadj$1gk$1...@speranza.aioe.org...

Don't worry, I think I can work with it anyway.

I added the field Sub_Item_ID to the DISTINCT clause, and after that for
every record, I query another recordset for those Document_ID and
Sub_Item_ID and work with that set of words (for the later analysis).

The only thing is that there will be sets that don't have all the words, but
only some of them because they were in the same Doc_Field number but on
another Sub_Item_ID.
It will waste some time analyzing those sets but it still will work OK
because it won't find all the words there and don't add any relevance to
that document.