Sybase SQL Anywhere SLOW using like operator on joined tables.

Andrew Klein

unread,

Sep 14, 1998, 3:00:00 AM9/14/98

to

The following query takes about 1 second and gives a count of 15.

Select count(*) from inventory,styles where inventory.styleid = styles.id
and styles.style ='J00362B' and status = '1'

The statistics show I/O estimate 1969.

BUT his query takes FOREVER and gives a count of 383 which is correct.

Select count(*) from inventory,styles where inventory.styleid = styles.id
and styles.style like 'J%' and status = '1'

The statistics show I/O estimate 16232 and the same indexes that are uses in
the 1st query.

It seems that the 'like operator' takes a long amouny of time.

Note, when I import the data into Micrrosoft Access, the query only takes a
second
which seems normal. The same is true when I import it into Foxpro.

The inventory table has 132,853 records, and has a index on styleid and
status.
The style table has 36,367, and has a index on style.
Sybase SQL Anywhere 5.503

Please, help me speed up my query! This seems like pretty basic stuff.

PS : I tried adding foriegn keys, different kind of joins, etc.

Gus

unread,

Sep 14, 1998, 3:00:00 AM9/14/98

to

The reality is that wherever there is a function accross a column, the
optimiser can not use an index (in the true index sense) to optimise the
search. It will only use an index if the index covers the whole query,
where "covering" means that all the columns referenced in the query also
appear in the index. This means that the query will "scan" the index as
opposed to the table, and will be quicker than scanning the table (less
i/o). The reason's for this are fairly simple. If the value of a column
is modified in any way for the purpose of comparison, the optimiser is
unable to predict what the result will be. It therefore has to test each
value in the column regardless of whether the human eye can see that
there is a better mechanism. Your solution is not easy to find, and
perhaps you have to live with it, but the reality is that any function
on a column will give the optimiser headaches. Even if you say:

Select count(*)
from inventory,styles
where inventory.styleid = styles.id

and rtrim(styles.style) = 'J00362B'
and status = '1'

it will have the same result. Sybase does not know the difference
between functions which may alter the logical set of a query and those
which don't, therefore it has to go back to the base data and run each
value in the table through the function to test whether it passes the
test.

Obviously, if there are other criteria included in the where clause, it
will attempt to optimise in a manner which restricts as much as possible
before it starts it's scan.

It is "well" understood that this can and does happen. Unfortunately,
this is something that one has to live with if you want absolute
integrity of your query.

Admittedly, Sybase maybe should code a little more complexity in to
their optimisation so that it would recognised the situation in a Like
where the initial section of the value is constant, and then at least
obtain a starting poing and end point to search through. This would be
far from simple, and still maybe more often than not result in a table
scan. Maybe they have, but not in System 10 or System 11.0.2 I don't
think.

<humour>On a final note, are you sure about Access doing it that fast,
without being the cynical person I am when it comes to many Microsoft
products, I thought that Microsoft had a ploicy of at least 1.5 seconds
before any response was returned to the user, even if the results were
known beforehand ... ;-) </humour>

gus

Gus

unread,

Sep 14, 1998, 3:00:00 AM9/14/98

to

Gus wrote:
>
> Andrew Klein wrote:
> >
> > The following query takes about 1 second and gives a count of 15.

[ snip ]

> before any response was returned to the user, even if the results were
> known beforehand ... ;-) </humour>
>
> gus

Just realised that the question was for SQL Anywhere, so I don't know
how much of my answer applies, but the logic does still stand,
theoretically. I would like to know how Access does it though, perhaps
you can give us some more info, because the LIKE is a pain. What happens
if you have "... LIKE '%J00362B' ... " Is it still quick?

gus

Andrew Klein

unread,

Sep 14, 1998, 3:00:00 AM9/14/98

to

First of all, I am very upset because you told be there is no solution to my
problem.
It works in MS Access and in MS Foxpro but not in SQL Anywhere. Thats pretty
lame.

My whole app is based on issuing querys like

1. show me eveything thst begins with "red" (ie red blue, red yellow)
2. show me all descriptions beginning with "regular" (ie regular cotton,
regular denim, etc.)

Now I have to rewrite my app using another database. I have been working on
the app for months!
Who would of thought Sybase SQL Anywhere can't optimize the 'like' operator.

My MS Access buddys have always put down Sybase SQL Anywhere but know they
are going to laugh in my face.

Gus, do you know if MS SQL Server has the same behavior?

Also, I would never run the following query like you asked because that
would obviously be too slow.
LIKE '%J00362B' ... "

Thanls alot for the response Gus.

Andy
akl...@embryoninc.com

Gus

unread,

Sep 14, 1998, 3:00:00 AM9/14/98

to

This is where I back down a bit .... ;(

I hope you got the rapidly posted followup to my post which acknowledged
that I had assumed you were using Sybase SQL Server, not SQL Anywhere. I
have little/no experience with SQL anywhere. The newsgroups are not a
place where you can take back what you say, only acknowledge and
apologise for it.

You seem pretty aggressive about this, relax a bit. Just for now anyway.
My test system is actually a system 10 one, I am now heading off to try
a bit on system 11.0.3, I will be back ...

I'm back ...

Firstly, on version "SQL Server/11.0.3.2/P/DEC AXP/OpenVMS V6.2/SWR
7825 - X/OPT/26-FEB-1998 02:20:12.40"

This is the table I am choosing to use:

name rowtotal reserved data index_size unused
XXXXX 218658 85222 KB 54670 KB 30512 KB 40 KB

It's clustered index is on a Char (7) column.

This is what I get with the query ....
select Count (*) from XXXX where id like "8%"

This returned Count (*) = 0

QUERY PLAN FOR STATEMENT 1 (at line 1).
STEP 1
The type of query is SELECT.
Evaluate Ungrouped COUNT AGGREGATE.
FROM TABLE
XXXXX
Nested iteration.
Using Clustered Index.
Index : INDX01
Ascending scan.
Positioning by key.
Keys are:
id_party
Using I/O Size 2 Kbytes.
With LRU Buffer Replacement Strategy.
STEP 2
The type of query is SELECT.

0
Table: XXXXX scan count 1, logical reads: 4, physical reads: 3
Total writes for this command: 0

When I ran:
"select Count (*) from XXXX where id like "0%"

I got:

QUERY PLAN FOR STATEMENT 1 (at line 1).
STEP 1
The type of query is SELECT.
Evaluate Ungrouped COUNT AGGREGATE.
FROM TABLE
XXXXX
Nested iteration.
Index : INDXGFN
Ascending scan.
Positioning at index start.
Index contains all needed columns. Base table will not be read.
Using I/O Size 2 Kbytes.
With LRU Buffer Replacement Strategy.
STEP 2
The type of query is SELECT.

79277
Table: XXXXX scan count 1, logical reads: 2506, physical reads: 2506
Total writes for this command: 0

Obviously what I said before about sybase is *incorrect* with version
11. It is noticable that the index used is different between the query
which I knew would return 0 records, and the query which I knew would
return lots.

On what is my test version, Version "SQL Server/10.0.2.6/P/DEC
AXP/OpenVMS V1.5+/EBF 6202 Rollup/1-MAY-1996 16:41:45.88"

This is the same table in our test database

name rowtotal reserved data index_size unused
XXXXX 225405 92948 KB 59444 KB 33456 KB 48 KB

These are the results for:
select Count (*) from XXXXX where id like "0%"

STEP 1
The type of query is SELECT.
Scalar Aggregate
FROM TABLE
XXXXX
Nested iteration
Index : INDX01
STEP 2
The type of query is SELECT.
Table Scan

79277
Table: XXXXX scan count 1, logical reads: 825, physical reads: 824
Total writes for this command: 0

Interestingly this shows even fewer reads than the System 11 database. I
think I know why this is, and I will check later.

This goes to *disprove* what I said earlier about the LIKE. To return
some credibility to what I say, I tried this as well:

select Count (*) from XXXXX where rtrim(id) = "9876543"

I got:
STEP 1
The type of query is SELECT.
Scalar Aggregate
FROM TABLE
XXXXX
Nested iteration
Index : INDX01
STEP 2
The type of query is SELECT.
Table Scan

0
Table: XXXXX scan count 1, logical reads: 2514, physical reads: 0
Total writes for this command: 0

Note the logical reads ....

On system 11 I got:

QUERY PLAN FOR STATEMENT 1 (at line 1).
STEP 1
The type of query is SELECT.
Evaluate Ungrouped COUNT AGGREGATE.
FROM TABLE
XXXXX
Nested iteration.
Index : INDXGFN
Ascending scan.
Positioning at index start.
Index contains all needed columns. Base table will not be read.
Using I/O Size 2 Kbytes.
With LRU Buffer Replacement Strategy.
STEP 2
The type of query is SELECT.

0
Table: XXXXX scan count 1, logical reads: 2506, physical reads: 0
Total writes for this command: 0

The logical reads correlates to an index scan.

Thus, in summary, LIKE is optimised differently to what I believed, and
it is better than I thought. Further, Sybase has already coded in to
it's optimiser what I suggested it should. My confidence in LIKE is
growing.

Obviously the facilities in SQL Anywhere are different, but you will
have to figure it out for yourself, or get someone else on the group who
has access to SQL Anywhere to test/experiment themselves with SQL
Anywhere to see how it behaves. Sybase SQL Server is an "Enterprise"
database, and thus it has more advanced, detailed, and correspondingly
more expensive tools. This gives us the ability to do the above
showplans and statistics on queries. I don't know if SQL Anywhere has
similar functionality.

If it doesn't, then you are suckered, a bit, <sarcasm> and I would
suggest that you see if Access has them </sarcasm>. MS Sql Server is a
derivative of Sybase SQL Server 4.9. It has developed in a different
direction to Sybase SQL Server, and has it's strengths, and weaknesses.
How it handles the LIKE, IDK.

SQL Anywhere is a descendant of the Watcom database, which is actually
totally *unlike* Sybase. It has different origins totally, and Sybase I
believe have not really modified the philosopy of SQL anywhere to any
great extent, but have rather focused on developing the
communication/replication/synchronisation parts of it.

So, Andrew, you are back to the drawing board. You have no idea of why
the Like behaves the way it does. It is unfortunate that I have mislead
you, but at the same time, don't base your opinions of the product on
your experiences with the users of the product.

What can I say.

I used heresay and old experience to substantiate an argument against
something which was not broken in a system I knew nothing about (only
accidentally;). Won't someone please post that to a.h.b-o-u?

As it happens, Sybase SQL Server is so great that it does all the above
and more better than any other database I have worked with ... (which
other database you ask .... MS Access ...)!!!

gus

P.S. how's that deadline .... ?

courte...@jpmorgan.com

unread,

Sep 15, 1998, 3:00:00 AM9/15/98

to Andrew Klein

Andrew,

There are a couple things that could be happening:

1. How many J% records do you have? If this values is more than about 30% of all
values the index will NOT be used especially because there is an index on status
that the optimizer might be switching to. What happens when you do like 'J0%' or
'J00%'... ?

2. The optimizer might decide to check first the status value and then the style
value but still using the same indexes (just not in the same order...)

A like statement undes MS SQL and Sybase act basically the same way... My first
approach would be to check how many J% you have. You could also try K% instead
and see what happens

Andrew Klein wrote:

> The following query takes about 1 second and gives a count of 15.
>

Jan Rosinowski

unread,

Sep 15, 1998, 3:00:00 AM9/15/98

to

if status is defined in inventory a subquery might improve things.
otherwise use a concatenated index (status,style) (the fixed one first
of course).

if sqlaw constantly uses the "wrong" indices you might want to enforce
your selection by assigning selectivity probabilities or estimates
(see chapter 'how the optimizer works')

Andrew Klein

unread,

Sep 15, 1998, 3:00:00 AM9/15/98

to

Thanks for responding. I can't believe I can't run a optimized query that
shows all inventory records where the color begins with red!! (I.E Red
Gray, Red Foxfire, Redish Tint) . (Obviously the style file is indexed on
color name. Remember, only the styleid is in inventory. Granted, I could
probably redesign an existing huge database to accomodate SQL Anywhere but I
don't have the time and the company won't want to change the way they enter
inventory.)

MS Access, Foxpro and Clipper performance blows SQL Anywhere away for the
above query. But I would like to use SQL Anywhere.

Anyway, to answer you questions:

I have 383 J records with a status of 1.
K% behaves the same way.
When I do JO% and JOO% it is still slow. But when I do J00362% is performs
quickly.

Thanks again,
Andrew

courte...@jpmorgan.com wrote in message
<35FE1888...@jpmorgan.com>...

>Andrew,
>
>There are a couple things that could be happening:
>
>1. How many J% records do you have? If this values is more than about 30%
of all
>values the index will NOT be used especially because there is an index on
status
>that the optimizer might be switching to. What happens when you do like
'J0%' or
>'J00%'... ?

>
>2. The optimizer might decide to check first the status value and then the
style
>value but still using the same indexes (just not in the same order...)
>
>A like statement undes MS SQL and Sybase act basically the same way... My
first
>approach would be to check how many J% you have. You could also try K%
instead
>and see what happens

>
>
>
>Andrew Klein wrote:
>
>> The following query takes about 1 second and gives a count of 15.
>>

>> Select count(*) from inventory,styles where inventory.styleid =
styles.id

>> and styles.style ='J00362B' and status = '1'
>>
>> The statistics show I/O estimate 1969.
>>
>> BUT his query takes FOREVER and gives a count of 383 which is correct.
>>

Andrew Klein

unread,

Sep 15, 1998, 3:00:00 AM9/15/98

to

Yes the count is correct.

Thanks,
Andrew

Anthony Mandic wrote in message
<35FE318A...@agd.nsw.gov.au_hormel.free>...

>Andrew Klein wrote:
>>
>> The following query takes about 1 second and gives a count of 15.
>

> I gather that this count is correct?
>
>...

>> BUT his query takes FOREVER and gives a count of 383 which is correct.
>>
>> Select count(*) from inventory,styles where inventory.styleid = styles.id
>> and styles.style like 'J%' and status = '1'

>...

>> The inventory table has 132,853 records, and has a index on styleid and
>> status.
>> The style table has 36,367, and has a index on style.
>

> The styles table should have an index on id and style (or the
> other way around, depending on the distribution of values).
> [Note: these comments are made on the presumption that SQLAW's
> use of indexes behaves similar to Sybase's Adaptive Server.]
>
>-am

Andrew Klein

unread,

Sep 15, 1998, 3:00:00 AM9/15/98

to

Thanks for your suggestion about the optimizer.
My query was cut from 40 seconds to 10 seconds by changing

styles.style like 'J%'
to
(styles.style like 'J%',1)

Now I hope I can program a Powerbuilder datawindow to recoginze this syntax.

Granted, MS Access still performs this query in 1 or 2 seconds but for now I
will take what can get.

Thanks Again
Andrew

Jan Rosinowski wrote in message <35fe055f.596183857@bigXb>...

>if status is defined in inventory a subquery might improve things.
>otherwise use a concatenated index (status,style) (the fixed one first
>of course).
>
>if sqlaw constantly uses the "wrong" indices you might want to enforce
>your selection by assigning selectivity probabilities or estimates
>(see chapter 'how the optimizer works')
>

>>Select count(*) from inventory,styles where inventory.styleid = styles.id
>>and styles.style like 'J%' and status = '1'
>
>
>>

>>The statistics show I/O estimate 16232 and the same indexes that are uses
in
>>the 1st query.
>>
>>It seems that the 'like operator' takes a long amouny of time.
>>
>>Note, when I import the data into Micrrosoft Access, the query only takes
a
>>second
>>which seems normal. The same is true when I import it into Foxpro.
>>

>>The inventory table has 132,853 records, and has a index on styleid and
>>status.
>>The style table has 36,367, and has a index on style.

Gus

unread,

Sep 15, 1998, 3:00:00 AM9/15/98

to

Having foobarred once, I am intrigued about what it is that you have
done. I understand that you have changed the where clause to read:

where styles.status = 1
and (styles.style like 'J%',1)

This is a syntax I have never encountered. Briefly, what does it do? I
assume it is proprietary to sqlaw, and that it gives an indication of
the distribution of the data. Is this figure a percentage of rows
returned, an index number to use, what? Is it portable to any other
database?

Thanks
gus

Jan Rosinowski

unread,

Sep 15, 1998, 3:00:00 AM9/15/98

to

>I have 383 J records with a status of 1.
>K% behaves the same way.
>When I do JO% and JOO% it is still slow. But when I do J00362% is performs
>quickly.

have a look at the different plans.. provide selectivity-estimates
(see section "how the optimizer works") to force indexusage and it'll
run.

Andrew Klein

unread,

Sep 15, 1998, 3:00:00 AM9/15/98

to

The number 1 represents the % of rows that the expression is estimated to
select.

Paul A. Horan

unread,

Sep 15, 1998, 3:00:00 AM9/15/98

to

Andrew,
You can definitely put these hints in the WHERE clauses of PowerBuilder
datawindows. (Do it all the time).

Paul Horan
VCI
Springfield, MA

Andrew Klein wrote in message <6tlrms$i...@sjx-ixn5.ix.netcom.com>...