Re: ORDER BY and LIMIT

111 views
Skip to first unread message

Noel Grandin

unread,
Nov 2, 2012, 4:29:35 AM11/2/12
to h2-da...@googlegroups.com, Jan Møller
you need

CREATE INDEX index2 ON myTable(x)

in order to make the ORDER BY run fast.

On 2012-11-02 09:50, Jan M�ller wrote:
> CREATE INDEX index2 ON myTable(id, x)

Jan Møller

unread,
Nov 2, 2012, 4:48:08 AM11/2/12
to h2-da...@googlegroups.com, Jan Møller
I tried that initially, but that does not seem to work
In the article: http://www.mysqlperformanceblog.com/2006/09/01/order-by-limit-performance-optimization/
He claims that you should have the index cover both the WHERE clause and the ORDER BY clause to get good performance. This is for MySql, so I don't know if it applies to H2.
 
On Friday, November 2, 2012 9:29:45 AM UTC+1, Noel Grandin wrote:
you need

   CREATE INDEX index2 ON myTable(x)

in order to make the ORDER BY run fast.

Noel Grandin

unread,
Nov 2, 2012, 4:53:46 AM11/2/12
to h2-da...@googlegroups.com, Jan Møller
What is it about people, that they show up here asking for help, and then somebody gives them a suggestion and they tell me they know better, without even bothering to try it out.

If you already know the answer, then why are you asking?
--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To view this discussion on the web visit https://groups.google.com/d/msg/h2-database/-/lk-D69JsaTQJ.
To post to this group, send email to h2-da...@googlegroups.com.
To unsubscribe from this group, send email to h2-database...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/h2-database?hl=en.

Sergi Vladykin

unread,
Nov 2, 2012, 5:33:09 AM11/2/12
to h2-da...@googlegroups.com
Hi,

You can create index on (id, x) instead of two indexes (id) + (x) and then do select just as SELECT value, height FROM myTable WHERE id =? LIMIT ?
Because you selecting only one id and index already has correct order of x under each id it will return correct result.
It is a still kinda dirty hack since SQL does not guarantee any ordering without ORDER BY clause, but this allows to avoid sorting at query time at all.

Sergi


On Friday, November 2, 2012 11:50:08 AM UTC+4, Jan Møller wrote:
I am new to H2 and this group. So far H2 has been great for my project, but I have hit a show stopper.
I have a table with a very large set of records where I need to do retrieve an ordered set of records with a limit.

Here is my table:
CREATE TABLE myTable(id VARCHAR(36), value VARCHAR(64), x INT)
The table has no primary key as I need to be able to insert rows with duplicate IDs

I have created an index to make selecting by ID fast
CREATE INDEX index1 ON myTable(id)");

I would like to select from this table and order by x while having a limit as the number of records goes into the millions

SELECT value, height FROM myTable WHERE id =? ORDER BY x LIMIT ?
This takes ages, but eventually succeeds.

A select like this is fast:
SELECT value, x FROM myTable WHERE id =? LIMIT ?
But I don't get my ordering

that for MySql you should have an index on the column that you order by. 

So I create this index:
CREATE INDEX index2 ON myTable(id, x)
However this seems not to help.

Is it possible to make this table and query perform well on H2? 
Note: The value of x is increasing monotonously as records are added (there may be rows where id1 != id2 && x1 == x2). Don't know if this can be used for something.

Any help appreciated.

Jan Møller

unread,
Nov 2, 2012, 5:35:42 AM11/2/12
to h2-da...@googlegroups.com, Jan Møller
I am not sure what I said to upset you. I don't know the answer. This is why I am asking.
I have now tried with various indexes
1) CREATE INDEX index1 ON myTable(id)
2) CREATE INDEX index1 ON myTable(x)
3) CREATE INDEX index1 ON myTable(id, x) 

I have made a testbed where I insert 1,000,000 records and do my select with various combinations of indexes. Here are the results sorted by speed:
Index 1, 2, 3: 13516 ms
index 1, 3: 3560 ms
index 1, 2: 3716 ms
index 2, 3: 3285 ms
index 1: 2776 ms
No indexes: 2637 ms

I am really surprised by the results. For some reason I get the fastest query with no indexes at all.

Jan Møller

unread,
Nov 2, 2012, 5:41:29 AM11/2/12
to h2-da...@googlegroups.com
Interesting.
So what you are saying is that because the value of x is a number then the ordering is automatically preserved because of the index on (id, x)?
Sounds like a hack, but I'll try it out.

Sergi Vladykin

unread,
Nov 2, 2012, 5:50:56 AM11/2/12
to h2-da...@googlegroups.com
It is not because x is a number, but because index is a data structure sorted by given fields.

Sergi

Thomas Mueller

unread,
Nov 2, 2012, 5:51:53 AM11/2/12
to h2-da...@googlegroups.com
Hi,

Yes, if you have a condition "id = ?", then you actually need an index on the columns (id, x), and order by (id, x). The problem is that the optimizer of H2 might not understand it should use *this* index if you also have an index on just (id) (because the index on just id seems to be the best one initially). This is a limitation of the current optimizer implementation (patches are welcome of course!). What you need to have is "index sorted" in the explain plan:

drop table if exists test;
create table test(id int, name varchar(255));
-- create index on test(id);
@loop 1000 insert into test values(?, 'hello');
explain select * from test
where id > 1000 order by name limit 1000;
create index on test(id, name);
analyze;
explain select * from test
where id > 1000 and name > '' order by id, name limit 1000;

See also: http://h2database.com/html/performance.html#storage_and_indexes

Regards,
Thomas




If it does't work for you, could you post a simple, standalone test case please (a SQL script would be nice)

On Fri, Nov 2, 2012 at 8:50 AM, Jan Møller <jan.m...@gmail.com> wrote:
I am new to H2 and this group. So far H2 has been great for my project, but I have hit a show stopper.
I have a table with a very large set of records where I need to do retrieve an ordered set of records with a limit.

Here is my table:
CREATE TABLE myTable(id VARCHAR(36), value VARCHAR(64), x INT)
The table has no primary key as I need to be able to insert rows with duplicate IDs

I have created an index to make selecting by ID fast
CREATE INDEX index1 ON myTable(id)");

I would like to select from this table and order by x while having a limit as the number of records goes into the millions

SELECT value, height FROM myTable WHERE id =? ORDER BY x LIMIT ?
This takes ages, but eventually succeeds.

A select like this is fast:
SELECT value, x FROM myTable WHERE id =? LIMIT ?
But I don't get my ordering

that for MySql you should have an index on the column that you order by. 

So I create this index:
CREATE INDEX index2 ON myTable(id, x)
However this seems not to help.

Is it possible to make this table and query perform well on H2? 
Note: The value of x is increasing monotonously as records are added (there may be rows where id1 != id2 && x1 == x2). Don't know if this can be used for something.

Any help appreciated.

--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To view this discussion on the web visit https://groups.google.com/d/msg/h2-database/-/K3xGgnDPBkYJ.

Jan Møller

unread,
Nov 2, 2012, 6:17:52 AM11/2/12
to h2-da...@googlegroups.com
It may be an ugly hack, but... it works!
I need the result in reverse order, but I can achieve that by making x negative.
I guess this will also work if I have a WHERE clause with multiple IDs, I'll have to investigate that.
Thanks

Jan Møller

unread,
Nov 2, 2012, 6:18:33 AM11/2/12
to h2-da...@googlegroups.com
Thanks for the explanation, I appreciate that.
Reply all
Reply to author
Forward
0 new messages