query efficiency question

79 views
Skip to first unread message

Clint Hyde

unread,
Jul 12, 2012, 8:43:52 PM7/12/12
to h2-da...@googlegroups.com
here's the situation:

h2 v.166

I have a DB with about 20 million total rows across all tables. 16
million are in one table.

the query:

select taskID, taskname, filename from tasks where state=1 and priority=3;

this is a distributed multi-processing task controller (grid engine)
program. I need to be aware of the overall state of things, so I need to
do this query once in a while, for all values of 'priority'

this query is quite expensive, takes 20-24 seconds right now, with 16M
rows. it's going to get worse.

all the fields in the query are indexed. Why is it so slow? (fwiw, this
is on a 3GHz 32GB dual-hex-core Mac Pro--blazing fast hardware)

I sure would like this query to run in 1/10 the time (2 sec, and not
dependent on table size).

------------------

related: similar query:

select taskname, count(*) from tasks where state=1 group by taskname

also about 24 seconds

--------------------

other than this set of queries, H2 is doing good for me, mostly it runs
at about 5% of cpu or less.

-- clint

Noel Grandin

unread,
Jul 13, 2012, 3:40:53 AM7/13/12
to h2-da...@googlegroups.com, Clint Hyde
Please do an explain plan on the queries and post the results.

Thomas Mueller

unread,
Jul 13, 2012, 1:31:32 PM7/13/12
to h2-da...@googlegroups.com
Hi,

I guess you need a multi-column indes. See also
http://www.h2database.com/html/performance.html#storage_and_indexes

Regards,
Thomas
> --
> You received this message because you are subscribed to the Google Groups
> "H2 Database" group.
> To post to this group, send email to h2-da...@googlegroups.com.
> To unsubscribe from this group, send email to
> h2-database...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/h2-database?hl=en.
>

Clint Hyde

unread,
Jul 18, 2012, 12:38:40 AM7/18/12
to h2-da...@googlegroups.com
The suggestion of multi-column index turned out to work quite well. I
seem to have a ~ 10% speed gain for the system as a whole, which is great.

There is still one other query that is quite slow, but I've moved it to
run only occasionally, and reworked my task-selection algorithm to do a
better job, so that the query doesn't run as often.

-- clint

Clint Hyde

unread,
Jul 18, 2012, 1:22:57 AM7/18/12
to h2-da...@googlegroups.com
something different had trouble today:

I need to do something that is called cross-table updating, in which I
select rows in one table by using TWO values from another table, and
then I modify a value in the rows that are returned.

H2 seems to have no such functionality

It would be like this (I think mysql can do this)

update table1,table2 set table1.columnX=some_Value where
table1.columnA=table2.columnA and table2.columnQ=5

what I have is that I know a value in table 2, querying using it gets me
40 rows, those rows contain pairs of values I need to use to look in
another table.

but, no can do.

--------------

related: to solve the above problem, I have to do my own loop, get the
value pairs, and use that in a second query to get the correct rows that
I then update.

I wanted to do this:

query = "select ...";
statement.executeQuery(query);
resultSet = statement.getResultSet(params about allowing positioning);
int ct = 0;
while (resultSet.next()) {ct++;} //so I know how many records I have.
for now, this is a small number
resultSet.absolute(0); //back to the beginning
int g1 = resultSet.getInt(1);




and right there I get an Exception about no data, suggesting that
"absolute()" didn't work, so I'm at the end of the dataset and there is
no next record.

Any suggestions? I realize I could execute the query again, but I should
not have to.

-- clint

Dani

unread,
Jul 20, 2012, 3:31:43 AM7/20/12
to h2-da...@googlegroups.com
Hey there,
 
> update table1,table2 set table1.columnX=some_Value where 
> table1.columnA=table2.columnA and table2.columnQ=5 

maybe something like this does the job:

UPDATE
    table1 A 
SET
    A.columnX = some_Value 
WHERE
    A.columnA = (
        SELECT
            B.columnA 
        FROM
            table2 B 
        WHERE
            B.columnQ=5
    );


Dani

Thomas Mueller

unread,
Jul 20, 2012, 3:35:16 AM7/20/12
to h2-da...@googlegroups.com
Hi,

It might be easier to use:
 
  update table1 t1 set columnX = (select some_Value from table2 t2 where ...) 
  where exists (select * from table2 t2 where ...)

The query is a bit more verbose, but it will work in all databases I believe.

resultSet.absolute(0); //back to the beginning

Use beforeFirst() instead.

Regards,
Thomas
Reply all
Reply to author
Forward
0 new messages