Select the record with the max value.

428 views
Skip to first unread message

JonoPorter

unread,
Jun 9, 2008, 12:04:36 PM6/9/08
to H2 Database
What would be in H2 the best way to select a row with the max value of
a column?

This article describes the problem well:
http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/

I was wondering if there is a 'best' way to do this with H2.

gingda

unread,
Jun 9, 2008, 5:29:22 PM6/9/08
to H2 Database
Is this question academic or do you have a particular query you are
trying to implement?

I believe that any of the queries listed down to the point where the
author starts talking about MySQL specific solutions will work.

I think that we have used the sub-query strategy.

On Jun 9, 11:04 am, JonoPorter <Jono.Por...@gmail.com> wrote:
> What would be in H2 the best way to select a row with the max value of
> a column?
>
> This article describes the problem well:http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-...

Thomas Mueller

unread,
Jun 10, 2008, 4:07:18 PM6/10/08
to h2-da...@googlegroups.com
Hi,

Optimizations are not fully documented in H2 yet. I will improve the
documentation. My plan is to add the following code samples in the
next release. You can run the script with the H2 Console. Please tell
me if this doesn't answer your question.

-------------------------------------------------------------------------------
-- Optimize Count Star
-------------------------------------------------------------------------------
-- This code snippet shows how to quickly get the the number of rows in a table.

-- Initialize the data
CREATE TABLE TEST(ID INT PRIMARY KEY);
INSERT INTO TEST SELECT X FROM SYSTEM_RANGE(1, 1000);

-- Query the count
SELECT COUNT(*) FROM TEST;
--> 1000
;

-- Display the query plan - 'direct lookup' means the index is used
EXPLAIN SELECT COUNT(*) FROM TEST;
--> SELECT COUNT(*)
--> FROM PUBLIC.TEST /* PUBLIC.TEST_TABLE_SCAN */
--> /* direct lookup */
;

DROP TABLE TEST;

-------------------------------------------------------------------------------
-- Optimize Distinct
-------------------------------------------------------------------------------
-- This code snippet shows how to quickly get all distinct values
-- of a column for the whole table.

-- Initialize the data
CREATE TABLE TEST(ID INT PRIMARY KEY, TYPE INT);
CALL RAND(0);
--> 0.730967787376657
;
INSERT INTO TEST SELECT X, MOD(X, 10) FROM SYSTEM_RANGE(1, 1000);

-- Create an index on the column TYPE
CREATE INDEX IDX_TEST_TYPE ON TEST(TYPE);

-- Calculate the selectivity - otherwise it will not be optimized
ANALYZE;

-- Query the distinct values
SELECT DISTINCT TYPE FROM TEST ORDER BY TYPE LIMIT 3;
--> 0
--> 1
--> 2
;

-- Display the query plan - 'IDX_TEST_TYPE' means the index is used
EXPLAIN SELECT DISTINCT TYPE FROM TEST ORDER BY TYPE LIMIT 3;
--> SELECT DISTINCT TYPE
--> FROM PUBLIC.TEST /* PUBLIC.IDX_TEST_TYPE */
--> ORDER BY 1
--> LIMIT 3
--> /* distinct */
;

DROP TABLE TEST;

-------------------------------------------------------------------------------
-- Optimize Min Max
-------------------------------------------------------------------------------
-- This code snippet shows how to quickly get the smallest and largest value
-- of a column for each group.

-- Initialize the data
CREATE TABLE TEST(ID INT PRIMARY KEY, VALUE DECIMAL(100, 2));
CALL RAND(0);
--> 0.730967787376657
;
INSERT INTO TEST SELECT X, RAND()*100 FROM SYSTEM_RANGE(1, 1000);

-- Create an index on the column VALUE
CREATE INDEX IDX_TEST_VALUE ON TEST(VALUE);

-- Query the largest and smallest value - this is optimized
SELECT MIN(VALUE), MAX(VALUE) FROM TEST;
--> 0.01 99.89
;

-- Display the query plan - 'direct lookup' means it's optimized
EXPLAIN SELECT MIN(VALUE), MAX(VALUE) FROM TEST;
--> SELECT MIN(VALUE), MAX(VALUE)
--> FROM PUBLIC.TEST /* PUBLIC.TEST_TABLE_SCAN */
--> /* direct lookup */
;

DROP TABLE TEST;

-------------------------------------------------------------------------------
-- Optimize Grouped Min Max
-------------------------------------------------------------------------------
-- This code snippet shows how to quickly get the smallest and largest value
-- of a column for each group.

-- Initialize the data
CREATE TABLE TEST(ID INT PRIMARY KEY, TYPE INT, VALUE DECIMAL(100, 2));
CALL RAND(0);
--> 0.730967787376657
;
INSERT INTO TEST SELECT X, MOD(X, 5), RAND()*100 FROM SYSTEM_RANGE(1, 1000);

-- Create an index on the columns TYPE and VALUE
CREATE INDEX IDX_TEST_TYPE_VALUE ON TEST(TYPE, VALUE);

-- Analyze to optimize the DISTINCT part of the query query
ANALYZE;

-- Query the largest and smallest value - this is optimized
SELECT TYPE, (SELECT VALUE FROM TEST T2 WHERE T.TYPE = T2.TYPE LIMIT 1) MIN
FROM (SELECT DISTINCT TYPE FROM TEST) T ORDER BY TYPE;
--> 0 0.42
--> 1 0.14
--> 2 0.01
--> 3 0.40
--> 4 0.44
;

-- Display the query plan - 'direct lookup' means it's optimized
EXPLAIN SELECT TYPE, (SELECT VALUE FROM TEST T2 WHERE T.TYPE = T2.TYPE
LIMIT 1) MIN
FROM (SELECT DISTINCT TYPE FROM TEST) T ORDER BY TYPE;
--> SELECT TYPE, (SELECT VALUE
--> FROM PUBLIC.TEST T2 /* PUBLIC.IDX_TEST_TYPE_VALUE: TYPE = T.TYPE */
--> WHERE T.TYPE = T2.TYPE
--> LIMIT 1) AS MIN
--> FROM (SELECT DISTINCT TYPE
--> FROM PUBLIC.TEST /* PUBLIC.IDX_TEST_TYPE_VALUE */
--> /* distinct */) T /* SELECT DISTINCT TYPE
--> FROM PUBLIC.TEST /++ PUBLIC.IDX_TEST_TYPE_VALUE ++/
--> /++ distinct ++/ */
--> ORDER BY 1
;

DROP TABLE TEST;

-------------------------------------------------------------------------------
-- Optimize Top N --
-------------------------------------------------------------------------------
-- This code snippet shows how to quickly get the smallest and largest N
-- values of a column for the whole table.

-- Initialize the data
CREATE TABLE TEST(ID INT PRIMARY KEY, TYPE INT, VALUE DECIMAL(100, 2));
CALL RAND(0);
--> 0.730967787376657
;
INSERT INTO TEST SELECT X, MOD(X, 100), RAND()*100 FROM SYSTEM_RANGE(1, 1000);

-- Create an index on the column VALUE
CREATE INDEX IDX_TEST_VALUE ON TEST(VALUE);

-- Query the smallest 10 values
SELECT VALUE FROM TEST ORDER BY VALUE LIMIT 3;
--> 0.01
--> 0.14
--> 0.16
;

-- Display the query plan - 'IDX_TEST_VALUE' means the index is used
EXPLAIN SELECT VALUE FROM TEST ORDER BY VALUE LIMIT 10;
--> SELECT VALUE
--> FROM PUBLIC.TEST /* PUBLIC.IDX_TEST_VALUE */
--> ORDER BY 1
--> LIMIT 10
;

-- To optimize getting the largest values, a new descending index is required
CREATE INDEX IDX_TEST_VALUE_D ON TEST(VALUE DESC);

-- Query the largest 10 values
SELECT VALUE FROM TEST ORDER BY VALUE DESC LIMIT 3;
--> 99.89
--> 99.73
--> 99.68
;

-- Display the query plan - 'IDX_TEST_VALUE_D' means the index is used
EXPLAIN SELECT VALUE FROM TEST ORDER BY VALUE DESC LIMIT 10;
--> SELECT VALUE
--> FROM PUBLIC.TEST /* PUBLIC.IDX_TEST_VALUE_D */
--> ORDER BY 1 DESC
--> LIMIT 10
;

DROP TABLE TEST;

Regards
Thomas

Reply all
Reply to author
Forward
0 new messages