Views and H2

39 views
Skip to first unread message

Cecil Westerhof

unread,
Sep 7, 2013, 7:58:48 AM9/7/13
to h2-da...@googlegroups.com
At the moment I am using a MySQL database where I have the following definitions:

    CREATE TABLE IF NOT EXISTS userProcesses (
      `id`            int(11)       NOT NULL auto_increment,
      `dateTime`      varchar(25)   NOT NULL,
      `user`          varchar(25)   NOT NULL,
      `noOfProcesses` varchar(25)   NOT NULL,
    
      PRIMARY KEY            (`id`),
      UNIQUE  KEY `dateUser` (`dateTime`, `user`),
    );
    
    CREATE OR REPLACE VIEW totalProcesses AS
    SELECT
      dateTime,
      SUM(noOfProcesses) AS total
    FROM
      userProcesses
    GROUP BY
      dateTime
    ;

The problem is that in the view the index from userProcesses is not used because the GROUP BY. (I now do not use the view, makes my query about 40 times faster.) Would H2 not have this problem? That would be a good excuse to migrate from MySQL to H2.

--
Cecil Westerhof

Cecil Westerhof

unread,
Sep 7, 2013, 10:19:20 AM9/7/13
to h2-da...@googlegroups.com
2013/9/7 Cecil Westerhof <cldwes...@gmail.com>
In H2 I can use the view. It is a little less fast as without the view, but it makes the code more readable. The strange thing is that MySQL is a lot faster (when not using the view) and I understood that H2 should be faster as MySQL.

The following query takes 4.5 seconds in MySQL:
    SELECT
      SUBSTR(dateTime, 1, 10)         AS thisDate,
      MIN(total)                      AS min,
      MAX(total)                      AS max,
      ROUND(AVG(total), 2)            AS avg,
      COUNT(*)                        AS total
    FROM (

        SELECT
          dateTime,
          SUM(noOfProcesses) AS total
        FROM
          userProcesses
        WHERE
          SUBSTR(dateTime, 1, 10) >= '2013-09-01' AND
          SUBSTR(dateTime, 1, 10) <= '2013-09-06'
        GROUP BY
          dateTime
    ) totalProcesses
    GROUP BY
      thisDate
    ORDER BY
      thisDate
    ;

In H2 I made the following table and view:

    CREATE TABLE IF NOT EXISTS userProcesses (
      `dateTime`      varchar(25)   NOT NULL,
      `user`          varchar(25)   NOT NULL,
      `noOfProcesses` int(11)       NOT NULL,
   
      PRIMARY  KEY (`dateTime`, `user`)

    );
    CREATE OR REPLACE VIEW totalProcesses AS
    SELECT
      dateTime,
      SUM(noOfProcesses) AS total
    FROM
      userProcesses
    GROUP BY
      dateTime
    ;

I removed id there it is not needed and changed noOfProcesses to an int. But that should not make things slower I would think.

I imported the MySQL data into H2.

The query without the view takes 18,5 seconds:
    SELECT
      SUBSTR(dateTime, 1, 10)         AS thisDate,
      MIN(total)           AS min,
      MAX(total)           AS max,
      ROUND(AVG(total), 2) AS avg,
      COUNT(*)             AS total
    FROM (

        SELECT
          dateTime,
          SUM(noOfProcesses) AS total
        FROM
          userProcesses
        WHERE
          SUBSTR(dateTime, 1, 10) >= '2013-09-01' AND
          SUBSTR(dateTime, 1, 10) <= '2013-09-06'
        GROUP BY
          dateTime
    ) totalProcesses
    GROUP BY
      thisDate
    ORDER BY
      thisDate
    ;

The query with the view takes 38 seconds:
    SELECT
      SUBSTR(dateTime, 1, 10)         AS thisDate,
      MIN(total)           AS min,
      MAX(total)           AS max,
      ROUND(AVG(total), 2) AS avg,
      COUNT(*)             AS total
    FROM totalProcesses
    WHERE
      SUBSTR(dateTime, 1, 10) >= '2013-09-01' AND
      SUBSTR(dateTime, 1, 10) <= '2013-09-06'
    GROUP BY
      thisDate
    ORDER BY
      thisDate
    ;

Am I doing something wrong? The table contains almost 2,5 million records.

--
Cecil Westerhof

Cecil Westerhof

unread,
Sep 10, 2013, 5:47:15 AM9/10/13
to h2-da...@googlegroups.com
I did something wrong. When removing the SUBSTR the time went below half a second:

    SELECT
      SUBSTR(dateTime, 1, 10)         AS thisDate,
      MIN(total)           AS min,
      MAX(total)           AS max,
      ROUND(AVG(total), 2) AS avg,
      COUNT(*)             AS total
    FROM totalProcesses
    WHERE
      dateTime >= '2013-09-01' AND
      dateTime <= '2013-09-06 23:59:59'

    GROUP BY
      thisDate
    ORDER BY
      thisDate
    ;

So that is a significant time-saving. Maybe a good idea to have a warning about this by the documentation?

Removing the SUBSTR by MySQL is also interesting, but less. It went from a little above 6 seconds to below 5 seconds. So a lot less improvement. I think I am going to migrate to H2. Fetching is about 10 times as fast. Just have to check the speed of storing.



2013/9/7 Cecil Westerhof <cldwes...@gmail.com>



--
Cecil Westerhof
Reply all
Reply to author
Forward
0 new messages