Performance of Hive-MR3 2.1

6 views
Skip to first unread message

Sungwoo Park

unread,
Jun 1, 2025, 8:36:57 PMJun 1
to MR3
I fixed a couple of almost trivial yet important bugs in MR3. In conjunction with a new memory optimization (which stores task output in memory when enough free Java heap is available), now Hive-MR3 is as fast as Trino on 10TB TPC-DS benchmark.

In our last benchmark, Hive-MR3 2.0: 2873 seconds vs Trino 468: 4441 seconds
This benchmark used Hive-MR3 running in standalone mode.

Now, running Hive-MR3 on Hadoop (which is slightly slower than in standalone mode), we see:
Hive-MR3 2.0: 5218 seconds, vs Hive-MR3 2.1 candidate: 4495 seconds.
So, the total running time decreased more than 15 percent.

I haven't test Hive-MR3 in standalone mode, but I am pretty confident that it will beat Trino. In concurrent environments, the gap will be much higher.

Sungwoo Park

unread,
Jun 1, 2025, 8:40:58 PMJun 1
to MR3
Typo:


In our last benchmark, Hive-MR3 2.0: 2873 seconds vs Trino 468: 4441 seconds
-->
In our last benchmark, Hive-MR3 2.0: 4873 seconds vs Trino 468: 4441 seconds
Reply all
Reply to author
Forward
0 new messages