I have uploaded Hive 4 on MR3 using the latest release in the master branch of Apache Hive (as of yesterday).
Here are a few comments.
1. I disabled several optimization implemented after April 2020, which are buggy and do not bring significant performance improvement. You can check the last section in conf/tpcds/hive4/hive-site.xml to see the optimizations that are disabled (e.g., hive.optimize.shared.work.dppunion).
2. I tested with TPC-DS 1TB ORC and all 99 queries return correct results.
3. When tested with TPC-DS 1TB, Hive 4 on MR3 is no faster than Hive 3 on MR3 on average (mostly because of the increased time in compiling queries). This is bad news because when we tested in April 2020, Hive 4 on MR3 was noticeably faster than Hive 3 on MR3. I suspect that several optimizations implemented after that actually degraded the performance.
Cheers,
-- Sungwoo