Performance issue with Hive Mr3 on kubernetes (and s3)

Fiorella Piriz

unread,

Nov 22, 2023, 1:17:41 AM11/22/23

to MR3

Hi,

We have deployed Hive Mr3 on Openshift (with 3 masters and 8 worker nodes) following the instructions https://mr3docs.datamonad.com/docs/quick/k8s/run-helm-k8s/.

We have configured:

<property>
  <name>mr3.k8s.worker.total.max.memory.gb</name>
  <value>256</value>
</property>

<property>
  <name>mr3.k8s.worker.total.max.cpu.cores</name>
  <value>64</value>


<property>
  <name>mr3.am.resource.memory.mb</name>
  <value>16384</value>
</property>

<property>
  <name>mr3.am.resource.cpu.cores</name>
  <value>4</value>
</property>

<property>
  <name>mr3.am.local.resourcescheduler.max.memory.mb</name>
  <value>262144</value>
</property>

<property>
  <name>mr3.am.local.resourcescheduler.max.cpu.cores</name>
  <value>64</value>
</property>

For writing in S3 we have left the default configuration and the same for tez.

<property>
  <name>fs.s3a.connection.maximum</name>
  <value>4000</value>
</property>

<property>
  <name>fs.s3.maxConnections</name>
  <value>4000</value>
</property>

<property>
  <name>fs.s3a.threads.max</name>
  <value>250</value>
</property>

<property>
  <name>fs.s3a.threads.core</name>
  <value>250</value>
</property>

<!-- S3 write performance -->

<property>
  <name>hive.mv.files.thread</name>
  <value>15</value>
</property>

<property>
  <name>fs.s3a.max.total.tasks</name>
  <value>5</value>
</property>

Our metastore is located in S3 and the transient data goes to a NFS server.

However, our queries take too long and some of them get stucked.

For example, I have run a simple count on an external partitioned table (with about 40 million rows per day, about 457 avro files of 100Mb) in Hive takes more than 12 minutes to query one day and in trino only 4 minutes.

What we see is that Hive tries to read in many threads, collapsing the storage system which is not able to respond to so many I/O requests. However this does not happen with trino, which is able to read up to 1Gb per second, while hive does not exceed 300Mb per second.

Does anyone has any hint?

Thank you,

Fiorella

Sungwoo Park

unread,

Nov 22, 2023, 1:46:55 AM11/22/23

to MR3

Hello,

First, it's great to see that you have got Hive-MR3 up and running.

As far as I know, Trino is very efficient in accessing S3 because it uses its own custom S3 connector. In contrast, Hive (like other systems that use Hadoop library, such as Spark) relies on Hadoop library to access S3, which is less efficient. For example, it makes more S3 API calls than Trino.

Another thing is that Trino is usually faster than Hive on simple queries (like the one you mentioned). For complex queries with heavy joins, you will find that Trino is slower, or even fails some times. Please see this blog for the performance comparison: https://www.datamonad.com/post/2023-05-31-trino-spark-hive-performance-1.7/

So, in your experiment, my guess is that Trino is much faster because 1) it uses a more efficient S3 connector, 2) the configuration parameters for S3 are not properly tuned for your S3 environment, and 3) the query is simple. The default configuration parameters for S3 may be far from optimal, so I suggest you experiment with different values (e.g., by decreasing fs.s3a.connection.maximum, fs.s3.maxConnections, fs.s3a.threads.max, fs.s3a.threads.core).

For a simple counting query, Hive is slower, but this does not mean that Hive is always slower. It is a price to pay for choosing the so-called MapReduce architecture to achieve fault tolerance, which requires split computation and so on. If you would like to evaluate Trino vs Hive-MR3, I suggest more complex queries. (David in the MR3 Google group might have some comments because they have been using Hive-MR3-Kubernetes-S3 in production for over a year.)

--- Sungwoo

David Engel

unread,

Nov 22, 2023, 11:43:07 AM11/22/23

to Sungwoo Park, MR3

For reference, we had to bump the default fs.s3a.connection.maximum up
to 20000. On some queries, we have to set it even higher.

David

> --
> You received this message because you are subscribed to the Google Groups "MR3" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to hive-mr3+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/hive-mr3/41361f95-6e7f-4765-a155-065dab1cbdb4n%40googlegroups.com.

--
David Engel
da...@istwok.net

Fiorella Piriz

unread,

Nov 22, 2023, 12:26:34 PM11/22/23

to David Engel, Sungwoo Park, MR3

Thank you for answering Davis.Interesting, could you guide me with the following s3 parameters? Did you have to change any other value?

We still don't achieve to decrease IOPS and increase read Mb

<property>
  <name>fs.s3a.connection.maximum</name>
  <value>4000</value>
</property>

<property>
  <name>fs.s3.maxConnections</name>
  <value>4000</value>
</property>

<property>
  <name>fs.s3a.threads.max</name>
  <value>250</value>
</property>

<property>
  <name>fs.s3a.threads.core</name>
  <value>250</value>
</property>

<property>
  <name>fs.s3a.block.size</name>
  <value>128M</value>
</property>

To view this discussion on the web visit https://groups.google.com/d/msgid/hive-mr3/ZV4vEvnYRuUmBd9S%40opus.istwok.net.

Sungwoo Park

unread,

Nov 22, 2023, 12:35:53 PM11/22/23

to MR3

Is there a chance that this slow S3 access is due to the use of Avro files? For example, do you see any difference in throughput when reading ORC files?

--- Sungwoo

Reply all

Reply to author

Forward