Hive on mr3 on yarn always use the minimum value of resources configured with yarn queue

Carol Chapman

unread,

Apr 24, 2022, 11:58:02 AM4/24/22

to MR3

Hi.

At present, I found that Hive ON MR3 runs in Yarn, it can only use the minimum resources configured in Yarn queues.

For example, i use yarn queue “crowd”，ts configuration information is shown in the figure below:

In theory, this queue I can use all resources of Yarn. In fact, when I use Apache Hive, I do have access to all YARN resources. But when i use hive on mr3,I can only use 20% of the resources, which is the minimum configuration of the queue.

Is this problem caused by my lack of key configuration items? Or something else?

Sungwoo Park

unread,

Apr 25, 2022, 12:32:13 AM4/25/22

to MR3

Unlike Tez, MR3 honors the resource limit of the queue it belongs to, and sends container requests only within this limit. There are pros and cons for this decision, but it is not a bug of MR3 (and actually almost necessary for running MR3 on Kubernetes with autoscaling).

This resource limit is determined when MR3 starts. You can check the limit in the log:

LOG.info(s"Registering DAGAppMaster $name at $appHostName:$appHostPort $maxContainerResource")

So, Hive-MR3 is working as intended. If you would like Hive-MR3 to use 100% of the resource, a simple trick is: 1) set the capacity of the queue 'crowd' to 100%, 2) start Hive-MR3, 3) reset the capacity back to 20%. Check the log to see if the resource limit is set to 100% of the cluster resource.

If you can determine the maximum amount of resources MR3 can consume, I could add a new configuration key in the next release of MR3 so that you can manually specify the maximum amount of resources on Hadoop. This is how MR3 behaves on Kubernetes. Let me know if you want this feature.

Cheers,

--- Sungwoo

Carol Chapman

unread,

Apr 25, 2022, 10:41:09 AM4/25/22

to MR3

Yes, We need this feature. We can determine the maximum use of resources.

Sungwoo Park

unread,

May 4, 2022, 11:01:34 AM5/4/22

to MR3

Could you try Hive-MR3 with autoscaling enabled? With autoscaling, MR3 assume infinite resources on Yarn, so I think Hive-MR3 will try to consume up to 100% of the cluster resources.

https://mr3docs.datamonad.com/docs/mr3/features/autoscaling/

Cheers,

--- Sungwoo

Reply all

Reply to author

Forward