query 9

Bart Vandewoestyne

unread,

Sep 24, 2014, 5:45:55 AM9/24/14

to big-...@googlegroups.com

I would like to get back to query 9. The problem I had with query 9 was that there was not enough memory to do a map join:

2014-09-24 11:30:22     Starting to launch local task to process map join;      maximum memory = 257949696
2014-09-24 11:30:24     Dump the side-table into file: file:/tmp/bart/hive_2014-09-24_11-29-43_078_7743841909616182172-1/-local-10006/HashTable-Stage-5/MapJoin-mapfile21--.hashtable
2014-09-24 11:30:24     Upload 1 File to: file:/tmp/bart/hive_2014-09-24_11-29-43_078_7743841909616182172-1/-local-10006/HashTable-Stage-5/MapJoin-mapfile21--.hashtable
2014-09-24 11:30:24     Dump the side-table into file: file:/tmp/bart/hive_2014-09-24_11-29-43_078_7743841909616182172-1/-local-10006/HashTable-Stage-5/MapJoin-mapfile11--.hashtable
2014-09-24 11:30:24     Upload 1 File to: file:/tmp/bart/hive_2014-09-24_11-29-43_078_7743841909616182172-1/-local-10006/HashTable-Stage-5/MapJoin-mapfile11--.hashtable
2014-09-24 11:30:24     Processing rows:        200000  Hashtable size: 199999  Memory usage:   117970624       percentage:     0.457
2014-09-24 11:30:24     Processing rows:        300000  Hashtable size: 299999  Memory usage:   150920336       percentage:     0.585
Execution failed with exit status: 3
Obtaining error information

Task failed!
Task ID:
  Stage-15

Logs:

/tmp/bart/hive.log
FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
MapReduce Jobs Launched: 
Job 0: Map: 1   Cumulative CPU: 5.78 sec   HDFS Read: 5752704 HDFS Write: 5709249 SUCCESS
Total MapReduce CPU Time Spent: 5 seconds 780 msec
======= q09_hive_RUN_QUERY_0 time =========

real    0m47.995s
user    0m44.539s
sys     0m2.240s
===========================

One solution that overcomes this problem and that I have found myself, is to add a hiveLocalSettings.sql file with the following content:

set hive.auto.convert.join=false;

The above solution works: the job runs and I get my query result in about 47 seconds. However, as Michael pointed out, the preferred solution is to use change the 'Client Java Heap Size in Bytes' from 256 MB to >= 1.5 GB in the Hive configuration settings in my Cloudera Manager. I have tested this with 1,5 GB, 3 GB, 8 GB, 32 GB and 64 GB, and always paid attention to restart my cluster before re-running the query, and after some minutes, I get a 'GC overhead limit exceeded' error:

... and so on ...
2014-09-24 11:09:35,924 Stage-5 map = 0%,  reduce = 0%, Cumulative CPU 95.74 sec
2014-09-24 11:09:36,956 Stage-5 map = 0%,  reduce = 0%, Cumulative CPU 95.74 sec
2014-09-24 11:09:37,988 Stage-5 map = 0%,  reduce = 0%, Cumulative CPU 95.74 sec
2014-09-24 11:09:39,019 Stage-5 map = 0%,  reduce = 0%, Cumulative CPU 95.74 sec
2014-09-24 11:09:40,046 Stage-5 map = 0%,  reduce = 0%, Cumulative CPU 95.74 sec
2014-09-24 11:09:41,076 Stage-5 map = 100%,  reduce = 100%
MapReduce Total cumulative CPU time: 1 minutes 35 seconds 740 msec
Ended Job = job_1411543545275_0006 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1411543545275_0006_m_000000 (and more) from job job_1411543545275_0006

Task with the most failures(4): 
-----
Task ID:
  task_1411543545275_0006_m_000000

URL:
  http://sandy-quad-1.sslab.lan:8088/taskdetails.jsp?jobid=job_1411543545275_0006&tipid=task_1411543545275_0006_m_000000
-----
Diagnostic Messages for this Task:
Error: GC overhead limit exceeded

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched: 
Job 0: Map: 1   Cumulative CPU: 5.48 sec   HDFS Read: 5752704 HDFS Write: 5709249 SUCCESS
Job 1: Map: 1  Reduce: 1   Cumulative CPU: 95.74 sec   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 1 minutes 41 seconds 220 msec
======= q09_hive_RUN_QUERY_0 time =========

real    25m47.102s
user    1m35.230s
sys     0m11.493s
===========================

so it seems like this solution is not working for me.

I have 4 nodes in my Hadoop cluster with 64, 40, 32 and 32 GB of memory. I run the query on the node with 64 GB of memory.

Should I stick with my solution of disabling map joins for query 9, or is there a way I can get Michael's solution up and running and thus benefit from the mapjoin.

Kind regards,
Bart

Michael Frank

unread,

Sep 24, 2014, 11:23:46 AM9/24/14

to big-...@googlegroups.com

Hi Bart,
you passed the point of your previous problem (the local MapJoin stage) and face a different issue. (previously return code 3, now: return code 2)
Both are memory related, but are affected by different memory related configuration properties.

Your:

Execution failed with exit status: 3

[..]

FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask

error was caused by not enough memory for the "LOCAL" jvm (the jvm that started your hive task) where as "Error, return code 2" indicates a "REMOTE" problem. (A jvm started by e.g. YARN on a Node, to process your hive MR job stage)

Please increase the memory available to your (Yarn-) "containers" as well as the memory allowed for Hadoop Map and reduce tasks (basic cluster steup).
The default cloudera installation is very strict in terms of memory per container and map/reduce job.
You can find some information about this on the main git hub page https://github.com/intel-hadoop/Big-Bench in the README.md.

The GitHub front page for BigBench (and the README.md) contains a FAQ section for your exception and a section concerning basic cluster setup:

cluster setup

Error: GC overhead limit exceeded

best regards,
Michael

2014-09-24 11:09:40,046 Stage-5 map = 0%, reduce =<span style="color: #000;" class="styled-b
...

Bart Vandewoestyne

unread,

Sep 24, 2014, 2:07:00 PM9/24/14

to big-...@googlegroups.com

On Wednesday, September 24, 2014 5:23:46 PM UTC+2, Michael Frank wrote:

[...]

The GitHub front page for BigBench (and the README.md) contains a FAQ section for your exception and a section concerning basic cluster setup:
cluster setup
Error: GC overhead limit exceeded
best regards,
Michael

Hello Michael,

Thanks for adding a FAQ like this. I will take a look at it tomorrow. It seems like I will have to familiarize myself with more than one Hadoop/Hive/CDH configuration option... ;-) Interesting stuff!

Regards,
Bart

dwayne lessner

unread,

Jan 21, 2016, 5:59:30 PM1/21/16

to Big Data Benchmark for BigBench

Hi

I have ran into the same problem, I have given more RAM to the client and turned the option to false but the issues still persists.I am still working on fixing exit code 3. I gave the local hive java client 8 GB. How much should it be with a factor size of 100?

-DL

Michael Frank

unread,

Jan 22, 2016, 10:02:16 AM1/22/16

to Big Data Benchmark for BigBench

Hi Dwayne,

As an example, SF 3000 (3TB) I use:
"Client Java Heap Size in Bytes" = 3340763136 (Starting to launch local task to process map join; maximum memory = 3340763136)
mapreduce.input.fileinputformat.split.maxsize=134217728 //128MB
hive.exec.reducers.bytes.per.reducer=67108864 //64MB

As you see 8GB should be plenty enough.
If you are still experiencing "exit status: 3" despite having a lot of memory dedicated to the "Client Java Heap Size in Bytes", you should consider reduceing "hive.mapjoin.smalltable.filesize".
This variable is used by hive to determine if a table is processed with a local fast mapjoin or a slow normal join:
hive.mapjoin.smalltable.filesize=25000000
hive.mapjoin.localtask.max.memory.usage=0.9

Query 9 is optimal to test your mapjoin settings, as this query is very sensitive regarding mapjoin configuration. If query 9 runs at your desired Scale Factor and settings, the other queries will probably as well.

./bin/bigBench runBenchmark -i "POWER_TEST" -b -q 9 //runs only query 9 with debug prints enabled. Requires that in a previous run you generated the data and populated the hive database.

The BigBench FAQ covers this topic as well:
https://github.com/intel-hadoop/Big-Data-Benchmark-for-Big-Bench/blob/master/README.md#execution-failed-with-exit-status-3

The choice of your cluster settings is strongly dependent on your cluster and the data set size you choose to run with. Because of this fact, we do not provide "default settings" for various SF as it would be impossible to give you the "right" ones and may be even missleading and causing more harm then good.
Now that I placed my warning: Here are some parameters I use at different SF running on a 10 Node AWS cluster (16VCores/Node). Dependent on your cluster you may have to use totally different settings.
Notice that the settings a rather radical for SF < 1000, which you would never do so in a production system! The goal of these settings was to maximize cluster utilization, even with small data sizes.
For any serious benchmarking of a BigData system you want to consider running at least SF 1000.

--sf 1 settings good values between 4 and 6 mb (wont achieve 100% - jobs running not long enough, mainly Hive/MR startup overhead)
--set mapreduce.input.fileinputformat.split.minsize=4194304;
--set mapreduce.input.fileinputformat.split.maxsize=6291456;
--set hive.exec.reducers.bytes.per.reducer=6291456;

-- sf 10 setting.
--set mapreduce.input.fileinputformat.split.minsize=4194304;
--set mapreduce.input.fileinputformat.split.maxsize=8388608;
--set hive.exec.reducers.bytes.per.reducer=8388608;

--sf 100 setting
--set mapreduce.input.fileinputformat.split.minsize=4194304;
--set mapreduce.input.fileinputformat.split.maxsize=16777216;
--set hive.exec.reducers.bytes.per.reducer=16777216;

--sf 1000 setting good values between 32 and 64 mb
set mapreduce.input.fileinputformat.split.minsize=4194304;
set mapreduce.input.fileinputformat.split.maxsize=67108864;
set hive.exec.reducers.bytes.per.reducer=67108864;

Cheers,
Michael

Message has been deleted

dwayne lessner

unread,

Feb 2, 2016, 6:17:18 PM2/2/16

to Big Data Benchmark for BigBench

If i edit the engineSettings.sql should that do the trick, it doesn't seem to take the settings..,,

Michael Frank

unread,

Feb 3, 2016, 1:13:30 PM2/3/16

to Big Data Benchmark for BigBench

Hi Dwayne,

For further assistance i require the full log file:

/root/Big-Data-Benchmark-for-Big-Bench/logs/q09_hive_power_test_0.log

as well as a copy of your engineSettings.sql file.

Cheers,
Michael

Reply all

Reply to author

Forward