fail message during test

Ray Lu

unread,

Sep 8, 2014, 6:00:05 AM9/8/14

to big-...@googlegroups.com

Dear everyone,

I'm a novice at Hadoop. I want to run BigBench with CDH 5.1.0 to test benchmark.
I use Cloudera Manager to deploy the CDH environment, and run HiBench successfully.
I have modified BIG_BENCH_HADOOP_LIBS_NATIVE and BIG_BENCH_HDFS_NAMENODE in setEnvVars.
Then switch to hdfs account and execute below command to run the benchmark.
#./bigBench runBenchmark -m 3 -f 5 -s 2

But after test, there are some fail message in q05/q09 power test and throughput test log as below: (I also attached the full log.)

2014-09-08 04:26:29    Starting to launch local task to process map join;    maximum memory = 257949696
2014-09-08 04:26:30    Processing rows:    200000    Hashtable size:    199999    Memory usage:    94451168    percentage:    0.366
2014-09-08 04:26:30    Processing rows:    300000    Hashtable size:    299999    Memory usage:    127494424    percentage:    0.494
2014-09-08 04:26:30    Processing rows:    400000    Hashtable size:    399999    Memory usage:    156816688    percentage:    0.608
2014-09-08 04:26:30    Processing rows:    500000    Hashtable size:    499999    Memory usage:    167620040    percentage:    0.65
2014-09-08 04:26:31    Processing rows:    600000    Hashtable size:    599999    Memory usage:    203173448    percentage:    0.788
Execution failed with exit status: 3
Obtaining error information

Task failed!
Task ID:
Stage-17

Logs:

/tmp/hdfs/hive.log
FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask

How to resolve this fail?
Is there any vital procedure before running BigBench?

How do I know the test finish success and the result is valuable?

Thanks for your help!

logs-20140908-075143.zip

Michael Frank

unread,

Sep 9, 2014, 11:32:29 AM9/9/14

to big-...@googlegroups.com

Hi Ray,

Explanation and fix for:

Execution failed with exit status: 3

FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask

Hive converted a join into a locally running and faster 'mapjoin', but ran out of memory while doing so.

maximum memory = 257949696

256Mb of memory is not quite enough

There are two bugs responsible for this.
Bug 1)
------
hives metric for converting joins miscalculated the required amount of memory. This is especially true for compressed files and ORC files, as hive uses the filesize as metric, but compressed tables require more memory in their uncompressed 'in memory representation'.
You could simply decrease 'hive.smalltable.filesize' to tune the metric, or increase 'hive.mapred.local.mem' to allow the allocation of more memory for map tasks.
The later option may lead to bug number two if you happen to have a affected hadoop version.

Bug 2)
Hive/Hadoop ignores 'hive.mapred.local.mem' !
(more exactly: bug in Hadoop 2.2 where hadoop-env.cmd sets the -xmx parameter multiple times, effectively overriding the user set hive.mapred.locla.mem setting.
see: https://issues.apache.org/jira/browse/HADOOP-10245

There are 3 workarounds for this bug:
1) assign more memory to the local! Hadoop JVM client (this is not! mapred.map.memory) because map-join child jvm will inherit the parents jvm settings
+In cloudera manager home, click on "hive" service,
+then on the hive service page click on "configuration"
+Gateway base group --(expand)--> Resouce Management -> Client Java Heap Size in Bytes -> 1GB
2) reduce "hive.smalltable.filesize" to ~1MB or below (depends on your cluster settings for the local JVM)
3) turn off "hive.auto.convert.join" to prevent hive from converting the joins to a mapjoin.

The prefered solution is:1).If you can not increase your memory settings to high enougth values, you can additionaly employ workaround 2).
Workaround 3) is the last option. Turning off auto join convert implies a huge performance penalty.
2) & 3) can be set in Big-Bench/hive/hiveSettings.sql

Ray Lu

unread,

Sep 13, 2014, 9:04:10 AM9/13/14

to big-...@googlegroups.com

Hi Michael,

Thanks for your help!

After set java heap size to 1GB and rerun the benchmark, I still got some fail at q09 power/throughput tests. (q05 tests works fine now.)

But the memory usage doesn't reach the setting.
How to resolve it?

2014-09-11 10:55:44    Starting to launch local task to process map join;    maximum memory = 1029701632
2014-09-11 10:55:44    Dump the side-table into file: file:/tmp/hdfs/hive_2014-09-11_22-55-15_968_3321285425869299880-1/-local-10006/HashTable-Stage-5/MapJoin-mapfile21--.hashtable
2014-09-11 10:55:45    Upload 1 File to: file:/tmp/hdfs/hive_2014-09-11_22-55-15_968_3321285425869299880-1/-local-10006/HashTable-Stage-5/MapJoin-mapfile21--.hashtable
2014-09-11 10:55:45    Dump the side-table into file: file:/tmp/hdfs/hive_2014-09-11_22-55-15_968_3321285425869299880-1/-local-10006/HashTable-Stage-5/MapJoin-mapfile11--.hashtable
2014-09-11 10:55:45    Upload 1 File to: file:/tmp/hdfs/hive_2014-09-11_22-55-15_968_3321285425869299880-1/-local-10006/HashTable-Stage-5/MapJoin-mapfile11--.hashtable
2014-09-11 10:55:45    Processing rows:    200000    Hashtable size:    199999    Memory usage:    285015176    percentage:    0.277
2014-09-11 10:55:45    Processing rows:    300000    Hashtable size:    299999    Memory usage:    127199272    percentage:    0.124
2014-09-11 10:55:45    Processing rows:    400000    Hashtable size:    399999    Memory usage:    158164848    percentage:    0.154
2014-09-11 10:55:45    Processing rows:    500000    Hashtable size:    499999    Memory usage:    189130416    percentage:    0.184
2014-09-11 10:55:45    Processing rows:    600000    Hashtable size:    599999    Memory usage:    229451232    percentage:    0.223
2014-09-11 10:55:45    Processing rows:    700000    Hashtable size:    699999    Memory usage:    260416808    percentage:    0.253
2014-09-11 10:55:45    Processing rows:    800000    Hashtable size:    799999    Memory usage:    291382376    percentage:    0.283
2014-09-11 10:55:45    Processing rows:    900000    Hashtable size:    899999    Memory usage:    322347936    percentage:    0.313
2014-09-11 10:55:45    Processing rows:    1000000    Hashtable size:    999999    Memory usage:    353313496    percentage:    0.343
2014-09-11 10:55:45    Processing rows:    1100000    Hashtable size:    1099999    Memory usage:    393836568    percentage:    0.382
2014-09-11 10:55:45    Processing rows:    1200000    Hashtable size:    1199999    Memory usage:    424802160    percentage:    0.413
2014-09-11 10:55:45    Processing rows:    1300000    Hashtable size:    1299999    Memory usage:    455767720    percentage:    0.443
2014-09-11 10:55:45    Processing rows:    1400000    Hashtable size:    1399999    Memory usage:    486733296    percentage:    0.473
2014-09-11 10:55:46    Processing rows:    1500000    Hashtable size:    1499999    Memory usage:    517698880    percentage:    0.503
2014-09-11 10:55:46    Processing rows:    1600000    Hashtable size:    1599999    Memory usage:    548664464    percentage:    0.533
2014-09-11 10:55:46    Processing rows:    1700000    Hashtable size:    1699999    Memory usage:    579630040    percentage:    0.563

Execution failed with exit status: 3

Obtaining error information

Task failed!
Task ID:

Stage-15

Logs:

/tmp/hdfs/hive.log

logs-20140912-034611.zip

Michael Frank

unread,

Sep 15, 2014, 8:59:51 AM9/15/14

to big-...@googlegroups.com

Hi Ray,

Do not trust the hive memory consumption log. This log is not "realtime" and just an estimate to help you choose the required heap size.

Execution failed with exit status: 3

This error message is in reality a JVM OutOfMemoryExecption, caught and masked in the background.
It indicates that your heapsize is still to small for the in-memory map-join.
If you have available resources, i suggest increasing the allowed heap memory even more.

By the way, you dont have to re-run the whole benchmark if you are debugging a single query. Just execute the query in question individually wit (assuming you allready generated the data und populated the hive tables):

./scripts/BigBench -q <queryNum> runQuery

Other good command lines to use (until every thing runs fine) are:

Just generate data (only data generation, skip others)

./scripts/bigBench runBenchmark -f 3 -m 5 -s 2 -sl -sp -st

Just load data (only load test, skip others):

./scripts/bigBench runBenchmark -f 3 -m 5 -s 2 -sd -sp -st

Just do a single run of all queries (only power test, skip others):

./scripts/bigBench runBenchmark -f 3 -m 5 -s 2 -sd -sl -st

Excerpt from the command line usage help you get when execing ./scripts/bigBench runBenchmark:

-sd, --skipDataGeneration:    skip the data generation
-se,       --skipEnvChecks:    skip all environment checks (implies --pretend)
-sl,           --skipLoadTest:    skip the load test (no benchmark result available then)
-sp,        --skipPowerTest:    skip the power test (no benchmark result available then) (one stream of sequential queries)
-st, --skipThroughputTest:    skip the throughput test (no benchmark result available then) (n streams of queries in random order)

If your cluster setup runs smoothly, then you may consider doing a full benchmark run like you did before.
Its allways a good idea to start with a 'power test' (by skipping the 'throughput test' with -st option) to see if you cluster runs fine, before doing a complete benchmark including the 'throughput test', because the later will take a lot of time to run.

Runing bigbench with multiple concurrent streams of queries (-s <streams> option) may show additional cluster configuration issues, as this test is meant to exhaust all available resources.

best regards,
Michael

Bart Vandewoestyne

unread,

Sep 18, 2014, 7:25:21 AM9/18/14

to big-...@googlegroups.com

On Monday, September 15, 2014 2:59:51 PM UTC+2, Michael Frank wrote:

Other good command lines to use (until every thing runs fine) are:

Just generate data (only data generation, skip others)
./scripts/bigBench runBenchmark -f 3 -m 5 -s 2 -sl -sp -st

Just a small question: is the above exactly the same as

./scripts/bigBench -m 5 -f 3 hadoopDataGen

?

Bhaskar Gowda

unread,

Sep 18, 2014, 11:24:04 AM9/18/14

to big-...@googlegroups.com

./scripts/bigBench -m 5 -f 3 hadoopDataGen

Bart, above command generates data and doesn't run queries. runBenchmark parameter 1. Generates Data. 2.Loads Data and finally runs the queries.

Bhaskar Gowda

unread,

Sep 18, 2014, 11:27:28 AM9/18/14

to big-...@googlegroups.com

./scripts/bigBench -m 5 -f 3 hadoopDataGen

Bart, above command generates data and doesn't run queries. runBenchmark parameter 1. Generates Data. 2.Loads Data and finally runs the queries.

On Thursday, September 18, 2014 4:25:21 AM UTC-7, Bart Vandewoestyne wrote:

Michael Frank

unread,

Sep 18, 2014, 1:00:02 PM9/18/14

to big-...@googlegroups.com

Hi Bart,
Yes

./scripts/bigBench runBenchmark -f 3 -m 5 -s 2 -sl -sp -st

and

./scripts/bigBench -m 5 -f 3 hadoopDataGen

have the same effekt.
In fact the "runBenchmark" module calles the "hadoopDataGen" module as first step of its execution.
The parameters

-sl -sp -st

tell the runBenchmark module to: skipHiveLoading, skipPowerTest and skipThroughputtest.

Reply all

Reply to author

Forward