Testing SWIM on a 5 node cluster : Namenode and ResourceManager are killed

44 views
Skip to first unread message

Simba

unread,
Sep 1, 2014, 1:30:45 PM9/1/14
to swimapredu...@googlegroups.com
Hello

I'm trying out SWIM on a 5 node cluster to test out some changes I've made to HDFS. When using GenerateReplayScript, I did specify the size of the cluster I would be testing on. However, I think the job arrival rate is too high for my cluster to deal with. I moved to a 10 node cluster and the workload ran longer, but the Namenode and Resourcemanager processes eventually disappeared. I'm assuming they were killed. I'm the logs there are no errors.

Has anyone run into such issues?

Simba

unread,
Sep 1, 2014, 4:13:24 PM9/1/14
to swimapredu...@googlegroups.com
I noticed in GenerateReplayScript.java, on line 118, the code to scale the sleep period has been commented out. I believe jobs are being submitted too quickly. However, "sleep = sleep * clusterSizeRaw / clusterSizeWorkload;"  would scale is out too much. Anyone with experience on how the sleep should be scaled?

Yanpei Chen

unread,
Sep 2, 2014, 3:37:28 AM9/2/14
to swimapredu...@googlegroups.com
This indeed sounds like your cluster is not large enough for the workload!

"Namenode and Resourcemanager processes eventually disappeared" is strange. Usually an undersized cluster will look healthy, but many of the submitted jobs did not register at the ResourceManager.

To get some more info - Which workload are you running? What distribution of Hadoop?

Yanpei Chen

unread,
Sep 2, 2014, 3:42:51 AM9/2/14
to swimapredu...@googlegroups.com
Glad you're looking at SWIM code in depth.

Currently sleep time is not being scaled because the data size per job is. If you scale both the data size and the sleep time, you would be running 1/N the number of jobs at 1/N the original size. This means your workload is 1/N^2 the original production workload. You may or may not get helpful info by running a workload scaled down this much - it depends on what you're trying to accomplish using SWIM.

Having large jobs and having many jobs stress different parts of the MapReduce system. The default in SWIM is to scale data size only. The scaling sleep logic is commented out but not removed to give you an additional scaling option.

Hope this helps.

Simba

unread,
Sep 2, 2014, 10:08:44 AM9/2/14
to swimapredu...@googlegroups.com
Thanks. I'm using Hadoop 2.2.0, the namenode and resourcemanagers failed with thFB-2010_samples_24_times_1hr_0.tsv workload. I then tried the FB-2009 workload, with waits after every 5 jobs things seemed to limit the number of outstanding jobs. jobs 0-36 completed. But jobs 37-42 hung, and I had to terminate the experiment.

In your 200 machine cluster for your IEEE MASCOTS paper, was the namenode also running on an m1.large instance? I have my my namenode and resourcemanager running on one m1.large instance. Then I have 4 nodes which each have but the datanode and nodemanager. 

Yanpei Chen

unread,
Oct 14, 2014, 7:40:17 PM10/14/14
to swimapredu...@googlegroups.com
The NameNode in our MASCOTS experiments is indeed a m1.large instance.

One possible fix: You should try a larger node instance for the NameNode and ResourceManager. Because your cluster size is small, the jobs data sizes are also small. Beyond a point, the job durations no longer linearly scale down with cluster size and data size, and you're hitting lower bounds on job launch/tear-down overhead. This requires the ResourceManager to keep more state for active jobs. Hence, try a larger node instance.
Reply all
Reply to author
Forward
0 new messages