How can a system administrator ensure optimal performance and manageability of an Apache Hadoop Cluster?

46 views
Skip to first unread message

Durugkar, PriyankX

unread,
Feb 26, 2013, 7:44:55 PM2/26/13
to idh-...@googlegroups.com

The following recommendations help ensure optimal performance and manageability of the Hadoop cluster.

 

* To avoid catastrophic failures, the JobTracker and PrimaryNameNode nodes should be installed on raided disks. The raid should be 1 or 5.

* To add a group of nodes into the cluster simultaneously, you must set the same root password for each node in the group. This significantly speeds up the process of adding nodes into a cluster.

* To reduce network latency, all the nodes in the cluster should be placed in the same subnet with the fewest possible number of switches between all the nodes.

* If the nodes are not using 10 GB NICs, then use NIC bonding to combine multiple NICs together to increase network throughput. The bonded NICs should use work mode 6.

* A minimum recommend disk size for each node is two 1 terabyte physical disks plus a system drive to contain the operating system.  However, it is ideal if each node had at least 6 terabytes of available disk space for HDFS.

* There should be at least 3 DataNodes in the cluster.

* Each node should use a 10 GB NIC card for internode communication and performing any function in the cluster that requires network connectivity.

* Only use bare metal machines and do not use virtual machines. Virtual machines can cause a significant slowdown in the HDFS I/O.

* It is highly recommended that a DNS server be used to assign hostnames to all the nodes in the cluster.

* No other machines should be on the same subnet or subnets that the Hadoop nodes are on.

* The manager node, where the Intel® Manager is installed, should be on that  network as well.

* You should not have a cluster that consists of bare metal machines and virtual machines.

* To ensure that no machine in the cluster becomes a bottleneck for performance and I/O, all the machines should have similar hardware and software configurations, which includes RAM, CPU, and disk space.

* Each node should have at least 32 GB of RAM

* Swap should be turned off on all nodes.

Reply all
Reply to author
Forward
0 new messages