Hadoop Cluster Concepts Doubts

14 views
Skip to first unread message

sridhar babu

unread,
Feb 17, 2016, 12:40:44 AM2/17/16
to Utah Hadoop Users Group - Big Data Utah
What is the Difference between homogeneous and heterogeneous hadoop clusters?

Can any one provide a perfect, crystal clear definition that would be easy for beginners to understand!

Christian

unread,
Feb 17, 2016, 7:44:46 AM2/17/16
to Utah Hadoop Users Group - Big Data Utah
When talking about heterogeneous vs homogeneous Hadoop clusters, the context is around hardware configuration. When all machines in the Hadoop cluster have the same hardware (CPUs and cores, num HDDs, ram, etc), then that's called a homogeneous cluster. Homogeneous clusters are easier to configure because each machine has the same configuration files and each machine can run the same number of tasks with each having the same amount of memory dedicated to those tasks, and read and write from/to the same number of drives. A heterogeneous cluster cluster is when some machines in the cluster have different hardware specs, making the configuration more challenging and the debugging more complicated. I've seen homogeneous clusters turn into heterogeneous clusters over time as new hardware in the org comes available and the demands of an existing Hadoop cluster increase.

Hope this helps,
Christian
On Tue, Feb 16, 2016 at 10:40 PM sridhar babu <sridhar...@gmail.com> wrote:
What is the Difference between homogeneous and heterogeneous hadoop clusters?

Can any one provide a perfect, crystal clear definition that would be easy for beginners to understand!

--
--
Visit us on the net: http://www.uhug.org
 
You received this message because you are subscribed to the Google
Groups "Utah Hadoop Users Group" group.
To manage your subscription, visit this group at
http://groups.google.com/group/uhug?hl=en

---
You received this message because you are subscribed to the Google Groups "Utah Hadoop Users Group - Big Data Utah" group.
To unsubscribe from this group and stop receiving emails from it, send an email to uhug+uns...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Craig Brown

unread,
Feb 17, 2016, 10:13:09 AM2/17/16
to uh...@googlegroups.com
Christian is correct.

Typically a new cluster will be homogeneous, but will turn heterogeneous as it grows.

  - Craig
Message has been deleted

Christian

unread,
Feb 19, 2016, 8:50:29 PM2/19/16
to Utah Hadoop Users Group - Big Data Utah
Are you talking about having multiple VMs per physical machine? If so, how many VMs per machine? Also, to be sure I understand, are you saying each physical machine has 8g RAM?

BTW, have you thought about using Flink or Spark instead?


On Fri, Feb 19, 2016 at 2:47 PM sridhar babu <sridhar...@gmail.com> wrote:
Thank you Christian,
                       it was indeed crystal clear.

                       One more doubt, whether is it possible to create a simple heterogeneous hadoop cluster in a computer lab, with 4 machines of each 8gb ram??

                       Like can each VM's act as a separate node for simulation purposes? I hope my question is clear.......... 

sridhar babu

unread,
Feb 21, 2016, 10:28:09 PM2/21/16
to uh...@googlegroups.com
Yes,

         I am talking about having VM's per physical machine.
 
        Like how many VM's can be created per single machine,with each physical machine having 8gb RAM

        I have thought of using Apache Spark in each host and guest machines


You received this message because you are subscribed to a topic in the Google Groups "Utah Hadoop Users Group - Big Data Utah" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/uhug/cMog3E-u2wM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to uhug+uns...@googlegroups.com.

Christian

unread,
Feb 21, 2016, 11:28:59 PM2/21/16
to uh...@googlegroups.com
I'd probably not go the VM route then. It's going to be slower in several ways. There are some simple equations out there that will get YARN setup with the optimal number of concurrent tasks and memory per task. If you have any questions there, feel free to ask.
Reply all
Reply to author
Forward
0 new messages