Accelerate MapReduce development using Jumbune, an all in one open source tool for hadoop

23 views
Skip to first unread message

Mahesh Nair

unread,
Jul 4, 2014, 12:36:19 AM7/4/14
to chenn...@googlegroups.com

Guys, Hi

I am part of the team Jumbune, we have created a tool for debugging MR code, profiling the various phases of MR execution, validating data on HDFS and monitoring hadoop based clusters.

We need more traction as this is entirely an open source initiative and motivate us to devote time to this cause.

A brief overview of the modules of Jumbune.

Debugger

Provides code level control flow statistics of MapReduce job. User may apply regex validations or its own user defined validation classes. As per validations applied, Flow Debugger checks the flow of data for mapper and reducer respectively and will give a very detailed view of where your MapReduce application is going wrong.

JVM profiler

Profiles MapReduce jobs in a cluster and gives insights into CPU and heap dumps of Hadoop job. It also provides an in-depth graphical view of the MapReduce phases – Map, Reduce, Sort, Shuffle, Setup, and Cleanup.

HDFS data validation

Provides a simple and easy-to-use utility to find discrepancies and errors in HDFS data. User may check for data violations under various categories like data type, null values, or regular expression values.

Cluster monitor

One of the distinguishing feature of Jumbune is on-demand monitoring, which can be turned on or off as per user requirement. Jumbune is loosely coupled, which enables it to be deployed on a remote machine without additional setup on each cluster machine. The cluster monitoring features can be summarized as follow:

  • Node level cluster view to monitor system and Hadoop parameters
  • Network latency view to detect network latency across nodes in cluster
  • Data load partition to monitor data load distribution among various nodes of the cluster
  • Replica management view to show data blocks replications in HDFS Jumbune supports Apache, HDP and CDH distributions of Hadoop.

Please feel free to use Jumbune, ask questions and contribute to this open source initiative. Our website is atwww.jumbune.org. The source code can be accessed at https://github.com/impetus-opensource/jumbune

Mahesh Nair

unread,
Jul 4, 2014, 2:13:13 AM7/4/14
to chenn...@googlegroups.com
A Roadmap discussion is going on Users mailing list of Jumbune, you can observe the discussion at http://bit.ly/1rjGyh1

To participate in the discussion, just subscribe to the user mailing list, send a subscription mail to below alias, we will take care of rest, 

users-s...@collaborate.jumbune.org

Subrata Biswas

unread,
Jul 29, 2014, 11:16:59 AM7/29/14
to chenn...@googlegroups.com
Hi Mahesh,
 Does it work for hadoop 2 also. Couple of questions-
 1. Is it agent less?
 2. how does it discover all the nodes in cluster, as you mentioned it can be installed in a remote box. And I feel it could a completely separate box which is not part of cluster?

Regards
Subrata


--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages