Guys, Hi
I am part of the team Jumbune, we have created a tool for debugging MR code, profiling the various phases of MR execution, validating data on HDFS and monitoring hadoop based clusters.
We need more traction as this is entirely an open source initiative and motivate us to devote time to this cause.
A brief overview of the modules of Jumbune.
Provides code level control flow statistics of MapReduce job. User may apply regex validations or its own user defined validation classes. As per validations applied, Flow Debugger checks the flow of data for mapper and reducer respectively and will give a very detailed view of where your MapReduce application is going wrong.
Profiles MapReduce jobs in a cluster and gives insights into CPU and heap dumps of Hadoop job. It also provides an in-depth graphical view of the MapReduce phases – Map, Reduce, Sort, Shuffle, Setup, and Cleanup.
Provides a simple and easy-to-use utility to find discrepancies and errors in HDFS data. User may check for data violations under various categories like data type, null values, or regular expression values.
One of the distinguishing feature of Jumbune is on-demand monitoring, which can be turned on or off as per user requirement. Jumbune is loosely coupled, which enables it to be deployed on a remote machine without additional setup on each cluster machine. The cluster monitoring features can be summarized as follow:
Please feel free to use Jumbune, ask questions and contribute to this open source initiative. Our website is atwww.jumbune.org. The source code can be accessed at https://github.com/impetus-opensource/jumbune
--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.