Dear Rohit!
I have read through your proposal. Before we'll discuss details and plans, I have two big things two discuss.
1. Current work in the library (though pretty slow mostly because of me) is going towards separating core (Dataframe/Vector/Index) from "periphery" (IO, analysis, views and so on), which will simplify maintaining and contributions to each of libraries. Therefore, planning to monkey-patch some methods, and include new modules in the daru library itself maybe not as useful, as creating separate library/libraries for BI and data cleaning. In fact, this approach allows to focus the work and limit its scope. Some of "core functionality" methods still could be committed to daru itself, but some others that exist and are NOT that "core" for DF, could, vice versa, be duplicated and optimized in BI module (with future deletion from the core).
In fewer words, try to rethink your plan in a modular fashion, "clean module will receive dataframe as an argument, do this and that, and will be set this and that way".
2. My largest concern. For now, your proposal seems extremely ambitious. Besides coding, it will require testing, documenting, studying related libraries and integrating with them, documenting again, trying different approaches and so on... And for one person, it seems to really easy to be late on any stage, which endangers all following stages. The end result may be a lot of good and useful code (or a lot of demos and experimental code), but no particular deliverables that are easy to use and maintain for others. I can suggest considering two options:
a) simplifying the proposal (for example, concentrate on reading different log formats into daru dataframes + visualizing the standard daru summaries; or, vice versa, just data cleaning, with simplest Rails log reader as a source of data); or
b) plan work in a "circular" manner, like: stage1: add Rails standard logs reader, try some visualizations, commit; stage2: more log formats, some data cleaning, commit; stage3: a bit of BI, a bit more visualizations, commit; and so on. This way, instead of monolithic "fail-or-succeed" plan you can have more flexible "useful at each stage" steps.
The end result could be a bit less expressive than planned, yet it could be still solid, useful and demonstratable. And either of approaches allows extending scope a bit if you'll find everything done and a lot of time left.
An interesting perspective is to think this way: what is the minimum change to Daru ecosystem for Rails log analysis? (I believe, just IO module, as we already have some grouping/aggregating and some visualization) What would be next minimum useful step and its goal? And the next?
Note also that typical GSoC plan includes some "buffer" weeks (that are deliberately left for fixing bugs, documenting and facing unexpected problems), especially before the phase ends. Tight schedules can look shiny at the planning phase, but when something unexpected happens (and it will happen), they do not have enough flexibility and tend to break completely.
Hope that sounds reasonable!
V.