What would it take to make Data Cleaner cloud ready ?

17 views
Skip to first unread message

Pmohan

unread,
Oct 13, 2011, 12:29:28 PM10/13/11
to DataCleaner-dev
I was just exploring, what would it take to

1) make Data cleaner be able to be deployed on the cloud
2) accessed through a web client / API
3) Distributed Job Processing across multiple machines ( long shot
though)

Thanks
Pmohan

Kasper Sørensen

unread,
Oct 13, 2011, 1:03:46 PM10/13/11
to datacle...@googlegroups.com
Hi Pmohan,

Thanks for the question, an interesting one!

1+2) Actually we have been playing around with this idea already at Human Inference. We've already made some loose plans to be able to deploy DC jobs as invokable web services, running on a server. The architecture completely supports this idea and I see no major impediments, except "just doing it".

3) For some tasks this is a good fit, for some features not. Specifically, the transformer and filter components are very analogous with the "map" part of a MapReduce system (like Hadoop or GridGain) and thus could be REALLY scalable. The Analyzer components are also kinda analogous to "Reduce" in a MapReduce system, but there to make it work it would impose certain restrictions onto what an Analyzer can do, and specifically how it saves state. So yes, it is in our thoughts but it's not likely to be something we would create on the short term.

Now that you have a few answers, may I ask (out of curiosity) why you are asking? Are you considering building such an application? Would you maybe be interested in a cooperation?

Best regards,
Kasper


--
You received this message because you are subscribed to the Google Groups "DataCleaner-dev" group.
To post to this group, send email to datacle...@googlegroups.com.
To unsubscribe from this group, send email to datacleaner-d...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/datacleaner-dev?hl=en.


Reply all
Reply to author
Forward
0 new messages