Prashant,
Spark is a back-office analytics platform. It started as
infrastructure to support faster iterative algorithms, like machine
learning type stuff where you keep applying an operation over and over
again until you converge. It is fantastic at providing analysts with
the ability to run queries and analyze large amounts of data with a
wide array of different algorithms.
We've heard feedback from others who have tried Spark and chosen Druid
that Spark is not currently the greatest choice for powering
applications. That is, if you have a web UI that is powered entirely
by Spark queries, you are going to be reluctant to expose that to
users outside of your analytics team. Instead, you would use Spark to
generate some other data set, load that into a key-value store or
database or something like Druid and serve your application from that.
Druid is infrastructure built to power an application. It does not
have the query flexibility that Spark provides and it doesn't
maintain/update state, so it's not currently the greatest choice for
high scale iterative machine learning algorithms that need to update
large amounts of state between iterations.
For the queries that it does provide, however, it was designed to be
able to provide answers to those queries quickly and in a highly
concurrent environment. More specifically, it is designed to allow
query latency and concurrency requirements to be dictated by how much
$$ you want to throw at the problem. If you want to hit a specific
query latency or level of concurrent queries, you can do that by
increasing the amount of hardware available.
Does that make sense?
--Eric
> --
> You received this message because you are subscribed to the Google Groups
> "Druid Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to
druid-developm...@googlegroups.com.
> To post to this group, send email to
druid-de...@googlegroups.com.
> To view this discussion on the web visit
>
https://groups.google.com/d/msgid/druid-development/d1a50948-63a1-4b44-9747-f3d9fd8f9eaf%40googlegroups.com.
> For more options, visit
https://groups.google.com/d/optout.