Hopping in on this older thread.
Were Sky to support ES JSON semantics for bulk import/export, I could use existing scripts and fire off at the Sky endpoint the same as I do for ES.
Simply having a basic endpoint compatibility layer would mean I could use tools like elasticsearch-hadoop out of the box, for instance. That'd be swell.
That corresponds to Sky's bulk import stream:
Both packages have a simple JSON API (DSL?). In some sense, Sky is a little simpler as the goal is more focused.
ES does not yet have a matured aggregation language. That's something they're putting as a priority for 1.x (enhanced aggregates).
So there's not much reason to consider working into the existing faceting DSL. Sky has something to offer ES on that end.
Bulk export is handled through scan:
That corresponds to "List all events" in Sky.
That's all we'd really need to re-use most tools out there.
As for a plugin, the benefit isn't as big; it'd just make sure Sky popped up/down when ES does and maybe provide for some data locality (and save an extra port).
So we might have a nice _scan request type to populate Sky with results without (necessarily) hopping across the network, instead it'd just be IPC.
Such a plugin would be a lot more work than a simple compatibility class for scan/_bulk, and it doesn't provide all that much benefit that can't already be handled through elasticsearch-hadoop.
So, that's more-or-less how I see the integration story, based on the work I've been doing.
-Charles