Connect to JDBC/Calcite?

211 views

Skip to first unread message

Ben Vogan

unread,

May 4, 2017, 12:46:45 PM5/4/17

to airbnb_superset

Hi all,

I am playing with Druid 0.10 which adds support for SQL queries and JDBC. I know Superset has native support for Druid, but I would like to try out the new SQL features and see how well it works. Is it possible to connect over JDBC/Calcite through Superset?

Thanks,

BENJAMIN VOGAN | Data Platform Team Lead

Maxime Beauchemin

unread,

May 7, 2017, 12:43:56 AM5/7/17

to airbnb_superset

I spoke with Fangjin Yang (one of the top Druid contributors) two weeks ago and we spoke about SQL support and the path forward for Superset. It sounded like the longer term solution would be to go the SQLAlchemy route.

A few related considerations:

* SQL being declarative, we don't need to handle the whole groupby/topn/timeseries different queries on a case-by-case basis, we can just do simple SQL and Druid will run the best query plan behind the scene

* we can keep the current implementation working for people running earlier Druid versions and eventually deprecate

* someone has to take on creating and maintaining the Druid SQLAlchemy dialect, should be pretty easy to write, Airbnb won't take this on until we upgrade Druid, I think we had a blocker there with our Hadoop [HDFS?] distro, the Druid admin is on paternity leave, ... so we won't lead that in the next few months, we welcome people to step up in the meantime

* pydruid isn't a great lib, it's more obfuscating than abstracting properly, I'd move away from having a dependency on that lib altogether

* Superset may need to support a new flag `supports_subquery` in `db_engine_spec`, for which it would be false for Druid, and run the "Series Limit"-type queries as 2 phases SQL queries