[Proposal] Built-in SQL for Druid

Gian Merlino

unread,

Oct 12, 2016, 12:51:20 PM10/12/16

to druid-de...@googlegroups.com

Inspired by the Calcite Druid adapter (https://groups.google.com/d/topic/druid-development/FK5D162ao74/discussion) I've been playing around with something similar that lives inside of the Druid Broker. It seems promising, so in this proposal I'm suggesting we include an official SQL server inside Druid itself.

I am hoping that we can:

1) Use Calcite for SQL parsing and optimizing, and use Avatica (https://calcite.apache.org/docs/avatica_overview.html) for the server and the JDBC client.

2) Like the official Calcite Druid adapter, have a set of rules that push down filters, projections, aggregations, sorts, etc into normal Druid queries.

3) Unlike the official Calcite Druid adapter, use Druid objects (like DimFilter, ExtractionFn, etc) as model classes, since it avoids extra code, helps with type safety, and speeds up development.

4) Have this all run on the Broker, which would then make normal Druid queries to data nodes.

5) Work towards being able to push down more and more SQL into normal Druid queries over time.

Current status

If people are interested in this proposal then I'll clean up the code a bit and do a PR. Currently it's a rough prototype. Some things that do work:

1) Avatica handler running at /druid/v2/sql/ + Avatica JDBC driver

2) Determining column types with segment metadata queries

3) Pushing down operator sequences that look like filter -> project -> aggregate -> project -> sort into groupBy, timeseries, and select queries as appropriate

4) Using "intervals" to filter on time when appropriate

5) LIKE, range, equality, and boolean filters

6) SUM, MIN, MAX, AVG, COUNT, COUNT DISTINCT

7) Some extraction fns like SUBSTRING, CHAR_LENGTH

8) GROUP BY FLOOR(__time TO gran) for time-series

9) Arithmetic post-aggregations

10) Filtered aggregations using CASE or using FILTER(WHERE ...)

11) Semi-joins like SELECT ... WHERE xxx IN (SELECT ...) can run by materializing the inner result on the broker and applying it to the outer query as a filter. Obviously doesn't always work, but it works sometimes (and it works more often than pulling the lefthand side into the Broker…).

Non-exhaustive list of things that don't work:

1) Pushing down filter after aggregate (HAVING)

2) Push down of anything without a native Druid analog, like multi-column extraction fns, aggregation of expressions, window functions, etc.

3) Any extraction fns other than SUBSTRING, CHAR_LENGTH

4) A lot of time stuff, like x + INTERVAL, FLOOR(__time TO MONTH) = x, etc.

5) Query time lookups

6) Select with pagination – only the first 1000 results are used

7) Any sort of memory usage controls on the Broker side

FAQ

1) Why another SQL on Druid thing? There's already, like, 7 of them.

I think the fact that there are 7 of them means there's clearly some value in having a built-in implementation. Partially this is so we can hopefully share some work between the projects. Partially this is because Druid doesn't support some things that are needed for well rounded SQL support (like multi-column extraction fns, aggregations of expressions, etc) and having the SQL layer inside the Druid repo will make it possible to develop those sorts of features hand in hand with the SQL planner rules.

Btw, the 7 that I counted are, in alphabetical order, Calcite (https://calcite.apache.org/docs/druid_adapter.html), Drill (https://groups.google.com/d/msg/druid-development/FK5D162ao74/EnYDjASWCQAJ), Druid's own simple grammar (added in 2013, removed in https://github.com/druid-io/druid/pull/2090), Hive (https://cwiki.apache.org/confluence/display/Hive/Druid+Integration), PlyQL (http://plywood.imply.io/plyql), Sparkline (https://github.com/SparklineData/spark-druid-olap), and Sql4D (https://github.com/srikalyc/Sql4D).

2) Is the proposed SQL language actually SQL or is it "SQL-like"?

In terms of what can be efficiently pushed down to Druid queries, it's "SQL-like". A lot of common SQL features aren't supported – although I think it makes sense to add more over time. Technically Calcite does speak full SQL, but a lot of it at the start would get planned as pulling all the raw data into the Broker and processing it in Calcite's interpreter.

3) Why not use the Druid adapter in Calcite?

Calcite's Druid adapter doesn't depend on any Druid jars; it implements the query language and protocol using its own set of model and client classes. For a builtin approach I wanted to be able to use Druid's own Query, DimFilter, ExtractionFn, etc in a type-safe way, and wanted to use the query code that already exists in Druid for discovering and querying data nodes. I think this will also help speed up development of Druid features that allow more SQL to be pushed down.

4) Can we share work between a builtin Druid SQL and the other SQL on Druid adapters that people are working on?

Hopefully! I think it would make sense if a builtin Druid SQL could be used for whatever Druid supports natively, and external SQL on Druid adapters could be used when users want to do something that Druid doesn't support. Sharing the work needed to translate "whatever Druid supports natively" into Druid queries would help everyone.

Hive and Drill already use Calcite internally, and I hope it's workable to stuff Druid's own rules into their planners without changing too much. If those projects are comfortable embedding druid-server then that should work straight away. If they aren't comfortable embedding druid-server (perhaps understandably) then we could bite the bullet and work on a light(er) weight druid-client jar that has just enough to give us the benefit of type checking, and does not include all the heavy Druid functionality.

If you're working on one of those projects, feedback is greatly appreciated.

5) What happens when parts of the SQL query can't be converted to a native Druid query?

Calcite is rad and runs the parts that can't be pushed down through an interpreter on the Druid Broker. Of course this means that if you use constructs that are close to the data and can't be pushed down, like grouping on CONCAT(foo, bar) or aggregating SUM(3 * bar), potentially a surprisingly large amount of data will be pulled out into the Broker. This is not great behavior and something should be done about that…

6) What about JOINs?

I don't know, maybe it makes sense for Druid to have query types usable for joins in the future. But it doesn't now; the closest thing is query-time lookups, which is like a broadcast join. Without native join support in Druid, it makes more sense to pull data out of Druid into another system (like Drill or Hive or Spark) and do the join there. Even if Druid did support native joins, there's still some value in using an external execution engine to join Druid data with data from some other system. Filters and aggregations can still potentially be pushed down, depending on the query.

7) JDBC works, but what about ODBC?

Avatica's home page says work on an ODBC client has not yet started. The page at https://hortonworks.com/hadoop-tutorial/bi-apache-phoenix-odbc/ is interesting, since the Phoenix Query Server is also Avatica based, so maybe that work could be useful? However, it doesn't seem to be open source, and when I tried to get the binary to work, the Windows ODBC setup thing crashed after calling getTableTypes. Maybe someone at Hortonworks can comment :)

Gian

Fangjin Yang

unread,

Oct 12, 2016, 1:10:18 PM10/12/16

to Druid Development

+1, This will be extremely useful for the community.

Julian Hyde

unread,

Oct 12, 2016, 1:43:29 PM10/12/16

to Druid Development

Gian,

Sounds exciting.

You say 'SQL-like'. Do you plan to add extensions that are not SQL, or

for certain queries to return different results than on other SQL

databases (e.g. implicitly adding joins)? It doesn't sound like you

do. In which case I'd describe your plan as 'core SQL': not everything

in SQL works, but what does work, works consistently with the

standard. That sounds like a fine plan.

For a minute I thought you were proposing to ditch the relational

algebra (Calcite's Filter, Scan, Project, Aggregate classes). I'm glad

you're keeping those. They are essential for semantic rewrites such a

materialized views, which you will need at some point. It makes sense

to find a plan, then transcribe directly to Druid's model classes.

Different people will need different engines. Some people will want

massive distributed joins, which Druid is not designed for. Since

you're using Calcite, people will be able to switch to a different

engine (say Hive or Drill) and keep the same SQL syntax and same

metadata. (The proposed Drillix[1] project - combining Apache Phoenix

and Apache Drill - follows a similar pattern. And indeed, what you are

proposing is very similar to how Phoenix is embedding Calcite.)

It sounds as if your biggest challenge is to implement the necessary

runtime for SQL (SQL types, SQL functions, and maybe one or two new

relational operators). I don't know whether it's feasible to re-use

parts of an existing engine (e.g. Calcite's or Drill's implementation

of SQL operators) but even if it's not, you should re-use Calcite's

unit and system tests to help develop your engine as quickly as

possible. (I've talked for a long time about Calcite having "pluggable

engines/runtimes". Maybe it's time to add that to Calcite.)

Good luck with this, and let us know what we can do to help on the Calcite side.

Julian

[1] https://apurtell.s3.amazonaws.com/phoenix/Drillix+Combined+Operational+%26+Analytical+SQL+at+Scale.pdf

On Wednesday, October 12, 2016 at 9:51:20 AM UTC-7, Gian Merlino wrote:

Nishant Bangarwa

unread,

Oct 12, 2016, 1:50:16 PM10/12/16

to Druid Development

+1

--
You received this message because you are subscribed to the Google Groups "Druid Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-developm...@googlegroups.com.
To post to this group, send email to druid-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/582c4241-eca0-4622-81ac-3c8fde1a4f1c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nishant Bangarwa

unread,

Oct 12, 2016, 1:58:21 PM10/12/16

to Druid Development

+1, super excited about druid having built-in support for SQL, going to be super useful for the druid community.

Gian Merlino

unread,

Oct 12, 2016, 2:16:19 PM10/12/16

to druid-de...@googlegroups.com

Hey Julian,

Thanks for the feedback!

By "SQL-like" I really just mean "not all of SQL". For the most part I think it's best if Druid's SQL behaves as close to standard SQL as possible. I think we might want to add some new SQL functions as extensions, but I don't see a need to make any major changes to how Calcite thinks of things. I think you're right that the biggest challenge is building out the runtime. A lot of things you can express in SQL just can't be pushed to Druid right now.

Do you think it's likely that Calcite & Hive would be comfortable using the planner rules from a druid-sql jar? It seems like there's active development in the Druid adapter of both of those, and it'd be great to share work.

Also, do you have any insight into what's going on with Avatica and ODBC, or any insight into where the Hortonworks Phoenix Query Server ODBC driver came from and whether it'd be useful for other Avatica servers?

Gian

--

You received this message because you are subscribed to the Google Groups "Druid Development" group.

To unsubscribe from this group and stop receiving emails from it, send an email to druid-development+unsubscribe@googlegroups.com.
To post to this group, send email to druid-development@googlegroups.com.

Charles Allen

unread,

Oct 12, 2016, 2:36:19 PM10/12/16

to Druid Development

If I recall correctly druid violates the way standard SQL thinks of NULL values in pretty deep and severe ways. That would probably need reviewed very closely for anything that wants to claim SQL-like behavior.

Burak Emre Kabakcı

unread,

Oct 12, 2016, 3:10:04 PM10/12/16

to Druid Development

Even Druid has own ways of handling NULL values, I think a SQLish query language would be great for community but everybody is familiar with it. Clickhouse also doesn't have NULL it's stated in their documentation.

Gian

To unsubscribe from this group and stop receiving emails from it, send an email to druid-developm...@googlegroups.com.
To post to this group, send email to druid-de...@googlegroups.com.

Gian Merlino

unread,

Oct 12, 2016, 3:47:53 PM10/12/16

to druid-de...@googlegroups.com

I decided that Druid doesn't have nulls and all those things you think are nulls are actually empty strings :)

That made a lot of the tests easier to write.

Gian

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/7ff1f4d5-137f-43cd-afc7-08bd55601215%40googlegroups.com.

Gian Merlino

unread,

Oct 12, 2016, 3:48:23 PM10/12/16

to druid-de...@googlegroups.com

One other thing I forgot to mention is that I have no idea how to handle multi-value dimensions, so those aren't supported right now.

Gian

Julian Hyde

unread,

Oct 12, 2016, 4:09:34 PM10/12/16

to druid-de...@googlegroups.com

Saying that Druid doesn’t support null values sounds like a good approach.

Note that Oracle converts empty strings to null values. I do not recommend that! They had that behavior historically and (I gather) would love to get to standard behavior.

I much prefer “SQL compliant with bugs and/or missing features” to “SQL-like”. Some of your users won’t care whether you’re SQL-compliant or SQL-like, but those of them using tools (e.g. Tableau) will definitely care, and they will be much happier if you say “that’s a bug” rather than “oh yeah, we decided not to follow the standard for that feature”.

Maybe at some point you will decide to allow null values for dimensions and measures, and at that point you can make sure that you implement 3-valued logic.

Regarding multi-valued dimensions: let’s discuss. Maybe they can be modeled as list-valued columns.

Julian

To unsubscribe from this group and stop receiving emails from it, send an email to druid-developm...@googlegroups.com.
To post to this group, send email to druid-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/CACZNdYBeOqb-empxO9jigVsCnCtuo8e1ngpD9zJR9LAq%3DLjgwA%40mail.gmail.com.

Julian Hyde

unread,

Oct 12, 2016, 4:13:43 PM10/12/16

to druid-de...@googlegroups.com

By the way, NULL values will arrive if & when you decide to support outer joins, IN or scalar sub-queries, or windowed aggregates.

What do you return for empty tables?

select max(x) from emp where 1 = 0;

Julian

Carter Shanklin

unread,

Oct 12, 2016, 6:04:22 PM10/12/16

to Druid Development

The Phoenix Query Server ODBC driver is in fact commercial and not open source. It is developed by Simba Technologies.

In addition to ODBC we have been pushing down the path of using Avatica as a simple way of hooking up any language that can speak protobuf to Phoenix, both for reads and writes, using the idioms of the respective environments. At this point we have some embryonic Python and Golang adapters and they are quite easy to develop Someone could even take a run at ODBC if they wanted to deal with all the cross-platform nonsense.

Also, my 2c, be careful about nulls, Hive's sloppy handling of null versus empty string caused us a lot of pain over the years.

Gian

To unsubscribe from this group and stop receiving emails from it, send an email to druid-developm...@googlegroups.com.
To post to this group, send email to druid-de...@googlegroups.com.

Gian Merlino

unread,

Oct 12, 2016, 6:09:40 PM10/12/16

to druid-de...@googlegroups.com

"SQL compliant with bugs and missing features" is exactly what I'm going for so I think we're on the same page there. The query "select max(x) from emp where 1 = 0" currently returns an empty result set. I guess it should return a row with a NULL, huh?

Our docs on how multi-valued dimensions behave are here: http://druid.io/docs/latest/querying/multi-value-dimensions.html. I'm not sure if anything in SQL behaves like that.

Gian

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/CACZNdYBeOqb-empxO9jigVsCnCtuo8e1ngpD9zJR9LAq%3DLjgwA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Druid Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-development+unsubscribe@googlegroups.com.
To post to this group, send email to druid-development@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/5FF5E050-CF62-49C0-95B2-76F375B0301E%40apache.org.

Gian Merlino

unread,

Oct 12, 2016, 6:13:10 PM10/12/16

to druid-de...@googlegroups.com

Thanks for the info Carter. It seems like there's a real dearth of open source ODBC drivers for any of the big data ecosystem stuff. They must be a pain to write :)

Yeah I think I agree that nulls are going to be an issue in the future. I *think* we're okay for now since we can take the stance that all Druid columns are non-nullable. But like Julian pointed out, some SQL features involve nulls appearing out of nowhere and the runtime will have to deal with that.

Gian

CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

--
You received this message because you are subscribed to the Google Groups "Druid Development" group.

To unsubscribe from this group and stop receiving emails from it, send an email to druid-development+unsubscribe@googlegroups.com.
To post to this group, send email to druid-development@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/4bfbe22f-2d65-4d0e-b0e0-c002bf24e7e4%40googlegroups.com.

Jihoon Son

unread,

Oct 13, 2016, 9:52:07 PM10/13/16

to Druid Development

Hi, it looks great and I'm very interested in your proposal!

Regarding NULLs, I think NULL value issues will arise early. For example, a simple aggregation count query returns different values depending on which columns are projected.

pgsql> select * from t;

id

----

1

10

20

null

5

(5 rows)

pgsql> select count(*) from t;

count

---------

5

(1 row)

pgsql> select count(id) from t;

count

---------

4

(1 row)

Regarding semi (anti) joins, subqueries with IN clause can be rewritten into the form of semi (anti) joins. Once they are rewritten, I think there will be some chances to execute those queries more efficiently by using broadcast joins or some other systems for join like you said.

Jihoon

2016년 10월 13일 목요일 오전 7시 13분 10초 UTC+9, Gian Merlino 님의 말:

Gian

Kurt Young

unread,

Oct 14, 2016, 12:06:00 AM10/14/16

to Druid Development

Great to hear that community starts to consider some serious SQL support for Druid, it will benefits lots of users and community itself.

But i think maybe it's not a perfect time for community to start full built-in SQL support now, it will involve lots of runtime modification or maybe a whole re-write of broker. Especially there are still some semantic issues not been well discussed, like NULL values, multi value dimensions, the difference between dimension and metrics and so on. What i suppose is to divide the plan into some sequential steps. The first and most important one is revisit the data model and other SQL-related issues, make sure we clean all the obstacle for supporting SQL well in the future. Like we eliminate the difference between dimension and metrics, make index and encode optional and some other fundamental staff. I think many of them only involves changes with index building and local query in historical and realtime. In the meantime, we can adopt some mature SQL query engine to work with theses changes. There are indeed lots of them, we can get lots of feedback when we process, or we can just choose one to work with and see where it goes.

When everything works fine (but not perfectly) with other engines, i believe Druid's storage & query engine will be flexible enough. Then we can think of make some built-in SQL support within broker, because we don't have all the limits other engine may have, we will have a good chance to have a more efficient SQL support. In the meanwhile, Druid can still keep some other useful feature to the user that other engine may not easily have, like pre-aggreation, native supporting of time-series data. By that time, Druid will be a very competitive product.

Best,

Kurt

Gian Merlino

unread,

Oct 14, 2016, 12:43:40 PM10/14/16

to druid-de...@googlegroups.com

Jihoon,

I think we can't avoid the NULL issue forever but I think we can get away with it for a little while. Specifically I think we can avoid problems for now by claiming that all columns in Druid are NOT NULL for purposes of the SQL layer. Druid query results actually do return nulls sometimes, but they never return empty strings (internally, nulls and empty strings are represented the same way, and we chose to represent those as null). So for purposes of the SQL layer I think we can treat all those nulls as empty strings instead, and aggs like COUNT(foo) will behave as expected.

Maybe it would be good to identify what SQL features really need proper NULL handling to support in a standard way, so we consider the issue properly. I don't have a list in mind but I know it's at least "NULLable columns" and "outer joins".

Kurt,

Even though all the ground-work isn't in Druid to support standard SQL, and there's some clear issues around multi-value dimensions and NULLable columns, I think it still makes sense to start working on a built-in SQL layer today. I expect that having the SQL layer and the core Druid stuff in the same project will actually accelerate the process of clearing out obstacles for proper SQL, since it'll be easy to evolve them simultaneously. If we do this, I fully expect people to keep working on integrating external engines with Druid.

What I was hoping we could do for SQL in general is a plan like:

1) Start working on a built-in SQL layer immediately, even though many capabilities will be missing.

2) Evolve core Druid together with the built-in SQL layer towards supporting more SQL features.

3) Simultaneously, anyone already working on integrating an external engine like Hive, Drill, Spark, etc, can keep working on those. They'll benefit from the improvements to core Druid from #2, and be able to push more computation down to Druid. Maybe they'll even benefit from Druid's SQL layer itself, if they are Calcite based and choose to integrate Druid's own planner rules (which will be published in a druid-sql jar on Maven).

Gian

--

You received this message because you are subscribed to the Google Groups "Druid Development" group.

To unsubscribe from this group and stop receiving emails from it, send an email to druid-development+unsubscribe@googlegroups.com.
To post to this group, send email to druid-development@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/ef8cb3e1-0b57-46f0-a5ee-357b45483bde%40googlegroups.com.

Jihoon Son

unread,

Oct 14, 2016, 8:07:22 PM10/14/16

to druid-de...@googlegroups.com

Gian, thank you for your detailed answer.

It sounds reasonable and I agree with you.

Jihoon

2016년 10월 15일 (토) 오전 1:43, Gian Merlino <gi...@imply.io>님이 작성:

To unsubscribe from this group and stop receiving emails from it, send an email to druid-developm...@googlegroups.com.

To post to this group, send email to druid-de...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/ef8cb3e1-0b57-46f0-a5ee-357b45483bde%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "Druid Development" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-development/3npt9Qxpjr0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-developm...@googlegroups.com.
To post to this group, send email to druid-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/CACZNdYBFBo-43JnAmhStky804MHBr0VUL%2BJaZzoTUW4h8t%2Bykg%40mail.gmail.com.

Gian Merlino

unread,

Nov 9, 2016, 6:41:19 PM11/9/16

to druid-de...@googlegroups.com

Hey Julian,

Is it doable to disable the enumerable/bindable implementations of project, filter, aggregate, etc, for a planner? The justification here is similar to https://issues.apache.org/jira/browse/PHOENIX-3316. In most situations I think it'd be better for queries that can't be wholly translated to Druid queries to fail at the planning stage, rather than to execute in an unscalable way.

It looks like I could remove all the relevant rules from the planner, but that seems brittle since I need to name them individually.

Gian

On Wed, Oct 12, 2016 at 10:43 AM, Julian Hyde <jhyde....@gmail.com> wrote:

--

You received this message because you are subscribed to the Google Groups "Druid Development" group.

To unsubscribe from this group and stop receiving emails from it, send an email to druid-development+unsubscribe@googlegroups.com.
To post to this group, send email to druid-development@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/582c4241-eca0-4622-81ac-3c8fde1a4f1c%40googlegroups.com.

Julian Hyde

unread,

Nov 10, 2016, 2:18:15 AM11/10/16

to druid-de...@googlegroups.com

Gian,

There’s no particularly nice way to do it, I’m afraid. You could build a planner with the rules you need (probably using the org.apache.calcite.tools.Programs class and the rule sets defined in it). Or you could selectively remove rules from an existing planner, using RelOptPlanner.removeRule. Or you could give RelOptPlanner.setRuleDescExclusionFilter a try (I see we don’t have any tests for it, but it should work).

Julian

Gian Merlino

unread,

Nov 10, 2016, 12:59:59 PM11/10/16

to druid-de...@googlegroups.com

Thanks Julian, I'll give those a shot.

Gian

--
You received this message because you are subscribed to the Google Groups "Druid Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-development+unsubscribe@googlegroups.com.
To post to this group, send email to druid-development@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/A648C83C-1B42-43CB-B063-197B3854A5E9%40apache.org.

Gian Merlino

unread,

Nov 11, 2016, 2:50:44 PM11/11/16

to druid-de...@googlegroups.com

I posted a WIP PR at: https://github.com/druid-io/druid/pull/3682, feedback is very welcome.

Gian

Gian Merlino

unread,

Nov 11, 2016, 2:56:03 PM11/11/16

to druid-de...@googlegroups.com

Of the original list of "things that don't work" list some of them _do_ work now:

1) Pushing down filter after aggregate (HAVING)

2) Aggregation of expressions (using the math expression language)

3) WHERE FLOOR(__time TO MONTH) = x

4) Select with pagination

5) Memory usage controls on the Broker side for semi-joins

The docs in the PR (sql.md) describe some things that still don't work. Copying it here for convenience:

Druid does not support all SQL features. Most of these are due to missing features in the Druid native query language. A non-exhaustive list of unsupported features is:

- Grouping on functions of multiple columns, like concatenation: `SELECT COUNT(*) FROM druid.foo GROUP BY dim1 || ' ' || dim2`

- Grouping on long and float columns.

- Filtering on float columns.

- Filtering on non-boolean interactions between columns, like two columns equaling each other: `SELECT COUNT(*) FROM druid.foo WHERE dim1 = dim2`.

- A number of miscellaneous functions, like `TRIM`.

- Joins, other than semi-joins as described above.

Gian

sreehar...@gmail.com

unread,

Aug 21, 2017, 2:35:46 AM8/21/17

to Druid Development

Hello

Just help me out in Sql using http whenever i used verbose it is showing 405 method not allowed

Problem :

curl -XPOST -H'Content-Type: application/json' http://localhost:8082/druid/v2/sql/ -d '{"query":"SELECT COUNT(*) FROM wikiticker"}' -v

Connected to localhost (127.0.0.1) port 8082 (#0)

> POST /druid/v2/sql/ HTTP/1.1

> User-Agent: curl/7.35.0

> Host: localhost:8082

> Accept: */*

> Content-Type: application/json

> Content-Length: 75

>

* upload completely sent off: 75 out of 75 bytes

< HTTP/1.1 405 Method Not Allowed

< Date: Mon, 21 Aug 2017 06:32:59 GMT

< Allow: DELETE,OPTIONS

< Content-Length: 0

* Server Jetty(9.3.16.v20170120) is not blacklisted

< Server: Jetty(9.3.16.v20170120)

<

Gian Merlino

unread,

Aug 21, 2017, 11:58:14 AM8/21/17

to druid-de...@googlegroups.com

Try adding druid.sql.enable=true to your runtime properties.

Gian

--

You received this message because you are subscribed to the Google Groups "Druid Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-development+unsubscribe@googlegroups.com.
To post to this group, send email to druid-development@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/11f22733-2947-4b6d-99c4-9187e073deb5%40googlegroups.com.

Jakub Liska

unread,

Oct 19, 2017, 5:19:25 AM10/19/17

to Druid Development

Hey Gian,

I currently have druid.sql.enable=true in 0.10.1 and I get error response "Cannot POST /druid/v2/sql/".

I'm following this http://druid.io/docs/latest/querying/sql.html

Jakub.

On Monday, August 21, 2017 at 5:58:14 PM UTC+2, Gian Merlino wrote:

Try adding druid.sql.enable=true to your runtime properties.

Gian

On Sun, Aug 20, 2017 at 11:35 PM, <sreehar...@gmail.com> wrote:

Hello
Just help me out in Sql using http whenever i used verbose it is showing 405 method not allowed

Problem :

curl -XPOST -H'Content-Type: application/json' http://localhost:8082/druid/v2/sql/ -d '{"query":"SELECT COUNT(*) FROM wikiticker"}' -v

Connected to localhost (127.0.0.1) port 8082 (#0)
> POST /druid/v2/sql/ HTTP/1.1
> User-Agent: curl/7.35.0
> Host: localhost:8082
> Accept: */*
> Content-Type: application/json
> Content-Length: 75
>
* upload completely sent off: 75 out of 75 bytes
< HTTP/1.1 405 Method Not Allowed
< Date: Mon, 21 Aug 2017 06:32:59 GMT
< Allow: DELETE,OPTIONS
< Content-Length: 0
* Server Jetty(9.3.16.v20170120) is not blacklisted
< Server: Jetty(9.3.16.v20170120)
<

--
You received this message because you are subscribed to the Google Groups "Druid Development" group.

To unsubscribe from this group and stop receiving emails from it, send an email to druid-developm...@googlegroups.com.
To post to this group, send email to druid-de...@googlegroups.com.

Jakub Liska

unread,

Oct 19, 2017, 5:34:29 AM10/19/17

to Druid Development

Ok that was my fault, however now :

curl -XPOST -H'Content-Type: application/json' http://localhost/druid/v2/sql/ -d '{"query":"SELECT COUNT(*) FROM gwiq-daily-p"}'

{"error":"Unknown exception","errorMessage":"Encountered \"-\" at line 1, column 26.\nWas expecting one of:\n <EOF> \n \"ORDER\" ...\n \"LIMIT\" ...\n \"OFFSET\" ...\n \"FETCH\" ...\n \"NATURAL\" ...\n \"JOIN\" ...\n \"INNER\" ...\n \"LEFT\" ...\n \"RIGHT\" ...\n \"FULL\" ...\n \"CROSS\" ...\n \",\" ...\n \"OUTER\" ...\n \"EXTEND\" ...\n \"(\" ...\n \".\" ...\n \"AS\" ...\n <IDENTIFIER> ...\n <QUOTED_IDENTIFIER> ...\n <BACK_QUOTED_IDENTIFIER> ...\n <BRACKET_QUOTED_IDENTIFIER> ...\n <UNICODE_QUOTED_IDENTIFIER> ...\n \"TABLESAMPLE\" ...\n \"WHERE\" ...\n \"GROUP\" ...\n \"HAVING\" ...\n \"WINDOW\" ...\n \"UNION\" ...\n \"INTERSECT\" ...\n \"EXCEPT\" ...\n \"MINUS\" ...\n ","errorClass":"org.apache.calcite.sql.parser.SqlParseException","host":null}

Message has been deleted

Jakub Liska

unread,

Oct 19, 2017, 6:11:59 AM10/19/17

to Druid Development

Fixed, missing quotes at table name, sorry, bad day :-)

Gian Merlino

unread,

Oct 20, 2017, 1:22:26 AM10/20/17

to druid-de...@googlegroups.com

Just saw this, but glad you got it figured out!

Gian

--
You received this message because you are subscribed to the Google Groups "Druid Development" group.

To unsubscribe from this group and stop receiving emails from it, send an email to druid-development+unsubscribe@googlegroups.com.
To post to this group, send email to druid-development@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/fb7699c4-7ef8-4bad-9c34-01027c34a19c%40googlegroups.com.

Jakub Liska

unread,

Oct 23, 2017, 12:05:31 PM10/23/17

to Druid Development

Just a question, is it possible to enable logging of those native druid queries that sql translates to?

I have this issue that GroubBy returns "null" instead of the split results even though it should not and I don't know whether the problem is in the SQL engine or druid.

Gian Merlino

unread,

Oct 23, 2017, 1:09:32 PM10/23/17

to druid-de...@googlegroups.com

It isn't at this point, although you can get some more details by prepending "EXPLAIN PLAN FOR" before your query. In 0.11.0 you will have two additional options that you don't have now,

- You could enable request logs and have the Druid queries logged to disk/emitter/etc

- When you prepend "EXPLAIN PLAN FOR" you will get the JSON that would be sent down to Druid

Gian

On Mon, Oct 23, 2017 at 9:05 AM, Jakub Liska <liska...@gmail.com> wrote:

Just a question, is it possible to enable logging of those native druid queries that sql translates to?

I have this issue that GroubBy returns "null" instead of the split results even though it should not and I don't know whether the problem is in the SQL engine or druid.

--

You received this message because you are subscribed to the Google Groups "Druid Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-development+unsubscribe@googlegroups.com.
To post to this group, send email to druid-development@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/ef7a9d97-9c66-445f-a7fc-9669be3a1f34%40googlegroups.com.

Reply all

Reply to author

Forward