Druid Count Distinct with Superset

Skip to first unread message

Stelios Philippou

Mar 13, 2023, 7:06:15 AM3/13/23
to Druid User
I am looking for some help with this.
Count Distint in Druid is an approximations of the actual Count.
This does not work nicely on Superset as a lot of our Counts end up wrong. 

We have updated the following configurations on Druid

druid.sql.planner.useApproximateCountDistinct → False

druid.sql.planner.useGroupingSetForExactDistinct → True

and i can see that on Druid the following is disabled when doing any queries

Screen Shot 2023-03-13 at 1.03.51 PM.png

My problem now, is that when using Superset the following Count Distinct will not work correctly and the data will be return wrong.

I am open to suggestions or ideas how i can properly sort this out.



Sergio Ferragut

Mar 13, 2023, 11:38:28 AM3/13/23
to Druid User
Hi Stelios,

Are you getting approximations when you query through Superset? 
One thought is that Superset has both a Native connector and a SQLAlchemy based connector to Druid. My understanding is that the configurations you set are only for the SQL based interface between Superset and Druid, so perhaps if you change the connector.
Reply all
Reply to author
0 new messages