Connecting to Druid via Spark via JDBC

Ben Vogan

unread,

May 23, 2017, 5:06:23 PM5/23/17

to druid...@googlegroups.com

Hi all,

I am interested in querying Druid via Spark. I know there is a separate project for doing so (https://github.com/SparklineData/spark-druid-olap) but I was curious as to whether the new JDBC support might not be a better supported option.

I am wholly unfamiliar with the Avatica driver and I am unclear as to what class is the proper entry point.

I have tried:

val druidDf = sqlContext.read.format("jdbc").options(Map("url" -> "jdbc:avatica:remote:url=http://mydruidbroker:8082/druid/v2/sql/avatica/", "dbtable" -> "mydruidtable", "driver" -> "org.apache.calcite.avatica.remote.Driver", "fetchSize"->"10000")).load()

But this gives me an UnsupportedOperationException.

I tried changing the driver to org.apache.calcite.avatica.UnregisteredDriver but this gives me:

java.lang.IllegalAccessException: Class org.apache.spark.sql.execution.datasources.jdbc.DriverRegistry$ can not access a member of class org.apache.calcite.avatica.UnregisteredDriver with modifiers "protected"

I presume this is because the constructor is protected.

If someone can point me in the correct direction I would greatly appreciate it.

Thanks,

--

BENJAMIN VOGAN | Data Platform Team Lead

Gian Merlino

unread,

May 23, 2017, 5:22:23 PM5/23/17

to druid...@googlegroups.com

Do you have a message or stack trace for the UnsupportedOperationException? That'd help.

The Spark docs have a troubleshooting step that talks about getting a classloader set up, which may or may not be related (https://spark.apache.org/docs/latest/sql-programming-guide.html#jdbc-to-other-databases):

> The JDBC driver class must be visible to the primordial class loader on the client session and on all executors. This is because Java’s DriverManager class does a security check that results in it ignoring all drivers not visible to the primordial class loader when one goes to open a connection. One convenient way to do this is to modify compute_classpath.sh on all worker nodes to include your driver JARs.

Gian

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CAAoNsd%3Dgtb%2BFVzXXD2Z4eL304dusuGDvwKUDwSf1wW0ueV7-kA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Ben Vogan

unread,

May 24, 2017, 1:33:54 PM5/24/17

to druid...@googlegroups.com

My apologies for the delay. It appears to be a problem using prepared statements:

scala> val dw2 = sqlContext.read.format("jdbc").options(Map("url" -> "jdbc:avatica:remote:url=http://jarvis-druid-query002:8082/druid/v2/sql/avatica/", "dbtable" -> "sor_business_events_all", "driver" -> "org.apache.calcite.avatica.remote.Driver", "fetchSize"->"10000")).load()

java.lang.UnsupportedOperationException

at org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:275)

at org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:121)

at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:122)

at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:91)

at org.apache.spark.sql.execution.datasources.jdbc.DefaultSource.createRelation(DefaultSource.scala:57)

at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158)

at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)

at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:25)

at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:30)

at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:32)

at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:34)

at $iwC$$iwC$$iwC$$iwC.<init>(<console>:36)

at $iwC$$iwC$$iwC.<init>(<console>:38)

at $iwC$$iwC.<init>(<console>:40)

at $iwC.<init>(<console>:42)

at <init>(<console>:44)

at .<init>(<console>:48)

at .<clinit>(<console>)

at .<init>(<console>:7)

at .<clinit>(<console>)

at $print(<console>)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1045)

at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1326)

at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:821)

at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:852)

at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:800)

at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)

at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)

at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)

at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)

at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)

at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)

at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)

at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)

at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)

at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)

at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1064)

at org.apache.spark.repl.Main$.main(Main.scala:31)

at org.apache.spark.repl.Main.main(Main.scala)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)

at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)

at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)

at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)

at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CACZNdYBu9T5dBG1VCBjm8KLavESA7QHnJ_F8VXTSGhmQU4oFDw%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

Gian Merlino

unread,

May 25, 2017, 5:10:13 AM5/25/17

to druid...@googlegroups.com

Hmm, looks like something missing in Avatica. Is it possible to get Spark to avoid using prepared statements?

The Calcite folks may also be able to help with this, maybe they can shed some light on why the method isn't implemented.

Gian

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CAAoNsdm7Wky6b56f9AQrjufxYMK_sxCjwz7mL89G%2Bv_HtzV4yA%40mail.gmail.com.

Ben Vogan

unread,

May 25, 2017, 12:17:20 PM5/25/17

to druid...@googlegroups.com

Thanks Gian. I have opened a ticket with the Avatica/Calcite folks.

--Ben

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CACZNdYDWTACO2s%2Bf%2BUarT37K1a7E7CnbY9qebVwpvxdsZn8WBQ%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

lethuy...@gmail.com

unread,

Jun 11, 2018, 5:32:42 AM6/11/18

to Druid User

hii guys,

Does anybody know, where we should to copy avatica.jar in druid?

How to connect druid with tableau or grafana (as SQL datasource)?

Thank you so much,

zhangxin...@gmail.com

unread,

Jul 14, 2018, 11:38:47 AM7/14/18

to Druid User

there is a druid plugin in grafana, you can use it.

在 2018年6月11日星期一 UTC+8下午5:32:42，lethuy...@gmail.com写道：

seyyed Safavie

unread,

Aug 10, 2020, 2:27:53 PM8/10/20

to Druid User

This is a problem for me too

Did you solve this problem?

Gian

To unsubscribe from this group and stop receiving emails from it, send an email to druid...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CAAoNsd%3Dgtb%2BFVzXXD2Z4eL304dusuGDvwKUDwSf1wW0ueV7-kA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Druid User" group.

To unsubscribe from this group and stop receiving emails from it, send an email to druid...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CACZNdYBu9T5dBG1VCBjm8KLavESA7QHnJ_F8VXTSGhmQU4oFDw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
BENJAMIN VOGAN | Data Platform Team Lead

--
You received this message because you are subscribed to the Google Groups "Druid User" group.

To unsubscribe from this group and stop receiving emails from it, send an email to druid...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CAAoNsdm7Wky6b56f9AQrjufxYMK_sxCjwz7mL89G%2Bv_HtzV4yA%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Druid User" group.

To unsubscribe from this group and stop receiving emails from it, send an email to druid...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CACZNdYDWTACO2s%2Bf%2BUarT37K1a7E7CnbY9qebVwpvxdsZn8WBQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

vijay narayanan

unread,

Aug 10, 2020, 11:14:45 PM8/10/20

to druid...@googlegroups.com

this works for me

val dw2 = sqlContext.read.format("jdbc").options(Map("url" -> "jdbc:avatica:remote:url=http://localhost:8082/druid/v2/sql/avatica/", "dbtable" -> "Fire_Department_Calls_for_service", "driver" -> "org.apache.calcite.avatica.remote.Driver", "fetchSize"->"10000")).load()

dw2: org.apache.spark.sql.DataFrame = [ALS Unit: string, Address: string ... 43 more fields]

I ran spark shell like ./spark-shell --driver-class-path ../../avatica-1.12.0.jar --jars ../../avatica-1.12.0.jar

vijay

To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/4055e0f6-f28a-40da-9fea-629445ab2211o%40googlegroups.com.

Reply all

Reply to author

Forward