SQL queries on 0.10: "Table not found" for one of two brokers

1,495 views
Skip to first unread message

Alex Charlton

unread,
May 3, 2017, 4:41:17 PM5/3/17
to Druid User
Hi,

I've been evaluating the use of Druid and recently just upgraded our cluster from 0.9 to 0.10. Aside from needing to configure the allowed memory for everything, this process went relatively smoothly. The one remaining issue is puzzling. 

The ability to perform SQL queries was a major selling point internally for this new release. Unfortunately, in our cluster that contains two brokers, only one seems to be able to execute SQL queries. The other is seemingly unable to find our table.

root@druid-broker:/druid/druid-dist> curl -X POST -H'Content-Type: application/json' http://localhost:8082/druid/v2/sql/ -d '{"query":"SELECT COUNT(*) FROM eventer"}'

{"error":"Unknown exception","errorMessage":"org.apache.calcite.runtime.CalciteContextException: From line 1, column 15 to line 1, column 21: Table 'eventer' not found","errorClass":"org.apache.calcite.tools.ValidationException","host":null}


Queries of this sort on the other broker work just fine.

From the logs:

2017-05-03T20:19:17,582 ERROR [qtp2073333566-78] org.apache.calcite.runtime.CalciteException - org.apache.calcite.sql.validate.SqlValidatorException: Table 'eventer' not found
2017-05-03T20:19:17,583 ERROR [qtp2073333566-78] org.apache.calcite.runtime.CalciteException - org.apache.calcite.runtime.CalciteContextException: From line 1, column 15 to line
1, column 21: Table 'eventer' not found
2017-05-03T20:19:17,583 WARN [qtp2073333566-78] io.druid.sql.http.SqlResource - Failed to handle query: SqlQuery{query='SELECT * FROM eventer LIMIT 2', context={}}
org.apache.calcite.tools.ValidationException: org.apache.calcite.runtime.CalciteContextException: From line 1, column 15 to line 1, column 21: Table 'eventer' not found
        at org.apache.calcite.prepare.PlannerImpl.validate(PlannerImpl.java:187) ~[calcite-core-1.11.0.jar:1.11.0]
        at io.druid.sql.calcite.planner.DruidPlanner.plan(DruidPlanner.java:76) ~[druid-sql-0.10.0.jar:0.10.0]
        at io.druid.sql.http.SqlResource.doPost(SqlResource.java:86) [druid-sql-0.10.0.jar:0.10.0]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_121]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_121]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_121]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_121]
        at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) [jersey-server-1.19.jar:1.19]
        at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205) [
jersey-server-1.19.jar:1.19]
        at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) [jersey-server-1.19.jar:1.19]
        at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302) [jersey-server-1.19.jar:1.19]
        at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108) [jersey-server-1.19.jar:1.19]
        at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) [jersey-server-1.19.jar:1.19]
        at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) [jersey-server-1.19.jar:1.19]
        at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542) [jersey-server-1.19.jar:1.19]
        at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473) [jersey-server-1.19.jar:1.19]
        at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419) [jersey-server-1.19.jar:1.19]
        at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409) [jersey-server-1.19.jar:1.19]
        at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409) [jersey-servlet-1.19.jar:1.19]
        at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558) [jersey-servlet-1.19.jar:1.19]
        at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733) [jersey-servlet-1.19.jar:1.19]
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) [javax.servlet-api-3.1.0.jar:3.1.0]
        at com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:286) [guice-servlet-4.1.0.jar:?]
        at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:276) [guice-servlet-4.1.0.jar:?]
        at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:181) [guice-servlet-4.1.0.jar:?]
        at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) [guice-servlet-4.1.0.jar:?]
        at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85) [guice-servlet-4.1.0.jar:?]
        at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120) [guice-servlet-4.1.0.jar:?]
        at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:135) [guice-servlet-4.1.0.jar:?]
        at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) [jetty-servlet-9.3.16.v20170120.jar:9.3.16.v20170120]
        at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) [jetty-servlet-9.3.16.v20170120.jar:9.3.16.v20170120]
        at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:224) [jetty-server-9.3.16.v20170120.jar:9.3.16.v20170120]
        at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) [jetty-server-9.3.16.v20170120.jar:9.3.16.v20170120]
        at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) [jetty-servlet-9.3.16.v20170120.jar:9.3.16.v20170120]
        at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) [jetty-server-9.3.16.v20170120.jar:9.3.16.v20170120]
        at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) [jetty-server-9.3.16.v20170120.jar:9.3.16.v20170120]
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) [jetty-server-9.3.16.v20170120.jar:9.3.16.v20170120]
        at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493) [jetty-server-9.3.16.v20170120.jar:9.3.16.v20170120]
        at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52) [jetty-server-9.3.16.v20170120.jar:9.3.16.v20170120]
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) [jetty-server-9.3.16.v20170120.jar:9.3.16.v20170120]
        at org.eclipse.jetty.server.Server.handle(Server.java:534) [jetty-server-9.3.16.v20170120.jar:9.3.16.v20170120]
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320) [jetty-server-9.3.16.v20170120.jar:9.3.16.v20170120]
        at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) [jetty-server-9.3.16.v20170120.jar:9.3.16.v20170120]
        at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283) [jetty-io-9.3.16.v20170120.jar:9.3.16.v20170120]
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:110) [jetty-io-9.3.16.v20170120.jar:9.3.16.v20170120]
        at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) [jetty-io-9.3.16.v20170120.jar:9.3.16.v20170120]
        at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) [jetty-util-9.3.16.v20170120.jar:9.3.16.v20170120]
        at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) [jetty-util-9.3.16.v20170120.jar:9.3.16.v20170120]
        at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) [jetty-util-9.3.16.v20170120.jar:9.3.16.v20170120]
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) [jetty-util-9.3.16.v20170120.jar:9.3.16.v20170120]
        at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) [jetty-util-9.3.16.v20170120.jar:9.3.16.v20170120]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]

Any assistance in debugging this would be appreciated. Particularly: what would cause one Broker to end up with a different context from the other?

Thanks,
Alex

Gian Merlino

unread,
May 3, 2017, 4:54:24 PM5/3/17
to druid...@googlegroups.com
Hey Alex,

The brokers populate their list of SQL tables by doing metadata queries to your data nodes. It's possible one of the brokers wasn't able to do that. I bet restarting that broker would get it able to run SQL queries. I've seen one race where a broker could make a metadata query "too fast" for a freshly loaded datasource, and thereby miss it. So you might be hitting something like that.

Do you see any exceptions in the problematic broker's log? That'd help figure out if you're seeing a known or unknown issue. If it's not currently known we'll make sure to fix it.

Gian

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/78ecf2a5-a715-4a1a-80b8-0e249c5725d9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alex Charlton

unread,
May 3, 2017, 5:49:06 PM5/3/17
to Druid User
Hi Gian,

Sorry, I forgot to mention that I've restarted the broker in question multiple times. Additionally there are no exceptions in the log (except for those specifically related to the query, as provided in my previous message).

My most recent restart however seems to have fixed things (for now!) I'll post back if I run into this problem again.

Thanks!

Gian

To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.

Gian Merlino

unread,
May 4, 2017, 1:46:48 AM5/4/17
to druid...@googlegroups.com
If you ever run into it again, maybe check if you see any "Refreshing metadata for dataSource[xx]" messages or any interesting looking followups to that message.

Gian

To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

Alex Charlton

unread,
May 6, 2017, 7:55:44 PM5/6/17
to Druid User
Hi Gian,

Restarting my brokers continues to pose problems with respect to SQL queries. There appear to be some timing issues around fetching the SQL metadata:

1) If I perform a SQL query before I see a "Refreshing metadata for dataSource[xx]" message show up in the log, that broker will always respond with a "table not found" message for all subsequent SQL queries (although non-SQL queries work fine).
2) If I perform a SQL query after I see a "Refreshing metadata for dataSource[xx]" message show up in the log, the first set of SQL queries performed may hang for many seconds. It may take more than 2 minutes for a SQL query to return without timing out. After the first SQL query has returned successfully, all subsequent queries will return promptly. 

Aside from those observations, there does not seem to be anything particularly notable going on in the logs.

Cheers,
Alex

Gian

Gian Merlino

unread,
May 7, 2017, 12:43:23 PM5/7/17
to druid...@googlegroups.com
Hmm, that's interesting and definitely seems wrong. Do you ever see messages like "Refreshed metadata for dataSource[x] in y ms."? If so how big is "y"? And what API are you using to query -- JDBC or the JSON-over-HTTP on /druid/v2/sql/?

Gian

To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

Alex Charlton

unread,
May 7, 2017, 8:48:09 PM5/7/17
to druid...@googlegroups.com
I do see messages like that. Sometimes two, one after another, but usually just once soon after starting the broker. It usually reports something between 2000-3000ms.

I've been using the JSON-over-HTTP API.


For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/A_C9pKWzc0M/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

Arpan Khagram

unread,
May 8, 2017, 4:33:28 AM5/8/17
to Druid User
Hi Gian, We also came across the same issue of Table not found.

Gian Merlino

unread,
May 8, 2017, 7:29:48 PM5/8/17
to druid...@googlegroups.com
Hmm, let me see if I can reproduce this and figure out what's going on. Any repro steps would be helpful if you have a way to make this behavior appear in a smaller installation like the quickstart.


Gian

To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

Gian Merlino

unread,
May 26, 2017, 3:48:02 PM5/26/17
to druid...@googlegroups.com
Alex, Arpan, were either of you using multiple brokers behind load balancers?

Gian

Alex Charlton

unread,
May 26, 2017, 10:32:24 PM5/26/17
to Druid User
Gian, 

Yes, I was. This behaviour was still present when querying from/to localhost, however.

Alex

Gian


Gian

Reply all
Reply to author
Forward
0 new messages