Query metadata is very very slow and timeout eventually. Help!

158 views
Skip to first unread message

Noppanit Charassinvichai

unread,
Sep 13, 2016, 6:15:45 PM9/13/16
to Druid User
We're trying to use Caravel on top of Druid and during the refresh process, it's taking a very very long time. And eventually it will timeout. I tried it a couple of times and I tracked the issue down to this metadata query to Druid.

{
  "queryType":"segmentMetadata",
  "dataSource":"sparrow-firehose-web",
  "intervals":["2016-09-06T22:09:55.891000+00:00/2016-09-13T22:09:55.891000+00:00"]
}

I'm not sure if it's normal and how to debug or improve this? 

I did some digging and when I query this endpoint which I think it's getting the metadata

/druid/coordinator/v1/metadata/datasources/{dataSourceName}

If I use CURL and output to a file, it's about 69MB. I'm not sure if it's normal at all? 


Gian Merlino

unread,
Sep 13, 2016, 6:22:42 PM9/13/16
to druid...@googlegroups.com
You can cut down how heavy this query is by only asking for the "analysisTypes" you want. By default a lot of them are enabled, and they can be expensive. Also, if you don't need details of every segment, you can turn on "merge" and possibly also "lenientAggregatorMerge".

Check the Druid docs for details on all of these options.

Gian

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/0dd72fec-1952-48c3-b5f6-f7489c9413be%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Message has been deleted

Slim Bouguerra

unread,
Sep 15, 2016, 3:36:38 PM9/15/16
to druid...@googlegroups.com

On Sep 13, 2016, at 3:30 PM, Noppanit Charassinvichai <noppa...@gmail.com> wrote:

Thanks, I'll check out the doc if anything I might have to submit a PR for Caravel. And is my 69MB size sounds normal? Or is there an optimization I can do?

Segment metadata queries return per-segment information about dims and other stuff, thus the size of the query result is linear to the number of segment covered by the query intervals.
You might be able to merge segments by having less partitions or/and  increase the segment granularity but that can have negative impact if the number of rows per physical segment is more than 5M rows.
Hope that helps.
 


On Tuesday, 13 September 2016 18:22:42 UTC-4, Gian Merlino wrote:
You can cut down how heavy this query is by only asking for the "analysisTypes" you want. By default a lot of them are enabled, and they can be expensive. Also, if you don't need details of every segment, you can turn on "merge" and possibly also "lenientAggregatorMerge".

Check the Druid docs for details on all of these options.

Gian

On Tue, Sep 13, 2016 at 3:15 PM, Noppanit Charassinvichai <noppa...@gmail.com> wrote:
We're trying to use Caravel on top of Druid and during the refresh process, it's taking a very very long time. And eventually it will timeout. I tried it a couple of times and I tracked the issue down to this metadata query to Druid.

{
  "queryType":"segmentMetadata",
  "dataSource":"sparrow-firehose-web",
  "intervals":["2016-09-06T22:09:55.891000+00:00/2016-09-13T22:09:55.891000+00:00"]
}

I'm not sure if it's normal and how to debug or improve this? 

I did some digging and when I query this endpoint which I think it's getting the metadata

/druid/coordinator/v1/metadata/datasources/{dataSourceName}

If I use CURL and output to a file, it's about 69MB. I'm not sure if it's normal at all? 



--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/0dd72fec-1952-48c3-b5f6-f7489c9413be%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

Gian Merlino

unread,
Sep 15, 2016, 4:53:11 PM9/15/16
to druid...@googlegroups.com
If you set "merge" to true then you should get a result substantially smaller than 69MB, since you'll get a single metadata object instead of an object per segment.

Gian

On Tue, Sep 13, 2016 at 3:30 PM, Noppanit Charassinvichai <noppa...@gmail.com> wrote:
Thanks, I'll check out the doc if anything I might have to submit a PR for Caravel. And is my 69MB size sounds normal? Or is there an optimization I can do?
On Tuesday, 13 September 2016 18:22:42 UTC-4, Gian Merlino wrote:
You can cut down how heavy this query is by only asking for the "analysisTypes" you want. By default a lot of them are enabled, and they can be expensive. Also, if you don't need details of every segment, you can turn on "merge" and possibly also "lenientAggregatorMerge".

Check the Druid docs for details on all of these options.

Gian

On Tue, Sep 13, 2016 at 3:15 PM, Noppanit Charassinvichai <noppa...@gmail.com> wrote:
We're trying to use Caravel on top of Druid and during the refresh process, it's taking a very very long time. And eventually it will timeout. I tried it a couple of times and I tracked the issue down to this metadata query to Druid.

{
  "queryType":"segmentMetadata",
  "dataSource":"sparrow-firehose-web",
  "intervals":["2016-09-06T22:09:55.891000+00:00/2016-09-13T22:09:55.891000+00:00"]
}

I'm not sure if it's normal and how to debug or improve this? 

I did some digging and when I query this endpoint which I think it's getting the metadata

/druid/coordinator/v1/metadata/datasources/{dataSourceName}

If I use CURL and output to a file, it's about 69MB. I'm not sure if it's normal at all? 


--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/0dd72fec-1952-48c3-b5f6-f7489c9413be%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
Message has been deleted

Slim Bouguerra

unread,
Sep 16, 2016, 6:24:20 PM9/16/16
to druid...@googlegroups.com
Hi
socket connections are treated like files and they use file descriptor, which is a limited resource you need to increase the limit for open files.
do 'ulimit -a' to find out how many open file handles per process is allowed.
-- 

B-Slim
_______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______

On Sep 16, 2016, at 5:21 AM, Noppanit Charassinvichai <noppa...@gmail.com> wrote:

I tried with merge:true and I still get the timeout from the broker

{
  "queryType":"segmentMetadata",
  "dataSource":"datasource",
  "intervals":["2016-09-09T02:40:27.468000+00:00/2016-09-16T02:40:27.468000+00:00"],
  "merge": true
}

And it seemed like I got Too many open files exception from the brokers as well. Not sure how to debug this.

org.jboss.netty.channel.ChannelException: Failed to open a socket.
        at org
.jboss.netty.channel.socket.nio.NioClientSocketChannel.newSocket(NioClientSocketChannel.java:43) ~[netty-3.10.4.Final.jar:?]
        at org
.jboss.netty.channel.socket.nio.NioClientSocketChannel.<init>(NioClientSocketChannel.java:82) ~[netty-3.10.4.Final.jar:?]
        at org
.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory.newChannel(NioClientSocketChannelFactory.java:212) ~[netty-3.10.4.Final.jar:?]
        at org
.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory.newChannel(NioClientSocketChannelFactory.java:82) ~[netty-3.10.4.Final.jar:?]
        at org
.jboss.netty.bootstrap.ClientBootstrap.connect(ClientBootstrap.java:212) ~[netty-3.10.4.Final.jar:?]
        at org
.jboss.netty.bootstrap.ClientBootstrap.connect(ClientBootstrap.java:182) ~[netty-3.10.4.Final.jar:?]
        at com
.metamx.http.client.pool.ChannelResourceFactory.generate(ChannelResourceFactory.java:84) ~[http-client-1.0.4.jar:?]
        at com
.metamx.http.client.pool.ChannelResourceFactory.generate(ChannelResourceFactory.java:41) ~[http-client-1.0.4.jar:?]
        at com
.metamx.http.client.pool.ResourcePool$ImmediateCreationResourceHolder.get(ResourcePool.java:188) ~[http-client-1.0.4.jar:?]
        at com
.metamx.http.client.pool.ResourcePool.take(ResourcePool.java:76) ~[http-client-1.0.4.jar:?]
        at com
.metamx.http.client.NettyHttpClient.go(NettyHttpClient.java:133) ~[http-client-1.0.4.jar:?]
        at com
.metamx.http.client.AbstractHttpClient.go(AbstractHttpClient.java:14) ~[http-client-1.0.4.jar:?]
        at io
.druid.client.DirectDruidClient.run(DirectDruidClient.java:326) ~[druid-server-0.9.0.jar:0.9.0]
        at io
.druid.client.CachingClusteredClient$2.addSequencesFromServer(CachingClusteredClient.java:386) ~[druid-server-0.9.0.jar:0.9.0]
        at io
.druid.client.CachingClusteredClient$2.get(CachingClusteredClient.java:316) ~[druid-server-0.9.0.jar:0.9.0]
        at io
.druid.client.CachingClusteredClient$2.get(CachingClusteredClient.java:310) ~[druid-server-0.9.0.jar:0.9.0]
        at com
.metamx.common.guava.LazySequence.toYielder(LazySequence.java:43) ~[java-util-0.27.7.jar:?]
        at io
.druid.query.RetryQueryRunner$1.toYielder(RetryQueryRunner.java:105) ~[druid-processing-0.9.0.jar:0.9.0]
        at io
.druid.common.guava.CombiningSequence.toYielder(CombiningSequence.java:79) ~[druid-common-0.9.0.jar:0.9.0]
        at com
.metamx.common.guava.MappedSequence.toYielder(MappedSequence.java:46) ~[java-util-0.27.7.jar:?]
        at com
.metamx.common.guava.MappedSequence.toYielder(MappedSequence.java:46) ~[java-util-0.27.7.jar:?]
        at io
.druid.query.CPUTimeMetricQueryRunner$1.toYielder(CPUTimeMetricQueryRunner.java:93) ~[druid-processing-0.9.0.jar:0.9.0]
        at com
.metamx.common.guava.Sequences$1.toYielder(Sequences.java:98) ~[java-util-0.27.7.jar:?]
        at io
.druid.server.QueryResource.doPost(QueryResource.java:167) [druid-server-0.9.0.jar:0.9.0]
        at sun
.reflect.GeneratedMethodAccessor27.invoke(Unknown Source) ~[?:?]
        at sun
.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_101]
        at java
.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_101]
        at com
.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) [jersey-server-1.19.jar:1.19]
        at com
.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205) [jersey-server-1.19.jar:1.19]
        at com
.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) [jersey-server-1.19.jar:1.19]
        at com
.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302) [jersey-server-1.19.jar:1.19]
        at com
.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108) [jersey-server-1.19.jar:1.19]
        at com
.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) [jersey-server-1.19.jar:1.19]
        at com
.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) [jersey-server-1.19.jar:1.19]
        at com
.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542) [jersey-server-1.19.jar:1.19]
        at com
.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473) [jersey-server-1.19.jar:1.19]
        at com
.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419) [jersey-server-1.19.jar:1.19]
        at com
.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409) [jersey-server-1.19.jar:1.19]
        at com
.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409) [jersey-servlet-1.19.jar:1.19]
        at com
.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558) [jersey-servlet-1.19.jar:1.19]
        at com
.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733) [jersey-servlet-1.19.jar:1.19]
        at javax
.servlet.http.HttpServlet.service(HttpServlet.java:790) [javax.servlet-api-3.1.0.jar:3.1.0]
        at com
.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:278) [guice-servlet-4.0-beta.jar:?]
        at com
.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:268) [guice-servlet-4.0-beta.jar:?]
        at com
.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:180) [guice-servlet-4.0-beta.jar:?]
        at com
.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:93) [guice-servlet-4.0-beta.jar:?]
        at com
.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85) [guice-servlet-4.0-beta.jar:?]
        at com
.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120) [guice-servlet-4.0-beta.jar:?]
        at com
.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:132) [guice-servlet-4.0-beta.jar:?]
        at com
.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:129) [guice-servlet-4.0-beta.jar:?]
        at com
.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:206) [guice-servlet-4.0-beta.jar:?]
        at com
.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:129) [guice-servlet-4.0-beta.jar:?]
        at org
.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) [jetty-servlet-9.2.5.v20141112.jar:9.2.5.v20141112]
        at org
.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83) [jetty-servlets-9.2.5.v20141112.jar:9.2.5.v20141112]
        at org
.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:364) [jetty-servlets-9.2.5.v20141112.jar:9.2.5.v20141112]
        at org
.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) [jetty-servlet-9.2.5.v20141112.jar:9.2.5.v20141112]
        at org
.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) [jetty-servlet-9.2.5.v20141112.jar:9.2.5.v20141112]
        at org
.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]
        at org
.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]
        at org
.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) [jetty-servlet-9.2.5.v20141112.jar:9.2.5.v20141112]
        at org
.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]
        at org
.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]
        at org
.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]
        at org
.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]
        at org
.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]
        at org
.eclipse.jetty.server.Server.handle(Server.java:497) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]
        at org
.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]
        at org
.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248) [jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]
        at org
.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540) [jetty-io-9.2.5.v20141112.jar:9.2.5.v20141112]
        at org
.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:620) [jetty-util-9.2.5.v20141112.jar:9.2.5.v20141112]
        at org
.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:540) [jetty-util-9.2.5.v20141112.jar:9.2.5.v20141112]
        at java
.lang.Thread.run(Thread.java:745) [?:1.7.0_101]
Caused by: java.net.SocketException: Too many open files
        at sun
.nio.ch.Net.socket0(Native Method) ~[?:1.7.0_101]
        at sun
.nio.ch.Net.socket(Net.java:441) ~[?:1.7.0_101]
        at sun
.nio.ch.Net.socket(Net.java:434) ~[?:1.7.0_101]
        at sun
.nio.ch.SocketChannelImpl.<init>(SocketChannelImpl.java:105) ~[?:1.7.0_101]
        at sun
.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:60) ~[?:1.7.0_101]
        at java
.nio.channels.SocketChannel.open(SocketChannel.java:142) ~[?:1.7.0_101]
        at org
.jboss.netty.channel.socket.nio.NioClientSocketChannel.newSocket(NioClientSocketChannel.java:41) ~[netty-3.10.4.Final.jar:?]
        
... 71 more

Gian

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages