Exception while sending request
2017-09-06 11:07:30,597 WARN i.d.i.k.KafkaIndexTaskClient [KafkaIndexTaskClient-prod_sample_1_events-0] Exception while sending request
io.druid.java.util.common.IAE: Received 400 Bad Request with body: Task must be paused before changing the end offsets
at io.druid.indexing.kafka.KafkaIndexTaskClient.submitRequest(KafkaIndexTaskClient.java:515) [druid-kafka-indexing-service-0.10.1.jar:0.10.1]
at io.druid.indexing.kafka.KafkaIndexTaskClient.setEndOffsets(KafkaIndexTaskClient.java:293) [druid-kafka-indexing-service-0.10.1.jar:0.10.1]
at io.druid.indexing.kafka.KafkaIndexTaskClient$11.call(KafkaIndexTaskClient.java:429) [druid-kafka-indexing-service-0.10.1.jar:0.10.1]
at io.druid.indexing.kafka.KafkaIndexTaskClient$11.call(KafkaIndexTaskClient.java:425) [druid-kafka-indexing-service-0.10.1.jar:0.10.1]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_141]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_141]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_141]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]
Exception while getting current/latest offsets
2017-09-06 11:28:43,877 WARN i.d.i.k.s.KafkaSupervisor [KafkaSupervisor-prod_sample_1
_events-Reporting-0] Exception while getting current/latest offsets
java.util.concurrent.TimeoutException: Timeout waiting for task.
at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:269) ~[guava-16.0.1.jar:?]
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:96) ~[guava-16.0.1.jar:?]
at io.druid.indexing.kafka.supervisor.KafkaSupervisor.updateCurrentOffsets(KafkaSupervisor.java:1802) ~[?:?]
at io.druid.indexing.kafka.supervisor.KafkaSupervisor.lambda$updateCurrentAndLatestOffsets$12(KafkaSupervisor.java:1810) ~[?:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_141]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_141]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_141]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_141]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_141]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_141]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]
Kafka partitions [,,,] do not match task partition [,,,]
occurs frequently
2017-09-06 11:25:31,410 WARN i.d.i.k.s.KafkaSupervisor [KafkaSupervisor-prod_sample_1
_events-Reporting-0] Lag metric: Kafka partitions [0, 1, 2, 3, 4, 5, 6, 7] do not match task partitions [1, 2, 3, 4, 6]
Cannot get a connection, general error
2017-09-06 11:44:10,741 WARN i.d.j.u.c.RetryUtils [qtp1373220972-115] Failed on try 1, retrying in 720ms.
org.skife.jdbi.v2.exceptions.UnableToObtainConnectionException: java.sql.SQLException: Cannot get a connection, general error
at org.skife.jdbi.v2.DBI.open(DBI.java:230) ~[jdbi-2.63.1.jar:2.63.1]
at org.skife.jdbi.v2.DBI.withHandle(DBI.java:279) ~[jdbi-2.63.1.jar:2.63.1]
at io.druid.metadata.SQLMetadataConnector$2.call(SQLMetadataConnector.java:129) ~[druid-server-0.10.1.jar:0.10.1]
at io.druid.java.util.common.RetryUtils.retry(RetryUtils.java:63) [java-util-0.10.1.jar:0.10.1]
at io.druid.java.util.common.RetryUtils.retry(RetryUtils.java:81) [java-util-0.10.1.jar:0.10.1]
at io.druid.metadata.SQLMetadataConnector.retryWithHandle(SQLMetadataConnector.java:133) [druid-server-0.10.1.jar:0.10.1]
at io.druid.metadata.SQLMetadataConnector.retryWithHandle(SQLMetadataConnector.java:142) [druid-server-0.10.1.jar:0.10.1]
at
io.druid.metadata.IndexerSQLMetadataStorageCoordinator.getUsedSegmentsForIntervals(IndexerSQLMetadataStorageCoordinator.java:127)
[druid-server-0.10.1.jar:0.10.1]
at
io.druid.metadata.IndexerSQLMetadataStorageCoordinator.getUsedSegmentsForInterval(IndexerSQLMetadataStorageCoordinator.java:119)
[druid-server-0.10.1.jar:0.10.1]
at
io.druid.indexing.common.actions.SegmentAllocateAction.perform(SegmentAllocateAction.java:156)
[druid-indexing-service-0.10.1.jar:0.10.1]
at
io.druid.indexing.common.actions.SegmentAllocateAction.perform(SegmentAllocateAction.java:54)
[druid-indexing-service-0.10.1.jar:0.10.1]
at
io.druid.indexing.common.actions.LocalTaskActionClient.submit(LocalTaskActionClient.java:64)
[druid-indexing-service-0.10.1.jar:0.10.1]
at
io.druid.indexing.overlord.http.OverlordResource$3.apply(OverlordResource.java:341)
[druid-indexing-service-0.10.1.jar:0.10.1]
at
io.druid.indexing.overlord.http.OverlordResource$3.apply(OverlordResource.java:330)
[druid-indexing-service-0.10.1.jar:0.10.1]
at
io.druid.indexing.overlord.http.OverlordResource.asLeaderWith(OverlordResource.java:673)
[druid-indexing-service-0.10.1.jar:0.10.1]
at
io.druid.indexing.overlord.http.OverlordResource.doAction(OverlordResource.java:327)
[druid-indexing-service-0.10.1.jar:0.10.1]
at sun.reflect.GeneratedMethodAccessor55.invoke(Unknown Source) ~[?:?]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_141]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_141]
at
com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
[jersey-server-1.19.3.jar:1.19.3]
at
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)
[jersey-server-1.19.3.jar:1.19.3]
at
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
[jersey-server-1.19.3.jar:1.19.3]
at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302) [jersey-server-1.19.3.jar:1.19.3]
at
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
[jersey-server-1.19.3.jar:1.19.3]
at
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
[jersey-server-1.19.3.jar:1.19.3]
at
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
[jersey-server-1.19.3.jar:1.19.3]
at
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
[jersey-server-1.19.3.jar:1.19.3]
at
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542)
[jersey-server-1.19.3.jar:1.19.3]
at
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473)
[jersey-server-1.19.3.jar:1.19.3]
at
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419)
[jersey-server-1.19.3.jar:1.19.3]
at
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409)
[jersey-server-1.19.3.jar:1.19.3]
at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409) [jersey-servlet-1.19.3.jar:1.19.3]
at
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558)
[jersey-servlet-1.19.3.jar:1.19.3]
at
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733)
[jersey-servlet-1.19.3.jar:1.19.3]
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) [javax.servlet-api-3.1.0.jar:3.1.0]
at com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:286) [guice-servlet-4.1.0.jar:?]
at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:276) [guice-servlet-4.1.0.jar:?]
at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:181) [guice-servlet-4.1.0.jar:?]
at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) [guice-servlet-4.1.0.jar:?]
at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85) [guice-servlet-4.1.0.jar:?]
at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120) [guice-servlet-4.1.0.jar:?]
at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:135) [guice-servlet-4.1.0.jar:?]
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
[jetty-servlet-9.3.19.v20170502.jar:9.3.19.v20170502]
at io.druid.server.http.RedirectFilter.doFilter(RedirectFilter.java:72) [druid-server-0.10.1.jar:0.10.1]
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
[jetty-servlet-9.3.19.v20170502.jar:9.3.19.v20170502]
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
[jetty-servlet-9.3.19.v20170502.jar:9.3.19.v20170502]
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:224)
[jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
[jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
[jetty-servlet-9.3.19.v20170502.jar:9.3.19.v20170502]
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
[jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
[jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
[jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
at
org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493)
[jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
at
org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
[jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
[jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.server.Server.handle(Server.java:534) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
[jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
[jetty-io-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108) [jetty-io-9.3.19.v20170502.jar:9.3.19.v20170502]
at
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
[jetty-io-9.3.19.v20170502.jar:9.3.19.v20170502]
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
[jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
[jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]
Caused by: java.sql.SQLException: Cannot get a connection, general error
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:131) ~[commons-dbcp2-2.0.1.jar:2.0.1]
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1413) ~[commons-dbcp2-2.0.1.jar:2.0.1]
at
org.skife.jdbi.v2.DataSourceConnectionFactory.openConnection(DataSourceConnectionFactory.java:36)
~[jdbi-2.63.1.jar:2.63.1]
at org.skife.jdbi.v2.DBI.open(DBI.java:212) ~[jdbi-2.63.1.jar:2.63.1]
... 63 more
Caused by: java.lang.InterruptedException
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
~[?:1.8.0_141]
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2048)
~[?:1.8.0_141]
at org.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:524) ~[commons-pool2-2.2.jar:2.2]
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:438) ~[commons-pool2-2.2.jar:2.2]
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:361) ~[commons-pool2-2.2.jar:2.2]
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:119) ~[commons-dbcp2-2.0.1.jar:2.0.1]
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1413) ~[commons-dbcp2-2.0.1.jar:2.0.1]
at
org.skife.jdbi.v2.DataSourceConnectionFactory.openConnection(DataSourceConnectionFactory.java:36)
~[jdbi-2.63.1.jar:2.63.1]
at org.skife.jdbi.v2.DBI.open(DBI.java:212) ~[jdbi-2.63.1.jar:2.63.1]
... 63 more
Couldn't stop Thread and
Failed to get task runner
2017-09-06
11:44:25,739 WARN o.e.j.u.t.QueuedThreadPool [Thread-41]
qtp1373220972{STOPPING,40<=40<=40,i=0,q=36} Couldn't stop
Thread[qtp1373220972-102,5,main]
2017-09-06 11:44:25,739 WARN
o.e.j.u.t.QueuedThreadPool [Thread-41]
qtp1373220972{STOPPING,40<=40<=40,i=0,q=36} Couldn't stop
Thread[qtp1373220972-84,5,main]
2017-09-06 11:44:25,739 WARN
o.e.j.u.t.QueuedThreadPool [Thread-41]
qtp1373220972{STOPPING,40<=40<=40,i=0,q=36} Couldn't stop
Thread[qtp1373220972-91,5,main]
2017-09-06 11:51:03,706
ERROR i.d.i.k.s.KafkaSupervisor
[KafkaIndexTaskClient-prod_sample_1_events-0] Failed to get task runner
because I'm not the leader!
druid.host={{overlord-fqdn}}
druid.port={{overlord-port}}
druid.service=druid/overlord
# # Upload all task logs to deep storage
# druid.indexer.logs.type=s3
# druid.indexer.logs.s3Bucket={{druid-s3-bucket-name}}
# druid.indexer.logs.s3Prefix=prod/logs/v1
# Run in remote mode
druid.indexer.runner.type=remote
druid.indexer.runner.minWorkerVersion=#{WORKER_VERSION}
# Store all task state in the metadata storage
druid.indexer.storage.type=metadata
druid.monitoring.monitors=["com.metamx.metrics.SysMonitor","io.druid.client.cache.CacheMonitor","io.druid.server.metrics.EventReceiverFirehoseMonitor","com.metamx.metrics.JvmMonitor"][root@druid-master01
~]#
druid.host={{mm-fqdn}}
druid.port={{mm-port}}
druid.service=druid/middlemanager
druid.monitoring.monitors=["com.metamx.metrics.SysMonitor","io.druid.client.cache.CacheMonitor","io.druid.server.metrics.EventReceiverFirehoseMonitor","com.metamx.metrics.JvmMonitor"]
# Resources for peons
druid.indexer.runner.javaOpts=-server
-Xmx6g -XX:MaxDirectMemorySize=4g -XX:+UseG1GC -XX:MaxGCPauseMillis=100
-Dcom.amazonaws.services.s3.enableV4
druid.indexer.task.baseTaskDir=/persistent/task/
# # Store task logs in deep storage --> please refere to runtime.properties in the _common dir
# druid.indexer.logs.type=s3
# druid.indexer.logs.s3Bucket=s3-deep-storage
# druid.indexer.logs.s3Prefix=prod/logs/v1
# Peon properties
druid.indexer.fork.property.druid.monitoring.monitors=["com.metamx.metrics.JvmMonitor"]
druid.indexer.fork.property.druid.processing.buffer.sizeBytes=1073741824
druid.indexer.fork.property.druid.processing.numThreads=2
druid.indexer.fork.property.druid.processing.numMergeBuffers=1
#druid.indexer.fork.property.druid.segmentCache.locations=[{"path": "/persistent/zk_druid", "maxSize": 0}]
druid.indexer.fork.property.druid.server.http.numThreads=100
druid.indexer.fork.property.druid.storage.archiveBaseKey=prod
druid.indexer.fork.property.druid.storage.archiveBucket={{druid-s3-bucket-name}}
-archive
druid.indexer.fork.property.druid.storage.baseKey=prod/v1
druid.indexer.fork.property.druid.storage.bucket={{druid-s3-bucket-name}}
druid.indexer.fork.property.druid.storage.type=s3
druid.worker.capacity=11
druid.worker.ip=localhost
druid.worker.version=0
--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/8e1ef120-dbb2-4e36-b438-592f6337672e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
sample SQL access related logs from overlord:
INFO i.d.m.IndexerSQLMetadataStorageCoordinator [qtp1373220972-92] Updated metadata
INFO i.d.m.IndexerSQLMetadataStorageCoordinator [qtp1373220972-92] Published segment
INFO i.d.m.IndexerSQLMetadataStorageCoordinator [qtp1373220972-90] Alocating pending segment
INFO i.d.m.IndexerSQLMetadataStorageCoordinator [qtp1373220972-90] Found existing pending segment
--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/7d4dd4ce-9956-4cae-8415-ceed64a734ed%40googlegroups.com.
SQLException: Cannot get a connection, general error
2017-09-08 20:20:58,300 INFO i.d.i.k.KafkaIndexTaskClient [KafkaIndexTaskClient-prod_sample_1_events-5] submitRequest failed for [http://druid-mm02:8104/druid/worker/v1/chat/index_kafka_prod_sample_1_events_068972c80c54c05_himhgfpp/offsets/current], with message [Connection refused (Connection refused)]
ansible druid-mm -a "lsof | grep java | wc -l" -s
druid-mm05 | SUCCESS | rc=0 >>
1370934
druid-mm03 | SUCCESS | rc=0 >>
1731782
druid-mm02 | SUCCESS | rc=0 >>
1727398
druid-mm01 | SUCCESS | rc=0 >>
1715144
druid-mm04 | SUCCESS | rc=0 >>
1739617
discovered that pendingSegments should be cleaned manually (on druid-user group), and that
count of
163280
id should not take almost 2mins
MySQL [druid]> select count(id) from druid_pendingSegments;
+-----------+
| count(id) |
+-----------+
| 163280 |
+-----------+
1 row in set (1 min 48.18 sec)
MySQL [druid]> select count(id) from druid_segments;
+-----------+
| count(id) |
+-----------+
| 237642 |
+-----------+
1 row in set (1 min 27.62 sec)
latestOffsets
:) "latestOffsets": {
"0": 306648700,
"1": 304854330,
"2": 308518602,
"3": 298993034,
"4": 305214569,
"5": 311090025,
"6": 315678079,
"7": 307856233
},