Unable to run batch ingestion example on cluster

210 views

Skip to first unread message

Jeremy

unread,

Aug 16, 2017, 1:01:00 PM8/16/17

to Druid User

I am attempting to setup druid-0.10.0 on an IBM BigInsights cluster (equivalent to HDP). After numerous attempts and tweaks, I can still not get the simple files-based tutorial (http://druid.io/docs/0.10.0/tutorials/tutorial-batch.html).

I have the Overlord, Coordinator, and Broker running on one server in my cluster and the Middle Manager and Historical running on a second server in the cluster and am using HDFS for deep storage. BigInsights ships with hadoop client 2.7.3, so I have tried different runs after setting

mapreduce.job.classloader = true

mapreduce.job.user.classpath.first = true

Here is my task:

{
  "type" : "index_hadoop",
  "spec" : {
  "ioConfig" : {
  "type" : "hadoop",
  "inputSpec" : {
  "type" : "static",
  "paths" : "pageviews.json"
  }
  },
  "dataSchema" : {
  "dataSource" : "pageviews",
  "granularitySpec" : {
  "type" : "uniform",
  "segmentGranularity" : "day",
  "queryGranularity" : "none",
  "intervals" : ["2015-09-01/2015-09-02"]
  },
  "parser" : {
  "type" : "hadoopyString",
  "parseSpec" : {
  "format" : "json",
  "dimensionsSpec" : {
  "dimensions" : [ "url", "user"
  ]
  },
  "timestampSpec" : {
  "format" : "auto",
  "column" : "time"
  }
  }
  },
  "metricsSpec" : [
  {
  "name" : "views",
  "type" : "count"
  },
  {
  "name" : "latencyMs",
  "type" : "doubleSum",
  "fieldName" : "latencyMs"
  }
  ]
  },
  "tuningConfig" : {
  "type" : "hadoop",
  "partitionsSpec" : {
  "type" : "hashed",
  "targetPartitionSize" : 5000000
  },
  "jobProperties" : {
  "mapreduce.job.classloader": "true",
  "mapreduce.map.java.opts": "-Duser.timezone=UTC -Dfile.encoding=UTF-8",
  "mapreduce.reduce.java.opts": "-Duser.timezone=UTC -Dfile.encoding=UTF-8"
  },
  "ignoreInvalidRows" : "true",
  "leaveIntermediate" : "true"
  }
  }
}

My common.runtime.properties

#
# Licensed to Metamarkets Group Inc. (Metamarkets) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. Metamarkets licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#

#
# Extensions
#

# This is not the full list of Druid extensions, but common ones that people often use. You may need to change this list
# based on your particular setup.
druid.extensions.loadList=["druid-hdfs-storage", "druid-kafka-eight", "druid-histogram", "druid-datasketches", "druid-lookups-cached-global"]

# If you have a different version of Hadoop, place your Hadoop client jar files in your hadoop-dependencies directory
# and uncomment the line below to point to your directory.
#druid.extensions.hadoopDependenciesDir=/my/dir/hadoop-dependencies

#
# Logging
#

# Log all runtime properties on startup. Disable to avoid logging properties on startup:
druid.startup.logging.logProperties=true

#
# Zookeeper
#

druid.zk.service.host=10.10.209.22
druid.zk.paths.base=/druid

#
# Metadata storage
#

# For Derby server on your Druid Coordinator (only viable in a cluster with a single Coordinator, no fail-over):
druid.metadata.storage.type=derby
druid.metadata.storage.connector.connectURI=jdbc:derby://localhost:1527/var/druid/metadata.db;create=true
druid.metadata.storage.connector.host=127.0.0.1
druid.metadata.storage.connector.port=1527

# For MySQL:
#druid.metadata.storage.type=mysql
#druid.metadata.storage.connector.connectURI=jdbc:mysql://db.example.com:3306/druid
#druid.metadata.storage.connector.user=...
#druid.metadata.storage.connector.password=...

# For PostgreSQL (make sure to additionally include the Postgres extension):
#druid.metadata.storage.type=postgresql
#druid.metadata.storage.connector.connectURI=jdbc:postgresql://db.example.com:5432/druid
#druid.metadata.storage.connector.user=...
#druid.metadata.storage.connector.password=...

#
# Deep storage
#

# For local disk (only viable in a cluster if this is a network mount):
#druid.storage.type=local
#druid.storage.storageDirectory=var/druid/segments

# For HDFS (make sure to include the HDFS extension and that your Hadoop config files in the cp):
druid.storage.type=hdfs
druid.storage.storageDirectory=/druid/segments

# For S3:
#druid.storage.type=s3
#druid.storage.bucket=your-bucket
#druid.storage.baseKey=druid/segments
#druid.s3.accessKey=...
#druid.s3.secretKey=...

#
# Indexing service logs
#

# For local disk (only viable in a cluster if this is a network mount):
#druid.indexer.logs.type=file
#druid.indexer.logs.directory=var/druid/indexing-logs

# For HDFS (make sure to include the HDFS extension and that your Hadoop config files in the cp):
druid.indexer.logs.type=hdfs
druid.indexer.logs.directory=/druid/indexing-logs

# For S3:
#druid.indexer.logs.type=s3
#druid.indexer.logs.s3Bucket=your-bucket
#druid.indexer.logs.s3Prefix=druid/indexing-logs

#
# Service discovery
#

druid.selectors.indexing.serviceName=druid/overlord
druid.selectors.coordinator.serviceName=druid/coordinator

#
# Monitoring
#

druid.monitoring.monitors=["com.metamx.metrics.JvmMonitor"]
druid.emitter=logging
druid.emitter.logging.logLevel=debug

and the error portion of the log:

2017-08-16T16:17:01,770 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - map 100% reduce 100%
2017-08-16T16:17:02,786 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_1499096125809_0333 failed with state FAILED due to: Task failed task_1499096125809_0333_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

2017-08-16T16:17:02,911 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Counters: 16
   Job Counters
       Failed map tasks=4
       Killed reduce tasks=1
       Launched map tasks=4
       Other local map tasks=3
       Data-local map tasks=1
       Total time spent by all maps in occupied slots (ms)=35285
       Total time spent by all reduces in occupied slots (ms)=0
       Total time spent by all map tasks (ms)=35285
       Total time spent by all reduce tasks (ms)=0
       Total vcore-seconds taken by all map tasks=35285
       Total vcore-seconds taken by all reduce tasks=0
       Total megabyte-seconds taken by all map tasks=162593280
       Total megabyte-seconds taken by all reduce tasks=0
   Map-Reduce Framework
       CPU time spent (ms)=0
       Physical memory (bytes) snapshot=0
       Virtual memory (bytes) snapshot=0
2017-08-16T16:17:02,914 ERROR [task-runner-0-priority-0] io.druid.indexer.DetermineHashedPartitionsJob - Job failed: job_1499096125809_0333
2017-08-16T16:17:02,916 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_pageviews_2017-08-16T16:15:49.166Z, type=index_hadoop, dataSource=pageviews}]
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
   at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
   at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:211) ~[druid-indexing-service-0.10.0.jar:0.10.0]
   at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:176) ~[druid-indexing-service-0.10.0.jar:0.10.0]
   at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.10.0.jar:0.10.0]
   at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.10.0.jar:0.10.0]
   at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_121]
   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]
   at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_121]
   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_121]
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_121]
   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_121]
   at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:208) ~[druid-indexing-service-0.10.0.jar:0.10.0]
   ... 7 more
Caused by: io.druid.java.util.common.ISE: Job[class io.druid.indexer.DetermineHashedPartitionsJob] failed!
   at io.druid.indexer.JobHelper.runJobs(JobHelper.java:369) ~[druid-indexing-hadoop-0.10.0.jar:0.10.0]
   at io.druid.indexer.HadoopDruidDetermineConfigurationJob.run(HadoopDruidDetermineConfigurationJob.java:91) ~[druid-indexing-hadoop-0.10.0.jar:0.10.0]
   at io.druid.indexing.common.task.HadoopIndexTask$HadoopDetermineConfigInnerProcessing.runTask(HadoopIndexTask.java:306) ~[druid-indexing-service-0.10.0.jar:0.10.0]
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_121]
   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_121]
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_121]
   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_121]
   at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:208) ~[druid-indexing-service-0.10.0.jar:0.10.0]
   ... 7 more
2017-08-16T16:17:02,928 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_hadoop_pageviews_2017-08-16T16:15:49.166Z] status changed to [FAILED].
2017-08-16T16:17:02,934 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
"id" : "index_hadoop_pageviews_2017-08-16T16:15:49.166Z",
"status" : "FAILED",
"duration" : 65436
}
2017-08-16T16:17:02,944 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.server.coordination.AbstractDataSegmentAnnouncer.stop()] on object[io.druid.server.coordination.BatchDataSegmentAnnouncer@4e958f08].
2017-08-16T16:17:02,945 INFO [main] io.druid.server.coordination.AbstractDataSegmentAnnouncer - Stopping class io.druid.server.coordination.BatchDataSegmentAnnouncer with config[io.druid.server.initialization.ZkPathsConfig@22e2266d]
2017-08-16T16:17:02,945 INFO [main] io.druid.curator.announcement.Announcer - unannouncing [/druid/announcements/nephos-4.campbell.com:8100]
2017-08-16T16:17:02,971 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.server.listener.announcer.ListenerResourceAnnouncer.stop()] on object[io.druid.query.lookup.LookupResourceListenerAnnouncer@53016b11].
2017-08-16T16:17:02,971 INFO [main] io.druid.curator.announcement.Announcer - unannouncing [/druid/listeners/lookups/__default/nephos-4.campbell.com:8100]
2017-08-16T16:17:02,978 INFO [main] io.druid.server.listener.announcer.ListenerResourceAnnouncer - Unannouncing start time on [/druid/listeners/lookups/__default/nephos-4.campbell.com:8100]
2017-08-16T16:17:02,978 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.query.lookup.LookupReferencesManager.stop()] on object[io.druid.query.lookup.LookupReferencesManager@53125718].
2017-08-16T16:17:02,979 INFO [main] io.druid.query.lookup.LookupReferencesManager - Stopping lookup factory references manager
2017-08-16T16:17:02,990 INFO [main] org.eclipse.jetty.server.AbstractConnector - Stopped ServerConnector@4ee8051c{HTTP/1.1,[http/1.1]}{0.0.0.0:8100}
2017-08-16T16:17:02,993 INFO [main] org.eclipse.jetty.server.handler.ContextHandler - Stopped o.e.j.s.ServletContextHandler@37ca3ca8{/,null,UNAVAILABLE}
2017-08-16T16:17:02,996 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.indexing.worker.executor.ExecutorLifecycle.stop() throws java.lang.Exception] on object[io.druid.indexing.worker.executor.ExecutorLifecycle@62b3a2f6].
2017-08-16T16:17:02,997 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.indexing.overlord.ThreadPoolTaskRunner.stop()] on object[io.druid.indexing.overlord.ThreadPoolTaskRunner@6ddc67d0].
2017-08-16T16:17:02,998 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.curator.discovery.ServerDiscoverySelector.stop() throws java.io.IOException] on object[io.druid.curator.discovery.ServerDiscoverySelector@4bc6da03].
2017-08-16T16:17:03,002 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.curator.announcement.Announcer.stop()] on object[io.druid.curator.announcement.Announcer@718fd7c1].
2017-08-16T16:17:03,004 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.curator.discovery.ServerDiscoverySelector.stop() throws java.io.IOException] on object[io.druid.curator.discovery.ServerDiscoverySelector@5a0e0886].
2017-08-16T16:17:03,004 INFO [main] io.druid.curator.CuratorModule - Stopping Curator
2017-08-16T16:17:03,006 INFO [Curator-Framework-0] org.apache.curator.framework.imps.CuratorFrameworkImpl - backgroundOperationsLoop exiting
2017-08-16T16:17:03,014 INFO [main] org.apache.zookeeper.ZooKeeper - Session: 0x15d0914df35c1f0 closed
2017-08-16T16:17:03,014 INFO [main-EventThread] org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 0x15d0914df35c1f0
2017-08-16T16:17:03,014 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void com.metamx.http.client.NettyHttpClient.stop()] on object[com.metamx.http.client.NettyHttpClient@33feda48].
2017-08-16T16:17:03,089 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.storage.hdfs.HdfsStorageAuthentication.stop()] on object[io.druid.storage.hdfs.HdfsStorageAuthentication@4e3f2908].
2017-08-16T16:17:03,089 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void com.metamx.metrics.MonitorScheduler.stop()] on object[com.metamx.metrics.MonitorScheduler@6579cdbb].
2017-08-16T16:17:03,089 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void com.metamx.emitter.service.ServiceEmitter.close() throws java.io.IOException] on object[com.metamx.emitter.service.ServiceEmitter@1e3df614].
2017-08-16T16:17:03,089 INFO [main] com.metamx.emitter.core.LoggingEmitter - Close: started [false]
2017-08-16T16:17:03,090 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.initialization.Log4jShutterDownerModule$Log4jShutterDowner.stop()] on object[io.druid.initialization.Log4jShutterDownerModule$Log4jShutterDowner@64aeaf29].
2017-08-16 16:17:03,137 Thread-2 ERROR Unable to register shutdown hook because JVM is shutting down. java.lang.IllegalStateException: Not started
   at io.druid.common.config.Log4jShutdown.addShutdownCallback(Log4jShutdown.java:45)
   at org.apache.logging.log4j.core.impl.Log4jContextFactory.addShutdownCallback(Log4jContextFactory.java:273)
   at org.apache.logging.log4j.core.LoggerContext.setUpShutdownHook(LoggerContext.java:256)
   at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:216)
   at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:145)
   at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:41)
   at org.apache.logging.log4j.LogManager.getContext(LogManager.java:182)
   at org.apache.logging.log4j.spi.AbstractLoggerAdapter.getContext(AbstractLoggerAdapter.java:103)
   at org.apache.logging.slf4j.Log4jLoggerFactory.getContext(Log4jLoggerFactory.java:43)
   at org.apache.logging.log4j.spi.AbstractLoggerAdapter.getLogger(AbstractLoggerAdapter.java:42)
   at org.apache.logging.slf4j.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:29)
   at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:253)
   at org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:155)
   at org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:132)
   at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:685)
   at org.apache.hadoop.hdfs.LeaseRenewer.<clinit>(LeaseRenewer.java:72)
   at org.apache.hadoop.hdfs.DFSClient.getLeaseRenewer(DFSClient.java:699)
   at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:859)
   at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:853)
   at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2407)
   at org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2424)
   at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

I have searched this group and made numerous tweaks but I feel like I am going in circles now.

Thanks
Jeremy

common.runtime.properties

log

my-index-task.json

hellobab...@gmail.com

unread,

Aug 16, 2017, 8:49:01 PM8/16/17

to Druid User

Hi,Jeremy

you can check your mapreduce task log.

在 2017年8月17日星期四 UTC+8上午1:01:00，Jeremy写道：

Reply all

Reply to author

Forward

0 new messages