Unable to run batch ingestion example on cluster

210 views
Skip to first unread message

Jeremy

unread,
Aug 16, 2017, 1:01:00 PM8/16/17
to Druid User

I am attempting to setup druid-0.10.0 on an IBM BigInsights cluster (equivalent to HDP). After numerous attempts and tweaks, I can still not get the simple files-based tutorial (http://druid.io/docs/0.10.0/tutorials/tutorial-batch.html).

I have the Overlord, Coordinator, and Broker running on one server in my cluster and the Middle Manager and Historical running on a second server in the cluster and am using HDFS for deep storage. BigInsights ships with hadoop client 2.7.3, so I have tried different runs after setting 

mapreduce.job.classloader = true

or

mapreduce.job.user.classpath.first = true


Here is my task:

{
  
"type" : "index_hadoop",
  
"spec" : {
    
"ioConfig" : {
      
"type" : "hadoop",
      
"inputSpec" : {
        
"type" : "static",
        
"paths" : "pageviews.json"
      
}
    
},
    
"dataSchema" : {
      
"dataSource" : "pageviews",
      
"granularitySpec" : {
        
"type" : "uniform",
        
"segmentGranularity" : "day",
        
"queryGranularity" : "none",
        
"intervals" : ["2015-09-01/2015-09-02"]
      
},
      
"parser" : {
        
"type" : "hadoopyString",
        
"parseSpec" : {
          
"format" : "json",
          
"dimensionsSpec" : {
            
"dimensions" : [ "url", "user"
            
]
          
},
          
"timestampSpec" : {
            
"format" : "auto",
            
"column" : "time"
          
}
        
}
      
},
      
"metricsSpec" : [
        
{
          
"name" : "views",
          
"type" : "count"
        
},
        
{
          
"name" : "latencyMs",
          
"type" : "doubleSum",
          
"fieldName" : "latencyMs"
        
}
      
]
    
},
    
"tuningConfig" : {
      
"type" : "hadoop",
      
"partitionsSpec" : {
        
"type" : "hashed",
        
"targetPartitionSize" : 5000000
      
},
      
"jobProperties" : {
        
"mapreduce.job.classloader": "true",
        
"mapreduce.map.java.opts": "-Duser.timezone=UTC -Dfile.encoding=UTF-8",
        
"mapreduce.reduce.java.opts": "-Duser.timezone=UTC -Dfile.encoding=UTF-8"
      
},
      
"ignoreInvalidRows" : "true",
      
"leaveIntermediate" : "true"
    
}
  
}
}


My common.runtime.properties

#
# Licensed to Metamarkets Group Inc. (Metamarkets) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. Metamarkets licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#

#
# Extensions
#

# This is not the full list of Druid extensions, but common ones that people often use. You may need to change this list
# based on your particular setup.
druid
.extensions.loadList=["druid-hdfs-storage", "druid-kafka-eight", "druid-histogram", "druid-datasketches", "druid-lookups-cached-global"]

# If you have a different version of Hadoop, place your Hadoop client jar files in your hadoop-dependencies directory
# and uncomment the line below to point to your directory.
#druid.extensions.hadoopDependenciesDir=/my/dir/hadoop-dependencies

#
# Logging
#

# Log all runtime properties on startup. Disable to avoid logging properties on startup:
druid
.startup.logging.logProperties=true

#
# Zookeeper
#

druid
.zk.service.host=10.10.209.22
druid
.zk.paths.base=/druid

#
# Metadata storage
#

# For Derby server on your Druid Coordinator (only viable in a cluster with a single Coordinator, no fail-over):
druid.metadata.storage.type=derby
druid.metadata.storage.connector.connectURI=jdbc:derby:/
/localhost:1527/var/druid/metadata.db;create=true
druid
.metadata.storage.connector.host=127.0.0.1
druid
.metadata.storage.connector.port=1527

# For MySQL:
#druid.metadata.storage.type=mysql
#druid.metadata.storage.connector.connectURI=jdbc:mysql://db.example.com:3306/druid
#druid.metadata.storage.connector.user=...
#druid.metadata.storage.connector.password=...

# For PostgreSQL (make sure to additionally include the Postgres extension):
#druid.metadata.storage.type=postgresql
#druid.metadata.storage.connector.connectURI=jdbc:postgresql://db.example.com:5432/druid
#druid.metadata.storage.connector.user=...
#druid.metadata.storage.connector.password=...

#
# Deep storage
#

# For local disk (only viable in a cluster if this is a network mount):
#druid.storage.type=local
#druid.storage.storageDirectory=var/druid/segments

# For HDFS (make sure to include the HDFS extension and that your Hadoop config files in the cp):
druid
.storage.type=hdfs
druid
.storage.storageDirectory=/druid/segments

# For S3:
#druid.storage.type=s3
#druid.storage.bucket=your-bucket
#druid.storage.baseKey=druid/segments
#druid.s3.accessKey=...
#druid.s3.secretKey=...

#
# Indexing service logs
#

# For local disk (only viable in a cluster if this is a network mount):
#druid.indexer.logs.type=file
#druid.indexer.logs.directory=var/druid/indexing-logs

# For HDFS (make sure to include the HDFS extension and that your Hadoop config files in the cp):
druid
.indexer.logs.type=hdfs
druid
.indexer.logs.directory=/druid/indexing-logs

# For S3:
#druid.indexer.logs.type=s3
#druid.indexer.logs.s3Bucket=your-bucket
#druid.indexer.logs.s3Prefix=druid/indexing-logs

#
# Service discovery
#

druid
.selectors.indexing.serviceName=druid/overlord
druid
.selectors.coordinator.serviceName=druid/coordinator

#
# Monitoring
#

druid
.monitoring.monitors=["com.metamx.metrics.JvmMonitor"]
druid
.emitter=logging
druid
.emitter.logging.logLevel=debug



and the error portion of the log:

2017-08-16T16:17:01,770 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 100%
2017-08-16T16:17:02,786 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_1499096125809_0333 failed with state FAILED due to: Task failed task_1499096125809_0333_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

2017-08-16T16:17:02,911 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Counters: 16
    Job Counters 
        Failed map tasks=4
        Killed reduce tasks=1
        Launched map tasks=4
        Other local map tasks=3
        Data-local map tasks=1
        Total time spent by all maps in occupied slots (ms)=35285
        Total time spent by all reduces in occupied slots (ms)=0
        Total time spent by all map tasks (ms)=35285
        Total time spent by all reduce tasks (ms)=0
        Total vcore-seconds taken by all map tasks=35285
        Total vcore-seconds taken by all reduce tasks=0
        Total megabyte-seconds taken by all map tasks=162593280
        Total megabyte-seconds taken by all reduce tasks=0
    Map-Reduce Framework
        CPU time spent (ms)=0
        Physical memory (bytes) snapshot=0
        Virtual memory (bytes) snapshot=0
2017-08-16T16:17:02,914 ERROR [task-runner-0-priority-0] io.druid.indexer.DetermineHashedPartitionsJob - Job failed: job_1499096125809_0333
2017-08-16T16:17:02,916 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_pageviews_2017-08-16T16:15:49.166Z, type=index_hadoop, dataSource=pageviews}]
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
    at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
    at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:211) ~[druid-indexing-service-0.10.0.jar:0.10.0]
    at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:176) ~[druid-indexing-service-0.10.0.jar:0.10.0]
    at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.10.0.jar:0.10.0]
    at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.10.0.jar:0.10.0]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_121]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]
    at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_121]
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_121]
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_121]
    at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_121]
    at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:208) ~[druid-indexing-service-0.10.0.jar:0.10.0]
    ... 7 more
Caused by: io.druid.java.util.common.ISE: Job[class io.druid.indexer.DetermineHashedPartitionsJob] failed!
    at io.druid.indexer.JobHelper.runJobs(JobHelper.java:369) ~[druid-indexing-hadoop-0.10.0.jar:0.10.0]
    at io.druid.indexer.HadoopDruidDetermineConfigurationJob.run(HadoopDruidDetermineConfigurationJob.java:91) ~[druid-indexing-hadoop-0.10.0.jar:0.10.0]
    at io.druid.indexing.common.task.HadoopIndexTask$HadoopDetermineConfigInnerProcessing.runTask(HadoopIndexTask.java:306) ~[druid-indexing-service-0.10.0.jar:0.10.0]
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_121]
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_121]
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_121]
    at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_121]
    at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:208) ~[druid-indexing-service-0.10.0.jar:0.10.0]
    ... 7 more
2017-08-16T16:17:02,928 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_hadoop_pageviews_2017-08-16T16:15:49.166Z] status changed to [FAILED].
2017-08-16T16:17:02,934 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_hadoop_pageviews_2017-08-16T16:15:49.166Z",
  "status" : "FAILED",
  "duration" : 65436
}
2017-08-16T16:17:02,944 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.server.coordination.AbstractDataSegmentAnnouncer.stop()] on object[io.druid.server.coordination.BatchDataSegmentAnnouncer@4e958f08].
2017-08-16T16:17:02,945 INFO [main] io.druid.server.coordination.AbstractDataSegmentAnnouncer - Stopping class io.druid.server.coordination.BatchDataSegmentAnnouncer with config[io.druid.server.initialization.ZkPathsConfig@22e2266d]
2017-08-16T16:17:02,945 INFO [main] io.druid.curator.announcement.Announcer - unannouncing [/druid/announcements/nephos-4.campbell.com:8100]
2017-08-16T16:17:02,971 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.server.listener.announcer.ListenerResourceAnnouncer.stop()] on object[io.druid.query.lookup.LookupResourceListenerAnnouncer@53016b11].
2017-08-16T16:17:02,971 INFO [main] io.druid.curator.announcement.Announcer - unannouncing [/druid/listeners/lookups/__default/nephos-4.campbell.com:8100]
2017-08-16T16:17:02,978 INFO [main] io.druid.server.listener.announcer.ListenerResourceAnnouncer - Unannouncing start time on [/druid/listeners/lookups/__default/nephos-4.campbell.com:8100]
2017-08-16T16:17:02,978 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.query.lookup.LookupReferencesManager.stop()] on object[io.druid.query.lookup.LookupReferencesManager@53125718].
2017-08-16T16:17:02,979 INFO [main] io.druid.query.lookup.LookupReferencesManager - Stopping lookup factory references manager
2017-08-16T16:17:02,990 INFO [main] org.eclipse.jetty.server.AbstractConnector - Stopped ServerConnector@4ee8051c{HTTP/1.1,[http/1.1]}{0.0.0.0:8100}
2017-08-16T16:17:02,993 INFO [main] org.eclipse.jetty.server.handler.ContextHandler - Stopped o.e.j.s.ServletContextHandler@37ca3ca8{/,null,UNAVAILABLE}
2017-08-16T16:17:02,996 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.indexing.worker.executor.ExecutorLifecycle.stop() throws java.lang.Exception] on object[io.druid.indexing.worker.executor.ExecutorLifecycle@62b3a2f6].
2017-08-16T16:17:02,997 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.indexing.overlord.ThreadPoolTaskRunner.stop()] on object[io.druid.indexing.overlord.ThreadPoolTaskRunner@6ddc67d0].
2017-08-16T16:17:02,998 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.curator.discovery.ServerDiscoverySelector.stop() throws java.io.IOException] on object[io.druid.curator.discovery.ServerDiscoverySelector@4bc6da03].
2017-08-16T16:17:03,002 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.curator.announcement.Announcer.stop()] on object[io.druid.curator.announcement.Announcer@718fd7c1].
2017-08-16T16:17:03,004 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.curator.discovery.ServerDiscoverySelector.stop() throws java.io.IOException] on object[io.druid.curator.discovery.ServerDiscoverySelector@5a0e0886].
2017-08-16T16:17:03,004 INFO [main] io.druid.curator.CuratorModule - Stopping Curator
2017-08-16T16:17:03,006 INFO [Curator-Framework-0] org.apache.curator.framework.imps.CuratorFrameworkImpl - backgroundOperationsLoop exiting
2017-08-16T16:17:03,014 INFO [main] org.apache.zookeeper.ZooKeeper - Session: 0x15d0914df35c1f0 closed
2017-08-16T16:17:03,014 INFO [main-EventThread] org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 0x15d0914df35c1f0
2017-08-16T16:17:03,014 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void com.metamx.http.client.NettyHttpClient.stop()] on object[com.metamx.http.client.NettyHttpClient@33feda48].
2017-08-16T16:17:03,089 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.storage.hdfs.HdfsStorageAuthentication.stop()] on object[io.druid.storage.hdfs.HdfsStorageAuthentication@4e3f2908].
2017-08-16T16:17:03,089 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void com.metamx.metrics.MonitorScheduler.stop()] on object[com.metamx.metrics.MonitorScheduler@6579cdbb].
2017-08-16T16:17:03,089 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void com.metamx.emitter.service.ServiceEmitter.close() throws java.io.IOException] on object[com.metamx.emitter.service.ServiceEmitter@1e3df614].
2017-08-16T16:17:03,089 INFO [main] com.metamx.emitter.core.LoggingEmitter - Close: started [false]
2017-08-16T16:17:03,090 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.initialization.Log4jShutterDownerModule$Log4jShutterDowner.stop()] on object[io.druid.initialization.Log4jShutterDownerModule$Log4jShutterDowner@64aeaf29].
2017-08-16 16:17:03,137 Thread-2 ERROR Unable to register shutdown hook because JVM is shutting down. java.lang.IllegalStateException: Not started
    at io.druid.common.config.Log4jShutdown.addShutdownCallback(Log4jShutdown.java:45)
    at org.apache.logging.log4j.core.impl.Log4jContextFactory.addShutdownCallback(Log4jContextFactory.java:273)
    at org.apache.logging.log4j.core.LoggerContext.setUpShutdownHook(LoggerContext.java:256)
    at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:216)
    at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:145)
    at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:41)
    at org.apache.logging.log4j.LogManager.getContext(LogManager.java:182)
    at org.apache.logging.log4j.spi.AbstractLoggerAdapter.getContext(AbstractLoggerAdapter.java:103)
    at org.apache.logging.slf4j.Log4jLoggerFactory.getContext(Log4jLoggerFactory.java:43)
    at org.apache.logging.log4j.spi.AbstractLoggerAdapter.getLogger(AbstractLoggerAdapter.java:42)
    at org.apache.logging.slf4j.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:29)
    at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:253)
    at org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:155)
    at org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:132)
    at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:685)
    at org.apache.hadoop.hdfs.LeaseRenewer.<clinit>(LeaseRenewer.java:72)
    at org.apache.hadoop.hdfs.DFSClient.getLeaseRenewer(DFSClient.java:699)
    at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:859)
    at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:853)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2407)
    at org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2424)
    at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)



I have searched this group and made numerous tweaks but I feel like I am going in circles now.

Thanks
Jeremy

common.runtime.properties
log
my-index-task.json

hellobab...@gmail.com

unread,
Aug 16, 2017, 8:49:01 PM8/16/17
to Druid User
Hi,Jeremy

you can check your mapreduce task log.

在 2017年8月17日星期四 UTC+8上午1:01:00,Jeremy写道:
Reply all
Reply to author
Forward
0 new messages