Re: [druid-user] Ingest local data to s3 deep storage failed

117 views

Skip to first unread message

Gian Merlino

unread,

Aug 26, 2016, 1:58:57 AM8/26/16

to druid...@googlegroups.com

Hey 唐焱,

Did you also put your data on S3? There are a lot of ways to get Hadoop to load data from S3 (Hadoop is fun that way). One of them is documented here: https://imply.io/docs/latest/ingestion-batch#elastic-mapreduce-setup. It involves setting fs.s3*.awsAccessKeyId and fs.s3*.awsSecretAccessKey in your jobProperties.

Gian

On Tue, Aug 9, 2016 at 9:21 PM, 唐焱 <yant...@gmail.com> wrote:

Hello,

I'm trying to load the sample data wikiticker to druid and get following exception. I have used s3 as deep storage.

2016-08-10T03:46:48,072 WARN [Thread-59] org.apache.hadoop.mapred.LocalJobRunner - job_local826592096_0002 java.lang.Exception: java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key must be specified as the username or password (respectively) of a s3n URL, or by setting the fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey properties (respectively). at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) ~[hadoop-mapreduce-client-common-2.3.0.jar:?] at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) [hadoop-mapreduce-client-common-2.3.0.jar:?] Caused by: java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key must be specified as the username or password (respectively) of a s3n URL, or by setting the fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey properties (respectively). at org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:70) ~[hadoop-common-2.3.0.jar:?] at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.initialize(Jets3tNativeFileSystemStore.java:61) ~[hadoop-common-2.3.0.jar:?] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_101] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_101] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_101] at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_101] at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) ~[hadoop-common-2.3.0.jar:?] at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) ~[hadoop-common-2.3.0.jar:?] at org.apache.hadoop.fs.s3native.$Proxy191.initialize(Unknown Source) ~[?:?] at org.apache.hadoop.fs.s3native.NativeS3FileSystem.initialize(NativeS3FileSystem.java:272) ~[hadoop-common-2.3.0.jar:?] at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2316) ~[hadoop-common-2.3.0.jar:?] at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:90) ~[hadoop-common-2.3.0.jar:?] at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2350) ~[hadoop-common-2.3.0.jar:?] at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2332) ~[hadoop-common-2.3.0.jar:?] at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369) ~[hadoop-common-2.3.0.jar:?] at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) ~[hadoop-common-2.3.0.jar:?] at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:691) ~[druid-indexing-hadoop-0.9.1.1.jar:0.9.1.1] at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:469) ~[druid-indexing-hadoop-0.9.1.1.jar:0.9.1.1] at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) ~[hadoop-mapreduce-client-core-2.3.0.jar:?] at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) ~[hadoop-mapreduce-client-core-2.3.0.jar:?] at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) ~[hadoop-mapreduce-client-core-2.3.0.jar:?] at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) ~[hadoop-mapreduce-client-common-2.3.0.jar:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[?:1.7.0_101] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[?:1.7.0_101] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[?:1.7.0_101] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[?:1.7.0_101] at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_101] 2016-08-10T03:46:48,567 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_local826592096_0002 failed with state FAILED due to: NA 2016-08-10T03:46:48,572 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Counters: 33 File System Counters FILE: Number of bytes read=34215426 FILE: Number of bytes written=17310701 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 Map-Reduce Framework Map input records=39244 Map output records=39244 Map output bytes=16736001 Map output materialized bytes=16892983 Input split bytes=309 Combine input records=0 Combine output records=0 Reduce input groups=0 Reduce shuffle bytes=16892983 Reduce input records=0 Reduce output records=0 Spilled Records=39244 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=96 CPU time spent (ms)=0 Physical memory (bytes) snapshot=0 Virtual memory (bytes) snapshot=0 Total committed heap usage (bytes)=1912078336 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=0 2016-08-10T03:46:48,577 INFO [task-runner-0-priority-0] io.druid.indexer.JobHelper - Deleting path[var/druid/hadoop-tmp/wikiticker/2016-08-10T034630.036Z/8b44ff31242a48ff96ed6cdbc1547708] 2016-08-10T03:46:48,590 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_wikiticker_2016-08-10T03:46:29.980Z, type=index_hadoop, dataSource=wikiticker}] java.lang.RuntimeException: java.lang.reflect.InvocationTargetException at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?] at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:204) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1] at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:208) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1] at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.9.1.1.jar:0.9.1.1] at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.9.1.1.jar:0.9.1.1] at java.util.concurrent.FutureTask.run(FutureTask.java:262) [?:1.7.0_101] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [?:1.7.0_101] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [?:1.7.0_101] at java.lang.Thread.run(Thread.java:745) [?:1.7.0_101] Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_101] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_101] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_101] at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_101] at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:201) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1] ... 7 more Caused by: com.metamx.common.ISE: Job[class io.druid.indexer.IndexGeneratorJob] failed! at io.druid.indexer.JobHelper.runJobs(JobHelper.java:343) ~[druid-indexing-hadoop-0.9.1.1.jar:0.9.1.1] at io.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:94) ~[druid-indexing-hadoop-0.9.1.1.jar:0.9.1.1] at io.druid.indexing.common.task.HadoopIndexTask$HadoopIndexGeneratorInnerProcessing.runTask(HadoopIndexTask.java:261) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_101] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_101] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_101] at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_101] at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:201) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1] ... 7 more 2016-08-10T03:46:48,596 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_hadoop_wikiticker_2016-08-10T03:46:29.980Z] status changed to [FAILED]. 2016-08-10T03:46:48,598 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: { "id" : "index_hadoop_wikiticker_2016-08-10T03:46:29.980Z", "status" : "FAILED", "duration" : 14282 }
The attachments is the full log and common conf file.
Thanks!
--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/691055d0-f0b8-4c37-a2e6-2b907fbe349c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward

0 new messages