The loaded data can't be queried.

631 views
Skip to first unread message

tao...@yahoo-inc.com

unread,
Apr 16, 2014, 1:07:59 PM4/16/14
to druid-de...@googlegroups.com
Hi, All:


But I can't query these data.

Here is my json query file:
{
  "queryType": "groupBy",
  "dataSource": "wikipedia",
  "granularity": "all",
  "dimensions": [ "page" ],
  "orderBy": {
     "type": "default",
     "columns": [ { "dimension": "edit_count", "direction": "DESCENDING" } ],
     "limit": 10
  },
  "aggregations": [
    {"type": "longSum", "fieldName": "count", "name": "edit_count"}
  ],
  "filter": { "type": "selector", "dimension": "country", "value": "United States" },
  "intervals": ["2013-08-31T00:00/2020-01-01T00"]
}

When I send it to broker, it return the null result. And the Historical had exception:
2014-04-16 17:01:48,644 ERROR [ZkCoordinator-0] io.druid.server.coordination.ZkCoordinator - Failed to load segment for dataSource: {class=io.druid.server.coordination.ZkCoordinator, exceptionType=class io.druid.segment.loading.SegmentLoadingException, exceptionMessage=Exception loading segment[wikipedia_2013-08-31T00:00:00.000Z_2013-09-01T00:00:00.000Z_2014-04-16T17:00:40.325Z], segment=DataSegment{size=4462, shardSpec=NoneShardSpec, metrics=[count, added, deleted, delta], dimensions=[anonymous, city, continent, country, language, namespace, newpage, page, region, robot, unpatrolled, user], version='2014-04-16T17:00:40.325Z', loadSpec={type=local, path=/tmp/druid/localStorage/wikipedia/2013-08-31T00:00:00.000Z_2013-09-01T00:00:00.000Z/2014-04-16T17:00:40.325Z/0/index.zip}, interval=2013-08-31T00:00:00.000Z/2013-09-01T00:00:00.000Z, dataSource='wikipedia', binaryVersion='9'}}
io.druid.segment.loading.SegmentLoadingException: Exception loading segment[wikipedia_2013-08-31T00:00:00.000Z_2013-09-01T00:00:00.000Z_2014-04-16T17:00:40.325Z]
at io.druid.server.coordination.ZkCoordinator.addSegment(ZkCoordinator.java:239)
at io.druid.server.coordination.SegmentChangeRequestLoad.go(SegmentChangeRequestLoad.java:44)
at io.druid.server.coordination.ZkCoordinator$1.childEvent(ZkCoordinator.java:131)
at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:494)
at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:488)
at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:92)
at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293)
at org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:83)
at org.apache.curator.framework.recipes.cache.PathChildrenCache.callListeners(PathChildrenCache.java:485)
at org.apache.curator.framework.recipes.cache.EventOperation.invoke(EventOperation.java:35)
at org.apache.curator.framework.recipes.cache.PathChildrenCache$11.run(PathChildrenCache.java:755)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: io.druid.segment.loading.SegmentLoadingException: Asked to load path[/tmp/druid/localStorage/wikipedia/2013-08-31T00:00:00.000Z_2013-09-01T00:00:00.000Z/2014-04-16T17:00:40.325Z/0/index.zip], but it doesn't exist.
at io.druid.segment.loading.LocalDataSegmentPuller.getFile(LocalDataSegmentPuller.java:100)
at io.druid.segment.loading.LocalDataSegmentPuller.getSegmentFiles(LocalDataSegmentPuller.java:41)
at io.druid.segment.loading.OmniSegmentLoader.getSegmentFiles(OmniSegmentLoader.java:125)
at io.druid.segment.loading.OmniSegmentLoader.getSegment(OmniSegmentLoader.java:93)
at io.druid.server.coordination.ServerManager.loadSegment(ServerManager.java:129)
at io.druid.server.coordination.ZkCoordinator.addSegment(ZkCoordinator.java:235)
... 17 more

Can somebody help me to have a look at it?

Thanks.

Fangjin Yang

unread,
Apr 16, 2014, 11:47:56 PM4/16/14
to druid-de...@googlegroups.com
Hi Tao, the exception ": Asked to load path[/tmp/druid/localStorage/wikipedia/2013-08-31T00:00:00.000Z_2013-09-01T00:00:00.000Z/2014-04-16T17:00:40.325Z/0/index.zip], but it doesn't exist." means that the files you indexed no longer exists in your system. Did you happen to restart the computer or clear out the tmp drive? For local testing, you may want to these files in a more permanent folder. 

- FJ

tao...@yahoo-inc.com

unread,
Apr 17, 2014, 12:17:49 AM4/17/14
to druid-de...@googlegroups.com
Hi, Fangjin:

I stop all the nodes, then drop the tables in MySQL, and clean the dir "consumers" and "druid" in zookeeper, at last remove the dir "rm -rf /tmp/*" in every node.

Finish these steps, I restart all the nodes, and start to load the data through indexing, but still have this exception on historical node....

Are there some info I ignore to clean up?

2014-04-17 04:16:59,087 ERROR [ZkCoordinator-0] io.druid.server.coordination.ZkCoordinator - Failed to load segment for dataSource: {class=io.druid.server.coordination.ZkCoordinator, exceptionType=class io.druid.segment.loading.SegmentLoadingException, exceptionMessage=Exception loading segment[wikipedia_2013-08-31T00:00:00.000Z_2013-09-01T00:00:00.000Z_2014-04-17T04:10:53.010Z], segment=DataSegment{size=4462, shardSpec=NoneShardSpec, metrics=[count, added, deleted, delta], dimensions=[anonymous, city, continent, country, language, namespace, newpage, page, region, robot, unpatrolled, user], version='2014-04-17T04:10:53.010Z', loadSpec={type=local, path=/tmp/druid/localStorage/wikipedia/2013-08-31T00:00:00.000Z_2013-09-01T00:00:00.000Z/2014-04-17T04:10:53.010Z/0/index.zip}, interval=2013-08-31T00:00:00.000Z/2013-09-01T00:00:00.000Z, dataSource='wikipedia', binaryVersion='9'}}
io.druid.segment.loading.SegmentLoadingException: Exception loading segment[wikipedia_2013-08-31T00:00:00.000Z_2013-09-01T00:00:00.000Z_2014-04-17T04:10:53.010Z]
at io.druid.server.coordination.ZkCoordinator.addSegment(ZkCoordinator.java:239)
at io.druid.server.coordination.SegmentChangeRequestLoad.go(SegmentChangeRequestLoad.java:44)
at io.druid.server.coordination.ZkCoordinator$1.childEvent(ZkCoordinator.java:131)
at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:494)
at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:488)
at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:92)
at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293)
at org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:83)
at org.apache.curator.framework.recipes.cache.PathChildrenCache.callListeners(PathChildrenCache.java:485)
at org.apache.curator.framework.recipes.cache.EventOperation.invoke(EventOperation.java:35)
at org.apache.curator.framework.recipes.cache.PathChildrenCache$11.run(PathChildrenCache.java:755)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: io.druid.segment.loading.SegmentLoadingException: Asked to load path[/tmp/druid/localStorage/wikipedia/2013-08-31T00:00:00.000Z_2013-09-01T00:00:00.000Z/2014-04-17T04:10:53.010Z/0/index.zip], but it doesn't exist.
at io.druid.segment.loading.LocalDataSegmentPuller.getFile(LocalDataSegmentPuller.java:100)
at io.druid.segment.loading.LocalDataSegmentPuller.getSegmentFiles(LocalDataSegmentPuller.java:41)
at io.druid.segment.loading.OmniSegmentLoader.getSegmentFiles(OmniSegmentLoader.java:125)
at io.druid.segment.loading.OmniSegmentLoader.getSegment(OmniSegmentLoader.java:93)
at io.druid.server.coordination.ServerManager.loadSegment(ServerManager.java:129)
at io.druid.server.coordination.ZkCoordinator.addSegment(ZkCoordinator.java:235)
... 17 more

Thanks,
Tao

Fangjin Yang

unread,
Apr 17, 2014, 12:21:01 AM4/17/14
to druid-de...@googlegroups.com
Inline.


On Wed, Apr 16, 2014 at 9:17 PM, <tao...@yahoo-inc.com> wrote:
Hi, Fangjin:

I stop all the nodes, then drop the tables in MySQL, and clean the dir "consumers" and "druid" in zookeeper, at last remove the dir "rm -rf /tmp/*" in every node.

Are you running everything on a single machine or on multiple nodes?
 
Finish these steps, I restart all the nodes, and start to load the data through indexing, but still have this exception on historical node....

Your deep storage is configured to be local, which means segments will be stored on the same filesystem as where your indexing is running. If you are running in distributed mode, you will need to put these files in another deep storage.
 

--
You received this message because you are subscribed to the Google Groups "Druid Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-developm...@googlegroups.com.
To post to this group, send email to druid-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/417d92ea-fb9c-49c7-99ab-33b85dee6374%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

tao...@yahoo-inc.com

unread,
Apr 17, 2014, 1:24:47 AM4/17/14
to druid-de...@googlegroups.com
Hi, Fangjin:
I ran them on multiple nodes(different machines), and I configure the historical to use hdfs as deep storage.But it seems not work.
Here is the historical node configuration.
druid.host=<hostname>
druid.service=druid/v1/historical
druid.port=8081

druid.zk.service.host=localhost

druid.extensions.coordinates=["io.druid.extensions:druid-hdfs-storage:0.6.52"]

# Dummy read only AWS account (used to download example data)
druid.s3.secretKey=QyyfVZ7llSiRg6Qcrql1eEUG7buFpAK6T6engr1b
druid.s3.accessKey=AKIAIMKECRUYKDQGR6YQ

druid.server.maxSize=536870912

# Change these to make Druid faster
druid.processing.buffer.sizeBytes=268435456
druid.processing.numThreads=2

druid.segmentCache.locations=[{"path": "/tmp/druid/indexCache", "maxSize"\: 268435456}]

druid.storage.type=hdfs
druid.storage.storageDirectory=hdfs://<hostname>:<port>/druid/segments

Nishant Bangarwa

unread,
Apr 17, 2014, 3:57:03 AM4/17/14
to druid-de...@googlegroups.com
Hi Tao, 

druid.storage.type and storageDirectory is used by the nodes creating the segments, So you need to specify them in runtime.properties for indexing service instead of historical nodes.Historical nodes only gets to know about the location of segments from segment metadata stored in mysql so they dont need these. 
It seems your segments are still created in local storage instead of hdfs. 
Once you add these and reindex data, you should see the segments being created in hdfs instead of local storage. 
 




For more options, visit https://groups.google.com/d/optout.

tao...@yahoo-inc.com

unread,
Apr 17, 2014, 5:30:09 AM4/17/14
to druid-de...@googlegroups.com
Hi,Nishant:

I add these in indexing configuration, and restart the system.
Historical still can't load the data.
BTW, the hadoop version is required? I used hadoop-2.2.0. And I add the hadoop lib path using:
-classpath lib/*:/home/tt/hadoop-2.2.0/share/hadoop/hdfs/lib/*:/home/tt/hadoop-2.2.0/share/hadoop/hdfs/* 
in java command when start realtime, historical and indexing node.
Is it right? or other way to support hadoop2?

2014-04-17 09:28:46,248 ERROR [ZkCoordinator-0] io.druid.curator.announcement.Announcer - Path[/druid/servedSegments/hostname:8081/wikipedia_2013-08-31T00:00:00.000Z_2013-09-01T00:00:00.000Z_2014-04-17T09:15:35.804Z] not announced, cannot unannounce.
2014-04-17 09:28:46,249 ERROR [ZkCoordinator-0] io.druid.server.coordination.ZkCoordinator - Failed to load segment for dataSource: {class=io.druid.server.coordination.ZkCoordinator, exceptionType=class io.druid.segment.loading.SegmentLoadingException, exceptionMessage=Exception loading segment[wikipedia_2013-08-31T00:00:00.000Z_2013-09-01T00:00:00.000Z_2014-04-17T09:15:35.804Z], segment=DataSegment{size=4462, shardSpec=NoneShardSpec, metrics=[count, added, deleted, delta], dimensions=[anonymous, city, continent, country, language, namespace, newpage, page, region, robot, unpatrolled, user], version='2014-04-17T09:15:35.804Z', loadSpec={type=local, path=hdfs:/hostname:port/druid/segments/wikipedia/2013-08-31T00:00:00.000Z_2013-09-01T00:00:00.000Z/2014-04-17T09:15:35.804Z/0/index.zip}, interval=2013-08-31T00:00:00.000Z/2013-09-01T00:00:00.000Z, dataSource='wikipedia', binaryVersion='9'}}
io.druid.segment.loading.SegmentLoadingException: Exception loading segment[wikipedia_2013-08-31T00:00:00.000Z_2013-09-01T00:00:00.000Z_2014-04-17T09:15:35.804Z]
at io.druid.server.coordination.ZkCoordinator.addSegment(ZkCoordinator.java:239)
at io.druid.server.coordination.SegmentChangeRequestLoad.go(SegmentChangeRequestLoad.java:44)
at io.druid.server.coordination.ZkCoordinator$1.childEvent(ZkCoordinator.java:131)
at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:494)
at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:488)
at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:92)
at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293)
at org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:83)
at org.apache.curator.framework.recipes.cache.PathChildrenCache.callListeners(PathChildrenCache.java:485)
at org.apache.curator.framework.recipes.cache.EventOperation.invoke(EventOperation.java:35)
at org.apache.curator.framework.recipes.cache.PathChildrenCache$11.run(PathChildrenCache.java:755)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: io.druid.segment.loading.SegmentLoadingException: Asked to load path[hdfs:/hostname:port/druid/segments/wikipedia/2013-08-31T00:00:00.000Z_2013-09-01T00:00:00.000Z/2014-04-17T09:15:35.804Z/0/index.zip], but it doesn't exist.
at io.druid.segment.loading.LocalDataSegmentPuller.getFile(LocalDataSegmentPuller.java:100)
at io.druid.segment.loading.LocalDataSegmentPuller.getSegmentFiles(LocalDataSegmentPuller.java:41)
at io.druid.segment.loading.OmniSegmentLoader.getSegmentFiles(OmniSegmentLoader.java:125)
at io.druid.segment.loading.OmniSegmentLoader.getSegment(OmniSegmentLoader.java:93)
at io.druid.server.coordination.ServerManager.loadSegment(ServerManager.java:129)
at io.druid.server.coordination.ZkCoordinator.addSegment(ZkCoordinator.java:235)
... 17 more

Thanks,
Tao

Nishant Bangarwa

unread,
Apr 17, 2014, 12:18:26 PM4/17/14
to druid-de...@googlegroups.com
Hi Tao, 
It seems it is still creating segments locally in the segment loadspec type is local, 
can you share runtime.props for indexer and the task spec ? , 
to use hadoop-2.2.0, you can just add change the hadoopCoordinates in the task spec to point to hadoop-2.2.0. 




For more options, visit https://groups.google.com/d/optout.

tao...@yahoo-inc.com

unread,
Apr 17, 2014, 12:46:35 PM4/17/14
to druid-de...@googlegroups.com
Hi, Nishant
Here is my indexer runtime.props:

druid.host=hostname
druid.port=8087
#druid.service=overlord
druid.service=druid/v1/indexer

druid.zk.service.host=localhost

druid.db.connector.connectURI=jdbc:mysql://localhost:3306/druid
druid.db.connector.user=druid
druid.db.connector.password=diurd

druid.selectors.indexing.serviceName=druid:v1:indexer
druid.indexer.queue.startDelay=PT0M
druid.indexer.runner.javaOpts=-server -Xmx1g -Xms1g -XX:NewSize=256m -XX:MaxNewSize=256m -XX:MaxDirectMemorySize=536870912  -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
druid.indexer.runner.startPort=8088
druid.indexer.fork.property.druid.computation.buffer.size=268435456
#druid.indexer.storage.type=db
druid.db.connector.useValidationQuery=true

druid.storage.type=hdfs
druid.storage.storageDirectory=hdfs://<hostname>:port/druid/segments


and the task spec I used is druid example provided.

{
  "type" : "index",
  "dataSource" : "wikipedia",
  "granularitySpec" : {
    "type" : "uniform",
    "gran" : "DAY",
    "intervals" : [ "2013-08-31/2013-09-01" ]
  },
  "aggregators" : [{
     "type" : "count",
     "name" : "count"
    }, {
     "type" : "doubleSum",
     "name" : "added",
     "fieldName" : "added"
    }, {
     "type" : "doubleSum",
     "name" : "deleted",
     "fieldName" : "deleted"
    }, {
     "type" : "doubleSum",
     "name" : "delta",
     "fieldName" : "delta"
  }],
  "firehose" : {
    "type" : "local",
    "baseDir" : "examples/indexing/",
    "filter" : "wikipedia_data.json",
    "parser" : {
      "timestampSpec" : {
        "column" : "timestamp"
      },
      "data" : {
        "format" : "json",
        "dimensions" : ["page","language","user","unpatrolled","newPage","robot","anonymous","namespace","continent","country","region","city"]
      }
    }
   }
}

Thanks,
Tao

         
         
         
         
         
         
         
 
Nishant
<table cellspacing="0" cellpadding="0" border="0" style="border-collapse:coll
...

Nishant Bangarwa

unread,
Apr 17, 2014, 2:07:37 PM4/17/14
to druid-de...@googlegroups.com
runtime.properties are missing extension-coordinates for hdfs-storage
Add this - 
druid.extensions.coordinates=["io.druid.extensions:druid-hdfs-storage:0.6.52"]

you will also need to specify the hdfs-storage module for historical nodes to enable them to be able to pull segments from hdfs. 

Also specify hadoopCoordinates in the task spec - 
 "hadoopCoordinates": "org.apache.hadoop:hadoop-client:2.2.0"





For more options, visit https://groups.google.com/d/optout.

tao...@yahoo-inc.com

unread,
Apr 18, 2014, 5:02:46 AM4/18/14
to druid-de...@googlegroups.com
Hi, Nishant:
I add the parameters you referred
"druid.extensions.coordinates=["io.druid.extensions:druid-hdfs-storage:0.6.52"]"
 in overlord/runtimes.properties  and
"hadoopCoordinates": "org.apache.hadoop:hadoop-client:2.2.0" in index json file.

I still got the exception when load data.

The command run I used are:

java -Duser.timezone=UTC -Dfile.encoding=UTF-8 -classpath lib/*:/home/taoluo/hadoop-lib/*:/home/taoluo/software/hadoop-2.2.0/etc/hadoop:config/overlord io.druid.cli.Main server overlord
on overlord node.

java -Xmx512m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Ddruid.realtime.specFile=examples/indexing/wikipedia.spec -classpath lib/*:/home/taoluo/hadoop-lib:/home/taoluo/software/hadoop-2.2.0/etc/hadoop:config/realtime io.druid.cli.Main server realtime
on realtime node.

java -Xmx1g -Duser.timezone=UTC -Dfile.encoding=UTF-8 -classpath lib/*:/home/taoluo/hadoop-lib:/home/taoluo/software/hadoop-2.2.0/etc/hadoop:config/historical io.druid.cli.Main server historical
on historical node.

The dir "/home/taoluo/hadoop-lib" have "hadoop-client-2.2.0.jar" and "hadoop-client-2.2.0-tests.jar".

Did I use the wrong command or some parameters I configured weren't right.
Here is the exception from task log:

2014-04-18 08:50:56,175 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Converting v8[/tmp/persistent/task/index_wikipedia_2014-04-18T08:50:46.657Z/work/wikipedia_2013-08-31T00:00:00.000Z_2013-09-01T00:00:00.000Z_2014-04-18T08:50:46.670Z_0/wikipedia_2013-08-31T00:00:00.000Z_2013-09-01T00:00:00.000Z_2014-04-18T08:50:46.670Z/spill0/v8-tmp] to v9[/tmp/persistent/task/index_wikipedia_2014-04-18T08:50:46.657Z/work/wikipedia_2013-08-31T00:00:00.000Z_2013-09-01T00:00:00.000Z_2014-04-18T08:50:46.670Z_0/wikipedia_2013-08-31T00:00:00.000Z_2013-09-01T00:00:00.000Z_2014-04-18T08:50:46.670Z/spill0]
2014-04-18 08:50:56,177 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[dim_anonymous.drd]
2014-04-18 08:50:56,182 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Dimension[anonymous] is single value, converting...
2014-04-18 08:50:56,203 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[dim_city.drd]
2014-04-18 08:50:56,204 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Dimension[city] is single value, converting...
2014-04-18 08:50:56,204 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[dim_continent.drd]
2014-04-18 08:50:56,204 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Dimension[continent] is single value, converting...
2014-04-18 08:50:56,207 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[dim_country.drd]
2014-04-18 08:50:56,208 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Dimension[country] is single value, converting...
2014-04-18 08:50:56,208 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[dim_language.drd]
2014-04-18 08:50:56,208 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Dimension[language] is single value, converting...
2014-04-18 08:50:56,209 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[dim_namespace.drd]
2014-04-18 08:50:56,209 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Dimension[namespace] is single value, converting...
2014-04-18 08:50:56,209 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[dim_newpage.drd]
2014-04-18 08:50:56,209 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Dimension[newpage] is single value, converting...
2014-04-18 08:50:56,210 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[dim_page.drd]
2014-04-18 08:50:56,213 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Dimension[page] is single value, converting...
2014-04-18 08:50:56,214 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[dim_region.drd]
2014-04-18 08:50:56,214 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Dimension[region] is single value, converting...
2014-04-18 08:50:56,214 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[dim_robot.drd]
2014-04-18 08:50:56,215 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Dimension[robot] is single value, converting...
2014-04-18 08:50:56,215 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[dim_unpatrolled.drd]
2014-04-18 08:50:56,215 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Dimension[unpatrolled] is single value, converting...
2014-04-18 08:50:56,216 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[dim_user.drd]
2014-04-18 08:50:56,216 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Dimension[user] is single value, converting...
2014-04-18 08:50:56,216 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[index.drd]
2014-04-18 08:50:56,219 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[inverted.drd]
2014-04-18 08:50:56,219 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[met_added_LITTLE_ENDIAN.drd]
2014-04-18 08:50:56,226 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[met_count_LITTLE_ENDIAN.drd]
2014-04-18 08:50:56,227 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[met_deleted_LITTLE_ENDIAN.drd]
2014-04-18 08:50:56,227 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[met_delta_LITTLE_ENDIAN.drd]
2014-04-18 08:50:56,228 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[spatial.drd]
2014-04-18 08:50:56,231 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[time_LITTLE_ENDIAN.drd]
2014-04-18 08:50:56,239 INFO [task-runner-0] io.druid.segment.IndexIO$DefaultIndexIOHandler - Skipped files[[index.drd, inverted.drd, spatial.drd]]
9.623: [Full GC9.623: [Tenured: 26657K->31422K(786432K), 0.1514110 secs] 85856K->31422K(1022400K), [Perm : 37566K->37566K(37568K)], 0.1515960 secs] [Times: user=0.15 sys=0.00, real=0.15 secs] 
2014-04-18 08:50:56,606 WARN [task-runner-0] io.druid.indexing.common.index.YeOldePlumberSchool - Failed to merge and upload
java.io.IOException: failure to login
	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:490)
	at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:452)
	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1494)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1395)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
	at io.druid.storage.hdfs.HdfsDataSegmentPusher.push(HdfsDataSegmentPusher.java:82)
	at io.druid.indexing.common.task.IndexTask$2.push(IndexTask.java:339)
	at io.druid.indexing.common.index.YeOldePlumberSchool$1.finishJob(YeOldePlumberSchool.java:156)
	at io.druid.indexing.common.task.IndexTask.generateSegment(IndexTask.java:395)
	at io.druid.indexing.common.task.IndexTask.run(IndexTask.java:153)
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:216)
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:195)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:744)
Caused by: javax.security.auth.login.LoginException: unable to find LoginModule class: org.apache.hadoop.security.UserGroupInformation$HadoopLoginModule
	at javax.security.auth.login.LoginContext.invoke(LoginContext.java:800)
	at javax.security.auth.login.LoginContext.access$000(LoginContext.java:203)
	at javax.security.auth.login.LoginContext$4.run(LoginContext.java:690)
	at javax.security.auth.login.LoginContext$4.run(LoginContext.java:688)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:687)
	at javax.security.auth.login.LoginContext.login(LoginContext.java:595)
	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:471)
	... 16 more

Thanks,
Tao

Taras Puhol

unread,
Apr 18, 2014, 5:37:29 AM4/18/14
to druid-de...@googlegroups.com
Hi Nishant Bangarwa,
see inline


Hi Tao, 

druid.storage.type and storageDirectory is used by the nodes creating the segments, So you need to specify them in runtime.properties for indexing service instead of historical nodes.Historical nodes only gets to know about the location of segments from segment metadata stored in mysql so they dont need these. 
It seems your segments are still created in local storage instead of hdfs. 
Once you add these and reindex data, you should see the segments being created in hdfs instead of local storage.

just to understand properly
data are injected by batch indexing. I can see this data in  ..druid/localStorage. this storage is just needed to indexing and than no need to save it
so when batch indexing done I see segment in druid/indexCache.
so as I understand to make requests to Historical node in the future I doesn't need to have .druid/localStorage. and can delete segments in it. right?

Fangjin Yang

unread,
Apr 18, 2014, 2:32:23 PM4/18/14
to druid-de...@googlegroups.com
Hi Tao, did you include hadoop configuration xml files in the classpath? Also, are you using index_task or hadoop_index_task? Make sure to use the latter.

-- FJ
Inline.


Tao
<span sty
...

tao...@yahoo-inc.com

unread,
Apr 20, 2014, 1:09:23 PM4/20/14
to druid-de...@googlegroups.com
Hi, Fangjin:
The hadoop version I used is 2.2.0
On overlord node, I used this command to start it:
"java -Duser.timezone=UTC -Dfile.encoding=UTF-8 -classpath lib/*:/home/taoluo/hadoop-lib/*:/home/taoluo/software/hadoop-2.2.0/etc/hadoop:config/overlord io.druid.cli.Main server overlord"
In "/home/taoluo/hadoop-lib" dir, there are "hadoop-client-2.2.0.jar" and "hadoop-client-2.2.0-test.jar" in it.
In "/home/taoluo/software/hadoop-2.2.0/etc/hadoop", there are configuration files for hadoop-2.2.0.
For example:"hdfs-site.xml", "core-site.xml", "capacity-scheduler.xml" and etc.

You referred "hadoop_index_task", I think the example  your doc "Loading your Data(part 2)" described is this type.
I followed these, but task failed. The difference from your example was I used hadoop-2.2.0 not hadoop-1.0.3.
Which step I made the mistake?
Here is the exception from task log:

20T16:23:02.488Z] to overlord[http://<hostname>:<port>/druid/indexer/v1/action]: LockListAction{}
2014-04-20 16:23:25,654 INFO [task-runner-0] io.druid.indexing.common.task.HadoopIndexTask - Setting version to: 2014-04-20T16:23:02.489Z
2014-04-20 16:23:25,963 ERROR [task-runner-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_wikipedia_2014-04-20T16:23:02.488Z, type=index_hadoop, dataSource=wikipedia}]
java.lang.RuntimeException: java.lang.RuntimeException: class org.apache.hadoop.security.ShellBasedUnixGroupsMapping not org.apache.hadoop.security.GroupMappingServiceProvider
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:899)
	at org.apache.hadoop.security.Groups.<init>(Groups.java:48)
	at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:140)
	at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:205)
	at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:184)
	at org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled(UserGroupInformation.java:236)
	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:466)
	at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:452)
	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1494)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1395)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
	at io.druid.storage.hdfs.HdfsDataSegmentPusher.getPathForHadoop(HdfsDataSegmentPusher.java:70)
	at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:178)
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:216)
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:195)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.RuntimeException: class org.apache.hadoop.security.ShellBasedUnixGroupsMapping not org.apache.hadoop.security.GroupMappingServiceProvider
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:893)
	... 19 more
2014-04-20 16:23:25,973 INFO [task-runner-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_hadoop_wikipedia_2014-04-20T16:23:02.488Z",
  "status" : "FAILED",
  "duration" : 12598
} 


And also have other questions.
This topic is for running your doc example "Loading your Data(part 1)" (http://druid.io/docs/0.6.73/Tutorial:-Loading-Your-Data-Part-1.html)

1. It doesn't mention how to start realtime, so I used the command in part 2  to start it.

"java -Xmx256m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Ddruid.realtime.specFile=examples/indexing/wikipedia.spec -classpath lib/*:/home/taoluo/hadoop-lib/*:/home/taoluo/software/hadoop-2.2.0/etc/hadoop:config/realtime io.druid.cli.Main server realtime"

I add hadoop-clinet.jar and hadoop configuration files in classpath.

Is it right? And "druid.realtime.specFile" must be given? If I want to add another data source in Druid, should I defined another specFile for this data source and restart realtime?

2.The difference I followed the steps between the doc "Loading your data(part 1)" is that I want to use “hdfs” as "Deep storage".
In my viewpoint, the realtime node will aggregate some index as "segment" in "Deep storage" for Historical node to read. 
So we could see these segments in "hdfs".Is it right?
But I got the exception as above referred when I load the data. Is it the hadoop configurations weren't right?
Inline.


Tao
io.druid.segment.loading.SegmentLoadingException: Exception loading segment[wikipedia_2013-08-31T0</
...

Nishant Bangarwa

unread,
Apr 20, 2014, 3:15:38 PM4/20/14
to druid-de...@googlegroups.com
Hi Tao, 
See Inline


On Sun, Apr 20, 2014 at 10:39 PM, <tao...@yahoo-inc.com> wrote:
Hi, Fangjin:
Can you check by updating to newer version of hdfs-storage ?   


And also have other questions.
This topic is for running your doc example "Loading your Data(part 1)" (http://druid.io/docs/0.6.73/Tutorial:-Loading-Your-Data-Part-1.html)

1. It doesn't mention how to start realtime, so I used the command in part 2  to start it.

"java -Xmx256m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Ddruid.realtime.specFile=examples/indexing/wikipedia.spec -classpath lib/*:/home/taoluo/hadoop-lib/*:/home/taoluo/software/hadoop-2.2.0/etc/hadoop:config/realtime io.druid.cli.Main server realtime"

I add hadoop-clinet.jar and hadoop configuration files in classpath.

Is it right? And "druid.realtime.specFile" must be given? If I want to add another data source in Druid, should I defined another specFile for this data source and restart realtime?
this seems fine. 
the realtime spec file is a JSON array of realtime schemas, you can add another data source to the same spec file and restart the realtime node.  
 
2.The difference I followed the steps between the doc "Loading your data(part 1)" is that I want to use “hdfs” as "Deep storage".
In my viewpoint, the realtime node will aggregate some index as "segment" in "Deep storage" for Historical node to read. 
So we could see these segments in "hdfs".Is it right?
yeah its right, you can use realtime nodes to push segments to hdfs.
But I got the exception as above referred when I load the data. Is it the hadoop configurations weren't right?
 can you try to run this with the updating the hdfs-storage module and see if it fixes this? 

--
You received this message because you are subscribed to the Google Groups "Druid Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-developm...@googlegroups.com.
To post to this group, send email to druid-de...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

tao...@yahoo-inc.com

unread,
Apr 20, 2014, 10:37:00 PM4/20/14
to druid-de...@googlegroups.com
Hi, Nishant:

Sorry, I'm not very clear about how to to newer version of hdfs-storage. Is its version 0.96? I update it by clone your source code from git, mvn package a "druid-hdfs-storage-0.6.96-SNAPSHOT.jar" , then use it to replace the "druid-hdfs-storage-0.53.jar" in ".m2" dir on realtime, historical and indexing node.

And copy the hadoop configuration files from hadoop master, and add these files in start  command when start overlord, realtime and historical.

But when load the data. Still got the same exception. It's confused.....

2014-04-21 02:26:09,242 INFO [task-runner-0] io.druid.indexing.common.task.IndexTask - Task[index_wikipedia_2014-04-21T02:25:58.617Z] interval[2013-08-31T00:00:00.000Z/2013-09-01T00:00:00.000Z] partition[0] took in 5 rows (5 processed, 0 unparseable, 0 thrown away) and output 5 rows
2014-04-21 02:26:09,245 ERROR [task-runner-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[IndexTask{id=index_wikipedia_2014-04-21T02:25:58.617Z, type=index, dataSource=wikipedia}]
java.lang.RuntimeException: java.io.IOException: failure to login
	at com.google.common.base.Throwables.propagate(Throwables.java:160)
	at io.druid.indexing.common.index.YeOldePlumberSchool$1.finishJob(YeOldePlumberSchool.java:165)
	at io.druid.indexing.common.task.IndexTask.generateSegment(IndexTask.java:395)
	at io.druid.indexing.common.task.IndexTask.run(IndexTask.java:153)
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:216)
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:195)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.IOException: failure to login
	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:490)
	at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:452)
	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1494)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1395)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:238)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
	at io.druid.storage.hdfs.HdfsDataSegmentPusher.push(HdfsDataSegmentPusher.java:77)
	at io.druid.indexing.common.task.IndexTask$2.push(IndexTask.java:339)
	at io.druid.indexing.common.index.YeOldePlumberSchool$1.finishJob(YeOldePlumberSchool.java:156)
	... 8 more
Caused by: javax.security.auth.login.LoginException: unable to find LoginModule class: org.apache.hadoop.security.UserGroupInformation$HadoopLoginModule
	at javax.security.auth.login.LoginContext.invoke(LoginContext.java:800)
	at javax.security.auth.login.LoginContext.access$000(LoginContext.java:203)
	at javax.security.auth.login.LoginContext$4.run(LoginContext.java:690)
	at javax.security.auth.login.LoginContext$4.run(LoginContext.java:688)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:687)
	at javax.security.auth.login.LoginContext.login(LoginContext.java:595)
	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:471)
	... 18 more
2014-04-21 02:26:09,251 INFO [task-runner-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Removing task directory: /tmp/persistent/task/index_wikipedia_2014-04-21T02:25:58.617Z/work
2014-04-21 02:26:09,264 INFO [task-runner-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_wikipedia_2014-04-21T02:25:58.617Z",
  "status" : "FAILED",
  "duration" : 694
}

Thanks,
Tao

...

Fangjin Yang

unread,
Apr 21, 2014, 1:31:01 PM4/21/14
to druid-de...@googlegroups.com
Hi Tao,

For getting Druid to work with different versions of Hadoop can be a bit daunting and often times require recompiling Druid with the version of Hadoop you require and all necessary dependencies. If you are interested, we can set up some live help in our IRC. You can ping fj, neo_, or cheddar.

-- FJ
...

tao...@yahoo-inc.com

unread,
Apr 25, 2014, 2:06:14 AM4/25/14
to druid-de...@googlegroups.com
Hi, Fangjin:

I add the hadoop version and recompile the hdfs-storage from source code. And use this jar package to replace the jar under path ".m2/repository/io/druid/extensions/druid-hdfs-storage/0.6.52/druid-hdfs-storage-0.6.52.jar" on indexing, realtime and historical node.
But when I run the task, it still get the exception.
The exception is described as blow:

2014-04-25 05:52:29,787 ERROR [task-runner-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Uncaught Throwable while running task[IndexTask{id=index_wikipedia_2014-04-25T05:52:18.495Z, type=index, dataSource=wikipedia}]
java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FileSystem
	at java.lang.ClassLoader.defineClass1(Native Method)
	at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
	at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
	at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:270)
	at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:363)
	at java.util.ServiceLoader$1.next(ServiceLoader.java:445)
	at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2400)
	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:351)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
	at io.druid.storage.hdfs.HdfsDataSegmentPusher.push(HdfsDataSegmentPusher.java:77)
	at io.druid.indexing.common.task.IndexTask$2.push(IndexTask.java:339)
	at io.druid.indexing.common.index.YeOldePlumberSchool$1.finishJob(YeOldePlumberSchool.java:156)
	at io.druid.indexing.common.task.IndexTask.generateSegment(IndexTask.java:395)
	at io.druid.indexing.common.task.IndexTask.run(IndexTask.java:153)
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:216)
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:195)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:744)

It is so confused, This class was in the "druid-hdfs-storage-0.6.52.jar".

jar tf /home/taoluo/.m2/repository/io/druid/extensions/druid-hdfs-storage/0.6.52/druid-hdfs-storage-0.6.52.jar | grep "org/apache/hadoop/fs/FileSystem"
org/apache/hadoop/fs/FileSystem$1.class
org/apache/hadoop/fs/FileSystem$2.class
org/apache/hadoop/fs/FileSystem$3.class
org/apache/hadoop/fs/FileSystem$4.class
org/apache/hadoop/fs/FileSystem$5.class
org/apache/hadoop/fs/FileSystem$Cache$ClientFinalizer.class
org/apache/hadoop/fs/FileSystem$Cache$Key.class
org/apache/hadoop/fs/FileSystem$Cache.class
org/apache/hadoop/fs/FileSystem$Statistics.class
org/apache/hadoop/fs/FileSystem.class
org/apache/hadoop/fs/FileSystemLinkResolver.class

But why druid can't find it?

Thanks,
Tao
...

tao...@yahoo-inc.com

unread,
Apr 25, 2014, 2:07:47 AM4/25/14
to druid-de...@googlegroups.com
Hi,Nishant:

Could you help me to have a look at the exception I describe as above.

Thanks,
Tao
...

tao...@yahoo-inc.com

unread,
Apr 25, 2014, 4:01:09 AM4/25/14
to druid-de...@googlegroups.com
Hi, Fangjin & Nishant:

I have been fixed it! The segments have been generated under HDFS path.
I add hadoop2.2.0 hdfs lib jars in java start command. and remove the "protobuf-java-2.4.0a.jar" in the druid lib dir. because it will conflicts with hadoop ""protobuf-2.5".

Thank you for your help.

...

Fangjin Yang

unread,
Apr 25, 2014, 2:06:34 PM4/25/14
to druid-de...@googlegroups.com
Hi Tao, that's great to hear!

Let us know if you see more problems.
...
Reply all
Reply to author
Forward
0 new messages