can drake handle hdfs urls with a host?

62 views
Skip to first unread message

Koert Kuipers

unread,
Apr 22, 2015, 7:00:28 PM4/22/15
to drake-w...@googlegroups.com
my test Drakefile is:
; bunch of hadoop fs operations
!hdfs://master:8020/tmp/test <- !hdfs:///user/koert/candidates.txt
  set -e
  hadoop fs -cp $INPUT $OUTPUT

the error i get is:
$ drake
java.io.IOException: No FileSystem for scheme: null
    at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
    at drake.fs$hdfs_filesystem.invoke(fs.clj:161)
    at drake.fs$fn__1899.invoke(fs.clj:175)
    at drake_interface.core$fn__1643$G__1593__1650.invoke(core.clj:8)
    at drake.fs$fn__1871.invoke(fs.clj:195)
    at drake_interface.core$fn__1710$G__1591__1717.invoke(core.clj:8)
    at drake.fs$data_in_QMARK__impl.invoke(fs.clj:64)
    at drake_interface.core$fn__1697$G__1589__1704.invoke(core.clj:8)
    at drake.fs$fs.invoke(fs.clj:443)
    at drake.core$should_build_QMARK_$fn__5080.invoke(core.clj:221)
    at clojure.core$some.invoke(core.clj:2515)
    at drake.core$should_build_QMARK_.invoke(core.clj:221)
    at drake.core$predict_steps$fn__5100.invoke(core.clj:255)
    at clojure.core.protocols$fn__6089.invoke(protocols.clj:127)
    at clojure.core.protocols$fn__6057$G__6052__6066.invoke(protocols.clj:19)
    at clojure.core.protocols$seq_reduce.invoke(protocols.clj:31)
    at clojure.core.protocols$fn__6080.invoke(protocols.clj:48)
    at clojure.core.protocols$fn__6031$G__6026__6044.invoke(protocols.clj:13)
    at clojure.core$reduce.invoke(core.clj:6289)
    at drake.core$predict_steps.invoke(core.clj:251)
    at drake.core$run.invoke(core.clj:646)
    at drake.core$drake$fn__5451.invoke(core.clj:878)
    at drake.core$with_workflow_file.invoke(core.clj:736)
    at drake.core$drake.doInvoke(core.clj:878)
    at clojure.lang.RestFn.invoke(RestFn.java:397)
    at clojure.lang.AFn.applyToHelper(AFn.java:152)
    at clojure.lang.RestFn.applyTo(RestFn.java:132)
    at clojure.core$apply.invoke(core.clj:624)
    at drake.core$_main.doInvoke(core.clj:892)
    at clojure.lang.RestFn.invoke(RestFn.java:397)
    at clojure.lang.AFn.applyToHelper(AFn.java:152)
    at clojure.lang.RestFn.applyTo(RestFn.java:132)
    at drake.core.main(Unknown Source)

if i change my drake file to:
; bunch of hadoop fs operations
!hdfs:///tmp/test <- !hdfs:///user/koert/candidates.txt
  set -e
  hadoop fs -cp $INPUT $OUTPUT

then it works fine:
$ drake
The following steps will be run, in order:
  1: hdfs:///tmp/test <- hdfs:///user/koert/candidates.txt [timestamped]
Confirm? [y/n]

Reply all
Reply to author
Forward
0 new messages