My goal is using Flume to pull messages from Kafka to HDFS,
but it failed even for the simplest Exec source to File_roll sink example.
My Flume configuration:
# flm.conf
agent.sources = r1
agent.channels = c1
agent.sources.r1.type = exec
agent.sources.r1.command = tail -F /home/me/in.txt
agent.sources.r1.channels = c1
agent.channels.c1.type = memory
agent.sinks = k1
agent.sinks.k1.channel = c1
agent.sinks.k1.type = file_roll
agent.sinks.k1.sink.directory = /home/me/out.txt
agent.sinks.k1.sink.rollInterval = 10
Add some lines to the source file:
~$ echo apache >> in.txt
~$ echo bigtable >> in.txt
Run Flume:
~/bin/apache-flume-1.6.0-bin/bin/flume-ng agent -n agent -f flm.conf -Dflume.root.logger=INFO,console
Warning: No configuration directory set! Use --conf <dir> to override.
Info: Including Hive libraries found via () for Hive access
+ exec /usr/lib/jvm/java-8-openjdk-amd64/bin/java -Xmx20m -Dflume.root.logger=INFO,console -cp '/home/me/bin/apache-flume-1.6.0-bin/lib/*:/lib/*' -Djava.library.path= org.apache.flume.node.Application -n agent -f /home/me/flm.conf
log4j:WARN No appenders could be found for logger (org.apache.flume.lifecycle.LifecycleSupervisor).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
# No more logs printed
Append more text:
~$ echo camus >> in.txt
...
~$ echo yarn >> in.txt
~$ echo zookeeper >> in.txt
Stop Flume and check the directory but can’t find any output file.
agent.sinks.k1.sink.directory = /home/me/out.txt <- should be a directory here.
What surprises me is I didn't see any error messages.