How to Increase the performance in Flume

59 views
Skip to first unread message

Srinivasan Ramalingam

unread,
Mar 21, 2013, 1:25:55 AM3/21/13
to chenn...@googlegroups.com
Hi Guys,
                   i was transferring the file from Apache server to Hdfs with in the help of flume. The received file has been split ed into 1k b file and it has been stored in hdfs.  so how can i get the entire file in hdfs. my flume.conf file is 


dst_agent.sources = netcat
dst_agent.channels = memoryChannel
dst_agent.sinks = hbaseSink

dst_agent.sources.netcat.type = netcat
dst_agent.sources.netcat.bind = xx.xx.xxx.xxx
dst_agent.sources.netcat.port = 10000
dst_agent.sources.netcat.channels = memoryChannel

dst_agent.sinks.hbaseSink.channel = memoryChannel
dst_agent.sinks.hbaseSink.type = hdfs
dst_agent.sinks.hbaseSink.hdfs.path=hdfs://xxx:8020/flume/wedata
dst_agent.sinks.hbaseSink.hdfs.batchSize=100
dst_agent.sinks.hbaseSink.hdfs.rollSize=1024000
dst_agent.sinks.hbaseSink.hdfs.fileType=DataStream

dst_agent.channels.memoryChannel.type = memory
dst_agent.channels.memoryChannel.capacity = 100000000
dst_agent.channels.memoryChannel.transactionCapacity = 100000000



  

Senthil Kumar

unread,
Mar 21, 2013, 1:27:41 AM3/21/13
to chenn...@googlegroups.com
Where is Binish?? The expert in Flume.. 
@Binish
Your thoughts on this!!!!!!!




  

--
You received this message because you are subscribed to the Google Groups "Hadoop User Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Srinivasan Ramalingam

unread,
Mar 21, 2013, 1:45:17 AM3/21/13
to chenn...@googlegroups.com
Hi Senthil,
   how are you, i don't have Binish Email Id can you please provide that
Thanks and regards
  R.Srinivasan

On Thu, Mar 21, 2013 at 10:57 AM, Senthil Kumar <senthilk...@gmail.com> wrote:
Where is Binishp?? The expert in Flume.. 

Bini

unread,
Mar 24, 2013, 2:02:33 PM3/24/13
to chenn...@googlegroups.com
HI Srinivas,

Why don't you change the various parameter values and see how it is capturing in hdfs.
eg: dst_agent.sinks.hbaseSink.hdfs.rollSize = 0
This will never roll based on file size.

See the volume of data that comes to the system, and set rollInterval, rollCount according to that.

Regards,
Binish

Srinivasan Ramalingam

unread,
Mar 25, 2013, 1:03:45 AM3/25/13
to chenn...@googlegroups.com
Ok Thanks binish , i will do that. 

--
Reply all
Reply to author
Forward
0 new messages