Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Message from discussion Question about Improving performance
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Harold Lim  
View profile   Translate to Translated (View Original)
 More options May 30 2012, 12:02 am
From: Harold Lim <harold.c....@gmail.com>
Date: Wed, 30 May 2012 00:02:37 -0400
Local: Wed, May 30 2012 12:02 am
Subject: Re: [storm-user] Question about Improving performance
Hi Nathan,

Thanks. I'll try both of them.

Is there an instruction on how to compile the branch? Also, does
storm-deploy work with 0.8.0?

-Harold

On Tue, May 29, 2012 at 11:50 PM, Nathan Marz <nathan.m...@gmail.com> wrote:
> First of all – if you want to understand where your performance bottleneck
> is, you should use a Java profiler rather than try to guess. I highly
> recommend YourKit, as it's really easy to use.

> Storm 0.8.0 (in development) has significant performance improvements
> (4-5x). It's possible the perf improvements from that branch will help with
> your situation. The branch is pretty stable now, so you can give it a shot:
> https://github.com/nathanmarz/storm/tree/0.8.0

> On Tue, May 29, 2012 at 6:46 PM, Harold Lim <harold.c....@gmail.com> wrote:

>> Hi Steve,

>> I don't think it's the file IO part. My file is stored in HDFS and I
>> am using the standard HDFS read API. Basically, in the open method, I
>> open a reader of the file. In the nextTuple, it reads a line. I then
>> performs some post processing, such as splitting the string and then
>> emitting them.

>> I tested this by also commenting the emit call and simply printing a
>> message when a file has been completely read and it takes only a few
>> seconds but with emit not commented, it takes longer to finish.

>> -Harold

>> On Tue, May 29, 2012 at 8:14 PM, Steven Siebert <smsi...@gmail.com> wrote:
>> > I'm wondering if it's the Spout implementation, specifically the file IO
>> > part.  Could you post your spout code?

>> > What kind of performance do you get if you read the file-based tuples
>> > into
>> > an in-memory queue in the ISpout#open method and then just poll from
>> > that
>> > queue in nextTuple?

>> > Regards,

>> > Steve

>> > On Tue, May 29, 2012 at 7:31 PM, Harold Lim <harold.c....@gmail.com>
>> > wrote:

>> >> Hi,

>> >> I am trying to figure out where the bottleneck is in my topology and
>> >> have
>> >> simplified my topology into a spout and bolt.
>> >> The spout simply reads from a file (~10MB). Each call to nextTuple will
>> >> simply read a line from a file, parse the line and emit it. The bolt
>> >> currently does nothing except ack the tuple. Also, I disabled the
>> >> reliability mechanism, #ackers = 0.
>> >> The issue I have is it takes minutes to finish reading the whole file.
>> >> I
>> >> tried commenting out all of the emit calls in the spout to  measure the
>> >> time
>> >> for it to finish reading the whole file and it takes only a few seconds
>> >> (4-5s) because at first I thought there may be a delay between calls to
>> >> nextTuple(). However, this seems to be not the case.

>> >> Any ideas how to improve the performance? I tried changing
>> >> the zmq.threads
>> >> and zmq.linger.millis values and it doesn't seem to help. I also tried
>> >> changing the parallelism of the bolt and it doesn't seem to help too.

>> >> Thanks.

> --
> Twitter: @nathanmarz
> http://nathanmarz.com


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.