Web Images Videos Maps News Shopping Gmail more »
Recently Visited Groups | Help | Sign in
Google Groups Home
EM sending and receiving large files
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  13 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Dan Mayer  
View profile  
 More options Sep 28 2008, 10:18 pm
From: "Dan Mayer" <d...@devver.net>
Date: Sun, 28 Sep 2008 20:18:11 -0600
Local: Sun, Sep 28 2008 10:18 pm
Subject: [Eventmachine-talk] EM sending and receiving large files

We have been trying to send large files with EventMachine and noticed a few
issues. If we just use send data with the contents of a file inside it is
slow, and the server eats about 98% of the CPU. The send_file call only
supports files up to 32K, which we are sending files as large as 5mb. Lastly
we have been unable to use stream_file_data, because it has a dependency on
evma_fastfilereader, which I couldn't seem to find anywhere to install
anymore.

Some of these issues have been discussed in this thread:
http://groups.google.com/group/eventmachine/browse_thread/thread/3cc6...

Has anyone been sending large file with eventmachine that could share some
tips. In our case we are using EM for both the client and the server. We are
trying to sync over a directory of many files, is this just not a
recommended usage of EM? Besides looking for solutions to make this work
better on EM, are there other recommendations of better ways to send and
receive large amounts of file data with Ruby?

Thanks,
Dan

--
Dan Mayer
Co-founder, Devver
(http://devver.net)
follow us on twitter: http://twitter.com/devver
My Blog (http://mayerdan.com)

_______________________________________________
Eventmachine-talk mailing list
Eventmachine-t...@rubyforge.org
http://rubyforge.org/mailman/listinfo/eventmachine-talk


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Kirk Haines  
View profile  
 More options Sep 28 2008, 10:46 pm
From: "Kirk Haines" <wyhai...@gmail.com>
Date: Sun, 28 Sep 2008 20:46:27 -0600
Local: Sun, Sep 28 2008 10:46 pm
Subject: Re: [Eventmachine-talk] EM sending and receiving large files

On Sun, Sep 28, 2008 at 8:18 PM, Dan Mayer <d...@devver.net> wrote:
> We have been trying to send large files with EventMachine and noticed a few
> issues. If we just use send data with the contents of a file inside it is
> slow, and the server eats about 98% of the CPU. The send_file call only
> supports files up to 32K, which we are sending files as large as 5mb. Lastly
> we have been unable to use stream_file_data, because it has a dependency on
> evma_fastfilereader, which I couldn't seem to find anywhere to install
> anymore.

Hmmm.  I think that was confused oversight on Francis/my part.
evma_fastfilereader should be part of EM.  Until it is, you can get it by
installing Swiftiply.

> Has anyone been sending large file with eventmachine that could share some
> tips. In our case we are using EM for both the client and the server. We are
> trying to sync over a directory of many files, is this just not a
> recommended usage of EM? Besides looking for solutions to make this work
> better on EM, are there other recommendations of better ways to send and
> receive large amounts of file data with Ruby?

 Using stream_file_data I regularly transfer very large files with
Swiftiply.

Kirk Haines

_______________________________________________
Eventmachine-talk mailing list
Eventmachine-t...@rubyforge.org
http://rubyforge.org/mailman/listinfo/eventmachine-talk


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
James Tucker  
View profile  
 More options Sep 29 2008, 7:56 am
From: James Tucker <jftuc...@gmail.com>
Date: Mon, 29 Sep 2008 12:56:15 +0100
Local: Mon, Sep 29 2008 7:56 am
Subject: Re: [Eventmachine-talk] EM sending and receiving large files

On 29 Sep 2008, at 03:46, Kirk Haines wrote:

> On Sun, Sep 28, 2008 at 8:18 PM, Dan Mayer <d...@devver.net> wrote:
> We have been trying to send large files with EventMachine and  
> noticed a few issues. If we just use send data with the contents of  
> a file inside it is slow, and the server eats about 98% of the CPU.  
> The send_file call only supports files up to 32K, which we are  
> sending files as large as 5mb. Lastly we have been unable to use  
> stream_file_data, because it has a dependency on  
> evma_fastfilereader, which I couldn't seem to find anywhere to  
> install anymore.

> Hmmm.  I think that was confused oversight on Francis/my part.  
> evma_fastfilereader should be part of EM.  Until it is, you can get  
> it by installing Swiftiply.

I've been meaning to come and grab it and commit it to EM, as it's  
also the last failing test in the suite run from trunk after the last  
months work. Assuming there are no other issues raised, I will get  
this committed to the EM code base.

_______________________________________________
Eventmachine-talk mailing list
Eventmachine-t...@rubyforge.org
http://rubyforge.org/mailman/listinfo/eventmachine-talk


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Dan Mayer  
View profile  
 More options Sep 29 2008, 8:45 pm
From: "Dan Mayer" <d...@devver.net>
Date: Mon, 29 Sep 2008 18:45:11 -0600
Local: Mon, Sep 29 2008 8:45 pm
Subject: Re: [Eventmachine-talk] EM sending and receiving large files

Thanks for the tip on installing Swiftiply, that made stream_file_data work
perfectly.

Unfortunately, it didn't solve our problem. Large files were still taking a
long time to transfer. So I looked deeper into the issue, I had always been
assuming the delay was actually the slow transfer time. Running a profiler
against our code was enlightening as always, it appears our message buffer
is adding a significant amount of the time. If I completely get rid of any
message buffer on the server used to split up multiple messages, either
send_data or stream_file_data (with larger files) drops to less than 1
second. After searching around a bit I found BufferedTokenizer, which is one
of the protocols for EM. Switching from our apparently bad buffer to the one
included with EM brought us from 10 seconds to 1.2 seconds.

Thanks for the the help, looks like everything is back on track for our EM
performance.

thanks,
Dan Mayer

--
Dan Mayer
Co-founder, Devver
(http://devver.net)
follow us on twitter: http://twitter.com/devver
My Blog (http://mayerdan.com)

_______________________________________________
Eventmachine-talk mailing list
Eventmachine-t...@rubyforge.org
http://rubyforge.org/mailman/listinfo/eventmachine-talk


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Aman Gupta  
View profile  
 More options Sep 29 2008, 9:49 pm
From: "Aman Gupta" <themastermi...@gmail.com>
Date: Mon, 29 Sep 2008 18:49:29 -0700
Local: Mon, Sep 29 2008 9:49 pm
Subject: Re: [Eventmachine-talk] EM sending and receiving large files
Do you know what specifically about your buffer was causing issues?
Were you using String#<<

  Aman

_______________________________________________
Eventmachine-talk mailing list
Eventmachine-t...@rubyforge.org
http://rubyforge.org/mailman/listinfo/eventmachine-talk

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Dan Mayer  
View profile  
 More options Sep 29 2008, 10:07 pm
From: "Dan Mayer" <d...@devver.net>
Date: Mon, 29 Sep 2008 20:07:30 -0600
Local: Mon, Sep 29 2008 10:07 pm
Subject: Re: [Eventmachine-talk] EM sending and receiving large files

Aman (and hopefully others interested on the list),

Here is a profiler dump after I optimized a bit, I got ours from 26ish
seconds down to 10 by getting rid of things like String#<<
14.44     3.49      0.66      668     0.99     0.99  String#split
 13.13     4.09      0.60      665     0.90     0.90  String#index
  4.16     4.28      0.19      668     0.28     3.29  DataBuffer#grab
  3.06     4.42      0.14      661     0.21     6.87
EmServerExample#receive_data
  0.88     4.46      0.04     2007     0.02     0.02  Array#length
  0.66     4.49      0.03     2007     0.01     0.01  Fixnum#>
  0.66     4.52      0.03      662     0.05     3.31  DataBuffer#append

What is the fastest way to do appending to strings?

This is a really messy since I was messing around trying a bunch
optimizations and other things, before finding and switching to the EM
buffer.

class DataBuffer
  FRONT_DELIMITER = "0x5b".hex.chr # '['
  #']'[0].to_s(16).hex.chr
  BACK_DELIMITER = "0x5d".hex.chr # ']'
#crazy delimiter because normal ones kept showing up in binary files
  DELIMITER =
"|#{FRONT_DELIMITER}#{FRONT_DELIMITER}#{FRONT_DELIMITER}GT_DELIM#{BACK_DELI MITER}#{BACK_DELIMITER}#{BACK_DELIMITER}#{BACK_DELIMITER}|"
#added to replace, dynamically making these
  DELIM_ESCAPE = /#{Regexp.escape(DELIMITER)}/
  DELIM_ESCAPE_END = /#{Regexp.escape(DELIMITER)}\Z/

    def initialize
      @unprocessed = ""
      @commands = []
    end

    def grab
      new_messages = @unprocessed.split(DELIM_ESCAPE)
      while new_messages.length > 1
        @commands << new_messages.shift
      end
      msg_length = new_messages.length
      if msg_length > 0
        if msg_length == 1 && (@unprocessed=~DELIM_ESCAPE_END)
          # @commands << new_messages.shift
          @commands.push(new_messages.shift)
          @unprocessed = ""
        else
          #put the rest of the last statement back into the buffer
          while(c...@unprocessed.index(DELIM_ESCAPE))
            @unprocessed = (@unprocessed[cu...@unprocessed.length
]).sub(DELIMITER,"")
          end
        end
      end
      if @commands.length > 0
        return @commands.shift
      else
        return nil #if @commands.length==0
      end
    end

    def prepare(str)
      str.to_s+DELIMITER
    end

    def append(data)
      # @unprocessed << data
      @unprocessed = @unprocessed + data
    end

  end

... client / server code usage...
send_data(@buffer.prepare("some_msg"))

 def receive_data(data)
      @buffer.append(data)
      while(command = @buffer.grab)
         process(command)
      end
  end

    def process(data)
      puts "got data: #{data}"
    end
...

I am probably going to look closer at the EM buffer and our code and I am
sure I will realize something pretty dumb that we did.

Thanks,
Dan

On Mon, Sep 29, 2008 at 7:49 PM, Aman Gupta <themastermi...@gmail.com>wrote:

--
Dan Mayer
Co-founder, Devver
(http://devver.net)
follow us on twitter: http://twitter.com/devver
My Blog (http://mayerdan.com)

_______________________________________________
Eventmachine-talk mailing list
Eventmachine-t...@rubyforge.org
http://rubyforge.org/mailman/listinfo/eventmachine-talk


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
James Tucker  
View profile  
 More options Sep 30 2008, 7:25 am
From: James Tucker <jftuc...@gmail.com>
Date: Tue, 30 Sep 2008 12:25:17 +0100
Local: Tues, Sep 30 2008 7:25 am
Subject: Re: [Eventmachine-talk] EM sending and receiving large files

Dan,

If you have some time, would you be able to use your data sets against  
this other BufferedTokenizer implementation:

http://pastie.textmate.org/private/ykjtuipjedrwgzwgggu5w

There are varying cases for performance depending on the specific data  
sets and chunk size being added to the buffer. Ruby's GC certainly  
starts to cause performance issues with too many objects, so I'm  
trying to strike a balance.

Any input would be welcome,

Kind regards,

J.

On 30 Sep 2008, at 03:07, Dan Mayer wrote:

_______________________________________________
Eventmachine-talk mailing list
Eventmachine-t...@rubyforge.org
http://rubyforge.org/mailman/listinfo/eventmachine-talk


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tony Arcieri  
View profile  
 More options Sep 30 2008, 12:46 pm
From: "Tony Arcieri" <t...@medioh.com>
Date: Tue, 30 Sep 2008 10:46:29 -0600
Local: Tues, Sep 30 2008 12:46 pm
Subject: Re: [Eventmachine-talk] EM sending and receiving large files

On Tue, Sep 30, 2008 at 5:25 AM, James Tucker <jftuc...@gmail.com> wrote:
> Dan,
> If you have some time, would you be able to use your data sets against this
> other BufferedTokenizer implementation:

> http://pastie.textmate.org/private/ykjtuipjedrwgzwgggu5w

A string-based one should generally be faster on Ruby 1.8

--
Tony Arcieri
medioh.com

_______________________________________________
Eventmachine-talk mailing list
Eventmachine-t...@rubyforge.org
http://rubyforge.org/mailman/listinfo/eventmachine-talk


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
James Tucker  
View profile  
 More options Sep 30 2008, 1:55 pm
From: James Tucker <jftuc...@gmail.com>
Date: Tue, 30 Sep 2008 18:55:26 +0100
Local: Tues, Sep 30 2008 1:55 pm
Subject: Re: [Eventmachine-talk] EM sending and receiving large files

On 30 Sep 2008, at 17:46, Tony Arcieri wrote:

> On Tue, Sep 30, 2008 at 5:25 AM, James Tucker <jftuc...@gmail.com>  
> wrote:
> Dan,

> If you have some time, would you be able to use your data sets  
> against this other BufferedTokenizer implementation:

> http://pastie.textmate.org/private/ykjtuipjedrwgzwgggu5w

> A string-based one should generally be faster on Ruby 1.8

In a few tests I did here, the differences were related to size of  
incoming chunk and number of chunks per token mostly.

1.8 - 1.9 speed differences vary, each has it's own advantages at  
certain tasks, but the two implementations were overall quite  
comparable on both interpreters.

What I'm hoping to get an idea of is where and why the differences  
really come up.

_______________________________________________
Eventmachine-talk mailing list
Eventmachine-t...@rubyforge.org
http://rubyforge.org/mailman/listinfo/eventmachine-talk


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Dan Mayer  
View profile  
 More options Oct 1 2008, 12:55 am
From: "Dan Mayer" <d...@devver.net>
Date: Tue, 30 Sep 2008 22:55:09 -0600
Local: Wed, Oct 1 2008 12:55 am
Subject: Re: [Eventmachine-talk] EM sending and receiving large files

Sure no problem. Sorry it took me so long to get back to this, I got slammed
with some items that I had to take care of today.

I ran it on a small test set of data, and the results were very similar...
The current tokenizer in EM seemed to outperform your pastie by very small
amounts. Tomorrow I can run it against a much large and real project, and I
will let you know if I notice any significant differences.

I am cleaning up some of the code I have been using, and will likely make a
post about various methods of sending files through EM in the next couple
days. I noticed it wasn't the easiest to find examples of the various
options just out on the web, so it might help a few people running into
similar problems.

peace,
Dan Mayer

--
Dan Mayer
Co-founder, Devver
(http://devver.net)
follow us on twitter: http://twitter.com/devver
My Blog (http://mayerdan.com)

_______________________________________________
Eventmachine-talk mailing list
Eventmachine-t...@rubyforge.org
http://rubyforge.org/mailman/listinfo/eventmachine-talk


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Dan Mayer  
View profile  
 More options Oct 8 2008, 11:42 am
From: "Dan Mayer" <d...@devver.net>
Date: Wed, 8 Oct 2008 09:42:16 -0600
Local: Wed, Oct 8 2008 11:42 am
Subject: Re: [Eventmachine-talk] EM sending and receiving large files

One final follow up.

I posted some quick benchmarks comparing sending files with our buffer, EM's
buffer, the buffer James Tucker suggested, and stream_file_data. I also
included some benchmarks with compression. I included the code I used for
testing. I thought since I hadn't easily found a good way to send files it
might help out some people in the future. It was nice to be able to just
switch buffers and get a 10X improvement on speed.

http://devver.net/blog/2008/10/sending-files-with-eventmachine/

If anyone has any thoughts, tips, or alternative buffers let me know.

thanks,
Dan

...

read more »

_______________________________________________
Eventmachine-talk mailing list
Eventmachine-t...@rubyforge.org
http://rubyforge.org/mailman/listinfo/eventmachine-talk


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Aman Gupta  
View profile  
 More options Oct 9 2008, 12:17 am
From: "Aman Gupta" <themastermi...@gmail.com>
Date: Wed, 8 Oct 2008 21:17:49 -0700
Local: Thurs, Oct 9 2008 12:17 am
Subject: Re: [Eventmachine-talk] EM sending and receiving large files

> If anyone has any thoughts, tips, or alternative buffers let me know.

You might also try Tony's C buffer:

  http://github.com/igrigorik/em-http-request/tree/master/ext/buffer/em...
  http://github.com/tarcieri/rev/tree/master/ext/rev/rev_buffer.c

  Aman

...

read more »


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tony Arcieri  
View profile  
 More options Oct 9 2008, 1:47 am
From: "Tony Arcieri" <t...@medioh.com>
Date: Wed, 8 Oct 2008 23:47:59 -0600
Local: Thurs, Oct 9 2008 1:47 am
Subject: Re: [Eventmachine-talk] EM sending and receiving large files

Although that buffer may be the source of the problems you were experiencing
with Rev... that'd be good to know.

On Wed, Oct 8, 2008 at 10:17 PM, Aman Gupta <themastermi...@gmail.com>wrote:

...

read more »

_______________________________________________
Eventmachine-talk mailing list
Eventmachine-t...@rubyforge.org
http://rubyforge.org/mailman/listinfo/eventmachine-talk


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google