GetAttachment takes very long time when executed over the network

146 views
Skip to first unread message

Johannes Gustafsson

unread,
Aug 23, 2012, 10:47:38 AM8/23/12
to rav...@googlegroups.com
Hi,

We store fairly large blobs in RavenDB, around 10MB. Maybe not the most optimal solution but it has worked well. Until now.

While calling GetAttachment locally on the same server as ravendb is installed, then the call takes no time at all to fetch a 10MB blob. If, however the call is made from another machine it takes much, much longer. In our production environment it takes up to 8 min! 

Just to make sure nothing is broken in our production environment I also made some tests on our local lan and the difference is smaller, although still significant.

Storing and retrieving blob on my local machine:

Written to database: 00:00:01.9929243
Read from database: 00:00:00.7924333
deleted from database: 00:00:00.0290345

Storing and retrieving blob on another machine across the room:

Written to database: 00:00:04.6639674
Read from database: 00:00:32.7073306
deleted from database: 00:00:00.0059801

This is on a 1Gps network. 32 seconds seems very large to me. Is there something that I missed? I'm fairly certain this was much faster before builds 960...

Oren Eini (Ayende Rahien)

unread,
Aug 23, 2012, 5:09:06 PM8/23/12
to rav...@googlegroups.com
Hm,
Can you try testing this with a recent build?

For 960, the code in question is:


And it looks fine.
Can you try this through fiddler, and see if the issue is in the netwrok?

Johannes Gustafsson

unread,
Aug 24, 2012, 4:13:11 AM8/24/12
to rav...@googlegroups.com
I've tried again in our test environment. I installed build 616, 701, 960 and 2071. The all seem to show the same behaviour. Here is the output on the server:

Infogad bild 1

The spike in I/O is from the PUT. I/O barely reacts while GET:ing which suggests that it reads very very slowly from disk (which is weird).

Fiddler output. The GET actually timed out.
Infogad bild 2

2012/8/23 Oren Eini (Ayende Rahien) <aye...@ayende.com>
raven_fiddler.JPG
raven_memory.JPG

Oren Eini (Ayende Rahien)

unread,
Aug 24, 2012, 4:51:19 PM8/24/12
to rav...@googlegroups.com
Is this on the same system? Remote?
raven_fiddler.JPG
raven_memory.JPG

Oren Eini (Ayende Rahien)

unread,
Aug 24, 2012, 11:34:41 PM8/24/12
to rav...@googlegroups.com
Hi,
I just run the following test locally


using(var docStore = new DocumentStore
{
}.Initialize())
{
var buffer = new byte[1024*1024*11];
var sw = Stopwatch.StartNew();
docStore.DatabaseCommands.PutAttachment("a", null, new MemoryStream(buffer), new RavenJObject());
Console.WriteLine("Putting 11 MB in {0:#,#} ms", sw.ElapsedMilliseconds);

sw.Restart();
var attachment = docStore.DatabaseCommands.GetAttachment("a");
Console.WriteLine("Getting 11 MB in {0:#,#} ms", sw.ElapsedMilliseconds);

sw.Restart();
attachment.Data().ReadData();
Console.WriteLine("Reading 11 MB in {0:#,#} ms", sw.ElapsedMilliseconds);


}


With the following results:

Putting 11 MB in 484 ms
Getting 11 MB in 59 ms
Reading 11 MB in 9 ms
raven_memory.JPG
raven_fiddler.JPG

Johannes Gustafsson

unread,
Aug 26, 2012, 5:02:01 AM8/26/12
to rav...@googlegroups.com
Yes, when run locally it runs fine. It is when the server is on a remote machine when the Get is slow.
raven_memory.JPG
raven_fiddler.JPG

Johannes Gustafsson

unread,
Aug 27, 2012, 8:38:49 AM8/27/12
to rav...@googlegroups.com
I made some more tests. In the screenshot below I have restarted the server and then did a new GET on an existing attachment:

Infogad bild 1

Notice the I/O levels. It never goes over 170 KB/s. Instead it slowly streams the data back to the client. The client in this case is running on a separate network in another windows domain and there is a firewall between the server and client.

If I run the same code on the machine as the server then the GET takes only a couple of seconds. Also, if I run the client on a remote machine but on the same network as the server, it will also take only a couple of seconds.

So, I'm not at all sure that this is Raven's fault, it might as well be that our network is configured wrong. However, the strange thing is that I have these exact problems in 2 completely different environments and different firewalls.

Also, if there was anything wrong with the network , should not the PUT:s be slow as well?

2012/8/26 Johannes Gustafsson <johan...@gmail.com>
raven_memory_2.JPG
raven_memory.JPG
raven_fiddler.JPG

Chris Marisic

unread,
Aug 27, 2012, 9:14:57 AM8/27/12
to rav...@googlegroups.com
QOS / rate limiting functions of the firewall/networking could be limiting your download stream because of other users consumption.

An easy way to test this in isolation would be to setup one of the free VMs on azure or amazon and test the download both inside your current network and outside it (probably from your home).

Oren Eini (Ayende Rahien)

unread,
Aug 27, 2012, 9:17:41 AM8/27/12
to rav...@googlegroups.com
I just tried it and the major factor seems to be network saturation.
Take a look at this sample server app:

class Program
{
static void Main(string[] args)
{
var listener = new HttpListener
{
Prefixes = {"http://+:8080/"}
};
listener.Start();
while (true)
{
var context = listener.GetContext();
var sp = Stopwatch.StartNew();
var chunks = int.Parse(context.Request.QueryString["chunks"]);
var _4kb = new byte[4*1024];
for (int i = 0; i < chunks; i++)
{
context.Response.OutputStream.Write(_4kb, 0, _4kb.Length);
}
context.Response.Close();
var totalSize = (double) (_4kb.Length*chunks);
Console.WriteLine("{0:#,#.##;;0} mb in {1:#,#}", Math.Round((totalSize / 1024) / 1024, 2), sp.ElapsedMilliseconds);
}
}
}


This i serving everything from memory, and it should be able to saturate the network because it is basically writing to it as soon as it can.
Run this over your remote system and measure how long it takes.
For example:



It will give you some idea where things are happening

Johannes Gustafsson

unread,
Aug 27, 2012, 9:35:38 AM8/27/12
to rav...@googlegroups.com
Running the sample with 500 chunks (2MB) took 62 seconds across the network! So this means it's not a raven issue :-). I just can't understand why, there is no QOS configured on the firewall. I suppose I have to dig deeper...

Many thanks,
Johannes

2012/8/27 Oren Eini (Ayende Rahien) <aye...@ayende.com>

Oren Eini (Ayende Rahien)

unread,
Aug 27, 2012, 10:04:53 AM8/27/12
to rav...@googlegroups.com
Okay

Oren Eini (Ayende Rahien)

unread,
Aug 27, 2012, 10:06:29 AM8/27/12
to rav...@googlegroups.com
Would be interested in hearing what you found.

Johannes Gustafsson

unread,
Aug 28, 2012, 2:49:55 AM8/28/12
to rav...@googlegroups.com
It seems that setting the correct buffer size is a big deal while using the httplistener. I changed your sample code a bit to test different buffer sizes (https://gist.github.com/3495420). This is what I found while running the server in our test environment behind a firewall with 100Mbps throughput:

Buffer size: 4 Kb. 1 Mb in 51 690 ms (20 Kbps)
Buffer size: 8 Kb. 1 Mb in 195 ms (5 377 Kbps)
Buffer size: 16 Kb. 1 Mb in 143 ms (7 333 Kbps)
Buffer size: 32 Kb. 1 Mb in 6 578 ms (159 Kbps)
Buffer size: 64 Kb. 1 Mb in 3 290 ms (319 Kbps)
Buffer size: 128 Kb. 1 Mb in 103 ms (10 180 Kbps)
Buffer size: 256 Kb. 1 Mb in 103 ms (10 180 Kbps)
Buffer size: 512 Kb. 1 Mb in 98 ms (10 700 Kbps)
Buffer size: 1024 Kb. 1 Mb in 307 ms (3 416 Kbps)

I ran this test several times but the numbers are pretty consistent. In my specific case, the worst buffer sizes are 4, 32, 64, they slow down the server to a crawl! The best buffer size in my case would probably be 128Kb.

I also ran it on another computer in the same network (1Gbps):

Buffer size: 4 Kb. 1 Mb in 42 ms (24 966 Kbps)
Buffer size: 8 Kb. 1 Mb in 23 ms (45 590 Kbps)
Buffer size: 16 Kb. 1 Mb in 13 ms (80 660 Kbps)
Buffer size: 32 Kb. 1 Mb in 10 ms (104 858 Kbps)
Buffer size: 64 Kb. 1 Mb in 7 ms (149 797 Kbps)
Buffer size: 128 Kb. 1 Mb in 6 ms (174 763 Kbps)
Buffer size: 256 Kb. 1 Mb in 8 ms (131 072 Kbps)
Buffer size: 512 Kb. 1 Mb in 8 ms (131 072 Kbps)
Buffer size: 512 Kb. 1 Mb in 7 ms (149 797 Kbps)
Buffer size: 1024 Kb. 1 Mb in 8 ms (131 072 Kbps)

Much more consistent. However, anything below 64Kb should not be used I think.

It would be interesting to see what anyone else could find.

Oren Eini (Ayende Rahien)

unread,
Aug 28, 2012, 3:53:06 AM8/28/12
to rav...@googlegroups.com
Blah, this is probably because of kernel calls for http.sys 
I'll see what we can do about it

Johannes Gustafsson

unread,
Aug 28, 2012, 4:34:56 AM8/28/12
to rav...@googlegroups.com
I made the same tests in our production environment where the firewall is a big Cisco thing. The numbers fluctuate quite a bit, probably because there are quite a lot of activity on the network. These are the best scores for each chunk size:

Buffer size: 4 Kb. 1 Mb in 53 518 ms (20 Kbps)
Buffer size: 8 Kb. 1 Mb in 147 ms (7 133 Kbps)
Buffer size: 16 Kb. 1 Mb in 83 ms (12 633 Kbps)
Buffer size: 32 Kb. 1 Mb in 6 839 ms (153 Kbps)
Buffer size: 64 Kb. 1 Mb in 1 313 ms (799 Kbps)
Buffer size: 128 Kb. 1 Mb in 60 ms (17 476 Kbps)
Buffer size: 256 Kb. 1 Mb in 30 ms (34 953 Kbps)
Buffer size: 512 Kb. 1 Mb in 22 ms (47 663 Kbps)
Buffer size: 1024 Kb. 1 Mb in 232 ms (4 520 Kbps)


2012/8/28 Oren Eini (Ayende Rahien) <aye...@ayende.com>

Johannes Gustafsson

unread,
Aug 28, 2012, 5:03:52 AM8/28/12
to rav...@googlegroups.com
I suppose this also explains why the download of the silverlight app takes such a long time for me...

2012/8/28 Johannes Gustafsson <johan...@gmail.com>

Johannes Gustafsson

unread,
Aug 28, 2012, 5:21:49 AM8/28/12
to rav...@googlegroups.com
I created my own build of 960 where I made the buffer size while getting statics a configurable setting.

I also hardcoded HttpListenerResponseAdapter.WriteFile() to use a buffer of 8196. This made a huge difference for the download speed of the silverlight app.

I suppose you have a much better solution in the works but if you want a pull request just let me know :-)

Johannes Gustafsson

unread,
Aug 28, 2012, 5:36:38 AM8/28/12
to rav...@googlegroups.com
I'm seriously considering to put my own build out into production. Is there anything special I should think of? I used the quick.ps1 file to build plus changes the configuration to "Release". The build number becomes 13.

Oren Eini (Ayende Rahien)

unread,
Aug 28, 2012, 12:33:31 PM8/28/12
to rav...@googlegroups.com
For the build number, you need to set the env variable "buildlabel"

On PS:

$env:buildlabel=999;

We have a fix for that that uses a staggered approach to the buffer issue and shows about 4 - 6 times perf improvement. 
I'll be in the next build 
Reply all
Reply to author
Forward
0 new messages