Raven on Azure working group

1,109 views
Skip to first unread message

georgiosd

unread,
Apr 19, 2013, 3:37:42 AM4/19/13
to rav...@googlegroups.com
All,

I think that enough people are asking about RavenDB on Azure and some people have provided helpful insights but there doesn't seem to be an accepted solution **based on benchmarks**.

How about a few of get together and collectively test out a few configurations and report back for the sake of the whole community?

Georgios

Mauro Servienti

unread,
Apr 19, 2013, 3:51:51 AM4/19/13
to rav...@googlegroups.com

here I am, would be great to have a performance test, with the same data set, suite to run on different Azure seutps.

 

.m


From: rav...@googlegroups.com [rav...@googlegroups.com] on behalf of georgiosd [geor...@gmail.com]
Sent: Friday, April 19, 2013 9:37
To: rav...@googlegroups.com
Subject: [RavenDB] Raven on Azure working group

--
You received this message because you are subscribed to the Google Groups "ravendb" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Jahmai Lay

unread,
Apr 19, 2013, 7:19:27 AM4/19/13
to rav...@googlegroups.com

I'll gladly run a suite of tests under our Azure configuration.

I don't exactly have time to design them though...

Kijana Woodard

unread,
Apr 19, 2013, 8:47:46 AM4/19/13
to rav...@googlegroups.com

+1 on being willing to run some configurations. Been thinking about thus myself, but haven't had time to design something.

Ideally it'd be an exe that filled the db with data and ran a suite of operations. Then we can modify one variable of the setup and run again.

Someone here called it a raven score.

Could be used to judge new versions as well as cloud and on premise hardware.

Georgios Diamantopoulos

unread,
Apr 19, 2013, 10:57:27 AM4/19/13
to rav...@googlegroups.com
What about the test data that Rhinos use? That's supposed to be somewhere @ github right?
Is that a suitable data set?


Date: Fri, 19 Apr 2013 07:47:46 -0500
Subject: Re: [RavenDB] Raven on Azure working group
From: kijana....@gmail.com
To: rav...@googlegroups.com
You received this message because you are subscribed to a topic in the Google Groups "ravendb" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ravendb/XNqobAx3yBs/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to ravendb+u...@googlegroups.com.

viscious

unread,
Apr 19, 2013, 12:54:54 PM4/19/13
to rav...@googlegroups.com
I would be happy to help out with this.

georgiosd

unread,
Apr 20, 2013, 8:58:34 AM4/20/13
to rav...@googlegroups.com
I've created a Google Doc to gather some requirements before we start chipping away at code https://docs.google.com/document/d/1IBH-iKEIfvrI8QJ65YVT6fN061U60j8RM_vvqceq8MM/edit?usp=sharing

Anyone can comment - I'll send out editing invitations to anyone who wishes to contribute

Justin A

unread,
Apr 22, 2013, 10:36:27 PM4/22/13
to rav...@googlegroups.com
great idea! we really would love azure hosting .. (I've already bumped a thread in the RavenHQ forums saying +1-would-pay-if-offered :P  .. and yes i know ravenhq is a separate company to HR.)

Georgios Diamantopoulos

unread,
Apr 23, 2013, 4:39:12 AM4/23/13
to rav...@googlegroups.com
The document I posted above is open for comments and I've laid out some questions for discussion - please feel free to contribute as you find appropriate


Date: Mon, 22 Apr 2013 19:36:27 -0700
From: jus...@adler.com.au
To: rav...@googlegroups.com
Subject: [RavenDB] Re: Raven on Azure working group


great idea! we really would love azure hosting .. (I've already bumped a thread in the RavenHQ forums saying +1-would-pay-if-offered :P  .. and yes i know ravenhq is a separate company to HR.)

georgiosd

unread,
May 13, 2013, 4:04:04 PM5/13/13
to rav...@googlegroups.com
Issues for discussion:

Generating data set


  • What target applications should we consider?

  • What kind of data do these target apps have?

  • Can we generate data that approximates real data? Is there a real source we can import from?

Generating load


  • What types of operations do these target apps perform?


Measuring score


  • How do we translate the performance to an objective score?


Test deployment


  • How do we run the load against the DB so that it

    • simulates a real environment

    • minimizes/eliminates interface from e.g. network delays

Kijana Woodard

unread,
May 13, 2013, 4:14:50 PM5/13/13
to rav...@googlegroups.com
Georgios,

Thanks for keeping this alive. I am interested. I am also buried by work atm. I may have some time in a couple weeks to contribute more.


--
You received this message because you are subscribed to the Google Groups "ravendb" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.

Georgios Diamantopoulos

unread,
May 13, 2013, 4:16:37 PM5/13/13
to rav...@googlegroups.com
Same here so no worries - let's kick some ideas around and the implementation shouldn't be that challenging


Date: Mon, 13 May 2013 15:14:50 -0500
Subject: Re: [RavenDB] Re: Raven on Azure working group
From: kijana....@gmail.com
To: rav...@googlegroups.com
You received this message because you are subscribed to a topic in the Google Groups "ravendb" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ravendb/XNqobAx3yBs/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to ravendb+u...@googlegroups.com.

Kijana Woodard

unread,
May 13, 2013, 4:21:26 PM5/13/13
to rav...@googlegroups.com
The main theme I've picked up from the threads on this is that the definition of what constitutes the score will evolve over time.

From that end, I think the result of running this program should be something like "RavenScore 87 (v 1.21)". That way looking at it you can get _some clue_ about what was involved in making up the score.

Georgios Diamantopoulos

unread,
May 13, 2013, 4:23:50 PM5/13/13
to rav...@googlegroups.com
Well, what about just having sub-scores only that get averaged or median'd for the "total" score?
Not to say they won't need revisions over time but hopefully only major revisions and we won't have 80 different versions and end up comparing apples to oranges :)

When I say sub-scores I'm thinking Windows Performance Index or whatever it's called - score each "ability" separately


Date: Mon, 13 May 2013 15:21:26 -0500

Kijana Woodard

unread,
May 13, 2013, 4:26:12 PM5/13/13
to rav...@googlegroups.com
Yeah. Like Indexing, Load, Spatial, Replication?, etc.

Chris Marisic

unread,
May 13, 2013, 4:26:44 PM5/13/13
to rav...@googlegroups.com
If you do subscores I recommend scoring on these areas:

map
map/reduce
facets
geo-spatial
attachments
insert / modify
bulkinsert / bulk update
patch?

Georgios Diamantopoulos

unread,
May 13, 2013, 4:42:18 PM5/13/13
to rav...@googlegroups.com
We should also avoid doing too much to make the project more maintainable

For example, we should first establish whether any of these areas display non-linear performance differences between different set ups.

Meaning, we should mostly test for things that seem to give <> 2x performance on a 2x faster machine 
Which of course we'll only discover by trial and error

Plus, a lot of this will depend on the data we use so I guess it'd be best to start from defining the business scope for this - and the data that goes with it


Date: Mon, 13 May 2013 13:26:44 -0700
From: ch...@marisic.com
To: rav...@googlegroups.com

Kijana Woodard

unread,
May 13, 2013, 5:12:18 PM5/13/13
to rav...@googlegroups.com
Here's my personal user story.

As a RavenDB developer, I'd like to evaluate the relative capability of a raven db installation on an Azure VM vs AWS EC2 vs Some Physical hardware so that I can make deployment / setup decisions.

In other words, I like to try a vanilla Azure VM and then mess around with striping disks and partioning logs/data/indexes and see whether my work made any difference. Then try the same thing with EC2 and compare. Then try on "other hardware" and get a comparison so that decision can be made from data, not gut feelings and prejudice (i.e. Azure sucks or Physical Hardware is always the right choice or I once had a bad experience with abcxyz, so....).

Georgios Diamantopoulos

unread,
May 13, 2013, 5:25:11 PM5/13/13
to rav...@googlegroups.com
OK, that's a great starting point.

What are the characteristics of your (or anyone who'd see value in this) data? For example, I've never personally used attachments.


Date: Mon, 13 May 2013 16:12:18 -0500

Kijana Woodard

unread,
May 13, 2013, 6:20:59 PM5/13/13
to rav...@googlegroups.com
I guess it depends on the project, so I feel I'd have to understand what data was used to make the score to get an idea of how it might fit my particular project. Giant scope creep: pluggable data.

Let's not do that. :-)

For the most part, I lean on Load quite a bit, but when I use Query, I'd like it to be fast. :-)
Really, "my data" isn't so important for me. I really want to evaluate the hardware/environment.

Georgios Diamantopoulos

unread,
May 13, 2013, 6:38:21 PM5/13/13
to rav...@googlegroups.com
Right, but in theory, different data will perform different on the same machine (cache hits etc) so i guess we may need to assess how much variance that is

Perhaps an amazon like app is a good starting point

Sent from my iPad

Kijana Woodard

unread,
May 13, 2013, 6:48:28 PM5/13/13
to rav...@googlegroups.com
I guess that's a good point for a decision.

Is this purpose of this Score supposed to give an indication of how "your app" will perform on "this hardware" or how "this hardware" performs against a standard. I prefer the latter because the former is sooooooo expansive. I would submit that wanting to know how "your app performs" should be part of the app development process. I'm just looking for a nice way to know "if I stripe 4 data disks on an azure VM, how much difference does that make over striping two". I don't really care about "my app" at that moment. I then have to retest my findings continuously as my app progresses to validate my assumptions and calculations.

Georgios Diamantopoulos

unread,
May 14, 2013, 5:11:16 AM5/14/13
to rav...@googlegroups.com
Hm... in that case, we should classify operations based on the type of data they operate on
For example, benchmark the load times for documents of 2KB, 10KB, 100KB, 500KB, 2MB

Of course in all this we'll have network costs - operating the benchmark on the same machine as Raven will compromise the accuracy a little - is there any way to get timings (load, serialization, wire transfer) from Raven somehow? Or is it an easy patch to do?




Date: Mon, 13 May 2013 17:48:28 -0500

Oren Eini (Ayende Rahien)

unread,
May 14, 2013, 5:41:02 AM5/14/13
to ravendb
We are already reporting those to the console.

Georgios Diamantopoulos

unread,
May 14, 2013, 5:48:36 AM5/14/13
to rav...@googlegroups.com
Right - but how can we get hold of that data programatically?


From: aye...@ayende.com
Date: Tue, 14 May 2013 10:41:02 +0100

Subject: Re: [RavenDB] Re: Raven on Azure working group

Oren Eini (Ayende Rahien)

unread,
May 14, 2013, 5:49:03 AM5/14/13
to ravendb
It is available in the logs, and you can get that via the /logs endpoint

Georgios Diamantopoulos

unread,
May 14, 2013, 5:53:13 AM5/14/13
to rav...@googlegroups.com
Are there thoughts for a performance monitoring endpoint (structured data, not text)?



From: aye...@ayende.com
Date: Tue, 14 May 2013 10:49:03 +0100

Oren Eini (Ayende Rahien)

unread,
May 14, 2013, 5:56:38 AM5/14/13
to ravendb
We do that, we have perf counters

Georgios Diamantopoulos

unread,
May 14, 2013, 5:58:18 AM5/14/13
to rav...@googlegroups.com
That should be easier to interface with than parsing the logs, as long as it has the same amount of info or more


From: aye...@ayende.com
Date: Tue, 14 May 2013 10:56:38 +0100

Subject: Re: [RavenDB] Re: Raven on Azure working group
To: rav...@googlegroups.com

Matt Johnson

unread,
May 14, 2013, 10:50:37 AM5/14/13
to rav...@googlegroups.com
Evaluating perf counters are probably a good bet.  Especially if we can get some Raven-specific ones as well as some of the key operating system metrics, such as I/O performance.

Speaking of metrics, have you guys seen Iometer?  (www.iometer.org) - It's a bit dated, but very stable and useful tool for measuring I/O performance.  It might be useful - at least for testing on Azure VMs.  I'm not sure we could use it in other modes, such as with Raven embedded mode on an Azure web site or web role.

Speaking of IOPS, There have been some recent changes posted for Azure VMs: http://msdn.microsoft.com/library/dn197896.aspx

They don't guarantee a specific minimum IOPS, but they are specifying targeted maximums.  We should be able to see how Raven performs in within the ranges they are specifying for the different machine sizes.

We should also be considering that Amazon offers guaranteed IOPS (provisioned IOPS) for their EC2 instances.  Azure doesn't have anything like that yet.

-Matt

Matt Johnson

unread,
May 14, 2013, 10:55:12 AM5/14/13
to rav...@googlegroups.com
Also, see this relatively new writeup on SQL Server IOPS on Azure VMs.  We have different load than SQL of course, but these are similar metrics as to what we need.

Kijana Woodard

unread,
May 18, 2013, 4:54:15 PM5/18/13
to rav...@googlegroups.com
I just tried SQLIO Disk Subsystem Benchmark Tool (http://www.microsoft.com/en-us/download/details.aspx?id=20163).


The performance is still _much_ worse than my local SSD.

Local:
IOs/sec:  7423.98
MBs/sec:    14.49

Azure VM Striped Data Disk:
IOs/sec:   199.13
MBs/sec:     0.38

Azure VM C:
IOs/sec: 13309.53
MBs/sec:    25.99

Azure VM D:
IOs/sec: 19614.86
MBs/sec:    38.31

Given the numbers for the VM C/D drives, I'm wondering if I missed some steps setting up the disks. I was rushing to get everything back up so other developers could continue working and seeing their integrated changes.

At any rate, maybe SQLIO is a good (enough) approximation of a RavenScore.



Kijana Woodard

unread,
May 18, 2013, 5:28:44 PM5/18/13
to rav...@googlegroups.com
Ok. I've found three problems with what I did:

1. Had Geo-Replication enabled
2. Did not have the Storage Accounts in an affinity group.
3. I'm using an existing VM that I thought was in an affinity group, but I'm not sure.

Anyone know how to set the 1 and 2 with powershell cmdlets?

Kijana Woodard

unread,
May 18, 2013, 6:56:37 PM5/18/13
to rav...@googlegroups.com
Ok. Maybe my real problem was my use of SQLIO.

For my original numbers, I used ".\sqlio.exe -dM" where M is my drive letter.


So now I tried ".\sqlio.exe -kR -t64 -s120 -dM -o8 -fsequential -b64 -BH -LS Testfile.dat"
IOs/sec:  2135.00
MBs/sec:   133.43

I attached a disk from the same storage account as the VM and got:;
IOs/sec:  2989.66
MBs/sec:   186.85

I created another VM in the same affinity group as the storage account. I am watching to see if the "warmup" the article author suggests makes a difference.

Fresh striped disks from same account/container as VM:
IOs/sec:  2044.67
MBs/sec:   127.79

A single disks from the same storage account but different container that has been around a bit longer:
IOs/sec:  2908.49
MBs/sec:   181.78

Georgios Diamantopoulos

unread,
May 20, 2013, 11:52:13 AM5/20/13
to rav...@googlegroups.com
That seems to suffice to test what setup is in terms of Azure disk I/O but I don't think it's a meter of DB performance?


Date: Sat, 18 May 2013 17:56:37 -0500

Subject: Re: [RavenDB] Re: Raven on Azure working group
You received this message because you are subscribed to a topic in the Google Groups "ravendb" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ravendb/XNqobAx3yBs/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to ravendb+u...@googlegroups.com.

Kijana Woodard

unread,
May 20, 2013, 12:03:45 PM5/20/13
to rav...@googlegroups.com

So, the tool is meant to give you an idea of sql server performance. It's not a real raven score  but it might serve as a rough proxy for the capabilities of a given setup since raven is so reliant on disk speed for perf.

Mircea Chirea

unread,
May 23, 2013, 5:37:52 PM5/23/13
to rav...@googlegroups.com
I'm ready to move all my apps to use Raven hosted on Azure. We barely have any traffic but had quite low performance, which was surprising.

Here's what I've learned:
  • Some VMs are faster than others; it seems weird, but I've noticed significantly lower perf across different VMs, so if it feels slow, be prepared to tear it down.
  • Since GA no VM went down, not for once second. That's pretty good, as during preview VMs would just die randomly.
  • If you want to use Azure backup don't use ReFS as it only supports NTFS.
  • ReFS seems slower than NTFS on Raven, although I don't have hard data (and it might be the VM perf variation).
  • Latency kinda sucks. There's 20ms on average for each request, so five queries take 100ms just for the connection.
  • Query time spikes; I'm not sure if it's network latency or Raven going unresponsive, or the IO subsystem, but some queries take 3-4 seconds, while normally being 25-30ms. WEIRD.
  • You better install New Relic to keep an eye on everything. Standard is free through Azure or BizSpark.
I've only tested small instances; the larger ones may not have those issues, I don't know.

Kijana Woodard

unread,
May 23, 2013, 10:28:30 PM5/23/13
to rav...@googlegroups.com
Faster VMs - We found the same to be true on AWS and read about someone else who had went so far as to have an auto-kill when the app found itself on a "slow machine". In the cloud, they have old hardware sitting around that's "usefule" and your VM may wind up there. :-D

For Azure, supposedly US West has newer hardware than US East.

I haven't had a VM die either - until I hit the spending limit :-D.

ReFS?

I've notices some hic-ups for latency as well from Azure Websites to and Azure VM. I've been blaming it on not being able to use the private ip (at least was true a couple months ago). In general I try to do as many Lazy operations as possible to consolidate all  the requests into one remote call if possible. Your experience validates that effort.

I'm still not happy with disk performance. I've tried striping 4 disks and even using disks from different storage accounts. Doesn't seem to make much difference.

At the moment, I'm running on the OS disk (C:). It's dev so if I run out of room, no problem. Since the client wants to move to colo hardware, I'm not worried at the moment, but wish the disk perf was better. The main caveats to this is that I realize there's something wrong with my Index (I think because of LoadDocument as mentioned in my other thread) and I'm on stable and 2.5 is set to bring better performance.

Good tip about New Relic. Duck and move. With AWS, we joked about bringing up 40% more boxes than we needed  for the web app and then killing the "bad ones".



--

Justin A

unread,
May 24, 2013, 1:03:45 AM5/24/13
to rav...@googlegroups.com
Kijana: "Since the client wants to move to coloc hardware".

Do you mean you're going to have the website on azure but RavenDb on a coloc elsewhere? and accept the latency hit for better disk speeds?

(I'm looking at using RavenHQ with around 5 odd Gigs of data, just so I don't have to deal with managing/supporting the db software/backups/another os/etc with the websites on azure).

Mircea Chirea

unread,
May 24, 2013, 6:08:01 AM5/24/13
to rav...@googlegroups.com
inline


On Friday, May 24, 2013 5:28:30 AM UTC+3, Kijana Woodard wrote:
Faster VMs - We found the same to be true on AWS and read about someone else who had went so far as to have an auto-kill when the app found itself on a "slow machine". In the cloud, they have old hardware sitting around that's "usefule" and your VM may wind up there. :-D

Yeah well there isn't anything else you can do about that.
 
For Azure, supposedly US West has newer hardware than US East.

I'm using EU West (Amsterdam). Azure has (had?) newer hardware than EC2, just because... well it's newer.
 
I haven't had a VM die either - until I hit the spending limit :-D.

Well that never helps.

ReFS?

New filesystem in Windows Server 2012.
 
I've notices some hic-ups for latency as well from Azure Websites to and Azure VM. I've been blaming it on not being able to use the private ip (at least was true a couple months ago). In general I try to do as many Lazy operations as possible to consolidate all  the requests into one remote call if possible. Your experience validates that effort.
 
Yeah, I was too lazy to use lazy operations.
That said, I am not able to use the internal IP. I think it's only available to use betweens VMs setup in the same availability group.
 
I'm still not happy with disk performance. I've tried striping 4 disks and even using disks from different storage accounts. Doesn't seem to make much difference.

I can't comment on that as I don't have the volume of data necessary to have IO as a bottleneck.
 
At the moment, I'm running on the OS disk (C:). It's dev so if I run out of room, no problem. Since the client wants to move to colo hardware, I'm not worried at the moment, but wish the disk perf was better. The main caveats to this is that I realize there's something wrong with my Index (I think because of LoadDocument as mentioned in my other thread) and I'm on stable and 2.5 is set to bring better performance.

The OS disk is stored on the storage account. If you want fast storage the D (temp) is the fastest AFAIK, as it's directly on the host.

Mircea Chirea

unread,
May 24, 2013, 6:09:19 AM5/24/13
to rav...@googlegroups.com
Don't use RavenHQ with everything else on Azure, the latency WILL kill you.
Running a VM is pretty much set and forget. Especially with Server 2012 you don't have to do anything after it's running, except apply updates.

Kijana Woodard

unread,
May 24, 2013, 8:33:31 AM5/24/13
to rav...@googlegroups.com

No. Website and db on owned hardware. Not sure the reasoning even with the issues we're discussing.

--

Kijana Woodard

unread,
May 24, 2013, 8:35:15 AM5/24/13
to rav...@googlegroups.com

D is for sure fastest, but azure reserves the right to kill the data on that drive at any time. Purely temporary.

Khalid Abuhakmeh

unread,
May 24, 2013, 10:41:12 AM5/24/13
to rav...@googlegroups.com
Have you guys thought of making a VHD and making it available? It sucks to have to provision a VM and go through the steps to install IIS and then RavenDB.

Kijana Woodard

unread,
May 24, 2013, 10:58:03 AM5/24/13
to rav...@googlegroups.com
I haven't even my own image yet....which really sucked when I hit the spending cap and Azure automatically _deleted_ my VM. Good times. :-D

Manager called. I had to remind him what "dev environment" means.

Kijana Woodard

unread,
May 24, 2013, 10:59:11 AM5/24/13
to rav...@googlegroups.com
I should add, It doesn't take much time for me to get up and running once the VM is provisioned. I don't use IIS for raven, jut unzip the server, run it, and open port 8080.

Kijana Woodard

unread,
May 24, 2013, 11:11:10 AM5/24/13
to rav...@googlegroups.com
Here's a ps script I use to open ports instead of using the azure portal.

$vm_name = "whatver"

$ports = @{
"http" = @{
        "Port" = 80;
    };
    "nancy" = @{
        "Port" = 8888;
    };
    "raven" = @{
        "Port" = 8080;
    };
"postgres" = @{
        "Port" = 5432;
    };
    "ftp" = @{
        "Port" = 21;
    };
    "ftp-data" = @{
        "Port" = 20;
    };
    "ftp-passive00" = @{
        "Port" = 7000;
    };
    "ftp-passive01" = @{
        "Port" = 7001;
    };
    "ftp-passive02" = @{
        "Port" = 7002;
    };
}

$vm = Get-AzureVM -ServiceName $vm_name -Name $vm_name;

$ports.GetEnumerator() | %{ 
    $port = $ports[$_.Name];
    Write-Host $_.Name $port.Port;   
    $vm = $vm | Add-AzureEndpoint -Name $_.Name -Protocol 'TCP' -LocalPort $port.Port -PublicPort $port.Port;
}

$vm | Update-AzureVM;


Next I need a script to open ports in the firewall on the VM itself.

Justin A

unread,
May 24, 2013, 8:36:51 PM5/24/13
to rav...@googlegroups.com
drats - was hoping to leverage RavenHq + Azure.

If i go the VM route, I still have to update the vm with windows updates, etc. . which i've been doing for years on my other servers .. which is stilla PITA but i know how to do. And firewall security, etcetc. 

Would :heart: a pre-made Azure-RavenDb-VM which anyone could clone .. with RavenDb running under IIS and backups are setup to go to the amazon s3 thing or whatever is the flavour of the month. (of course, it won't work until we apply out lic's for ravendb and amazon backup, etc).

^ - beg?

Georgios Diamantopoulos

unread,
May 24, 2013, 9:20:35 PM5/24/13
to rav...@googlegroups.com
Maybe we can use scripts to do all this?

Sent from my iPad
--
You received this message because you are subscribed to a topic in the Google Groups "ravendb" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ravendb/XNqobAx3yBs/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to ravendb+u...@googlegroups.com.

Kijana Woodard

unread,
May 24, 2013, 10:33:17 PM5/24/13
to rav...@googlegroups.com

Probably scripts yes. The azure ones are not too bad. Just need to fill in some windows ones for iis and we should be good to go.

Justin, the firewall stuff is so simple, I can barely justify scripting it. :-]

You received this message because you are subscribed to the Google Groups "ravendb" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.

Justin A

unread,
May 25, 2013, 11:09:48 AM5/25/13
to rav...@googlegroups.com
totally agreed :) and it might be even easier (read: nothing to do manually) if the raven.install.exe thing i think i read once in here .. is made and does that stuff for us :P

but yes, have a powershell script would be really really awesomeballs - make creation + deployment soooooooo much simpler! :) not just the vm .. but some initial basic raven server configuration (eg. auto backups).

Kijana Woodard

unread,
May 25, 2013, 11:30:50 AM5/25/13
to rav...@googlegroups.com

I wouldn't like a gui installer. I think you lose points for that. :-)

I'd take a cli for things like create a db and set bundles, users, etc. I've seen it done in a script by using curl though, so that's not urgent.

On May 25, 2013 10:09 AM, "Justin A" <jus...@adler.com.au> wrote:
totally agreed :) and it might be even easier (read: nothing to do manually) if the raven.install.exe thing i think i read once in here .. is made and does that stuff for us :P

but yes, have a powershell script would be really really awesomeballs - make creation + deployment soooooooo much simpler! :) not just the vm .. but some initial basic raven server configuration (eg. auto backups).

--

Georgios Diamantopoulos

unread,
May 25, 2013, 11:51:58 AM5/25/13
to rav...@googlegroups.com
I've got some free time today - I may do some work on the performance stuff we've been talking about.

Any ideas on a dataset?


Date: Sat, 25 May 2013 10:30:50 -0500

Subject: Re: [RavenDB] Re: Raven on Azure working group
From: kijana....@gmail.com
To: rav...@googlegroups.com

You received this message because you are subscribed to a topic in the Google Groups "ravendb" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ravendb/XNqobAx3yBs/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to ravendb+u...@googlegroups.com.

Mircea Chirea

unread,
May 25, 2013, 12:05:12 PM5/25/13
to rav...@googlegroups.com
NuGet? Oren has done a bunch of blog posts about it.

Kijana Woodard

unread,
May 25, 2013, 12:05:22 PM5/25/13
to rav...@googlegroups.com

Yeah. Last night I noticed a folder in the raven source code called raven.performance. It uses freedb. It also looks like it does a lot of what we want.

Mircea Chirea

unread,
May 25, 2013, 12:05:47 PM5/25/13
to rav...@googlegroups.com
Or heck, take the StackOverflow dump =)
That ought to give Raven a run for its money.

On Friday, April 19, 2013 10:37:42 AM UTC+3, georgiosd wrote:
All,

I think that enough people are asking about RavenDB on Azure and some people have provided helpful insights but there doesn't seem to be an accepted solution **based on benchmarks**.

How about a few of get together and collectively test out a few configurations and report back for the sake of the whole community?

Georgios

Georgios Diamantopoulos

unread,
May 25, 2013, 12:25:14 PM5/25/13
to rav...@googlegroups.com
You mean nuget packages themselves??


Date: Sat, 25 May 2013 09:05:12 -0700
From: chirea...@gmail.com
To: rav...@googlegroups.com

Georgios Diamantopoulos

unread,
May 25, 2013, 12:25:32 PM5/25/13
to rav...@googlegroups.com
Hm... at 4.30GB, it may be a little hard to Git :)


Date: Sat, 25 May 2013 09:05:47 -0700
From: chirea...@gmail.com
To: rav...@googlegroups.com
Subject: [RavenDB] Re: Raven on Azure working group

Georgios Diamantopoulos

unread,
May 25, 2013, 12:27:22 PM5/25/13
to rav...@googlegroups.com
I seem to remember this was somehow deprecated.

Have you identified what's missing, if anything?


Date: Sat, 25 May 2013 11:05:22 -0500
Subject: RE: [RavenDB] Re: Raven on Azure working group

Kijana Woodard

unread,
May 25, 2013, 12:28:30 PM5/25/13
to rav...@googlegroups.com
No. Just saw it in the code. Didn't try to run it.

Georgios Diamantopoulos

unread,
May 25, 2013, 12:33:28 PM5/25/13
to rav...@googlegroups.com
There's also Raven.SimulatedWorkLoad


Date: Sat, 25 May 2013 11:28:30 -0500

Mircea Chirea

unread,
May 25, 2013, 1:11:45 PM5/25/13
to rav...@googlegroups.com

Mircea Chirea

unread,
May 25, 2013, 1:13:07 PM5/25/13
to rav...@googlegroups.com
Maybe Raven.Performance.Hardcode? =)

I don't think the SO data should live in a repository. Just an import app which reads the dumps and shoves them into Raven in the appropriate format.

Kijana Woodard

unread,
May 25, 2013, 1:32:47 PM5/25/13
to rav...@googlegroups.com
Except the data should be consistent to give comparable results. Maybe we need a bit less than 4GB. :-D


--
You received this message because you are subscribed to the Google Groups "ravendb" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.

Georgios Diamantopoulos

unread,
May 25, 2013, 1:46:55 PM5/25/13
to rav...@googlegroups.com
Sure, but even with a fast line, downloading 4GB won't be a breeze


Date: Sat, 25 May 2013 10:13:07 -0700
From: chirea...@gmail.com
To: rav...@googlegroups.com
Subject: Re: [RavenDB] Re: Raven on Azure working group

Georgios Diamantopoulos

unread,
May 25, 2013, 1:55:17 PM5/25/13
to rav...@googlegroups.com
Interestingly the Performance project is not included in the main Raven solution but IS in the zzz_RavenDB_Release.sln


Date: Sat, 25 May 2013 12:32:47 -0500
Subject: Re: [RavenDB] Re: Raven on Azure working group
From: kijana....@gmail.com
To: rav...@googlegroups.com

Georgios Diamantopoulos

unread,
May 25, 2013, 2:02:28 PM5/25/13
to rav...@googlegroups.com
Anyone knows how to get data from the performance counters that Oren spoke about in a previous post?


From: geor...@live.com
To: rav...@googlegroups.com
Subject: RE: [RavenDB] Re: Raven on Azure working group
Date: Sat, 25 May 2013 20:55:17 +0300

Mircea Chirea

unread,
May 25, 2013, 3:47:06 PM5/25/13
to rav...@googlegroups.com
You can always download a specific dump and import it, SO does monthly dumps (or used to at least). Unless the importer has bugs it should be the same in RavenDB :)

Mircea Chirea

unread,
May 25, 2013, 3:50:28 PM5/25/13
to rav...@googlegroups.com
Of course, but you'd only need to download it once. We could standardize on the, say, October 2010 dump, which is 1GB: http://www.clearbits.net/torrents/1439-oct-2010

Kijana Woodard

unread,
May 25, 2013, 3:55:16 PM5/25/13
to rav...@googlegroups.com

Well. That would make it tougher to say, setup a new vm and evaluate its raven capabilities. Maybe 100mb?

You received this message because you are subscribed to the Google Groups "ravendb" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.

Georgios Diamantopoulos

unread,
May 25, 2013, 4:04:45 PM5/25/13
to rav...@googlegroups.com
Just imported the Nuget DB. About 270Meg, 130k docs! Sounds good?


Date: Sat, 25 May 2013 14:55:16 -0500

Subject: Re: [RavenDB] Re: Raven on Azure working group

Kijana Woodard

unread,
May 25, 2013, 4:15:49 PM5/25/13
to rav...@googlegroups.com

Yeah. Lets roll with that.

Georgios Diamantopoulos

unread,
May 25, 2013, 4:19:23 PM5/25/13
to rav...@googlegroups.com
I'll save the jsons and zip it, should be pretty small. (dates seem to need conversion also)
Here's an example, let's brainstorm some queries:

{
  "Id": "RavenDB.Client",
  "Version": "2.0.2230",
  "Authors": "Hibernating Rhinos",
  "Copyright": null,
  "Created": "/Date(1357564102500)/",
  "Dependencies": [],
  "Description": "This package includes the client API of RavenDB. RavenDB is a document database for the .NET platform, offering a flexible data model design to fit the needs of real world systems.",
  "DownloadCount": 98547,
  "IsLatestVersion": false,
  "IsAbsoluteLatestVersion": false,
  "IsPrerelease": false,
  "Language": "en-US",
  "LastUpdated": "/Date(1369502739593)/",
  "Published": "/Date(1357564102500)/",
  "PackageHash": "om9F1F6kSmAZIY21bTrLNxFRq4UKjRlOMx9w0AhZE/dNNDpTrORc4oooRMpQXsMsWQvjqva4pVV4sIQwm09ing==",
  "PackageHashAlgorithm": "SHA512",
  "PackageSize": "3580588",
  "ProjectUrl": "http://www.ravendb.net/",
  "ReleaseNotes": null,
  "RequireLicenseAcceptance": true,
  "Summary": "This package includes the client API of RavenDB.",
  "Tags": [
    "nosql",
    "ravendb",
    "raven",
    "document",
    "database",
    "client"
  ],
  "Title": "RavenDB Client",
  "VersionDownloadCount": 8644,
  "MinClientVersion": null
}



Date: Sat, 25 May 2013 15:15:49 -0500
Subject: RE: [RavenDB] Re: Raven on Azure working group

Kijana Woodard

unread,
May 25, 2013, 4:41:06 PM5/25/13
to rav...@googlegroups.com
Load(Id)
Query.Seach(Description + Summary)
Where IsPrerelease = true/false
Where Author = "Someone"

Georgios Diamantopoulos

unread,
May 25, 2013, 4:46:17 PM5/25/13
to rav...@googlegroups.com
How about this (some of this will be educational and prob removed later):

- Bulk insert jsons (measure perf)
- Time to page-load all docs?
- Insert static index and measure time until it's non-stale
- Find saturation point as in http://blogs.prodata.ie/post/IOPS-Planning-for-SQL-Server-in-Azure-IaaS.aspx for queries against the index using terms from the index and outside of the index



Date: Sat, 25 May 2013 15:41:06 -0500

Subject: Re: [RavenDB] Re: Raven on Azure working group
From: kijana....@gmail.com
To: rav...@googlegroups.com

Kijana Woodard

unread,
May 25, 2013, 4:49:51 PM5/25/13
to rav...@googlegroups.com
page load? after index non-stale?
could do Stream for that as well to see difference, if any.


Georgios Diamantopoulos

unread,
May 25, 2013, 4:51:27 PM5/25/13
to rav...@googlegroups.com
I'm setting up a repo for this, give me a few


Date: Sat, 25 May 2013 15:49:51 -0500

Subject: Re: [RavenDB] Re: Raven on Azure working group
From: kijana....@gmail.com
To: rav...@googlegroups.com

page load? after index non-stale?
could do Stream for that as well to see difference, if any.


On Sat, May 25, 2013 at 3:46 PM, Georgios Diamantopoulos <geor...@live.com> wrote:
How about this (some of this will be educational and prob removed later):

- Bulk insert jsons (measure perf)
- Time to page-load all docs?
- Insert static index and measure time until it's non-stale
- Find saturation point as in http://blogs.prodata.ie/post/IOPS-Planning-for-SQL-Server-in-Azure-IaaS.aspx for queries against the index using terms from the index and outside of the index



Date: Sat, 25 May 2013 15:41:06 -0500

Subject: Re: [RavenDB] Re: Raven on Azure working group
From: kijana....@gmail.com
To: rav...@googlegroups.com

Load(Id)
Query.Seach(Description + Summary)
Where IsPrerelease = true/false
Where Author = "Someone"



--
You received this message because you are subscribed to the Google Groups "ravendb" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 


--
You received this message because you are subscribed to a topic in the Google Groups "ravendb" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ravendb/XNqobAx3yBs/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to ravendb+u...@googlegroups.com.

Kijana Woodard

unread,
May 25, 2013, 5:01:51 PM5/25/13
to rav...@googlegroups.com
no worries. i'm just tooling around today. have leave shortly.

Mircea Chirea

unread,
May 25, 2013, 6:52:47 PM5/25/13
to rav...@googlegroups.com
  • Load by id
  • Query by prerelease
  • Query by version
  • Query by author
  • Query by tags
  • Full text search on title
  • Full text search on description
  • Full text search on release notes
  • Order by published/last updated/downloads, 25 results
That should do it :)

Of course the SO benchmark should also be considered, as a "hardcore" test.
Curious what the full text search performance would be on the latest 10GB dumps =)

Justin A

unread,
May 26, 2013, 7:08:58 AM5/26/13
to rav...@googlegroups.com
>> I wouldn't like a gui installer. I think you lose points for that. :-)

heh :) *I* was not suggesting/promoting a gui thingy. I thought I read somewhere that the next versions of Raven will include an 'installer' .. and I was suggesting to leverage that if it actually existed .. only to reduce our work effort.


but yeah - command line interface for me also. not a gui fan.

now, back to the rest of the azure convo :)

Oren Eini (Ayende Rahien)

unread,
May 26, 2013, 7:10:18 AM5/26/13
to ravendb
We do have an installer, yes.


--

Mircea Chirea

unread,
May 26, 2013, 8:14:08 AM5/26/13
to rav...@googlegroups.com
Blasphemy! Windows is all about GUI!

On-topic, I'm seeing much lower latencies today, under 10ms.

Georgios Diamantopoulos

unread,
May 29, 2013, 3:16:34 AM5/29/13
to rav...@googlegroups.com
I've downloaded the nugets and saved them in a ZIP (man that takes ages, I think they throttle connections) - they're 111MB so I should be able to check in the file so that other people don't have to download them again - soonish...


Date: Sat, 25 May 2013 15:52:47 -0700
From: chirea...@gmail.com
To: rav...@googlegroups.com

Jahmai Lay

unread,
Jun 14, 2013, 2:52:06 AM6/14/13
to rav...@googlegroups.com

We've been using Raven on Azure for a while now and it's worked fine due to low load.
We're now running into IOPS limitation issues which we are currently engaging with Azure support to resolve.

The core issue is it doesn't seem to be possible to scale beyond ~200-300 random writes for a mounted drive. Striping doesn't seem to help.
Random IO seems to be a big influence on Raven's performance.

Anyone else seeing this?

On Sunday, May 19, 2013 8:56:37 AM UTC+10, Kijana Woodard wrote:
Ok. Maybe my real problem was my use of SQLIO.

For my original numbers, I used ".\sqlio.exe -dM" where M is my drive letter.


So now I tried ".\sqlio.exe -kR -t64 -s120 -dM -o8 -fsequential -b64 -BH -LS Testfile.dat"
IOs/sec:  2135.00
MBs/sec:   133.43

I attached a disk from the same storage account as the VM and got:;
IOs/sec:  2989.66
MBs/sec:   186.85

I created another VM in the same affinity group as the storage account. I am watching to see if the "warmup" the article author suggests makes a difference.

Fresh striped disks from same account/container as VM:
IOs/sec:  2044.67
MBs/sec:   127.79

A single disks from the same storage account but different container that has been around a bit longer:
IOs/sec:  2908.49
MBs/sec:   181.78



On Sat, May 18, 2013 at 4:28 PM, Kijana Woodard <kijana....@gmail.com> wrote:
Ok. I've found three problems with what I did:

1. Had Geo-Replication enabled
2. Did not have the Storage Accounts in an affinity group.
3. I'm using an existing VM that I thought was in an affinity group, but I'm not sure.

Anyone know how to set the 1 and 2 with powershell cmdlets?


On Sat, May 18, 2013 at 3:54 PM, Kijana Woodard <kijana....@gmail.com> wrote:
I just tried SQLIO Disk Subsystem Benchmark Tool (http://www.microsoft.com/en-us/download/details.aspx?id=20163).


The performance is still _much_ worse than my local SSD.

Local:
IOs/sec:  7423.98
MBs/sec:    14.49

Azure VM Striped Data Disk:
IOs/sec:   199.13
MBs/sec:     0.38

Azure VM C:
IOs/sec: 13309.53
MBs/sec:    25.99

Azure VM D:
IOs/sec: 19614.86
MBs/sec:    38.31

Given the numbers for the VM C/D drives, I'm wondering if I missed some steps setting up the disks. I was rushing to get everything back up so other developers could continue working and seeing their integrated changes.

At any rate, maybe SQLIO is a good (enough) approximation of a RavenScore.



On Tue, May 14, 2013 at 9:55 AM, Matt Johnson <mj1...@hotmail.com> wrote:
Also, see this relatively new writeup on SQL Server IOPS on Azure VMs.  We have different load than SQL of course, but these are similar metrics as to what we need.


On Tuesday, May 14, 2013 7:50:37 AM UTC-7, Matt Johnson wrote:
Evaluating perf counters are probably a good bet.  Especially if we can get some Raven-specific ones as well as some of the key operating system metrics, such as I/O performance.

Speaking of metrics, have you guys seen Iometer?  (www.iometer.org) - It's a bit dated, but very stable and useful tool for measuring I/O performance.  It might be useful - at least for testing on Azure VMs.  I'm not sure we could use it in other modes, such as with Raven embedded mode on an Azure web site or web role.

Speaking of IOPS, There have been some recent changes posted for Azure VMs: http://msdn.microsoft.com/library/dn197896.aspx

They don't guarantee a specific minimum IOPS, but they are specifying targeted maximums.  We should be able to see how Raven performs in within the ranges they are specifying for the different machine sizes.

We should also be considering that Amazon offers guaranteed IOPS (provisioned IOPS) for their EC2 instances.  Azure doesn't have anything like that yet.

-Matt


On Tuesday, May 14, 2013 2:58:18 AM UTC-7, georgiosd wrote:
That should be easier to interface with than parsing the logs, as long as it has the same amount of info or more


From: aye...@ayende.com
Date: Tue, 14 May 2013 10:56:38 +0100

Subject: Re: [RavenDB] Re: Raven on Azure working group
To: rav...@googlegroups.com

We do that, we have perf counters


Kijana Woodard

unread,
Jun 14, 2013, 6:49:58 AM6/14/13
to rav...@googlegroups.com

Yes.

Though I was able to get sqlio to look better, I couldn't see any noticeable difference in raven. My subscription ran over the limit and my vm was deleted. I didn't have an image. When I rebuilt, I just put everything in the os drive and it seemed better. Of course that is size limited, but for dev, no worries.

The client wants to buy hardware so I stored working on it. Please share any insights from azure support.

It was also non trivial to get all the striping done anyway. They should have a high io option that works without trying to hack their infrastructure (trying to coerce disks  into the same dc, etc). It also didn't seem trivial to attach 16 existing drives that were previously striped and preserve all the data.

Matt Watson

unread,
Jun 14, 2013, 9:44:56 AM6/14/13
to rav...@googlegroups.com
We are also planning to use RavenDB on Azure and would love to be part of any group testing or best practices.

In our use case we should have a lot of very small databases, so we are hoping it will scale out well.

Felipe Leusin

unread,
Jun 15, 2013, 11:56:27 AM6/15/13
to rav...@googlegroups.com
+1 for Azure here. I`ve been using it with a Medium VM for our development (the application is a WebSite) without a hitch but i`m not very confident on going to production on Azure (and if some people like the folks from RavenHQ or CloudBird would take this challenge would be great).

If you guys need any help testing or have already defined some methodology, let me know. I scanned the past emails but couldn`t find anything. Also I`m running our tests next week and will post the results here.
Message has been deleted

Chris Marisic

unread,
Jun 17, 2013, 1:07:09 PM6/17/13
to rav...@googlegroups.com
We're going dedicated hardware SSDs in pairs with raid1 mirroring and 96GB of RAM. The features we've really needed are in 2.5, so i've been watching that closely. now with it in RC we'll probably just wait until official RTM/RTW

Kijana Woodard

unread,
Jun 17, 2013, 1:09:53 PM6/17/13
to rav...@googlegroups.com
On premise or with a vendor (rackspace/softlayer/etc)?

Chris Marisic

unread,
Jun 17, 2013, 3:34:30 PM6/17/13
to rav...@googlegroups.com
 
On Monday, June 17, 2013 1:09:53 PM UTC-4, Kijana Woodard wrote:
On premise or with a vendor (rackspace/softlayer/etc)?



3rd party dedicated hosting, the box is in The Planet's data center in dallas, which is now owned by SoftLayer. 

Oren Eini (Ayende Rahien)

unread,
Jun 17, 2013, 3:34:43 PM6/17/13
to ravendb
Please don't wait for this, we really need people to test the RC


On Mon, Jun 17, 2013 at 8:06 PM, Chris Marisic <ch...@marisic.com> wrote:
We're going dedicated hardware SSDs in pairs with raid0 and 96GB of RAM. The features we've really needed are in 2.5, so i've been watching that closely. now with it in RC we'll probably just wait until official RTM/RTW


On Friday, June 14, 2013 6:49:58 AM UTC-4, Kijana Woodard wrote:

Kijana Woodard

unread,
Jun 17, 2013, 3:47:01 PM6/17/13
to rav...@googlegroups.com
Down the street from my house. ;-)


--

Georgios Diamantopoulos

unread,
Jun 22, 2013, 1:50:54 PM6/22/13
to rav...@googlegroups.com
I guess I'm overdue on setting up that Github for the benchmark project.

I can do that pretty soon including the ZIP of test files but I won't be able to gain any serious moment without any help.

Any takers? :)


Date: Fri, 14 Jun 2013 05:49:58 -0500

Subject: Re: [RavenDB] Re: Raven on Azure working group
You received this message because you are subscribed to a topic in the Google Groups "ravendb" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ravendb/XNqobAx3yBs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ravendb+u...@googlegroups.com.

Kijana Woodard

unread,
Jun 22, 2013, 3:09:59 PM6/22/13
to rav...@googlegroups.com

I have a console app I've been running that may have some value for raven bench.

Georgios Diamantopoulos

unread,
Jul 9, 2013, 5:58:12 PM7/9/13
to rav...@googlegroups.com
Cheer, scream or rejoice. I'm uploading the first commit for https://github.com/georgiosd/RavenDB.Contrib.Performance
It contains the nugets dataset and method to regenerate it (don't try it at home folks, it's throttled) and to bulk insert it to the DB.

The bulk insert could use batching, it needs about 5GB mem to complete!

Feel welcome to fork and send pull requests :)


Date: Sat, 22 Jun 2013 14:09:59 -0500
Subject: RE: [RavenDB] Re: Raven on Azure working group

Kijana Woodard

unread,
Jul 9, 2013, 6:05:16 PM7/9/13
to rav...@googlegroups.com
:-)

Georgios Diamantopoulos

unread,
Jul 11, 2013, 8:33:17 AM7/11/13
to rav...@googlegroups.com
Sigh... upload failed because Github has limit of 100MB per file and it's 112.. I'll have to repackage, will let you know


Date: Tue, 9 Jul 2013 17:05:16 -0500

Subject: Re: [RavenDB] Re: Raven on Azure working group
From: kijana....@gmail.com
To: rav...@googlegroups.com

:-)


Mircea Chirea

unread,
Jul 11, 2013, 11:18:59 AM7/11/13
to rav...@googlegroups.com
Might to better to upload to a dedicated file host. Dropbox or Google Drive or something.

Khalid Abuhakmeh

unread,
Jul 11, 2013, 4:32:22 PM7/11/13
to rav...@googlegroups.com
I can't wait till RavenDB supports running in a Unix environment. Anybody see Digital Ocean? They offer super cheap SSD servers which would be great for a database. Anyways, my 2 cents.
It is loading more messages.
0 new messages