Large import fails with Voron.Exceptions.ScratchBufferSizeLimitException

瀏覽次數:127 次
跳到第一則未讀訊息

Andrej Krivulčík

未讀,
2016年5月20日 上午8:13:442016/5/20
收件者:RavenDB - 2nd generation document database
I'm trying to import large amout of data to a Voron database.

Server Build #30115
Client Build #30115
Enabled bundles: Periodic Export, Replication
Incremental Backup option enabled

It's the same database/server as the one experiencing problems with removing old .journal files, described here: https://groups.google.com/forum/#!topic/ravendb/Fyf4xnGRdAA .

There are no indexes except the default Raven/DocumentsByEntityName index.

I create 3 entities for each data entry, two types of entities are about 2 kB in size, the third entity is up to 1 kB in size. Most of the documents already exist in the database, as the import sometimes fails (some documents are changed, some not).

The file scratch.0000000000.buffers grows during the import, but not steadily. After it hits the size limit, the import fails (this last occurred after processing around 530k data entries, 1590k documents):

System.AggregateException: Error when executing write ---> Voron.Exceptions.ScratchBufferSizeLimitException: Cannot allocate more space for the scratch buffer.
Current file size is: 3,136 KB.
Requested size for current file: 2,496 KB.
Requested total size for all files: 6,291,460 KB.
Limit: 6,291,456 KB.
Already flushed and waited for 5,002 ms for read transactions to complete.
Do you have a long running read transaction executing?
Debug info:
Current transaction id: 191182
Requested number of pages: 1 (adjusted size: 1 == 4 KB)
Oldest active transaction: 191181 (snapshot: 191181)
Oldest active transaction when flush was forced: -1
Next write transaction id: 191183
Active transactions:
Id: 191182 - ReadWrite
Id: 191181 - Read
Scratch files usage:
scratch.0000000000.buffers - size: 6,291,520 KB, in active use: 6,288,964 KB
scratch.0000000001.buffers - size: 3,136 KB, in active use: 2,492 KB
Most available free pages:
scratch.0000000000.buffers
Size:1, ValidAfterTransactionId: -1
Size:2, ValidAfterTransactionId: 186151
scratch.0000000001.buffers
Compression buffer size: 24,640 KB

   at Voron.Impl.Scratch.ScratchBufferPool.Allocate(Transaction tx, Int32 numberOfPages)
   at Voron.Impl.Transaction.AllocatePage(Int32 numberOfPages, PageFlags flags, Nullable`1 pageNumber)
   at Voron.Impl.Transaction.ModifyPage(Int64 num, Tree tree, Page page)
   at Voron.Trees.Tree.DirectAdd(MemorySlice key, Int32 len, NodeFlags nodeType, Nullable`1 version)
   at Voron.Trees.Tree.Add(Slice key, Stream value, Nullable`1 version)
   at Voron.Impl.TransactionMergingWriter.HandleOperations(Transaction tx, List`1 writes, CancellationToken token)
   at Voron.Impl.TransactionMergingWriter.HandleActualWrites(OutstandingWrite mine, CancellationToken token)
   --- End of inner exception stack trace ---
   at Voron.Impl.TransactionMergingWriter.Write(WriteBatch batch)
   at Raven.Database.Storage.Voron.Impl.TableStorage.Write(WriteBatch writeBatch)
   at Raven.Storage.Voron.TransactionalStorage.ExecuteBatch(Action`1 action)
   at Raven.Storage.Voron.TransactionalStorage.Batch(Action`1 action)
   at Raven.Database.DocumentDatabase.Batch(IList`1 commands, CancellationToken token)
   at Raven.Database.Server.Controllers.DocumentsBatchController.d__8.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Threading.Tasks.TaskHelpersExtensions.d__3`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
   at System.Web.Http.Controllers.ApiControllerActionInvoker.d__0.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
   at System.Web.Http.Controllers.ActionFilterResult.d__2.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
   at System.Web.Http.Controllers.ExceptionFilterResult.d__0.MoveNext()
---> (Inner Exception #0) Voron.Exceptions.ScratchBufferSizeLimitException: Cannot allocate more space for the scratch buffer.
Current file size is: 3,136 KB.
Requested size for current file: 2,496 KB.
Requested total size for all files: 6,291,460 KB.
Limit: 6,291,456 KB.
Already flushed and waited for 5,002 ms for read transactions to complete.
Do you have a long running read transaction executing?
Debug info:
Current transaction id: 191182
Requested number of pages: 1 (adjusted size: 1 == 4 KB)
Oldest active transaction: 191181 (snapshot: 191181)
Oldest active transaction when flush was forced: -1
Next write transaction id: 191183
Active transactions:
Id: 191182 - ReadWrite
Id: 191181 - Read
Scratch files usage:
scratch.0000000000.buffers - size: 6,291,520 KB, in active use: 6,288,964 KB
scratch.0000000001.buffers - size: 3,136 KB, in active use: 2,492 KB
Most available free pages:
scratch.0000000000.buffers
Size:1, ValidAfterTransactionId: -1
Size:2, ValidAfterTransactionId: 186151
scratch.0000000001.buffers
Compression buffer size: 24,640 KB

   at Voron.Impl.Scratch.ScratchBufferPool.Allocate(Transaction tx, Int32 numberOfPages)
   at Voron.Impl.Transaction.AllocatePage(Int32 numberOfPages, PageFlags flags, Nullable`1 pageNumber)
   at Voron.Impl.Transaction.ModifyPage(Int64 num, Tree tree, Page page)
   at Voron.Trees.Tree.DirectAdd(MemorySlice key, Int32 len, NodeFlags nodeType, Nullable`1 version)
   at Voron.Trees.Tree.Add(Slice key, Stream value, Nullable`1 version)
   at Voron.Impl.TransactionMergingWriter.HandleOperations(Transaction tx, List`1 writes, CancellationToken token)
   at Voron.Impl.TransactionMergingWriter.HandleActualWrites(OutstandingWrite mine, CancellationToken token)<---

   at Raven.Client.Connection.Implementation.HttpJsonRequest.d__29.MoveNext() in c:\Builds\RavenDB-Stable-3.0\Raven.Client.Lightweight\Connection\Implementation\HttpJsonRequest.cs:line 389
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Raven.Client.Connection.Implementation.HttpJsonRequest.<>c__DisplayClassf.<b__e>d__11.MoveNext() in c:\Builds\RavenDB-Stable-3.0\Raven.Client.Lightweight\Connection\Implementation\HttpJsonRequest.cs:line 200
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Raven.Client.Connection.Implementation.HttpJsonRequest.d__16`1.MoveNext() in c:\Builds\RavenDB-Stable-3.0\Raven.Client.Lightweight\Connection\Implementation\HttpJsonRequest.cs:line 241
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Raven.Client.Connection.Async.AsyncServerClient.<>c__DisplayClass1f8.<b__1f6>d__1fb.MoveNext() in c:\Builds\RavenDB-Stable-3.0\Raven.Client.Lightweight\Connection\Async\AsyncServerClient.cs:line 1515
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Raven.Client.Connection.ReplicationInformerBase`1.d__29`1.MoveNext() in c:\Builds\RavenDB-Stable-3.0\Raven.Client.Lightweight\Connection\ReplicationInformerBase.cs:line 439
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.ValidateEnd(Task task)
   at Raven.Client.Connection.ReplicationInformerBase`1.d__19`1.MoveNext() in c:\Builds\RavenDB-Stable-3.0\Raven.Client.Lightweight\Connection\ReplicationInformerBase.cs:line 334
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Raven.Client.Connection.Async.AsyncServerClient.d__2b3`1.MoveNext() in c:\Builds\RavenDB-Stable-3.0\Raven.Client.Lightweight\Connection\Async\AsyncServerClient.cs:line 0
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Raven.Abstractions.Util.AsyncHelpers.<>c__DisplayClassb`1.<b__8>d__d.MoveNext() in c:\Builds\RavenDB-Stable-3.0\Raven.Abstractions\Util\AsyncHelpers.cs:line 75
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at Raven.Abstractions.Util.AsyncHelpers.RunSync[T](Func`1 task) in c:\Builds\RavenDB-Stable-3.0\Raven.Abstractions\Util\AsyncHelpers.cs:line 90
   at Raven.Client.Connection.ServerClient.Batch(IEnumerable`1 commandDatas) in c:\Builds\RavenDB-Stable-3.0\Raven.Client.Lightweight\Connection\ServerClient.cs:line 318
   at Raven.Client.Document.DocumentSession.SaveChanges() in c:\Builds\RavenDB-Stable-3.0\Raven.Client.Lightweight\Document\DocumentSession.cs:line 728
   at [my code]

The import processes around 15000 records/minute, so around 750 documents/s (15000*3/60) are stored.

The import is done in batches of 256 records (that's 768 documents), each batch saved in its own session.

I found the following: https://groups.google.com/forum/#!topic/ravendb/b8MP7sZIB6Q but I don't see any old read transactions in the exception message.

What can I do to make the import run more reliably?

Thanks
Andrej

Oren Eini (Ayende Rahien)

未讀,
2016年5月20日 上午8:59:292016/5/20
收件者:ravendb
Do you have any custom configuration for RavenDB?

Hibernating Rhinos Ltd  

Oren Eini l CEO Mobile: + 972-52-548-6969

Office: +972-4-622-7811 l Fax: +972-153-4-622-7811

 


--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andrej Krivulčík

未讀,
2016年5月20日 上午9:50:382016/5/20
收件者:RavenDB - 2nd generation document database
No, it's clean installation without any customization. The database is created with all settings set to default (except the storage engine) - I tested it with database without any bundles.

It seems to be related to the fact that I'm storing 3 different entities for each data record. When I stored only one entity, the problem didn't occur. I'm investigating and will report any findings.

Oren Eini (Ayende Rahien)

未讀,
2016年5月20日 上午9:53:352016/5/20
收件者:ravendb
What is the size for each document?

Andrej Krivulčík

未讀,
2016年5月20日 上午9:54:172016/5/20
收件者:RavenDB - 2nd generation document database
I create 3 entities for each data entry, two types of entities are about 2 kB in size, the third entity is up to 1 kB in size. 

Oren Eini (Ayende Rahien)

未讀,
2016年5月20日 上午9:55:352016/5/20
收件者:ravendb
That make no sense.
The scratch file usage is 6 GB in size, the default value is 512MB

Andrej Krivulčík

未讀,
2016年5月20日 上午9:57:282016/5/20
收件者:RavenDB - 2nd generation document database
The limit is 6 GB by default:

https://ravendb.net/docs/article-page/3.0/csharp/start/whats-new

3.0.3785 - 2015/08/31
Server
[Voron] increased scratch buffer size to 6144 MB and added a threshold after which indexing/reducing batch sizes will start decreasing,

When I lower it (tried 256 MB), the import fails sooner.

Andrej Krivulčík

未讀,
2016年5月20日 上午10:00:562016/5/20
收件者:RavenDB - 2nd generation document database
I just found out that even when I'm importing only one entity type, the problem persists. I didn't let the import run long enough before. I'll try to create isolated test case and report it here.

Oren Eini (Ayende Rahien)

未讀,
2016年5月20日 上午10:13:232016/5/20
收件者:ravendb
Do you have an large number of indexes, including map/reduce?

Hibernating Rhinos Ltd  

Oren Eini l CEO Mobile: + 972-52-548-6969

Office: +972-4-622-7811 l Fax: +972-153-4-622-7811

 


Andrej Krivulčík

未讀,
2016年5月20日 上午10:26:432016/5/20
收件者:RavenDB - 2nd generation document database
No, the database has no indexes.

Oren Eini (Ayende Rahien)

未讀,
2016年5月20日 上午10:28:042016/5/20
收件者:ravendb
Can you send a repro?

Andrej Krivulčík

未讀,
2016年5月20日 上午10:29:062016/5/20
收件者:RavenDB - 2nd generation document database
I'm trying to isolate the issue, I'll send it as soon as I have it ready.

Andrej Krivulčík

未讀,
2016年5月23日 凌晨4:21:332016/5/23
收件者:RavenDB - 2nd generation document database
I couldn't replicate with synthetic data, this is real-data repro:

To reproduce:
1. Download the file http://download.companieshouse.gov.uk/BasicCompanyData-2016-05-01-part1_5.zip and extract to working directory.
2. Create a database with Voron storage engine (http://imgur.com/PtNbCV4).
3. Configure the database connection string for DataServer.
4. Let it run (took about 15 minutes for me).
5. Voron scratch buffer file grows to 1 GB (http://imgur.com/0kittyb) and would grow further, if the processing continued.


using Raven.Client.Document;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace VoronScratchTest
{
    // 1. Download the file http://download.companieshouse.gov.uk/BasicCompanyData-2016-05-01-part1_5.zip and extract to working directory.
    // 2. Create a database with Voron storage engine (http://imgur.com/PtNbCV4).
    // 3. Configure the database connection string for DataServer.
    // 4. Let it run (took about 15 minutes for me).
    // 5. Voron scratch buffer file grows to 1 GB (http://imgur.com/0kittyb).
    class Program
    {
        static void Main(string[] args)
        {
            var documentStore = new DocumentStore
            {
                ConnectionStringName = "DataServer",
                EnlistInDistributedTransactions = false,
            };
            documentStore.Initialize();

            using (var reader = new StreamReader("BasicCompanyData-2016-05-01-part1_5.csv"))
            {
                var header = reader.ReadLine();
                var headers = header.Split(',').Select(x => x.Trim()).ToArray();
                string line;
                var batch = new List<Company>();
                var position = 0;
                while ((line = reader.ReadLine()) != null && position < 500000)
                {
                    position++;
                    if (position % 1000 == 0)
                    {
                        Console.WriteLine($"Processed {position} records.");
                    }
                    var data = line.Split(new[] { @""",""" }, StringSplitOptions.None); // not correct CSV parsing but not important now
                    var companyAttributes = headers.Take(32).Zip(data.Take(32), (headerName, value) => new { Key = headerName, Value = value })
                        .ToDictionary(x => x.Key, x => x.Value);
                    var company = new Company
                    {
                        Id = companyAttributes["CompanyNumber"],
                        Attributes = companyAttributes,
                    };
                    batch.Add(company);
                    if (batch.Count() > 64)
                    {
                        StoreBatch(documentStore, batch);
                        batch = new List<Company>();
                    }
                }
            }
        }

        private static void StoreBatch(DocumentStore documentStore, List<Company> batch)
        {
            using (var session = documentStore.OpenSession())
            {
                foreach (var company in batch)
                {
                    session.Store(company);
                }
                session.SaveChanges();
            }
        }

        class Company
        {
            public string Id { get; set; }
            public Dictionary<string, string> Attributes { get; set; }
        }
    }
}


Andrej Krivulčík

未讀,
2016年6月6日 清晨5:51:082016/6/6
收件者:RavenDB - 2nd generation document database
Hi, did you already get a chance to look into this issue?


On Monday, May 23, 2016 at 10:21:33 AM UTC+2, Andrej Krivulčík wrote:
I couldn't replicate with synthetic data, this is real-data repro:

...

Michael Yarichuk

未讀,
2016年6月6日 上午8:29:512016/6/6
收件者:RavenDB - 2nd generation document database
Hi,
I've taken a look at this, couldn't replicate so far. The highest size of the scratch buffer I've seen is about 300mb. 
What are the specs of the machine you are running this test on? If you try this on a machine significantly faster or slower than the current, does it change the test results?

--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Best regards,

 

Hibernating Rhinos Ltd  cid:image001.png@01CF95E2.8ED1B7D0

Michael Yarichuk l RavenDB Core Team 

RavenDB paving the way to "Data Made Simple"   http://ravendb.net/  

Andrej Krivulčík

未讀,
2016年6月6日 上午9:38:302016/6/6
收件者:RavenDB - 2nd generation document database
Hi, the machine is Azure "Standard DS3 v2 (4 cores, 14 GB memory)" instance, with SSD disk where the database is located.

How many records did it process? Was the scratch buffer size growing (without shrinking)? If so, removing the "end condition" (https://gist.github.com/krivulcik/f8e5efad7571abc352414c496a00a994#file-program-cs-L34) to let it process all 850k records might help replicate the issue. Also, processing more files downloadable from http://download.companieshouse.gov.uk/en_output.html might help.

I'll try to replicate on a different machine and report.

Michael Yarichuk

未讀,
2016年6月6日 上午9:52:102016/6/6
收件者:RavenDB - 2nd generation document database
the scratch buffer seems to grow up to a certain size. Also, I've tried to remove the limit, it finished the processing of all the records without too much growth on the side of scratch buffer.

I will continue to investigate on my end - current suspect for this is the way locks are being taken for flushing of journals and scratch buffers during high load. 

Andrej Krivulčík

未讀,
2016年6月6日 上午9:55:072016/6/6
收件者:RavenDB - 2nd generation document database
I have one more fact that might help in replicating the issue - the import itself runs on a different Azure instance ("Standard D2 (2 Cores, 7 GB memory)"), it connects to the database remotely.

Oren Eini (Ayende Rahien)

未讀,
2016年6月6日 上午10:06:072016/6/6
收件者:ravendb
What happens if they run on the same server?

Hibernating Rhinos Ltd  

Oren Eini l CEO Mobile: + 972-52-548-6969

Andrej Krivulčík

未讀,
2016年6月6日 上午10:36:332016/6/6
收件者:RavenDB - 2nd generation document database
Almost exactly the same thing - scratch buffer file grew to 3 GB after processing 850k records. The growth rate was inconsistent, I noticed roughly the following:
500k - 768 MB
550k - 1 GB
650k - 3 GB
850k (after complete processing of the file linked in the source) - 3 GB

Oren Eini (Ayende Rahien)

未讀,
2016年6月7日 凌晨3:11:552016/6/7
收件者:ravendb
Can you check the I/O rates while this is going on?

Oren Eini (Ayende Rahien)

未讀,
2016年6月7日 凌晨3:12:122016/6/7
收件者:ravendb
It looks like we have to increase the scratch size because we can't flush fast enough to disk

Andrej Krivulčík

未讀,
2016年6月7日 凌晨3:12:582016/6/7
收件者:RavenDB - 2nd generation document database
I did a test on my local dev machine with the following specs:
Intel Core i5-4310U 2.00 GHz (working at 2.60 GHz)
16 GB RAM
SSD disk
unlicensed RavenDB installation (Development only)

The scratch buffers growth was much slower:
after 1 file (850k records): 384 MB
after 2 files (1700k records): 640 MB
after 3 files (2550k records): still 640 MB

Oren Eini (Ayende Rahien)

未讀,
2016年6月7日 凌晨3:16:592016/6/7
收件者:ravendb
Okay, so this is likely the issue with the disk I/O not keeping up, and us needing to increase the buffer size.
IIRC, there is a way to do rate limits on smuggler so you'll slow down not to overload the server.
But even so, isn't the default scratch size 6GB? So you'll not be actually getting the error.

Andrej Krivulčík

未讀,
2016年6月7日 凌晨3:33:482016/6/7
收件者:RavenDB - 2nd generation document database
I uploaded screenshot of the progress: https://imgur.com/a/etPry
Not sure if this is the statistics you need to see, if you need other information, please let me know how to obtain them.

Andrej Krivulčík

未讀,
2016年6月7日 凌晨3:37:022016/6/7
收件者:RavenDB - 2nd generation document database
I'm going to try slower import now. I'm not importing using smuggler, I'm using my own program.

Default scratch size is 6 GB, but after completing one file, when I try to import another one, the scratch file grows even more, until it hits the limit. This happens even after quite long period of time. IIRC, I ran another export after the server was sitting idle overnight, and the scratch file was still. Restarting the database helps, but I wasn't able to import the whole files (850k records each) between restarts - the scratch file grew to 6 GB during the processing of a single 850k-records file.

Oren Eini (Ayende Rahien)

未讀,
2016年6月7日 凌晨3:39:492016/6/7
收件者:ravendb
What is F: drive?

Andrej Krivulčík

未讀,
2016年6月7日 凌晨3:40:292016/6/7
收件者:RavenDB - 2nd generation document database
That's where both the database and data files are located (SSD drive).

Oren Eini (Ayende Rahien)

未讀,
2016年6月7日 凌晨3:45:192016/6/7
收件者:ravendb
It seems to be running a bit under load.
Can you run the I/O test on that?

Andrej Krivulčík

未讀,
2016年6月7日 凌晨3:51:192016/6/7
收件者:RavenDB - 2nd generation document database
https://imgur.com/QbN7Mz6

The bottom bar covers the percentiles:

Write latency percentiles
50%75%95%99%99.9%99.99%
2.002.003.004.005.005.00

Oren Eini (Ayende Rahien)

未讀,
2016年6月7日 凌晨3:55:022016/6/7
收件者:ravendb
This is _very _ slow.
You are writing about 1.7MB/s ?
And were able to write just 50MB in 30 seconds?

If you enable buffering, what do you get?

Andrej Krivulčík

未讀,
2016年6月7日 凌晨4:04:462016/6/7
收件者:RavenDB - 2nd generation document database
There's something weird in the statistics in RavenDB studio.
I ran the same test (all values default, using drive F), screenshot from the server during the test and the results: https://imgur.com/a/6AZpE

Resource monitor shows 33 MB/s write, results show 0 for first ~18 seconds, 5 MB/s after that?

Andrej Krivulčík

未讀,
2016年6月7日 凌晨4:20:352016/6/7
收件者:RavenDB - 2nd generation document database
The results are much better with buffering enabled: https://imgur.com/p2URO5K

Oren Eini (Ayende Rahien)

未讀,
2016年6月7日 凌晨4:23:212016/6/7
收件者:ravendb
And the problem is that RavenDB need to write with buffering off to make sure that we are actually hitting the disk and are safe.
I suggest pinging Azure support about those numbers.

Andrej Krivulčík

未讀,
2016年6月7日 凌晨4:29:192016/6/7
收件者:RavenDB - 2nd generation document database
Do you have any thoughts about the discrepancy between the write rate reported by resource monitor and by RavenDB Studio? https://imgur.com/a/6AZpE

Oren Eini (Ayende Rahien)

未讀,
2016年6月7日 凌晨4:33:112016/6/7
收件者:ravendb
It is measuring data from other processes (System, in this case), not RavenDB)

Andrej Krivulčík

未讀,
2016年6月7日 凌晨4:35:362016/6/7
收件者:rav...@googlegroups.com

But still, the process directly below the highlighted one is RavenDB and it shows ~8 MB/s, which is higher than the ~6 MB/s reported in the studio chart (and 1.96 MB/s average).

Andrej Krivulčík

未讀,
2016年6月7日 上午10:26:122016/6/7
收件者:RavenDB - 2nd generation document database
Is it possible to turn this behavior off? In our use case, it would be okay to use buffering for storage, in case of a failure, we can re-run the import and lost date are not of a big concern (but not corrupted data.

Also, is it possible to tweak the size of data being stored to database? Azure storage is optimized for larger writes (256 kB), I'm not sure how big are Voron writes from scratch to storage.

Oren Eini (Ayende Rahien)

未讀,
2016年6月8日 清晨6:48:232016/6/8
收件者:ravendb
When using Esent, you can use lazy transactions, which might help under this situation.
But I suggest increasing the IOPS for better performance overall
--

Andrej Krivulčík

未讀,
2016年6月8日 清晨7:26:112016/6/8
收件者:RavenDB - 2nd generation document database
Thanks for the suggestions.

I did more testing yesterday and today and found the following:
  • Azure limits the IOPS in this case to 1,500 writes/s. This is the cause of the IO Test result (the top value in the chart is ~6 MB/s, which is 1,500x4 kB). When I increase the Chunk size to 64 kB or 256 kB, the performance improves considerably. The speed is ~28 MB/s and ~86 MB/s, respectively (still not the theoretical limit of 96 MB/s and 684 MB/s, respectively).
  • The import in question is processed with rate around 100k records/minute. The records are between 1 and 2 kB each, when calculating with 2 kB, it's 200,000 kB / min = 3,333 kB / s ~= 3.25 MB/s. Even with 4 kB size (and thus 6 MB/s write limit), it should be enough to store the data. Can the internal processes cause the data to be written so many times over that it can't keep up and needs to be written to the scratch buffers?
I reformatted the NTFS partition to use "Allocation unit size" of 64 kB, but the rates remained the same.

The slides at http://www.slideshare.net/ayenderahien/voron mention that page size for Voron is 4 kB (slide 8), I'm not sure if this is the size of blocks written to the database. Can this value be changed?

We really want to use Voron as Esent support will be removed in the 4.0 release and this is ideal opportunity for us to test it.

So, is 1500 IOPS limit too low for this scenario? Do you have any further thoughts/comments?

Oren Eini (Ayende Rahien)

未讀,
2016年6月8日 上午8:10:002016/6/8
收件者:ravendb
Voron uses 4KB writes, but that actually just mean that it will use multiple of 4KB.
In 3.x, that means that the minimum write is 8KB in size.
But we do that we no buffering to ensure that we can get persistence. There is no support for lazy transactiosn in Voron.
--

Andrej Krivulčík

未讀,
2016年6月13日 上午9:29:152016/6/13
收件者:RavenDB - 2nd generation document database
I hope this will help someone.

I tried to increase the scratch buffers size limit and after that the import worked fine. The scratch buffers usage topped at around 8-9 GB.

Before that, I tried several approaches, none of which helped:
  • Using larger and faster Azure disks. P30 drives are supposed to have more than twice the performance of the P20 drives used earlier.
  • Using multiple drives in striped volumes. This is supposed to multiply the write throughput by the number of drives.
  • Slowing the import.
  • Restarting RavenDB between import sets.
  • Waiting long periods of time between import sets.
In all cases, the scratch file grew until it reached the limit, regardless of the drive speed. Also, when IO testing the faster drives, the limit was always 1500 writes/s (6 MB/s with 4 kB writes), I didn't find out why.
回覆所有人
回覆作者
轉寄
0 則新訊息