Import stops with Raven Smuggler tool

147 views
Skip to first unread message

Tratjen

unread,
Jul 4, 2014, 3:04:18 AM7/4/14
to rav...@googlegroups.com
I want to import a lot of documents into my database (around 40 million documents and then dump file is around 7 GB) and I am using  the smuggler tool. 

Current database is #2879 and the database from which the dump file was created was #2370. 

After around 550 000 documents it just suddenly stops every time. It is not constantly at exactly 557 000 documents - one time it may stop at 530 000 and another time at 565 000. 

I have attached a screenshot here of how it looks. 

I know this isn't a lot to go on but I haven't found anything trying to monitor resources, or disk space etc...


ravendb_import_error.png

Oren Eini (Ayende Rahien)

unread,
Jul 4, 2014, 3:05:49 AM7/4/14
to ravendb
Is it possible that you have a request time out defined in IIS that is stopping this?

If you are using a recent smuggler edition, try using the --batch-limit or --limit (I don't recall which it is).



Oren Eini

CEO


Mobile: + 972-52-548-6969

Office:  + 972-4-622-7811

Fax:      + 972-153-4622-7811





--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tratjen

unread,
Jul 4, 2014, 3:57:02 AM7/4/14
to rav...@googlegroups.com
It is defined at 120 seconds and no request seems to take long time at all. Even if the whole process counts as 1 request it doesn't seem to have anything to do with that.

I have also tried to use both --batch-limit and --limit now without any real difference. I also tried --wait-for-indexing (or something similar) and then it imports slightly slower but still stopped without errors before reaching 600 000 documents. 

Tratjen

unread,
Jul 4, 2014, 4:47:17 AM7/4/14
to rav...@googlegroups.com
Ok, you might be right about the time out in IIS. I increased it substantially and now 4 million documents has been imported.  

Tratjen

unread,
Jul 5, 2014, 4:19:06 AM7/5/14
to rav...@googlegroups.com
I may have some additional issues... 

After the import is done:

Wrote 1,024 (total 45,873,316 documents to server gzipped to 121 kb
Read 45875200 documents
Done with reading documents, total: 45875694
Begin reading attachments
Done with reading attachments, total: 0
Begin reading transformers
Done with reading transformers, total: 0
Imported 45,875,694 documents and 0 attachments in 10,771,923 ms
Wrote 1,024 (total 45,874,340 documents to server gzipped to 126 kb
Wrote 1,024 (total 45,875,364 documents to server gzipped to 131 kb
Wrote 330 (total 45,875,694 documents to server gzipped to 39 kb
Finished writing all results to server
Done writing to server


I still can't see any documents in my database and have no idea where anything has been imported! Looks like it just did nothing. It is not in the system database, nor is it in any other database. Maybe I have close to 1 GB less of disk space but I am not sure. 

I used the following command:

Raven.Smuggler in http://localhost:8081/ dump.raven --database[=main]

The dump file was created using an older (#2370) version of RavenDB. 

The only way I can think about solving this is to import it into the same old version and then copy the data into the new version of ravendb and let ravendb migrate the data, then export and import again. 

The problem with that is that I can't import the old data into an old database because of some error halfway through. Looking up the issue in this group I found out that I had to update the database - hence this situation. 

Oren Eini (Ayende Rahien)

unread,
Jul 6, 2014, 7:30:22 AM7/6/14
to ravendb
Go to  http://localhost:8081/docs
What do you see there?



Oren Eini

CEO


Mobile: + 972-52-548-6969

Office:  + 972-4-622-7811

Fax:      + 972-153-4622-7811





--

Tratjen

unread,
Jul 6, 2014, 1:20:12 PM7/6/14
to rav...@googlegroups.com
I can see documents but only documents that I already had in the database from before... 

Oren Eini (Ayende Rahien)

unread,
Jul 6, 2014, 1:25:09 PM7/6/14
to ravendb

WHAT docs are those

Tratjen

unread,
Jul 6, 2014, 2:00:28 PM7/6/14
to rav...@googlegroups.com
Ok, it actually seems to be documents of the type "messages" that I have been trying to import and the "Last-Modified" attribute also has a timestamp that indicates that it is documents that has been imported. What documents are supposed to be under /docs? How are they sorted since I can only see 25 documents

On Sunday, July 6, 2014 7:25:09 PM UTC+2, Oren Eini wrote:

WHAT docs are those

Oren Eini (Ayende Rahien)

unread,
Jul 6, 2014, 11:27:04 PM7/6/14
to ravendb
All docs appear under /docs

You can actually see all of them if you go to the system database in the UI.

Or use ?pageSize=1024&start=0

To page through them.



Oren Eini

CEO


Mobile: + 972-52-548-6969

Office:  + 972-4-622-7811

Fax:      + 972-153-4622-7811





--

Tratjen

unread,
Jul 7, 2014, 4:08:33 AM7/7/14
to rav...@googlegroups.com
I still have trouble understanding exactly whats going on. One of the reasons for that is that before the migration to the newest stable version I tried to import the data a few times and succeeded half way. 

Under localhost:8081/docs I can see documents from the dump file with the timestamp that indicates that they were put there from the most recent import. However, it is only 1 million documents - not the expected 40+ million. 

Under localhost:8081/databases/main/docs I can only see the documents that is also seen in the graphical ui... Which is about half of the documents from the import and all of them from the imports that succeeded to 50% before the import.

Any idea of whats going on?

Oren Eini (Ayende Rahien)

unread,
Jul 7, 2014, 4:12:46 AM7/7/14
to ravendb
Delete the database, try again from scratch.
Make _sure_ that the url you use is correct.

Tratjen

unread,
Jul 7, 2014, 5:01:32 AM7/7/14
to rav...@googlegroups.com
Delete the database "main" or the entire installation of ravendb at port 8081?

Oren Eini (Ayende Rahien)

unread,
Jul 7, 2014, 6:55:03 AM7/7/14
to ravendb
The entire thing

Tratjen

unread,
Jul 10, 2014, 12:56:57 PM7/10/14
to rav...@googlegroups.com
Ok, so I had time to try this. I deleted everything and started over from scratch. It looks like everything imports if I look in the command prompt but checking the database I can see that only the first 23552 documents gets imported. 

Any idea why?

Oren Eini (Ayende Rahien)

unread,
Jul 10, 2014, 1:19:56 PM7/10/14
to ravendb

Maybe they weren't indexed yet?
What does the stats say?

Tratjen

unread,
Jul 10, 2014, 4:53:36 PM7/10/14
to rav...@googlegroups.com
No, I have waited for quite a while and no additional documents are indexed and none are stale. I have attached a screenshot of the stats from the index raven/bydocumentsname.

Tratjen

unread,
Jul 10, 2014, 5:24:28 PM7/10/14
to rav...@googlegroups.com
If I look in the logs I can see that I have an error (or "Info") around the time it stopped actually importing which says "Failed to execute background task 3" + 

ystem.AggregateException: One or more errors occurred. ---> System.Web.HttpException: Maximum request length exceeded.
   at System.Web.HttpBufferlessInputStream.ValidateRequestEntityLength()
   at System.Web.HttpBufferlessInputStream.Read(Byte[] buffer, Int32 offset, Int32 count)
   at System.IO.Stream.ReadByte()
   at Raven.Database.Util.Streams.PartialStream.Dispose(Boolean disposing) in c:\Builds\RavenDB-Stable\Raven.Database\Util\Streams\PartialStream.cs:line 77
   at System.IO.Stream.Close()
   at Raven.Database.Server.Responders.BulkInsert.<YieldBatches>d__6.System.IDisposable.Dispose() in c:\Builds\RavenDB-Stable\Raven.Database\Server\Responders\BulkInsert.cs:line 0
   at Raven.Database.DocumentDatabase.<>c__DisplayClass131.<BulkInsert>b__12e(IStorageActionsAccessor accessor) in c:\Builds\RavenDB-Stable\Raven.Database\DocumentDatabase.cs:line 2513
   at Raven.Storage.Esent.TransactionalStorage.ExecuteBatch(Action`1 action, EsentTransactionContext transactionContext) in c:\Builds\RavenDB-Stable\Raven.Database\Storage\Esent\TransactionalStorage.cs:line 677
   at Raven.Storage.Esent.TransactionalStorage.Batch(Action`1 action) in c:\Builds\RavenDB-Stable\Raven.Database\Storage\Esent\TransactionalStorage.cs:line 628
   at Raven.Database.Server.Responders.BulkInsert.<>c__DisplayClass4.<Respond>b__1() in c:\Builds\RavenDB-Stable\Raven.Database\Server\Responders\BulkInsert.cs:line 59
   at System.Threading.Tasks.Task.Execute()
   --- End of inner exception stack trace ---
---> (Inner Exception #0) System.Web.HttpException (0x80004005): Maximum request length exceeded.
   at System.Web.HttpBufferlessInputStream.ValidateRequestEntityLength()
   at System.Web.HttpBufferlessInputStream.Read(Byte[] buffer, Int32 offset, Int32 count)
   at System.IO.Stream.ReadByte()
   at Raven.Database.Util.Streams.PartialStream.Dispose(Boolean disposing) in c:\Builds\RavenDB-Stable\Raven.Database\Util\Streams\PartialStream.cs:line 77
   at System.IO.Stream.Close()
   at Raven.Database.Server.Responders.BulkInsert.<YieldBatches>d__6.System.IDisposable.Dispose() in c:\Builds\RavenDB-Stable\Raven.Database\Server\Responders\BulkInsert.cs:line 0
   at Raven.Database.DocumentDatabase.<>c__DisplayClass131.<BulkInsert>b__12e(IStorageActionsAccessor accessor) in c:\Builds\RavenDB-Stable\Raven.Database\DocumentDatabase.cs:line 2513
   at Raven.Storage.Esent.TransactionalStorage.ExecuteBatch(Action`1 action, EsentTransactionContext transactionContext) in c:\Builds\RavenDB-Stable\Raven.Database\Storage\Esent\TransactionalStorage.cs:line 677
   at Raven.Storage.Esent.TransactionalStorage.Batch(Action`1 action) in c:\Builds\RavenDB-Stable\Raven.Database\Storage\Esent\TransactionalStorage.cs:line 628
   at Raven.Database.Server.Responders.BulkInsert.<>c__DisplayClass4.<Respond>b__1() in c:\Builds\RavenDB-Stable\Raven.Database\Server\Responders\BulkInsert.cs:line 59
   at System.Threading.Tasks.Task.Execute()<---

Oren Eini (Ayende Rahien)

unread,
Jul 10, 2014, 11:05:06 PM7/10/14
to ravendb
You need to extend the request length, because IIS is cutting us off.



Oren Eini

CEO


Mobile: + 972-52-548-6969

Office:  + 972-4-622-7811

Fax:      + 972-153-4622-7811





--

Tratjen

unread,
Jul 11, 2014, 5:19:42 AM7/11/14
to rav...@googlegroups.com
I changed the web.config to include <httpRuntime requestPathInvalidCharacters="&lt;,&gt;,%,&amp;,:,\,?" maxRequestLength="2097151"/>

It certainly did a difference since I now could import 15 673 344 documents and then it stopped. Looking at the size of the gzipped documents to the server it seems to correlate to the maxRequestLength. I have googled the issue and it doesn't seem to be possible to increase the maxRequestLength to more than  2097151 - is there a way around this issue? 

Oren Eini (Ayende Rahien)

unread,
Jul 11, 2014, 9:18:55 AM7/11/14
to ravendb
Yes, use --limit (or --batch-limit), it will partition the data.

Oren Eini (Ayende Rahien)

unread,
Jul 11, 2014, 9:19:08 AM7/11/14
to ravendb
Also, in IIS8, you can pass more than 2GB there.



Oren Eini

CEO


Mobile: + 972-52-548-6969

Office:  + 972-4-622-7811

Fax:      + 972-153-4622-7811





On Fri, Jul 11, 2014 at 12:19 PM, Tratjen <andre...@gmail.com> wrote:

Tratjen

unread,
Jul 12, 2014, 1:18:41 PM7/12/14
to rav...@googlegroups.com
This is starting to take a long time just to import a backup... 

a) I don't have the option to upgrade IIS right now...

b) I have tried every imaginable combination of --limit=X and --batch-limit[=X] that I can think of and I get stuck at the exact same amount of documents each time (2GB). Can you explain exactly how to use those commands and how to see if they work properly? 

Some examples of what I have tried

Raven.Smuggler in http://localhost:9090/databases/main x.raven --limit=50000

Raven.Smuggler in http://localhost:9090/databases/main x.raven --batch-limit=50000

Raven.Smuggler in http://localhost:9090/databases/main x.raven --limit[=50000]

Raven.Smuggler in http://localhost:9090/databases/main x.raven --batch-limit[=50000]

Oren Eini (Ayende Rahien)

unread,
Jul 13, 2014, 11:24:11 AM7/13/14
to ravendb
What build are you using for smuggler?

Raven.Smuggler.exe should give you the command line interface for that.

Tratjen

unread,
Jul 14, 2014, 1:35:22 AM7/14/14
to rav...@googlegroups.com
The build is 2879 and yes, I get the command line interface but it doesn't really work. I started to work around the issue using --metadata-filter but then it just worked for the first document type. For the second document type I get this error: 

Done with reading documents, total: 330
Begin reading attachments
Done with reading attachments, total: 0
Begin reading transformers
Done with reading transformers, total: 0
Imported 330 documents and 0 attachments in 3,282,890 ms
Finished writing all results to server
System.AggregateException: One or more errors occurred. ---> System.InvalidOperationException: Forbidden Forbidden
{"Error":"This single use token has expired"}
 ---> System.Net.WebException: The remote server returned an error: (403) Forbidden.

Tratjen

unread,
Jul 14, 2014, 1:36:39 AM7/14/14
to rav...@googlegroups.com
And to clearify: no documents are really added to the database. Looking at /docs I can only see documents of the first type. 

Oren Eini (Ayende Rahien)

unread,
Jul 14, 2014, 1:37:20 AM7/14/14
to ravendb
Can you go on skype?

Tratjen

unread,
Jul 14, 2014, 1:58:23 AM7/14/14
to rav...@googlegroups.com
Yes, what is your name on Skype?

Oren Eini (Ayende Rahien)

unread,
Jul 14, 2014, 2:19:53 AM7/14/14
to ravendb
ayenderahien
Reply all
Reply to author
Forward
0 new messages