Bulk Deletion of Documents in RavenDB

581 views
Skip to first unread message

Frodo35

unread,
May 10, 2011, 2:38:28 PM5/10/11
to ravendb
Hi have a test program and I want to re-run it with multiple
changes. Before it runs I want to BULK delete all the items in the
database before I start vs. getting a list of all the items and
deleting them one by one. Is there a simple call to do that ? I am
using the .NET 4.0 lightweight client against.

Thanks,

Mike

Itamar Syn-Hershko

unread,
May 10, 2011, 2:40:02 PM5/10/11
to rav...@googlegroups.com
Just delete the data folder instead, or run the database in memory in the first place it that makes sense.

Its going to be much faster.

Frodo35

unread,
May 10, 2011, 2:57:37 PM5/10/11
to ravendb
Thanks,

I have been using the DeleteByIndex() method, but it is very slow
and often runs into timeouts while running. I inserted a bit over
500K stack overflow users (from their dump) and wanted to delete them
to test different load strategies against performance. But for my
eventual application I am looking for the best way to handle large
deletions as well and this doesn't seem very good at the moment.

Itamar Syn-Hershko

unread,
May 10, 2011, 3:03:08 PM5/10/11
to rav...@googlegroups.com
Again, for such tests you should run in-memory (Munin storage) or delete the Data folder of your application (if you want to test using Esent).

What scenarios do you expect to have in your application requiring bulk deletes?

Bulk deletes are known to be slow, and that is considered to be a rare usage. See https://gist.github.com/ravendb/ravendb/issues/205.

Mike Wade

unread,
May 10, 2011, 3:06:50 PM5/10/11
to rav...@googlegroups.com

Thanks for your kind responses.  I have a scenario where I create very large repositories of objects (documents actually) and host them for long term review  (10’s to 100’s of millions) and I am trying to determine whether I can leverage a NoSQL system to improve my scalability.  The queries on date/meta-data work really well, but occasionally I need to be able to perform fairly large deletions, entire populations of data, or large #’s of items based upon a key for example (e.g. all documents before a certain date/time for example).  Give the volume and size of data, the deletion issue is fairly important to me that it can run in a defined time frame.  Getting the time outs while trying to delete only 500K items makes me think I would have to use the embedded client API to work directly with the server(s).

 

Sincerely yours,

 

Michael Wade

CTO/EVP

Planet Data Solutions

Direct: (720) 851-2295

Office: (914) 593-6900

Mob: (914) 886-3620

Itamar Syn-Hershko

unread,
May 10, 2011, 3:14:20 PM5/10/11
to rav...@googlegroups.com
Dropping a whole RavenDB Collection could be optimized rather easily - is this your case?

Mike Wade

unread,
May 10, 2011, 3:25:34 PM5/10/11
to rav...@googlegroups.com

For this particular case yes.  Although in the future I will be doing selective deletions.

 

Sincerely yours,

 

Michael Wade

Ayende Rahien

unread,
May 11, 2011, 3:12:14 AM5/11/11
to rav...@googlegroups.com
Mike,
DeleteByIndex is the way to go here, you get timeouts because the request took over 15 seconds. You can increase the request timeout, mind you. Please note that even so, deletions is a long process in RavenDB.
It is generally advisable to do this at the database level (have a tenant db for those sorts of data), and then just delete the whole database when you want to delete it.

On Tue, May 10, 2011 at 10:06 PM, Mike Wade <mw...@planetds.com> wrote:

Frodo35

unread,
May 11, 2011, 10:53:12 AM5/11/11
to ravendb
I have search around and can't find how to change this particular
timeout value. Any pointers?

Thanks,
> > *From:* rav...@googlegroups.com [mailto:rav...@googlegroups.com] *On
> > Behalf Of *Itamar Syn-Hershko
> > *Sent:* Tuesday, May 10, 2011 1:03 PM
> > *To:* rav...@googlegroups.com
> > *Subject:* Re: [RavenDB] Re: Bulk Deletion of Documents in RavenDB
>
> > Again, for such tests you should run in-memory (Munin storage) or delete
> > the Data folder of your application (if you want to test using Esent).
>
> > What scenarios do you expect to have in your application requiring bulk
> > deletes?
>
> > Bulk deletes are known to be slow, and that is considered to be a rare
> > usage. Seehttps://gist.github.com/ravendb/ravendb/issues/205.

Ayende Rahien

unread,
May 11, 2011, 11:48:39 AM5/11/11
to rav...@googlegroups.com
documentStore.JsonRequestFactory.CustomizeRequest
Reply all
Reply to author
Forward
0 new messages