How do I delete all entities of a specific kind in Datastore?

3,833 views
Skip to first unread message

Ralf Rottmann

unread,
Apr 30, 2014, 6:08:27 PM4/30/14
to google-ap...@googlegroups.com
How do I delete all entities of a specific kind in Datastore? Programatically, not using the GAE Console.
GetAll doesn't help as it has a 1000 entities limit.

--

David Symonds

unread,
Apr 30, 2014, 6:16:30 PM4/30/14
to Ralf Rottmann, google-appengine-go
You will need to iterate through the entities and delete in batches.
Using datastore cursors and taskqueue can make it relatively easy.

Note that deleting an entity incurs datastore operation costs
equivalent to writing to all of its properties, since indexes need to
be updated, so watch that you don't blow out your quota. Having a
dedicated queue with a moderate rate can help control that.

Glenn Lewis

unread,
Apr 30, 2014, 6:31:52 PM4/30/14
to Ralf Rottmann, google-appengine-go
This is totally off the top of my head... so it may not even compile, but here is something to get you started that you can try out:

import (
"net/http"

"appengine"
"appengine/datastore"
)

func handle(w http.ResponseWriter, req *http.Request) {
c := appengine.NewContext(req)
q := datastore.NewQuery(EntryKind).KeysOnly()
keys := []*datastore.Key{}
for t := q.Run(c); ; {
k, err := t.Next(nil)
if err == datastore.Done {
break
}
if err != nil {
c.Errorf("datastore t.Next error: %v", err)
return
}
keys = append(keys, k)
}
if len(keys) == 0 {
return
}
if err := datastore.DeleteMulti(c, keys); err != nil {
c.Errorf("datastore.DeleteMulti error: %v", err)
return
}
c.Infof("Deleted %d entries.", len(keys))
}

Note that you only have 60 seconds to do all this... unless you use a taskqueue...
Another alternative is to use "Query.GetAll" (with its limit of 1000) and call it multiple times until everything is all gone.

Anyway, I hope that helps.
-- Glenn


--
You received this message because you are subscribed to the Google Groups "google-appengine-go" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengin...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ralf Rottmann

unread,
Apr 30, 2014, 6:53:55 PM4/30/14
to google-ap...@googlegroups.com, Ralf Rottmann
Glenn,

Almost worked like a charm, but it seems DeleteMulti can only delete 500 records in a batch. I'd love if these limits would be clearly stated in the Go for App Engine docs.

So I ended up doing this:

func DeleteKind(datastoreKind string, r *http.Request) (int64, error) {
c := appengine.NewContext(r)
q := datastore.NewQuery(datastoreKind).KeysOnly()

var i int64
var exit bool
for !exit {
count := 0
keys := []*datastore.Key{}
for t := q.Run(c); ; {
key, err := t.Next(nil)
if err == datastore.Done {
exit = true
break
}
keys = append(keys, key)
i++
count++
if count >= 500 {
break
}
}
if len(keys) == 0 {
return i, nil
}
if err := datastore.DeleteMulti(c, keys); err != nil {
return i, err
}
}
return i, nil
}

Best,
-Ralf

David Byttow

unread,
Apr 30, 2014, 6:56:35 PM4/30/14
to Ralf Rottmann, google-ap...@googlegroups.com
you don't need count >= 500 if you specify a Limit.

And, as Glenn says, you have limited time to do this...


--

David Byttow

unread,
Apr 30, 2014, 6:58:01 PM4/30/14
to Ralf Rottmann, google-ap...@googlegroups.com
Also, that is not a good way to do it probably... because the kind index is eventually consistent. You should use a cursor to iterate through the entire keyspace and delete in batches. 

Ralf Rottmann

unread,
Apr 30, 2014, 7:15:41 PM4/30/14
to google-ap...@googlegroups.com, Ralf Rottmann
David, would you mind elaborating a bit on

>  because the kind index is eventually consistent

Ralf Rottmann

unread,
Apr 30, 2014, 7:18:26 PM4/30/14
to google-ap...@googlegroups.com, Ralf Rottmann
> you don't need count >= 500 if you specify a Limit.

But as far as I understand this (the docs are not particularly thorough with these details), I'd run into the datastore.Done return value for err when I hit 500, though there are potentially thousands of entities left in the datastore.

What I really want is grab batches of 500. Can I do this with Limit? Is there a datastore.Done and datastore.CompletelyDone (== empty) err value?

David Symonds

unread,
Apr 30, 2014, 9:16:16 PM4/30/14
to Ralf Rottmann, google-appengine-go
You can just do a query with Limit of 500, and use GetAll to get them
in one hit (no loop required), and then pass the keys to DeleteMulti.
And the old school way is to order that query by key (i.e.
Order('__key__')), remember the last key you saw, and use that as a
Filter in the next query (i.e. Filter('__key__ >', prevFinalKey)), and
so on. Cursors are the better way to do that now, but are a little
more subtle.

Glenn Lewis

unread,
Apr 30, 2014, 9:16:36 PM4/30/14
to Ralf Rottmann, google-appengine-go
He's saying that even if you blow away 500 entities, and then immediately do another Query(), you may get some keys that were just deleted because this is not a strongly-consistent query... it is eventually consistent, which is an excellent point.
I don't have time right now to put together an example with a cursor, but I think David is on the right track.
But even if you do get duplicates, if you are able to blow away all your entities within 60 seconds, then this will probably work.

Since you are hitting a 500-key limit anyway, then it doesn't make sense to use the t.Next() version with a loop... you might as well just call ...Limit(500).GetAll(...)
-- Glenn


--

ralf.r...@grandcentrix.net

unread,
May 1, 2014, 6:23:18 AM5/1/14
to google-ap...@googlegroups.com
Thanks for all your great feedback. My current revised version looks like this:

func DeleteKind(datastoreKind string, r *http.Request) (int, error) {
c := appengine.NewContext(r)

var i int
var lastSeenKey *datastore.Key
q := datastore.NewQuery(datastoreKind).Limit(500).KeysOnly().Order("__key__")
for {
keys, err := q.GetAll(c, nil)
if err != nil || len(keys) == 0 {
return i, err
} else {
lastSeenKey = keys[len(keys)-1]
i = i + len(keys)
if err := datastore.DeleteMulti(c, keys); err != nil {
return i, err
}
}
q = datastore.NewQuery(datastoreKind).Limit(500).KeysOnly().Order("__key__").Filter("__key__ >", lastSeenKey)
}
return i, nil
}

--
grandcentrix GmbH
Schanzenstrasse 6-20
51063 Köln, Deutschland

phone: +49 221 677 860 0

Amtsgericht Köln | HRB  70119 | Geschäftsführer: R. Rottmann, M. Willnow | USt.-IdNr.: DE266333969

Ralf Rottmann

unread,
May 1, 2014, 7:07:56 AM5/1/14
to google-ap...@googlegroups.com
And here is the version which works best for me, using a timeout to return before the request gets cancelled.

func DeleteKind(datastoreKind string, r *http.Request) (int, error) {
c := appengine.NewContext(r)
var i int
var lastSeenKey *datastore.Key
q := datastore.NewQuery(datastoreKind).Limit(500).KeysOnly().Order("__key__")
timeout := time.After(time.Second * 60)
for {
select {
case <-timeout:
{
return i, nil
}
default:
{
keys, err := q.GetAll(c, nil)
if err != nil || len(keys) == 0 {
return i, err
} else {
lastSeenKey = keys[len(keys)-1]
i = i + len(keys)
if err := datastore.DeleteMulti(c, keys); err != nil {
return i, err
}
c.Debugf("persistence > DeleteKind(%s) Entries deleted: %d", datastoreKind, i)
}
q = datastore.NewQuery(datastoreKind).Limit(500).KeysOnly().Order("__key__").Filter("__key__ >", lastSeenKey)
}
}
}
return i, nil
}

Glenn Lewis

unread,
May 1, 2014, 11:50:34 AM5/1/14
to Ralf Rottmann, google-appengine-go
Great!  I'm glad you got it working.

If this were a code review, I would say "remove the '} else {' and replace with '}' since the line before it is a return.
Also, you don't need the extra pair of curly braces around each case/default within the select.

And finally, (this is purely personal preference...) I like to always use "%v" in formatting unless I must use something else (like in tests where I'm told to use a '%q')... but you are welcome to keep using %s, %d if you want.  :-)

-- Glenn


--
Reply all
Reply to author
Forward
0 new messages