How to clear a model without incurring huge costs

152 views
Skip to first unread message

Prem D

unread,
Dec 2, 2012, 6:23:07 PM12/2/12
to google-a...@googlegroups.com
I have a model (table) which has accumulated a few GBs of data. I do not need any of the data and so want to truncate the table.

Yesterday I tried to DELETE ENTITIES using Datastore Admin but it hit my billing limit immediately.

What is the cheapest way to truncate a table in Google App Engine ?

PS: I am using python

Carl Schroeder

unread,
Dec 2, 2012, 7:32:50 PM12/2/12
to google-a...@googlegroups.com
I am not sure if it is the most efficient, but what I do is:
Remove all indexes associated with the model. This minimizes writes associated with deletes.
Then create a cron task that deletes n entities per day according to how much quota I feel like using.

GAE really needs a "remove all entity and indexes for entity" function.

Jeff Schnitzer

unread,
Dec 3, 2012, 11:08:48 AM12/3/12
to Google App Engine
Truth is, there is no way to "efficiently" truncate a table because of the nature of BigTable - your data isn't stored in separate tables that can be dropped individually.  Every row is stored in one big table, and those rows need to be deleted individually.

Sometimes I wonder if Google should simply double the price of a write operation and make delete operations "free".  Of course, it makes write operations look really expensive... but they *are* that expensive since all data will eventually get deleted.

Jeff


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/YBOQusnZSRcJ.

To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

alex

unread,
Dec 3, 2012, 11:25:38 AM12/3/12
to google-a...@googlegroups.com
> Sometimes I wonder if Google should simply double the price of a write
> operation and make delete operations "free"

what I wonder is when App Engine will get a Spanner implementation :)

> but they *are* that expensive since all data will eventually get deleted.

not unless you disable => delete the app. Am I missing something here?

Jason Collins

unread,
Dec 3, 2012, 3:08:28 PM12/3/12
to google-a...@googlegroups.com
I'm guessing that due to the nature of tablet splitting, there is a lot of fragmentation, and some low-level background process comes along once in a while and reclaims space.

I've often wanted a feature that would let me "mark entities as deleted" so I could be part of this process (which may be just a figment of my imagination) and have my entities go away on Google's schedule and for very cheap/free. Of course, I would need to be responsible for ignoring these things in query results, etc., which often is not a problem because they are orphan entities that otherwise wouldn't be queried for anyway.

Something like:

  keys = MyEntity.query(ancestor=my_parent).fetch(1000, keys_only=True)
  ndb.delete_multi(keys, low_priority=True) # would mark both the entities and the index entries for background deletion

My alternative, often, is to just let dead stuff keep turning on spindles, which is asinine.

j

Barry Hunter

unread,
Dec 3, 2012, 4:55:31 PM12/3/12
to google-appengine
> but they *are* that expensive since all data will eventually get deleted.

not unless you disable => delete the app. Am I missing something here?

I imagine that Google dont actully bother deleting the data. Its cheaper for them to just leave the data 'orphaned' all over the place, than actully enumerating it all and deleting it. 

So google just swallow the cost. If really wanted to delete the data, google would have to charge to delete it :)


One reason to disallow reuse of app-ids. The new app wouldnt magically still see the data of the previous app. 

Jeff Schnitzer

unread,
Dec 3, 2012, 6:01:01 PM12/3/12
to Google App Engine
On Mon, Dec 3, 2012 at 11:25 AM, alex <al...@cloudware.it> wrote:
> but they *are* that expensive since all data will eventually get deleted.

not unless you disable => delete the app. Am I missing something here?

Apparently you cannot disable billing while there is >1GB data in the datastore.  If you want to turn off billing, you must first delete data until it's under the 1GB free quota.

You could change your credit card number, but I imagine that if we're talking about significant amounts of money (the magnitude of which I don't know), Google could come after you with legal measures.

This particular aspect of billing seems rather janky to me.

Jeff

alex

unread,
Dec 3, 2012, 6:09:34 PM12/3/12
to google-a...@googlegroups.com
Agree. I didn't say it wasn't "janky", just as a partial workaround:
delete data 'till you have <= 1gb, then disable and delete the app.

alex

unread,
Dec 3, 2012, 6:18:38 PM12/3/12
to google-a...@googlegroups.com
btw, I couldn't find a feature request for this. I'm pretty sure it
must be somewhere. Anyone has a link?

Stefano Ciccarelli

unread,
Dec 4, 2012, 1:22:52 AM12/4/12
to google-a...@googlegroups.com
I've disabled billing with 8GB on the datastore, the app is still there disabled and unusable. 

Inviato da iPad
--
Reply all
Reply to author
Forward
0 new messages