Does deleting a backup also delete the files in Cloud Storage?

113 views
Skip to first unread message

Jason Collins

unread,
Apr 1, 2013, 7:46:47 PM4/1/13
to google-a...@googlegroups.com
We use the Datastore Admin backup tool (fired by a cron) to backup our data into Cloud Storage nightly. Our storage costs have crept up and I'd like to delete the old backups.

Aside: if you think, like I do, that you should be able to automate backup deletion, please star https://code.google.com/p/googleappengine/issues/detail?id=7412 or possible even https://code.google.com/p/googleappengine/issues/detail?id=7428

Datastore Admin has a feature that allows me to delete old backups. However, I don't think that it actually deletes the Cloud Storage files themselves. Here is the code I'm looking at from google.appengine.ext.datastore_admin.backup_handler:

def delete_backup_files(filesystem, backup_files):
 
if backup_files:


   
if filesystem == files.BLOBSTORE_FILESYSTEM:




      blob_keys
= []
     
for fname in backup_files:
        blob_key
= files.blobstore.get_blob_key(fname)
       
if blob_key:
          blob_keys
.append(blob_key)
         
if len(blob_keys) == MAX_BLOBS_PER_DELETE:
            blobstore_api
.delete(blob_keys)
            blob_keys
= []
     
if blob_keys:
        blobstore_api
.delete(blob_keys)


I'm guessing that blank line that has been redacted says "TODO: implement Google Storage file deletion".

Can anyone confirm or deny? How do people manage their backups today? Is my only option to write some kind of custom tool to dump old files on Cloud Storage?

Thanks,
j

Bryce Cutt

unread,
Apr 2, 2013, 3:04:56 PM4/2/13
to google-a...@googlegroups.com
IIRC it does not. Works fine on blobstore though.

Arie Ozarov

unread,
Apr 3, 2013, 4:26:54 PM4/3/13
to google-a...@googlegroups.com
Right. Deleting a backup does not delete the associated Google Cloud Storage files (but does delete the associated blobstore files).
We may provide an option to delete the associated Google Cloud Storage files in the future (and the default would be False).
Using a folder per backup can help with maintenance.

Arie.

Jason Collins

unread,
Apr 3, 2013, 6:13:09 PM4/3/13
to google-a...@googlegroups.com
Thanks for the response Arie. Once I get access to the Cloud Storage JSON API, I will have developed my own backup scrubber.

"Using a folder per backup can help with maintenance."

I agree this would help (though the current tooling doesn't allow easy deletion of an entire folder), but the cron-based backup tool does not allow for dynamically-specified folder names. In my opinion, the backup tool should automatically create date-named subfolders when creating backups.

j

Bryce Cutt

unread,
Apr 4, 2013, 12:17:21 AM4/4/13
to google-a...@googlegroups.com
To give myself more control and more options when running backups I have a cron job that calls my own handler and I have that handler spin off a task to the backup handler. That way I can, based on a config I store in the datastore, have a lot more control of backup names and kinds and schedule. I actually use multiple cron jobs on different time schedules that call my backup handler with a parameter to tell it which config to use so that I can backup different things on different schedules and so on. Has worked quite well for me so far.

Arie Ozarov

unread,
Apr 4, 2013, 3:08:09 PM4/4/13
to google-a...@googlegroups.com
I think it should work if you provide the folder as part of the gs_bucket_name value (e.g. 'bucket/folder/'). 
Also, gsutil does support wildcard (and delete with wildcard) and the gs online browser supports deleting multiple files and/or folders.

Jason Collins

unread,
Apr 4, 2013, 4:37:28 PM4/4/13
to google-a...@googlegroups.com
Arie, the issue is that you can't provide dynamic parts in your cron.yaml (unless I were to upload a new cron.yaml every day):

- description: brand analytics backup
  url: /_ah/datastore_admin/backup.create?name=my-backup&filesystem=gs&gs_bucket_name=vendasta-backup/my-folder&queue=backup&kind=Foo&kind=Bar&...
  schedule: every day 00:00

To be sure, the technique that Bryce points out certainly works, but as a Platform as a Service, I always argue that database backup and restore should be a first class citizen and have strong, integrated tooling.

j

Arie Ozarov

unread,
Apr 4, 2013, 7:52:38 PM4/4/13
to google-a...@googlegroups.com
Yes, I agree with that.We are constantly trying to improve it (and this request was noted).

Arie.

--
You received this message because you are subscribed to a topic in the Google Groups "Google App Engine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/google-appengine/UA75mV_JBP4/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to google-appengi...@googlegroups.com.
To post to this group, send email to google-a...@googlegroups.com.
Visit this group at http://groups.google.com/group/google-appengine?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Reply all
Reply to author
Forward
0 new messages