Large text files from blobstore are not being gzipped (AppEngine Java)

285 views
Skip to first unread message

Emanuele Ziglioli

unread,
Jun 21, 2012, 5:56:20 AM6/21/12
to google-a...@googlegroups.com
Hi everyone,

I'm serving a number of text files from the blobstore and while smaller ones are being gzipped by the frontend servers, larger ones are not.
Not sure what the threshold is, it could be as low as 4MB. 
Couldn't find any mention of it anywhere. 
Our files are "text/csv" and "application/json"`.
We serve them with a servlet, just like in the documentation:
blobstoreService.serve( blobKey, res);

That's a major problem for us in terms of customer experience. Has anyone seen that?
Thank you

Stuart Langley

unread,
Jun 21, 2012, 7:37:08 AM6/21/12
to google-a...@googlegroups.com

Emanuele Ziglioli

unread,
Jun 21, 2012, 5:05:43 PM6/21/12
to google-a...@googlegroups.com
Thanks Stuart,

indeed it looks like the same issue. I'm gonna try loading the blob content in memory first.
It's going to be slower but that's what I've been doing until yesterday: I was compressing large entities and storing them in up to 1MB sized entities. I switched to the blobstore in order to avoid compressing/decompressing and to reduce the delay when fetching them.
Unfortunately the lack of Gzip compression is really biting us. 

Richard Watson

unread,
Jun 22, 2012, 2:08:51 AM6/22/12
to google-a...@googlegroups.com
As Jeff mentioned on the SDK thread, maybe try Cloudflare.com.  I've just turned it on and it's not too painful, although I had to set up my page rules just right.  If you have static content, they'll gzip and cache it for you on their CDN.

One option if you don't want them proxying your whole app: deliver your content from a different subdomain and tell CF to only proxy that domain. Then all they are is a DNS host for you.  Worth a try at least.

Brandon Wirtz

unread,
Jun 22, 2012, 2:35:49 AM6/22/12
to google-a...@googlegroups.com

Cloud flare would still suck the uncompressed file down and then zip it, that is unlikely to speed things up.

Check my old posts about headers for edge cache, my guess is that you have the expiration set in the past or immediately which causes Edgecache to not compression most mime types.

 

 

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/x1nUlmwpplIJ.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

Emanuele Ziglioli

unread,
Jun 22, 2012, 7:06:47 AM6/22/12
to google-a...@googlegroups.com
thanks for the suggestion. The resources I was talking about are large but not static, that's I've been using compressed entities and now the blob store.
Have given up on the blob store for now, and reverted to compressed entities.
Might try with the Google cloud store.

Emanuele Ziglioli

unread,
Jun 22, 2012, 7:08:53 AM6/22/12
to google-a...@googlegroups.com


On Friday, 22 June 2012 18:35:49 UTC+12, Brandon Wirtz wrote:

Cloud flare would still suck the uncompressed file down and then zip it, that is unlikely to speed things up.

Check my old posts about headers for edge cache, my guess is that you have the expiration set in the past or immediately which causes Edgecache to not compression most mime types.

Not sure what the expiration date is set to for blobs in the blobstore. There's no difference between small resources and big ones.
The small ones get gzipped, the large ones don't.


Stephen Lewis

unread,
Jun 22, 2012, 7:38:40 AM6/22/12
to google-a...@googlegroups.com
When you serve up a Cloud Storage object directly from Cloud Storage, you can certainly pre-gzip the content and make sure it's served with the correct 'Content-Encoding'. The reference to this is at:


I'd be interested to know whether this still applies if you serve the Cloud Storage object using send_blob in App Engine. I'd imagine it would (and should) - it certainly works for Content-Type, because we're already using this in one of our apps.

This is, of course, only useful if all your clients understand gzip content encoding; if not, you'd probably have to store two version of your objects (compressed and uncompressed) and detect which type of client you're talking to in your AE code.

If you do try this, please let the rest of us know!

Thanks

Stephen

Emanuele Ziglioli

unread,
Jun 22, 2012, 7:42:27 AM6/22/12
to google-a...@googlegroups.com
Thanks for the tip! I will certainly try next week. Storing two versions would be nice and easy to do!

Stephen Lewis

unread,
Jun 22, 2012, 1:23:02 PM6/22/12
to google-a...@googlegroups.com
Curiosity got the better of me, and I've just tried this - unfortunately, it doesn't work. When serving from Google Storage using send_blob I received the pre-gzipped content, but the Content-Encoding header was not sent.

You might want to try something with signed URLs (https://developers.google.com/storage/docs/accesscontrol#Signed-URLs). I've just discovered that the edge cache in front of Google Storage will actually uncompress pre-compressed content before serving it (which would mess things up for you), but I suspect this this is a bug, and I also imagine that it won't affect signed URLs.

Stephen

Sameer Lodha

unread,
Jun 23, 2012, 12:57:24 AM6/23/12
to google-a...@googlegroups.com
Stuart,

The issue "2820" has been hanging fire for quite some time now (30 months). It would be a huge help if we could get visibility into whether it would ever be addressed.


Thanks,
Sameer

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/SZoA4fjsuZkJ.

Emanuele Ziglioli

unread,
Jun 24, 2012, 5:59:40 PM6/24/12
to google-a...@googlegroups.com
Thanks for that, I tried to setup Cloud Storage and thought that I could be on a free quota (we're already paying for GAE and didn't really want to pay again for something that GAE should be doing). Anyway, if there's a free quota for Cloud storage, I don't know how to enable it.
The cost of working around GAE's restrictions keeps increasing. 

On the upside, https has started to work for us on a custom domain, although with a certificate error.

Stephen Lewis

unread,
Jun 25, 2012, 2:11:08 AM6/25/12
to google-a...@googlegroups.com
I signed up for a GCS account relatively recently, and I certainly got the free quota (https://developers.google.com/storage/docs/pricingandterms) applied. It's easy to miss though - to see whether it's active you need to go to the API Console and choose the 'Billing' option from the menu. When I do that, I see the text "The following promotion has been applied to your project: Google Storage 5GB free plan." - this only applies to the first API project you create, though.

Stephen

Emanuele Ziglioli

unread,
Jun 27, 2012, 6:53:51 AM6/27/12
to google-a...@googlegroups.com
Hi Stephen,

I couldn't see the promotion being applied after enabling billing. Anyway, let's see if release 1.7.0 is any different, they say you can set header for static contents now (maybe you can set the encoding type now).
As far as I understand static content/blobs are the same for the purpose of being served by front end servers.
Reply all
Reply to author
Forward
0 new messages