Transcript: App Engine chat time March 4

11 views

Skip to first unread message

Marzia Niccolai

unread,

Mar 5, 2009, 2:27:06 PM3/5/09

to Google App Engine

[7:00pm] marzia_google: (and this marks the beginning of App Engine chat time
[7:00pm] smwaveforms: so does the app-engine-patcher run on top of django 0.96
[7:00pm] smwaveforms: or instead of it
[7:01pm] cresloyd joined the chat room.
[7:01pm] marzia_google: i'm not an app-engine-patch expert, but i think it works with either 0.96 and 1.0
[7:02pm] Kardax: Is there any particular format to this chat?
[7:02pm] marzia_google: no format
[7:02pm] dan_google: Shout questions. And answers, if you have 'em.
[7:02pm] smwaveforms: I am wondering if you are running both 0.96 and the patch at the same time. That would seem to be expensive
[7:02pm] Kardax: Ok, question: permanent unique user identifiers... something we can use in a key name... any chance this will happen soon?
[7:03pm] oktopus joined the chat room.
[7:03pm] marzia_google: re:app-engine-patch, i don't think there is significant additional overhead involved, it's basically just an extension of django that bridges the gap between app engine and traditional environments
[7:04pm] taylorhughes: @Kardax good question!
[7:04pm] smwaveforms: @Marzia, that is good news.
[7:04pm] marzia_google: re: permanent unique user identifiers, we don't have any very immediate plans to introduce it, but it's something we are actively investigating
[7:04pm] Kardax: So... how can one safely use the Google User's API... what if a user changes their email address?
[7:05pm] amichail: How to fix an index that is stuck on "Building"?
[7:05pm] marzia_google: so that is an issue with the user's api in theory, in reality, this is a very rare occurence
[7:05pm] marzia_google: it almost never happens
[7:06pm] Kardax: That's true... but it's always the rare stuff that bites ya
[7:06pm] marzia_google: if an index has been in building more than 24 hours you will need to send an email to the group and someone will be able to fix it
[7:06pm] taylorhughes: the main issue I ran into is, I changed my user account's e-mail address and couldn't create a User object to use in a query because my old e-mail address was not linked to an existing account anymore
[7:06pm] marzia_google: if it's less than 24 hours it could still be building
[7:07pm] marzia_google: @taylorhughes yes, currently, unfortunately this happens, you would basically need to treat them like a new user
[7:07pm] marzia_google: or manually link the two accounts together
[7:07pm] taylorhughes: the way I'm dealing with it is to record the user's e-mail address when the account is created separately so I can at least find it later to migrate it
[7:08pm] marzia_google: that is the best solution right now
[7:08pm] Kardax: I posted an issue about the permanent ID's a while back, for anyone who's interested. #1019
[7:08pm] Kardax:        http://code.google.com/p/googleappengine/issues/detail?id=1019
[7:08pm] dw: i was wondering if there was an official comment on the recent failure that resulted in high latencies. how did this happen? what is being done to prevent such an evil problem in future?
[7:09pm] dw: particularly im interested in how you segment your rollouts. seems something went live that you couldn't roll back, and that's worrying
[7:09pm] marzia_google: the problems stemmed from monday's maintenance
[7:09pm] marzia_google: you can read the most concise useful summary that was just posted:
[7:09pm] marzia_google:        http://groups.google.com/group/google-appengine-downtime-notify/browse_thread/thread/86ab36fa2e7fac2c
[7:09pm] dw: th
[7:10pm] dw: *thx
[7:10pm] marzia_google: but we take what happened very seriously
[7:10pm] marzia_google: obviously this is one of the worst downtimes we've seen since launch
[7:11pm] marzia_google: generally at Google when something like this happens there is a full internal post mortem to ensure that this kind of stuff doesn't happen again
[7:11pm] marzia_google: so the team will be going through everything and fixing things so that this kind of thing is less likely to happen in the future
[7:12pm] dw: can you expand at all on what caused the failure? is a new component resulting in higher io?   *very curious*
[7:14pm] marzia_google: at this point, beyond what is published there isn't much more I can say, it wasn't the result of any specific feature or system
[7:14pm] Kardax: My app is just a little experiment at the moment, but I'm seeing greatly improved performance vs. yesterday and this morning
[7:14pm] marzia_google: it was basically just the result of the scheduled maintenance
[7:14pm] dw: marzia_google: thanks again
[7:14pm] marzia_google: it should be coming back in line with more typical performance now
[7:15pm] marzia_google: i can say the team has been working _very_ hard the past couple of days on this problem
[7:15pm] marzia_google: (as you can imagine)
[7:15pm] dw: telling me that only makes me more curious sounds like a fun problem
[7:15pm] pranny: http://appgallery.appspot.com/results?topapps=true&start=15&num=5 any progress on thie error that pops up when viewing this. Sorry can't file bug, in hurry
[7:16pm] amichail: Do you plan to provide access to previous versions of the datastore in case the current version becomes corrupted?
[7:16pm] marzia_google: oh, hmm that page works for me, actally
[7:16pm] marzia_google: we have no current plans for datastore versioning, but you should star the issue for it (or file it if it isn't already there) if it's something you would like to see
[7:16pm] oktopus: 10 days ago I posted a message asking for reactions to my app at simplifyconnections.appspot.com and got zero responses. That likely means no on had anything good to say. But just in case it was overlooked by everyone, let me note here that I would love to hear from people. http://groups.google.com/group/google-appengine/browse_thread/thread/a9473091ecc2e829/1318ebf58f98744b?lnk=gst&q=thebrianschott#1318ebf58f98744b
[7:17pm] pranny: marzia_google: might be this could help http://dpaste.com/6652/
[7:18pm] marzia_google: hmmm, it's a timeout error
[7:18pm] marzia_google: so the call took to long
[7:18pm] marzia_google: the appgallery could use better error recovery
[7:18pm] dw: oktopus: i posted a datastore backup app a week ago and got one response, and only 20 or so hits to the web page   i think the group's demographic is not the idle hobbyist who clicks on every link
[7:19pm] pranny: marzia_google: yeah, I guess so. My internet speed is fairly good. Gotta go. Bye. Good work, AppEngine team.
[7:19pm] Kardax: Ok, next question: we have db.run_in_transaction; is there a chance of getting a db.run_in_snapshot?
[7:19pm] dankles: Q: What (if any) progress / plans are there on reducing Datastore CPU usage?
[7:20pm] Kardax: I'd like to be able to read an entity group from a consistant state without the overhead of a transaction.
[7:21pm] marzia_google: @kardax file a feature request right now there are no plans for such a thing
[7:21pm] marzia_google: you can db.get multiple keys
[7:21pm] marzia_google: which seems like it would be similar
[7:21pm] dan_google: Kardax: Depending on what you're reading, you can store all the stuff you want in a "snapshot" on a single entity.
[7:21pm] dan_google: Kardax: Reads are consistent within an entity.
[7:21pm] Kardax: Yes, I'm doing that a lot... but I don't know what the next key is until I've read the first.
[7:22pm] marzia_google: @dankles there isn't anything specific to report wrt to datastore cpu
[7:22pm] Kardax: I'm playing with B-tree storage in the datastore, a single entity can't hold one of these...
[7:22pm] oktopus: dw: Yes this is a great group, but maybe not the ones to ask such questions. My app is very difficult to publicize and this particular feature requires almost a nationwide commitment to be effective.
[7:22pm] Kardax: So, I'm using an entity group for the b-tree.
[7:23pm] Kardax: This gives me the consistant state I need for updates/deletes.
[7:23pm] Kardax: But if something changes while another app instance is reading it, it can fail in interesting ways.
[7:23pm] Kardax: So, being able to run_in_snapshot is the only way to fix it
[7:24pm] Kardax: I guess I'll have to make that feature request, then...
[7:24pm] dankles: Q: (Not a Q) I would again like to voice my anti-starring of all non-Python languages, with the motivation being that GAE team focus on platform itself without fragmentation.
[7:25pm] marzia_google: noted
[7:25pm] Kardax: As much as I hate python, I do agree.
[7:25pm] • angerman votes for haskell
[7:25pm] Kardax: There are a lot of things I'd like to see before we get more languages, as you might have guessed from my comments here
[7:26pm] dan_google: But then what would all our Fortran engineers do? They don't know Python.
[7:26pm] nwinter: Marzia, thanks ever so much for diagnosing my CPU sidelining after the last chat.
[7:26pm] nwinter: It would help even more to know how this works: "Applications that are heavily cpu-bound, on the other hand, may incur
[7:26pm] nwinter: some additional latency in long-running requests in order to make room
[7:26pm] nwinter: for other apps sharing the same servers."
[7:26pm] • angerman has yet to see a fortran webframework
[7:27pm] dankles: Q: Not sure if this has come up... Any plans to offer/parter with CDN, so a few lines in app.yaml could push a directory of static assets out for efficient serving?
[7:27pm] nwinter: Like, how is a handler identified as high-cpu, and how long does that classification last?
[7:27pm] cresloyd: dan_google: could you re-train those Fortran guys to learn Cobol?
[7:27pm] marzia_google: the high-cpu is done on a per-request basis
[7:27pm] dan_google: angerman: If you write one, you could be a hero to the Fortran web development community!
[7:27pm] amichail: Do you know if python 2.5's random module will give the same results for the same seed across all architectures, compilers, etc.?
[7:27pm] marzia_google: and it's basically when cpu is in the hunderds of runtime ms per request
[7:27pm] angerman: dan_google: i question the last 3 words
[7:28pm] nwinter: So, if a request goes over X cpu seconds, it can be put on the backburner, but other requests to the same handler will be unaffected by that?
[7:28pm] marzia_google: it's possible that other requests will be unaffected
[7:28pm] amichail: Is there any chance that the random module will give different results as Google upgrades GAE servers?
[7:28pm] nwinter: Hmm. Okay, thanks!
[7:29pm] marzia_google: @dankles, that is an interesting idea, i don't think it's one we've explored
[7:29pm] • angerman is running his blog http://journal.moritzangermann.com pretty well on app-engine. The only thing that annoies me from time to time is that the deployment fail... I even had it fail for 5 times in a row...
[7:30pm] angerman: @dankles, @marzia_google: dunno if that's legal, but I saw instructions on using GAE as a CDN of it's own.
[7:30pm] angerman: (having an implicit file(+size) limit though
[7:30pm] dankles: @angerman: but GAE isn't really well suited for being a CDN
[7:31pm] marzia_google: concerning the random module in python i'm going to have to cop to not knowing the answer to what it uses to see the random number
[7:32pm] dankles: re random: i wouldn't rely on it not changing, but it probably won't
[7:32pm] angerman: dankles: apart from the file+size limit, what else holds you back?
[7:32pm] marzia_google: i imagine that aside from the seed, the algorithm is the same for all python implementations, but that's just a random guess
[7:32pm] smwaveforms: when I log into dashboard I always get UNAUTHORIZED but then it gives me a link Return to applications screen' and I can get in
[7:32pm] smwaveforms: should I be concerned?
[7:32pm] marzia_google: have you tried appengine.google.com/a/mydomain.com
[7:33pm] marzia_google: it seems ike this could be a Google Apps / Google Accounts issue (@smwaveforms)
[7:33pm] smwaveforms: In what way?
[7:33pm] dankles: @angerman : a good CDN has servers all over the world, serving each user from the closest edge, and optimized for low bandwidth costs, static assets... GAE design goals were different, as it clusters around itself + Datastore, etc.
[7:33pm] marzia_google: it's certainly possible to use App Engine as a CDN, nothing is stopping you
[7:34pm] marzia_google: if you have an account that is both a Google Apps account and a Google Account... it's just
[7:34pm] marzia_google: the system sometimes authenticates against the wrong account and can cause problems access your application
[7:34pm] dankles: right, the easiest thing is to just serve up static_files. I'm thinking about an upgrade, possibly one that Google could make $ from, instead of people going to outside CDNs.
[7:35pm] marzia_google: if the admin is the Google Apps account and you login with the Google Account for instance
[7:35pm] Kardax: Another question: How "big" can an entity group be before problems are encountered (besides concurrency)? What's the definitiion of "big"? What problems are seen?
[7:36pm] marzia_google: with entity groups, the biggest issues is contention
[7:36pm] marzia_google: since writes to an entity group happen serially
[7:37pm] voluntas left the chat room.
[7:37pm] marzia_google: so an entity group is too 'big', i would say, when it can't reasonably handle the number of writes to the group
[7:37pm] marzia_google: writing to one entity in the group locks the entire group
[7:37pm] Kniht joined the chat room.
[7:37pm] marzia_google: which is the main issue
[7:37pm] voluntas joined the chat room.
[7:38pm] muthu_ left the chat room. (Remote closed the connection)
[7:38pm] deltab joined the chat room.
[7:38pm] Kardax: I'm more concerned with physical size limitations... for example, if I can make reasonably sure that contention will be low, but it could have 10's of thousands of entities in size, is that going to work?
[7:39pm] LuchoVtn3d joined the chat room.
[7:39pm] marzia_google: 10s of thousands... i don't know that there is a lmitation that would prevent this
[7:40pm] dan_google: Kardax: I think you'll be fine.
[7:40pm] nwinter: Is it possible to get a rough estimate of how long an app instance will stick around once spawned?
[7:40pm] Kardax: Good to know, thanks
[7:40pm] marzia_google: @nwinter, not really
[7:40pm] marzia_google: there are no garuntees
[7:40pm] dankles: Q: What's the latest with background tasks?
[7:41pm] Kardax: Good question
[7:41pm] savraj joined the chat room.
[7:41pm] nwinter: No average time or anything?
[7:41pm] savraj: hi guys... just jumping in here. Are there any decent libraries to integrate Google Checkout and App Engine?
[7:42pm] marzia_google: @nwinter i'm not sure an average time would be meaningful, even if i knew off hand what it was ( i don't)
[7:42pm] voluntas left the chat room. (Client Quit)
[7:42pm] marzia_google: since it really can be quite variable
[7:42pm] marzia_google: concerning background tasks, i odn't have any specific announcements
[7:42pm] nwinter: @marzia: Okay; thanks.
[7:42pm] Kardax: Re instance lifetime: So an app instance could last a few seconds or a few hours? Just curious
[7:42pm] marzia_google: @savrag, i know there is a python checkout library, gchecky?
[7:43pm] marzia_google: i've never used it
[7:43pm] dan_google: nwinter: Apps are evicted by least-recently-used on an app server. As someone noted recently (forums or chat I forget), low traffic could mean lots of "restarts", but so could spikes in traffic which may start new instances on multiple app servers.
[7:43pm] voluntas joined the chat room.
[7:43pm] nwinter: @dan_google: good to know!
[7:43pm] dan_google: Kardax: Yes, depending on the weather. By which I mean, request patterns, other apps on each app server, and so forth. Not really predictable.
[7:44pm] Kardax: Makes sense
[7:44pm] savraj: @marzia -- ah, cool. found it on google code, will try getting this working tonight. Thanks!
[7:44pm] Kardax: Ok, my next question: Currently, it looks like db.get, when fed a bunch of keys, only counts as one datastore API call... is this a bug or by design?
[7:45pm] amichail: Do we need to make datastore backups? Could we trust GAE not to lose our data?
[7:45pm] dankles: Good question, amichail
[7:45pm] marzia_google: @kardax that's by design
[7:45pm] dan_google: Kardax: By design. The datastore has a batch API for gets, puts and deletes.
[7:46pm] dan_google: Kardax: That also means you have to be aware of API request size limits when doing batches.
[7:46pm] Kardax: Excellent... I've been purposely grouping together gets/puts/deletes as much as possible hoping that was the case
[7:46pm] Kardax: How do API request size limits come into play?
[7:46pm] marzia_google: @amichail i think it's probably a good idea to make backups for personal reasons
[7:47pm] angerman: what was with the background jobs and xmpp? did I miss something intermediate?
[7:47pm] marzia_google: but i don't believe, due to the natural redencies in the system, there is much chance of App Engine loosing the data that is stored there
[7:47pm] marzia_google: backups would be a good idea in the case of overwritten/deleted data
[7:47pm] dan_google: Kardax: There's a limit of 1 MB for an API call request, and an API call response. With a batch put, for instance, the *total* size of all entities being put needs to be under 1 MB. If you put each entity separately (not in a batch), then each entity could be up to 1 MB.
[7:48pm] marzia_google: @angerman these things are things we are working on
[7:48pm] dan_google: Kardax: Similar for batch gets, since the limit applies to API responses.
[7:48pm] marzia_google: though they aren't yet released and I don't have a timeline on when the will be
[7:48pm] marzia_google: aside from the one on the roadmap
[7:48pm] Kardax: dan_google: Ok, makes sense. Is there a good way to tell how large an entity is?
[7:49pm] nwinter: Is a get_by_key_name() lookup significantly faster than a query.get() lookup?
[7:49pm] oktopus: My app allows users to delete records. An evil doer could delete a lot of stuff. Is there an example of a system that stores all added records from users to an archive file?
[7:49pm] dan_google: Kardax: Entity size = kind + key name/id + key path elements + property names + property values.
[7:50pm] nwinter: oktopus: why not have user deletions set a "deleted" flag instead of actually deleting the record and having a separate backup?
[7:50pm] Kardax: dan_google: Ok, that's what I was expecting... good to know
[7:50pm] dan_google: Kardax: Specifically, it's the size of the resulting proto buffer, if you want to dig for that.
[7:50pm] marzia_google: @nwinter, yes, faster
[7:50pm] oktopus: nwinter: yes, that might work for me. Thanx.
[7:51pm] nwinter: Maybe I'm missing this somewhere, but what's the limit to how much the memcache can hold before it starts evicting items?
[7:51pm] Kardax: dan_google: I do know a little about protocol buffers... not suprising to hear that they're integral to the datastore's operation
[7:51pm] marzia_google: @nwinter, it's not listed anywhere, because it depends a lot on usage
[7:52pm] marzia_google: but around 100MB is a safe assumption
[7:52pm] nwinter: oh, sweet!
[7:52pm] angerman: hehe. I'm caching so much on my blog I'm running out of traffic before I run out of CPU
[7:52pm] dan_google: nwinter: Be sure to code defensively, since memcache can evict for other reasons. It's rare, but possible.
[7:53pm] nwinter: will do; was just wondering if it could support the size of my "working set", which it more than can -- very happy to hear that
[7:54pm] Kardax: I just wanted to take a moment to say that this chat has been very, very valuable to me
[7:54pm] angerman: dan_google: are there people who rely on existance of objects in the cache(?)
[7:54pm] nwinter: me too; thanks so much, guys!
[7:54pm] marzia_google: great, we still have a couple more minutes
[7:54pm] marzia_google:
[7:55pm] angerman: marzia_google: could you publish a list of "successfull"/high traffic app-engine apps?
[7:55pm] dankles: Q: All my static files are returning with HTTP 200 (not being cached). What am I doing wrong?
[7:55pm] Kardax: Good question, dankles
[7:55pm] marzia_google: ahve you set an expiration on the files? in the app.yaml
[7:55pm] dan_google: angerman: I've seen people suggest techniques that use memcache for inter-request work space without persisting to the datastore. There's nothing inherently wrong with that, as long as you don't mind starting the calculation over or losing results.
[7:55pm] Kardax: Expiration shouldn't be necessary... the browser sends an If-Modified-Since header when it checks a file it has in it scache.
[7:55pm] dankles: by "all" I may have meant "all except for the ones that are"... But yes, I have in app.yaml: default_expiration: "5d"
[7:56pm] Kardax: The server should check the date of the file and compare it against that header value, returinging 304 (and no content) if the browser/proxy cache is current.
[7:56pm] Dennis_TW: i'm new to memcache. does that 100mb memcache stay around for days even if app has not been called?
[7:56pm] marzia_google: i don't know any reason off the top of my head this wouldn't work
[7:56pm] marzia_google: if you notice it consitently i would post some headers in the group
[7:57pm] marzia_google: which would be the most helpful, request response headers + app.yaml
[7:57pm] dan_google: Dennis_TW: Yes, memcache eviction is unrelated to app cache eviction.
[7:57pm] dankles: OK will look into it. Am pre-launch, so haven't done deep research yet, just wondering if there was something obvious.
[7:57pm] angerman: Dennis_TW: I had cached items for over 72hs
[7:57pm] Dennis_TW: cool!
[7:58pm] angerman: marzia_google: btw, the memcache.get_statistics behaves different on dev and prod
[7:58pm] dankles: My app is trying to memcache for 1 week, but haven't tested yet.
[7:58pm] marzia_google: yes i believe there was a thread about this
[7:58pm] Kardax: Implementing support for the If-Modified-Since header is super easy... I did it for an ASP.NET image processing app I wrote a few years ago
[7:59pm] Kardax: In the next day or two I'll see if there are any feature requests out for it, and add to it or create one if necessary.
[7:59pm] dankles: Kardax: one man's "super easy" is another man's nightmare   however I was talking about my static files, which is beyond my control.
[7:59pm] savraj left the chat room.
[7:59pm] angerman: dankles: I'm having either never expering values in the memcache (e.g. for static stuff) where I just cache the response indeterminate... or usually 15minutes
[7:59pm] marzia_google: sorry i can't find it write now
[7:59pm] marzia_google: *right
[8:00pm] Kardax: dankles: Yeah, I know what you mean... it's up to Google to support If-Modified-Since on static files
[8:00pm] dankles: Hey Googlers, thank you very much for the chat & all your hard work!
[8:00pm] angerman: marzia_google: maybe it's even been filled by me, wait a sec
[8:00pm] pranny left the chat room. ("Leaving.")
[8:01pm] angerman: hmm no.
[8:01pm] smwaveforms: thanks !
[8:01pm] angerman: marzia_google: but this one was strange too: http://code.google.com/p/googleappengine/issues/detail?id=1070&can=4&colspec=ID%20Type%20Status%20Priority%20Stars%20Owner%20Summary%20Log%20Component
[8:01pm] oktopus: Thank you, all.
[8:01pm] smwaveforms left the chat room.
[8:01pm] oktopus left the chat room.
[8:02pm] marzia_google: @angerman
[8:02pm] marzia_google:        http://groups.google.com/group/google-appengine/browse_thread/thread/8e0304f1a4ce6ecc/f4e046f773451aa5?lnk=gst&q=memcache+get_stats()%2C+what+do+they+mean%3F#f4e046f773451aa5
[8:02pm] marzia_google: ok, i have to be off
[8:02pm] marzia_google: thanks everyone for coming
[8:02pm] angerman:        http://code.google.com/p/googleappengine/issues/detail?id=624
[8:02pm] Kardax: Thanks for your help
[8:02pm] taylorhughes: @marzia_google thanks
[8:02pm] nwinter: thanks!
[8:02pm] marzia_google: as always the transcript will be on the group tomorrow!
[8:02pm] angerman: that's the memcache issue
[8:03pm] marzia_google: goodnight!

ryan

unread,

Mar 6, 2009, 6:09:57 PM3/6/09

to Google App Engine

good chat! sorry i missed it. i have a few comments, inline:

On Mar 5, 11:27 am, Marzia Niccolai <ma...@google.com> wrote:
> [7:05pm] amichail: How to fix an index that is stuck on "Building"?

...

> [7:06pm] marzia_google: if an index has been in building more than 24 hours
> you will need to send an email to the group and someone will be able to fix
> it

more info:

http://code.google.com/appengine/articles/index_building.html
http://code.google.com/appengine/docs/python/datastore/queriesandindexes.html#Big_Entities_and_Exploding_Indexes

> [7:19pm] Kardax: Ok, next question: we have db.run_in_transaction; is there
> a chance of getting a db.run_in_snapshot?

> [7:20pm] Kardax: I'd like to be able to read an entity group from a
> consistant state without the overhead of a transaction.

db.run_in_transaction() actually adds very little extra overhead. for
read-only transactions, all it adds is a single disk seek to read the
timestamp of the entity group's log header. that read will always be
smaller than any of get(), and more importantly for latency, it's only
a single disk seek, as opposed to queries, which take multiple disk
seeks.

given that, i'd recommend using db.run_in_transaction(). you should
only see minimal performance impact.

> [7:19pm] dankles: Q: What (if any) progress / plans are there on reducing
> Datastore CPU usage?

> [7:22pm] marzia_google: @dankles there isn't anything specific to report wrt
> to datastore cpu

we've actually made a number of solid improvements to datastore CPU in
the last couple months, both to how much CPU we actually use and to
the accuracy of our measurements.

having said that, one useful thing to note is that CPU is a broad
category. along with pure CPU, it also includes the cost of disk
accesses, particularly disk seeks, which are slow (relative to pure
CPU and in-memory operations) and expensive.

> [7:27pm] dankles: Q: Not sure if this has come up... Any plans to
> offer/parter with CDN, so a few lines in app.yaml could push a directory of
> static assets out for efficient serving?

...

> [7:30pm] dankles: @angerman: but GAE isn't really well suited for being a
> CDN

> [7:33pm] dankles: @angerman : a good CDN has servers all over the world,
> serving each user from the closest edge, and optimized for low bandwidth
> costs, static assets... GAE design goals were different, as it clusters
> around itself + Datastore, etc.

that's true for serving dynamic requests, but static files aren't
subject to those constraints. you'll probably find that static files
served through app engine are already fairly efficient and low
latency, regardless of where you are in the world, and we expect that
to get even better in the future.

of course, no need to take our word for it. you can always measure
yourself and compare to other CDNs, or look at a few of the blog posts
and threads on the group that have already done that.

> [7:35pm] Kardax: Another question: How "big" can an entity group be before
> problems are encountered (besides concurrency)? What's the definitiion of
> "big"? What problems are seen?
> [7:36pm] marzia_google: with entity groups, the biggest issues is contention
> [7:36pm] marzia_google: since writes to an entity group happen serially

> [7:37pm] marzia_google: so an entity group is too 'big', i would say, when
> it can't reasonably handle the number of writes to the group

...

> [7:38pm] Kardax: I'm more concerned with physical size limitations... for
> example, if I can make reasonably sure that contention will be low, but it
> could have 10's of thousands of entities in size, is that going to work?

marzia's absolutely right here. there isn't really any meaningful
limit to entity group size, either bytes or number of entities. the
main thing entity groups limit is write throughput, not size.

> [7:55pm] angerman: marzia_google: could you publish a list of
> "successfull"/high traffic app-engine apps?

definitely! http://code.google.com/appengine/casestudies.html

ryan

unread,

Mar 6, 2009, 6:13:42 PM3/6/09

to Google App Engine

On Mar 6, 3:09 pm, ryan <ryanb+appeng...@google.com> wrote:
> db.run_in_transaction() actually adds very little extra overhead. for
> read-only transactions, all it adds is a single disk seek to read the
> timestamp of the entity group's log header. that read will always be
> smaller than any of get(), and more importantly for latency, it's only
> a single disk seek, as opposed to queries, which take multiple disk
> seeks.

another way to phrase this is, if you only do reads inside
db.run_in_transaction(), it effectively already is run_in_snapshot().
it provides the "read a consistent snapshot" functionality with the
minimum possible overhead.

Reply all

Reply to author

Forward

0 new messages