Chat Time Transcript December 17

5 views
Skip to first unread message

Marzia Niccolai

unread,
Dec 17, 2008, 1:43:14 PM12/17/08
to Google App Engine
[09:00am] marzia_google: well, I have my morning coffee in hand
[09:00am] marzia_google: so
[09:00am] dan_google_ joined the chat room.
[09:00am] MBoffin joined the chat room.
[09:00am] marzia_google: it's time to call this session of app engine chat time to order
[09:01am] angerman: ohh right... is it 9am pst?
[09:01am] scudder: do we have quorum? :-p
[09:01am] marzia_google: today (in addition to me), dan_google, danielobrien and scudder join us
[09:01am] marzia_google: and i may have missed people
[09:02am] marzia_google: so i would like to take a minute to do a bit of a plug
[09:02am] marzia_google: on our new system status dashboard: http://code.google.com/status/appengine
[09:02am] marzia_google: just because i think it's awesome
[09:02am] marzia_google: *end shameless plug*
[09:02am] marzia_google: :)
[09:04am] dan_google_: The new quota details section of the admin console is pretty awesome, too.  :)
[09:04am] bFlood: it is awesome, thx
[09:04am] bFlood: billing looks great as well
[09:04am] marzia_google: yes, the team has been putting in a lot of work to get all of these things out
[09:06am] scudder: Yesterday was quite a day for releases, announcements, ...
[09:06am] scudder: What questions do you all have?
[09:06am] amichail: Why not provide an option for profit sharing, without billing at all?
[09:06am] angerman: amichail: count me in on that :)
[09:07am] marzia_google: by profit sharing, i assume you mean an adsense integration?
[09:07am] amichail: yes
[09:07am] bFlood: will cPickle be added in the future?
[09:07am] MBoffin: That would be nice. Sort of keep yourself paid up automatically.
[09:08am] marzia_google: it's definitely a great feature request, but we went initially with something that would work for everyone
[09:09am] tw3k left the chat room. ("Lost terminal")
[09:09am] danielobrien: bFlood: I don't recall any recent discussion on adding cPickle support. It's probably worth adding to the public issue tracker as a feature request.
[09:09am] picalolabu: when do you expect to roll out billing?
[09:11am] tpiep_ joined the chat room.
[09:11am] bFlood: danielobrien: will do, thx. seems like an easy add with mucho perf gains for storage/retrival
[09:11am] Lennie joined the chat room.
[09:11am] jlivni joined the chat room.
[09:12am] scudder: picalolabu: I can't say yet when billing will be available for everyone, but we plan to allow trusted testers to start using billing this week.
[09:13am] scudder: so a few people will be using billing very soon
[09:13am] picalolabu: great, thx
[09:14am] danielobrien: bFlood: If I recall, the main blocker was securing cPickle. Since we use pickle in our own API modules, though, I can understand the demand for it.
[09:15am] Lennie is now known as Lennie|Food.
[09:15am] matija: Hi everybody I have several questions
[09:15am] matija: Do you plan to inform us about average cost for some standard python statements and some python/app engine apis or even allow us somehow to test by our self, so that we can predict cpu usage during design and not after putting it to production ?
[09:15am] ryan_google joined the chat room.
[09:15am] jcgregorio joined the chat room.
[09:15am] jcgregorio is now known as jcg_google.
[09:16am] Wooble: that sounds nearly impossible.
[09:16am] dan_google left the chat room. (Read error: 110 (Connection timed out))
[09:16am] dan_google_: You can profile your own code to come up with your own estimate.
[09:16am] tpiep left the chat room. (Read error: 110 (Connection timed out))
[09:17am] danielobrien: You'll also still have free quota once billing is launched.
[09:17am] danielobrien: So you'll still be able to test your consumption rates within those limits.
[09:17am] amichail: Transactions and inverted indices don't go well together.  What to do?
[09:18am] ryan_google: amichail: i think i glanced at your group post on that. i take it you can't do what you want with a standard datastore query?
[09:18am] amichail: the problem is updating the inverted index iff a user item was stored successfully
[09:18am] matija: how can I profile if I don't know will my design go under quota, let say extensive simplejson usage for example in connection with client javascript side...
[09:19am] hp joined the chat room.
[09:19am] ryan_google: yes, i understand the problem. if you want to store your own index that crosses entity groups, you can't really make updates transactional. that's due to the datastore core architecture.
[09:20am] amichail: so how do I work around that?
[09:20am] matija: Is it okay to have more than one write in one web request in relation to quota ? Let's say that my every request has two datastore write operation and I have 10 requests per minute (so that I can't avoid high CPU quota credits) will I be ever 'punished' if I stay under maximum request time and under daily quotas?
[09:20am] ryan_google: i'm more curious why the built-in and developer-defined indices themselves aren't good enough
[09:20am] ryan_google: finding your group post...
[09:20am] ryan_google: never mind, the group post doesn't have much more detail.
[09:20am] hp: Is it possible to get web.py's error pages while working with app engine?
[09:21am] ianbicking left the chat room. (Read error: 104 (Connection reset by peer))
[09:21am] ianbicking joined the chat room.
[09:22am] dan_google_: matija: Are you asking if more than one datastore update would fit within the request deadline, or the high CPU quota?
[09:22am] marzia_google: hp: I imagine that it's possible
[09:22am] amichail: How would you support AND queries say without an inverted index?
[09:22am] ryan_google: filters are always ANDed together
[09:23am] ryan_google: what kind of query are you trying to do?
[09:23am] marzia_google: you would just render the web.py error page when you catch an exception
[09:23am] amichail: I'm doing tag lookups.
[09:23am] matija: High cpu quota
[09:23am] ryan_google: sure, you shouldn't need a custom index for that
[09:24am] amichail: Given a query with words w1...wn, I find items with at least one tag containing all those words.
[09:24am] dw_ joined the chat room.
[09:24am] ryan_google: do you mean phrase match?
[09:25am] ryan_google: ie find items with the exact tag "w1 w2 ... wn"?
[09:25am] scudder: matija: in most cases multiple writes per request would be no problem (though it is possible to shoot yourself in the foot with high contention)
[09:25am] amichail: no.. the tag may contain more words
[09:25am] hp: marzia: I don't know how to explicitly render web.py's error apge but I guess I could look this up... would you say it is a bad idea though?
[09:25am] amichail: order doesn't matter
[09:25am] ryan_google: ok. then you're basically asking for full text search
[09:25am] ryan_google: more or less
[09:26am] scudder: matija: The high CPU request limit applies to runtime CPU and not datastore CPU directly
[09:26am] amichail: entity groups are used for efficiency?  why not use them only as efficiency hints?
[09:26am] ryan_google: appengine.ext.search does that right now, but it's weak, which we're aware of, and we're working on something better
[09:26am] ryan_google: entity groups are the transactional unit
[09:26am] amichail: but why? efficiency?
[09:26am] matija:  Somewhere in documentation/groups etc... is written that datastore updates aren't counted for high cpu quota but there are warnings in log only for information purpose. Is this true ?
[09:26am] scudder: for more details on high CPU request limits, see this FAQ (and let me know if any of it is unclear) http://code.google.com/appengine/kb/general.html#highcpu
[09:26am] ryan_google: they do also contribute to locality in the Entities table, but that's secondary
[09:27am] amichail: why are entity groups the transactional unit?
[09:28am] skiv02 joined the chat room.
[09:28am] ryan_google: it's a distributed system; we couldn't include top-level or kind-level locking and have it scale. entity groups are the design we went with.
[09:29am] ryan_google: i also misspoke there, since we use optimistic concurrency, not locking. more on the group, e.g. http://groups.google.com/group/google-appengine/browse_thread/thread/52d6faa1c9a131dc/1fd2d0546c1c9ba2
[09:29am] matija: scudder: Contention? in index creation related way or transaction isolation way. If you mean about that is my design problem.
[09:30am] amichail: so there is some complicated protocol to update the inverted index eventually?
[09:30am] jcg_google: matija: contention if you are trying to write to the same entity or entity group
[09:30am] ryan_google: there's also a description of how entity groups and txes work in http://snarfed.org/space/datastore_talk.html and http://sites.google.com/site/io/under-the-covers-of-the-google-app-engine-datastore
[09:31am] ryan_google: hmm. by inverted index, do you mean appengine.ext.search? or if you were to build one yourself?
[09:31am] jasonadams joined the chat room.
[09:31am] scudder: matija: by contention I mean conflicting writes to the same entity or entity group. Since writes are atomic/transactional, they will be rolled back and retried in the case of a conflicting write
[09:31am] amichail: both... they should work the same way no?
[09:32am] ryan_google: sure. there's no complex protocol. appengine.ext.search just tokenizes all of an entity's text into words, then stores the list of unique words in a list property
[09:32am] scudder: matija: a common indicator of write contention is a high level of Datastore CPU Time consumption
[09:32am] matija: scudder: Let's say that I don't have updates than only inserts. Then I will have no contention problem or there is some limit ?
[09:33am] ryan_google: to query, it does WHERE __searchable_text_index = 'w1' AND __searchable_text_index = 'w2' AND ...
[09:33am] jasonadams left the chat room. (Client Quit)
[09:33am] amichail: but I guess you can guarantee that indexing will take placewhere that is harder to do from an app
[09:33am] scudder: matija: it would depend if the inserts were within the same entity group, meaning if your inserts have the same parent entity, you may see contention
[09:33am] ryan_google: actually, no, appengine.ext.search is entirely in user land
[09:33am] amichail: but you can't use transactions
[09:33am] ryan_google: the index update is transactional because the index data is stored as a property on the entity
[09:34am] ryan_google: not in a separate entity
[09:34am] scudder: matija: but in general the datastore will support a large number of concurrent inserts
[09:34am] ryan_google: you can see the code in the sdk, in google/appengine/ext/search/__init__.py
[09:35am] ryan_google: or http://code.google.com/p/googleappengine/source/browse/trunk/google/appengine/ext/search/__init__.py
[09:35am] matija: scudder: Why I asked about multiple writes are counters. If I use counter for something, usually I will write also something else to database. If I do two writes in separate request I have problem if first finishes and second not.
[09:36am] scudder: matija: counters are the classic example of a candidate for datastore contention :) and there are several great solutions out there
[09:36am] cancerbero_sgx joined the chat room.
[09:36am] scudder: matija: this video might be helpful http://sites.google.com/site/io/building-scalable-web-applications-with-google-app-engine
[09:37am] matija: scudder: like... I am using shard counter princip... and I have watched video...
[09:37am] scudder: matija: ok :)
[09:38am] ropiku joined the chat room.
[09:38am] scudder: matija: I think I may have missed something, did you want the counter increment and insert to happen in one transaction?
[09:39am] cancerbero_sgx: hi all. I developed some little math utilities as a appengine application (http://mylocalcollection.appspot.com/) and also make a gadget that works ok in igoogle (http://mylocalcollection.appspot.com/static/gadget1.xml). My question is: is it possible to show this gadget inside myspace??? or google gadgets only works in igoogle ? any explanation about this is very appreciated
[09:39am] matija: scudder: Let's say I have comments in my web application and with every comment I also incement counter for comments. But now I do this in two ajax requests. First can return successfuly and second not. Is this good candidate for two write operation request ?
[09:39am] dobee left the chat room.
[09:40am] dan_google_: cancerbero_sgx: You can definitely use App Engine to make MySpace gadgets.
[09:40am] scudder: matija: yes I think incrementing the sharded counter and performing the write in one request is a good idea
[09:40am] cancerbero_sgx: dan_google: ok but you are saying that myspaces gadgets are not the same technologie that igoogle gadgets ?
[09:41am] matija: scudder: But if somehow I have more request per minute (beyond high cpu quota limit) will I be punished for that ?
[09:42am] matija: Do you plan to create somehow different web documentation style like let's say MSDN or Sql Server Books On Line (yes I was last ten years 'Microsoft technology man'). Current style is nice for first reading (book style), but after a while I usually spend few seconds/minutes too much time to find something that I have read but I don't know where. Complete reference guide with examples.
[09:42am] dan_google_: cancerbero: If your gadget is an OpenSocial gadget, it should work in both iGoogle and MySpace.
[09:42am] kodisha joined the chat room.
[09:42am] matija: ListProperty usage. Are ListProperty designed primarily for storage and less for querying? Lets say in some usage there is 1:100 ratio in write:searching operation. Is it efficient to have one write (three items in one ListProperty) and querying that attribute for hundred times or three writes each item in separate entity and than querying that attribute?
[09:42am] dan_google_: cancerbero: http://wiki.opensocial.org/index.php?title=Main_Page#Container_Information
[09:42am] matija: When sending e-mail through mail API without filling to field (only bcc or cc field),  why is app engine changing from field to some strange x...@apphosting.bounces.google.com value, although it should leave user's email address?
[09:42am] matija: Is name for email quota 'Recipients Emailed' accurate? Because if I send email to 10 people in cc or bcc field, quota usage changes only for 1.
[09:42am] dan_google_: My understanding is that iGoogle's OpenSocial support is limited to developers (you) at the moment.
[09:44am] dan_google_: The older gadget standard for iGoogle (the one that's live) is not the same as OpenSocial.  MySpace doesn't support the older Google gadget protocol.
[09:44am] scudder: matija: if doing both writes in a single request triggers a high CPU warning in the logs, then I guess it would be better to split it into two requests. I would be surprised if the request takes more than 1000 runtime megacycles (though it could if there is something higly processor intensive).
[09:46am] cancerbero_sgx: dan_google: thanks for the info!
[09:46am] scudder: matija: if you were just asking about how rapidly your Ajax page can make HTTP requests, there isn't a simple set limit it depends on overall resource usage (bandwidth, requests, etc.)
[09:46am] dan_google_: cancerbero: You're welcome!
[09:47am] lmorchard is now known as lmorchard|away.
[09:48am] picalolabu left the chat room. ("Computer went to sleep")
[09:48am] matija: scudder: Now I am under 1000 mcycles quota but with two request for something what would be good to do in one request. Counting comments number isn't to important. Maybe information or pageing purpose. But what If my design needs to be certain that two writes are finished or no one is. What now ?
[09:49am] matija: Hey, have google people read my other questions ?
[09:49am] matija: ListProperty usage and maybe mail api bug.
[09:49am] angerman: yeeeeha. my app-engine blog works again... [after being stupid] :)
[09:50am] danielobrien: The mail bug is something I'd need to look into - have you filed it as an issue in our public issue tracker already?
[09:50am] • angerman just hopes that the lower level caching will reduce workload :)
[09:50am] Tim_ is now known as fiffle.
[09:50am] matija: no... I noticed it today, few hours before...
[09:50am] ryan_google: matija: sorry, i'm not seeing your list property question. mind repeating it?
[09:50am] matija: ListProperty usage. Are ListProperty designed primarily for storage and less for querying? Lets say in some usage there is 1:100 ratio in write:searching operation. Is it efficient to have one write (three items in one ListProperty) and querying that attribute for hundred times or three writes each item in separate entity and than querying that attribute?
[09:51am] dan_google_: ListProperty is definitely useful for queries.  Each value of a ListProperty has a corresponding row in an index table, allowing for easy membership tests.
[09:51am] cancerbero_sgx: dan_google_: if you have a minit, please visit my appengine application: now it contains a julia fractal generator and a visual matrix calculator... http://mylocalcollection.appspot.com/ I think you will get impressed.. :-) the idea is to use those windows maximized when the app is showed in a gadget...
[09:52am] matija: Performance?
[09:52am] warreninaustinte joined the chat room.
[09:52am] dan_google_: Just watch out for combinatoric explosions: an index of two multi-valued properties will have a row for each combination of values.
[09:52am] ryan_google: matija: +1 to dan's answer. for that specific example, for queries, the list property in a single entity will be marginally more efficient than a single-valued property in three entities.
[09:52am] dan_google_: matija: Can you give more detail about your performance question?
[09:53am] scudder: matija: re ensuring two writes, it sounds like this is a good place to use transactions
[09:53am] ryan_google: writing the single entity with a list property will also be marginally more efficient than writing three entities
[09:53am] dan_google_: matija: By "querying that attribute," do you mean a membership test, or retrieving the entity?
[09:53am] matija: retrieving the entity
[09:53am] scudder: By my count matija is holding ~four conversations at once! :-)
[09:54am] matija: multithreading :)
[09:54am] dan_google_: cancerbero: impressed!
[09:54am] angerman: yes. email counting in the dashboard is broken for me too
[09:54am] angerman: I did not send 7 emails... non of my programs knows about sending emails
[09:54am] amichail: but put([...]) does not ensure that all are written or none are, correct?
[09:55am] ryan_google: amichail: correct, batch puts are atomic within entity groups, but not across them
[09:55am] ryan_google: if the put() call returns without raising an exception, though, then all of them were written
[09:56am] dobee joined the chat room.
[09:56am] dan_google_: If you're seeing unusual behavior in the quota reporting, please file an issue so we can track it.  Whether we need to fix the report, or just fix our explanation, we'd like to fix it.
[09:56am] MunkyJunk left the chat room.
[09:57am] moraes: hey marzia_google and googlers, congrats for the new dashboard stuff. pretty awesome.
[09:57am] amichail: if the app request exceeds its deadline, then I can't check for the exception.
[09:57am] • moraes has no questions, so he will just say 'weee'
[09:57am] marzia_google: i will pass along your compliments to the rest of the team
[09:57am] scudder: matija: if you wanted to be really sure that your sharded counter didn't update without a successful write, you _could_ make the comment a child of the counter shard. This would probably be ok in terms of write contention, unless the comment writes conflict over their parent shard
[09:57am] marzia_google: we're pretty excited about the new releases
[09:57am] warreninaustinte: yes, thanks on the new dashboard stuff
[09:58am] cancerbero_sgx: dan_google_: thanks... the nice thing of those web applications is that they are normal java desktop applications (use swt gui toolkit). so they run in a java vm and in the browser, just almost everywhere..... (I use the amazing java2script project)
[09:59am] ryan_google: amichail: you can catch DeadlineExceededError and handle it, as long as you don't take too long
[09:59am] angerman: hmmm... "BadRequestError: can't operate on multiple entity groups in a single transaction." I only operate on one...
[09:59am] dan_google_: amichall: You actually get a runtime.DeadlineExceededError when you're about to hit the request deadline.  You get a very brief period of time to prepare a friendly error message.
[09:59am] matija: There was some tag related problem. Is it more efficient to fill one StringListproperty with few items (always less than ten) and than querying for entities that have some tag. Or to have n (less than ten) write operations in separate 'table' and to query StringProperty and somehow on client side accumulate data.
[09:59am] ryan_google: amichail: see the Request Timer section in http://code.google.com/appengine/docs/python/requestsandcgi.html
[10:00am] amichail: With scalable apps, do you sometimes give up on correctness and settle for something that works most of the time?
[10:00am] ryan_google: matija: the answer is the same as before, it's generally more efficient to use a single entity and a list property.
[10:00am] ryan_google: angerman: got a code snippet?
[10:00am] matija: scudder: hm... i need to see...
[10:00am] r0ver left the chat room. (Read error: 60 (Operation timed out))
[10:00am] angerman: ryan_google: just preparing :)
[10:00am] ryan_google: amichail: not as a general rule, no. that would be pretty depressing.
[10:00am] matija: ryan_google: excelent...
[10:01am] angerman: http://pastie.org/341509
[10:01am] amichail: transactions are problematic though...
[10:01am] ryan_google: transactions that cross entity groups, yes. you could implement distributed transactions right now, though, entirely in userland.
[10:02am] matija: scudder: tnx for suggestions
[10:02am] ryan_google: we're actually thinking about distributed txes ourselves now. they're not a priority, but we have a few ideas
[10:03am] angerman: ryan_google: the idea is to duplicate the entity. but only put the newly created one or the old on...
[10:03am] angerman: so the put on db.Models only receives 1 entity
[10:03am] ryan_google: angerman: is this from the sdk? if so, which version?
[10:03am] angerman: ryan_google: it's from my live deployment at http://journal.moritzangermann.com
[10:04am] angerman: I think it did work locally before I deployed it
[10:05am] ryan_google: ok. is it reproducible, deterministically?
[10:05am] ryan_google: if it is, but the same thing works in the sdk, then that's a bug on our part
[10:05am] bFlood: if I create a dynamic kind via datastore.Entity(), is there a way to add a composite index to it (something that normally would be defined in index.yaml)
[10:05am] angerman: yes it does work on the sdk 1.1.7
[10:05am] scudder: matija: you're welcome, just to clarify, the reason the comment would be a child of the counter shard would be to allow the update and insert to happen in a single transaction (in case this was unclear to others)
[10:05am] angerman: ryan_google: seems pretty reproducable
[10:05am] angerman: ryan_google: wait a second i'll upload the whole code to github
[10:07am] angerman: ryan_google: http://github.com/angerman/reflection-blog/
[10:08am] angerman: The only thing I can think of is that it somehow tracks the old entity... only why I don't get
[10:08am] ryan_google: angerman: out of curiosity, why super(Post, self).put() instead of just self.put() ?
[10:08am] angerman: ryan_google: because I'm overwriting put
[10:09am] matija: scudder: problem could still evolve if between two request somehow connection breaks... so with no contention problem ever...
[10:10am] ryan_google: angerman: hmm, ok. do you always use that idiom in this class?
[10:10am] angerman: it's pretty much my only class :)
[10:10am] ryan_google: ok. the other odd line is:
[10:10am] ryan_google: if not self.is_saved() and self.get(self.key()):
[10:10am] ryan_google: how could you get it if it's not saved?
[10:11am] Lennie|Food is now known as Lennie.
[10:11am] angerman: ryan_google: because eventually I have a key collision
[10:12am] marzia_google: bflood: there is no way to programmatically create a composite index
[10:12am] QuickT joined the chat room.
[10:12am] moraes: angerman, that will be always False
[10:12am] marzia_google: you can anticipate dynamically creating an entity and add it to your app.yaml
[10:12am] ryan_google: angerman: actually, it looks like the second part of that conditional never happens
[10:12am] ryan_google: key() raises an exception if is_saved() returns false
[10:12am] marzia_google: you can always do the typical = queries without a composite index
[10:12am] marzia_google: rather index.yaml
[10:13am] bFlood: given proper login, can we talk directly to http://appengine.google.com/api/datastore/index/add?
[10:13am] ryan_google: actually, bFlood: if you know the kind name, you can create an index for it manually in index.yaml
[10:13am] bFlood: (like in appcfg.py)
[10:13am] ryan_google: you don't need a db.Model or db.Expando subclass for it in code
[10:13am] ryan_google: bFlood: yes, you can
[10:14am] ropiku_ joined the chat room.
[10:14am] ryan_google: we've definitely anticipated having third party code that talks to the admin console
[10:14am] angerman: moraes:, ryan_google if that would always be false, I could not raise a NonUniqueException
[10:14am] angerman: but I can do that
[10:14am] ryan_google: the mac launcher is one example of a non-appcfg.py program that talks to the admin console
[10:15am] ryan_google: angerman: do you override is_saved() or key()?
[10:15am] angerman: ryan_google: I inject a "key_name" during __init__
[10:15am] angerman: http://github.com/angerman/reflection-blog/tree/master/blog/models.py
[10:16am] angerman: lines 50 following
[10:16am] ryan_google: ah, i see
[10:16am] dan_google_: I gotta head out.  Thanks, all!
[10:16am] dan_google_ left the chat room.
[10:16am] marzia_google: yes this seems a good time to officially end the chat
[10:16am] marzia_google: though I'll be hanging out for a few more minutes
Reply all
Reply to author
Forward
0 new messages