How to move data reliably from front to backends ?

528 views
Skip to first unread message

Richard

unread,
Jul 29, 2012, 10:19:25 AM7/29/12
to google-a...@googlegroups.com
This is for all you guys who know app engine really well:

I want to be able to move data RELIABLY and with low latency from many front end instances to a backend within 5 seconds.

I am currently getting the F1's to do a db.put() on a lightweight object.  The data comes from web clients.  Around 10 seconds later, the backend reaps them.  This should work nicely in theory.

However, it is just unreliable.  Sometimes it will handle a load of 600 db put()'s in that 10 second window.... and other times (in the same day!), it won't even have completed 100 put()'s in 10 seconds.... so when I do the reap... I get nothing!   It seems put() has some problems... 'eventually consistent' is useless if it is several minutes later!

How can I do this reliably in a 10 second window ?   I have had using a PULL queue suggested, but I don't want to do all the work of converting the app over if it will be just as unreliable (I remember posts about tasksqueue's getting "stuck")....


Darien Caldwell

unread,
Jul 29, 2012, 4:34:42 PM7/29/12
to google-a...@googlegroups.com
Memcache?

Jeff Schnitzer

unread,
Jul 29, 2012, 7:42:04 PM7/29/12
to google-a...@googlegroups.com
On Sun, Jul 29, 2012 at 7:19 AM, Richard <stev...@gmail.com> wrote:
>
> How can I do this reliably in a 10 second window ? I have had using a PULL
> queue suggested, but I don't want to do all the work of converting the app
> over if it will be just as unreliable (I remember posts about tasksqueue's
> getting "stuck")....

The task queues that have gotten "stuck" in the past were push queues;
basically, the machines that do the pushing fell behind (or were
temporarily suspended due to problems). I haven't heard of any
equivalent problem with pull queues. The queue would have to start
throwing errors on service calls, or "losing" tasks... probably not
impossible, but on par with "datastore requests failing". It would be
a major failure.

Does anyone have any comments about the reliability of pull queues?

I'm only just now starting to work with pull queues. While I do allow
that my push queues may not fire in a timely manner, I'm engineering
to expect pull queues to always be available.

Jeff

David Hardwick

unread,
Jul 30, 2012, 10:40:23 AM7/30/12
to Google App Engine
We do pull queues on backends, and use the pipeline library to insure
reliability.



On Jul 29, 7:42 pm, Jeff Schnitzer <j...@infohazard.org> wrote:

Richard

unread,
Jul 30, 2012, 2:54:37 PM7/30/12
to google-a...@googlegroups.com
Well, I need performance of at least 1k /second throughput including non-batch add from many F1's and batch dequeue/reaping.

Richard

unread,
Jul 31, 2012, 11:38:22 AM7/31/12
to google-a...@googlegroups.com
Update.  Tried PULL queues.  Some entries arrive > 10 seconds later into the queue.  This completely won't work for my application.  This was occuring when the total adds to the queue was around 180 over a 1-5 second

D X

unread,
Jul 31, 2012, 12:21:03 PM7/31/12
to google-a...@googlegroups.com
How are you doing the reaps?  Are you doing an eventually-consistent query, or fully-consistent get by keys?

Richard

unread,
Jul 31, 2012, 12:35:26 PM7/31/12
to google-a...@googlegroups.com
Reaping is done using:

    allScores = Score.all()
    for s in allScores:
         etc

I could store each entry with a unique key (user's id), but my understanding is that supplying a key name does not improve things.  I don't know who is in the game a priori, so I don't know how to do a fully consistent 'get' in batch mode.....

Michael Hermus

unread,
Jul 31, 2012, 2:04:24 PM7/31/12
to google-a...@googlegroups.com
Since I know very few details, I could be completely wrong, but it seems like this is a candidate for the typical high throughput 'fan-in' batch processing. The way I prefer to do this is:

1) Add each work item to a Pull queue
2) Create a named task on a Push queue at your desired interval (say 1 second)
3) The named task executes (on a front-end or back-end according to your needs) and pulls the existing work from the Push queue in batch
4) Process the batch of work as desired (i.e. update the leader-board)

You still cannot guarantee latency, and I think it is hard to do that under any circumstances, but it does eliminate the need to write temporary work objects to the datastore (you are using the task queue system instead).

Barry Hunter

unread,
Jul 31, 2012, 2:47:02 PM7/31/12
to google-a...@googlegroups.com
This is from the other thread:

> Also, when processing at around 160 players/game, the number of 'late' entries was randomly between 1-10. > Obviously, this is completely unacceptable as it appears to players that we 'lose' their score!

But it seems more natural fit for this thread.


... One technique I've seen advocated to 'hide' the latency - so that
players don't see their score missing. Is when you display the
leaderboard - fetching the scoreboard using a standard eventually
consistent query (ok, its gone via a backend that has effectively done
a 'group by' and cached the result for you)

But you have the user-id, so can do a strongly consistent get to get
the payers own score. And you alter the leaderbaord for display using
the very latest data for just the current player. Their own score is
correct.

The chances of their even realising they are getting a slightly
imperfect view (ie everyone else's score is outdated) is very small.
As long as their own score is right.


... ie expect latency, architect so it doesn't matter.

Richard

unread,
Jul 31, 2012, 4:10:12 PM7/31/12
to google-a...@googlegroups.com
Yeah, but my worry is that if the queue architecture cannot handle 160 entries... how will it handle > 1000 ?  Remember these entries hang around for the next game round.  So at that point, each client is also going to have to remove duplicates from other people that were left over from the previous round (hoping this makes sense!).  Or else I purge the queue in between ... after attempting to patch the leaderboards a second time with the extra entries.

It just seems like a hack on top of a hack and extremely ugly....

Richard Watson

unread,
Jul 31, 2012, 4:54:51 PM7/31/12
to google-a...@googlegroups.com
How many queues are you using? Could you add more?
Can you batch data into fewer tasks?  If they're arriving at such a high rate, batching and adding fewer tasks might help.

Jeff Schnitzer

unread,
Jul 31, 2012, 6:12:05 PM7/31/12
to google-a...@googlegroups.com
If Richard's metrics are right (he has created a very exact timing
system which accounts for skew, so I suspect they are), there's
sometimes a 4s+ delay between submitting a task to a pull queue and
when it becomes available for lease. This makes pull queues difficult
for precisely timing data collection and reaping in a 10s window.

Honestly I'm struggling to find an in-appengine solution to this
problem. Let's say at T+0 clients submit 1000 data points. At T+10s,
the data must be collated to produce a leaderboard, and shortly
afterwards clients fetch the leaderboard.

* Saving to the datastore and querying for the data points suffers
from eventual consistency issues. It works mostly but not always.

* Putting items in a pull queue and then fetching them out in batch
would be the most logical solution, but the queue delay causes
problems. There doesn't seem to be any defined limit to queue
latency.

* Using in-memory state in a Backend would be the next logical
solution, but backends can't handle the throughput.

At this point I think the GAE toolbox is empty. It might be possible
to hack something together using memcache but it would be difficult
and fail whenever memcache resets.

At this point all I can think of is to use something external to GAE
to queue the submissions. A simple custom in-memory server using
technology that handles concurrency well (eg Java) would work. For
something more off-the-shelf, a Redis instance fronted by a couple web
frontends to handle the submit() and reap() requests. At least we
*know* that Redis can handle thousands of submissions per second.

Jeff
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine/-/SFiRDIKnOAgJ.
>
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengi...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.

alex

unread,
Jul 31, 2012, 6:48:17 PM7/31/12
to google-a...@googlegroups.com, je...@infohazard.org
> At this point I think the GAE toolbox is empty.

How about Cloud SQL?
> To post to this group, send email to google-appengine@googlegroups.com.
> To unsubscribe from this group, send email to

Richard Watson

unread,
Aug 1, 2012, 1:10:42 AM8/1/12
to google-a...@googlegroups.com, je...@infohazard.org
On Wednesday, August 1, 2012 12:12:05 AM UTC+2, Jeff Schnitzer wrote:

At this point all I can think of is to use something external to GAE
to queue the submissions.  A simple custom in-memory server using
technology that handles concurrency well (eg Java) would work.  For
something more off-the-shelf, a Redis instance fronted by a couple web
frontends to handle the submit() and reap() requests.

Unless you move the whole thing out of GAE, I suspect 1000 TPS could well suffer from variable performance whether you're using puts, tasks or URL fetch.  Before I built out some second external setup and spent time integrating it, I'd look at reducing the transactions within GAE, again by batching them.

For example:
Each instance receives values, caches them in memory (maybe add memcache as backup), submits them at the end of each second (or two, or whenever max entity size is hit) as one blob-like entity, via whatever medium ends up working best.  The risk here is that the instance fails, but I'd get it working now and then improve the resilience later, because this devil is worse than the one Richard has now.

This way, we're back to using e.g. put(), but hopefully changing 1000/sec into 100, 50 or 10/sec.

What sucks about this idea?

Jeff Schnitzer

unread,
Aug 1, 2012, 4:50:59 AM8/1/12
to Richard Watson, google-a...@googlegroups.com
It's hard to imagine this working without some sort of background
thread in the frontend instances. Otherwise how do you get the
frontend to commit its data to whatever datasource you want? You
can't really hold open every connection for 2s... at max concurrency
of 10, you'd need 100 instances. Maybe you could make some sort of
latch whereby when a second request comes in, it releases the first
request, but wow that would be complicated.

GAE can easily proxy 1000qps to another service; that's just a
question of having enough instances. The question is whether the
other service can handle 1000 submissions per second. If it all
funnels to a redis (or some other in-memory store) instance, the
answer is almost certainly yes.

It wouldn't require moving 'the whole thing' out of GAE, just the
queue. Ideally get the client to submit directly to that queue
instead of proxying through GAE, but that might require a client
update.

In theory this should be the kind of thing that a backend is good for.
Too bad it isn't.

Jeff

Richard Watson

unread,
Aug 1, 2012, 5:36:36 AM8/1/12
to Jeff Schnitzer, google-a...@googlegroups.com
On Wed, Aug 1, 2012 at 10:50 AM, Jeff Schnitzer <je...@infohazard.org> wrote:
> It's hard to imagine this working without some sort of background
> thread in the frontend instances. Otherwise how do you get the
> frontend to commit its data to whatever datasource you want?

I suspect I misunderstand this next point:
"All threads in a request must finish before the request deadline (60
seconds for online requests and 10 minutes for offline)."
https://developers.google.com/appengine/docs/python/python27/newin27#Multithreading
It most likely means the thread can't outlive the request, which is
why it's bound by the request's lifetime deadline. Sad.

One option is to use a periodic cron or task to ping the front-end and
clear the list, but you can't address individual instances, so no
guarantee that you'll clean up orphans. Sad.

You could also scale the amount in the list depending on how many
requests per second you're getting. If 1000, batch like mad. If <
20, submit every request. The risk with this is that you get 1000's,
then instantly zero. But you then likely have another problem.
Hopefully workable?

> GAE can easily proxy 1000qps to another service; that's just a
> question of having enough instances

I thought GAE can also do 1000 puts per second if you have a lot of
instances? I'm kinda assuming Richard's trying to push lots through
relatively few instances so one pause in the list affects many values
being processed. Richard, could you clarify?

One other question, since I've never used a backend. We're able to
address them directly, yes? Can't he just do an HTTP POST to the
backend directly rather than a put()?

Richard

Jeff Schnitzer

unread,
Aug 1, 2012, 6:13:06 AM8/1/12
to Richard Watson, google-a...@googlegroups.com
On Wed, Aug 1, 2012 at 2:36 AM, Richard Watson <richard...@gmail.com> wrote:
>
> You could also scale the amount in the list depending on how many
> requests per second you're getting. If 1000, batch like mad. If <
> 20, submit every request. The risk with this is that you get 1000's,
> then instantly zero. But you then likely have another problem.
> Hopefully workable?

Anything that involves batching in a frontend risks orphaning data in
the frontend... there's just no efective way to ensure that batching
happens and that the queue is purged when "done".

> I thought GAE can also do 1000 puts per second if you have a lot of
> instances? I'm kinda assuming Richard's trying to push lots through
> relatively few instances so one pause in the list affects many values
> being processed. Richard, could you clarify?

That's the solution that Richard is using right now. The problem is
that the reaping process is a simple query; because of eventual
consistency, you can't guarantee that the collator will get all the
scores. A burp in the datastore and you get nothing.

> One other question, since I've never used a backend. We're able to
> address them directly, yes? Can't he just do an HTTP POST to the
> backend directly rather than a put()?

See my thread about performance of backends: Horrible. They top out
at 80qps doing no-ops, and even minor amounts of work cut that to 20.
Plus there's a mysterious 200ms (!!) added for every
frontend-to-backend call. I'm having a genuinely hard time finding
any application for which backends are actually suitable.

I really don't see any problem with running this service in another
cloud provider. It's slightly more complicated but it gives you "the
right tool for the job". Use another PaaS provider like Heroku to
keep the fuss minimal.

Jeff

alex

unread,
Aug 1, 2012, 6:16:19 AM8/1/12
to Jeff Schnitzer, google-a...@googlegroups.com

I don't know. Do you? If not, I think it is at least worth a consideration.

-- alex

On Aug 1, 2012 1:33 AM, "Jeff Schnitzer" <je...@infohazard.org> wrote:
On Tue, Jul 31, 2012 at 3:48 PM, alex <al...@cloudware.it> wrote:
>> At this point I think the GAE toolbox is empty.
>
> How about Cloud SQL?

Can it handle a thousand inserts in a 2s period?

Jeff

Richard Watson

unread,
Aug 1, 2012, 7:14:19 AM8/1/12
to Jeff Schnitzer, google-a...@googlegroups.com
On Wed, Aug 1, 2012 at 12:13 PM, Jeff Schnitzer <je...@infohazard.org> wrote:

> Anything that involves batching in a frontend risks orphaning data in
> the frontend... there's just no efective way to ensure that batching
> happens and that the queue is purged when "done".

It's definitely not bulletproof, but I'd investigate whether I could
achieve "good enough" for 99% and be happy that for the rest the
client could re-request the score if it didn't reflect their own
correct value.

As you say, if clients can connect directly to some other more
reliable service, maybe that's the way to go. Mixing platforms (GAE
url fetch to Amazon SQS, or whatever) would leave me worrying about
extra points of failure - you'll now fail if either GAE or
Amazon/Heroku go down.

Pity. I would imaging a back-end for a game with hundreds of users is
something GAE should eat up.

Richard

Richard

unread,
Aug 1, 2012, 9:37:21 AM8/1/12
to google-a...@googlegroups.com, Jeff Schnitzer
The thing is.... GAE does a wonderful job.  It has been fine for a year or so.  Now suddenly, we are getting bitten by 'eventual consistency'.  And not at peak load either!  This is hitting us at the lowest load time and at the same time each day. 

So, maybe we were just lucky before this..... but it still sucks to have something work for a year or so, and then suddenly find out you might need to consider moving to a different stack.

Takashi Matsuo

unread,
Aug 1, 2012, 10:30:55 AM8/1/12
to google-a...@googlegroups.com, Jeff Schnitzer
Hi Richard,

I actually played your game and probably encountered the exact issue.
BTW, it's a cool addictive game.

I agree that eventually consistent window should ideally be more
steady than now. I have passed your issue to the engineering team.

I'll get back to you when I have some updates.

-- Takashi
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine/-/MpsNezxGjpkJ.
>
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengi...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.



--
Takashi Matsuo

Takashi Matsuo

unread,
Aug 1, 2012, 10:55:35 AM8/1/12
to google-a...@googlegroups.com, Jeff Schnitzer
I forgot to mention about one possible workaround which could mitigate
your pain. In the frontend instances, maybe you can get those small
entities with a newly created keys just after putting them in order to
apply the change locally.

This article describes this behavior well:
https://developers.google.com/appengine/articles/transaction_isolation

However, please note that
* It doesn't guarantee 100% 'strong consistency'.
* The latency of submitting scores will become larger.
* Definitely it will cost you more.

Sorry if this doesn't work well for you, but I think it's worth trying.

-- Takashi
--
Takashi Matsuo

Richard

unread,
Aug 1, 2012, 11:45:56 AM8/1/12
to google-a...@googlegroups.com, Jeff Schnitzer
Ok, so I read that web page, and I understand that the put() has completed, but not the commit().

However, I don't really see the potential solution you are suggesting  :(  Can you (or someone else) please explain in a bit more detail ?
>> To post to this group, send email to google-appengine@googlegroups.com.
>> To unsubscribe from this group, send email to
>> google-appengine+unsubscribe@googlegroups.com.

Jeff Schnitzer

unread,
Aug 1, 2012, 12:39:41 PM8/1/12
to Richard, google-a...@googlegroups.com
I presume the essential line is this:

"In the (now standard) High Replication Datastore, the transaction
typically is completely applied within a few hundred milliseconds
after the commit returns. However, even if it is not completely
applied, subsequent reads, writes, and ancestor queries will always
reflect the results of the commit, because these operations apply any
outstanding modifications before executing."

...which is interesting and not something I ever thought of. I don't
think this intermediate state was discussed in Alfred's "Under The
Covers" I/O talk last year. Can someone describe it in more detail?

Jeff
>> >> To post to this group, send email to google-a...@googlegroups.com.
>> >> To unsubscribe from this group, send email to
>> >> google-appengi...@googlegroups.com.

Richard

unread,
Aug 1, 2012, 2:01:43 PM8/1/12
to google-a...@googlegroups.com, Richard, je...@infohazard.org
Yep... got that.  However, a query() will still return stale data between the put() and the (internal) commit().  Which is where I believe I am sitting...
>> >> To post to this group, send email to google-appengine@googlegroups.com.
>> >> To unsubscribe from this group, send email to
>> >> google-appengine+unsubscribe@googlegroups.com.

Joshua Smith

unread,
Aug 1, 2012, 2:20:37 PM8/1/12
to google-a...@googlegroups.com
I haven't been following this thread to closely. Can you summarize the problem as you understand it at this point?

To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/_oBjwMMsOiAJ.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.

Richard

unread,
Aug 1, 2012, 2:24:10 PM8/1/12
to google-a...@googlegroups.com, Richard, je...@infohazard.org
Ok, based on Takashi's suggestion, I now do the following:

new_user_score = Score()
new_user_score.member = thing
new_user_score.put()

# next part is new based on Takashi's suggestion
k = new_user_score.key()
db.get(k)

WOW... I mean... HOLY CRIPES!   Time to process each request went from ~150msec to 4.5-7 SECONDS.

This means I effectively need a one instance PER REQUEST.



On Wednesday, August 1, 2012 2:01:43 PM UTC-4, Richard wrote:
Yep... got that.  However, a query() will still return stale data between the put() and the (internal) commit().  Which is where I believe I am sitting...


Richard

unread,
Aug 1, 2012, 2:27:51 PM8/1/12
to google-a...@googlegroups.com
Summary so far:

I have a massive multiplayer Android game.  It has tight timings.  "Sometimes" (at the same time of day.. when load is lowest),  GAE will do a put() and 5-10 seconds later when I do a query all to create leaderboards for that game round, I get stale results..... for some of the entries.  This is because the put() has completed, but not the (internal) commit().

This started 5 days ago.  Moving to PULL queues did not work, because entries sometimes take at 5+ seconds to 'show up' in the pull queue.

Michael Hermus

unread,
Aug 1, 2012, 2:28:55 PM8/1/12
to google-a...@googlegroups.com, Richard, je...@infohazard.org
If you collect the keys of newly created 'small entities' and send them in batch to the back end for reaping, you could use a batch get() which would force all the entities to roll forward, and eliminates the eventual consistency issue.

Unfortunately, I believe that no matter the solution architecture, you ALWAYS have the possibility of a latency spike with any GAE service, which why the 10s window is so challenging.


On Wednesday, August 1, 2012 2:01:43 PM UTC-4, Richard wrote:

Michael Hermus

unread,
Aug 1, 2012, 2:31:36 PM8/1/12
to google-a...@googlegroups.com, Richard, je...@infohazard.org
Our posts crossed paths in the ether.

I don't think you should do 'get(key)' per request on the front end; rather, you should collect the keys and send them to the back end for use in a single batch get() call. It still might not completely suit your needs, but it would be much closer.

Richard

unread,
Aug 1, 2012, 2:36:42 PM8/1/12
to google-a...@googlegroups.com, Richard, je...@infohazard.org
The backend has 5 seconds to do reap all the results, create a leaderboard and make it available to the front ends.  This part runs in under 2 seconds typically atm.

I don't believe there is any way to move the keys to the back end (PULL queues and urlget has been shown to not work), execute a get() on all the keys (which can take at least 7 seconds), then create a leaderboard and store the results.

Joshua Smith

unread,
Aug 1, 2012, 2:45:05 PM8/1/12
to google-a...@googlegroups.com
I don't know if this will help (depends on some specifics of your data model), but I've found that using a query to get a list of entities, followed by a get to actually get the data, is a good workaround for many eventual consistency issues.

Here's  snippet of code that provides the functionality I use:

class HRModel(db.Model):
  @classmethod
  def gql_keys(cls, query_string, *args, **kwds):
    return db.GqlQuery('SELECT __key__ FROM %s %s' % (cls.kind(), query_string), *args, **kwds)

  @classmethod
  def gql_ids(cls, query_string, *args, **kwds):
    return [x.id() for x in db.GqlQuery('SELECT __key__ FROM %s %s' % (cls.kind(), query_string), *args, **kwds)]

  @classmethod
  def gql_with_get(cls, query_string, *args, **kwds):
    return filter(None, db.get(db.GqlQuery('SELECT __key__ FROM %s %s' % (cls.kind(), query_string), *args, **kwds)))

The gql_with_get is the magic one. It does a keys-only query, and then does a get to fetch the entities with those keys. While the query might be out-of-date, the get is guaranteed to be consistent with the most recent writes.

If an entity returned by the possibly stale query results is deleted, the filter will clean that out.

If an entity should have been returned but was not, then you won't pick it up. That is only a problem for brand-new entities, and I don't have a solution for that. For already-existing entities, you could just query for near-miss leader, and then filter out the scores that made the query, but aren't justified by the actual state of the leader board after the get. For example, if your leader board has 100 entries, query for the top 150, and then re-sort the results and take the top 100 of those.

Make sense?

-Joshua

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/r4yYvSgNJFwJ.

Takashi Matsuo

unread,
Aug 1, 2012, 2:51:53 PM8/1/12
to google-a...@googlegroups.com, Richard, je...@infohazard.org
On Thu, Aug 2, 2012 at 3:24 AM, Richard <stev...@gmail.com> wrote:
> Ok, based on Takashi's suggestion, I now do the following:
>
> new_user_score = Score()
> new_user_score.member = thing
> new_user_score.put()
>
> # next part is new based on Takashi's suggestion
> k = new_user_score.key()
> db.get(k)
>
> WOW... I mean... HOLY CRIPES! Time to process each request went from
> ~150msec to 4.5-7 SECONDS.
>
> This means I effectively need a one instance PER REQUEST.

Sorry it didn't work like a magic, but a single datastore get
shouldn't take that long, so there's another issue with datastore
latency. It might be tablet splitting issue which I mentioned in
another thread:
http://ikaisays.com/2011/01/25/app-engine-datastore-tip-monotonically-increasing-values-are-bad/

I'd suggest generating somewhat ramdom(well-distributed) keys for
those small entities instead of using automatic key assignment.
Besides using datastore, maybe I'll experiment with an implementation
of a memory db with some backends instances tomorrow.

-- Takashi
>>> >> >> google-a...@googlegroups.com.
>>> >> >> To unsubscribe from this group, send email to
>>> >> >> google-appengi...@googlegroups.com.
>>> >> >> For more options, visit this group at
>>> >> >> http://groups.google.com/group/google-appengine?hl=en.
>>> >> >
>>> >> >
>>> >> >
>>> >> > --
>>> >> > Takashi Matsuo
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Takashi Matsuo
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine/-/v9XP85rmNM8J.
>
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengi...@googlegroups.com.

Richard

unread,
Aug 1, 2012, 3:11:20 PM8/1/12
to google-a...@googlegroups.com, Richard, je...@infohazard.org
Hi Takashi & Joshua,

How about using my user's id as the key ?

eg:
new_score = Score(key_name='ahFkZXZ-c3Zlbi13b3JkaGVyb3ILCxIEVXNlchjFAww')

I assume this would not count as monotonically increasing ?

@ Joshua:
At the moment, I create all new Score objects for each player of that game round and then have the backend reap them and afterwards fire a task to delete them.  HOWEVER, what is I did not delete them, but instead marked them with the user's account key_name as the Score's key_name (see above) and also included the following:
  timestamp = db.DateTimeProperty(auto_now = True)

Then I could figure out which ones were from the last round and which were from players that had last played a while ago.  The assumption is that at this point, players from the previous have would have their results stored using a put() with a key name, which would be strongly consistent.  Players whose first game it is, would have the chance that the DB was slow and did not yet have an entry for them ... in which case, they would just miss out on one round (unless the internal commit() takes > 3 minutes!) ... and I could update the game client to add in a user's score to the results if they were not found to 'fudge' things.

Comments/thoughts ?

-R

>>> >> >> google-appengine@googlegroups.com.
>>> >> >> To unsubscribe from this group, send email to
>>> >> >> google-appengine+unsubscribe@googlegroups.com.
>>> >> >> For more options, visit this group at
>>> >> >> http://groups.google.com/group/google-appengine?hl=en.
>>> >> >
>>> >> >
>>> >> >
>>> >> > --
>>> >> > Takashi Matsuo
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Takashi Matsuo
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine/-/v9XP85rmNM8J.
>
> To post to this group, send email to google-appengine@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengine+unsubscribe@googlegroups.com.

Joshua Smith

unread,
Aug 1, 2012, 3:25:23 PM8/1/12
to google-a...@googlegroups.com
I think that approach sounds pretty good.

The trick of pasting the new user's data in client-side is similar to a technique I have used to ensure newly added records appear in the list.

On Aug 1, 2012, at 3:11 PM, Richard <stev...@gmail.com> wrote:

Comments/thoughts ?

Takashi Matsuo

unread,
Aug 1, 2012, 3:32:27 PM8/1/12
to google-a...@googlegroups.com, Richard, je...@infohazard.org
On Thu, Aug 2, 2012 at 4:11 AM, Richard <stev...@gmail.com> wrote:
> Hi Takashi & Joshua,
>
> How about using my user's id as the key ?
>
> eg:
> new_score = Score(key_name='ahFkZXZ-c3Zlbi13b3JkaGVyb3ILCxIEVXNlchjFAww')
>
> I assume this would not count as monotonically increasing ?

I don't think it's ok. Probably the first few characters would be
always the same. It's not good.
Additionally, from your explanation bellow, a single user has multiple
scores, right? If you use userid as key_name, the latest score always
override the previous score.

What about something like user_name + timestamp?
>> >>> >> >> google-a...@googlegroups.com.
>> >>> >> >> To unsubscribe from this group, send email to
>> >>> >> >> google-appengi...@googlegroups.com.
>> >>> >> >> For more options, visit this group at
>> >>> >> >> http://groups.google.com/group/google-appengine?hl=en.
>> >>> >> >
>> >>> >> >
>> >>> >> >
>> >>> >> > --
>> >>> >> > Takashi Matsuo
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> --
>> >>> >> Takashi Matsuo
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> > Groups
>> > "Google App Engine" group.
>> > To view this discussion on the web visit
>> > https://groups.google.com/d/msg/google-appengine/-/v9XP85rmNM8J.
>> >
>> > To post to this group, send email to google-a...@googlegroups.com.
>> > To unsubscribe from this group, send email to
>> > google-appengi...@googlegroups.com.
>> > For more options, visit this group at
>> > http://groups.google.com/group/google-appengine?hl=en.
>>
>>
>>
>> --
>> Takashi Matsuo
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine/-/iJA7RSzgG0oJ.
>
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengi...@googlegroups.com.

Richard

unread,
Aug 1, 2012, 3:52:24 PM8/1/12
to google-a...@googlegroups.com, Richard, je...@infohazard.org
Hi Takashi,
Yeah, the first few characters would probably be very similar.  I am not sure if that would be a problem ?

A single user has only one score in the leaderboard for the game.  So, overwriting the score the next time is not a problem, provided we know when to NOT include a user in the leaderboards (timestamp... if they are no longer playing).

We could also use the user's name as that is guaranteed unique too.  Would that work ?

-R
>> >>> >> >> google-appengine@googlegroups.com.
>> >>> >> >> To unsubscribe from this group, send email to
>> >>> >> >> google-appengine+unsubscribe@googlegroups.com.
>> >>> >> >> For more options, visit this group at
>> >>> >> >> http://groups.google.com/group/google-appengine?hl=en.
>> >>> >> >
>> >>> >> >
>> >>> >> >
>> >>> >> > --
>> >>> >> > Takashi Matsuo
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> --
>> >>> >> Takashi Matsuo
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> > Groups
>> > "Google App Engine" group.
>> > To view this discussion on the web visit
>> > https://groups.google.com/d/msg/google-appengine/-/v9XP85rmNM8J.
>> >
>> > To post to this group, send email to google-appengine@googlegroups.com.
>> > To unsubscribe from this group, send email to
>> > google-appengine+unsubscribe@googlegroups.com.
>> > For more options, visit this group at
>> > http://groups.google.com/group/google-appengine?hl=en.
>>
>>
>>
>> --
>> Takashi Matsuo
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine/-/iJA7RSzgG0oJ.
>
> To post to this group, send email to google-appengine@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengine+unsubscribe@googlegroups.com.

Mauricio Aristizabal

unread,
Aug 1, 2012, 3:56:03 PM8/1/12
to google-a...@googlegroups.com
As others have said, why not Cloud SQL?

Mysql can handle tens of thousands of inserts per second into an average-width table:

I don't know what kind of performance you can expect from Google's setup but considering your table would be very narrow ( a couple number columns) and with few to no indexes, you can probably get at least a couple thousand per second.

You may also consider it acceptable to use MEMORY tables.  I just tested creating one so they are supported.  And if you do need to keep this data for a while, doing a 'insert into innodbtable select * from memtable' is much faster than individual inserts, so you can perhaps 'archive' your raw entries after every leaderboard calculation and truncate.

Lastly, you can then do your ranking in sql which may run a lot fater than loading tens of thousands of records into the backend to do the calculation in the app.

Of course, this will not scale indefinitely, but in this particular use case seems like sharding wouldn't be much of an issue: front end writes to any of 4 sqldbs, reaper backend queries all 4 (use threads to do asynchronously), collates results.



On Sunday, July 29, 2012 7:19:25 AM UTC-7, Richard wrote:
This is for all you guys who know app engine really well:

I want to be able to move data RELIABLY and with low latency from many front end instances to a backend within 5 seconds.

I am currently getting the F1's to do a db.put() on a lightweight object.  The data comes from web clients.  Around 10 seconds later, the backend reaps them.  This should work nicely in theory.

However, it is just unreliable.  Sometimes it will handle a load of 600 db put()'s in that 10 second window.... and other times (in the same day!), it won't even have completed 100 put()'s in 10 seconds.... so when I do the reap... I get nothing!   It seems put() has some problems... 'eventually consistent' is useless if it is several minutes later!

How can I do this reliably in a 10 second window ?   I have had using a PULL queue suggested, but I don't want to do all the work of converting the app over if it will be just as unreliable (I remember posts about tasksqueue's getting "stuck")....


Jeff Schnitzer

unread,
Aug 1, 2012, 5:15:13 PM8/1/12
to google-a...@googlegroups.com
This definitely sounds like a good strategy... especially with memory
tables and no indexes.

Back when cloud sql was in beta, there was a restriction of 5 queries
per second, so I "wrote it off" for doing any kind of heavy lifting.
Presumably that restriction has been lifted.

Jeff
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine/-/4skTYPy52hIJ.
>
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengi...@googlegroups.com.

Jeff Schnitzer

unread,
Aug 1, 2012, 5:21:08 PM8/1/12
to Richard, google-a...@googlegroups.com
Would this really help? There's still going to be a splitting issue
with the index tablet, and that one you will have a hard time
distributing the keyspace for.

Jeff
>> >> >>> >> >> google-a...@googlegroups.com.
>> >> >>> >> >> To unsubscribe from this group, send email to
>> >> >>> >> >> google-appengi...@googlegroups.com.
>> >> >>> >> >> For more options, visit this group at
>> >> >>> >> >> http://groups.google.com/group/google-appengine?hl=en.
>> >> >>> >> >
>> >> >>> >> >
>> >> >>> >> >
>> >> >>> >> > --
>> >> >>> >> > Takashi Matsuo
>> >> >>> >>
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> --
>> >> >>> >> Takashi Matsuo
>> >> >
>> >> > --
>> >> > You received this message because you are subscribed to the Google
>> >> > Groups
>> >> > "Google App Engine" group.
>> >> > To view this discussion on the web visit
>> >> > https://groups.google.com/d/msg/google-appengine/-/v9XP85rmNM8J.
>> >> >
>> >> > To post to this group, send email to
>> >> > google-a...@googlegroups.com.
>> >> > To unsubscribe from this group, send email to
>> >> > google-appengi...@googlegroups.com.
>> >> > For more options, visit this group at
>> >> > http://groups.google.com/group/google-appengine?hl=en.
>> >>
>> >>
>> >>
>> >> --
>> >> Takashi Matsuo
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> > Groups
>> > "Google App Engine" group.
>> > To view this discussion on the web visit
>> > https://groups.google.com/d/msg/google-appengine/-/iJA7RSzgG0oJ.
>> >
>> > To post to this group, send email to google-a...@googlegroups.com.
>> > To unsubscribe from this group, send email to
>> > google-appengi...@googlegroups.com.

Mauricio Aristizabal

unread,
Aug 1, 2012, 7:20:26 PM8/1/12
to google-a...@googlegroups.com, je...@infohazard.org
I think the 5 QPS restriction is still there... but it's only for connections from outside GAE, AFAIK to discourage using CloudSQL from an outside app.  For GAE there are only limits on concurrent connections (1000) and requests (100), and request/response size (16MB)



On Wednesday, August 1, 2012 2:15:13 PM UTC-7, Jeff Schnitzer wrote:
This definitely sounds like a good strategy... especially with memory
tables and no indexes.

Back when cloud sql was in beta, there was a restriction of 5 queries
per second, so I "wrote it off" for doing any kind of heavy lifting.
Presumably that restriction has been lifted.

Jeff

> To post to this group, send email to google-appengine@googlegroups.com.
> To unsubscribe from this group, send email to

Takashi Matsuo

unread,
Aug 1, 2012, 10:30:12 PM8/1/12
to google-a...@googlegroups.com, Richard, je...@infohazard.org
On Thu, Aug 2, 2012 at 4:52 AM, Richard <stev...@gmail.com> wrote:
> Hi Takashi,
> Yeah, the first few characters would probably be very similar. I am not
> sure if that would be a problem ?
>
> A single user has only one score in the leaderboard for the game. So,
> overwriting the score the next time is not a problem, provided we know when
> to NOT include a user in the leaderboards (timestamp... if they are no
> longer playing).
>
> We could also use the user's name as that is guaranteed unique too. Would
> that work ?

I think theoretically yes, but I've not tested myself. Sorry if it won't work.

I'll so some experiments with memory db with backends from now on, and
get back here with the result.

Another possible workaround is using multiple pull queues and task tagging.
* sharding across mutliple pull queues should increase the task throughput.
* Using task tags and leasing by explicit tag is also a way to
increase the throughput of a single queue.

For more details, please look for 'Task Tagging' on:
https://developers.google.com/appengine/docs/python/taskqueue/overview-pull

You have already tried a simple pull queue implementation, so it's
relatively easy for you to test this strategy.

-- Takashi
>> >> >>> >> >> google-a...@googlegroups.com.
>> >> >>> >> >> To unsubscribe from this group, send email to
>> >> >>> >> >> google-appengi...@googlegroups.com.
>> >> >>> >> >> For more options, visit this group at
>> >> >>> >> >> http://groups.google.com/group/google-appengine?hl=en.
>> >> >>> >> >
>> >> >>> >> >
>> >> >>> >> >
>> >> >>> >> > --
>> >> >>> >> > Takashi Matsuo
>> >> >>> >>
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> --
>> >> >>> >> Takashi Matsuo
>> >> >
>> >> > --
>> >> > You received this message because you are subscribed to the Google
>> >> > Groups
>> >> > "Google App Engine" group.
>> >> > To view this discussion on the web visit
>> >> > https://groups.google.com/d/msg/google-appengine/-/v9XP85rmNM8J.
>> >> >
>> >> > To post to this group, send email to
>> >> > google-a...@googlegroups.com.
>> >> > To unsubscribe from this group, send email to
>> >> > google-appengi...@googlegroups.com.
>> >> > For more options, visit this group at
>> >> > http://groups.google.com/group/google-appengine?hl=en.
>> >>
>> >>
>> >>
>> >> --
>> >> Takashi Matsuo
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> > Groups
>> > "Google App Engine" group.
>> > To view this discussion on the web visit
>> > https://groups.google.com/d/msg/google-appengine/-/iJA7RSzgG0oJ.
>> >
>> > To post to this group, send email to google-a...@googlegroups.com.
>> > To unsubscribe from this group, send email to
>> > google-appengi...@googlegroups.com.
>> > For more options, visit this group at
>> > http://groups.google.com/group/google-appengine?hl=en.
>>
>>
>>
>> --
>> Takashi Matsuo
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine/-/VLWi5mNjQYMJ.
>
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengi...@googlegroups.com.

hyperflame

unread,
Aug 2, 2012, 12:19:51 AM8/2/12
to Google App Engine
Just to be clear, the GAE datastore returns stale results ONLY at a
specific time, and only when load is low? Is it because of both of
those factors together or just a single one? Seems odd that GAE is
reacting slowly just when load is the lowest. Can you post some load
graphs?

If the staleness is due to low load, you could always run some fake
users to add to the load, then have your backend subtract those fake
users when it builds the scoreboards. It wouldn't be hard; you can
capture some real users data, save it to a log, then replay it against
production servers.

On Aug 1, 1:27 pm, Richard <steven...@gmail.com> wrote:
> Summary so far:

Richard Watson

unread,
Aug 2, 2012, 12:36:04 AM8/2/12
to google-a...@googlegroups.com, Richard, je...@infohazard.org
If you reverse the user id on saving the score, it'll be random and you'll avoid the tablet issue.

Richard

unread,
Aug 2, 2012, 10:50:42 AM8/2/12
to google-a...@googlegroups.com
Kinda.  By 'low load', I mean 150-200 simultaneous users.  Oh... and it never used to have this problem.  This started 5 days ago.

Richard

unread,
Aug 2, 2012, 11:46:58 AM8/2/12
to google-a...@googlegroups.com, Richard, je...@infohazard.org
Well, I tried to save Score() objects with a key name equal to the username (guaranteed unique) last night.  GAE had the same problem in the wee hours of the morning when (most) of my users are not playing (only around 100-150 people playing).  So, apparently it is not a tablet issue. 

Time to play with Cloud SQL maybe ?

hyperflame

unread,
Aug 2, 2012, 12:29:35 PM8/2/12
to Google App Engine
It's just odd that GAE is having trouble not when load is high, but
when load is low. I doubt tablet issues are the cause here. Can you
run some fake load against the datastore? Just build a B1 that adds
some fake scores, and have your tabulator backend delete those fake
scores when it builds the leaderboard.

Also, i'm wondering why it just started 5 days ago. Did your load
profile change? Try this: change your instances to F2, then back to
F1. Perhaps your resident instance got moved to a faulty server 5 days
ago; this will force GAE to open a new instance on a new server.

Jeff Schnitzer

unread,
Aug 2, 2012, 12:44:32 PM8/2/12
to google-a...@googlegroups.com
Honestly this doesn't seem worth spending a lot of time debugging.
Google doesn't guarantee any finite limit to "eventual" consistency;
even if it works within the 10s window 95% of the time, it will
certainly fail eventually.

Seems like Cloud SQL is the next thing to try, and barring that, put
the collector in another cloud host using something that can handle
the throughput.

Jeff
> --
> You received this message because you are subscribed to the Google Groups "Google App Engine" group.

Takashi Matsuo

unread,
Aug 2, 2012, 12:50:53 PM8/2/12
to google-a...@googlegroups.com, Richard, je...@infohazard.org
On Fri, Aug 3, 2012 at 12:46 AM, Richard <stev...@gmail.com> wrote:
Well, I tried to save Score() objects with a key name equal to the username (guaranteed unique) last night.  GAE had the same problem in the wee hours of the morning when (most) of my users are not playing (only around 100-150 people playing).  So, apparently it is not a tablet issue. 

Time to play with Cloud SQL maybe ?

That's a good idea. Certainly 1k QPS is something Cloud SQL can handle theoretically, and Cloud SQL has a strong consistency which is important for your use case. I should have came to this conclusion earlier. Sorry about that. I'll try to experiment with Cloud SQL tomorrow and get back to you.

BTW, I have dome some experiment withs in-memory score server using backend instances, and it constantly marked around 200QPS with 20 B4 backends. So it might not be a good fit for your need this time.

-- Takashi



On Thursday, August 2, 2012 12:36:04 AM UTC-4, Richard Watson wrote:
If you reverse the user id on saving the score, it'll be random and you'll avoid the tablet issue.

On Wednesday, August 1, 2012 9:52:24 PM UTC+2, Richard wrote:
Hi Takashi,
Yeah, the first few characters would probably be very similar.  I am not sure if that would be a problem ?

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/BBYrAdRDF0EJ.

To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.



--
Takashi Matsuo

Richard

unread,
Aug 2, 2012, 1:15:06 PM8/2/12
to google-a...@googlegroups.com
Typically, we run up to 20+ F1 instances to handle the burstiness, so I don't think it could be a case of a single 'bad' instance.  The query that is not getting all the scores is run on a resident B1 in a background thread.

Richard

unread,
Aug 2, 2012, 1:19:12 PM8/2/12
to google-a...@googlegroups.com, Richard, je...@infohazard.org
Thanks for the testing and benchmarking Takashi!

Considering B4 = 4x B1 (we not really, but this is to illustrate a point!) ... and you had 20 of them, you effectively managed 200 QPS with 80 x B1's..... that's umm... 2.5 QPS each ?!?  That's crazy!   I think I see a performance development project in someone's future :)


On Thursday, August 2, 2012 12:50:53 PM UTC-4, Takashi Matsuo (Google) wrote:
On Fri, Aug 3, 2012 at 12:46 AM, Richard <stev...@gmail.com> wrote:
Well, I tried to save Score() objects with a key name equal to the username (guaranteed unique) last night.  GAE had the same problem in the wee hours of the morning when (most) of my users are not playing (only around 100-150 people playing).  So, apparently it is not a tablet issue. 

Time to play with Cloud SQL maybe ?

That's a good idea. Certainly 1k QPS is something Cloud SQL can handle theoretically, and Cloud SQL has a strong consistency which is important for your use case. I should have came to this conclusion earlier. Sorry about that. I'll try to experiment with Cloud SQL tomorrow and get back to you.

BTW, I have dome some experiment withs in-memory score server using backend instances, and it constantly marked around 200QPS with 20 B4 backends. So it might not be a good fit for your need this time.

-- Takashi



On Thursday, August 2, 2012 12:36:04 AM UTC-4, Richard Watson wrote:
If you reverse the user id on saving the score, it'll be random and you'll avoid the tablet issue.

On Wednesday, August 1, 2012 9:52:24 PM UTC+2, Richard wrote:
Hi Takashi,
Yeah, the first few characters would probably be very similar.  I am not sure if that would be a problem ?

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/BBYrAdRDF0EJ.

To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to google-appengine+unsubscribe@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.



--
Takashi Matsuo

Takashi Matsuo

unread,
Aug 2, 2012, 2:10:59 PM8/2/12
to google-a...@googlegroups.com, Richard, je...@infohazard.org
Richard,

Yes. The result is almost the same with 20 B1. Apparently the bottleneck is within somewhere other than backends themselves.


To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/eIvPq4dCXqAJ.

To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.



--
Takashi Matsuo

Richard

unread,
Aug 2, 2012, 3:10:53 PM8/2/12
to google-a...@googlegroups.com, Richard, je...@infohazard.org
Ok, so I started looking at Cloud SQL.

To add an entry, appstats says the following:
  rdbms.OpenConnection   182 msec
  rdbms.Exec  144 msec
  rdbms.ExecOp  82 msec
  rdbms.CloseConnection  45msec

It seems to me that I should be able to cache the connection and keep it open for the life of the instance .... or at least while the instance is receiving a lot of scores.  This should remove almost half the DB overhead.  However, I don't know enough to understand under what conditions the DB connection would be dropped, how to detect the connection is stale and most importantly, how to close the connection when the instance is torn down.

Does anyone have a code pattern for this sort of thing ?

-R

alex

unread,
Aug 2, 2012, 3:55:05 PM8/2/12
to google-a...@googlegroups.com
As far as I understand, Google recommends closing connection right away, after you perform all operations. The latency to check whether a connection is still open is (almost?) the same as opening a new one, so it doesn't seem to be worth keeping it open.

They also say a per-hour-usage db instance gets inactive after 15 min idle, and12 hours (if I remember correctly) forthe other plan (you pay flat per day). I've seen high latencies only on first connect to a "cold" db instance. You'll find more in the official docs.

I liked it. Works better than I expected.

Mauricio Aristizabal

unread,
Aug 2, 2012, 5:19:59 PM8/2/12
to google-a...@googlegroups.com
Richard, I do use a connection pool (Apache commons dbcp) but i'm on the Java side so can't help you with specifics.  The only wrinkle for me was that I couldn't use the strategy where it checks connection health at intervals because of the GAE thread limitations, so instead it has to do a test query before every query, but in practice this ends up costing only about 5ms (it's a fast query: "select 1").  

Depending what you end up using though, you may be able to do without this and instead trap error, get new connection and retry.  If you setup a monitoring ping to your app or a cron that results in a query always happening before the 15m instance-inactive window then you can keep your GCSQL instance up around the clock and then such stale connections will be very rare.

In fact since in your case there will only be a couple queries you might just add this retry logic right into your app.  Should be pretty simple, just wait for wind down, see what exception is thrown, and trap that / connect / retry.


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/usXB3qS55KcJ.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.

Mauricio Aristizabal

unread,
Aug 2, 2012, 5:25:43 PM8/2/12
to google-a...@googlegroups.com
Wait, sorry I just realized you do still need some mechanism for pooling the connections (even if it doesn't automatically test for their health), so yeah hopefully Python folks can chime in with that.

Takashi Matsuo

unread,
Aug 2, 2012, 7:49:59 PM8/2/12
to google-a...@googlegroups.com

Hi everyone,

We don't recommend using connection pool mechanism with Cloud SQL. Also see this thread:
--
Takashi Matsuo

Takashi Matsuo

unread,
Aug 2, 2012, 8:57:44 PM8/2/12
to google-a...@googlegroups.com

Here is a good and bad news.

Bad news is that currently, the number of concurrent connections to a single CloudSQL instance is limited to 100.

Good news is that, my experiment implementation of in-memory database with backend instances got 1k QPS constantly even with 20 B1 backends.
Actually, the bottleneck of my previous experiment was just the network problem of the machine on which I ran the test. When I tested with a good environment, the throughput had been constantly +1k QPS. Yay!

Richard,

Can you start look into my prototype? I definitely believe it's worth looking :)

-- Takashi
--
Takashi Matsuo

Richard

unread,
Aug 2, 2012, 10:23:43 PM8/2/12
to google-a...@googlegroups.com
Hi Takashi,

I can definately look into it, but I would prefer a cheaper solution than MANY B1 backends... especially since my app is a free android game :)

In the mean time, I have configured another app to talks to a Cloud SQL instance.  I just generates a random username and score and saves it to the DB.  Here are the apache bench results for 1000 requests with 250 concurrent:

Concurrency Level:      250
Time taken for tests:   49.137 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0


Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       31   48  17.8     38      93
Processing:   397 10564 3119.1  11951   12820
Waiting:      395 6176 3352.6   6091   12819
Total:        434 10613 3118.9  11990   12858

Percentage of the requests served within a certain time (ms)
  50%  11990
  66%  12193
  75%  12250
  80%  12290
  90%  12395
  95%  12436
  98%  12477
  99%  12507
 100%  12858 (longest request)

That is not very encouraging.  I removed appstats, but that did not make any difference.  Time to try connection pooling ?

-R
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to google-appengine+unsubscribe@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to google-appengine+unsubscribe@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.



--
Takashi Matsuo



--
Takashi Matsuo

Takashi Matsuo

unread,
Aug 2, 2012, 11:18:09 PM8/2/12
to google-a...@googlegroups.com
Probably no. As I said before, we don't recommend using connection pooling with Cloud SQL.
BTW, my implementation steadily hit 300-500QPS with 10 b1 backends. Is it still too expensive?

Here is the ab result:

Concurrency Level:      500
Time taken for tests:   1.842 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      803899 bytes
HTML transferred:       2000 bytes
Requests per second:    542.98 [#/sec] (mean)
Time per request:       920.850 [ms] (mean)
Time per request:       1.842 [ms] (mean, across all concurrent requests)
Transfer rate:          426.27 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       32   34   0.7     34      35
Processing:   252  634 293.1    581    1776
Waiting:      252  634 293.1    581    1776
Total:        284  668 293.1    615    1809

Percentage of the requests served within a certain time (ms)
  50%    615
  66%    761
  75%    823
  80%    869
  90%   1037
  95%   1248
  98%   1485
  99%   1641
 100%   1809 (longest request)


 
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/m6NgLswZ5gcJ.

To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.



--
Takashi Matsuo

Takashi Matsuo

unread,
Aug 3, 2012, 12:41:07 AM8/3/12
to google-a...@googlegroups.com


I should have mentioned that if you move to the in-memory implementation, the number of datastore read/write will be drastically decreased, and so the cost will be smaller too.

Richard Watson

unread,
Aug 3, 2012, 3:28:02 AM8/3/12
to google-a...@googlegroups.com
What are the performance characteristics of connecting to Google Compute Engine?  Maybe slap the in-memory app onto that.

Mauricio Aristizabal

unread,
Aug 3, 2012, 3:49:24 AM8/3/12
to google-a...@googlegroups.com
Takashi, is there some more detailed information on why Google doesn't encourage using a connection pool?  Is it simply to encourage allowing the db instance to wind down instead of being kept alive only by pool connection health checks?  If so I'm sure it could be configured to avoid this.

It does seem to me that it could reduce Richard's costs drastically, by 2/3 just on the writes to the db by his own numbers.

I've been using pooling without issue for several months now, though admittedly with very little traffic so far, so if you think this is going to get me in trouble later I'm very eager to hear why.



On Fri, Aug 3, 2012 at 12:28 AM, Richard Watson <richard...@gmail.com> wrote:
What are the performance characteristics of connecting to Google Compute Engine?  Maybe slap the in-memory app onto that.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/pTFagZqQkx4J.

Richard

unread,
Aug 3, 2012, 9:04:55 AM8/3/12
to google-a...@googlegroups.com
Connection pooling might be a good idea.  Since there are people in every game round and each round is 3 minutes, the SQL db will always be up.  I did try it, but I think my connection from home was limited. 

RE: SQL solution:   Can some of you with LOTS of bandwidth (from a *nix machine), please AB the following URL:

     http://sven-anagramhero.appspot.com/client/loadtest

Try at least 1000 connections with 250-500 concurrent and report back here please.

WRT costs:  DB read/writes are around $3/day.  Whereas 10 B1 backends would be almost $20/day.

In addition, the B1 solution does not scale.  Lets say the app suddenly gets a lot of new users.  Now I need to update the backend.  Additionally, peak load is approximately 4x the lowest value.  Now, for part of each day, I need to have enough instances to handle the peak load just doing minimal work.  This is not scaling automatically.  I always need to pay the maximum of whatever is needed to handle peak load.... or else update the backends every few hours to add/remove B1's.  Not exactly fulfilling the automatic scaling promise!




On Friday, August 3, 2012 3:49:24 AM UTC-4, Mauricio Aristizabal wrote:
Takashi, is there some more detailed information on why Google doesn't encourage using a connection pool?  Is it simply to encourage allowing the db instance to wind down instead of being kept alive only by pool connection health checks?  If so I'm sure it could be configured to avoid this.

It does seem to me that it could reduce Richard's costs drastically, by 2/3 just on the writes to the db by his own numbers.

I've been using pooling without issue for several months now, though admittedly with very little traffic so far, so if you think this is going to get me in trouble later I'm very eager to hear why.

On Fri, Aug 3, 2012 at 12:28 AM, Richard Watson <richard...@gmail.com> wrote:
What are the performance characteristics of connecting to Google Compute Engine?  Maybe slap the in-memory app onto that.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/pTFagZqQkx4J.

To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to google-appengine+unsubscribe@googlegroups.com.

Takashi Matsuo

unread,
Aug 3, 2012, 9:25:24 AM8/3/12
to google-a...@googlegroups.com
On Fri, Aug 3, 2012 at 10:04 PM, Richard <stev...@gmail.com> wrote:
Connection pooling might be a good idea.  Since there are people in every game round and each round is 3 minutes, the SQL db will always be up.  I did try it, but I think my connection from home was limited. 

RE: SQL solution:   Can some of you with LOTS of bandwidth (from a *nix machine), please AB the following URL:

     http://sven-anagramhero.appspot.com/client/loadtest

Try at least 1000 connections with 250-500 concurrent and report back here please.

WRT costs:  DB read/writes are around $3/day.  Whereas 10 B1 backends would be almost $20/day.

In addition, the B1 solution does not scale.  Lets say the app suddenly gets a lot of new users.  Now I need to update the backend.  Additionally, peak load is approximately 4x the lowest value.  Now, for part of each day, I need to have enough instances to handle the peak load just doing minimal work.  This is not scaling automatically.  I always need to pay the maximum of whatever is needed to handle peak load.... or else update the backends every few hours to add/remove B1's.  Not exactly fulfilling the automatic scaling promise!

Good point, but I think you can workaround with dynamic backend instances, although you need to care about shutdown scenario. Here is just a theoretical implementation.

* Define one resident backend instance for handling the scores from backend instances which are shutting down. Let's call this instance as a 'master'.
* Define dynamic backend with appropriate max instances according your needs, setup a shutdown hook which will pass the scores to the master instance.
* When reaping, you can ask dynamic instances as well as the master instance.

That way, you can have the autoscale capability with minimum cost.

Maybe I can create a prototype implementation hopefully early next week. Of course, you can create your own.

-- Takashi

 




On Friday, August 3, 2012 3:49:24 AM UTC-4, Mauricio Aristizabal wrote:
Takashi, is there some more detailed information on why Google doesn't encourage using a connection pool?  Is it simply to encourage allowing the db instance to wind down instead of being kept alive only by pool connection health checks?  If so I'm sure it could be configured to avoid this.

It does seem to me that it could reduce Richard's costs drastically, by 2/3 just on the writes to the db by his own numbers.

I've been using pooling without issue for several months now, though admittedly with very little traffic so far, so if you think this is going to get me in trouble later I'm very eager to hear why.



On Fri, Aug 3, 2012 at 12:28 AM, Richard Watson <richard...@gmail.com> wrote:
What are the performance characteristics of connecting to Google Compute Engine?  Maybe slap the in-memory app onto that.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/pTFagZqQkx4J.

To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to google-appengine+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/-NIqVbhGPw4J.

To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.



--
Takashi Matsuo

alex

unread,
Aug 3, 2012, 10:02:09 AM8/3/12
to google-a...@googlegroups.com
From Rackspace (London):

This is ApacheBench, Version 2.3 <$Revision: 655654 $>

Server Software:        Google
Server Hostname:        sven-anagramhero.appspot.com
Server Port:            80

Document Path:          /client/loadtest
Document Length:        2 bytes

Concurrency Level:      200
Time taken for tests:   15.694 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      171000 bytes
HTML transferred:       2000 bytes
Requests per second:    63.72 [#/sec] (mean)
Time per request:       3138.712 [ms] (mean)
Time per request:       15.694 [ms] (mean, across all concurrent requests)
Transfer rate:          10.64 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        8    8   1.5      8      22
Processing:   139 2827 1197.6   2910    8487
Waiting:      139 2827 1197.6   2910    8487
Total:        147 2835 1197.6   2918    8494

Percentage of the requests served within a certain time (ms)
  50%   2918
  66%   3341
  75%   3620
  80%   3874
  90%   4257
  95%   4700
  98%   5900
  99%   6131
 100%   8494 (longest request)


This is ApacheBench, Version 2.3 <$Revision: 655654 $>

Server Software:        Google
Server Hostname:        sven-anagramhero.appspot.com
Server Port:            80

Document Path:          /client/loadtest
Document Length:        2 bytes

Concurrency Level:      500
Time taken for tests:   6.879 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      171000 bytes
HTML transferred:       2000 bytes
Requests per second:    145.37 [#/sec] (mean)
Time per request:       3439.463 [ms] (mean)
Time per request:       6.879 [ms] (mean, across all concurrent requests)
Transfer rate:          24.28 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        8   17   9.0     11      28
Processing:   144 2210 1535.1   1885    6831
Waiting:      144 2210 1535.2   1885    6831
Total:        152 2227 1539.7   1894    6853

Percentage of the requests served within a certain time (ms)
  50%   1894
  66%   2410
  75%   3100
  80%   3225
  90%   4492
  95%   5628
  98%   6418
  99%   6484
 100%   6853 (longest request)

Richard

unread,
Aug 3, 2012, 12:19:24 PM8/3/12
to google-a...@googlegroups.com
Thanks Alex, VERY much appreciated, since I can't test this myself without buying a shell account somewhere.

Luckily, the backend crashed due to being unable to reuse the connection for the delete.  So I added some exception handling :)

Can I ask some more people to try this link:  http://sven-anagramhero.appspot.com/client/loadtest

Please ping it once from a web browser just before you hit it.  This will ensure the DB is up :)

I would like to see results for loads of n > 1000 with c => 500.

The server clears out results every 3 minutes (synchronized to NTP time) on the minute boundary, so please try to avoid doing it exactly on that boundary (in which case the results will be spread and it makes it more difficult to ensure we did not 'lose' any).

NOTE:  It seems we can store at least 1k users within 10 seconds ..... I really don't like the 6.8 second response (I would prefer 300 msec)..... viable ?  y/n ?

Thanks !

-R

hyperflame

unread,
Aug 3, 2012, 2:05:34 PM8/3/12
to Google App Engine
Richard,

I did some testing overnight, and I have some good news, and some bad
news.

Good news, I can give you a system that stores 1,000 users and scores
in roughly 1 second. In less than a second, I can pull out all 1,000
scores, sort the scores numerically, and print out the score list.
Bad news: It depends on memcache.

Details: Last night, I wrote an application to generate 1000 users and
1000 scores randomly, and store them in memcache. On average, this
operation takes roughly 1 - 1.3 seconds, although I suspect the
slowness is due to the random number generator, not the memcache. I'll
test this more.

Then a task is enqueued, to call another F1 instance in three seconds.

The next instance pulls out all 1000 scores, sorts them using a
treemap, and prints out the sorted data into GAE logging in less than
a second. Then, the memcache is cleared for the next iteration of the
test, so we don't get old data.

A cron job repeats this test every 2 minutes.

Here is my memcache viewer screen: http://i.imgur.com/oypAm.png . As
you can see, this service ran overnight, and didn't drop a single user/
score. Over 330,000 scores were posted and accessed in total. Is this
good enough performance for your game?
> >>>> To post to this group, send email to google-a...@googlegroups.com.
> >>>> To unsubscribe from this group, send email to
> >>>> google-appengi...@googlegroups.com.

Richard

unread,
Aug 3, 2012, 2:13:50 PM8/3/12
to google-a...@googlegroups.com
Sounds interesting..... but how do you handle write contention to the memcache datastorage structure from multiple F1's serving client side score submissions ?

Also, I thought memcache had a size limit ?  I store a lot more than just username + score (including a full stream of all actions the user takes in the UI to prevent cheating).

-R
> >>>> To post to this group, send email to google-appengine@googlegroups.com.
> >>>> To unsubscribe from this group, send email to
> >>>> google-appengine+unsubscribe@googlegroups.com.

hyperflame

unread,
Aug 3, 2012, 2:29:20 PM8/3/12
to Google App Engine


On Aug 3, 1:13 pm, Richard <steven...@gmail.com> wrote:
> Sounds interesting..... but how do you handle write contention to the
> memcache datastorage structure from multiple F1's serving client side score
> submissions ?

I'm sure it could be done, I have some ideas regarding that (perhaps
vary the key structure depending on the instance/user?) but I really
don't want to pay the cost of multiple F1s, B1s, etc to test my
theory. I might mock up something on my local dev server if I have
time over the weekend, but I don't know how memcache works on the
local development eclipse plugin.

On Aug 3, 1:13 pm, Richard <steven...@gmail.com> wrote:
> Also, I thought memcache had a size limit ?  I store a lot more than just
> username + score (including a full stream of all actions the user takes in
> the UI to prevent cheating).

How much do you store? My general rule of thumb is that I depend on
memcache to store 1 GB of data before it starts force-expiring objects
(this is for enterprise-level, paid apps). I'm trying to Google around
for some documentation regarding the memcache limit, but it seems that
there is very little documentation regarding memcache. Frankly, I
think shooting for a 100 MB self-imposed-limit should be fine. This is
something you really should ask Takashi.

Richard

unread,
Aug 3, 2012, 2:38:32 PM8/3/12
to google-a...@googlegroups.com
Sorry, I should have been more explicit.

I thought memcache had a size limit on a single object (1MB).  Now imagine I have 2000 people submitting data for a game.  i don't think I will be fitting all that into 1MB.  Which means I need to store multiple objects and fan out/fan in results into memory from memcache (assuming I solve the write contention problem... WITHOUT making clients timeout waiting for a write lock!).

Richard

unread,
Aug 3, 2012, 3:34:11 PM8/3/12
to google-a...@googlegroups.com
Just moved the scoring over to CloudSQL ..... and got Featured on Google Play Store 30 min ago.

Let's PRAY that Cloud SQL saves our ass.... or else I am screwed.

hyperflame

unread,
Aug 3, 2012, 3:40:49 PM8/3/12
to Google App Engine

I'm assuming you need storage space to log past user actions so you
can prevent cheating, correct? If so, couldn't you just log, say, the
past 5 (or some relatively small number) of actions and check those
for cheating?

It's difficult to talk hypothetically about these issues without a
diagram or flowchart of what is actually happening in your
application.

hyperflame

unread,
Aug 3, 2012, 3:43:43 PM8/3/12
to Google App Engine
Congrats!

Let's see some graphs afterwards, I'd be interested in seeing how
Cloud SQL holds up.

Richard

unread,
Aug 3, 2012, 3:55:01 PM8/3/12
to google-a...@googlegroups.com
Well, Cloud SQL is NOT the answer.... it is topping out around 500 users.  The extra's don't make it into the DB within the 10 second window.  Then they get shown in the next window.

I can 'fix' this by silently deleting the extra's and doing a client update that will insert the user into the results leaderboard if they are not there.  HOWEVER, the user's stats will then possibly be wrong (best game/etc).

Richard

unread,
Aug 3, 2012, 4:34:02 PM8/3/12
to google-a...@googlegroups.com
Ditched Cloud SQL.  Went back to the old system of savign lightweight DB objects.  I have no idea what to do tonight when DB queries stop working again (like they do every night for the last week!)

Richard

unread,
Aug 3, 2012, 5:08:31 PM8/3/12
to google-a...@googlegroups.com
Currently moved to the following:
 
Lightweight Score() db object has a timestamp with the following:
  timestamp = db.DateTimeProperty(auto_now=True)
 
B1 backend reaps all Score()'s with a timestamp < 1 min old.
 
20 secs after the reap, the B1 deletes all Score()'s with timestamp at least 10 min old.
 
My theory is:
 - When the DB is playing up and creating a Score() is VERY slow and so they don't show up in queries... having an 'old' Score() already there will make the put() strongly consistent and force an instant update.
 - Queries based on a timestamp diff will then return the Score
 - It may take 1 round for a player to get into the leaderboard, since the first put() will take some time for things to settle.
 
Coments/thoughts ?
 
Any

Richard

unread,
Aug 3, 2012, 5:33:47 PM8/3/12
to google-a...@googlegroups.com
Number of users in a game:
 
522
577
575
602
623
653
259   <----- WTF ?
684
 
 
So, queries are still slow/lazy/bad/don't work properly..... Grrrr.
 
The fail list:
 - use a synchronised cron job to reap Score() ..... had to build my own NTP query engine because cron is unreliable under load
 - save object in a put() and query for it later .... fails to get all objects
 - push objects into a PULL queue and get them using a backend .... 4-10 second delay before things can be seen in the PULL queue
 - Cloud SQL ... client submission tops out around 500 entries.... then the backend cannot connect to reap the results!
 - keep Score objects in the DB and just update them ..... they dont get found in a query
 - in memory backend .... requires MANY back end instances and does not scale automatically
 
Anyone have any other possibilities to try ?
 
-R

Takashi Matsuo

unread,
Aug 3, 2012, 6:07:10 PM8/3/12
to google-a...@googlegroups.com
On Sat, Aug 4, 2012 at 6:33 AM, Richard <stev...@gmail.com> wrote:
Number of users in a game:
 
522
577
575
602
623
653
259   <----- WTF ?
684
 
 
So, queries are still slow/lazy/bad/don't work properly..... Grrrr.
 
The fail list:
 - use a synchronised cron job to reap Score() ..... had to build my own NTP query engine because cron is unreliable under load
 - save object in a put() and query for it later .... fails to get all objects
 - push objects into a PULL queue and get them using a backend .... 4-10 second delay before things can be seen in the PULL queue
 - Cloud SQL ... client submission tops out around 500 entries.... then the backend cannot connect to reap the results!
 - keep Score objects in the DB and just update them ..... they dont get found in a query
 - in memory backend .... requires MANY back end instances and does not scale automatically

Just wanted to make sure...
Have you seen my post about auto-scaling in-memory backends?
 
 
Anyone have any other possibilities to try ?
 
-R
 

On Friday, August 3, 2012 5:08:31 PM UTC-4, Richard wrote:
Currently moved to the following:
 
Lightweight Score() db object has a timestamp with the following:
  timestamp = db.DateTimeProperty(auto_now=True)
 
B1 backend reaps all Score()'s with a timestamp < 1 min old.
 
20 secs after the reap, the B1 deletes all Score()'s with timestamp at least 10 min old.
 
My theory is:
 - When the DB is playing up and creating a Score() is VERY slow and so they don't show up in queries... having an 'old' Score() already there will make the put() strongly consistent and force an instant update.
 - Queries based on a timestamp diff will then return the Score
 - It may take 1 round for a player to get into the leaderboard, since the first put() will take some time for things to settle.
 
Coments/thoughts ?
 
Any
On Friday, August 3, 2012 4:34:02 PM UTC-4, Richard wrote:
Ditched Cloud SQL.  Went back to the old system of savign lightweight DB objects.  I have no idea what to do tonight when DB queries stop working again (like they do every night for the last week!)
 

On Friday, August 3, 2012 3:55:01 PM UTC-4, Richard wrote:
Well, Cloud SQL is NOT the answer.... it is topping out around 500 users.  The extra's don't make it into the DB within the 10 second window.  Then they get shown in the next window.

I can 'fix' this by silently deleting the extra's and doing a client update that will insert the user into the results leaderboard if they are not there.  HOWEVER, the user's stats will then possibly be wrong (best game/etc).

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/Jmu7SkNGZxQJ.

To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.



--
Takashi Matsuo

Richard

unread,
Aug 3, 2012, 8:20:47 PM8/3/12
to google-a...@googlegroups.com
Hi Takashi,

Yes, I read your post with a theoretical model, but unfortunately, I don't really know how to tell it to scale up/down ?

-R

Takashi Matsuo

unread,
Aug 3, 2012, 11:18:37 PM8/3/12
to google-a...@googlegroups.com

Dynamic backend instances automatically scale up/down.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/9Z77HBrZDEYJ.

Richard

unread,
Aug 4, 2012, 12:39:47 AM8/4/12
to google-a...@googlegroups.com
6 hours of straight coding and testing later, we now have a new backend with 10 static B1's acting as a sharded memory proxy for the results.

We seem to be handling around 900 players with no problems at the moment.

Special thanks to Takashi for the design & a lot more!

Time for some much needed sleep.


On Friday, August 3, 2012 11:18:37 PM UTC-4, Takashi Matsuo (Google) wrote:

Dynamic backend instances automatically scale up/down.

On Aug 4, 2012 9:21 AM, "Richard" <stev...@gmail.com> wrote:
Hi Takashi,

Yes, I read your post with a theoretical model, but unfortunately, I don't really know how to tell it to scale up/down ?

-R

On Friday, August 3, 2012 6:07:10 PM UTC-4, Takashi Matsuo (Google) wrote:

Just wanted to make sure...
Have you seen my post about auto-scaling in-memory backends?


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/9Z77HBrZDEYJ.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to google-appengine+unsubscribe@googlegroups.com.

Takashi Matsuo

unread,
Aug 4, 2012, 5:32:00 PM8/4/12
to google-a...@googlegroups.com


Great to hear that and congrats to your success on Google Play!

To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/W707lh8eSDwJ.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.

Jeff Schnitzer

unread,
Aug 6, 2012, 1:23:29 AM8/6/12
to google-a...@googlegroups.com
Commendable... but insane! I still think the answer is to move the
collector outside of GAE. Hell, run it in Google Compute Engine. I
have a java server that handles a thousand qps doing push
notifications on an $11/mo rackspacecloud vps... and that isn't even
topped out.

Ultimately you're just collecting a bunch of numbers, then providing
them to a reaper process. You're spending 50X what you should and
writing all this ugly sharding code just because GAE backends have a
massive throughput problem.

I'm not saying abandon GAE, just move the parts that GAE does poorly
elsewhere. I can't see any compelling reason to keep the collector in
appengine, and a lot of good reasons to move it.

Jeff

On Fri, Aug 3, 2012 at 9:39 PM, Richard <stev...@gmail.com> wrote:
> 6 hours of straight coding and testing later, we now have a new backend with
> 10 static B1's acting as a sharded memory proxy for the results.
>
> We seem to be handling around 900 players with no problems at the moment.
>
> Special thanks to Takashi for the design & a lot more!
>
> Time for some much needed sleep.
>
>
> On Friday, August 3, 2012 11:18:37 PM UTC-4, Takashi Matsuo (Google) wrote:
>>
>> Dynamic backend instances automatically scale up/down.
>>
>> On Aug 4, 2012 9:21 AM, "Richard" <stev...@gmail.com> wrote:
>>>
>>> Hi Takashi,
>>>
>>> Yes, I read your post with a theoretical model, but unfortunately, I
>>> don't really know how to tell it to scale up/down ?
>>>
>>> -R
>>>
>>> On Friday, August 3, 2012 6:07:10 PM UTC-4, Takashi Matsuo (Google)
>>> wrote:
>>>>
>>>>
>>>> Just wanted to make sure...
>>>> Have you seen my post about auto-scaling in-memory backends?
>>>>
>>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "Google App Engine" group.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msg/google-appengine/-/9Z77HBrZDEYJ.
>>> To post to this group, send email to google-a...@googlegroups.com.
>>> To unsubscribe from this group, send email to
>>> google-appengi...@googlegroups.com.
>>> For more options, visit this group at
>>> http://groups.google.com/group/google-appengine?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine/-/W707lh8eSDwJ.
>
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengi...@googlegroups.com.

Takashi Matsuo

unread,
Aug 7, 2012, 3:52:01 AM8/7/12
to google-a...@googlegroups.com

Richard,

I've done some experiment with the dynamic backends, and unfortunately I found that it's not suitable for your needs.
Please feel free to ask again if you have further questions.

-- Takashi
--
Takashi Matsuo

Richard

unread,
Aug 7, 2012, 3:54:51 PM8/7/12
to google-a...@googlegroups.com
Yeah, I pretty much figured that firing up backends dynamically would be a little slow.  Thank you for testing this option!

I just registered to test/use a Google Compute machine.  I would like to test some options on that.  Hopefully they approve me to use it soon.

-R
Takashi Matsuo
Reply all
Reply to author
Forward
0 new messages