I am getting a lot of transaction collisions

131 views
Skip to first unread message

Michael Robellard

unread,
Aug 13, 2010, 11:24:07 PM8/13/10
to Google App Engine Ranklist
08-13 08:21PM 44.950
Transaction collision for entity group with key
datastore_types.Key.from_path(u'ranker', 23150L,
_app=u'railroadempire'). Retrying...

I have about 1.4 or so requests/sec that are trying to set the
ranking.

Do I have something configured wrong, or does ranklist not support
this high of a throughput

Bartholomew Furrow

unread,
Aug 14, 2010, 1:04:11 PM8/14/10
to google-app-en...@googlegroups.com
Hi Michael,

I'm only guessing, but probably the best way to deal with this is batch your updates.  Whenever a scoreboard row changes in score, set a 'dirty' field in it to True; then, separately, have a cron job continually querying for rows with 'dirty' set to true and then running set_scores on the ranker for all of them.  That way you get the advantage of having a lot of updates at the same time, and during periods of heavy use you just fall behind a bit instead of failing requests.

Beyond that I think you'd need to get into specifics of the tree structure and whether certain nodes happen to be getting hit more often than the others.  For example, if you're calling num_ranked_users (or whatever it's called) a lot, you could consider slapping a memcache on that with an expiry time of 20s so your root node doesn't get hit nearly as often.

Thoughts?
Bartholomew


--
You received this message because you are subscribed to the Google Groups "Google App Engine Ranklist" group.
To post to this group, send email to google-app-en...@googlegroups.com.
To unsubscribe from this group, send email to google-app-engine-r...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-app-engine-ranklist?hl=en.


Robellard, Michael

unread,
Aug 14, 2010, 6:37:35 PM8/14/10
to google-app-en...@googlegroups.com
I batched my updates, which seems to be working pretty well. I can get up to 50 in a single request before I start having memory issues, but fifty seems to be OK for now, I end up getting about 38 player updates right now every minute on average. If it goes much higher I can slap a task queue on the cron job, and when I get to fifty just add a task to continue it.

The log lines look like this:
08-14 03:28PM 00.970 /traintasks/setranks 200 4253ms 19410cpu_ms 14024api_cpu_ms 0kb AppEngine-Google; (+http://code.google.com/appengine

I am assuming the high CPU times are becuase of the large amount of work the datastore is doing to get all the entites?
--
Michael Robellard
(216)288-2745
Play my game Railroad Empire http://apps.facebook.com/railroadempire

Bartholomew Furrow

unread,
Aug 14, 2010, 7:21:25 PM8/14/10
to google-app-en...@googlegroups.com
I batched my updates, which seems to be working pretty well. I can get up to 50 in a single request before I start having memory issues, but fifty seems to be OK for now, I end up getting about 38 player updates right now every minute on average. If it goes much higher I can slap a task queue on the cron job, and when I get to fifty just add a task to continue it.

Cool!  I'm glad it worked out.
 
The log lines look like this:
08-14 03:28PM 00.970 /traintasks/setranks 200 4253ms 19410cpu_ms 14024api_cpu_ms 0kb AppEngine-Google; (+http://code.google.com/appengine

I am assuming the high CPU times are becuase of the large amount of work the datastore is doing to get all the entites?

Whew, that's a lot of CPU use.  What's the score range of your ranker?  I suspect that's exactly what it is -- and there's some client-side calculation that happens to figure out exactly which nodes to update -- but I'm curious as to exactly how poorly this benchmarks for set_multi. :-)

Robellard, Michael

unread,
Aug 14, 2010, 8:42:09 PM8/14/10
to google-app-en...@googlegroups.com
I think part of the problem is my score range:

I needed to make it much smaller, I got carried away, The most money the best player in the game has currently is just over $1,000,000,000, but I allow for up to: 100,000,000,000,000. I could also divide the money by $1,000 and I don't even think I would lose any granularity.

                r = ranker.Ranker.Create([-100000000000000, 100000000000000], 100)


--
You received this message because you are subscribed to the Google Groups "Google App Engine Ranklist" group.
To post to this group, send email to google-app-en...@googlegroups.com.
To unsubscribe from this group, send email to google-app-engine-r...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-app-engine-ranklist?hl=en.

Bartholomew Furrow

unread,
Aug 14, 2010, 9:01:40 PM8/14/10
to google-app-en...@googlegroups.com
Hm, I guess it's tricky if the amount of money is just unbounded.  It shouldn't really add much overhead to have that extra depth, though, if we implemented it right; unaffected nodes just shouldn't be changed by set_multi.

Sebastian Kanthak

unread,
Aug 16, 2010, 12:11:23 PM8/16/10
to google-app-en...@googlegroups.com
On Sat, Aug 14, 2010 at 10:04 AM, Bartholomew Furrow <fur...@gmail.com> wrote:
Hi Michael,

I'm only guessing, but probably the best way to deal with this is batch your updates.  Whenever a scoreboard row changes in score, set a 'dirty' field in it to True; then, separately, have a cron job continually querying for rows with 'dirty' set to true and then running set_scores on the ranker for all of them.  That way you get the advantage of having a lot of updates at the same time, and during periods of heavy use you just fall behind a bit instead of failing requests.

I agree with Bartholomew that "batching" is the way to go.


Beyond that I think you'd need to get into specifics of the tree structure and whether certain nodes happen to be getting hit more often than the others.  For example, if you're calling num_ranked_users (or whatever it's called) a lot, you could consider slapping a memcache on that with an expiry time of 20s so your root node doesn't get hit nearly as often.

The entire ranker structure is one "entity group", so all transactions that modify some node of it (not necessarily the same) will fail. That's why the throughput is limited to something around 1 (batched) update per second -- AppEngine's optimistic concurrency model just doesn't give you more throughput per entity group.

Splitting up the ranker into several entity groups would be an interesting experiment but you'd have to think pretty hard about inconsistencies between these entity groups.
 
Sebastian

Robellard, Michael

unread,
Aug 16, 2010, 2:15:29 PM8/16/10
to google-app-en...@googlegroups.com
I have implemented a batch system

I wrote about the details today on my blog:

Bartholomew Furrow

unread,
Aug 16, 2010, 8:48:47 PM8/16/10
to google-app-en...@googlegroups.com
Nice post!  The only reason I suggested a cron job, by the way, is that it's the technology we had available to us when we started doing it (actually we just had a machine constantly pinging a secret url).  There might be some newer, better way to say "keep doing the following operation constantly, but pause for a few seconds between repetitions if you don't get any work done."  That way you wouldn't have the minute worth of latency.

This raises the interesting question of what you'd do if your consistent usage got past the point where ranklist could keep up with its current implementation.  How close to that do you think you are?

Robellard, Michael

unread,
Aug 16, 2010, 9:45:47 PM8/16/10
to google-app-en...@googlegroups.com

Robellard, Michael

unread,
Aug 16, 2010, 9:47:55 PM8/16/10
to google-app-en...@googlegroups.com
Now I have about a 20x room for growth before that would become a problem. based on how much my current use is. After that it would be tricky and I would probably just say that ranking is only updated once every x hours after that.

Nickolas Daskalou

unread,
Aug 16, 2010, 10:22:54 PM8/16/10
to google-app-en...@googlegroups.com
I've been following this thread with interest.

Cool blog post Michael! Correct me if I'm wrong though, but your implementation (in your blog post) allows for a small window of time between:

for player in playerquery.fetch(50): # time 1

and

db.put(players) # time 2

where a new value for a player's cashOnHand value will be overridden by the db.put(players) line, if that new value is saved to the datastore between time 1 and time 2.

Although this is probably unlikely, have you managed to get around this (eg. by put()ing each player entity within a transaction)?

Nick

Bartholomew Furrow

unread,
Aug 17, 2010, 1:56:58 AM8/17/10
to google-app-en...@googlegroups.com
Good call, Nickolas.  In fact, you particularly want to:

1. Get scoreboard rows, save as old_rows
2. Update scores in ranker
3.
Transactionally:
  Get all the rows, save as new_rows.
  for each row:
    If it's the same as it was in old_rows:
      Put with "dirty" set to False.
    else:
      Don't change it.  We'll deal with it next time.

Robellard, Michael

unread,
Aug 17, 2010, 9:25:53 AM8/17/10
to google-app-en...@googlegroups.com
That's a good point about wanting to do something transactionally,

It'd be nice if Memcache had a way to atomically add something to a list the way you can do atomic increment and decrement operations. That would make the whole issue very easy to deal with.

Can I put all the calls to different individual players in the same transaction? I thought you would have issues with that.

Bartholomew Furrow

unread,
Aug 17, 2010, 3:27:29 PM8/17/10
to google-app-en...@googlegroups.com
It'd be nice if Memcache had a way to atomically add something to a list the way you can do atomic increment and decrement operations. That would make the whole issue very easy to deal with.

Don't I wish...
 

Can I put all the calls to different individual players in the same transaction? I thought you would have issues with that.

Bleh, of course not; sorry, my brain's a little fried. :-)
Reply all
Reply to author
Forward
0 new messages