counter use pattern

josh l

unread,

Nov 1, 2008, 1:50:42 PM11/1/08

to Google App Engine

I was just looking over Bill Katz' counter code (
http://billkatz.com/2008/10/Fault-tolerant-counters-for-App-Engine )
and had a question about a reasonable pattern for using it in the
following situation:
- I want to count daily usage in certain parts of my site by each
user.

Right now I have a UserStats model which has the user, the date, and a
reference to a counter. Previously I was using a more simplistic
sharded counter that was based on the db.Model, rather than a python
object, so in my UserStats model I could write a simple method called
get_daily_counter to get counter results for the day with a line like
this:

def get_daily_counter(self, counter_key):
return Counter.get_or_insert(counter_key, name=counter_key)

The above would create the counter entity for the day if it didn't
exist, and then I could get the count, in my view.

With the new code, we have a python object rather than a model I can
get_or_insert, so I am wondering if the following seems like a
reasonable pattern for GAE:

def get_daily_counter(self, counter_key):
if not counter_key in vars().keys():
vars()[counter_key] = Counter(counter_name)
return vars()[counter_key]

If not, is there a recommended way for implementing this counter such
that it gets instantiated if not already in existence, and returned
appropriately if so?

Thanks,

-Josh

Bill

unread,

Nov 1, 2008, 5:13:36 PM11/1/08

to Google App Engine

> With the new code, we have a python object rather than a model I can
> get_or_insert

My modified counter uses a simple python object (rather than a model)
at the level above the shards. Shard models are still used so when
you change a counter, you'll invoke a get_or_insert style transaction
at the shard level. If you look at the other counter
implementations, you'll see that a high-level Counter model stores
data like counter name and number of shards. Since I handle counter
names and # of shards at the program level, hardwired to particular
tasks, my version simply bypasses those Counter entities as not
necessary.

Stat-like tasks like your use case are a good fit for the approach
I've taken because counter accuracy (although very likely) is not
guaranteed. For example, if you have a string of datastore timeouts
and your memcache goes away, the counter will be inaccurate. In cases
where it's essential for accurate counts or knowledge that it's
failed, simply returning errors and handling it on the client side
seems better.

But unless I'm misunderstanding your question, you should be able to
just always instantiate the Counter object and use it.
-Bill

josh l

unread,

Nov 1, 2008, 9:39:54 PM11/1/08

to Google App Engine

Bill,

Thanks for the response.
I noticed of course you're using db.models and the associated
get_or_insert at the shard level. My question, though not well
worded, was really regarding whether the method I plan to use to
instantiate your counter object (specifically, checking the vars()
dict and going from there) seemed reasonable,or had any negative
implications on a system like GAE.

As I see it, any code that uses your new counter object will need to
deal with instantiating it somewhere, and the example used in your
older version won't fit (since again it is using the get_or_insert on
a db.model Counter rather than a python object). Perhaps the answer
is so obvious it wasn't worth asking, but perhaps others might have a
similar question when wondering how best to actually implement it and
ensure the variable name they wish to use for the counter is not
taken, and that they're not unnecessarily instantiating multiple
Counter objects for no reason.

Cheers,

-Josh

yejun

unread,

Nov 2, 2008, 1:17:41 AM11/2/08

to Google App Engine

The initialization code seems very light, only 3 object creation and 4
object attributes assignment without touch both memcache and
datastore.
I think your code is correct, a local singleton map shouldn't be
needed.

Bill

unread,

Nov 2, 2008, 1:11:19 AM11/2/08

to Google App Engine

> As I see it, any code that uses your new counter object will need to
> deal with instantiating it somewhere

...

> others might have a
> similar question when wondering how best to actually implement it and
> ensure the variable name they wish to use for the counter is not
> taken, and that they're not unnecessarily instantiating multiple
> Counter objects for no reason.

The way I'm using counters, they are hardwired into the code right at
the time counter info is needed to be retrieved or set. I do the
instantiation right at that point. For my app, I guarantee variable
names are OK just via code review and making each group of counters
has a special prefix name so there can't be conflict across groups of
counters.

I'm still relatively new to python web apps, so I can use educating
here as well, but my understanding was a request created a limited
lifecycle for the request handler and instantiated models, unless you
put it in the global space. So if you really want to be efficient,
you could create the counters globally so they are kept across
requests some times. But even if you are dynamically creating
multiple instances of Counter objects, there shouldn't be much of a
performance penalty or access issue because these objects are so
lightweight and the very nature of a web app where the whole response
gets recreated each request.

Best,
Bill

yejun

unread,

Nov 2, 2008, 2:39:19 AM11/2/08

to Google App Engine

I found a bug in that code. If you do an increment after memcache
exhaustion and without read value, it will reset value to 0.

yejun

unread,

Nov 2, 2008, 3:15:22 AM11/2/08

to Google App Engine

I found another bug.
Counter.get_count tests whether self.memcached.count is None, however
self.memcached.count will never return None.

On Nov 2, 1:11 am, Bill <billk...@gmail.com> wrote:

josh l

unread,

Nov 2, 2008, 11:11:55 PM11/2/08

to Google App Engine

I haven't actually looked deeply at the countershard/memcache code yet
(tho I plan to tomorrow) so I can't speak to any bugs in it, but just
off the top of my head I think you can easily get None if checking for
memcache.get('invalid_key'), no?

-Josh

josh l

unread,

Nov 3, 2008, 1:21:48 PM11/3/08

to Google App Engine

Not sure by your wording if this is the same bug you noticed, but as I
see it lines 144-145 should be under the try: on line 137. That way,
the delayed counter count won't get reset to zero even in the case of
a failed transaction.

I think I'm going to rewrite this a bit, possibly from scratch, to
bring in the concept of forcing a delayed/memcache counter if
requested. So you'd pass the increment method not only the increment
amount, but also the minimum delay integer before it even attempts to
write to a shard. Under the assumption your memcache object will
stick around for a few hundred increments, you could set this to a
relatively safe low number of 10 or so, and then it would only even
attempt to write to the datastore shard (incrementing it by ~10) if
the delayed count was >= 10. This would drastically reduce the number
of attempted datastore writes, and thus the average response time
(especially for situations like mine, when I'm working with multiple
counters per http request).

The issue here is that even moreso than Bill's code, you're relying on
memcache not disappearing all the time. In general, I think this is a
safe bet. And if it happens just occasionally (eg once every few days
even), then I don't care - it doesn't matter to me if my total counts
are off by a tiny fraction of a percent (and of course for counters
that are more important, just set the minimum delay integer lower).
More of an issue, however, is your counter memcache disappearing
because it's outdated due to tons of other memcache items having been
created since your original counter was created. This is because I
_think_ the memcache items are removed based on first created, first
out - and not on last modified, last out. To solve this, I'm also
planning to destroy and recreate the memcache object upon successful
datastore write (and associated memcache delay being reset to zero).

Any comments on this approach are appreciated.

-Josh

yejun

unread,

Nov 3, 2008, 3:16:07 PM11/3/08

to Google App Engine

I fixed some problem in my copy of bill's code. You can find here.
http://github.com/yejun/sharded_counter/tree/master/counter.py

It is very hard to simulate a real contention incidence, so my code
isn't tested. The problem in bill's code is that his memcache class is
completely backend agnostic, so it made wrong assumption when counter
unavailable in memcache.

I added a default_value parameter to get_count() and increment() to
alternate the behavior when a cache miss happens.

yejun

unread,

Nov 3, 2008, 3:22:39 PM11/3/08

to Google App Engine

> To solve this, I'm also
> planning to destroy and recreate the memcache object upon successful
> datastore write (and associated memcache delay being reset to zero).

You shouldn't do this. It the completely negates the reason why you
use this counter class.
If you just need a counter for a local object, you should just save
the counter with object itself.

This counter class should only be used in situation when the same
named counter need to be increment more than 10 times per second by
different requests/users concurrently.

josh l

unread,

Nov 3, 2008, 3:43:14 PM11/3/08

to Google App Engine

Yejun,

Thanks for the updated code example. Since a counter is such a common
need, I think it might be helpful if we all worked together on the
same codebase, rather than forking each time, or at least if we could
be specific about the changes we made (I know I can do a diff, but if
you noted what exactly you changed, and why, that would be awesome for
all future users who are curious).

Moving on, I think I didn't explain myself well regarding the
destruction of the memcache object. Imagine an app where
1) There will definitely be at least 10 counter requests/sec (for
same named counter. let's call it the TotalRequests counter, and is
referred to by some Stats model)
2) Lots and lots of other entities get written to memcache (millions
of entities in the system, and each gets cached upon initial request)

In this situation, it is guaranteed objects in our memcache will
disappear after some use, since we have less memcache total avail than
the size of items that will be cached over a few days of use. Now,
which items get removed? In this case, our counter is the first
created item in memcache, and definitely one of the first items to be
just nuked from memcache when we hit the max storage limit for our
apps' memcache. To ensure it never gets nuked due to it being the
'oldest object in memcache', then we could 'occasionally' destroy/
recreate it. Maybe, for example, I could also have a time_created on
it, and if it's older than a few hours, then nuke/recreate upon
resetting it. I figured might as well do this every time, but anyway
hopefully you see my point as to why I was thinking about the need to
destroy/reuse.

Much more important than this very occasional mis-count for a
destroyed memcache item, tho, is my general idea of just not even
attempting to write to a shard entity unless we've had a few (10?,
50?) counter increments. I am getting ~350ms/request average due to
the time it takes writing to the shards (multiple counters/request),
and this is my main concern with the current code.

I will diff your code (thanks again) and check it out this afternoon.

-Josh

yejun

unread,

Nov 3, 2008, 3:54:54 PM11/3/08

to Google App Engine

I believe mecached algorithm will keep most used object not just by
creation time.
The reason you use a sharded count is that you want increment
operation always hit the disk. Otherwise you don't need shard because
you don't access it very often if write is cached in memcache. Shard
is used here is to improve write concurrency.

josh l

unread,

Nov 3, 2008, 4:32:57 PM11/3/08

to Google App Engine

Yejun,

I've been told that the memcached framework on GAE uses a first-
created, first-deleted algorithm, and that we have finite (a few
hundred megs, but maybe even up to a GB) of memcache for any specific
app. This means that once you hit your limit, older objects WILL get
deleted. And my app will definitely be going over this max limit.
This is not a huge deal to me (probably won't happen that my counter
gets deleted that often, and it's ok if it's a little bit off) but I
figured my counter might want as well handle that small case anyway.

Again, the much bigger case: Not writing to the datastore each time
you want to increment. And yes, I am aware of why to use the sharded
counter. The point is what if you have about 50QPS coming in (so you
need a sharded counter with a lot of shards for sure), and every
single request is writing to ~3 different counters. Now each request
is taking a while because of the datastore writes that it attempts
each time, even with no Transaction collisions on shard-writes. And
also I believe there is a GAE watchdog looking to see if over a period
of time your average request is >300ms.

So I am simply saying, why not try cut down on the total datastore
writes, and write to a shard only 1/10 times, but still get the
correct totals? This is the reasoning for my arguments above. So now
you have an order of magnitude less datastore writes, and the average
response time is way down. This sounds good to me, and I am sure
others who plan to write apps that have a number of sharded counter
increments / avg. request, might feel similarly. Am I missing
something obvious here?

-Josh

yejun

unread,

Nov 3, 2008, 4:56:08 PM11/3/08

to Google App Engine

The reason why not use shard count with memcache cache write is that
with memcache you will only write datastore once per second or every
10 seconds. I see no reason why you need a shard counter for that kind
frequency.
Because when a memcache reset happends, losing 1/10 second of data or
1 second seems have the similar reliability to me.

josh l

unread,

Nov 3, 2008, 5:11:40 PM11/3/08

to Google App Engine

Why do you say with memcache I would only write to the datastore once
per second, or every 10 seconds? This doesn't make any sense... If I
only attempt write 1/10 of my counter increments to the datastore, I
will still be writing a few times/second at least (I am writing this
to expect >30QPS, but likely it will be more).

Also, my issue/question is not regarding Transaction Collisions. Of
course these are avoidable by just using many shards, and adding more
shards until collisions are extremely infrequent. My issue is
regarding total request/response time. Attempting to write anything
to the datastore takes time. It just does. The cpu cycles may not
count against us (in $cost), but the time does (or seems to, at
least). If a request ends up needing to write to multiple counters
(and all of mine do), quickly the total response is in the many
hundreds of ms. I don't want this, I want quicker.

My testing shows memcache to be fairly reliable. I am not worried
about the very occasional (I hope) memcache issue. I am worried about
total time per request. I believe I can drastically shorten this time
by not attempting to write to the datastore as often, and having the
counter use just memcache most of the time for increments. I can't
imagine I am the only person to think of this?

-Josh

yejun

unread,

Nov 3, 2008, 5:17:28 PM11/3/08

to Google App Engine

I am not say they are same. But you are degrading a 100% reliable
solution to 99% or 99.9%. To me 99.9% and 99% have similar reliability
comparing to 100%. And the complexity of combining memcache write and
sharded write also possibly make the problem somewhat unreliable.

josh l

unread,

Nov 3, 2008, 5:26:40 PM11/3/08

to Google App Engine

How is your counter 100% reliable? It could have a delayed
transaction (or a bunch) and then a memcache failure.

I agree my write-to-memcache-first counter would be (slightly) more
complex, and like yours, it _could_ have issues if memcache failed for
one reason or another. In that (rare) case it is likely mine _would_
lose some counts, and yours likely would not -- but I think it is a
rare case and I think the very occasional loss of some counts is more
than made up for by an order of magnitude (or more) less writes, and
drastically faster response times.

According to guide, in this video
http://www.youtube.com/watch?v=CmyFcChTc4M&eurl=http://www.technorati.com/videos/youtube.com/watch%3Fv%3DCmyFcChTc4M
(21:10), you will hit a watchdog if your avg. requests are >300ms, and
I have seen it happen. I need to avoid it, and I don't see another
way to do that when the app requirement is for multiple counter
incrememts/request, except with my counter example (far) above.

-Josh

yejun

unread,

Nov 3, 2008, 5:34:24 PM11/3/08

to Google App Engine

Of course nothing can be 100% reliable. Sorry I can't express my
opinion very clearly.
The problem is you are trying to make the counter so complex, the
effort you put in there may not worth the man hours.

On Nov 3, 5:26 pm, josh l <jlivni.umbre...@gmail.com> wrote:
> How is your counter 100% reliable? It could have a delayed
> transaction (or a bunch) and then a memcache failure.
>
> I agree my write-to-memcache-first counter would be (slightly) more
> complex, and like yours, it _could_ have issues if memcache failed for
> one reason or another. In that (rare) case it is likely mine _would_
> lose some counts, and yours likely would not -- but I think it is a
> rare case and I think the very occasional loss of some counts is more
> than made up for by an order of magnitude (or more) less writes, and
> drastically faster response times.
>

> According to guide, in this videohttp://www.youtube.com/watch?v=CmyFcChTc4M&eurl=http://www.technorati...

josh l

unread,

Nov 3, 2008, 5:52:38 PM11/3/08

to Google App Engine

Yejun,

Well I definitely appreciate your input. I do not think it would be
much more complex, but of course I also would like to avoid it if I
can.

I've posted a message just now requesting clarification on the 300ms
average. If it turns out datastore (shard-write) operations won't
count against us for the average response time, I will not have to
write my own, and instead be able to just use your counter code - so
thanks again for that.

-Josh

Bill

unread,

Nov 11, 2008, 4:12:10 PM11/11/08

to Google App Engine

I probably should've piped up on the thread earlier. I'm currently
looking at yejun's fork and will merge pending some questions I have
on his optimizations.

Here's one:
My old code stored the counter name in each shard so I could get all
shards with a single fetch. If you have 20 shards, you could have any
number of actually created shards. In a very high transaction system,
probably all 20 shards exist.
In yejun's optimization, he's iterating through each shard using
get_by_key_name and checking if it exists. Which is likely to be
faster?

A nice optimization by yejun is making count a TextProperty. This
will prevent indexing and would probably save some cycles.

Josh, you said "lines 144-145 should be under the try: on line 137.

That way, the delayed counter count won't get reset to zero even in
the case of a failed transaction."

I thought any datastore errors are handled via the db.Error exception
which forces a return before the delayed counter count is reset.
(db.Error is defined in appengine/api/datastore_errors.py)

On Josh's memcached buffering scheme, I can definitely see the utility
if you're willing to sacrifice some reliability (we're down to one
point of failure -- memcache) for possibly a lot of speed. Using
memcache buffers for counters makes sense because it's easy to
accumulate requests while for other model puts, like comments or other
text input, the amount of memcache buffer could grow pretty large
quickly.

Does it make sense to use a sharded backend to a memcache buffer?
Depends on the frequency of the final datastore writes, as mentioned
above. (I'm not as concerned with complexity as yejun because I think
this buffering is reasonably simple for counters.) So I think this
would be a good thing to add onto the sharded counter code through a
fork. Then people who want that kind of speed can opt for it.

-Bill

yejun

unread,

Nov 11, 2008, 4:41:15 PM11/11/08

to Google App Engine

Every operation by transferring value between memcache and datastore
is a non atomic operation. The original design is cache value from
datastore to memcache. The equivalent implementation is just throwing
random exception from the transaction.

I feel the non sharded counter plus a memcache will be more robust
solution.

Because a reasonable frequency update from memcache to datastore
completely eliminated the necessity of sharded implementation. A high
frequency memcache to datastore implementation will be very unreliable
due to high frequency non atomic operations.

Reply all

Reply to author

Forward