I think of it this way:
- a task will execute if there is a token in the bucket (by consuming the token)
- the bucket size can be defined to hold a particular number of tokens
- the bucket will refill tokens at the queue rate
So, when the bucket is "empty", you are effectively processing at your queue rate (i.e., the rate at which the bucket is being refilled). The max bucket size is 100, so this allows you to have a small burst in processing if desired.
This is the clearest description from the docs:
The task queue uses token buckets to control the rate of task execution. Each named queue has a token bucket that holds a certain number of tokens, defined by the bucket_size
directive. Each time your application executes a task, it uses a token. Your app continues processing tasks in the queue until the queue's bucket runs out of tokens. App Engine refills the bucket with new tokens continuously based on the rate
that you specified for the queue.