So it may help to understand how the plugin manages the quota. The quota gets divided into three parts:
- Burst (this is about 15-20%)
- Normal (this is about 75-80%)
- Exception (this is about 5%)
The idea is that the Normal quota is divided up over the whole hour. The Burst quota is for "bursty" use cases. The Exception quota is in case any requests accidentally use more than we have managed (because all your Jenkins instances are - per the GitHub ToS - supposed to be using the same API Key, so if you have 5 masters, without the Exception quota, 4 of them could erroneously think that there is an API call unused and then they would fail) Further complications arise because the GitHub API Java client library we are using does not make it easy to predict how many API calls will be made. For example, if you ask for a list of all the repositories in an organization... that is a paged API and the GitHub API Java client library will just return an Iterator that masks all the background API calls while you iterate the list... it may even serve the request from cached state... so the API call can be anywhere from 0 to infinite requests (realistically no more than 3-4 requests for most organizations)... its worse for listing PRs where you can have many thousands and the page size is typically 50-100. So what we do (all numbers are from memory and for illustrative purposes only):
- is we use a linear allocation strategy for the Normal quota... e.g. we allow approx 4000 requests per hour, at 10 minutes into the hour the budget says we should have 4000*(60-10)/60 requests left in the Normal quota, e.g. 3333.
- Then we add on the Exception quota (approx 150)
- This gives the budget plan for now, i.e. 3333+150 = 3483 requests should be remaining
- If the actual requests remaining is more than this number, the request will be made. If the requests remaining is less than this number then you get the log message you see. For example if the requests remaining is 3104 then we have spent 379 requests more than the budget. Since we know the rate at which the Normal quota will be divided out, we can then sleep until such a time as the spend will be expected to be back within budget, at 66 requests per minute that overspend of 379 will be back on track if we make no more requests for the next approx 6 minutes.
So what has happened is: in the first 10 minutes you already burned through the Burst allocation of 1000 requests. Yes we could burst more, but that just means that probabilistically your builds will be delayed for an hour once the bigger burst is burned through. The current strategy means that all jobs have an equal chance of getting the "allocation". The larger the burst the less even the spread of those 1.4 requests per second will be and then you run the risk that specific jobs will never win and thus never get CI. I did a lot of experimenting on different strategies, this one was the least worst. We could tweak the Burst to Normal ratio somewhat, but 1:4 produced fairer results overall while allowing for faster response to webhook notifications |