What I would do is set a “optimal” batch (you can vary this)
The scheduled task would call a page that processes 400 (for example). this runs every 30 seconds or minute or whatever. (a short period of time)
You can then make it self healing, so part of your query is “ok, get me the next 400 that haven’t been processed”
Part of the query of the queue can see how many are processing (start time but no end time yet) and if that is pretty high, you just exit.
Your script can have individual processing for each person etc. but the point here is that your system tracks and “self heals”
You could also add onError code in there and what not but that is another story.
So , to recap. You have a PersonQueue table, with :
id personID processingStartedDateTime processingEndedDateTime failureCount
You also have a scheduled task that runs very frequently. Your script will handle the throttling by aborting early if conditions are not met
1) It does a query to find out how many items are being processed (WHERE processingStartedDateTime IS NOT NULL AND processingEndedDateTime IS NULL)
2) If there are loads being processed (say 400?) Quit. you are done. Other requests are doing the work.
3) If there are a lot of OLD processingStartedDateTime WHERE processingEndedDateTime is NULL AND processingStartedDateTime > RequestTimeout… then we better fix these.
4) Select them, mark them as failed once (update PersonQueue SET failureCount ++) And clear the processingStartedDateTime (this request can now abort)
5) the next request comes in (from the scheduled task) and sees that we have met the parameters (there aren’t too many people being processed and there aren’t a bunch that need to be cleared up) we can then:
6) Get 400 people to process, set their processingStartedDateTime and get to it!
7) When you have processed one person set their processingEndedDateTime
as a quick note, I would not do:
UPDATE PersonQueue SET processingStartedDateTime = #now()#
(this was a tip from Cameron Childress actually)
But rather I would set up at the top of your processing something like request.nowtime = Now() and then do:
UPDATE PersonQueue SET processingStartedDateTime = #request.nowtime#
This makes it easier to then query the items as they would all have the same start time and end time.
Does this process make more sense?
MD