On 06/21/2012 04:40 PM, Jake - USPS wrote:
>
> This works for us although like I said I want to make it better, doing
> what you assumed I am doing ... shared storage. But since we can only
> make changes with a CHG ticket I basically make the update and then
> force a puppet run on my PMs (remote execution) and everything is
> updated in like 5 minutes. This is done during a time when the rest of
> the environment is not accessing the PMs.
This all doesn't nearly sound as bad to me as it may feel to you right
now :-)
> The only other issue I've ran into is if apache on a PM restarts or a PM
> restarts while agents are accessing it sometimes I'll get failed runs.
> Out of 4800+ systems this usually amounts to like ~200 failures until
> the next batch of runs (every 30 minutes here) which clears it up (even
> if apache/node still down). I'm not sure if this is a limitation of
> something I am doing, or if its just to be expected. Before using
There are some HAproxy options you can look into that may help you:
- redispatch: should allow HAproxy to redirect a compilation request to
another master if the original target apache won't respond
- disable-on-404: build a health check into your apache, make it
generate 404 for a while before actually stopping the process. haproxy
stops opening new sessions with this apache instance.
There are more things that may improve this situation - HAproxy is
really quite powerful where HTTP is in use.
Best,
Felix