We won't need a downtime as we can do it with low pool capacity
technique to deploy both changes [1].
We just need to set a range to do the work and philor to watch around :P
(JK)
Both changes are expected to have talos changes in some suites and
therefore we will trigger a new set of talos jobs to get a new baseline.
We would be doing this on EDT morning and we will be done way before PDT
wakes up.
Tuesday morning for change #1 - duration 1 hour.
* assuming staging run goes well
Wednesday morning for change #2 - duration 2-3 hours.
* this change is already well tested on staging
Any objections? Does it play well with the betas?
Both to be rescheduled if it conflicts with releases.
More details on deployment below.
cheers,
Armen
#############################
[1] low pool capacity technique
* take down 2/3 of the pool down to deploy the change
* take down the remaining 1/3 and put the 2/3 back up
* put the 1/3 back with the change
This is to ensure that all slaves at one time will have the change and
reduce noise on talos numbers.
Normally a second batch of talos tests is requested to establish the new
baseline.
PS = Not sure if this technique has a name but this is how I call it.
DEPLOYMENT DETAILS
##################
The first one is a hardblocker as we currently don't have hardware
acceleration coverage on 10.5 (it recently had to be set off by default).
To be able to add coverage I need to change the resolution as we already
did for 10.5 few months ago.
I am testing the change now on staging and will update the plan as I
obtain results.
The deployment requires just copying a couple of plist into our puppet
servers.
The second one is not as crucial as it was at the beginning of the week.
A workaround was found to not block the test slaves. gfx has to block
certain driver version and the testing machines are currently below that
version.
The deployment requires manual intervention: access each Windows 7
machine through VNC to install both changes.
I had planned to review the staging run today but I will have to do it
tomorrow.
I am moving the scheduled day/times one day:
Wednesday EDT morning for change #1 - duration 1 hour.
* assuming staging run goes well
Thursday EDT morning for change #2 - duration 2-3 hours.
* this change is already well tested on staging
A reminder that this won't affect any trees and no downtime will be
needed.
Best regards,
Armen
On Jan 21, 1:44 pm, Armen Zambrano Gasparnian <arme...@mozilla.com>
wrote:
Isn't this likely to affect performance numbers? If so, shouldn't
it happen during a tree closure, with a full clean set of talos runs
both before (i.e., runs on the changeset before the tree closure
finish before changes start being made) and after.
-David
--
L. David Baron http://dbaron.org/
Mozilla Corporation http://www.mozilla.com/
Yes, we are going to have talos changes; I mentioned on my original post:
> Both changes are expected to have talos changes in some suites and
> therefore we will trigger a new set of talos jobs to get a new
> baseline.
I am trying to avoid to have to do a tree closure so we won't affect the
FF4 release.
It is hard to explain it but we don't win much when we do a downtime.
Either we do a tree closure or we don't do we must do the following:
* once the change comes live:
** cset N is the last push that has complete talos coverage BEFORE
changes (all platforms are done)
*** we re-trigger talos set to establish PREVIOUS baseline
** csets after cset N can have some jobs run BEFORE the change and some
AFTER the change gets deployed
*** nevertheless, we should re-trigger a new talos set to establish the
NEW baseline
I prefer to do a tree closure since I can burn whatever I want during it
and I can do my job much faster but it doesn't greatly improve the end
result.
Believe me (or give me a long phone call with a whiteboard and a lot of
Q&A) that we can do it without a tree closure.
I hope it makes sense.
cheers,
Armen
I have re-triggered new talos jobs and unit tests to indicate the new
baseline.
I am trying to formalize what is the right way to generate a new
baseline and make it clearer to notice.
Below you will see what I did and why I did it.
If you notice something that has a flaw in my reasoning please do let
me know as I want to to improve the process.
#### NEW BASELINE
The new baseline for 10.6 will be the talos runs for:
http://hg.mozilla.org/mozilla-central/rev/cb707fbc99d4 (which is still
building)
I will trigger a new set of talos jobs as soon as it is built.
I have triggered a new set of talos jobs for (as the OLD BASELINE/NEW
BASELINE indicator):
* http://hg.mozilla.org/mozilla-central/rev/af7e65f1ee6f
* http://hg.mozilla.org/mozilla-central/rev/a5f732abf109
This means we will have old results and new results.
I have triggered a new set of unit tests for:
* http://hg.mozilla.org/mozilla-central/rev/a5f732abf109
This should hash out any new ORANGES if any.
cheers,
Armen