On 02/16/2015 07:02 PM, R. Tyler Croy wrote:
> (replies inline)
>
> On Mon, 16 Feb 2015, Kohsuke Kawaguchi wrote:
>
>> In the last 6 months or so, we've handed out infra acecss right to a few
>> more people (Daniel Beck and Oleg Nanoshev, IIRC), and that was good for
>> better time zone coverage and what not. But the problem still remains that
>> there is a leadership vacuum, that no one sufficiently "owns" the infra,
>> and that's difficult to solve by adding more hands alone.
>>
>> So here's what I'd like to propose:
>>
>> - Formalize our ops team more by designating the lead that reports to
>> the board. The lead shall be chosen in the discussion during the project
>> meeting.
>> - Under the new lead, accept another round of ops team members to help
>> spread the workload. I know for example Kostasya is interested in helping.
>> - Kohsuke (and Tyler if he can join) and the ops team will schedule a
>> series of "transfer of information" sessions to bring the new ops lead and
>> the team up to speed about how things are put together today.
>> - Identify and remove single-point-of-failure in our infra. Off the top
>> of my head:
>> - I think I'm currently the only one who has the private key to sign
>> update center root CA.
>> -
jenkins-ci.org domain name still appears to be registered under
>> Tyler's personal account.
These kind of things sound like good INFRA tickets; can we create a new
"spof" component/tag? :)
>> As the ops lead, I'd like the project to consider Adam Papai
>> <
https://github.com/woohgit>. He's been a long time user of Jenkins and he
>> is a member of the CloudBees ops team. I'm sensitive to the fact that he
>> works for CloudBees and how that can come across, but OTOH this will be a
>> part of his day job, and I think that ensures that he can allocate
>> necessary time to the effort.
>
> Since i've got a couple of real-world things consuming a boatload of my time, I
> don't have any objections to Adam joining the infra team. I'm not sure I like
> the term "ops lead" as I've never thought of there being a leadership structure
> around our infrastructure so much as a steaming pile of JIRAs and not enough
> people to tackle them :-P
I was under the same impression regarding a leadership structure, but I
guess creating this position is reasonable.
As for adding infra team members — since I've been responsible in the
past for lots of nagging, waiting for people in the US to wake up,
asking about SPoFs, and adding a bunch of tickets to the INFRA pile —
I'd be keen to help solve some of these things, and I have a fair amount
of sysadmin and Puppet experience.
Aside from the obvious infrastructure/server-access roles, we also have
accumulated various moderator roles, which (I think) a very short list
of people have, usually to varying degrees. e.g. wiki moderation, wiki
user deletion, LDAP account authorisation, account deletion, mailing
list banning.
There are probably a couple other systems like that. It would be nice
to define/document the various roles and who has them, and how people
can join that role.
I've spent a lot of time in the past deleting wiki spam (moreso when the
daily wiki email actually gets sent), and would love to help delete the
various spammers on the mailing lists, JIRA and the wiki.
> I would suggest ramping Adam up in the following ways to mitigate some of our
> current risk:
>
> * Documenting and migrating backend crawlers into the jenkins-infra GH
> organization. This is one of the places where I think we have a seriously
> low bus factor
> * Helping KostySha where I have failed, with feedback on this PR:
> <
https://github.com/jenkins-infra/jenkins-infra/pull/66>
> * Drive migration of JIRA and Confluence onto the newer hardware and newer
> versions we've not been able to complete due to time
>
> There's a long tail of other smaller projects, but in terms of our current
> infra health and its affect on the project's continued growth and success, I
> think those are the areas of most need.
There is indeed always a lot to be done, but it's also worth pointing
out how well a lot of the stuff runs, and how much automation we have.
Thanks to you and Kohsuke for keeping the bulk of this stuff under
control. And of course to the other infra contributors :)
Regards,
Chris