[Essentials] Defining the client/server "update lifecycle"

21 views
Skip to first unread message

R. Tyler Croy

unread,
Apr 18, 2018, 10:21:03 AM4/18/18
to jenkin...@googlegroups.com

Howdy! I've been working this week to define the Jenkins Essentials
client<->server contracts for handling the _actual_ updates of Jenkins
Essentials.

To help me organize my thoughts, I sketched out a quick draft of a JEP
yesterday afternoon which can be viewed here.

https://github.com/jenkinsci/jep/pull/89


Olivier already gave me some really helpful and thought provoking feedback, and
I would love to have a few more eyes on this client/server interaction as its
likely to be one of the most important interactions in the entire Jenkins
Essentials project.


One area which would be helpful, and I don't have a lot of thoughts invested in
yet, is that of "Update Manifest Authenticity". Fundamentally, we are
instructing the evergreen-client to download code for execution on an end-user
machine, and I want to be absolutely certain that the evergreen-client is
downloading the *right* code and not subject to man-in-the-middle attacks or
other forgeries leading to end-user compromise.

I discussed briefly in JEP-303 (Registration/Authentication) the notion of
Certificate Pinning in the evergreen-client
https://github.com/jenkinsci/jep/tree/master/jep/303#certificate-pinning
Which might be one potential solution here. Or we could model the existing
Update Center process where a certificate authority is baked into the client
and a custom server-side certificate signs the Update Manifest. Another idea
which comes to mind is the "traditional" gpg signing/verification which
yum/apt perform (Joe Damato has done some great presentations about how this
doesn't give you the trust you think it does if you search YouTube :)).

I'm open to suggestions on how we can effectively ensure Update Manifest
Authenticity, the easuer and safer the solution the better :)



Thanks for your time!


Toodles
signature.asc

R. Tyler Croy

unread,
Apr 20, 2018, 11:03:15 AM4/20/18
to jenkin...@googlegroups.com
(replies inline)

On Wed, 18 Apr 2018, R. Tyler Croy wrote:

>
> Howdy! I've been working this week to define the Jenkins Essentials
> client<->server contracts for handling the _actual_ updates of Jenkins
> Essentials.
>
> To help me organize my thoughts, I sketched out a quick draft of a JEP
> yesterday afternoon which can be viewed here.
>
> https://github.com/jenkinsci/jep/pull/89


After a very helpful review by Olivier, his comment embedded below, I had a
good hangout with Ba(p)tiste to determine a good way to handle the
"disconnected client" problem.

Olivier asked (https://github.com/jenkinsci/jep/pull/89#discussion_r182350032)

Do you consider all updates as 'safe'?
What happened if a client didn't connect to the update service for month?
Is it an information that would be useful in the update manifest?


Baptiste and I discussed what the right way to handle this is, and some of the
thoughts I had swishing in the back of my brain bucket about utilizing "Update
Levels" rather than tailoring an Update Manifest for each instance
individually.

Consider two instances, Alpha and Bravo. They both are created at the same
time, at Update Level (UL) 1. Alpha stays online, and connected, for the next
14 days, while Bravo is disconnected until day 14.

Our state is now:

Alpha: UL14
Bravo: UL1


My first idea was to dry to have Bravo jump from UL1 -> UL14 but with Jenkins
Essentials' testing process, this would effectively be a completely untested
upgrade jump, and Baptiste and I considered it too risky.

Another idea we discussed was using a git-bisect(1) type approach, trying UL14,
if that fails, try UL7, and so on. We discarded this idea as well because it
would be completely untested.


What we ultimately decided was the right path forward was to use Update Levels
(contrary to what the JEP presently describes), and staggar the upgrade logic
for Bravo to where it can successfully go from UL1->UL2, then UL2->UL3, etc.

Baptiste also raised the point of "What if we know that Update Level 5 is a bad
update?" What we decided was that the backend services need the ability to mark
a specific Update Level as tainted. SO in this example, once Bravo arrived at
UL4, it would skip UL5, and upgrade to the untainted UL6.


There are definitely some user experience concerns with downloading updates and
restarting, but we decided that is something we're going to have to find a way
to communicate to an end-user ("Why does Jenkins keep restarting?") and set up
the update lifecycle to prefer stability and tested upgrade paths.



I will be updating the JEP with this discussion shortly, thanks for the
feedback thus far Olivier and Jesse!


If you're interested in joining the Jenkins Essentials open planning meetings,
we have our next on Monday April 23rd at 14:30 UTC (https://www.youtube.com/watch?v=SZK8fdGaVhk)



Cheers
signature.asc

R. Tyler Croy

unread,
Apr 26, 2018, 4:28:14 PM4/26/18
to jenkin...@googlegroups.com

A heads up on this thread, this work has become a Draft JEP:
https://github.com/jenkinsci/jep/tree/master/jep/307

Of particular note since I last updated this thread is the security section
which has come after a plethora of reading and reasoning on my part :)
https://github.com/jenkinsci/jep/tree/master/jep/307#security


Cheers
signature.asc
Reply all
Reply to author
Forward
0 new messages