[JENKINS-49406] Evergreen snapshotting data safety system pre-JEP: feedback welcome

43 views
Skip to first unread message

Baptiste Mathus

unread,
Mar 14, 2018, 8:05:25 AM3/14/18
to Jenkins Developers
Hello everyone,

For Jenkins Essentials, one critical requirement is to be able to upgrade, and hence rollback in an automated manner.
So, as we are committed to an open design process, I have written a first draft of the associated Jenkins Enhancement Proposal.

It is up for review at https://github.com/batmat/jep/pull/1

I am very eager for any kind of feedback there.
I am especially interested in catching & clarifying (more or less) glaring holes in that design. 

Though I did some tests locally to check everything was not obviously flawed from the beginning, we do not have a prototype ready yet, but hope to have something around the end of March.

Thanks everyone!

-- Baptiste

R. Tyler Croy

unread,
Mar 14, 2018, 1:55:41 PM3/14/18
to Baptiste Mathus, Jenkins Developers
(replies inline)

On Wed, 14 Mar 2018, Baptiste Mathus wrote:

> Hello everyone,
>
> For Jenkins Essentials
> <https://github.com/jenkinsci/jep/tree/master/jep/300>, one critical
> requirement is to be able to upgrade, and hence rollback in an automated
> manner.
> So, as we are committed to an open design
> <https://github.com/jenkins-infra/evergreen#open-design> process, I have
> written a first draft of the associated Jenkins Enhancement Proposal.
>
> It is up for review at https://github.com/batmat/jep/pull/1
>
> I am very eager for any kind of feedback there.
> I am especially interested in catching & clarifying (more or less) glaring
> holes in that design.


Thanks for taking the time to send this out Ba(p)tiste! Now that I've had a
chance to take a look, I think the one thing that's missing from this document
is a bit more explanation of the problem which requires this solution.

My take on this problem space is that core and plugin upgrades can result in
modification of config.xml and other object-serialized-files on disk when an
upgrade occurs. As these files are serialized from objects in memory, when an
internal API changes within a plugin/core, it will necessarily result in
changes to files on disk. These changes may not be safe to "rollback" from,
i.e. Plugin A v0 cannot load a file generated by Plugin A v1.

This means an upgrade of Jenkins Essentials has a very real potential to cause
irreversible modifications to files on disk which prevent a safe rollback.


So that type background/context is (IMHO) missing a bit from the JEP document.

I think the Motivation section should also explain a bit more explicitly that
"bricking" a Jenkins Essentials instance is a severe failure for the project,
and thus we need to prevent against irreversible modifications to files causing
runtime failures for the Jenkins Essentials installation.


Overall, I think this looks quite reasonable. I look forward to seeing the
implementation and tests we get to write to support it :)


Cheers
- R. Tyler Croy

------------------------------------------------------
Code: <https://github.com/rtyler>
Chatter: <https://twitter.com/agentdero>
xmpp: rty...@jabber.org

% gpg --keyserver keys.gnupg.net --recv-key 1426C7DC3F51E16F
------------------------------------------------------
signature.asc

Jesse Glick

unread,
Mar 16, 2018, 11:10:41 AM3/16/18
to Jenkins Dev
On Wed, Mar 14, 2018 at 1:55 PM, R. Tyler Croy <ty...@monkeypox.org> wrote:
> core and plugin upgrades can result in
> modification of config.xml and other object-serialized-files on disk when an
> upgrade occurs.

Does happen, but rarely. In most cases, format changes take effect on
disk only when a `Saveable` object is in fact saved for some other
reason—a *Save* button in the UI, for example.

> This means an upgrade of Jenkins Essentials has a very real potential to cause
> irreversible modifications to files on disk which prevent a safe rollback.

This is true.

> "bricking" a Jenkins Essentials instance is a severe failure for the project

This is what needs to be defined much more carefully. What would cause
an installation to be “bricked”, exactly? Years of work by core devs
(see JIRA issues with label `robustness`) have solved most cases where
Jenkins would fail to start or be used in a basic capacity merely due
to unreadable configuration files. You might get *Discard Old Data*
warnings, of course, but these are not fatal.

> we need to prevent against irreversible modifications to files causing
> runtime failures

That is a much broader requirement, at least if “runtime failures”
could be interpreted as things like “the deployment stage in all my
pipelines started failing”, and it is not clear to me that the
proposal as it stands comes close to satisfying it.

Baptiste Mathus

unread,
Mar 20, 2018, 6:21:44 PM3/20/18
to Jenkins Developers
Hello everyone,

Sorry for the time it took to get back here. I think I finally addressed all comments.

https://github.com/batmat/jep/pull/1 is ready for another round of comments.

I hope that no big thing surfaces again, though obviously there will be issues discovered later, but I feel like we have been thinking about it enough to be able to move forward.

Thanks a lot.


--
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-dev+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/CANfRfr2-jQ7HgKteHX%3DvyqPrCEHATD-q2QJwqE8ggJOYM%3D6Bcg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Baptiste Mathus

unread,
Mar 21, 2018, 10:15:02 AM3/21/18
to Jenkins Developers
FYI JEP now officially filed for review at https://github.com/jenkinsci/jep/pull/67

Thank you everyone!

R. Tyler Croy

unread,
Mar 21, 2018, 1:11:01 PM3/21/18
to jenkin...@googlegroups.com
(replies inline)

On Wed, 21 Mar 2018, Baptiste Mathus wrote:

> FYI JEP now officially filed for review at
> https://github.com/jenkinsci/jep/pull/67


A friendly reminder from one of the JEP Editors, please keep the discussion on
this mailing list thread about the document.

At this stage of the game the only changes/edits on the pull request will
likely be copy edits rather than structure edits. By the end of the day I will
likely give this a number, and merge this as a `Draft` into repository, so this
PR is not a great place for design discussion :)

use the list, luke.



>
> Thank you everyone!
>
> 2018-03-20 23:21 GMT+01:00 Baptiste Mathus <m...@batmat.net>:
>
> > Hello everyone,
> >
> > Sorry for the time it took to get back here. I think I finally addressed
> > all comments.
> >
> > https://github.com/batmat/jep/pull/1 is ready for another round of
> > comments.
> >
> > I hope that no big thing surfaces again, though obviously there will be
> > issues discovered later, but I feel like we have been thinking about it
> > enough to be able to move forward.
> >
> > Thanks a lot.
> >
> > 2018-03-16 16:10 GMT+01:00 Jesse Glick <jgl...@cloudbees.com>:
> >
> >> On Wed, Mar 14, 2018 at 1:55 PM, R. Tyler Croy <ty...@monkeypox.org>
> >> wrote:
> >> > core and plugin upgrades can result in
> >> > modification of config.xml and other object-serialized-files on disk
> >> when an
> >> > upgrade occurs.
> >>
> >> Does happen, but rarely. In most cases, format changes take effect on
> >> disk only when a `Saveable` object is in fact saved for some other
> >> reason???a *Save* button in the UI, for example.
> >>
> >> > This means an upgrade of Jenkins Essentials has a very real potential
> >> to cause
> >> > irreversible modifications to files on disk which prevent a safe
> >> rollback.
> >>
> >> This is true.
> >>
> >> > "bricking" a Jenkins Essentials instance is a severe failure for the
> >> project
> >>
> >> This is what needs to be defined much more carefully. What would cause
> >> an installation to be ???bricked???, exactly? Years of work by core devs
> >> (see JIRA issues with label `robustness`) have solved most cases where
> >> Jenkins would fail to start or be used in a basic capacity merely due
> >> to unreadable configuration files. You might get *Discard Old Data*
> >> warnings, of course, but these are not fatal.
> >>
> >> > we need to prevent against irreversible modifications to files causing
> >> > runtime failures
> >>
> >> That is a much broader requirement, at least if ???runtime failures???
> >> could be interpreted as things like ???the deployment stage in all my
> >> pipelines started failing???, and it is not clear to me that the
> >> proposal as it stands comes close to satisfying it.
> >>
> >> --
> >> You received this message because you are subscribed to the Google Groups
> >> "Jenkins Developers" group.
> >> To unsubscribe from this group and stop receiving emails from it, send an
> >> email to jenkinsci-de...@googlegroups.com.
> >> To view this discussion on the web visit https://groups.google.com/d/ms
> >> gid/jenkinsci-dev/CANfRfr2-jQ7HgKteHX%3DvyqPrCEHATD-q2QJwqE8
> >> ggJOYM%3D6Bcg%40mail.gmail.com.
> >> For more options, visit https://groups.google.com/d/optout.
> >>
> >
> >
>
> --
> You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-de...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/CANWgJS5mWyWAhktJW%3DiiQqfHGe8PYYnggz1p7KTZF7%3DjB7Q4dA%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.

signature.asc

R. Tyler Croy

unread,
Mar 22, 2018, 2:37:11 PM3/22/18
to jenkin...@googlegroups.com
(replies inline)

On Wed, 21 Mar 2018, Baptiste Mathus wrote:

> FYI JEP now officially filed for review at
> https://github.com/jenkinsci/jep/pull/67


Just a heads up! *puts on JEP editor hat* I have marked this as a Draft and
assigned it the number JEP-302.

It can now be found here:
https://github.com/jenkinsci/jep/tree/master/jep/302


If you have any concerns about this proposal or questions, please chime in on
this list before mid-next week. If it looks like batmat has addressed concerns
and there is consensus on this mailing list thread, I will update the status to
'Accepted'


Thanks batmat for your work on this design!
signature.asc
Reply all
Reply to author
Forward
0 new messages