"guaranteed boot" OS releases, for migration

15 views
Skip to first unread message

Luigi Semenzato

unread,
Nov 7, 2017, 3:09:42 PM11/7/17
to Chromium OS dev, Mike Frysinger, mnis...@chromium.org
When a chromebook hasn't booted for a long time, and it's missing a
few OS updates, our update engine can feed to it an update which
merges multiple releases. I don't know if we guarantee a maximum
number of merges, or if we can specify releases which are guaranteed
to boot at least once. Do we?

This would be useful because whenever we need to migrate state (see
for instance crbug.com/782284) we have to leave the migration code in
in perpetuity, or at least for a suitably ridiculous length of time.

I am trying to think if there are other ways of getting this result,
but I don't think so. The migration code could be in any of our
binaries.

Mike Frysinger

unread,
Nov 7, 2017, 3:14:13 PM11/7/17
to Luigi Semenzato, Chromium OS dev, Mattias Nissler
there's a few ways to look at this.  when it comes to /var state, there's a reasonable argument to be made that daemons should make sure their data is in a reasonable state at every boot.  you could be migrating from older systems, or things could have been corrupted (not necessarily at the fs level, but by some other process gone amok), or some malicious code ran which wedged you.  by having the init script do all the sanity checking every time it runs, it means rebooting is an inexpensive method to get back to a working state.
-mike

Mike Frysinger

unread,
Nov 7, 2017, 3:19:21 PM11/7/17
to Luigi Semenzato, Chromium OS dev, Mattias Nissler, Bernie Thompson
wrt "how long do we need to keep migration logic", we don't have an answer for people to readily refer to.  effectively, you have to look at all the devices out there that were running that old version, and then look to see if we've set any stepping stone versions for ugprading, and once all those stepping stones are higher than you need, you can delete.  last time i wanted this info, i had to ask Bernie to look up stepping stone versions somewhere and report back.

"stepping stone versions" is what i think we call it.  i'm referring to omaha forcing devices beyond a certain vintage to upgrade to an intermediate version first.  so if i booted R19, omaha might only serve me R35, and once i'm on R35, it'll let me go to the latest version.  https://crbug.com/749166 has examples in the first few comments.
-mike

On Tue, Nov 7, 2017 at 3:09 PM, Luigi Semenzato <seme...@chromium.org> wrote:

Luigi Semenzato

unread,
Nov 7, 2017, 3:31:57 PM11/7/17
to Mike Frysinger, Chromium OS dev, Mattias Nissler, Bernie Thompson
Yes that's it, thanks.

Where are the stepping stones specified? I guess I can look that up,
but do we have any that go across all devices? (Or probably I should
simply find the least recent one and treat it as such when deciding if
some migration code can be removed.)

Luigi Semenzato

unread,
Nov 7, 2017, 3:32:51 PM11/7/17
to Mike Frysinger, Chromium OS dev, Mattias Nissler
On Tue, Nov 7, 2017 at 12:13 PM, Mike Frysinger <vap...@chromium.org> wrote:
> there's a few ways to look at this. when it comes to /var state, there's a
> reasonable argument to be made that daemons should make sure their data is
> in a reasonable state at every boot. you could be migrating from older
> systems, or things could have been corrupted (not necessarily at the fs
> level, but by some other process gone amok), or some malicious code ran
> which wedged you. by having the init script do all the sanity checking
> every time it runs, it means rebooting is an inexpensive method to get back
> to a working state.
> -mike

Yes that makes sense. I am not sure how well we follow this principle.

But migration can be quite a bit more involved than emergency
recovery. Often the init script doesn't have all the needed logic.

>
> On Tue, Nov 7, 2017 at 3:09 PM, Luigi Semenzato <seme...@chromium.org>
> wrote:
>>
>> When a chromebook hasn't booted for a long time, and it's missing a
>> few OS updates, our update engine can feed to it an update which
>> merges multiple releases. I don't know if we guarantee a maximum
>> number of merges, or if we can specify releases which are guaranteed
>> to boot at least once. Do we?
>>
>> This would be useful because whenever we need to migrate state (see
>> for instance crbug.com/782284) we have to leave the migration code in
>> in perpetuity, or at least for a suitably ridiculous length of time.
>>
>> I am trying to think if there are other ways of getting this result,
>> but I don't think so. The migration code could be in any of our
>> binaries.
>
>
> --
> --
> Chromium OS Developers mailing list: chromiu...@chromium.org
> View archives, change email options, or unsubscribe:
> http://groups.google.com/a/chromium.org/group/chromium-os-dev?hl=en
>

Sonny Rao

unread,
Nov 7, 2017, 6:29:02 PM11/7/17
to Luigi Semenzato, Mike Frysinger, Chromium OS dev, Mattias Nissler
On Tue, Nov 7, 2017 at 12:32 PM, Luigi Semenzato <seme...@chromium.org> wrote:
> On Tue, Nov 7, 2017 at 12:13 PM, Mike Frysinger <vap...@chromium.org> wrote:
>> there's a few ways to look at this. when it comes to /var state, there's a
>> reasonable argument to be made that daemons should make sure their data is
>> in a reasonable state at every boot. you could be migrating from older
>> systems, or things could have been corrupted (not necessarily at the fs
>> level, but by some other process gone amok), or some malicious code ran
>> which wedged you. by having the init script do all the sanity checking
>> every time it runs, it means rebooting is an inexpensive method to get back
>> to a working state.
>> -mike
>
> Yes that makes sense. I am not sure how well we follow this principle.
>
> But migration can be quite a bit more involved than emergency
> recovery. Often the init script doesn't have all the needed logic.

Do you really need migration recovery? Maybe just treat it as
emergency if the data is not user data?

>
>>
>> On Tue, Nov 7, 2017 at 3:09 PM, Luigi Semenzato <seme...@chromium.org>
>> wrote:
>>>
>>> When a chromebook hasn't booted for a long time, and it's missing a
>>> few OS updates, our update engine can feed to it an update which
>>> merges multiple releases. I don't know if we guarantee a maximum
>>> number of merges, or if we can specify releases which are guaranteed
>>> to boot at least once. Do we?
>>>
>>> This would be useful because whenever we need to migrate state (see
>>> for instance crbug.com/782284) we have to leave the migration code in
>>> in perpetuity, or at least for a suitably ridiculous length of time.
>>>
>>> I am trying to think if there are other ways of getting this result,
>>> but I don't think so. The migration code could be in any of our
>>> binaries.
>>
>>
>> --
>> --
>> Chromium OS Developers mailing list: chromiu...@chromium.org
>> View archives, change email options, or unsubscribe:
>> http://groups.google.com/a/chromium.org/group/chromium-os-dev?hl=en
>>
>
> --
> --
> Chromium OS Developers mailing list: chromiu...@chromium.org
> View archives, change email options, or unsubscribe:
> http://groups.google.com/a/chromium.org/group/chromium-os-dev?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups "Chromium OS dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chromium-os-d...@chromium.org.
>

Luigi Semenzato

unread,
Nov 8, 2017, 12:02:08 PM11/8/17
to Sonny Rao, Mike Frysinger, Chromium OS dev, Mattias Nissler
On Tue, Nov 7, 2017 at 3:28 PM, Sonny Rao <sonn...@chromium.org> wrote:
> On Tue, Nov 7, 2017 at 12:32 PM, Luigi Semenzato <seme...@chromium.org> wrote:
>> On Tue, Nov 7, 2017 at 12:13 PM, Mike Frysinger <vap...@chromium.org> wrote:
>>> there's a few ways to look at this. when it comes to /var state, there's a
>>> reasonable argument to be made that daemons should make sure their data is
>>> in a reasonable state at every boot. you could be migrating from older
>>> systems, or things could have been corrupted (not necessarily at the fs
>>> level, but by some other process gone amok), or some malicious code ran
>>> which wedged you. by having the init script do all the sanity checking
>>> every time it runs, it means rebooting is an inexpensive method to get back
>>> to a working state.
>>> -mike
>>
>> Yes that makes sense. I am not sure how well we follow this principle.
>>
>> But migration can be quite a bit more involved than emergency
>> recovery. Often the init script doesn't have all the needed logic.
>
> Do you really need migration recovery? Maybe just treat it as
> emergency if the data is not user data?

Specific example:

I need to change the metrics daemon to run as non-root. Currently it
is creating files, for instance in /var/log/vmlog, which are
root-owned (and so is the directory). The migration code needs to
remove that directory or change its mode or owner, or else the metrics
daemon will fail.

Granted it's not much code, a one-liner in the init script. It just
bugs me that it keeps adding up.

But then again, we should be able to remove it after 5 years...
Reply all
Reply to author
Forward
This conversation is locked
You cannot reply and perform actions on locked conversations.
0 new messages