There's some additional ideas in this thread that may be relevant such
as using a merkle tree data structure and only storing the root hash in
the sw-description.
https://groups.google.com/g/swupdate/c/DSbJo83oc3c/m/dTuwuXqIAAAJ
I had also come across this library which might give some good ideas
on how to implement the data structures efficiently for chunk validation.
https://github.com/IAIK/secure-block-device
>
> I'm still overworked, but writing this mail rekindled my motivation
> here, unless there's a reason not to I'll finish my poc for chunked
> transfer next week; I think it shouldn't be a too large patch and
> having a concrete implementation might give better grounds for
> discussions.
> (And if the zchunk approach seems better we can still throw it away, but
> right now even ignoring the backwards compatibility I don't quite see
> how to make the files in the cpio archive match up cleanly)
>
>
> Thanks,
> --
> Dominique
>
> --
> You received this message because you are subscribed to the Google Groups "swupdate" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
swupdate+u...@googlegroups.com.
> To view this discussion on the web visit
https://groups.google.com/d/msgid/swupdate/ZlmB1OIxf4KbVPXA%40atmark-techno.com.
>
>
>
> ---------- Forwarded message ----------
> From: Dominique MARTINET <
dominique...@atmark-techno.com>
> To: Stefano Babic <
stefan...@swupdate.org>
> Cc:
> Bcc:
> Date: Tue, 7 May 2024 14:02:11 +0900
> Subject: [security RFC] install_directly and checksum validation - chunkify the checksum?
> Hi Stefano,
>
> sorry for the direct mail; I've started writing this for the list but
> since I consider this to be a security issue for us I'd rather get your
> feedback first; happy to forward all our mails to the list once some
> basic patches are ready (doesn't necessarily have to be merged yet as I
> don't think it's necessarily a swupdate problem per se, more of a
> problem with how I'm using it; I just selfishly want at least some
> partial solution first before going public)
>
>
> This is rather obvious in hindsight and I'm not sure why I didn't
> consider this a problem at first, but using installed_directly we're
> basically trusting handlers to be no-op (or fully reversible) if someone
> keeps an existing sw-description and its signature but swaps one of the
> files inside the swu archive:
>
> - swupdate checks sw-description signature matches (ok)
> - file X is fed to the handler while verifying the checksum in parallel
> - verification fails, making handler ultimately fail terminating the
> installation
>
> For most handlers the worst that can happen is useless writes to disk
> (e.g. filling up an inactive partition that cannot be used), but for
> example the archive handler could easily be abused if there is a
> "shared" partition (we unfortuantely have such a "state" partition, so
> e.g. logs or databases can be updated during a new install and rebooting
> into the update will not lose anything -- it's mounted during the update
> if the user need to interact with it directly e.g. cleanup obsolete data)
>
> Encryption somewhat mitigates the issue because it's harder to mess with
> encrypted data, but while we suggest encryption for our users'
> application update, we also distribute "official OS" updates for all
> our users and these are not encrypted.
>
>
> With the background out of the way, that leaves a few possibilities:
> - Stop mounting such a partition; I'm not considering this further for
> two reasons: 1/ it's breaking compatibility, which I'd rather avoid if
> possible, 2/ even if we don't mount the partition directly, swupdate can
> also write to different btrfs subvolumes (we're using snapshots for
> cheap A/B copy), so an attacker could still cause trouble by filling the
> fs at the very least and I'd like to protect from that as well.
>
> As a poor workaround I could ensure the shared partition isn't mounted
> for our official updates, but I'd like to think we can do better.
>
> - Make the archive handler buffer the whole archive so we can verify
> the checksum before starting extraction, this is brutish and might not
> work for large archives; we're also not setting the archive compressed
> size (checksum target; I'm not sure it's possible in a generic way?), so
> this does not address filling $TMPDIR.
>
> In our particular case we already accept large-ish images in TMPDIR
> before extracting them (podman doesn't support non-seekable container
> archive input so we store a copy there), so I'd consider this acceptable
> for us for previously generated archives if there is no better solution,
> but I wouldn't suggest this for swupdate upstream.
>
> - Modify the checksum in sw-description to not be a single value, but
> e.g. a list of checksums for each 1MB chunks; __swupdate_copy could then
> temporarily buffer these chunks in TMPDIR then feed the data to handlers
> after the chunk is validated
> (chunk size to be discussed; the current BUFF_SIZE is 16KB which does
> not seem practical so I suggested 1MB but I'm open to suggestions.
> I know swupdate doesn't/shouldn't use dynamically allocated memory for
> updates and reserving a full MB isn't realistic, so for example using a
> mmap'ed buffer in TMPDIR would allow the kernel to flush to disk if
> there is memory pressure, yet never touch disk if there is none; but
> we can just really write it down 16KB at a time and read it back if you
> prefer, I don't want to get lost in details yet)
>
> That's obviously the direction I prefer, but it requires allowing a new
> way of writing checksums in the config so it's obviously something to
> consider.
> iirc swupdate ignores unknown fields so we could keep the full checksum
> and the chunked checksums, old versions of swupdate would work as they
> previously did and new versions would only use the chunked one if
> present.
>
>
> What do you think?
> --
> Dominique
>
>
>
> ---------- Forwarded message ----------
> From: Stefano Babic <
stefan...@swupdate.org>
> To: Dominique MARTINET <
dominique...@atmark-techno.com>
> Cc:
> Bcc:
> Date: Tue, 7 May 2024 10:00:30 +0200
> Subject: Re: [security RFC] install_directly and checksum validation - chunkify the checksum?
> Hi Dominique,
>
> On 07.05.24 07:02, Dominique MARTINET wrote:
> > Hi Stefano,
> >
> > sorry for the direct mail; I've started writing this for the list but
> > since I consider this to be a security issue for us I'd rather get your
> > feedback first; happy to forward all our mails to the list once some
> > basic patches are ready (doesn't necessarily have to be merged yet as I
> > don't think it's necessarily a swupdate problem per se, more of a
> > problem with how I'm using it; I just selfishly want at least some
> > partial solution first before going public)
>
> Sure - then a post to ML is useful and it will be tracked for the
> future. You could also think about if it makes sense to write something
> into documentation - I collect some times ago some suggestions into a
> Best Practices chapter (or where you think it is better).
>
> >
> >
> > This is rather obvious in hindsight and I'm not sure why I didn't
> > consider this a problem at first, but using installed_directly we're
> > basically trusting handlers to be no-op (or fully reversible) if someone
> > keeps an existing sw-description and its signature but swaps one of the
> > files inside the swu archive:
> >
> > - swupdate checks sw-description signature matches (ok)
> > - file X is fed to the handler while verifying the checksum in parallel
> > - verification fails, making handler ultimately fail terminating the
> > installation
>
> Yes - but this is exactly how it works to allow streaming. It is exactly
> the same when we have a single-copy approach: something were installed,
> but the update is considered FAILED and shouldn't taken. SWUodate
> reports this via progress interface and bootloader variable ("recovery"
> and "ustate"), and it is duty of the integrator to do something if required.
>
>
> >
> > For most handlers the worst that can happen is useless writes to disk
> > (e.g. filling up an inactive partition that cannot be used), but for
> > example the archive handler could easily be abused if there is a
> > "shared" partition (we unfortuantely have such a "state" partition, so
> > e.g. logs or databases can be updated during a new install and rebooting
> > into the update will not lose anything -- it's mounted during the update
> > if the user need to interact with it directly e.g. cleanup obsolete data)
>
> You have then a mixed between double-copy and single-copy. All data on
> your shared partition are updated in a single copy concept, and should
> be considered invalid until SWUpdate will be successful.
>
> I generally avoid to mix the concepts - if I go for a double-copy, I try
> to have double-copy for all artifacts. In this case, if space is enough,
> I had tried to duplicate the directory where the archive handler
> extracts data, and then in a later handler (postinstall) to switch the
> names.
>
> I have recently (well, not so recently, but it is not yet in an official
> release, I have to push a 2024.05) "post-failure" scripts. The main goal
> was to restart containers that were previously stopped, but it can fit
> in your use case. You could use to drop what the archive has installed
> before the failure.
>
> If the shared partition is just mounted but not updated (or no files are
> updated), I do not see issues, because sw-description is verified before
> doing anything.
>
> >
> > Encryption somewhat mitigates the issue because it's harder to mess with
> > encrypted data, but while we suggest encryption for our users'
> > application update, we also distribute "official OS" updates for all
> > our users and these are not encrypted.
>
> It could mitigate, but the update should be already clean without
> encryption.
>
> >
> >
> > With the background out of the way, that leaves a few possibilities:
> > - Stop mounting such a partition; I'm not considering this further for
> > two reasons: 1/ it's breaking compatibility, which I'd rather avoid if
> > possible, 2/ even if we don't mount the partition directly, swupdate can
> > also write to different btrfs subvolumes (we're using snapshots for
> > cheap A/B copy), so an attacker could still cause trouble by filling the
> > fs at the very least and I'd like to protect from that as well.
> >
> > As a poor workaround I could ensure the shared partition isn't mounted
> > for our official updates, but I'd like to think we can do better.
> >
> > - Make the archive handler buffer the whole archive so we can verify
> > the checksum before starting extraction, this is brutish and might not
> > work for large archives; we're also not setting the archive compressed
> > size (checksum target; I'm not sure it's possible in a generic way?), so
> > this does not address filling $TMPDIR.
> >
>
> This drops the streaming feature. In fact, the issue you report is more
> about what the integrator should do else an issue in side the code, as
> you have reported, too.
>
> I am not sure if your thoughts are due because you are updating the
> shared partition, and then yes, data are not valid, or this partition is
> not updated but it is still present, and you think to unmount it. But as
> no rule is in sw-description, I do not see any issue with it.
>
> > In our particular case we already accept large-ish images in TMPDIR
> > before extracting them (podman doesn't support non-seekable container
> > archive input so we store a copy there),
>
> I know this, someone has patched podman to skip the hash check and let
> SWUpdate doing this (but I haven't these patches for podman).
>
> > so I'd consider this acceptable
> > for us for previously generated archives if there is no better solution,
> > but I wouldn't suggest this for swupdate upstream.
> >
> > - Modify the checksum in sw-description to not be a single value, but
> > e.g. a list of checksums for each 1MB chunks; __swupdate_copy could then
> > temporarily buffer these chunks in TMPDIR then feed the data to handlers
> > after the chunk is validated
> > (chunk size to be discussed; the current BUFF_SIZE is 16KB which does
> > not seem practical so I suggested 1MB but I'm open to suggestions.
> > I know swupdate doesn't/shouldn't use dynamically allocated memory for
> > updates and reserving a full MB isn't realistic, so for example using a
> > mmap'ed buffer in TMPDIR would allow the kernel to flush to disk if
> > there is memory pressure, yet never touch disk if there is none; but
> > we can just really write it down 16KB at a time and read it back if you
> > prefer, I don't want to get lost in details yet)
> >
>
> This is quite what the delta handler is doing, but with the "delta"
> packed into the SWU.
>
> A file for delta is in zchunk format, that means split into chunks based
> on rolling-hash algorithms. The result of the delta handler is then
> streamed again to another handler, and it could be the archive handler.
>
> This is not implemented and it sounds a little weird as "delta", but
> splitting into chunks (and in a format already recognized and supported
> like zck) is done. So the pipeline is:
>
> tarball in ZCK format ==> modified delta ==> archive handler
>
> The delta handler verifies each chunk before forwarding to the chained
> handler using a pipe.
>
> However, I will first suggest you if the post-failure scripts are not
> enough for you. They run just in case of failure, and you can also track
> what happened before if they are in Lua because the Lua context is
> extended to be per update instead of per script.
>
> > That's obviously the direction I prefer, but it requires allowing a new
> > way of writing checksums in the config so it's obviously something to
> > consider.
>
> I won't create an own method for this if there is already a standard
> way. I chose zchunk for sevral reason, and one of them is that is wide
> supported in distro like Fedora (but not only).
>
> > iirc swupdate ignores unknown fields
>
> Right - this is a "feature" that I often use, for example when some
> information should be sent to the application but ignored by SWUpdate.
>
> > so we could keep the full checksum
> > and the chunked checksums, old versions of swupdate would work as they
> > previously did and new versions would only use the chunked one if
> > present.
> >
> >
> > What do you think?
>
> Best regards,
> Stefano
>
>
>
>
> ---------- Forwarded message ----------
> From: Dominique MARTINET <
dominique...@atmark-techno.com>
> To: Stefano Babic <
stefan...@swupdate.org>
> Cc:
> Bcc:
> Date: Tue, 7 May 2024 18:30:09 +0900
> Subject: Re: [security RFC] install_directly and checksum validation - chunkify the checksum?
> Thanks for the quick reply
>
> Stefano Babic wrote on Tue, May 07, 2024 at 10:00:30AM +0200:
> > Sure - then a post to ML is useful and it will be tracked for the
> > future. You could also think about if it makes sense to write something
> > into documentation - I collect some times ago some suggestions into a
> > Best Practices chapter (or where you think it is better).
>
> Yes, that'll be even better.
> I'll forward our discussion and suggest an update to
> doc/source/swupdate-best-practise.rst once we're done.
>
> > > For most handlers the worst that can happen is useless writes to disk
> > > (e.g. filling up an inactive partition that cannot be used), but for
> > > example the archive handler could easily be abused if there is a
> > > "shared" partition (we unfortuantely have such a "state" partition, so
> > > e.g. logs or databases can be updated during a new install and rebooting
> > > into the update will not lose anything -- it's mounted during the update
> > > if the user need to interact with it directly e.g. cleanup obsolete data)
> >
> > You have then a mixed between double-copy and single-copy. All data on
> > your shared partition are updated in a single copy concept, and should
> > be considered invalid until SWUpdate will be successful.
> >
> > I generally avoid to mix the concepts - if I go for a double-copy, I try
> > to have double-copy for all artifacts. In this case, if space is enough,
> > I had tried to duplicate the directory where the archive handler
> > extracts data, and then in a later handler (postinstall) to switch the
> > names.
>
> Yes, in hindsight I definitely agree with this; I'm just stuck here
> because of backwards compatibility as I'm not the final integrator;
> but we have many very small users which just use small variations of the
> examples I provide so it's hard to change anything.
>
> I'm not giving any example that write in the single-copy partition
> (because even without this security consideration, it's always dangerous
> to write somewhere the live system can access), but it's a directory that
> we've documented as writeable so it'd be a large user-facing change.
>
>
> > I have recently (well, not so recently, but it is not yet in an official
> > release, I have to push a 2024.05) "post-failure" scripts. The main goal
> > was to restart containers that were previously stopped, but it can fit
> > in your use case. You could use to drop what the archive has installed
> > before the failure.
> >
> > If the shared partition is just mounted but not updated (or no files are
> > updated), I do not see issues, because sw-description is verified before
> > doing anything.
>
> sw-description is verified, but the way that "single-copy" partition is
> mounted doesn't allow me to check if it is written or not easily
> (e.g. my updates will have path = "/target", which is a double-copy
> partition, but the single-copy partition is available under
> /target/var/app/volumes so if the archive is modified to include files
> like /var/app/volumes/foo it will be written to the single-copy
> partition.
>
> A post-failure script also won't be able to know which files have been
> written by the update and what has been modified by the live system, so
> it'll be hard to undo - I have the same problem if I want to protect the
> live system and take a snapshot of this subvolume, I won't know which
> files to keep from which side at the end of the update, as the whole
> point is to keep logs/database entries from the running system while the
> update is running.
> It's also not possible to undo a file being written on both sides.
>
> At this point the right thing to do would be to just not mount this
> partition beneath the update path, but we're back to breaking
> compabitility; I'd rather avoid that if possible.
>
> > > - Make the archive handler buffer the whole archive so we can verify
> > > the checksum before starting extraction, this is brutish and might not
> > > work for large archives; we're also not setting the archive compressed
> > > size (checksum target; I'm not sure it's possible in a generic way?), so
> > > this does not address filling $TMPDIR.
> >
> > This drops the streaming feature. In fact, the issue you report is more
> > about what the integrator should do else an issue in side the code, as
> > you have reported, too.
>
> Yes, I definitely wouldn't suggest this for swupdate upstream.
> That would be a local workaround for otherwise vulnerable updates only.
>
> Once again I am bitten by users building their own updates; for our
> updates we could just rotate the signing key (that is planned at some
> point), but our users manage their own keys as well and many will not
> even have a second key ready on devices...
>
>
> > I am not sure if your thoughts are due because you are updating the
> > shared partition, and then yes, data are not valid, or this partition is
> > not updated but it is still present, and you think to unmount it. But as
> > no rule is in sw-description, I do not see any issue with it.
>
> This is quite murky, let me try to recap:
> - the normal OS updates we provide do not write to this partition,
> but the partition is a submount of the update target path so an attacker
> could use these updates to write there if they want
> - user updates are allowed to write to this partition, so while just
> unmounting it would address the problem for normal OS updates it will
> break that for user updates.
> user updates could of course use another path that is not beneath the
> normal double-copy target, but I'd rather avoid this if possible as
> users explicitly writing there would also be vulnerable
> - even if I move this mount point, other part of the update (TMPDIR,
> another submount that is double-copy) write to subvolumes of the same
> partition, so filling the partition is also easy and I'd like to try to
> protect from that as well.
>
>
> > > In our particular case we already accept large-ish images in TMPDIR
> > > before extracting them (podman doesn't support non-seekable container
> > > archive input so we store a copy there),
> >
> > I know this, someone has patched podman to skip the hash check and let
> > SWUpdate doing this (but I haven't these patches for podman).
>
> This is off-topic so I'll open another thread on the list directly for
> this patch, I'll be interested as a later optimization.
>
> > > - Modify the checksum in sw-description to not be a single value, but
> > > e.g. a list of checksums for each 1MB chunks; __swupdate_copy could then
> > > temporarily buffer these chunks in TMPDIR then feed the data to handlers
> > > after the chunk is validated
> > > (chunk size to be discussed; the current BUFF_SIZE is 16KB which does
> > > not seem practical so I suggested 1MB but I'm open to suggestions.
> > > I know swupdate doesn't/shouldn't use dynamically allocated memory for
> > > updates and reserving a full MB isn't realistic, so for example using a
> > > mmap'ed buffer in TMPDIR would allow the kernel to flush to disk if
> > > there is memory pressure, yet never touch disk if there is none; but
> > > we can just really write it down 16KB at a time and read it back if you
> > > prefer, I don't want to get lost in details yet)
> > >
> >
> > This is quite what the delta handler is doing, but with the "delta"
> > packed into the SWU.
>
> Yes, I've seen the delta handler and tried it as you pushed it, but
> didn't keep it for two reasons:
> - it currently requires a web server (I didn't see a way to pack the
> delta into the SWU, please correct me if it's possible).
> This is normal for optimizing bandwidth (using RANGE requests to only
> fetch required chunks), but we also need offline updates.
> - if I start generating updates using a new handler, these will not be
> installable on old systems, and that will require quite some preparation
> (shipping systems where swupdate include the current handler, then
> perhaps a year later(!) finally require it).
> Our policy here is way too nice for my taste, but it has to be backwards
> compatible for a (long) while.
>
> > A file for delta is in zchunk format, that means split into chunks based
> > on rolling-hash algorithms. The result of the delta handler is then
> > streamed again to another handler, and it could be the archive handler.
> >
> > This is not implemented and it sounds a little weird as "delta", but
> > splitting into chunks (and in a format already recognized and supported
> > like zck) is done. So the pipeline is:
> >
> > tarball in ZCK format ==> modified delta ==> archive handler
> >
> > The delta handler verifies each chunk before forwarding to the chained
> > handler using a pipe.
>
> I'm not familiar with the ZCK format to reply immediately, but my first
> impression is that it sounds more complex than just chunking checksums,
> and I'm not sure the little code that will be shared this way is worth
> the cognitive burden to understand this pipeline.
>
>
> To give a more concrete example, I'm suggesting something like this:
> - file foo.tar is 30MB after compression
> - compute checksums for it, e.g. replace (pseudoshell)
> sha256 = "`sha256sum foo.tar.zst`"
> with something like (not sure what libconfig lists look like)
> sha256-chunks = [
> `split -b $((1024*1024)) --filter='sha256sum -' foo.tar.zst`
> ]
> which will look like
> sha256-chunks = [
> "eac4aded3727a63f72d8bd641a0f08541c0d3e17f82a274b9d4ddfa48836a880",
> "626e97f31e585646665fc9a6682243965a6c8bf9693fd1a585d7d3dd972615b7",
> ...
> ]
> - use these in copyfile
>
> The other advantage of this is that it makes the input file size
> explicit, so even with no streaming there is no more risk of having
> TMPDIR filled by invalid data (which is the second problem I'm
> referring to -- as long as the file does not end the checksum
> computation never fails and an attacker can write forever)
>
> > > That's obviously the direction I prefer, but it requires allowing a new
> > > way of writing checksums in the config so it's obviously something to
> > > consider.
> >
> > I won't create an own method for this if there is already a standard
> > way. I chose zchunk for sevral reason, and one of them is that is wide
> > supported in distro like Fedora (but not only).
>
> That's not so much a new method as making copyfile use either
> 'sha256-chunks' if available and fall back to 'sha256' (or none) if not,
> but yes, it will be a bit of code;
> the checksum compuation is all in place and can be reused so all that
> will change is moving the HASH_init/final to a subfunction so it can be
> shared from within the loop at regular interval, I don't expect this to
> add much code.
> (there is no advantage of computing both, so if both are present only
> the chunks variant will be checked)
>
> I've run out of time for today, but I can write a quick poc tomorrow if
> that can convince you the implementation is simple enough.
> (I'll re-check what you meant with zck first)
>
> Thanks,
> --
> Dominique
>
>
>
> ---------- Forwarded message ----------
> From: Stefano Babic <
stefan...@swupdate.org>
> To: Dominique MARTINET <
dominique...@atmark-techno.com>
> Cc:
> Bcc:
> Date: Tue, 7 May 2024 13:37:44 +0200
> Subject: Re: [security RFC] install_directly and checksum validation - chunkify the checksum?
> Hi Dominique,
>
> On 07.05.24 11:30, Dominique MARTINET wrote:
> > Thanks for the quick reply
> >
> > Stefano Babic wrote on Tue, May 07, 2024 at 10:00:30AM +0200:
> >> Sure - then a post to ML is useful and it will be tracked for the
> >> future. You could also think about if it makes sense to write something
> >> into documentation - I collect some times ago some suggestions into a
> >> Best Practices chapter (or where you think it is better).
> >
> > Yes, that'll be even better.
> > I'll forward our discussion and suggest an update to
> > doc/source/swupdate-best-practise.rst once we're done.
> >
>
> Fine.
>
> >>> For most handlers the worst that can happen is useless writes to disk
> >>> (e.g. filling up an inactive partition that cannot be used), but for
> >>> example the archive handler could easily be abused if there is a
> >>> "shared" partition (we unfortuantely have such a "state" partition, so
> >>> e.g. logs or databases can be updated during a new install and rebooting
> >>> into the update will not lose anything -- it's mounted during the update
> >>> if the user need to interact with it directly e.g. cleanup obsolete data)
> >>
> >> You have then a mixed between double-copy and single-copy. All data on
> >> your shared partition are updated in a single copy concept, and should
> >> be considered invalid until SWUpdate will be successful.
> >>
> >> I generally avoid to mix the concepts - if I go for a double-copy, I try
> >> to have double-copy for all artifacts. In this case, if space is enough,
> >> I had tried to duplicate the directory where the archive handler
> >> extracts data, and then in a later handler (postinstall) to switch the
> >> names.
> >
> > Yes, in hindsight I definitely agree with this; I'm just stuck here
> > because of backwards compatibility as I'm not the final integrator;
> > but we have many very small users which just use small variations of the
> > examples I provide so it's hard to change anything.
> >
>
> Ok, got it, but it is difficult to take over the responsibility when
> someone else was in charge and built mistakes.
>
> > I'm not giving any example that write in the single-copy partition
> > (because even without this security consideration, it's always dangerous
>
> +1
>
> > to write somewhere the live system can access), but it's a directory that
> > we've documented as writeable so it'd be a large user-facing change.
>
> Well, writable does not yet mean how it could be upgraded ;-)
>
> >
> >
> >> I have recently (well, not so recently, but it is not yet in an official
> >> release, I have to push a 2024.05) "post-failure" scripts. The main goal
> >> was to restart containers that were previously stopped, but it can fit
> >> in your use case. You could use to drop what the archive has installed
> >> before the failure.
> >>
> >> If the shared partition is just mounted but not updated (or no files are
> >> updated), I do not see issues, because sw-description is verified before
> >> doing anything.
> >
> > sw-description is verified, but the way that "single-copy" partition is
> > mounted doesn't allow me to check if it is written or not easily
> > (e.g. my updates will have path = "/target", which is a double-copy
> > partition, but the single-copy partition is available under
> > /target/var/app/volumes so if the archive is modified to include files
> > like /var/app/volumes/foo it will be written to the single-copy
> > partition.
>
> ok - but at the end, the way in SWUpdate is to consider the update
> successful or not in a whole. A hash mismatch (due to an attacker) will
> lead to a failed update. So at least this information is available, and
> then some decisions can be taken.
>
> >
> > A post-failure script also won't be able to know which files have been
> > written by the update and what has been modified by the live system, so
> > it'll be hard to undo
>
> Yes, but I guess we have always this. Let's say we have a tarball with
> multiple hashes as you proposed. The archive handler is running, some
> files were extracted, some of them not. SWUpdate uses also the
> libarchive in an atomic way, so everything was unpacked or nothing. You
> still have a mix up between extracted files with right hashes.
>
> And if these files can be updated, they can overwrite some files written
> by the live system, too. I mean: I do not know if we can provide a
> solution in SWUpdate if the integrator had not thought to what is
> happening. A tarball can always contain files that overwrite files
> written by live system, even if hash match - and then ?
>
> > - I have the same problem if I want to protect the
> > live system and take a snapshot of this subvolume, I won't know which
> > files to keep from which side at the end of the update, as the whole
> > point is to keep logs/database entries from the running system while the
> > update is running.
> > It's also not possible to undo a file being written on both sides.
> >
> > At this point the right thing to do would be to just not mount this
> > partition beneath the update path, but we're back to breaking
> > compabitility; I'd rather avoid that if possible.
>
> But let's say: what happens if SWUpdate raised an error on a file, that
> is important for the application ?
>
> Anyway, if you are already using btrfs snapshot, why cannot you add a
> snapshot before update is running and restore it with a post-failure
> script ?
>
> >
> >>> - Make the archive handler buffer the whole archive so we can verify
> >>> the checksum before starting extraction, this is brutish and might not
> >>> work for large archives; we're also not setting the archive compressed
> >>> size (checksum target; I'm not sure it's possible in a generic way?), so
> >>> this does not address filling $TMPDIR.
> >>
> >> This drops the streaming feature. In fact, the issue you report is more
> >> about what the integrator should do else an issue in side the code, as
> >> you have reported, too.
> >
> > Yes, I definitely wouldn't suggest this for swupdate upstream.
> > That would be a local workaround for otherwise vulnerable updates only.
> >
> > Once again I am bitten by users building their own updates; for our
> > updates we could just rotate the signing key (that is planned at some
> > point), but our users manage their own keys as well and many will not
> > even have a second key ready on devices...
>
> You are very kindly with your customers :-).
>
> I say my customers that I feel responsible for the integrations I do
> myself. If they are doing on their own, they should solve the problems
> on their own...
>
> It seems to me that we are trying to fix a wrong concept, due to missing
> experience by some users. But you know: we cannot crate something
> foolproof because folls are very creative and circumvents our measures ;-)
>
> >
> >
> >> I am not sure if your thoughts are due because you are updating the
> >> shared partition, and then yes, data are not valid, or this partition is
> >> not updated but it is still present, and you think to unmount it. But as
> >> no rule is in sw-description, I do not see any issue with it.
> >
> > This is quite murky, let me try to recap:
> > - the normal OS updates we provide do not write to this partition,
> > but the partition is a submount of the update target path so an attacker
> > could use these updates to write there if they want
>
> Ok - indeed, this is a failure in the update concept.
>
> > - user updates are allowed to write to this partition, so while just
> > unmounting it would address the problem for normal OS updates it will
> > break that for user updates.
>
> Got it.
>
> > user updates could of course use another path that is not beneath the
> > normal double-copy target,
>
> Exactly
>
> > but I'd rather avoid this if possible as
> > users explicitly writing there would also be vulnerable
> > - even if I move this mount point, other part of the update (TMPDIR,
> > another submount that is double-copy) write to subvolumes of the same
> > partition, so filling the partition is also easy and I'd like to try to
> > protect from that as well.
> >
> >
> >>> In our particular case we already accept large-ish images in TMPDIR
> >>> before extracting them (podman doesn't support non-seekable container
> >>> archive input so we store a copy there),
> >>
> >> I know this, someone has patched podman to skip the hash check and let
> >> SWUpdate doing this (but I haven't these patches for podman).
> >
> > This is off-topic so I'll open another thread on the list directly for
> > this patch, I'll be interested as a later optimization.
>
> Ok
>
> >
> >>> - Modify the checksum in sw-description to not be a single value, but
> >>> e.g. a list of checksums for each 1MB chunks; __swupdate_copy could then
> >>> temporarily buffer these chunks in TMPDIR then feed the data to handlers
> >>> after the chunk is validated
> >>> (chunk size to be discussed; the current BUFF_SIZE is 16KB which does
> >>> not seem practical so I suggested 1MB but I'm open to suggestions.
> >>> I know swupdate doesn't/shouldn't use dynamically allocated memory for
> >>> updates and reserving a full MB isn't realistic, so for example using a
> >>> mmap'ed buffer in TMPDIR would allow the kernel to flush to disk if
> >>> there is memory pressure, yet never touch disk if there is none; but
> >>> we can just really write it down 16KB at a time and read it back if you
> >>> prefer, I don't want to get lost in details yet)
> >>>
> >>
> >> This is quite what the delta handler is doing, but with the "delta"
> >> packed into the SWU.
> >
> > Yes, I've seen the delta handler and tried it as you pushed it, but
> > didn't keep it for two reasons:
> > - it currently requires a web server (I didn't see a way to pack the
> > delta into the SWU, please correct me if it's possible).
>
> In fact, what I have written is that the ZCK file should be part of the
> SWU instead of being retrieved from an external Webserver. This is of
> course not done, because then we haven't any "delta" anymore. So you can
> just see it as another format for the artifact (ZCK format).
>
> In the build, you should have your tarball for archive, say my.tar, pass
> it to create a ZCK file and deliver this file into the SWU. It is not a
> delta anymore, chunks are simply read by the SWU instead of be compared
> and with the live copy and downloaded from an external source.
>
>
> > This is normal for optimizing bandwidth (using RANGE requests to only
> > fetch required chunks), but we also need offline updates.
>
> See above.
>
> > - if I start generating updates using a new handler, these will not be
> > installable on old systems,
>
> Everything we will do, including adding multiple hashes, won't be
> compatible with the past. It is in the matter of things.
>
> The only way to fix and made backward compatible is to change the
> concept, that avoiding to extract a tarball into the path above. Using a
> new partition, uploading an image instead of tarball, etc..
>
> Proposals here will change SWUpdate code, and then customers need to
> update at least once to have the changes in place.
>
> > and that will require quite some preparation
> > (shipping systems where swupdate include the current handler, then
> > perhaps a year later(!) finally require it).
>
> Sure, that depends on your customers. But well, if they are the product
> owner, they should decide.
>
> > Our policy here is way too nice for my taste, but it has to be backwards
> > compatible for a (long) while.
>
> ok, I am not sure what can be done when the concept is wrong. At the
> end, we can do a lot of nasty things by writing a bad sw-description.
>
> >
> >> A file for delta is in zchunk format, that means split into chunks based
> >> on rolling-hash algorithms. The result of the delta handler is then
> >> streamed again to another handler, and it could be the archive handler.
> >>
> >> This is not implemented and it sounds a little weird as "delta", but
> >> splitting into chunks (and in a format already recognized and supported
> >> like zck) is done. So the pipeline is:
> >>
> >> tarball in ZCK format ==> modified delta ==> archive handler
> >>
> >> The delta handler verifies each chunk before forwarding to the chained
> >> handler using a pipe.
> >
> > I'm not familiar with the ZCK format to reply immediately, but my first
> > impression is that it sounds more complex than just chunking checksums,
>
> It is more complex, it could be a pain to implement, but well, someone
> else (me) has already done and it is working. It has a nice methods to
> split into chunks with mathematical algorithms, and each chunk is
> compressed via zst.
>
> > and I'm not sure the little code that will be shared this way is worth
> > the cognitive burden to understand this pipeline.
> >
> >
> > To give a more concrete example, I'm suggesting something like this:
> > - file foo.tar is 30MB after compression
> > - compute checksums for it, e.g. replace (pseudoshell)
> > sha256 = "`sha256sum foo.tar.zst`"
> > with something like (not sure what libconfig lists look like)
> > sha256-chunks = [
> > `split -b $((1024*1024)) --filter='sha256sum -' foo.tar.zst`
> > ]
>
> But let's say: your customer can already split the big tarball in a set
> of smaller tarballs, and each of them has its own hash. If they do this,
> they minimize the risks.
>
> Even with the changes above, an attacker can replace a chunk (then it
> depends on the size of the chunk..) and replace some important files. As
> usual, the bug itself due t othe wrong concept cannot be solved, we are
> trying to find some work-arounds, that can work or not.
>
> > which will look like
> > sha256-chunks = [
> > "eac4aded3727a63f72d8bd641a0f08541c0d3e17f82a274b9d4ddfa48836a880",
> > "626e97f31e585646665fc9a6682243965a6c8bf9693fd1a585d7d3dd972615b7",
> > ...
> > ]
> > - use these in copyfile
> >
> > The other advantage of this is that it makes the input file size
> > explicit, so even with no streaming there is no more risk of having
> > TMPDIR filled by invalid data (which is the second problem I'm
> > referring to -- as long as the file does not end the checksum
> > computation never fails and an attacker can write forever)
>
> But well, this is a separate issue. Some handlers have already a "size"
> attribute. It could be extended and checked with the CPIO header to
> avoid this.
>
> Issue is known, but it does not lead that manipulated software is
> installed - SWUpdate will stop with OOM and files are erased. Of course,
> this in case installed-directly is not set, else the device size will be
> effective.
>
> >
> >>> That's obviously the direction I prefer, but it requires allowing a new
> >>> way of writing checksums in the config so it's obviously something to
> >>> consider.
> >>
> >> I won't create an own method for this if there is already a standard
> >> way. I chose zchunk for sevral reason, and one of them is that is wide
> >> supported in distro like Fedora (but not only).
> >
> > That's not so much a new method as making copyfile use either
> > 'sha256-chunks' if available and fall back to 'sha256' (or none) if not,
> > but yes, it will be a bit of code;
> > the checksum compuation is all in place and can be reused so all that
> > will change is moving the HASH_init/final to a subfunction so it can be
> > shared from within the loop at regular interval, I don't expect this to
> > add much code.
> > (there is no advantage of computing both, so if both are present only
> > the chunks variant will be checked)
> >
> > I've run out of time for today, but I can write a quick poc tomorrow if
> > that can convince you the implementation is simple enough.
> > (I'll re-check what you meant with zck first)
> >
> > Thanks,
>
> Best regards,
> Stefano
>
>
>
>
> ---------- Forwarded message ----------
> From: Dominique MARTINET <
dominique...@atmark-techno.com>
> To: Stefano Babic <
stefan...@swupdate.org>
> Cc:
> Bcc:
> Date: Wed, 8 May 2024 11:02:07 +0900
> Subject: Re: [security RFC] install_directly and checksum validation - chunkify the checksum?
> Stefano Babic wrote on Tue, May 07, 2024 at 01:37:44PM +0200:
> > > A post-failure script also won't be able to know which files have been
> > > written by the update and what has been modified by the live system, so
> > > it'll be hard to undo
> >
> > Yes, but I guess we have always this. Let's say we have a tarball with
> > multiple hashes as you proposed. The archive handler is running, some
> > files were extracted, some of them not. SWUpdate uses also the
> > libarchive in an atomic way, so everything was unpacked or nothing. You
> > still have a mix up between extracted files with right hashes.
> >
> > And if these files can be updated, they can overwrite some files written
> > by the live system, too. I mean: I do not know if we can provide a
> > solution in SWUpdate if the integrator had not thought to what is
> > happening. A tarball can always contain files that overwrite files
> > written by live system, even if hash match - and then ?
>
> Yes, as said above I do not think a post failure script could handle
> this in a generic way.
>
> > > - I have the same problem if I want to protect the
> > > live system and take a snapshot of this subvolume, I won't know which
> > > files to keep from which side at the end of the update, as the whole
> > > point is to keep logs/database entries from the running system while the
> > > update is running.
> > > It's also not possible to undo a file being written on both sides.
> > >
> > > At this point the right thing to do would be to just not mount this
> > > partition beneath the update path, but we're back to breaking
> > > compabitility; I'd rather avoid that if possible.
> >
> > But let's say: what happens if SWUpdate raised an error on a file, that
> > is important for the application ?
>
> Yes, even if there is no ill intent, if a real error occurs after the
> single-copy partition has been updated then what has been written is
> done and cannot be taken back.
> I'll definitely also update our own user documentation to strongly
> discourage writing to this partition (assuming we keep the possibility)
>
> > Anyway, if you are already using btrfs snapshot, why cannot you add a
> > snapshot before update is running and restore it with a post-failure
> > script ?
>
> Because that would lose data written during the update.
> This partition is akin to /var for a normal system - database &
> logs; it shouldn't be rolled back.
> It's the same reason we don't snapshot before writing to it: in case of
> update success the partition would lose writes that happened during the
> update itself.
>
> It really should not have been mounted in the first place...
>
> > > Once again I am bitten by users building their own updates; for our
> > > updates we could just rotate the signing key (that is planned at some
> > > point), but our users manage their own keys as well and many will not
> > > even have a second key ready on devices...
> >
> > You are very kindly with your customers :-).
> >
> > I say my customers that I feel responsible for the integrations I do
> > myself. If they are doing on their own, they should solve the problems
> > on their own...
> >
> > It seems to me that we are trying to fix a wrong concept, due to missing
> > experience by some users. But you know: we cannot crate something
> > foolproof because folls are very creative and circumvents our measures ;-)
>
> Right, I think we can definitely say "these invalid use cases will stop
> working", but in our case the integration boundary is pretty blurry and
> while I do not necessary want to support 100% I'd like to at least try
> to not change how things work too much as it can get very confusing
>
> > > > This is quite what the delta handler is doing, but with the "delta"
> > > > packed into the SWU.
> > >
> > > Yes, I've seen the delta handler and tried it as you pushed it, but
> > > didn't keep it for two reasons:
> > > - it currently requires a web server (I didn't see a way to pack the
> > > delta into the SWU, please correct me if it's possible).
> >
> > In fact, what I have written is that the ZCK file should be part of the
> > SWU instead of being retrieved from an external Webserver. This is of
> > course not done, because then we haven't any "delta" anymore. So you can
> > just see it as another format for the artifact (ZCK format).
> >
> > In the build, you should have your tarball for archive, say my.tar, pass
> > it to create a ZCK file and deliver this file into the SWU. It is not a
> > delta anymore, chunks are simply read by the SWU instead of be compared
> > and with the live copy and downloaded from an external source.
> >
> > > This is normal for optimizing bandwidth (using RANGE requests to only
> > > fetch required chunks), but we also need offline updates.
> >
> > See above.
>
> Yes, I was talking about delta handler in general.
> For the archive case actually it would probably make sense to just use
> delta directly and have it extract files with zero pre-existing
> knowledge (that should behave as libarchive?), but we'd still need to
> implement this packing into SWU feature.
>
> > > - if I start generating updates using a new handler, these will not be
> > > installable on old systems,
> >
> > Everything we will do, including adding multiple hashes, won't be
> > compatible with the past. It is in the matter of things.
>
> My suggestion is compatible. It won't magically fix old swu to be
> non-vulnerable, but they'll still be installable:
> * old swu, old swupdate: current behaviour
> * old swu, new swupdate: it'll find the sha256 field and check full
> file hash as per current behaviour. Optionally as a local kludge I
> can disable the streaming, but for upstream we probably want to just
> keep current behaviour.
> * new swu with both sha256 and sha256-chunks, old swupdate:
> it'll find the sha256 field and work as per current, ignoring the chunks
> * (at some point integrator can say they no longer support older
> swupdate and stop putting in the sha256, in which case install will fail
> because no hash was found for these older swupdate versions)
> * new swu and new swupdate: it'll only use the chunks
>
> Whether that's good or not is up to discussion, but it's definitely
> possible.
>
> > The only way to fix and made backward compatible is to change the
> > concept, that avoiding to extract a tarball into the path above. Using a
> > new partition, uploading an image instead of tarball, etc..
> >
> > Proposals here will change SWUpdate code, and then customers need to
> > update at least once to have the changes in place.
>
> Right.
> We can mitigate the issue within the SWU itself (e.g. umount the
> partition before any archive extraction when it's not used), but old swu
> files will always have the problem.
> (thanksfully version check will make these not installable if user is up
> to date)
>
> > > and that will require quite some preparation
> > > (shipping systems where swupdate include the current handler, then
> > > perhaps a year later(!) finally require it).
> >
> > Sure, that depends on your customers. But well, if they are the product
> > owner, they should decide.
>
> Yes, that can definitely be made an option for user generated updates;
> we set the pace for OS updates though.
>
> > > > A file for delta is in zchunk format, that means split into chunks based
> > > > on rolling-hash algorithms. The result of the delta handler is then
> > > > streamed again to another handler, and it could be the archive handler.
> > > >
> > > > This is not implemented and it sounds a little weird as "delta", but
> > > > splitting into chunks (and in a format already recognized and supported
> > > > like zck) is done. So the pipeline is:
> > > >
> > > > tarball in ZCK format ==> modified delta ==> archive handler
> > > >
> > > > The delta handler verifies each chunk before forwarding to the chained
> > > > handler using a pipe.
> > >
> > > I'm not familiar with the ZCK format to reply immediately, but my first
> > > impression is that it sounds more complex than just chunking checksums,
> >
> > It is more complex, it could be a pain to implement, but well, someone
> > else (me) has already done and it is working. It has a nice methods to
> > split into chunks with mathematical algorithms, and each chunk is
> > compressed via zst.
>
> So the only two things that'd require implementing are:
> - allowing packing ZCK files into SWU itself
> - allowing chaining another handler piped from the delta handler
>
> as both of these do not look possible to me right now?
>
> > > To give a more concrete example, I'm suggesting something like this:
> > > - file foo.tar is 30MB after compression
> > > - compute checksums for it, e.g. replace (pseudoshell)
> > > sha256 = "`sha256sum foo.tar.zst`"
> > > with something like (not sure what libconfig lists look like)
> > > sha256-chunks = [
> > > `split -b $((1024*1024)) --filter='sha256sum -' foo.tar.zst`
> > > ]
> >
> > But let's say: your customer can already split the big tarball in a set
> > of smaller tarballs, and each of them has its own hash. If they do this,
> > they minimize the risks.
>
> There is no risk management there: the archive size is not in
> sw-description, so the archive in the cpio can be as big as an attacker
> wants, and they can overwrite as much data as they want as the hash will
> only fail when the file has been fully streamed.
>
> > Even with the changes above, an attacker can replace a chunk (then it
> > depends on the size of the chunk..) and replace some important files. As
> > usual, the bug itself due t othe wrong concept cannot be solved, we are
> > trying to find some work-arounds, that can work or not.
>
> No, because I am suggesting that for the chunks variant the chunks get
> checked before passing the data on to the handler (if possible even
> before decompression, as there are occasional vulnerabilities on
> decompressors as well[1][2])
>
> [1]
https://osv.dev/vulnerability/OSV-2020-691
> [2]
https://osv.dev/vulnerability/OSV-2020-405
>
> This really fixes the problem by saying no untrusted data is ever
> processed by swupdate:
> - having a larger file in the cpio archive will cause errors because
> there are not enough chunks available
> - any invalid data will be caught before causing harm
>
> What is still possible in the single-copy case is to overwrite partial
> data, e.g. archive contains files A and B in different chunks, if you
> break the archive at B you can get swupdate to update only A before
> erroring out.
> I am sure in some specific scenario this can also cause problems, but at
> least it is no longer possible to write a file in a single-copy
> partition if there was not such a write in the original update, so this
> can be left to final integrators to just not use the single copy
> partition and it can be left present for people who really want to use
> it.
>
>
> But that made me check the delta handler code, and it's checking the
> hash while feeding the pipe through copybuffer (in non-local copy case),
> despite all the data being available in a local buffer.
> I'm not sure if it's redundant with a check zck already did before (the
> digest comes from it in the first place..), but if not this isn't a good
> solution for the reason you described: even if it's a single chunk, an
> attacker could overwrite files with it.
> OTOH the data is fully present in buffer, so that's an easy change (just
> check before calling copybuffer) and won't increase overhead.
>
> > > The other advantage of this is that it makes the input file size
> > > explicit, so even with no streaming there is no more risk of having
> > > TMPDIR filled by invalid data (which is the second problem I'm
> > > referring to -- as long as the file does not end the checksum
> > > computation never fails and an attacker can write forever)
> >
> > But well, this is a separate issue. Some handlers have already a "size"
> > attribute. It could be extended and checked with the CPIO header to
> > avoid this.
> >
> > Issue is known, but it does not lead that manipulated software is
> > installed - SWUpdate will stop with OOM and files are erased. Of course,
> > this in case installed-directly is not set, else the device size will be
> > effective.
>
> Yes, it's somewhat separate, but I believe both can be addressed with
> the same solution. "Two birds with one stone".
>
>
> So we have three paths forward:
> * keep swupdate as is, and just make it clear single-copy updates are
> dangerous in docs (will be done anyway)
> * update delta handler to accept zck files embedded in swu & stream to
> another handler, eventually use this for new swu
> ** merit: re-uses the delta handler code for chunking, chunks themselves
> are in a well supported format with tools to generate them
> * add chunked checksums and use these if present
> ** merit: possible to generate SWUs compatible with both format, generic
> for all handlers, check hash before piping to next handler, also address
> the "cpio size" problem even if install directly is off.
>
> I'm still thinking the chunked checksums is more straightforward (and
> thus easier to understand for new integrators), especially since tools
> like swugenerator could "just do it" without having to think about how
> handlers should be chained or other subtle install-directly
> implications, so I had a quick look at the code this morning, and the
> hard part would be config parsing...
> Which makes me want to reconsider zck a bit (feel free to skip details
> below)
>
> ----chunked hash implementation details-------------
> * __swupdate_copy easy side first: If single hash keep current
> behaviour, if hash list add an extra leading step that'll buffer a
> full chunk and check before passing on to the next step (re-chunking
> by decompressor so handler won't be called with too large buffers)
>
> * __swupdate_copy/copyfile/copybuffer have too many arguments, but
> ultimately we'll only check either the full file hash or the chunks, so
> we can probably re-use the hash field...
> I'm not looking forward to updating all callers, but it should be
> mechanical enough and having the hash type change will make it easy to
> catch any missed caller.
>
> * which leaves the "chunked hash" type / extraction from config.
> For common parser/parser.c (libconfig/json) ideally the chunked version
> could return a pointer to the chunks hash node so that the copy step
> could fetch directly from config one at a time, but we're
> freeing/destroying the config at the end of parse_cfg/parse_json so
> that's not possible.
>
> That means pre-loading all hashes in img_type, which isn't optimal for
> very large images... (even assuming large-ish 1MB chunks, having a 512MB
> image would consume 512*32 = 16KB ; that's probably acceptable for most
> systems nowadays but more importantly it's not possible to have a static
> img_type; and using a list would have even more overhead, so I'd still
> favor a malloc from get_array_length() * SHA256_HASH_LENGTH)
>
> (for parse_external if we go that way it doesn't look overly difficult
> to fetch values in a similar way, but likewise I don't see how to make
> it work with a callback to fetch hashes as they are needed...)
>
> I guess that's probably not be acceptable and we'd want something a bit
> better here; having hashes in yet another file doesn't sound great to me
> either (that's what zck is doing with the zck index, at this point might
> as well go full delta handler)...
> ---------------------------------------------------
>
> I've got other urgent work (ugh) so I'll give it a day or two to think
> it through further; I really liked the idea of having something generic
> for all handlers though so I'm not fully giving up but I'd rather not
> add to the pile of not-upstreamable patches I'm running...
>
>
> Thanks,
> --
> Dominique
>
>
>
> ---------- Forwarded message ----------
> From: Dominique MARTINET <
dominique...@atmark-techno.com>
> To: Stefano Babic <
stefan...@swupdate.org>
> Cc:
> Bcc:
> Date: Thu, 9 May 2024 16:35:04 +0900
> Subject: Re: [security RFC] install_directly and checksum validation - chunkify the checksum?
> Dominique MARTINET wrote on Wed, May 08, 2024 at 11:02:07AM +0900:
> > > Anyway, if you are already using btrfs snapshot, why cannot you add a
> > > snapshot before update is running and restore it with a post-failure
> > > script ?
> >
> > Because that would lose data written during the update.
> > This partition is akin to /var for a normal system - database &
> > logs; it shouldn't be rolled back.
> > It's the same reason we don't snapshot before writing to it: in case of
> > update success the partition would lose writes that happened during the
> > update itself.
> >
> > It really should not have been mounted in the first place...
>
> I think I can make something not-too-distruptive that'll only mount it
> if the user requests it, and print a warning that it probably shouldn't
> be accessed; so I'll fix up a workaround for our users first then let's
> discuss a cleaner follow up on the list (best practice update first and
> then something about chunking or at least adding size parameter at cpio
> level)
>
> Our update will go out at the end of the month, so hopefully I'll have
> some time to work on the workaround until then and I'll follow up early
> June.
>
> (Until then feel free to reply anyway, we can keep discussing if you
> have ideas, but I probably won't have much time to work on anything
> concrete)
>
> Cheers,
> --
> Dominique
>
>
>
> ---------- Forwarded message ----------
> From: Stefano Babic <
stefan...@swupdate.org>
> To: Dominique MARTINET <
dominique...@atmark-techno.com>
> Cc:
> Bcc:
> Date: Fri, 10 May 2024 07:53:07 +0200
> Subject: Re: [security RFC] install_directly and checksum validation - chunkify the checksum?
> Hi Dominique,
>
> On 09.05.24 09:35, Dominique MARTINET wrote:
> > Dominique MARTINET wrote on Wed, May 08, 2024 at 11:02:07AM +0900:
> >>> Anyway, if you are already using btrfs snapshot, why cannot you add a
> >>> snapshot before update is running and restore it with a post-failure
> >>> script ?
> >>
> >> Because that would lose data written during the update.
> >> This partition is akin to /var for a normal system - database &
> >> logs; it shouldn't be rolled back.
> >> It's the same reason we don't snapshot before writing to it: in case of
> >> update success the partition would lose writes that happened during the
> >> update itself.
> >>
> >> It really should not have been mounted in the first place...
> >
> > I think I can make something not-too-distruptive that'll only mount it
> > if the user requests it, and print a warning that it probably shouldn't
> > be accessed; so I'll fix up a workaround for our users first then let's
> > discuss a cleaner follow up on the list (best practice update first and
> > then something about chunking or at least adding size parameter at cpio
> > level)
>
> Fine.
>
> >
> > Our update will go out at the end of the month, so hopefully I'll have
> > some time to work on the workaround until then and I'll follow up early
> > June.
> >
> > (Until then feel free to reply anyway, we can keep discussing if you
> > have ideas,
>
> Sure, I find this discussion very constructive - then we will switch to
> ML if we want to go on with the thread.
>
> Regards,
> Stefano
>
> but I probably won't have much time to work on anything
> > concrete)
> >
> > Cheers,