Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#890950: initramfs-tools: Resuming from hibernated swapfile fails

153 views
Skip to first unread message

Mario.Li...@dell.com

unread,
Feb 20, 2018, 5:30:04 PM2/20/18
to
Package: initramfs-tools
Version: 0.130ubuntu2
Severity: important

Dear Maintainer,

I've found that resuming from a hibernated swapfile using the kernel hibernate
implementation fails. This is due to some assumptions made in initramfs-tools:

1) It requires that uswpswp is installed (to provide /bin/resume)
2) It doesn't properly detect offsets

I've produced a patch that resolves these problems (and also merges some
improvements found in Ubuntu's implementation of this).

Would you please consider to adopt it in Debian?

It's available on my Gitlab profile:
https://salsa.debian.org/superm1-guest/initramfs-tools/commit/8578edf9afaaaa9484f65095455bb2355968eef4

Thanks,

Ben Hutchings

unread,
Feb 21, 2018, 1:00:03 PM2/21/18
to
Control: tag -1 moreinfo

On Tue, 2018-02-20 at 22:11 +0000, Mario.Li...@dell.com wrote:
> Package: initramfs-tools
> Version: 0.130ubuntu2
> Severity: important
>
> Dear Maintainer,
>
> I've found that resuming from a hibernated swapfile using the kernel hibernate
> implementation fails. This is due to some assumptions made in initramfs-tools:
>
> 1) It requires that uswpswp is installed (to provide /bin/resume)[...]

No, it runs /bin/resume which is installed by klibc-utils. (uswsusp
installs its resume implementation as /sbin/resume. That's what the
comment is about.)

> 2) It doesn't properly detect offsets

So far as I can see, the kernel has never really supported an offset
being passed through /sys/power/resume. However:

1. The kernel parses the resume_offset parameter, and uses that for
every resume request.
2. The implementation of /sys/power/resume is not very strict, and
ignores the trailing ":offset".

This second feature was briefly broken between Linux 4.1-rc1 and 4.1-
rc3, but otherwise still seems to work. So I don't see what your
change is fixing.

However I do think that either:

1. The kernel should add real support for setting the resume offset
after boot.
2. klibc should stop writing the unused offset parameter.

Ben.

--
Ben Hutchings
[W]e found...that it wasn't as easy to get programs right as we had
thought. ... I realized that a large part of my life from then on was
going to be spent in finding mistakes in my own programs. - Maurice
Wilkes, 1949
signature.asc

Mario.Li...@dell.com

unread,
Feb 21, 2018, 4:30:03 PM2/21/18
to
Thanks, I appreciate your feedback.
Some more comments nested below.
> >
> > 1) It requires that uswpswp is installed (to provide /bin/resume)[...]

>
> No, it runs /bin/resume which is installed by klibc-utils. (uswsusp
> installs its resume implementation as /sbin/resume. That's what the
> comment is about.)

Ah thanks - this wasn't clear. I wasn't seeing /bin/resume on a standard
system and that's because it's in /usr/lib/klibc/bin/resume on a standard
system and copied to initramfs.

>
> > 2) It doesn't properly detect offsets
>
> So far as I can see, the kernel has never really supported an offset
> being passed through /sys/power/resume. However:
>
> 1. The kernel parses the resume_offset parameter, and uses that for
> every resume request.
> 2. The implementation of /sys/power/resume is not very strict, and
> ignores the trailing ":offset".
>
> This second feature was briefly broken between Linux 4.1-rc1 and 4.1-
> rc3, but otherwise still seems to work. So I don't see what your
> change is fixing.
>
> However I do think that either:
>
> 1. The kernel should add real support for setting the resume offset
> after boot.
> 2. klibc should stop writing the unused offset parameter.

If you don't mind, I'm going to follow up with an updated patch
that drops klibc /bin/resume writing the unused parameter.

Also there was a few other aspects of my patch that I think are relevant
that I'll make sure are still present when I follow up.

1) using Plymouth if present to indicate resuming
2) Detection of swapfile via blkid (the current "auto") stuff doesn't work
otherwise.

Mario.Li...@dell.com

unread,
Feb 21, 2018, 11:00:02 PM2/21/18
to
I've reworked my patches and split to 3 segments that take your feedback into account. Can you please review these?

With setting resume_offset on kernel command line I confirmed this works out of the box for me.

https://salsa.debian.org/superm1-guest/initramfs-tools/commit/c8f4ae22a3332941940fb4b7d65583e2e29efb56
https://salsa.debian.org/superm1-guest/initramfs-tools/commit/20caffa22848372084893da5b2186167b8b143cb
https://salsa.debian.org/superm1-guest/initramfs-tools/commit/1ec25e76395a8108bfc3c8452a8b916db970dfe1

Mario.Li...@dell.com

unread,
Feb 28, 2018, 5:10:03 PM2/28/18
to
Hi,

I've had some more testing this week and developed some changes that I think are more sustainable.
1) No longer revert the setting offset via /bin/resume.
I've started a discussion upstream to allow reading offset this way. If it's adopted then this should
definitely stay.
Otherwise it's harmless.
2) Detect the offset both from kernel command line as well as filename.

Can you please ignore my old ones, and re-review my new patches and consider them for initramfs-tools?
https://salsa.debian.org/superm1-guest/initramfs-tools/commit/f74423c610bd3b9ee9baf97d42bfa0219bfb4826
https://salsa.debian.org/superm1-guest/initramfs-tools/commit/a39f61e48eca03e264a7235ee984db76e8c10aa9

Thanks,

Dirk Fieldhouse

unread,
Jul 19, 2018, 11:30:03 AM7/19/18
to
Mario

Looking at your patches and the preceding discussion on #89050 and
having wrapped a cold towel round my head, I have a few comments.

As of kernel 4.17,
<https://elixir.bootlin.com/linux/v4.17/source/Documentation/power/swsusp.txt>
documents the use of /sys/power/resume_offset to set the offset block,
and this is the first version that does actually include the required
code in hibernate.c (static ssize_t resume_offset_store(...) line 1070).
But that is not the same as the offset parameter sent by /bin/resume.

Therefore ...

In all cases, the initramfs can only know that it should invoke a resume
through the resume kernel parameter. The local-premount/resume script is
needed to do this. It has to canonicalise the resume kernel parameter
value and pass it via the /sys/power/resume maj:min API. The resume code
can use the resume_offset kernel parameter if provided.

For kernels before 4.17, the only way to get resume_offset into the
resume process is via the kernel command line, either manually or via a
function in /etc/default/grub as below invoked during update-grub:

Get_Resume()
{
local sname stype junk roff res

swapon --show --noheadings --raw | head -1 | {
read sname stype junk
if [ "$stype" = "file" ]; then
res=`df "$sname" | awk 'END {print $1}'`
type swap-offset>/dev/null && # from suspend-utils|uswsusp
roff=`swap-offset "$sname" |
awk '/resume offset =/ {print $4;exit}'`
[ -z "$roff" ] &&
roff=`filefrag -v "$sname" |
awk '{if ($1=="0:") {print
substr($4,1,length($4)-2);exit}}'`
sname=$res
fi
# now [ "$stype" = "device" ]
res=`lsblk -o PARTUUID --noheadings "$sname"`
[ -n "$res" ] && res="resume=PARTUUID=$res"
[ -n "$res" ] && [ "$roff" -ge 0 ] &&
res="$res resume_offset=$roff"
echo $res;
}
}

GRUB_CMDLINE_LINUX="`Get_Resume`"

I tested a swapfile resume with 4.4 using the echo
maj:min:offset>/sys/power/resume API and believed that it worked, but
the kernel must have been reading offset from the grub command-line.

Whereas for 4.17 and later, you can save the resume device and
resume_offset into a conf file for initramfs-tools from a hook run from
update-initramfs (which has to happen once for each installed kernel),
so long as you replace/update /bin/resume to use the new API. These
values override any kernel command-line parameters. The resume_offset
parameter remains the default value if the /sys/power/resume_offset API
is not used.

I can't really see why that's any better than the update-grub approach.

There needs to be some global initramfs storage that can be mounted when
initiating hibernate/hybrid-sleep and then again by the initramfs when
booting. I really have no idea if that's possible.

As to the scripts ...

hooks/resume
<https://salsa.debian.org/superm1-guest/initramfs-tools/commit/f74423c610bd3b9ee9baf97d42bfa0219bfb4826>

1 line 59
For filefrag, e2fsprogs has to be installed. Is it a dependency of other
file systems?

Should the check on resume_offset be the same as in l63?

2 global
If this is going to work with a resume_offset, /sys/power/resume_offset
has to be writable. There should be a test "[ -w
/sys/power/resume_offset ]" somewhere before writing resume_offset (>0)
to the conf file and error if not.


scripts/local-premount/resume
<https://salsa.debian.org/superm1-guest/initramfs-tools/commit/a39f61e48eca03e264a7235ee984db76e8c10aa9#8609b9b67b65b1556d2a51cfeacee8a080c15af6>

Mostly issues with the unpatched source!

1 line 18
Test -e /sys/power/resume should be -w /sys/power/resume.

2 line 29
Test [ -x /bin/plymouth ] but then call plymouth instead of
/bin/plymouth - is PATH sure to begin /bin:...? Instead:

if type plymouth >/dev/null && plymouth --ping; then

3 line 30
Why not tell the whole truth:

plymouth message --text="Resuming from $resume ${resume_offset:+at
offset $resume_offset}"

4 line 34
The test is redundant. Replace if ... fi by line 35.

There could be a test that resume_offset is an unsigned long number, as
in your hooks/resume:63, but then you'd need to handle the error case
where resume_offset is set and invalid.

5 lines 35ff
/bin/resume
<https://github.com/mirror/busybox/blob/master/klibc-utils/resume.c> is
no help. It doesn't do what's needed if the resume_offset has been
provided from within the initramfs, since it only writes to
/sys/power/resume using the never actually implemented maj:min:offset API.

You can do the right thing in the script itself per
<https://bugs.launchpad.net/ubuntu/+source/initramfs-tools/+bug/983805>
and avoid possible confusion with uswsusp's resume binary. This is a
modified script to handle the 4.17+ API.

resume() # resume_device [resume_offset]
{
local majmin x

majmin=
# Get the major and minor numbers for the resume device
x=$(stat -L -c '0x%t 0x%T' "$1") && majmin=$(printf '%d:%d' $x)

[ -n "$majmin" ] && [ "${majmin%:*}" != "0" ] ||
# No device (could not stat device) or not a real device
return 99;

[ ! "$2" -ge 0 ] || printf "$2" > /sys/power/resume_offset &&
printf "$majmin" > /sys/power/resume

return $?
}


Hope that helps.
/df

--
London SW6
UK

Ben Hutchings

unread,
Jul 19, 2018, 3:30:07 PM7/19/18
to
Please make a merge request on Salsa and then we can nitpick^Wdiscuss
this in detail.

Ben.

--
Ben Hutchings
We get into the habit of living before acquiring the habit of thinking.
- Albert Camus

signature.asc

Ben Hutchings

unread,
Mar 10, 2019, 6:30:02 PM3/10/19
to
On Sun, 2019-03-10 at 19:25 +0100, Michael Biebl wrote:
> Hi Mario,
>
>
> On Fri, 6 Jul 2018 12:41:37 +0000 <Mario.Li...@dell.com> wrote:
[...]
> > Yes I could see two swap partitions causing the wrong one to be picked.
> > It's trying to select the bigger of the two.
> >
> > If they don't match the one you're putting in /etc/initramfs-tools/conf.d/resume
> > then that would cause problems. Please do confirm if you switch what's in
> > initramfs conf.d/resume that the problem is fixed.
> >
> > Really matching code is needed for the initramfs-tools. I submitted some patches
> > that may help here. You can refer to them at the end of this bug report:
> >
>
> Somehow we dropped the ball here.
> What's the state regarding those initramfs-tools patches?
> Is there a chance to get them into buster?
> If not, should we revert commit
> https://github.com/systemd/systemd/commit/17c40b3a8fbfb797110c88d749bd5
>
> What do you suggest? The current situation doesn't seem ideal.

I think it would make sense for systemd to only set the hibernation
device if it's not already set (i.e. if /sys/power/resume contains
"0:0\n").

Ben.

--
Ben Hutchings
The program is absolutely right; therefore, the computer must be wrong.


signature.asc

Ben Hutchings

unread,
Mar 10, 2019, 8:00:03 PM3/10/19
to
On Mon, 2019-03-11 at 00:20 +0100, Michael Biebl wrote:
> Hi Ben
>
> Am 10.03.19 um 23:17 schrieb Ben Hutchings:
> > I think it would make sense for systemd to only set the hibernation
> > device if it's not already set (i.e. if /sys/power/resume contains
> > "0:0\n").
>
> I get $ cat /sys/power/resume
> 8:4
>
> which part is responsible for setting that?

initramfs-tools, or any alternative that implements resume from
hibernation. This is because writing to /sys/power/resume is the way
to resume from hibernation, as well as the way to set the device for
the next hibernation. (systemd writes the number of an active swap
device, so it won't cause an immediate resume.)
signature.asc

Christoph Anton Mitterer

unread,
Dec 25, 2022, 12:40:04 AM12/25/22
to
Hey.

Is this still followed up or already supported to some extent in the
meantime?

I've looked through the previous messages and suggested patches and
some notes on these:

- Auto-detection of resume device, when resuming happens, is most
likely a security hole, as I've described previously in #1020713.

Similarly, auto-detection of a resume device when performing the
hibernation, could be quite disastrous for the security of people:
Just imagine someone has attached some HDD which has swap partition
(unencrypted)... and the system would somehow automatically start to
use that and thereby leak likely sensitive data (at least the dm-
crypt keys of the running system.

For those who run some full disk encryption, it should be possible 
to disable any such auto-detections and only use a statically
configured swap device/swapfile instead.


- People may have their swapfile on top of some dm-crypt block layer...
either directly (in case of a swap "partition") or indirectly (in
case of a swapfile where the fs and possibly further DM layers are in
between.)


- The auto-detection of the resume_offset using filefrag does not
generally work, as the values will be wrong for swap files on btrfs
(the kernel wants the physical offset with respect to the block
device, which filefrag doesn't give with btrfs.

btrfs-progs 6.1 brought a new command for that:
btrfs inspect-internal map-swapfile
with the --resume-offset option.
Btw. there's now also:
btrfs filesyste mkswapfile


- Also on btrfs, people that want to use btrfs and a swapfile will
most likely want to have that in a separate btrfs subvolume.
The reason for this is subvolumes containing a swapfile have several
limitations (e.g. no snapshots).

This in turn however, may need some further support from any
integration into initramfs-tools:

One could have something like e.g. /var/local/hibernate/swapfile
where hibernate is a btrfs subvolume, that is really created at that
path (i.e. not just mounted at it).
That should subvolume should then be available as soon as the fs is
mounted.
Even if / is not the actual / on the btrfs.
E.g. when the real top-level subvolume of the btrfs contains:
/root-subvol/var/local/hibernate/swapfile
and is mounted with -o subvol=/root-subvol to / things should still
be there as soon as the root fs is there.

It could however be, that people choose a different layout, with the
top-level subvolume containing:
/root-subvol/
/hibernate/swapfile
where both, /root-subvol and /hibernate are subvols and where
/root-subvol/ is / ... in which case hibernate would not even be
visible in the normal system, unless manually mounted to some point
below / .
In this case, initramfs-tools need to either support this, or at
least somehow fail gracefully.


Some further questions that came to my mind:


Does anyone here know, whether the kernel somehow invalidates the swap
file once it has been unhibernated from - or does initramfs-tools
somehow disable it from being used again?

I ask because in my e.g. my case I boot from USB stick (which contains
GRUB/kernel/intiamfs)... and that USB stick is typically already gone
once the system has finished booting... so initramfs-tools couldn't
just regenerate the initramfs with new information to not use the swap
file on next boot.


Not sure whether this would even work right now, but in principle
people may want to disable swap until it's really actually needed for
hibernation.

The problem is that vm.swappiness doesn't really disable swapping (even
if set to 0, the kernel still might swap). Also it would affect any
swap area and not just the one that should only be used for
hibernation.

So my hope would be that systemd's hibernate.target could perhaps be
used to make some other service (that actually starts the hibernation-
swap on-demand only, right before needed) reverse depend on the former.
It could perhaps even allow to create the swapfile right on-demand
(freeing up disk space otherwise).

But in that case, and if initramfs-tools also do some work with the
file (e.g. determining the offset) it would kinda need to support
something like this, or at least it would be nice if it did.



So... is there any consensus already, how hibernation from swapfiles
should look like in Debian?


Thanks,
Chris.
0 new messages