Automatic mount/unmount using systemd/CentOS_7

1,059 views
Skip to first unread message

Randy Rue

unread,
Nov 17, 2016, 5:34:19 PM11/17/16
to s3ql
Hello All,

Based on what appear to be partial solutions from a couple of sources I've cobbled the below systemd service file in the hopes of having an s3ql file system mount on bootup and unmount cleanly on shutdown. I confess I'm more of an init.d guy but I'm working on catching up.

The volume mounts great if I start the service via systemctl after startup (if the file system is clean), that is, "systemctl start s3ql"

But if I reboot I find the file system is corrupt and fails to remount, and need to run fsck.s3ql before it will mount again.

It appears the file system isn't unmounting cleanly on shutdown.

What am I missing?

Randy


Here's my /lib/systemd/system/s3ql.service file:
[Unit]
Description=mount s3ql filesystem
Require=NetworkManager-wait-online.service
After=NetworkManager-wait-online.service

[Service]
ExecStart=/usr/bin/mount.s3ql --fg --authfile /etc/s3ql.authinfo --allow-other swift://tin.fhcrc.org/fast_dr/ /fast_dr
ExecStop=/usr/bin/umount.s3ql /fast_dr
TimeoutStopSec=5min


Daniel Jagszent

unread,
Nov 17, 2016, 6:47:12 PM11/17/16
to s3ql

Hello Randy,

[...]
The volume mounts great if I start the service via systemctl after startup (if the file system is clean), that is, "systemctl start s3ql"
But if I reboot I find the file system is corrupt and fails to remount, and need to run fsck.s3ql before it will mount again.
It appears the file system isn't unmounting cleanly on shutdown.
[...]
Here's my /lib/systemd/system/s3ql.service file:
[Unit]
Description=mount s3ql filesystem
Require=NetworkManager-wait-online.service
After=NetworkManager-wait-online.service

[Service]
ExecStart=/usr/bin/mount.s3ql --fg --authfile /etc/s3ql.authinfo --allow-other swift://tin.fhcrc.org/fast_dr/ /fast_dr
ExecStop=/usr/bin/umount.s3ql /fast_dr
TimeoutStopSec=5min

I am using this this unit file:

[Unit]
Description=Mount s3ql file system
Requires=nss-lookup.target network.target time-sync.target
After=nss-lookup.target network.target network-online.target remote-fs-pre.target time-sync.target
Conflicts=shutdown.target
ConditionPathIsDirectory=/fast_dr

[Service]
#Type=notify
Type=simple
ExecStart=/usr/local/sbin/mount-fast_dr.sh
LimitNOFILE=66000
NotifyAccess=all
TimeoutStopSec=10min
TimeoutStartSec=10min

[Install]
WantedBy=multi-user.target

It uses the following start script (/usr/local/sbin/mount-fast_dr.sh):

#!/bin/bash

FSCK_OPTS="--batch --authfile /etc/s3ql.authinfo"
MOUNT_OPTS="--fg --allow-other --authfile /etc/s3ql.authinfo"
STORAGE_URL="swift://tin.fhcrc.org/fast_dr/"
MOUNTPOINT="/fast_dr"

# Check and mount file system
echo executing fsck.s3ql $FSCK_OPTS "$STORAGE_URL"
/usr/bin/fsck.s3ql $FSCK_OPTS "$STORAGE_URL"
FSCK_RESULT=$?
if [[ $FSCK_RESULT != 0 && $FSCK_RESULT != 128 ]]; then
  echo "fsck.s3ql reported errors! exit code $FSCK_RESULT"
  exit $FSCK_RESULT
fi
/bin/systemd-notify --ready --status="Waiting for data..."
echo executing mount.s3ql $MOUNT_OPTS "$STORAGE_URL" "$MOUNTPOINT"
exec /usr/bin/mount.s3ql $MOUNT_OPTS "$STORAGE_URL" "$MOUNTPOINT"

Nikolaus Rath

unread,
Nov 17, 2016, 6:55:29 PM11/17/16
to s3...@googlegroups.com
On Nov 18 2016, Daniel Jagszent <dan...@jagszent.de> wrote:
> [Service]
> #Type=notify
> Type=simple
> ExecStart=/usr/local/sbin/mount-fast_dr.sh
> LimitNOFILE=66000
> NotifyAccess=all
> TimeoutStopSec=10min
> TimeoutStartSec=10min
>
> [Install]
> WantedBy=multi-user.target
> |
>
> It uses the following start script (|/usr/local/sbin/mount-fast_dr.sh|):
>
> |#!/bin/bash
[...]

> /bin/systemd-notify --ready --status="Waiting for data..."

If you use systemd-notify, you should set "Type = notify". However, you
don't actually need to use systemd-notify -- mount.s3ql is already doing
that for you.

> echo executing mount.s3ql $MOUNT_OPTS "$STORAGE_URL" "$MOUNTPOINT"
> exec /usr/bin/mount.s3ql $MOUNT_OPTS "$STORAGE_URL" "$MOUNTPOINT"

... and this should be run with --foreground, *especially* if you use
"Type = simple".


Best,
-Nikolaus

--
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

»Time flies like an arrow, fruit flies like a Banana.«

Daniel Jagszent

unread,
Nov 17, 2016, 7:05:29 PM11/17/16
to s3...@googlegroups.com

Hi Nikolaus,

Nikolaus Rath wrote:

/bin/systemd-notify --ready --status="Waiting for data..."
If you use systemd-notify, you should set "Type = notify". However, you
don't actually need to use systemd-notify -- mount.s3ql is already doing
that for you.

A cool – nice to know. If I recall correctly I tried Type=notify and it did not work (probably was S3QL 2.17 and Debian 8 back then). I did not bother to figure out why since it works great with Type=simple for my use case but it probably was a problem with my self-packaged .deb.

> echo executing mount.s3ql $MOUNT_OPTS "$STORAGE_URL" "$MOUNTPOINT"
> exec /usr/bin/mount.s3ql $MOUNT_OPTS "$STORAGE_URL" "$MOUNTPOINT"
... and this should be run with --foreground, *especially* if you use
"Type = simple".

Yes, --fg is in the $MOUNT_OPTS

Randy Rue

unread,
Nov 18, 2016, 10:14:21 AM11/18/16
to s3...@googlegroups.com
Hi Niko, Daniel,

Firstly, thanks for your help with what's essentially not an s3ql
question but an OS issue. I'm reading up on systemd.

The main problem I seem to be having is unmounting the file system
cleanly. Do I understand from your example that you've circumvented this
by running fsck.s3ql again before each remount?

If so, this might not meet our needs as I'm hoping to use this for
fairly large volumes and waiting for a file system check could take a while.

Any guidance on a clean unmount on shutdown?

Randy

Nikolaus Rath

unread,
Nov 18, 2016, 11:54:04 AM11/18/16
to s3...@googlegroups.com
On Nov 18 2016, Randy Rue <rand...@gmail.com> wrote:
> The main problem I seem to be having is unmounting the file system
> cleanly. Do I understand from your example that you've circumvented
> this by running fsck.s3ql again before each remount?

That is not a good idea, fsck.s3ql is not always able to fix everything
that may go wrong if the mount.s3ql process is killed.

You need to configure systemd to wait until the mount.s3ql process
terminates. I don't know how, but I do know that it can be done.

Randy Rue

unread,
Nov 18, 2016, 2:47:19 PM11/18/16
to s3ql
OK, we're now almost purely into the world of OS questions (I do have one s3ql question) but this pertains to running s3ql so I'll keep going.

I've changed the TimeOutStopSec=5min setting under [Service] to 1min. This setting appears not to enforce any timeout, instead it's just how long the call to unmount will wait before marking it failed or killing it.

I've added ExecStopPost=/usr/bin/sleep 60 to the [Service] in the hopes that it would add a 1min pause to unmounting.

So my unit file /lib/systemd/system/s3ql.service now looks like:
[Unit]
Description=mount s3ql filesystem
Require=NetworkManager-wait-online.service
Before=nfs-server.service
After=NetworkManager-wait-online.service


[Service]
ExecStart=/usr/bin/mount.s3ql --fg --authfile /etc/s3ql.authinfo --allow-other swift://tin.fhcrc.org/fast_dr/ /fast_dr
ExecStop=/usr/bin/umount.s3ql /fast_dr
ExecStopPost=/usr/bin/sleep 60
TimeoutStopSec=1min


[Install]
WantedBy=multi-user.target
RequiredBy=nfs-server.service

(note that I want to serve this volume via NFS and don't want nfs-server running unless this volume is available)

ran "systemctl daemon-reload" to load changes

Still no love. If I fsck the volume, I can run "systemctl start s3ql.service" and the volume mounts. If I "systemctl restart," the volume unmounts clean, waits a minute and mounts again. "systemctl stop" also unmounts gracefully including a 60 second pause.

But (tailing syslog shows this) if I reboot, the system is down within a second. What am I missing? You've mentioned systemd-notify here, I believe without any mention in the unit file the default for the service is "simple," should I be checking on the OS that something is running for that work?

s3ql question, finally: a clean unmount shows the metadata being backed up and the backup rotated. Is that the complete DB? How large? Where's it kept? If the file system is large (say millions of inodes or more) how much time/space will that take?

Reading up on systemd and notify....


Randy Rue

unread,
Nov 18, 2016, 3:02:10 PM11/18/16
to s3ql
Niko,

Earlier in tweaking on this you had me change a few lines in mount.py, from 
try:
   
from systemd.daemon import notify as sd_notify
except ImportError:
    sd_notify
= None

to

sd_notify = None

Do I gather correctly that my version of mount is NOT systemd-notify compatible? Also, I don't see any mention of "notify" at all in the umount.py code? Even if your stock mount code is compatible, is this not the case for unmount?



Randy Rue

unread,
Nov 18, 2016, 3:56:12 PM11/18/16
to s3ql
Some more information:

Mounting the volume via mount.s3ql works with either version of mount.py (with or without the import of sd_notify from systemd.daemon).

From a python3 session I can also run "from systemd.daemon import notify as sd_notify" and then autocompletion shows a wide list of options for calls to sd_notify.

You mentioned believing I might be running some rebel version of systemd. Again, this is pretty stripped down install of CentOS_7 with:
[root@fast-dr-proxy ~]# rpm -qa | grep systemd
systemd-devel-219-19.el7_2.13.x86_64
systemd-sysv-219-19.el7_2.13.x86_64
systemd-python-219-19.el7_2.13.x86_64
systemd-libs-219-19.el7_2.13.x86_64
systemd-219-19.el7_2.13.x86_64

Is one of these wrong?

If I change the unit file to "Type=notify" with either version of mount.py (sd_notify imported or sd_notify = None), the service no longer starts via systemctl

Right now I'm back to the stock mount.py and no Type argument in the unit file (which defaults me to simple). Back to where mounts and startups work but not shutdowns. But without any sd_notify call in umount.py, I believe this means that systemd is never hearing whether the umount completes?

Can it? I see a call to sd_notify that passes "STOPPING=1" but that seems to only tell systemd that the shutdown has begun?

Daniel Jagszent

unread,
Nov 18, 2016, 5:00:29 PM11/18/16
to s3...@googlegroups.com

Nikolaus Rath wrote:

You need to configure systemd to wait until the mount.s3ql process
terminates. I don't know how, but I do know that it can be done.

That’s what the Conflicts=shutdown.target in my unit file is for. With that line the unit file will get shut down before the shutdown target gets reached and systemd honors the TimeoutStopSec. Without it the unit file gets shut down during the shutdown target and then it cannot wait that long before it gets forcefully killed.

Nikolaus Rath

unread,
Nov 18, 2016, 11:22:47 PM11/18/16
to s3...@googlegroups.com
On Nov 18 2016, Randy Rue <rand...@gmail.com> wrote:
> I've changed the TimeOutStopSec=5min setting under [Service] to
> 1min. This
[...]>
> But (tailing syslog shows this) if I reboot, the system is down within a
> second. What am I missing?

I'd take this to the systemd mailing list instead. They're much more
likely to be able to help you. If you find the solution, please share it
here as well though :-).

> You've mentioned systemd-notify here, I believe
> without any mention in the unit file the default for the service is
> "simple," should I be checking on the OS that something is running for that
> work?

I don't understand what you mean.

> s3ql question, finally: a clean unmount shows the metadata being backed up
> and the backup rotated. Is that the complete DB?

Yes.

> How large?

s3qlstat should tell you.

> Where's it
> kept?

Locally? In --cachedir.

> If the file system is large (say millions of inodes or more) how much
> time/space will that take?

That depends on the kind of data and your system. Create a file system
with a few thousands inodes and extrapolate from that, it should be
linear.

Nikolaus Rath

unread,
Nov 18, 2016, 11:24:22 PM11/18/16
to s3...@googlegroups.com
On Nov 18 2016, Randy Rue <rand...@gmail.com> wrote:
> Niko,
>
> Earlier in tweaking on this you had me change a few lines in mount.py, from
> try:
> from systemd.daemon import notify as sd_notify
> except ImportError:
> sd_notify = None
>
> to
>
> sd_notify = None

That was when I thought there was a bug in S3QL. But as I later wrote,
that's not the case. You just have a "bad" systemd module
installed. Don't use the workaround, install the proper python module
that comes with systemd.

> Also, I don't see any mention of "notify" at all in the
> umount.py code? Even if your stock mount code is compatible, is this not
> the case for unmount?

umount doesn't need to notify systemd. systemd monitors the mount.s3ql
pid, and this is how it detects when mount.s3ql has terminated.

Nikolaus Rath

unread,
Nov 18, 2016, 11:26:31 PM11/18/16
to s3...@googlegroups.com
On Nov 18 2016, Randy Rue <rand...@gmail.com> wrote:
> You mentioned believing I might be running some rebel version of systemd.
> Again, this is pretty stripped down install of CentOS_7 with:
> [root@fast-dr-proxy ~]# rpm -qa | grep systemd
> systemd-devel-219-19.el7_2.13.x86_64
> systemd-sysv-219-19.el7_2.13.x86_64
> systemd-python-219-19.el7_2.13.x86_64
> systemd-libs-219-19.el7_2.13.x86_64
> systemd-219-19.el7_2.13.x86_64
>
> Is one of these wrong?

I don't know.

> If I change the unit file to "Type=notify" with either version of mount.py
> (sd_notify imported or sd_notify = None), the service no longer starts via
> systemctl

Since you said it crashes when you don't set sd_notify = None, that's
not surprising. Or did you using the proper systemd module now?

> Right now I'm back to the stock mount.py and no Type argument in the unit
> file (which defaults me to simple). Back to where mounts and startups work
> but not shutdowns. But without any sd_notify call in umount.py, I believe
> this means that systemd is never hearing whether the umount completes?

I think as long as you use --fg, it should still work most of the time.

Chris Davies

unread,
Dec 5, 2016, 6:41:26 AM12/5/16
to s3ql
A somewhat belated reply...

On Thursday, 17 November 2016 22:34:19 UTC, Randy Rue wrote:
[...] if I reboot I find the file system is corrupt and fails to remount, and need to run fsck.s3ql before it will mount again.
It appears the file system isn't unmounting cleanly on shutdown.
 

My remote filesystems take up to 10 minutes to unmount cleanly. My solution is also to run fsck on boot, linked with the automounter (autofs) to mount the S3QL filesystems only on demand. (The fsck will exit rapidly if it determines the filesystem was unmounted cleanly, which suits me because otherwise it takes over an hour to complete.)

Chris

Reply all
Reply to author
Forward
Message has been deleted
0 new messages