[PATCH 0/5] Support booting from virtio-fs

141 views
Skip to first unread message

Fotis Xenakis

unread,
Aug 19, 2020, 5:41:03 AM8/19/20
to osv...@googlegroups.com, Fotis Xenakis
This patch series adds support for booting from virtio-fs, just like
zfs, rofs and ramfs. On the way, it makes some other changes in this
area, most notably it adds a command line option for specifying the root
fs type.

With this functionality, virtio-fs should be more or less on par
functionality-wise with the other filesystems. A list of still-pending
functionalities is:
1. Support mmap
2. Implement seek
3. Improve gettatr
4. Various other (mostly minot) TODOs in the code

Hopefully, the way the patches have been broken down makes the various
changes easier to review. All feedback is as always more than welcome!

Fotis Xenakis (5):
virtio-fs: use exclusive device ids
vfs: homogenize mount_rofs_rootfs and mount_zfs_rootfs
loader: add rootfs type command line option
loader: add support for booting from virtio-fs
scripts/build: don't exit when exporting files

drivers/virtio-fs.cc | 10 ++++--
fs/vfs/main.cc | 74 +++++++++++++++++++++++++++++++-------------
loader.cc | 64 ++++++++++++++++++++++++++++----------
scripts/build | 26 +++++++++++-----
4 files changed, 125 insertions(+), 49 deletions(-)

--
2.28.0

Fotis Xenakis

unread,
Aug 19, 2020, 5:41:58 AM8/19/20
to osv...@googlegroups.com, Fotis Xenakis
Previously, the devfs entry for each virtio-fs device was derived from
virtio::virtio_driver::_disk_idx, which is used by virtio-blk and
virtio-scsi, which anyway share a devfs namespace. This introduced
unnecessary complexity, since e.g. changing the number of virtio-blk
devices could introduce a shift in the naming of virtio-fs devices.

This switches virtio-fs to using a separate, exclusive id. The logic
behind the devfs name (i.e. virtiofs-<instance>) remains the same.

Signed-off-by: Fotis Xenakis <fo...@windowslive.com>
---
drivers/virtio-fs.cc | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/virtio-fs.cc b/drivers/virtio-fs.cc
index 00f862a1..a8a81265 100644
--- a/drivers/virtio-fs.cc
+++ b/drivers/virtio-fs.cc
@@ -131,10 +131,14 @@ fs::fs(virtio_device& virtio_dev)
// Step 8
add_dev_status(VIRTIO_CONFIG_S_DRIVER_OK);

+ // TODO: Don't ignore the virtio-fs tag and use that instead of _id for
+ // identifying the device (e.g. something like /dev/virtiofs/<tag> or at
+ // least /dev/virtiofs-<tag> would be nice, but devfs does not support
+ // nested directories or device names > 12). Linux does not create a devfs
+ // entry and instead uses the virtio-fs tag passed to mount directly.
std::string dev_name("virtiofs");
- dev_name += std::to_string(_disk_idx++);
-
- struct device* dev = device_create(&fs_driver, dev_name.c_str(), D_BLK); // TODO Should it be really D_BLK?
+ dev_name += std::to_string(_id);
+ struct device* dev = device_create(&fs_driver, dev_name.c_str(), D_BLK);
dev->private_data = this;
debugf("virtio-fs: Add device instance %d as [%s]\n", _id,
dev_name.c_str());
--
2.28.0

Fotis Xenakis

unread,
Aug 19, 2020, 5:43:05 AM8/19/20
to osv...@googlegroups.com, Fotis Xenakis
Signed-off-by: Fotis Xenakis <fo...@windowslive.com>
---
fs/vfs/main.cc | 52 +++++++++++++++++++++++++++++---------------------
loader.cc | 20 +++++++++----------
2 files changed, 40 insertions(+), 32 deletions(-)

diff --git a/fs/vfs/main.cc b/fs/vfs/main.cc
index 3c8b327b..6cee319e 100644
--- a/fs/vfs/main.cc
+++ b/fs/vfs/main.cc
@@ -1593,7 +1593,7 @@ int faccessat(int dirfd, const char *pathname, int mode, int flags)
return error;
}

-extern "C"
+extern "C"
int euidaccess(const char *pathname, int mode)
{
return access(pathname, mode);
@@ -2375,45 +2375,53 @@ extern "C" void unmount_devfs()

extern "C" int mount_rofs_rootfs(bool pivot_root)
{
- int ret;
-
- if (mkdir("/rofs", 0755) < 0)
- kprintf("failed to create /rofs, error = %s\n", strerror(errno));
+ constexpr char* mp = "/rofs";

- ret = sys_mount("/dev/vblk0.1", "/rofs", "rofs", MNT_RDONLY, 0);
+ if (mkdir(mp, 0755) < 0) {
+ int ret = errno;
+ kprintf("failed to create %s, error = %s\n", mp, strerror(errno));
+ return ret;
+ }

+ int ret = sys_mount("/dev/vblk0.1", mp, "rofs", MNT_RDONLY, nullptr);
if (ret) {
- kprintf("failed to mount /rofs, error = %s\n", strerror(ret));
- rmdir("/rofs");
+ kprintf("failed to mount %s, error = %s\n", mp, strerror(ret));
+ rmdir(mp);
return ret;
}

if (pivot_root) {
- pivot_rootfs("/rofs");
+ pivot_rootfs(mp);
}

return 0;
}

-extern "C" void mount_zfs_rootfs(bool pivot_root, bool extra_zfs_pools)
+extern "C" int mount_zfs_rootfs(bool pivot_root, bool extra_zfs_pools)
{
- if (mkdir("/zfs", 0755) < 0)
- kprintf("failed to create /zfs, error = %s\n", strerror(errno));
+ constexpr char* mp = "/zfs";

- int ret = sys_mount("/dev/vblk0.1", "/zfs", "zfs", 0, (void *)"osv/zfs");
-
- if (ret)
- kprintf("failed to mount /zfs, error = %s\n", strerror(ret));
-
- if (!pivot_root) {
- return;
+ if (mkdir(mp, 0755) < 0) {
+ int ret = errno;
+ kprintf("failed to create %s, error = %s\n", mp, strerror(errno));
+ return ret;
}

- pivot_rootfs("/zfs");
+ int ret = sys_mount("/dev/vblk0.1", mp, "zfs", 0, (void *)"osv/zfs");
+ if (ret) {
+ kprintf("failed to mount %s, error = %s\n", mp, strerror(ret));
+ rmdir(mp);
+ return ret;
+ }

- if (extra_zfs_pools) {
- import_extra_zfs_pools();
+ if (pivot_root) {
+ pivot_rootfs(mp);
+ if (extra_zfs_pools) {
+ import_extra_zfs_pools();
+ }
}
+
+ return 0;
}

extern "C" void unmount_rootfs(void)
diff --git a/loader.cc b/loader.cc
index 66bfb52c..9d9d3173 100644
--- a/loader.cc
+++ b/loader.cc
@@ -87,7 +87,7 @@ extern "C" {
void premain();
void vfs_init(void);
void unmount_devfs();
- void mount_zfs_rootfs(bool,bool);
+ int mount_zfs_rootfs(bool, bool);
int mount_rofs_rootfs(bool);
void rofs_disable_cache();
}
@@ -396,19 +396,20 @@ void* do_main_thread(void *_main_args)

if (opt_mount) {
unmount_devfs();
- //
+
// Try to mount rofs
- if(mount_rofs_rootfs(opt_pivot) != 0) {
- //
+ if (mount_rofs_rootfs(opt_pivot) != 0) {
// Failed -> try to mount zfs
zfsdev::zfsdev_init();
- mount_zfs_rootfs(opt_pivot, opt_extra_zfs_pools);
+ auto error = mount_zfs_rootfs(opt_pivot, opt_extra_zfs_pools);
+ if (error) {
+ debug("Could not mount zfs root filesystem.\n");
+ }
bsd_shrinker_init();

boot_time.event("ZFS mounted");
- }
- else {
- if(opt_disable_rofs_cache) {
+ } else {
+ if (opt_disable_rofs_cache) {
debug("Disabling ROFS memory cache.\n");
rofs_disable_cache();
}
@@ -491,8 +492,7 @@ void* do_main_thread(void *_main_args)

if (opt_bootchart) {
boot_time.print_chart();
- }
- else {
+ } else {
boot_time.print_total_time();
}

--
2.28.0

Fotis Xenakis

unread,
Aug 19, 2020, 5:43:49 AM8/19/20
to osv...@googlegroups.com, Fotis Xenakis
Previously, the loader always attempted to:
1. Mount a rofs root filesystem.
2. Mount a zfs root filesystem, if the previous failed.
3. Do nothing (i.e. keep the ramfs mounted as bootfs) if the others
failed.

This adds a new command line option to control this behaviour instead,
specifying which filesystem should be attempted for mounting root
(defaulting to zfs, to avoid surprises). This option is set by
scripts/build (scripts/run.py would be the obvious choice for setting a
command line option, but it would require the user specifying the root
fs type a run time in addition to build time).

In addition, in a way this formalizes supporting ramfs as the root
filesystem. As noted in the code, this support consists of simply
mounting the fstab entries when ramfs is selected (this is in sync with
what happened previously, when mounting zfs failed). So, there is no
functionality added here, just documenting the option and making the
code more clear and explicit.

Signed-off-by: Fotis Xenakis <fo...@windowslive.com>
---
fs/vfs/main.cc | 2 +-
loader.cc | 44 +++++++++++++++++++++++++++++++++-----------
scripts/build | 3 +++
3 files changed, 37 insertions(+), 12 deletions(-)

diff --git a/fs/vfs/main.cc b/fs/vfs/main.cc
index 6cee319e..957333c2 100644
--- a/fs/vfs/main.cc
+++ b/fs/vfs/main.cc
@@ -2330,7 +2330,7 @@ static void mount_fs(mntent *m)
}

extern std::vector<mntent> opt_mount_fs;
-void pivot_rootfs(const char* path)
+extern "C" void pivot_rootfs(const char* path)
{
int ret = sys_pivot_root(path, "/");
if (ret)
diff --git a/loader.cc b/loader.cc
index 9d9d3173..60550344 100644
--- a/loader.cc
+++ b/loader.cc
@@ -86,6 +86,7 @@ void setup_tls(elf::init_table inittab)
extern "C" {
void premain();
void vfs_init(void);
+ void pivot_rootfs(const char*);
void unmount_devfs();
int mount_zfs_rootfs(bool, bool);
int mount_rofs_rootfs(bool);
@@ -134,6 +135,7 @@ bool opt_power_off_on_abort = false;
static bool opt_log_backtrace = false;
static bool opt_mount = true;
static bool opt_pivot = true;
+static std::string opt_rootfs = "zfs";
static bool opt_random = true;
static bool opt_init = true;
static std::string opt_console = "all";
@@ -161,8 +163,9 @@ static void usage()
std::cout << " --trace=arg tracepoints to enable\n";
std::cout << " --trace-backtrace log backtraces in the tracepoint log\n";
std::cout << " --leak start leak detector after boot\n";
- std::cout << " --nomount don't mount the ZFS file system\n";
- std::cout << " --nopivot do not pivot the root from bootfs to the ZFS\n";
+ std::cout << " --nomount don't mount the root file system\n";
+ std::cout << " --nopivot do not pivot the root from bootfs to the root fs\n";
+ std::cout << " --rootfs=arg root filesystem to use (zfs, rofs or ramfs)\n";
std::cout << " --assign-net assign virtio network to the application\n";
std::cout << " --maxnic=arg maximum NIC number\n";
std::cout << " --norandom don't initialize any random device\n";
@@ -274,6 +277,14 @@ static void parse_options(int loader_argc, char** loader_argv)
debug("console=%s\n", opt_console);
}

+ if (options::option_value_exists(options_values, "rootfs")) {
+ auto v = options::extract_option_values(options_values, "rootfs");
+ if (v.size() > 1) {
+ printf("Ignoring '--rootfs' options after the first.");
+ }
+ opt_rootfs = v.front();
+ }
+
if (options::option_value_exists(options_values, "mount-fs")) {
auto mounts = options::extract_option_values(options_values, "mount-fs");
for (auto m : mounts) {
@@ -397,23 +408,34 @@ void* do_main_thread(void *_main_args)
if (opt_mount) {
unmount_devfs();

- // Try to mount rofs
- if (mount_rofs_rootfs(opt_pivot) != 0) {
- // Failed -> try to mount zfs
- zfsdev::zfsdev_init();
- auto error = mount_zfs_rootfs(opt_pivot, opt_extra_zfs_pools);
+ if (opt_rootfs.compare("rofs") == 0) {
+ auto error = mount_rofs_rootfs(opt_pivot);
if (error) {
- debug("Could not mount zfs root filesystem.\n");
+ debug("Could not mount rofs root filesystem.\n");
}
- bsd_shrinker_init();

- boot_time.event("ZFS mounted");
- } else {
if (opt_disable_rofs_cache) {
debug("Disabling ROFS memory cache.\n");
rofs_disable_cache();
}
boot_time.event("ROFS mounted");
+ } else if (opt_rootfs.compare("ramfs") == 0) {
+ // NOTE: The ramfs is already mounted, we just need to mount fstab
+ // entries. That's the only difference between this and --nomount.
+
+ // TODO: Avoid the hack of using pivot_rootfs() just for mounting
+ // the fstab entries.
+ pivot_rootfs("/");
+ } else {
+ // Go with ZFS in all other cases
+ zfsdev::zfsdev_init();
+ auto error = mount_zfs_rootfs(opt_pivot, opt_extra_zfs_pools);
+ if (error) {
+ debug("Could not mount zfs root filesystem.\n");
+ }
+
+ bsd_shrinker_init();
+ boot_time.event("ZFS mounted");
}
}

diff --git a/scripts/build b/scripts/build
index 5d480d57..8fa18071 100755
--- a/scripts/build
+++ b/scripts/build
@@ -315,6 +315,9 @@ rofs)
ramfs)
qemu-img convert -f raw -O qcow2 loader.img usr.img ;;
esac
+# Prepend the root fs type option to the command line (preserved by run.py)
+cmdline=$(cat cmdline)
+echo -n "--rootfs=${fs_type} ${cmdline}" > cmdline

if [[ -f "$OSV_BUILD_PATH/usr.img" ]]; then
"$SRC"/scripts/imgedit.py setargs usr.img `cat cmdline`
--
2.28.0

Fotis Xenakis

unread,
Aug 19, 2020, 5:44:31 AM8/19/20
to osv...@googlegroups.com, Fotis Xenakis
This makes the necessary and pretty straight-forward additions to the
laoder to support using virtio-fs as the root filesystem. It also makes
minimal changes to scripts/build to add support there as well. Note
that, to obtain a directory with contents specified by the manifest
files, usable as the virtio-fs host directory, one can use the existing
'export' and 'export_dir' (previously undocumented) options to
scripts/build.

Ref https://github.com/cloudius-systems/osv/issues/1062.

Signed-off-by: Fotis Xenakis <fo...@windowslive.com>
---
fs/vfs/main.cc | 24 ++++++++++++++++++++++++
loader.cc | 10 +++++++++-
scripts/build | 22 +++++++++++++++-------
3 files changed, 48 insertions(+), 8 deletions(-)

diff --git a/fs/vfs/main.cc b/fs/vfs/main.cc
index 957333c2..f26e2275 100644
--- a/fs/vfs/main.cc
+++ b/fs/vfs/main.cc
@@ -2424,6 +2424,30 @@ extern "C" int mount_zfs_rootfs(bool pivot_root, bool extra_zfs_pools)
return 0;
}

+extern "C" int mount_virtiofs_rootfs(bool pivot_root)
+{
+ constexpr char* mp = "/virtiofs";
+
+ if (mkdir(mp, 0755) < 0) {
+ int ret = errno;
+ kprintf("failed to create %s, error = %s\n", mp, strerror(errno));
+ return ret;
+ }
+
+ int ret = sys_mount("/dev/virtiofs0", mp, "virtiofs", MNT_RDONLY, nullptr);
+ if (ret) {
+ kprintf("failed to mount %s, error = %s\n", mp, strerror(ret));
+ rmdir(mp);
+ return ret;
+ }
+
+ if (pivot_root) {
+ pivot_rootfs(mp);
+ }
+
+ return 0;
+}
+
extern "C" void unmount_rootfs(void)
{
int ret;
diff --git a/loader.cc b/loader.cc
index 60550344..a588af9a 100644
--- a/loader.cc
+++ b/loader.cc
@@ -91,6 +91,7 @@ extern "C" {
int mount_zfs_rootfs(bool, bool);
int mount_rofs_rootfs(bool);
void rofs_disable_cache();
+ int mount_virtiofs_rootfs(bool);
}

void premain()
@@ -165,7 +166,7 @@ static void usage()
std::cout << " --leak start leak detector after boot\n";
std::cout << " --nomount don't mount the root file system\n";
std::cout << " --nopivot do not pivot the root from bootfs to the root fs\n";
- std::cout << " --rootfs=arg root filesystem to use (zfs, rofs or ramfs)\n";
+ std::cout << " --rootfs=arg root filesystem to use (zfs, rofs, ramfs or virtiofs)\n";
std::cout << " --assign-net assign virtio network to the application\n";
std::cout << " --maxnic=arg maximum NIC number\n";
std::cout << " --norandom don't initialize any random device\n";
@@ -426,6 +427,13 @@ void* do_main_thread(void *_main_args)
// TODO: Avoid the hack of using pivot_rootfs() just for mounting
// the fstab entries.
pivot_rootfs("/");
+ } else if (opt_rootfs.compare("virtiofs") == 0) {
+ auto error = mount_virtiofs_rootfs(opt_pivot);
+ if (error) {
+ debug("Could not mount virtiofs root filesystem.\n");
+ }
+
+ boot_time.event("Virtio-fs mounted");
} else {
// Go with ZFS in all other cases
zfsdev::zfsdev_init();
diff --git a/scripts/build b/scripts/build
index 8fa18071..24e930d3 100755
--- a/scripts/build
+++ b/scripts/build
@@ -24,8 +24,9 @@ usage() {
--help|-h Print this help message
arch=x64|aarch64 Specify the build architecture; default is x64
mode=release|debug Specify the build mode; default is release
- export=none|selected|all If 'selected' or 'all' export the app files to build/export
- fs=zfs|rofs|ramfs Specify the filesystem of the image partition
+ export=none|selected|all If 'selected' or 'all' export the app files to <export_dir>
+ export_dir=<dir> The directory to export the files to; default is build/export
+ fs=zfs|rofs|ramfs|virtiofs Specify the filesystem of the image partition
fs_size=N Specify the size of the image in bytes
fs_size_mb=N Specify the size of the image in MiB
app_local_exec_tls_size=N Specify the size of app local TLS in bytes; the default is 64
@@ -182,12 +183,17 @@ manifest=bootfs.manifest.skel
fs_type=${vars[fs]-zfs}
usrskel_arg=
case $fs_type in
-zfs);; # Nothing to change here. This is our default behavior
-rofs) manifest=bootfs_empty.manifest.skel
+zfs)
+ ;; # Nothing to change here. This is our default behavior
+rofs|virtiofs)
+ # Both are read-only (in OSv) and require nothing extra on bootfs to work
+ manifest=bootfs_empty.manifest.skel
usrskel_arg="--usrskel usr_rofs.manifest.skel";;
-ramfs) manifest=$OUT/usr.manifest
+ramfs)
+ manifest=$OUT/usr.manifest
usrskel_arg="--usrskel usr_ramfs.manifest.skel";;
-*) echo "Unknown filesystem \"$fs_type\"" >&2
+*)
+ echo "Unknown filesystem \"$fs_type\"" >&2
exit 2
esac

@@ -312,7 +318,9 @@ rofs)
partition_size=`stat --printf %s rofs.img`
image_size=$((partition_offset + partition_size))
create_rofs_disk ;;
-ramfs)
+ramfs|virtiofs)
+ # No need to create extra fs like above: ramfs is already created (as the
+ # bootfs) and virtio-fs is specified with virtiofsd at run time
qemu-img convert -f raw -O qcow2 loader.img usr.img ;;
esac
# Prepend the root fs type option to the command line (preserved by run.py)
--
2.28.0

Fotis Xenakis

unread,
Aug 19, 2020, 5:45:47 AM8/19/20
to osv...@googlegroups.com, Fotis Xenakis
This allows exporting an image's files while doing a complete build on
the same invocation (these should anyway be orthogonal).

Signed-off-by: Fotis Xenakis <fo...@windowslive.com>
---
scripts/build | 1 -
1 file changed, 1 deletion(-)

diff --git a/scripts/build b/scripts/build
index 24e930d3..fed78fd3 100755
--- a/scripts/build
+++ b/scripts/build
@@ -271,7 +271,6 @@ fi
if [ "$export" != "none" ]; then
export_dir=${vars[export_dir]-$SRC/build/export}
"$SRC"/scripts/export_manifest.py -e "$export_dir" -m usr.manifest -D libgcc_s_dir="$libgcc_s_dir"
- exit 0
fi

if [[ ${vars[create_disk]} == "true" ]]; then
--
2.28.0

Waldek Kozaczuk

unread,
Aug 26, 2020, 12:02:19 AM8/26/20
to Fotis Xenakis, OSv Development
Hi,

I have seen your patches but have not been able to review them yet. I will hopefully be able to do it soon.

Waldek

--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/AM0PR03MB6292E20CDBF7C297D39694AFA65D0%40AM0PR03MB6292.eurprd03.prod.outlook.com.

Waldek Kozaczuk

unread,
Aug 27, 2020, 1:05:15 AM8/27/20
to OSv Development
I think all your patches look good (though I have not had a chance to test it yet). 

I see you add new option '--rootfs' to explicitly tell which filesystem to use. Now can you clarify if your changes to the mounting logic are backward compatible?

I other words when I do not specify '--rootfs', will it try to mounts stuff in the same order it used to before your changes? I wonder if that may break any existing scripts.

Waldek

Waldek Kozaczuk

unread,
Aug 28, 2020, 12:05:08 AM8/28/20
to OSv Development
So I played more and I think I answered my question myself. The default new behavior (no --rootfs passed) is not backward compatible. For example './scripts/build check fs=rofs' fails because test.py does not get/pass the proper value of --rootfs to run.py. I wonder if we have other cases/scripts when it breaks like that.

So I wonder if we tweak the 3rd patch in your series like so:
- make --rootfs optional without default value of 'zfs'
- if rootfs not specified like with test.py do old behavior - try rofs and then zfs

I like ability to specify explicit root filesystem by I think we need to keep default behavior if a new parameter not specified. What do you think?

Waldek

--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.

Fotis Xenakis

unread,
Aug 28, 2020, 4:22:33 PM8/28/20
to OSv Development
Indeed, there was no way for this change to be 100% backwards compatible, except the one you just described. Breaking tests.py was admittedly an omission on my side to test more thoroughly.

What you suggest (change the loader's behavior) should be the option with the least effort involved and probably the best guarantee for not breaking any script. My only worry is adding unnecessary complexity to the loader.
The only other solution I see is adding a --rootfs flag to run.py (useful only for overriding, e.g. when using the -e flag to run.py to override the command line) and utilizing that in the other scripts. Nevertheless, this still leaves the possibility of breaking some other use case, while it's also not a particularly robust (or elegant) solution in my eyes.

Your suggestion seems to be the safest option, but if we think the other one is worth a shot, I am also willing to implement that. What do you (or anyone else who wants to chime in) think?

Fotis

Commit Bot

unread,
Aug 28, 2020, 8:41:43 PM8/28/20
to osv...@googlegroups.com, Fotis Xenakis
From: Fotis Xenakis <fo...@windowslive.com>
Committer: Waldemar Kozaczuk <jwkoz...@gmail.com>
Branch: master

virtio-fs: use exclusive device ids

Previously, the devfs entry for each virtio-fs device was derived from
virtio::virtio_driver::_disk_idx, which is used by virtio-blk and
virtio-scsi, which anyway share a devfs namespace. This introduced
unnecessary complexity, since e.g. changing the number of virtio-blk
devices could introduce a shift in the naming of virtio-fs devices.

This switches virtio-fs to using a separate, exclusive id. The logic
behind the devfs name (i.e. virtiofs-<instance>) remains the same.

Signed-off-by: Fotis Xenakis <fo...@windowslive.com>

---
diff --git a/drivers/virtio-fs.cc b/drivers/virtio-fs.cc

Commit Bot

unread,
Aug 28, 2020, 8:41:44 PM8/28/20
to osv...@googlegroups.com, Fotis Xenakis
From: Fotis Xenakis <fo...@windowslive.com>
Committer: Waldemar Kozaczuk <jwkoz...@gmail.com>
Branch: master

vfs: homogenize mount_rofs_rootfs and mount_zfs_rootfs

Signed-off-by: Fotis Xenakis <fo...@windowslive.com>
Message-Id: <AM0PR03MB6292ABEC8F...@AM0PR03MB6292.eurprd03.prod.outlook.com>

---
diff --git a/fs/vfs/main.cc b/fs/vfs/main.cc

Waldek Kozaczuk

unread,
Aug 28, 2020, 9:01:18 PM8/28/20
to OSv Development
Let us see what others think (Nadav?). In my humble opinion, making loader.cc backward compatible would not make it all that much more complicated (which is really changing the last "else" in that "if" to do what old logic did). It would simply act as a fallback if the user did not specify what the filesystem is on the drive and loader.cc has to discover it, which to me makes perfect sense.

Now adding --rootfs flag to run.py also makes sense but then it would somewhat collide with the logic in scripts/build to add --rootfs to cmdline file which I like as well and would love to preserve it. Unless we make run.py read cmdine file (which it does I think) and replace stored --rootfs by scripts/build with the value of --rootfs passed by run.py. What do you think?

Waldek

BTW. I have committed the first 2 patches as they would change regardless of which way we go forward with 3rd and other patches.

Fotis Xenakis

unread,
Sep 3, 2020, 6:25:08 PM9/3/20
to osv...@googlegroups.com, Fotis Xenakis
Changes since v1:
- Made the loader backwards compatible: if the --rootfs flag is not
specified, it fals back to the previous behavior of trying to mount
rofs, followed by zfs.

On the root filesystem selection front, the thought of generalizing the
current implementation to try out the other fs as well when --rootfs is
not specified has passed through my mind. This could be a basic
auto-discovery process and should be straight-forward as long as the
various mount_*_rootfs are side-effect free when they fail. Do you think
it's something worth pursuing?

Fotis Xenakis (3):
loader: add rootfs type command line option
loader: add support for booting from virtio-fs
scripts/build: don't exit when exporting files

fs/vfs/main.cc | 26 ++++++++++++++++++-
loader.cc | 68 ++++++++++++++++++++++++++++++++++++++++++--------
scripts/build | 26 +++++++++++++------
3 files changed, 101 insertions(+), 19 deletions(-)

--
2.28.0

Fotis Xenakis

unread,
Sep 3, 2020, 6:28:20 PM9/3/20
to osv...@googlegroups.com, Fotis Xenakis
Previously, the loader always attempted to:
1. Mount a rofs root filesystem.
2. Mount a zfs root filesystem, if the previous failed.
3. Do nothing (i.e. keep the ramfs mounted as bootfs) if the others
failed.

This adds a new command line option to better control this behaviour,
specifying which filesystem should be attempted for mounting root,
falling back to the above if not specified. This option is set by
scripts/build (scripts/run.py would be the obvious choice for setting a
command line option, but it would require the user specifying the root
fs type a run time in addition to build time).

In addition, in a way this formalizes supporting ramfs as the root
filesystem. As noted in the code, this support consists of simply
mounting the fstab entries when ramfs is selected (this is in sync with
what happened previously, when mounting zfs failed). So, there is no
functionality added here, just documenting the option and making the
code more clear and explicit.

Signed-off-by: Fotis Xenakis <fo...@windowslive.com>
---
fs/vfs/main.cc | 2 +-
loader.cc | 60 +++++++++++++++++++++++++++++++++++++++++---------
scripts/build | 3 +++
3 files changed, 54 insertions(+), 11 deletions(-)

diff --git a/fs/vfs/main.cc b/fs/vfs/main.cc
index 6cee319e..957333c2 100644
--- a/fs/vfs/main.cc
+++ b/fs/vfs/main.cc
@@ -2330,7 +2330,7 @@ static void mount_fs(mntent *m)
}

extern std::vector<mntent> opt_mount_fs;
-void pivot_rootfs(const char* path)
+extern "C" void pivot_rootfs(const char* path)
{
int ret = sys_pivot_root(path, "/");
if (ret)
diff --git a/loader.cc b/loader.cc
index 9d9d3173..19b47e0e 100644
--- a/loader.cc
+++ b/loader.cc
@@ -86,6 +86,7 @@ void setup_tls(elf::init_table inittab)
extern "C" {
void premain();
void vfs_init(void);
+ void pivot_rootfs(const char*);
void unmount_devfs();
int mount_zfs_rootfs(bool, bool);
int mount_rofs_rootfs(bool);
@@ -134,6 +135,7 @@ bool opt_power_off_on_abort = false;
static bool opt_log_backtrace = false;
static bool opt_mount = true;
static bool opt_pivot = true;
+static std::string opt_rootfs;
static bool opt_random = true;
static bool opt_init = true;
static std::string opt_console = "all";
@@ -161,8 +163,9 @@ static void usage()
std::cout << " --trace=arg tracepoints to enable\n";
std::cout << " --trace-backtrace log backtraces in the tracepoint log\n";
std::cout << " --leak start leak detector after boot\n";
- std::cout << " --nomount don't mount the ZFS file system\n";
- std::cout << " --nopivot do not pivot the root from bootfs to the ZFS\n";
+ std::cout << " --nomount don't mount the root file system\n";
+ std::cout << " --nopivot do not pivot the root from bootfs to the root fs\n";
+ std::cout << " --rootfs=arg root filesystem to use (zfs, rofs or ramfs)\n";
std::cout << " --assign-net assign virtio network to the application\n";
std::cout << " --maxnic=arg maximum NIC number\n";
std::cout << " --norandom don't initialize any random device\n";
@@ -274,6 +277,14 @@ static void parse_options(int loader_argc, char** loader_argv)
debug("console=%s\n", opt_console);
}

+ if (options::option_value_exists(options_values, "rootfs")) {
+ auto v = options::extract_option_values(options_values, "rootfs");
+ if (v.size() > 1) {
+ printf("Ignoring '--rootfs' options after the first.");
+ }
+ opt_rootfs = v.front();
+ }
+
if (options::option_value_exists(options_values, "mount-fs")) {
auto mounts = options::extract_option_values(options_values, "mount-fs");
for (auto m : mounts) {
@@ -397,23 +408,52 @@ void* do_main_thread(void *_main_args)
if (opt_mount) {
unmount_devfs();

- // Try to mount rofs
- if (mount_rofs_rootfs(opt_pivot) != 0) {
- // Failed -> try to mount zfs
+ if (opt_rootfs.compare("rofs") == 0) {
+ auto error = mount_rofs_rootfs(opt_pivot);
+ if (error) {
+ debug("Could not mount rofs root filesystem.\n");
+ }
+
+ if (opt_disable_rofs_cache) {
+ debug("Disabling ROFS memory cache.\n");
+ rofs_disable_cache();
+ }
+ boot_time.event("ROFS mounted");
+ } else if (opt_rootfs.compare("zfs") == 0) {
zfsdev::zfsdev_init();
auto error = mount_zfs_rootfs(opt_pivot, opt_extra_zfs_pools);
if (error) {
debug("Could not mount zfs root filesystem.\n");
}
- bsd_shrinker_init();

+ bsd_shrinker_init();
boot_time.event("ZFS mounted");
+ } else if (opt_rootfs.compare("ramfs") == 0) {
+ // NOTE: The ramfs is already mounted, we just need to mount fstab
+ // entries. That's the only difference between this and --nomount.
+
+ // TODO: Avoid the hack of using pivot_rootfs() just for mounting
+ // the fstab entries.
+ pivot_rootfs("/");
} else {
- if (opt_disable_rofs_cache) {
- debug("Disabling ROFS memory cache.\n");
- rofs_disable_cache();
+ // Fallback to original behavior for compatibility: try rofs -> zfs
+ if (mount_rofs_rootfs(opt_pivot) == 0) {
+ if (opt_disable_rofs_cache) {
+ debug("Disabling ROFS memory cache.\n");
+ rofs_disable_cache();
+ }
+ boot_time.event("ROFS mounted");
+ } else {
+ zfsdev::zfsdev_init();
+ auto error = mount_zfs_rootfs(opt_pivot, opt_extra_zfs_pools);
+ if (error) {
+ debug("Could not mount zfs root filesystem (while "
+ "auto-discovering).\n");
+ }
+
+ bsd_shrinker_init();
+ boot_time.event("ZFS mounted");
}
- boot_time.event("ROFS mounted");
}
}

diff --git a/scripts/build b/scripts/build
index 5d480d57..8fa18071 100755
--- a/scripts/build
+++ b/scripts/build
@@ -315,6 +315,9 @@ rofs)
ramfs)
qemu-img convert -f raw -O qcow2 loader.img usr.img ;;
esac

Fotis Xenakis

unread,
Sep 3, 2020, 6:29:20 PM9/3/20
to osv...@googlegroups.com, Fotis Xenakis
This makes the necessary and pretty straight-forward additions to the
laoder to support using virtio-fs as the root filesystem. It also makes
minimal changes to scripts/build to add support there as well. Note
that, to obtain a directory with contents specified by the manifest
files, usable as the virtio-fs host directory, one can use the existing
'export' and 'export_dir' (previously undocumented) options to
scripts/build.

Ref https://github.com/cloudius-systems/osv/issues/1062.

Signed-off-by: Fotis Xenakis <fo...@windowslive.com>
---
fs/vfs/main.cc | 24 ++++++++++++++++++++++++
loader.cc | 10 +++++++++-
scripts/build | 22 +++++++++++++++-------
3 files changed, 48 insertions(+), 8 deletions(-)

diff --git a/fs/vfs/main.cc b/fs/vfs/main.cc
index 957333c2..f26e2275 100644
--- a/fs/vfs/main.cc
+++ b/fs/vfs/main.cc
@@ -2424,6 +2424,30 @@ extern "C" int mount_zfs_rootfs(bool pivot_root, bool extra_zfs_pools)
return 0;
}

+extern "C" int mount_virtiofs_rootfs(bool pivot_root)
+{
+ constexpr char* mp = "/virtiofs";
+
+ if (mkdir(mp, 0755) < 0) {
+ int ret = errno;
+ kprintf("failed to create %s, error = %s\n", mp, strerror(errno));
+ return ret;
+ }
+
+ int ret = sys_mount("/dev/virtiofs0", mp, "virtiofs", MNT_RDONLY, nullptr);
+ if (ret) {
+ kprintf("failed to mount %s, error = %s\n", mp, strerror(ret));
+ rmdir(mp);
+ return ret;
+ }
+
+ if (pivot_root) {
+ pivot_rootfs(mp);
+ }
+
+ return 0;
+}
+
extern "C" void unmount_rootfs(void)
{
int ret;
diff --git a/loader.cc b/loader.cc
index 19b47e0e..9656cf3b 100644
--- a/loader.cc
+++ b/loader.cc
@@ -91,6 +91,7 @@ extern "C" {
int mount_zfs_rootfs(bool, bool);
int mount_rofs_rootfs(bool);
void rofs_disable_cache();
+ int mount_virtiofs_rootfs(bool);
}

void premain()
@@ -165,7 +166,7 @@ static void usage()
std::cout << " --leak start leak detector after boot\n";
std::cout << " --nomount don't mount the root file system\n";
std::cout << " --nopivot do not pivot the root from bootfs to the root fs\n";
- std::cout << " --rootfs=arg root filesystem to use (zfs, rofs or ramfs)\n";
+ std::cout << " --rootfs=arg root filesystem to use (zfs, rofs, ramfs or virtiofs)\n";
std::cout << " --assign-net assign virtio network to the application\n";
std::cout << " --maxnic=arg maximum NIC number\n";
std::cout << " --norandom don't initialize any random device\n";
@@ -435,6 +436,13 @@ void* do_main_thread(void *_main_args)
// TODO: Avoid the hack of using pivot_rootfs() just for mounting
// the fstab entries.
pivot_rootfs("/");
+ } else if (opt_rootfs.compare("virtiofs") == 0) {
+ auto error = mount_virtiofs_rootfs(opt_pivot);
+ if (error) {
+ debug("Could not mount virtiofs root filesystem.\n");
+ }
+
+ boot_time.event("Virtio-fs mounted");
} else {
// Fallback to original behavior for compatibility: try rofs -> zfs
if (mount_rofs_rootfs(opt_pivot) == 0) {
diff --git a/scripts/build b/scripts/build
index 8fa18071..24e930d3 100755
--- a/scripts/build
+++ b/scripts/build
qemu-img convert -f raw -O qcow2 loader.img usr.img ;;
esac
# Prepend the root fs type option to the command line (preserved by run.py)
--
2.28.0

Fotis Xenakis

unread,
Sep 3, 2020, 6:30:24 PM9/3/20
to osv...@googlegroups.com, Fotis Xenakis
This allows exporting an image's files while doing a complete build on
the same invocation (these should anyway be orthogonal).

Signed-off-by: Fotis Xenakis <fo...@windowslive.com>
---
scripts/build | 1 -
1 file changed, 1 deletion(-)

diff --git a/scripts/build b/scripts/build
index 24e930d3..fed78fd3 100755
--- a/scripts/build
+++ b/scripts/build

Commit Bot

unread,
Sep 9, 2020, 6:20:56 PM9/9/20
to osv...@googlegroups.com, Fotis Xenakis
From: Fotis Xenakis <fo...@windowslive.com>
Committer: Waldemar Kozaczuk <jwkoz...@gmail.com>
Branch: master

loader: add rootfs type command line option

Previously, the loader always attempted to:
1. Mount a rofs root filesystem.
2. Mount a zfs root filesystem, if the previous failed.
3. Do nothing (i.e. keep the ramfs mounted as bootfs) if the others
failed.

This adds a new command line option to better control this behaviour,
specifying which filesystem should be attempted for mounting root,
falling back to the above if not specified. This option is set by
scripts/build (scripts/run.py would be the obvious choice for setting a
command line option, but it would require the user specifying the root
fs type a run time in addition to build time).

In addition, in a way this formalizes supporting ramfs as the root
filesystem. As noted in the code, this support consists of simply
mounting the fstab entries when ramfs is selected (this is in sync with
what happened previously, when mounting zfs failed). So, there is no
functionality added here, just documenting the option and making the
code more clear and explicit.

Signed-off-by: Fotis Xenakis <fo...@windowslive.com>
Message-Id: <AM0PR03MB62926D4644...@AM0PR03MB6292.eurprd03.prod.outlook.com>

---
diff --git a/fs/vfs/main.cc b/fs/vfs/main.cc
--- a/fs/vfs/main.cc
+++ b/fs/vfs/main.cc
@@ -2310,7 +2310,7 @@ static void mount_fs(mntent *m)
}

extern std::vector<mntent> opt_mount_fs;
-void pivot_rootfs(const char* path)
+extern "C" void pivot_rootfs(const char* path)
{
int ret = sys_pivot_root(path, "/");
if (ret)
diff --git a/loader.cc b/loader.cc

Commit Bot

unread,
Sep 9, 2020, 6:20:57 PM9/9/20
to osv...@googlegroups.com, Fotis Xenakis
From: Fotis Xenakis <fo...@windowslive.com>
Committer: Waldemar Kozaczuk <jwkoz...@gmail.com>
Branch: master

loader: add support for booting from virtio-fs

This makes the necessary and pretty straight-forward additions to the
laoder to support using virtio-fs as the root filesystem. It also makes
minimal changes to scripts/build to add support there as well. Note
that, to obtain a directory with contents specified by the manifest
files, usable as the virtio-fs host directory, one can use the existing
'export' and 'export_dir' (previously undocumented) options to
scripts/build.

Ref https://github.com/cloudius-systems/osv/issues/1062.

Signed-off-by: Fotis Xenakis <fo...@windowslive.com>
Message-Id: <AM0PR03MB62921196B1...@AM0PR03MB6292.eurprd03.prod.outlook.com>

---
diff --git a/fs/vfs/main.cc b/fs/vfs/main.cc
--- a/fs/vfs/main.cc
+++ b/fs/vfs/main.cc
@@ -2404,6 +2404,30 @@ extern "C" int mount_zfs_rootfs(bool pivot_root, bool extra_zfs_pools)

Commit Bot

unread,
Sep 9, 2020, 6:20:58 PM9/9/20
to osv...@googlegroups.com, Fotis Xenakis
From: Fotis Xenakis <fo...@windowslive.com>
Committer: Waldemar Kozaczuk <jwkoz...@gmail.com>
Branch: master

scripts/build: don't exit when exporting files

This allows exporting an image's files while doing a complete build on
the same invocation (these should anyway be orthogonal).

Signed-off-by: Fotis Xenakis <fo...@windowslive.com>
Message-Id: <AM0PR03MB62929812DF...@AM0PR03MB6292.eurprd03.prod.outlook.com>

---
diff --git a/scripts/build b/scripts/build

Waldek Kozaczuk

unread,
Sep 14, 2020, 1:49:50 PM9/14/20
to Fotis Xenakis, OSv Development
Hey,

Thanks for your patches. I love how you improved build/run export mode to make it easier to run OSv apps from virtio-fs.

On Thu, Sep 3, 2020 at 6:25 PM Fotis Xenakis <fo...@windowslive.com> wrote:
Changes since v1:
- Made the loader backwards compatible: if the --rootfs flag is not
  specified, it fals back to the previous behavior of trying to mount
  rofs, followed by zfs.

On the root filesystem selection front, the thought of generalizing the
current implementation to try out the other fs as well when --rootfs is
not specified has passed through my mind. This could be a basic
auto-discovery process and should be straight-forward as long as the
various mount_*_rootfs are side-effect free when they fail. Do you think
it's something worth pursuing?
That sounds like a good idea. In essence, you would like to enhance the current default pseudo discovery mode where it first tries rofs and then zfs, right? You would add virtiofs to it, right? Please note that ZFS is most expensive in terms of boot time (not sure comparing to virtiofs) so I would keep it last. Hopefully, you would not need to spend any time trying to boot from virtio-fs if there is no virtioFS device present.

Relatedly, do you know if the virtiofs team has made any changes to make it possible to expose/assess virtiofs without having to run it as a privileged user? I would imagine it raises some security concerns.

Fotis Xenakis (3):
  loader: add rootfs type command line option
  loader: add support for booting from virtio-fs
  scripts/build: don't exit when exporting files

 fs/vfs/main.cc | 26 ++++++++++++++++++-
 loader.cc      | 68 ++++++++++++++++++++++++++++++++++++++++++--------
 scripts/build  | 26 +++++++++++++------
 3 files changed, 101 insertions(+), 19 deletions(-)

--
2.28.0

--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.

Fotis Xenakis

unread,
Sep 18, 2020, 6:10:48 AM9/18/20
to OSv Development
Στις Δευτέρα, 14 Σεπτεμβρίου 2020 στις 7:49:50 μ.μ. UTC+2, ο χρήστης jwkoz...@gmail.com έγραψε:
Hey,

Thanks for your patches. I love how you improved build/run export mode to make it easier to run OSv apps from virtio-fs.

On Thu, Sep 3, 2020 at 6:25 PM Fotis Xenakis <fo...@windowslive.com> wrote:
Changes since v1:
- Made the loader backwards compatible: if the --rootfs flag is not
  specified, it fals back to the previous behavior of trying to mount
  rofs, followed by zfs.

On the root filesystem selection front, the thought of generalizing the
current implementation to try out the other fs as well when --rootfs is
not specified has passed through my mind. This could be a basic
auto-discovery process and should be straight-forward as long as the
various mount_*_rootfs are side-effect free when they fail. Do you think
it's something worth pursuing?
That sounds like a good idea. In essence, you would like to enhance the current default pseudo discovery mode where it first tries rofs and then zfs, right? You would add virtiofs to it, right? Please note that ZFS is most expensive in terms of boot time (not sure comparing to virtiofs) so I would keep it last. Hopefully, you would not need to spend any time trying to boot from virtio-fs if there is no virtioFS device present.
Exactly as you describe. After a quick profiling of the mount times (both successful and unsuccessful) for each of the filesystems, I see that:
- ROFS and virtio-fs are pretty quick to mount (successfully): 1.5-2ms on my laptop (<1% of total boot time). On the other hand, ZFS is a lot slower (~65ms, or ~20%).
- When, during auto-discovering, a mount fails, it takes negligible time for ROFS and virtio-fs (0.01-0.05ms) and somewhat more (~1.2ms, ~0.5%) for ZFS.
So, the point is:
- ROFS and virtio-fs are on par regarding mounting performance.
- ZFS is slower, but when it fails it's not too bad.

The most backwards-compatible choice would be to try virtio-fs only after ROFS and ZFS have failed. Yet, this would add a <1% overhead when discovering virtio-fs. The alternative would be to try virtio-fs before ZFS (or before ROFS), exchanging backwards compatibility for negligible overhead. Which do you think best?

Relatedly, do you know if the virtiofs team has made any changes to make it possible to expose/assess virtiofs without having to run it as a privileged user? I would imagine it raises some security concerns.
I haven't gotten in touch with them recently regarding this matter, but their qemu development branches haven't been updated recently either and no obvious relevant update has been posted in the virtio-fs mailing list. So, I assume not much has changed unfortunately. 

Waldek Kozaczuk

unread,
Sep 18, 2020, 4:30:28 PM9/18/20
to OSv Development
On Friday, September 18, 2020 at 6:10:48 AM UTC-4 Fotis Xenakis wrote:
Στις Δευτέρα, 14 Σεπτεμβρίου 2020 στις 7:49:50 μ.μ. UTC+2, ο χρήστης jwkoz...@gmail.com έγραψε:
Hey,

Thanks for your patches. I love how you improved build/run export mode to make it easier to run OSv apps from virtio-fs.

On Thu, Sep 3, 2020 at 6:25 PM Fotis Xenakis <fo...@windowslive.com> wrote:
Changes since v1:
- Made the loader backwards compatible: if the --rootfs flag is not
  specified, it fals back to the previous behavior of trying to mount
  rofs, followed by zfs.

On the root filesystem selection front, the thought of generalizing the
current implementation to try out the other fs as well when --rootfs is
not specified has passed through my mind. This could be a basic
auto-discovery process and should be straight-forward as long as the
various mount_*_rootfs are side-effect free when they fail. Do you think
it's something worth pursuing?
That sounds like a good idea. In essence, you would like to enhance the current default pseudo discovery mode where it first tries rofs and then zfs, right? You would add virtiofs to it, right? Please note that ZFS is most expensive in terms of boot time (not sure comparing to virtiofs) so I would keep it last. Hopefully, you would not need to spend any time trying to boot from virtio-fs if there is no virtioFS device present.
Exactly as you describe. After a quick profiling of the mount times (both successful and unsuccessful) for each of the filesystems, I see that:
- ROFS and virtio-fs are pretty quick to mount (successfully): 1.5-2ms on my laptop (<1% of total boot time). On the other hand, ZFS is a lot slower (~65ms, or ~20%).
- When, during auto-discovering, a mount fails, it takes negligible time for ROFS and virtio-fs (0.01-0.05ms) and somewhat more (~1.2ms, ~0.5%) for ZFS.
So, the point is:
- ROFS and virtio-fs are on par regarding mounting performance.
- ZFS is slower, but when it fails it's not too bad.

The most backwards-compatible choice would be to try virtio-fs only after ROFS and ZFS have failed. Yet, this would add a <1% overhead when discovering virtio-fs. The alternative would be to try virtio-fs before ZFS (or before ROFS), exchanging backwards compatibility for negligible overhead. Which do you think best?
ROFS, then virtio-fs, then ZFS. 

Fotis Xenakis

unread,
Mar 28, 2021, 12:30:31 PM3/28/21
to OSv Development
A patch addressing this has finally been posted.
Reply all
Reply to author
Forward
0 new messages