Philipp Wollermann uploaded this change for review.
Improve sandboxing performance on Linux for actions with many inputs. Fixes #1836. Thanks to Austin Schuh for suggesting the excellent idea of putting the input files of sandboxed actions on /dev/shm, sending a working version to try it out in https://bazel-review.googlesource.com/#/c/6670/ and proving the performance improvements with a benchmark. This version works by: - Creating the SymlinkedExecRoot on an in-memory filesystem (/dev/shm). - Redirecting writes back to disk via overlayfs. This is an experimental feature and must be enabled by passing the flag --experimental_sandbox_shm. It will be enabled by default as soon as we're confident that it works reliably enough and have auto-detection for supported kernel versions. (There should be no harm in trying it out - in the worst case your build will just fail with weird error messages and you turn it off again.) This change also fixes a number of known issues with the sandbox: - Mounts /dev/shm as writable, empty tmpfs in sandbox. (Fix #2056, fix #1973) - Mounts /tmp as writable overlay on same device as the output_base, instead of unconditionally using a tmpfs. (Fix b/32826681) - Removes support for bind-mounting arbitrary paths inside the sandbox. Not sure yet, if this is coming back, but it wasn't exposed via an accessible Bazel flag anyway. RELNOTES: Try the new flag --experimental_sandbox_shm to improve sandboxing performance on Linux a lot. Requires a kernel with overlayfs support (>=3.18). Change-Id: I2db8730e9e3389e56b43666b192f2d55c9eda398 --- M src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxRunner.java M src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxedStrategy.java M src/main/java/com/google/devtools/build/lib/sandbox/SandboxOptions.java M src/main/java/com/google/devtools/build/lib/sandbox/SymlinkedExecRoot.java M src/main/tools/linux-sandbox-options.cc M src/main/tools/linux-sandbox-options.h M src/main/tools/linux-sandbox-pid1.cc M src/main/tools/linux-sandbox.cc M src/test/shell/bazel/linux-sandbox_test.sh 9 files changed, 205 insertions(+), 228 deletions(-)
diff --git a/src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxRunner.java b/src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxRunner.java
index da50cda..8b501ce 100644
--- a/src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxRunner.java
+++ b/src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxRunner.java
@@ -41,11 +41,11 @@
private final Path execRoot;
private final Path sandboxExecRoot;
private final Path sandboxTempDir;
+ private final Path sandboxUnderlayPath;
private final Path argumentsFilePath;
private final Set<Path> writableDirs;
private final Set<Path> inaccessiblePaths;
private final Set<Path> tmpfsPaths;
- private final Set<Path> bindMounts;
private final boolean sandboxDebug;
LinuxSandboxRunner(
@@ -53,21 +53,21 @@
Path sandboxPath,
Path sandboxExecRoot,
Path sandboxTempDir,
+ Path sandboxUnderlayPath,
Set<Path> writableDirs,
Set<Path> inaccessiblePaths,
Set<Path> tmpfsPaths,
- Set<Path> bindMounts,
boolean verboseFailures,
boolean sandboxDebug) {
super(sandboxExecRoot, verboseFailures);
this.execRoot = execRoot;
this.sandboxExecRoot = sandboxExecRoot;
this.sandboxTempDir = sandboxTempDir;
+ this.sandboxUnderlayPath = sandboxUnderlayPath;
this.argumentsFilePath = sandboxPath.getRelative("linux-sandbox.params");
this.writableDirs = writableDirs;
this.inaccessiblePaths = inaccessiblePaths;
this.tmpfsPaths = tmpfsPaths;
- this.bindMounts = bindMounts;
this.sandboxDebug = sandboxDebug;
}
@@ -133,6 +133,11 @@
fileArgs.add("-W");
fileArgs.add(sandboxExecRoot.toString());
+ if (sandboxUnderlayPath != null) {
+ fileArgs.add("-O");
+ fileArgs.add(sandboxUnderlayPath.toString());
+ }
+
// Kill the process after a timeout.
if (timeout != -1) {
fileArgs.add("-T");
@@ -153,11 +158,6 @@
for (Path tmpfsPath : tmpfsPaths) {
fileArgs.add("-e");
fileArgs.add(tmpfsPath.getPathString());
- }
-
- for (Path bindMount : bindMounts) {
- fileArgs.add("-b");
- fileArgs.add(bindMount.getPathString());
}
if (!allowNetwork) {
diff --git a/src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxedStrategy.java b/src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxedStrategy.java
index b4ed542..5626dfa 100644
--- a/src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxedStrategy.java
+++ b/src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxedStrategy.java
@@ -14,6 +14,7 @@
package com.google.devtools.build.lib.sandbox;
+import com.google.common.base.Throwables;
import com.google.common.collect.ImmutableSet;
import com.google.devtools.build.lib.actions.ActionExecutionContext;
import com.google.devtools.build.lib.actions.ExecException;
@@ -30,6 +31,8 @@
import com.google.devtools.build.lib.vfs.Path;
import com.google.devtools.build.lib.vfs.PathFragment;
import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Paths;
import java.util.Set;
import java.util.UUID;
import java.util.concurrent.atomic.AtomicInteger;
@@ -107,21 +110,38 @@
// Each invocation of "exec" gets its own sandbox.
Path sandboxPath = SandboxHelpers.getSandboxRoot(blazeDirs, productName, uuid, execCounter);
Path sandboxExecRoot = sandboxPath.getRelative("execroot").getRelative(execRoot.getBaseName());
+
+ // This directory is used by the sandbox to store some temporary files. It is deleted by Bazel
+ // after execution is finished.
Path sandboxTempDir = sandboxPath.getRelative("tmp");
+
+ // This is used to store the SymlinkedExecRoot on an in-memory filesystem, which will then be
+ // mounted onto the real execroot via overlayfs as a "lower" directory, while the real execroot
+ // itself will be used as the "upper" directory (so all writes go to it).
+ Path sandboxUnderlayPath = null;
+ if (sandboxOptions.experimentalSandboxShm && fullySupported) {
+ try {
+ sandboxUnderlayPath = blazeDirs.getFileSystem().getPath(Files.createTempDirectory(Paths.get("/dev/shm"), productName + "-sandbox-").toString());
+ } catch (IOException e) {
+ throw new UserExecException("Using --experimental_sandbox_shm, but could not create a "
+ + "sandbox directory in /dev/shm", e);
+ }
+ }
Set<Path> writableDirs = getWritableDirs(sandboxExecRoot, spawn.getEnvironment());
- SymlinkedExecRoot symlinkedExecRoot = new SymlinkedExecRoot(sandboxExecRoot);
+ SymlinkedExecRoot symlinkedExecRoot = new SymlinkedExecRoot(sandboxUnderlayPath != null ? sandboxUnderlayPath : sandboxExecRoot, sandboxExecRoot);
ImmutableSet<PathFragment> outputs = SandboxHelpers.getOutputFiles(spawn);
try {
+ FileSystemUtils.createDirectoryAndParents(sandboxExecRoot);
symlinkedExecRoot.createFileSystem(
getMounts(spawn, actionExecutionContext), outputs, writableDirs);
sandboxTempDir.createDirectory();
} catch (IOException e) {
- throw new UserExecException("I/O error during sandboxed execution", e);
+ throw new UserExecException("I/O error during sandboxed execution: " + Throwables.getStackTraceAsString(e));
}
- SandboxRunner runner = getSandboxRunner(spawn, sandboxPath, sandboxExecRoot, sandboxTempDir);
+ SandboxRunner runner = getSandboxRunner(spawn, sandboxPath, sandboxExecRoot, sandboxTempDir, sandboxUnderlayPath);
try {
runSpawn(
spawn,
@@ -134,6 +154,9 @@
} finally {
if (!sandboxOptions.sandboxDebug) {
try {
+ if (sandboxUnderlayPath != null) {
+ FileSystemUtils.deleteTree(sandboxUnderlayPath);
+ }
FileSystemUtils.deleteTree(sandboxPath);
} catch (IOException e) {
executor
@@ -149,17 +172,18 @@
}
private SandboxRunner getSandboxRunner(
- Spawn spawn, Path sandboxPath, Path sandboxExecRoot, Path sandboxTempDir) {
+ Spawn spawn, Path sandboxPath, Path sandboxExecRoot, Path sandboxTempDir,
+ Path sandboxUnderlayPath) {
if (fullySupported) {
return new LinuxSandboxRunner(
execRoot,
sandboxPath,
sandboxExecRoot,
sandboxTempDir,
+ sandboxUnderlayPath,
getWritableDirs(sandboxExecRoot, spawn.getEnvironment()),
getInaccessiblePaths(),
getTmpfsPaths(),
- getBindMounts(blazeDirs),
verboseFailures,
sandboxOptions.sandboxDebug);
} else {
@@ -170,20 +194,12 @@
private ImmutableSet<Path> getTmpfsPaths() {
ImmutableSet.Builder<Path> tmpfsPaths = ImmutableSet.builder();
for (String tmpfsPath : sandboxOptions.sandboxTmpfsPath) {
+ // These are now mounted automatically as tmpfs in the sandbox.
+ if (tmpfsPath.equals("/dev/shm") || tmpfsPath.equals("/run/shm")) {
+ continue;
+ }
tmpfsPaths.add(blazeDirs.getFileSystem().getPath(tmpfsPath));
}
return tmpfsPaths.build();
- }
-
- private ImmutableSet<Path> getBindMounts(BlazeDirectories blazeDirs) {
- Path tmpPath = blazeDirs.getFileSystem().getPath("/tmp");
- ImmutableSet.Builder<Path> bindMounts = ImmutableSet.builder();
- if (blazeDirs.getWorkspace().startsWith(tmpPath)) {
- bindMounts.add(blazeDirs.getWorkspace());
- }
- if (blazeDirs.getOutputBase().startsWith(tmpPath)) {
- bindMounts.add(blazeDirs.getOutputBase());
- }
- return bindMounts.build();
}
}
diff --git a/src/main/java/com/google/devtools/build/lib/sandbox/SandboxOptions.java b/src/main/java/com/google/devtools/build/lib/sandbox/SandboxOptions.java
index 909e07c..e7e0e37 100644
--- a/src/main/java/com/google/devtools/build/lib/sandbox/SandboxOptions.java
+++ b/src/main/java/com/google/devtools/build/lib/sandbox/SandboxOptions.java
@@ -59,4 +59,15 @@
+ " (if supported by the sandboxing implementation, ignored otherwise)."
)
public List<String> sandboxTmpfsPath;
+
+ @Option(
+ name = "experimental_sandbox_shm",
+ defaultValue = "false",
+ category = "config",
+ help =
+ "Linux only. Create all the symlinks and directories for the sandbox in /dev/shm. "
+ + "This makes symlink creation significantly faster at the cost of more RAM usage. "
+ + "Requires a kernel that supports overlayfs."
+ )
+ public boolean experimentalSandboxShm;
}
diff --git a/src/main/java/com/google/devtools/build/lib/sandbox/SymlinkedExecRoot.java b/src/main/java/com/google/devtools/build/lib/sandbox/SymlinkedExecRoot.java
index bc60eae..48ea134 100644
--- a/src/main/java/com/google/devtools/build/lib/sandbox/SymlinkedExecRoot.java
+++ b/src/main/java/com/google/devtools/build/lib/sandbox/SymlinkedExecRoot.java
@@ -35,9 +35,24 @@
public final class SymlinkedExecRoot implements SandboxExecRoot {
private final Path sandboxExecRoot;
+ private final Path sandboxOutputRoot;
- public SymlinkedExecRoot(Path sandboxExecRoot) {
+ /**
+ * @param sandboxExecRoot directory where the symlinks for inputs and directories for outputs
+ * should be created in.
+ * @param sandboxOutputRoot directory where the output artifacts are to be copied from.
+ */
+ public SymlinkedExecRoot(Path sandboxExecRoot, Path sandboxOutputRoot) {
this.sandboxExecRoot = sandboxExecRoot;
+ this.sandboxOutputRoot = sandboxOutputRoot;
+ }
+
+ /**
+ * @param sandboxExecRoot directory where the symlinks for inputs and directories for outputs
+ * should be created in and the outputs copied from.
+ */
+ public SymlinkedExecRoot(Path sandboxExecRoot) {
+ this(sandboxExecRoot, sandboxExecRoot);
}
@Override
@@ -116,7 +131,8 @@
private void createWritableDirectories(Set<Path> createdDirs, Set<Path> writableDirs)
throws IOException {
for (Path writablePath : writableDirs) {
- if (writablePath.startsWith(sandboxExecRoot)) {
+ if (writablePath.startsWith(sandboxExecRoot)
+ || writablePath.startsWith(sandboxOutputRoot)) {
FileSystemUtils.createDirectoryAndParentsWithCache(createdDirs, writablePath);
}
}
@@ -135,7 +151,7 @@
@Override
public void copyOutputs(Path execRoot, Collection<PathFragment> outputs) throws IOException {
for (PathFragment output : outputs) {
- Path source = sandboxExecRoot.getRelative(output);
+ Path source = sandboxOutputRoot.getRelative(output);
Path target = execRoot.getRelative(output);
if (source.isFile() || source.isSymbolicLink()) {
Files.move(source.getPathFile(), target.getPathFile());
diff --git a/src/main/tools/linux-sandbox-options.cc b/src/main/tools/linux-sandbox-options.cc
index 49e6eaf..bb9edab 100644
--- a/src/main/tools/linux-sandbox-options.cc
+++ b/src/main/tools/linux-sandbox-options.cc
@@ -114,34 +114,46 @@
int c;
while ((c = getopt(args->size(), args->data(),
- ":CS:W:T:t:l:L:w:i:e:b:NRD")) != -1) {
+ ":CS:W:O:T:t:l:L:w:i:e:NRD")) != -1) {
switch (c) {
case 'C':
// Shortcut for the "does this system support sandboxing" check.
exit(CheckNamespacesSupported());
break;
case 'S':
- if (opt.sandbox_root_dir == NULL) {
- if (optarg[0] != '/') {
+ if (opt.sandbox_dir.empty()) {
+ if (strlen(optarg) == 0 || optarg[0] != '/') {
Usage(args->front(),
- "The -r option must be used with absolute paths only.");
+ "The -S option must be used with non-empty, absolute paths only.");
}
- opt.sandbox_root_dir = strdup(optarg);
+ opt.sandbox_dir.assign(optarg);
} else {
Usage(args->front(),
- "Multiple root directories (-r) specified, expected one.");
+ "Multiple sandbox directories (-S) specified, expected one.");
}
break;
case 'W':
- if (opt.working_dir == NULL) {
- if (optarg[0] != '/') {
+ if (opt.working_dir.empty()) {
+ if (strlen(optarg) == 0 || optarg[0] != '/') {
Usage(args->front(),
- "The -W option must be used with absolute paths only.");
+ "The -W option must be used with non-empty, absolute paths only.");
}
- opt.working_dir = strdup(optarg);
+ opt.working_dir.assign(optarg);
} else {
Usage(args->front(),
"Multiple working directories (-W) specified, expected one.");
+ }
+ break;
+ case 'O':
+ if (opt.lower_working_dir.empty()) {
+ if (strlen(optarg) == 0 || optarg[0] != '/') {
+ Usage(args->front(),
+ "The -O option must be used with non-empty, absolute paths only.");
+ }
+ opt.lower_working_dir.assign(optarg);
+ } else {
+ Usage(args->front(),
+ "Multiple working directory overlay directories (-O) specified, expected one.");
}
break;
case 'T':
@@ -192,13 +204,6 @@
"The -e option must be used with absolute paths only.");
}
opt.tmpfs_dirs.push_back(strdup(optarg));
- break;
- case 'b':
- if (optarg[0] != '/') {
- Usage(args->front(),
- "The -b option must be used with absolute paths only.");
- }
- opt.bind_mounts.push_back(strdup(optarg));
break;
case 'N':
opt.create_netns = true;
@@ -281,9 +286,17 @@
Usage(args.front(), "No command specified.");
}
- opt.tmpfs_dirs.push_back("/tmp");
+ if (opt.sandbox_dir.empty()) {
+ Usage(args.front(), "No sandbox directory specified (-S)");
+ }
- if (opt.working_dir == NULL) {
+ if (opt.working_dir.empty()) {
opt.working_dir = getcwd(NULL, 0);
}
+
+ if (access("/dev/shm", R_OK | W_OK | X_OK) < 0) {
+ PRINT_DEBUG("warning: /dev/shm does not exist, not mounting tmpfs there");
+ } else {
+ opt.tmpfs_dirs.push_back("/dev/shm");
+ }
}
diff --git a/src/main/tools/linux-sandbox-options.h b/src/main/tools/linux-sandbox-options.h
index 5554f9e..97aca54 100644
--- a/src/main/tools/linux-sandbox-options.h
+++ b/src/main/tools/linux-sandbox-options.h
@@ -18,14 +18,17 @@
#include <stdbool.h>
#include <stddef.h>
+#include <string>
#include <vector>
// Options parsing result.
struct Options {
- // Temporary root directory (-S)
- const char *sandbox_root_dir;
+ // Temporary directory that will be deleted by our parent process after execution (-S)
+ std::string sandbox_dir;
// Working directory (-W)
- const char *working_dir;
+ std::string working_dir;
+ // Lower directory to mount into the working_dir via overlayfs. (-O)
+ std::string lower_working_dir;
// How long to wait before killing the child (-T)
int timeout_secs;
// How long to wait before sending SIGKILL in case of timeout (-t)
@@ -40,8 +43,6 @@
std::vector<const char *> inaccessible_files;
// Directories where to mount an empty tmpfs (-e)
std::vector<const char *> tmpfs_dirs;
- // Files or directories to explicitly bind mount into the sandbox (-b)
- std::vector<const char *> bind_mounts;
// Create a new network namespace (-N)
bool create_netns;
// Pretend to be root inside the namespace (-R)
diff --git a/src/main/tools/linux-sandbox-pid1.cc b/src/main/tools/linux-sandbox-pid1.cc
index 5f0482e..8bdf092 100644
--- a/src/main/tools/linux-sandbox-pid1.cc
+++ b/src/main/tools/linux-sandbox-pid1.cc
@@ -54,9 +54,12 @@
#include <sys/wait.h>
#include <unistd.h>
+#include <sstream>
+#include <string>
+
static int global_child_pid;
-static char global_inaccessible_directory[] = "tmp/empty.XXXXXX";
-static char global_inaccessible_file[] = "tmp/empty.XXXXXX";
+static char global_inaccessible_directory[] = "/tmp/empty.XXXXXX";
+static char global_inaccessible_file[] = "/tmp/empty.XXXXXX";
static void SetupSelfDestruction(int *sync_pipe) {
// We could also poll() on the pipe fd to find out when the parent goes away,
@@ -167,107 +170,23 @@
}
}
-static bool IsDirectory(const char *path) {
- struct stat sb;
- if (stat(path, &sb) < 0) {
- DIE("stat(%s)", path);
- }
- return S_ISDIR(sb.st_mode);
-}
-
-// Recursively creates the file or directory specified in "path" and its parent
-// directories.
-static int CreateTarget(const char *path, bool is_directory) {
- PRINT_DEBUG("CreateTarget(%s, %s)", path, is_directory ? "true" : "false");
- if (path == NULL) {
- errno = EINVAL;
- return -1;
- }
-
- struct stat sb;
- // If the path already exists...
- if (stat(path, &sb) == 0) {
- if (is_directory && S_ISDIR(sb.st_mode)) {
- // and it's a directory and supposed to be a directory, we're done here.
- return 0;
- } else if (!is_directory && S_ISREG(sb.st_mode)) {
- // and it's a regular file and supposed to be one, we're done here.
- return 0;
- } else {
- // otherwise something is really wrong.
- errno = is_directory ? ENOTDIR : EEXIST;
- return -1;
- }
- } else {
- // If stat failed because of any error other than "the path does not exist",
- // this is an error.
- if (errno != ENOENT) {
- return -1;
- }
- }
-
- // Create the parent directory.
- if (CreateTarget(dirname(strdupa(path)), true) < 0) {
- DIE("CreateTarget(%s, true)", dirname(strdupa(path)));
- }
-
- if (is_directory) {
- if (mkdir(path, 0755) < 0) {
- DIE("mkdir(%s, 0755)", path);
- }
- } else {
- int handle;
- if ((handle = open(path, O_CREAT | O_WRONLY | O_EXCL, 0666)) < 0) {
- DIE("open(%s, O_CREAT | O_WRONLY | O_EXCL, 0666)", path);
- }
- if (close(handle) < 0) {
- DIE("close(%d)", handle);
- }
- }
-
- return 0;
-}
-
static void MountFilesystems() {
- if (mount("/", opt.sandbox_root_dir, NULL, MS_BIND | MS_REC, NULL) < 0) {
- DIE("mount(/, %s, NULL, MS_BIND | MS_REC, NULL)", opt.sandbox_root_dir);
+ // Make sure that our temporary and our working directory are mount points. The easiest way to
+ // do this is by bind-mounting them upon themselves.
+ PRINT_DEBUG("sandbox dir: %s", opt.sandbox_dir.c_str());
+ if (mount(opt.sandbox_dir.c_str(), opt.sandbox_dir.c_str(), NULL, MS_BIND, NULL) < 0) {
+ DIE("mount(%s, %s, NULL, MS_BIND, NULL)", opt.sandbox_dir.c_str(), opt.sandbox_dir.c_str());
}
- if (chdir(opt.sandbox_root_dir) < 0) {
- DIE("chdir(%s)", opt.sandbox_root_dir);
- }
-
- for (const char *tmpfs_dir : opt.tmpfs_dirs) {
- PRINT_DEBUG("tmpfs: %s", tmpfs_dir);
- if (mount("tmpfs", tmpfs_dir + 1, "tmpfs",
- MS_NOSUID | MS_NODEV | MS_NOATIME, NULL) < 0) {
- DIE("mount(tmpfs, %s, tmpfs, MS_NOSUID | MS_NODEV | MS_NOATIME, NULL)",
- tmpfs_dir + 1);
- }
- }
-
- // Make sure that our working directory is a mount point. The easiest way to
- // do this is by bind-mounting it upon itself.
- PRINT_DEBUG("working dir: %s", opt.working_dir);
- CreateTarget(opt.working_dir + 1, true);
- if (mount(opt.working_dir, opt.working_dir + 1, NULL, MS_BIND, NULL) < 0) {
- DIE("mount(%s, %s, NULL, MS_BIND, NULL)", opt.working_dir,
- opt.working_dir + 1);
- }
-
- for (const char *bind_mount : opt.bind_mounts) {
- PRINT_DEBUG("bind mount: %s", bind_mount);
- CreateTarget(bind_mount + 1, IsDirectory(bind_mount));
- if (mount(bind_mount, bind_mount + 1, NULL, MS_BIND, NULL) < 0) {
- DIE("mount(%s, %s, NULL, MS_BIND, NULL)", bind_mount, bind_mount + 1);
- }
+ PRINT_DEBUG("working dir: %s", opt.working_dir.c_str());
+ if (mount(opt.working_dir.c_str(), opt.working_dir.c_str(), NULL, MS_BIND, NULL) < 0) {
+ DIE("mount(%s, %s, NULL, MS_BIND, NULL)", opt.working_dir.c_str(), opt.working_dir.c_str());
}
for (const char *writable_file : opt.writable_files) {
PRINT_DEBUG("writable: %s", writable_file);
- if (mount(writable_file, writable_file + 1, NULL, MS_BIND, NULL) < 0) {
- DIE("mount(%s, %s, NULL, MS_BIND, NULL)", writable_file,
- writable_file + 1);
+ if (mount(writable_file, writable_file, NULL, MS_BIND, NULL) < 0) {
+ DIE("mount(%s, %s, NULL, MS_BIND, NULL)", writable_file, writable_file);
}
}
@@ -281,17 +200,13 @@
if (S_ISDIR(sb.st_mode)) {
PRINT_DEBUG("inaccessible dir: %s", inaccessible_file);
- if (mount(global_inaccessible_directory, inaccessible_file + 1, NULL,
- MS_BIND, NULL) < 0) {
- DIE("mount(%s, %s, NULL, MS_BIND, NULL)", global_inaccessible_directory,
- inaccessible_file + 1);
+ if (mount(global_inaccessible_directory, inaccessible_file, NULL, MS_BIND, NULL) < 0) {
+ DIE("mount(%s, %s, NULL, MS_BIND, NULL)", global_inaccessible_directory, inaccessible_file);
}
} else {
PRINT_DEBUG("inaccessible file: %s", inaccessible_file);
- if (mount(global_inaccessible_file, inaccessible_file + 1, NULL, MS_BIND,
- NULL) < 0) {
- DIE("mount(%s, %s, NULL, MS_BIND, NULL", global_inaccessible_file,
- inaccessible_file + 1);
+ if (mount(global_inaccessible_file, inaccessible_file, NULL, MS_BIND, NULL) < 0) {
+ DIE("mount(%s, %s, NULL, MS_BIND, NULL", global_inaccessible_file, inaccessible_file);
}
}
}
@@ -300,9 +215,11 @@
// We later remount everything read-only, except the paths for which this method
// returns true.
static bool ShouldBeWritable(char *mnt_dir) {
- mnt_dir += strlen(opt.sandbox_root_dir);
+ if (strcmp(mnt_dir, opt.sandbox_dir.c_str()) == 0) {
+ return true;
+ }
- if (strcmp(mnt_dir, opt.working_dir) == 0) {
+ if (strcmp(mnt_dir, opt.working_dir.c_str()) == 0) {
return true;
}
@@ -312,21 +229,14 @@
}
}
- for (const char *tmpfs_dir : opt.tmpfs_dirs) {
- if (strcmp(mnt_dir, tmpfs_dir) == 0) {
- return true;
- }
+ // Some system directories have to be writable for the sandbox to work
+ // correctly.
+ if (strcmp(mnt_dir, "/dev/shm") == 0
+ || strcmp(mnt_dir, "/run/shm") == 0
+ || strcmp(mnt_dir, "/tmp") == 0) {
+ return true;
}
- return false;
-}
-
-static bool IsUnderTmpDir(const char *mnt_dir) {
- for (const char *tmpfs_dir : opt.tmpfs_dirs) {
- if (strstr(mnt_dir, tmpfs_dir) == mnt_dir) {
- return true;
- }
- }
return false;
}
@@ -340,17 +250,6 @@
struct mntent *ent;
while ((ent = getmntent(mounts)) != NULL) {
- // Skip mounts that do not belong to our sandbox.
- if (strstr(ent->mnt_dir, opt.sandbox_root_dir) != ent->mnt_dir) {
- continue;
- }
- // Skip mounts that are under tmpfs directories because we've already
- // replaced such directories with new tmpfs instances.
- // mount() would fail with ENOENT if we tried to remount such mount points.
- if (IsUnderTmpDir(ent->mnt_dir + strlen(opt.sandbox_root_dir))) {
- continue;
- }
-
int mountFlags = MS_BIND | MS_REMOUNT;
// MS_REMOUNT does not allow us to change certain flags. This means, we have
@@ -400,10 +299,68 @@
endmntent(mounts);
}
+static void MountTmpfsFilesystems() {
+ for (const char *tmpfs_dir : opt.tmpfs_dirs) {
+ PRINT_DEBUG("mounting tmpfs: %s", tmpfs_dir);
+ if (mount("tmpfs", tmpfs_dir, "tmpfs", MS_NOSUID | MS_NODEV | MS_NOATIME, NULL) < 0) {
+ DIE("mount(tmpfs, %s, tmpfs, MS_NOSUID | MS_NODEV | MS_NOATIME, NULL)", tmpfs_dir);
+ }
+ }
+}
+
+static void MountTmpOverlay() {
+ std::string overlay_upper = opt.sandbox_dir + "/tmp_overlay_upper_dir";
+ if (mkdir(overlay_upper.c_str(), 0700) < 0) {
+ DIE("mkdir(%s, 0700)", overlay_upper.c_str());
+ }
+
+ std::string overlay_work = opt.sandbox_dir + "/tmp_overlay_work_dir";
+ if (mkdir(overlay_work.c_str(), 0700) < 0) {
+ DIE("mkdir(%s, 0700)", overlay_work.c_str());
+ }
+
+ std::ostringstream mount_opts;
+ mount_opts << "lowerdir=/tmp,upperdir=" << overlay_upper << ",workdir=" << overlay_work;
+
+ PRINT_DEBUG("mounting overlayfs: /tmp (%s)", mount_opts.str().c_str());
+ if (mount("overlay", "/tmp", "overlay", 0, mount_opts.str().c_str()) < 0) {
+ PRINT_DEBUG("warning: cannot mount overlayfs on /tmp");
+ }
+}
+
+static void MountWorkingDirOverlay() {
+ if (opt.lower_working_dir.empty()) {
+ return;
+ }
+
+ std::string overlay_work = opt.sandbox_dir + "/workingdir_overlay_work_dir";
+ if (mkdir(overlay_work.c_str(), 0700) < 0) {
+ DIE("mkdir(%s, 0700)", overlay_work.c_str());
+ }
+
+ std::ostringstream mount_opts;
+ mount_opts << "lowerdir=" << opt.lower_working_dir << ",upperdir=" << opt.working_dir << ",workdir=" << overlay_work;
+
+ if (access(opt.lower_working_dir.c_str(), R_OK | X_OK) < 0) {
+ DIE("access");
+ }
+ if (access(opt.working_dir.c_str(), R_OK | W_OK | X_OK) < 0) {
+ DIE("access");
+ }
+ if (access(overlay_work.c_str(), R_OK | W_OK | X_OK) < 0) {
+ DIE("access");
+ }
+
+ PRINT_DEBUG("mounting overlayfs: %s (%s)", opt.working_dir.c_str(), mount_opts.str().c_str());
+ if (mount("overlay", opt.working_dir.c_str(), "overlay", 0, mount_opts.str().c_str()) < 0) {
+ DIE("mount");
+ }
+}
+
static void MountProc() {
// Mount a new proc on top of the old one, because the old one still refers to
// our parent PID namespace.
- if (mount("proc", "proc", "proc", MS_NODEV | MS_NOEXEC | MS_NOSUID, NULL) <
+ if (mount("proc", "/proc", "proc", MS_NODEV | MS_NOEXEC | MS_NOSUID, NULL) <
0) {
DIE("mount");
}
@@ -441,31 +398,8 @@
}
static void EnterSandbox() {
- // Move the real root to old_root, then detach it.
- char old_root[] = "tmp/old-root-XXXXXX";
- if (mkdtemp(old_root) == NULL) {
- DIE("mkdtemp(%s)", old_root);
- }
-
- // pivot_root has no wrapper in libc, so we need syscall()
- if (syscall(SYS_pivot_root, ".", old_root) < 0) {
- DIE("pivot_root(., %s)", old_root);
- }
-
- if (chroot(".") < 0) {
- DIE("chroot(.)");
- }
-
- if (umount2(old_root, MNT_DETACH) < 0) {
- DIE("umount2(%s, MNT_DETACH)", old_root);
- }
-
- if (rmdir(old_root) < 0) {
- DIE("rmdir(%s)", old_root);
- }
-
- if (chdir(opt.working_dir) < 0) {
- DIE("chdir(%s)", opt.working_dir);
+ if (chdir(opt.working_dir.c_str()) < 0) {
+ DIE("chdir(%s)", opt.working_dir.c_str());
}
}
@@ -636,6 +570,9 @@
SetupMountNamespace();
SetupUserNamespace();
SetupUtsNamespace();
+ MountWorkingDirOverlay();
+ MountTmpfsFilesystems();
+ MountTmpOverlay();
MountFilesystems();
MakeFilesystemMostlyReadOnly();
MountProc();
diff --git a/src/main/tools/linux-sandbox.cc b/src/main/tools/linux-sandbox.cc
index 01e01fd..6e91125 100644
--- a/src/main/tools/linux-sandbox.cc
+++ b/src/main/tools/linux-sandbox.cc
@@ -73,7 +73,6 @@
int global_outer_uid;
int global_outer_gid;
-static char global_sandbox_root[] = "/tmp/sandbox.XXXXXX";
static int global_child_pid;
// The signal that will be sent to the child when a timeout occurs.
@@ -116,22 +115,6 @@
if (closedir(fds) < 0) {
DIE("closedir");
- }
-}
-
-static void RemoveSandboxRoot() {
- if (rmdir(global_sandbox_root) < 0) {
- DIE("rmdir(%s)", global_sandbox_root);
- }
-}
-
-static void SetupSandboxRoot() {
- if (opt.sandbox_root_dir == NULL) {
- if (mkdtemp(global_sandbox_root) == NULL) {
- DIE("mkdtemp(%s)", global_sandbox_root);
- }
- atexit(RemoveSandboxRoot);
- opt.sandbox_root_dir = global_sandbox_root;
}
}
@@ -281,8 +264,6 @@
// Make sure the sandboxed process does not inherit any accidentally left open
// file handles from our parent.
CloseFds();
-
- SetupSandboxRoot();
HandleSignal(SIGALRM, OnTimeout);
if (opt.timeout_secs > 0) {
diff --git a/src/test/shell/bazel/linux-sandbox_test.sh b/src/test/shell/bazel/linux-sandbox_test.sh
index 549f84f..ff1595f 100755
--- a/src/test/shell/bazel/linux-sandbox_test.sh
+++ b/src/test/shell/bazel/linux-sandbox_test.sh
@@ -28,12 +28,14 @@
readonly OUT="${OUT_DIR}/outfile"
readonly ERR="${OUT_DIR}/errfile"
readonly SANDBOX_DIR="${OUT_DIR}/sandbox"
+readonly SANDBOX_WORKING_DIR="${OUT_DIR}/work"
-SANDBOX_DEFAULT_OPTS="-W $SANDBOX_DIR"
+SANDBOX_DEFAULT_OPTS="-S $SANDBOX_DIR -W $SANDBOX_WORKING_DIR"
function set_up {
rm -rf $OUT_DIR
mkdir -p $SANDBOX_DIR
+ mkdir -p $SANDBOX_WORKING_DIR
}
function test_basic_functionality() {
To view, visit this change. To unsubscribe, visit settings.
Philipp Wollermann uploaded patch set #2 to this change.
Improve sandboxing performance on Linux for actions with many inputs. Fixes #1836. Thanks to Austin Schuh for suggesting the excellent idea of putting the input files of sandboxed actions on /dev/shm, sending a working version to try it out in https://bazel-review.googlesource.com/#/c/6670/
and proving the performance improvements with a benchmark. This version works by: - Creating the SymlinkedExecRoot on an in-memory filesystem. - Redirecting writes back to disk via overlayfs. This is an experimental feature and must be enabled by passing the flag --experimental_sandbox_shm. It will be enabled by default as soon as we're confident that it works reliably enough and have auto-detection for supported kernel versions. (There should be no harm in trying it out - in the worst case your build will just fail with weird error messages and you turn it off again.) This change also fixes a number of known issues with the sandbox: - Mounts /dev/shm as writable, empty tmpfs in sandbox. (Fix #2056, fix #1973) - Mounts /tmp as writable overlay on same device as the output_base, instead of unconditionally using a tmpfs. (Fix b/32826681) - Removes support for bind-mounting arbitrary paths inside the sandbox. Not sure yet, if this is coming back, but it wasn't exposed via an accessible Bazel flag anyway. RELNOTES: Try the new flag --experimental_sandbox_shm to improve sandboxing performance on Linux a lot. Requires a kernel with overlayfs support (>=3.18). Change-Id: I2db8730e9e3389e56b43666b192f2d55c9eda398 --- M src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxRunner.java M src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxedStrategy.java M src/main/java/com/google/devtools/build/lib/sandbox/SandboxOptions.java M src/main/java/com/google/devtools/build/lib/sandbox/SymlinkedExecRoot.java M src/main/tools/linux-sandbox-options.cc M src/main/tools/linux-sandbox-options.h M src/main/tools/linux-sandbox-pid1.cc M src/main/tools/linux-sandbox.cc M src/test/shell/bazel/linux-sandbox_test.sh 9 files changed, 205 insertions(+), 228 deletions(-)
To view, visit this change. To unsubscribe, visit settings.
Philipp Wollermann uploaded patch set #3 to this change.
Improve sandboxing performance on Linux for actions with many inputs. Fixes #1836. Thanks to Austin Schuh for suggesting the excellent idea of putting the input files of sandboxed actions on /dev/shm, sending a working version to try it out in https://bazel-review.googlesource.com/#/c/6670/
and proving the performance improvements with a benchmark. This version works by: - Creating the SymlinkedExecRoot on an in-memory filesystem. - Redirecting writes back to disk via overlayfs. This is an experimental feature and must be enabled by passing the flag --experimental_sandbox_shm. It will be enabled by default as soon as we're confident that it works reliably enough and have auto-detection for supported kernel versions. (There should be no harm in trying it out - in the worst case your build will just fail with weird error messages and you turn it off again.) This change also fixes a number of known issues with the sandbox: - Mounts /dev/shm as writable, empty tmpfs in sandbox. (Fix #2056, fix #1973) - Mounts /tmp as writable overlay on same device as the output_base, instead of unconditionally using a tmpfs. (Fix b/32826681) - Removes support for bind-mounting arbitrary paths inside the sandbox. Not sure yet, if this is coming back, but it wasn't exposed via an accessible Bazel flag anyway. RELNOTES: Try the new flag --experimental_sandbox_shm to improve sandboxing performance on Linux a lot. Requires a kernel with overlayfs support (>=3.18). Change-Id: I2db8730e9e3389e56b43666b192f2d55c9eda398 --- M src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxRunner.java M src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxedStrategy.java M src/main/java/com/google/devtools/build/lib/sandbox/SandboxOptions.java M src/main/java/com/google/devtools/build/lib/sandbox/SymlinkedExecRoot.java M src/main/tools/linux-sandbox-options.cc M src/main/tools/linux-sandbox-options.h M src/main/tools/linux-sandbox-pid1.cc M src/main/tools/linux-sandbox.cc M src/test/shell/bazel/linux-sandbox_test.sh 9 files changed, 234 insertions(+), 225 deletions(-)
To view, visit this change. To unsubscribe, visit settings.
Philipp Wollermann uploaded patch set #4 to this change.
Improve sandboxing performance on Linux for actions with many inputs. Fixes #1836. Thanks to Austin Schuh for suggesting the excellent idea of putting the input files of sandboxed actions on /dev/shm, sending a working version to try it out in https://bazel-review.googlesource.com/#/c/6670/
and proving the performance improvements with a benchmark. This version works by: - Creating the SymlinkedExecRoot on an in-memory filesystem. - Redirecting writes back to disk via overlayfs. This is an experimental feature and must be enabled by passing the flag --experimental_sandbox_shm. It will be enabled by default as soon as we're confident that it works reliably enough and have auto-detection for supported kernel versions. (There should be no harm in trying it out - in the worst case your build will just fail with weird error messages and you turn it off again.) This change also fixes a number of known issues with the sandbox: - Mounts /dev/shm as writable, empty tmpfs in sandbox. (Fix #2056, fix #1973) - Mounts /tmp as writable overlay on same device as the output_base, instead of unconditionally using a tmpfs. (Fix b/32826681) - Removes support for bind-mounting arbitrary paths inside the sandbox. Not sure yet, if this is coming back, but it wasn't exposed via an accessible Bazel flag anyway. RELNOTES: Try the new flag --experimental_sandbox_shm to improve sandboxing performance on Linux a lot. Requires a kernel with overlayfs support (>=3.18). Change-Id: I2db8730e9e3389e56b43666b192f2d55c9eda398 --- M src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxRunner.java M src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxedStrategy.java M src/main/java/com/google/devtools/build/lib/sandbox/SandboxOptions.java M src/main/java/com/google/devtools/build/lib/sandbox/SymlinkedExecRoot.java M src/main/tools/linux-sandbox-options.cc M src/main/tools/linux-sandbox-options.h M src/main/tools/linux-sandbox-pid1.cc M src/main/tools/linux-sandbox.cc M src/test/shell/bazel/linux-sandbox_test.sh 9 files changed, 221 insertions(+), 226 deletions(-)
To view, visit this change. To unsubscribe, visit settings.
Philipp Wollermann posted comments on this change.
To view, visit this change. To unsubscribe, visit settings.
Austin Schuh posted comments on this change.
To view, visit this change. To unsubscribe, visit settings.
Austin Schuh posted comments on this change.
To view, visit this change. To unsubscribe, visit settings.
Austin Schuh posted comments on this change.
To view, visit this change. To unsubscribe, visit settings.
Philipp Wollermann posted comments on this change.
To view, visit this change. To unsubscribe, visit settings.
Philipp Wollermann posted comments on this change.
To view, visit this change. To unsubscribe, visit settings.
Philipp Wollermann posted comments on this change.
To view, visit this change. To unsubscribe, visit settings.
Austin Schuh posted comments on this change.
To view, visit this change. To unsubscribe, visit settings.
Austin Schuh posted comments on this change.
To view, visit this change. To unsubscribe, visit settings.
Austin Schuh posted comments on this change.
To view, visit this change. To unsubscribe, visit settings.
Philipp Wollermann posted comments on this change.
To view, visit this change. To unsubscribe, visit settings.
Austin Schuh posted comments on this change.
To view, visit this change. To unsubscribe, visit settings.
Philipp Wollermann posted comments on this change.
Patch Set 4:
I'm currently working on a new version of your idea without overlayfs. It will take me some time, but be assured that I won't rest until this is done :) Feel free to ping if you'd like an update any time.
To view, visit this change. To unsubscribe, visit settings.
Austin Schuh posted comments on this change.
Patch Set 4:
I'm currently working on a new version of your idea without overlayfs. It will take me some time, but be assured that I won't rest until this is done :) Feel free to ping if you'd like an update any time.
Thanks! How are things looking?
To view, visit this change. To unsubscribe, visit settings.
Philipp Wollermann posted comments on this change.
Patch Set 4:
Thanks! How are things looking?
Today is my last day before my long vacation. I'll be back on January 30th and then get back to this again.
I have a CL that improves some aspects of the sandbox implementation and then wanted to build upon that with the symlink-on-tmpfs idea, but unfortunately had no time to finish them so far. :(
Can you keep your version in your fork for a little longer? There shouldn't be much if any activity on the (Linux) sandbox code while I'm away, so hopefully it doesn't require much maintenance effort or solving merge conflicts.
To view, visit this change. To unsubscribe, visit settings.
Austin Schuh posted comments on this change.
Patch Set 4:
> Thanks! How are things looking? Today is my last day before my long vacation. I'll be back on January 30th and then get back to this again. I have a CL that improves some aspects of the sandbox implementation and then wanted to build upon that with the symlink-on-tmpfs idea, but unfortunately had no time to finish them so far. :( Can you keep your version in your fork for a little longer? There shouldn't be much if any activity on the (Linux) sandbox code while I'm away, so hopefully it doesn't require much maintenance effort or solving merge conflicts.
I don't think we really get a choice :) We'll keep maintaining our fork.
Have a good vacation!
To view, visit this change. To unsubscribe, visit settings.
Josh Pieper posted comments on this change.
Patch set 4:
Any progress here? We are also blocked on newer versions of bazel because we aren't willing to give up the sandbox, but the performance regression from 0.3.1 is too much to handle.
To view, visit change 7538. To unsubscribe, visit settings.
Austin Schuh posted comments on this change.
Patch set 4:
Philipp,
Did you make any progress? I'd love to not have to maintain our patch going forwards.
Austin
Philipp Wollermann posted comments on this change.
Patch set 4:
Hi Austin, good to hear from you. Sorry for the lack of progress / updates here! But the good news is, I recently found time to work on this again, at last.
I'm about to push a change that will allow users to put their sandbox directories on a given root, e.g. --experimental_sandbox_base=/run/shm will let Bazel create all sandbox dirs under /run/shm/bazel-sandbox.xyz123/... - if you have enough RAM to store intermediate and output artifacts during a build, this should help.
That was the easy part. ;) Next I will try to implement Austin's initial idea of redirecting the bazel-out/ part onto HDD again.
As usual, I'll let you know once this is done, but if you don't hear from me, please ping me.
Philipp Wollermann abandoned this change.
To view, visit change 7538. To unsubscribe, visit settings.