RFC - Chrome sandbox driver file manifest XML/JSON file

82 views
Skip to first unread message

James Jones

unread,
Jul 6, 2023, 2:50:42 PM7/6/23
to Stéphane Marchesin, Arvind Gopalakrishnan, Chas Inman, Vaibhav Vyas, (SW), Dominik Behr, Daniel Dadap, graphi...@chromium.org
NVIDIA engineers and users have regularly encountered issues with how
the Chrome browser sandboxing interacts with our driver. Generally, this
occurs when we alter the names of driver-internal libraries or modify
their dependencies. For example, when experimenting with using Chrome in
non-X11 environments, I we ran into an issue where the Chrome sandbox's
handling of our driver was geared towards GLX-based usage, rather than
EGL or non-X11 Vulkan usage. EGL requires a different set of libraries,
and due to minor design differences, loads more of its dependency
libraries dynamically at runtime. Some of the dependencies are internal
libraries that have the driver version number baked into their DT_SONAME
field to avoid accidental mixing of driver components from different
driver versions. We also rearrange our code layout, adding and removing
libraries, on a somewhat regular basis as the code evolves and we seek
to optimize its runtime layout/memory footprint. This often requires
updating sandbox whitelists as well. We ran into one such issue recently
when we consolidated a number of libraries into a new one, which broke
some sandboxing environments. As a result, we have temporarily worked
around the issue by linking the new library in an otherwise suboptimal
way to trigger its inclusion in sandbox preloads and file list generators.

We have also experienced other issues, such as an inability to create
sockets in the sandbox or open device files (Which we do to create
memory sharing FDs and event-polling FDs in addition to the usual device
file interaction via ioctls), that caused our driver to fall back to
much less optimal paths (Software rendering, blit-based presentation
rather than page flipping, etc.).

Some of these issues have been encountered in other sandboxed or
container environments as well. Steam's pressure vessel, for example,
often runs into similar issues to Chrome, and we've encountered a few
other less widespread examples.

Lately, NVIDIA has put some thought into mechanisms that might address
such issues generally. We've been primarily interested in making the
list of files pre-loaded into the sandbox environment somewhat more
dynamic to avoid the need to hard-code lists of files in the Chrome
browser source that would need to be updated any time NVIDIA releases a
new driver. The straw-man proposal we came up with was modifying our
driver build process to generate an XML, JSON, or similar structured
manifest file that Chrome could consume at runtime. Then, Chrome and
other sandbox environments could parse that file to determine the list
of driver files needed in the sandbox. While none of us are security
specialists, the presumption we're working with is that this would be no
less secure than hard-coding a list of libraries for the following reasons:

-In ChromeOS, this manifest file would live on a verified file
partition, where it would be generated and placed by a build mechanism
completely under Google's control (.ebuilds, NV driver package built by
Google from shared source tree).

-On general Linux, it could be required that the file be root-owned,
requiring similar privileges to use it as an attack vector to those
required to manipulate system driver libraries for such a purpose as
well. E.g., rather than insert references to attack libraries in the
manifest, an attacker could just as easily replace libnvidia-glcore.so
with a file with similar interfaces but which performed arbitrary
functions when called by other components.

-There would be no inherent additional trust granted to the list of
NVIDIA libraries. Chrome already necessarily trusts that the NVIDIA
libraries it loads into its sandbox retain a relatively stable attack
surface and general security profile from release to release. The main
difference being Chrome would now also effectively trust the manifest to
contain a list of files that provide the Vulkan and GL interface it
already grants some level of trust.

Ideally we could structure the manifest file into categories (As a
straw-man, things like kernel modules, userspace libraries needed for
GLX+GL, userspace libraries needed for Vulkan, userspace files needed
for EGL, etc.), and files would get tagged for one or more of these
categories by NVIDIA as part of our regular development process. We're
currently evaluating the feasibility of such categorization mechanisms.

As an example of another possible solution, NVIDIA has developed the
libnvidia-container libraries and utilities to solve a similar set of
problems when deploying our driver in a container environment. These
packages could be trivially modified to spit out a list of list of
userspace libraries useful within a container environment, though I
believe in a less targeted manner than what we've been discussing for
the XML file.

Further work could attempt to address issues such as ioctl filtering,
permissions to open or create files/sockets, launching of driver worker
threads, etc. We have no concrete solutions to propose for these issues
at this time, but we would be happy to work with the relevant Google
Chrome and security engineers to develop solutions that enable an
optimal Chrome browser experience for users running on NVIDIA hardware
and software stacks on Linux and ChromeOS.

Thanks,
-James

Vaibhav Vyas (SW)

unread,
Jul 21, 2023, 6:52:39 PM7/21/23
to James Jones, Stéphane Marchesin, Arvind Gopalakrishnan, Chas Inman, Dominik Behr, Daniel Dadap, Prahlad Kilambi, Austin Eng, Nicole Yee, graphi...@chromium.org, Vaibhav Vyas (SW)
Hi Chrome Browser Team,
Please can you help evaluate the proposal below and share your thoughts. This will help us collaboratively design an agreed upon, vendor-neutral solution that works for Chrome Browser Sandbox as well as other applications on Linux & ChromeOS that uses NVIDIA dGPU and software stack. Please feel free to add required stakeholders from Chrome Browser / ChromeOS security team.

Thanks
Vaibhav Vyas

Ken Russell

unread,
Jul 28, 2023, 2:36:36 PM7/28/23
to Vaibhav Vyas (SW), James Jones, Stéphane Marchesin, Arvind Gopalakrishnan, Chas Inman, Dominik Behr, Daniel Dadap, Prahlad Kilambi, Austin Eng, Nicole Yee, graphi...@chromium.org, Zhenyao Mo, tis...@chromium.org
Hi James, Vaibhav,

Apologies for the delay replying.

Hoping others will add their thoughts here, but a couple of other engineers and I discussed your proposal and we think it is a great idea. This more precise definition of what the driver needs to access would likely improve security. We'd like to work with you to spec the file format and do a prototype, probably on Linux. CC'ing tis...@chromium.org as an initial point of contact from Chrome's security team.

It might be easiest to make a Google doc in which we could collaborate directly - what do you think? In the meantime, here are a few initial thoughts.

- Chromium already contains supporting libraries for parsing JSON, and I think we would heavily lean toward using JSON rather than XML.

- Simplicity and size of the file are important, since parsing this file will be on the critical path of Chromium's GPU sub-process startup (and, hopefully, other browsers' and applications' in the future).

- One could imagine attacks involving deliberately corrupting this file or making it exceedingly large, and it's important to think through that possibility.

- How will the JSON file be located initially?

- Will there be rules about which portions of the file system can be whitelisted for read access by the JSON file?

- Will paths be required to be relative / absolute?

- Rather than partitioning the driver's files into kernel/userspace/etc., I think it would be more straightforward to group entries in this sandbox description by function; for example, a "read_access" key in a dictionary whose value is an array of file or directory names that the sandbox must grant read access to. From the point of view of the sandbox, it doesn't matter whether those are shared objects, data files, etc. Again for simplicity, we would not want to have to parse a more complex file and merge different portions of it, but would rather the contents of this file be targeted specifically at sandboxing the driver. Similarly, one could imagine another key specifying allowed ioctl request types, a "read_write_access" key specifying a directory where the driver caches its compiled pipelines, certain magic variables which refer to the temp directory, etc.

- One colleague suggested something like an EGL API which would return this data in some well-defined C struct. I personally think it is difficult to maintain and evolve such structs, and agree with your proposed direction of parsing a file.

- We'll need to think about how this file will evolve. Assuming it contains a JSON dictionary, once keys are defined in it and browsers are looking for them, it will be difficult to remove them or modify their structure. It may be beneficial to define a version key at the beginning of time, and specify which keys were added in which version.

Again, looking forward to working with you on this more, and hope that other Chromium teammates will share their thoughts too.

-Ken


Reply all
Reply to author
Forward
0 new messages