Sentry memory management

78 views
Skip to first unread message

Gabriele Minì

unread,
Dec 6, 2020, 7:31:11 AM12/6/20
to gVisor Users [Public]
Hello everyone,  

i have some questions about how memory is managed in Sentry. To make things as simple as possibile, if I have an application that allocate a buffer (with malloc) and then accesses the allocated pages by writing to them, then:
  1. does Sentry internally calls mmap to request memory in order to satisfy the request made by malloc?
  2. When accessing memory do we pay the cost of having an address translation from Sentry (virtual to physical) to host (virtual to physical) ? If yes how we store those mappings?
I also have a doubt about write operations over regular file (for which an host's fd is available and donated by Gofer). Why gVisor tends to perform better with bigger block size?

Thank's everyone for the help!


Adin Scannell

unread,
Dec 7, 2020, 12:03:59 PM12/7/20
to Gabriele Minì, gVisor Users [Public]
On Sun, Dec 6, 2020 at 4:31 AM Gabriele Minì <mini.ga...@gmail.com> wrote:
Hello everyone,  

i have some questions about how memory is managed in Sentry. To make things as simple as possibile, if I have an application that allocate a buffer (with malloc) and then accesses the allocated pages by writing to them, then:
  1. does Sentry internally calls mmap to request memory in order to satisfy the request made by malloc?
Not directly, no. The Sentry has its own vma b-tree, and each vma has it's own "pma", which corresponds to some region allocated from the backing memfd. See pkg/sentry/mm and you'll be able to see how it all works. The host memory will be mapped when the first fault occurs (and only in appropriate chunks independent of the size the user mmap'ed).
  1. When accessing memory do we pay the cost of having an address translation from Sentry (virtual to physical) to host (virtual to physical) ? If yes how we store those mappings?
Generally speaking, yes. The user address must be translated to an internal Sentry mapping via the vma & pma b-trees noted above.
 
I also have a doubt about write operations over regular file (for which an host's fd is available and donated by Gofer). Why gVisor tends to perform better with bigger block size?

The translations are collected in a safemem.BlockSeq, which is then used for the actual I/O operation. This sequence can actually encode contiguous runs efficiently. Linux does this page-by-page, so it can actually end up being a tiny bit slower for large regions.

Happy to help answer any more specific questions!


Thank's everyone for the help!


--
You received this message because you are subscribed to the Google Groups "gVisor Users [Public]" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gvisor-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gvisor-users/20683dda-49cc-43a7-bc83-d85b715d36d0n%40googlegroups.com.

Gabriele Minì

unread,
Dec 8, 2020, 4:58:38 AM12/8/20
to gVisor Users [Public]

Thank you for you detailed answer!

Il giorno lunedì 7 dicembre 2020 alle 18:03:59 UTC+1 asca...@google.com ha scritto:
On Sun, Dec 6, 2020 at 4:31 AM Gabriele Minì <mini.ga...@gmail.com> wrote:
Hello everyone,  

i have some questions about how memory is managed in Sentry. To make things as simple as possibile, if I have an application that allocate a buffer (with malloc) and then accesses the allocated pages by writing to them, then:
  1. does Sentry internally calls mmap to request memory in order to satisfy the request made by malloc?
Not directly, no. The Sentry has its own vma b-tree, and each vma has it's own "pma", which corresponds to some region allocated from the backing memfd. See pkg/sentry/mm and you'll be able to see how it all works. The host memory will be mapped when the first fault occurs (and only in appropriate chunks independent of the size the user mmap'ed).

Great! so if I call malloc without touching the allocated pages I just pay the cost of creating a mapping inside Sentry. Instead, when I try to access one of those pages, a fault occurs, and some physical memory from the host need to be allocated, right? 
  1. When accessing memory do we pay the cost of having an address translation from Sentry (virtual to physical) to host (virtual to physical) ? If yes how we store those mappings?
Generally speaking, yes. The user address must be translated to an internal Sentry mapping via the vma & pma b-trees noted above.

That's clear. 
 
I also have a doubt about write operations over regular file (for which an host's fd is available and donated by Gofer). Why gVisor tends to perform better with bigger block size?

The translations are collected in a safemem.BlockSeq, which is then used for the actual I/O operation. This sequence can actually encode contiguous runs efficiently. Linux does this page-by-page, so it can actually end up being a tiny bit slower for large regions.

Sorry to bother you with another question, but this is still not so clear to me. I'm not understanding which additional cost we are paying when issuing smaller chunks writes (e.g. 4KB) instead of bigger ones ( e.g. 128KB).

Adin Scannell

unread,
Dec 8, 2020, 12:28:30 PM12/8/20
to Gabriele Minì, gVisor Users [Public]
On Tue, Dec 8, 2020 at 1:58 AM Gabriele Minì <mini.ga...@gmail.com> wrote:

Thank you for you detailed answer!

Il giorno lunedì 7 dicembre 2020 alle 18:03:59 UTC+1 asca...@google.com ha scritto:
On Sun, Dec 6, 2020 at 4:31 AM Gabriele Minì <mini.ga...@gmail.com> wrote:
Hello everyone,  

i have some questions about how memory is managed in Sentry. To make things as simple as possibile, if I have an application that allocate a buffer (with malloc) and then accesses the allocated pages by writing to them, then:
  1. does Sentry internally calls mmap to request memory in order to satisfy the request made by malloc?
Not directly, no. The Sentry has its own vma b-tree, and each vma has it's own "pma", which corresponds to some region allocated from the backing memfd. See pkg/sentry/mm and you'll be able to see how it all works. The host memory will be mapped when the first fault occurs (and only in appropriate chunks independent of the size the user mmap'ed).

Great! so if I call malloc without touching the allocated pages I just pay the cost of creating a mapping inside Sentry. Instead, when I try to access one of those pages, a fault occurs, and some physical memory from the host need to be allocated, right? 

Note that malloc() is one level removed from a system call itself. Depending on the implementation and parameters, a call to malloc() may or may not result in an mmap system call.
 
  1. When accessing memory do we pay the cost of having an address translation from Sentry (virtual to physical) to host (virtual to physical) ? If yes how we store those mappings?
Generally speaking, yes. The user address must be translated to an internal Sentry mapping via the vma & pma b-trees noted above.

That's clear. 
 
I also have a doubt about write operations over regular file (for which an host's fd is available and donated by Gofer). Why gVisor tends to perform better with bigger block size?

The translations are collected in a safemem.BlockSeq, which is then used for the actual I/O operation. This sequence can actually encode contiguous runs efficiently. Linux does this page-by-page, so it can actually end up being a tiny bit slower for large regions.

Sorry to bother you with another question, but this is still not so clear to me. I'm not understanding which additional cost we are paying when issuing smaller chunks writes (e.g. 4KB) instead of bigger ones ( e.g. 128KB).

I'm not sure what "we" is here, but I'll just try to clarify.

Take a look at the implementation for generic_file_buffered_read in Linux. It looks up pages backing a file one at a time.

The Sentry works differently. It needs to do the resolution of the user addresses (into a BlockSeq) which is effectively free for Linux, but then can do the actual copy during tree traversal, instead of doing a fresh lookup each time.


Happy to help answer any more specific questions!


Thank's everyone for the help!


--
You received this message because you are subscribed to the Google Groups "gVisor Users [Public]" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gvisor-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gvisor-users/20683dda-49cc-43a7-bc83-d85b715d36d0n%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "gVisor Users [Public]" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gvisor-users...@googlegroups.com.

Adin Scannell

unread,
Dec 8, 2020, 12:29:26 PM12/8/20
to Gabriele Minì, gVisor Users [Public]
On Tue, Dec 8, 2020 at 9:28 AM Adin Scannell <asca...@google.com> wrote:
On Tue, Dec 8, 2020 at 1:58 AM Gabriele Minì <mini.ga...@gmail.com> wrote:

Thank you for you detailed answer!

Il giorno lunedì 7 dicembre 2020 alle 18:03:59 UTC+1 asca...@google.com ha scritto:
On Sun, Dec 6, 2020 at 4:31 AM Gabriele Minì <mini.ga...@gmail.com> wrote:
Hello everyone,  

i have some questions about how memory is managed in Sentry. To make things as simple as possibile, if I have an application that allocate a buffer (with malloc) and then accesses the allocated pages by writing to them, then:
  1. does Sentry internally calls mmap to request memory in order to satisfy the request made by malloc?
Not directly, no. The Sentry has its own vma b-tree, and each vma has it's own "pma", which corresponds to some region allocated from the backing memfd. See pkg/sentry/mm and you'll be able to see how it all works. The host memory will be mapped when the first fault occurs (and only in appropriate chunks independent of the size the user mmap'ed).

Great! so if I call malloc without touching the allocated pages I just pay the cost of creating a mapping inside Sentry. Instead, when I try to access one of those pages, a fault occurs, and some physical memory from the host need to be allocated, right? 

Note that malloc() is one level removed from a system call itself. Depending on the implementation and parameters, a call to malloc() may or may not result in an mmap system call.
 
  1. When accessing memory do we pay the cost of having an address translation from Sentry (virtual to physical) to host (virtual to physical) ? If yes how we store those mappings?
Generally speaking, yes. The user address must be translated to an internal Sentry mapping via the vma & pma b-trees noted above.

That's clear. 
 
I also have a doubt about write operations over regular file (for which an host's fd is available and donated by Gofer). Why gVisor tends to perform better with bigger block size?

The translations are collected in a safemem.BlockSeq, which is then used for the actual I/O operation. This sequence can actually encode contiguous runs efficiently. Linux does this page-by-page, so it can actually end up being a tiny bit slower for large regions.

Sorry to bother you with another question, but this is still not so clear to me. I'm not understanding which additional cost we are paying when issuing smaller chunks writes (e.g. 4KB) instead of bigger ones ( e.g. 128KB).

I'm not sure what "we" is here, but I'll just try to clarify.

To clarify: I'm just not sure if "we" means Linux or the Sentry in the above sentence. (The answer below assumes that it means the Sentry.)

Gabriele Minì

unread,
Dec 9, 2020, 10:27:50 AM12/9/20
to gVisor Users [Public]
Thank you very much for your time and for the links to the source code, they helped me a lot.
Reply all
Reply to author
Forward
0 new messages