Questions about gVisor's performance

229 views
Skip to first unread message

Shaofeng Cen

unread,
Nov 7, 2023, 8:48:23 AM11/7/23
to gVisor Users [Public]
Hi everyone!
Recently I tried to evaluate the performance of gVisor on the Raspberry Pi 3B+ hardware platform.
I evaluate the levelDB and Redis with gVisor and Docker.
For Redis, the throughput of gVisor is tens of times worse than that of Docker.
For levelDB, fillseq, fillrandom and overwrite in gVisor are also tens of times worse than that of Docker. Other benchmark is much closer to the Docker.
Is my benchmark results reasonable? Does your performance results also have such a gap?

Etienne Perot

unread,
Nov 7, 2023, 2:36:08 PM11/7/23
to Shaofeng Cen, gVisor Users [Public]
Hello,

gVisor performance is a complex topic and the results vary a lot depending on many factors.
What matters most is the platform that you use. The current default platform is Systrap, but on a Raspberry Pi (which is bare metal) you'll likely get better performance with the KVM platform. See the platform guide for more. Additionally, Systrap (until recently) dealt poorly with machines that have low core counts. This has been addressed since, but the fix was submitted only 2 weeks ago so you may be using a build that doesn't have this fix.

Another factor is the machine and CPU architecture you are using. The Raspberry Pi is an ARM platform, for which gVisor has been less optimized than x86. For example, in Systrap, there is an x86-specific optimization that involves rewriting the sandboxed program's syscall instructions ("mov sysno, %eax; syscall") to avoid a roundtrip to the kernel. This optimization doesn't exist on ARM. Moreover, the Raspberry Pi, while being a very cool device that gVisor should be able to run on at all, is not really gVisor's target platform for high-performance workloads.

The nature of the application you run has a huge impact. The largest sources of gVisor overhead tend to be networking and file I/O. Databases (like LevelDB) are especially impacted, because they are usually bottlenecked on either or both of these. Redis is also one of the worst-case scenarios, because it spends almost all of its CPU time doing network interaction (very little processing), so the relative impact of gVisor's network overhead is magnified. For these reasons, the gVisor production guide does not recommend running this type of workload in a gVisor sandbox if you can avoid it.

Lastly, most off-the-shelf benchmarks out there tend to exercise gVisor in ways that don't represent real-world load. From a gVisor overhead perspective, the more complex the query, the lower percentage of the query time will be spent on I/O or networking, and so the overhead of gVisor will be lower. By contrast, benchmarks like fillseq/fillrandom/overwrite/etc perform exactly one type of simple (usually very low-processing) work, and so the numbers will show the worst-case scenario.

All that being said, 10x overhead is much larger than I've ever seen. I've been running benchmarks on GKE Sandbox, and gVisor overhead while running DB workloads is a lot lower than the 10x you are seeing. Were you using gVisor's benchmark suite? (We'd welcome pull requests adding more benchmarks like LevelDB.)

--
You received this message because you are subscribed to the Google Groups "gVisor Users [Public]" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gvisor-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gvisor-users/99fa3d62-89d1-4c0b-b55f-6d5efa441495n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages