Hello!
I was wondering if I could get any pointers on why I am receiving significant latency issues using the virtio-net driver when processing multiple parallel clients. Hopefully I can explain my issue enough to be replicated.
Testing environment:
- Comparison: Ubuntu Server (Linux) VM and OSv (used the option "-nv" in the run.py script for tap networking)
- In common: 4 CPU cores, 4GB of RAM, QEMU KVM, used "taskset" to pin to the same cores
- Program: java-httpserver program from the apps directory, java8
- What was sent: data of varying sizes (1KB to 1MB, 4MB, 8MB...) on the same machine to the VMs
Observations:
- With single-threaded requests and low data sizes, I was able to measure a latency on OSv that is lower than the Linux VM latency
- example: for 32KB I measured ~4ms for OSv and 9.8ms for the Linux VM
- At high data sizes (256KB+), OSv started to measure a higher latency than the Linux VM
- When I sent multiple requests at the same time, OSv suffered a much larger average latency penalty
- example, at 1MB data size and 16 parallel requests, average latency was:
- OSv: 120ms (min-max 14-225ms, std: 62ms)
- Linux VM: 82ms (min-max 24-144ms, std: 34ms) for the Linux VM
Other notes:
- I've been using the OSv profiling tools and have seen that the hot spots typically were in virtio::virtio_driver::wait_for_queue and virtio::net::receiver, but I was unable to identify the exact issue on why this latency is the case
Hope this is clear enough! I am hoping to understand whether I am misconfiguring OSv or something similar to figure out why this latency difference is occurring. Thank you for the help in advance, and happy to provide any more information as needed.