ENA driver

Waldek Kozaczuk

unread,

Nov 30, 2023, 12:52:29 AM11/30/23

to OSv Development

Hi,

I have just created a pull request that implements OSv ENA driver by porting the FreeBSD version - https://github.com/cloudius-systems/osv/pull/1283. If you have bandwidth and know-how please feel free to review this PR.

This is enough to run OSv on a nitro instance like t3 nano with ramfs image (ideally we would like NVMe driver). I have only tested it with simple golang httpserver and OSv http monitoring module and only on t3 nano. It seems to be functional and stable but I still need to run more tests to get a better feel.

Regards,

Waldek

Dor Laor

unread,

Nov 30, 2023, 2:57:48 AM11/30/23

to Waldek Kozaczuk, OSv Development

Sweat!

--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/32a2dfa7-9407-4a1a-a4fc-5711401dfec8n%40googlegroups.com.

Waldek Kozaczuk

unread,

Dec 10, 2023, 2:15:57 PM12/10/23

to OSv Development

Even though this driver implements stateless offloads - TXCSUM, RXCSUM, TSO, LRO - (just like the original FreeBSD one - https://github.com/amzn/amzn-drivers/tree/master/kernel/fbsd/ena#stateless-offloads), the underlying ENA device does NOT implement RXCSUM nor TSO (see amzn/amzn-drivers#29). It also looks like the LRO logic never gets activated based on the observed values of relevant tracepoints. We use netchannels so maybe it does not matter as much.

OSv when running on t3nano reports this:

D/22 ena]: Elastic Network Adapter (ENA)ena v2.6.3
[D/22 ena]: LLQ is not supported. Using the host mode policy.
[D/22 ena]: ena_attach: set max_num_io_queues to 2
[D/22 ena]: Enable only 3 MSI-x (out of 9), reduce the number of queues
[D/22 ena]: device offloads (caps): TXCSUM=2, TXCSUM_IPV6=0, TSO4=0, TSO6=0, RXCSUM=0, RXCSUM_IPV6=0, LRO=1, JUMBO_MTU=1

...

[D/22 ena]: ena_update_hwassist: CSUM_IP=1, CSUM_UDP=4, CSUM_TCP=2, CSUM_UDP_IPV6=0, CSUM_TCP_IPV6=0, CSUM_TSO=0

Can anyone confirm if this indeed is the case? The issue I am citing above was opened in 2017 and never updated. If so how much of the performance impact does it have?

Relatedly, I have since improved the driver a bit. Mainly I have changed the "cleanup" logic (mostly handling RX) to make the worker threads and corresponding MSIX vectors pin to a single vCPU. That seems to reduce # of IPIs and in some workflows I see performance improve by 5-10%.

I have also run more tests with iperf3 and netperf:

netperf -H 172.31.89.238 -t TCP_STREAM -l 5 -- -m 65536
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.31.89.238 () port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec

65536 16384 65536 5.01 3776.40

iperf3 -t 5 -c 172.31.93.118
Connecting to host 172.31.93.118, port 5201
[ 5] local 172.31.90.167 port 55674 connected to 172.31.93.118 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 444 MBytes 3.72 Gbits/sec 5901 199 KBytes
[ 5] 1.00-2.00 sec 421 MBytes 3.54 Gbits/sec 5529 147 KBytes
[ 5] 2.00-3.00 sec 464 MBytes 3.89 Gbits/sec 5923 157 KBytes
[ 5] 3.00-4.00 sec 440 MBytes 3.69 Gbits/sec 6117 158 KBytes
[ 5] 4.00-5.00 sec 450 MBytes 3.78 Gbits/sec 6686 260 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-5.00 sec 2.17 GBytes 3.72 Gbits/sec 30156 sender

[ 5] 0.00-5.03 sec 2.16 GBytes 3.69 Gbits/sec receiver

With iperf3 I typically see a relatively high number of retries. Do you think it indicates some sort of bottleneck on OSv side?

Relatedly, with both iperf and netperf I never see OSv exceed the 4Gbits/s barrier and it never approaches the NIC bandwidth limit (the maximum of t3nano is 5Gbits).

Finally, all the tests I have been running we conducted with clients (wrk, iperf, netperf, etc) on the t3micro Ubuntu instance deployed in the same availability zone (us-east-1f) and same VPC. I have also had no chance to compare it to Linux guest so no idea if these results are half decent or not.

Any input is highly appreciated.

Reply all

Reply to author

Forward