Thanks a lot for the information, all! In particular, the UMA link from the previous message is pretty awesome! I will have to play around with this more, and remember it for later.
I definitely agree that microbenchmarks only make it so far in understanding real-world performance. That's actually related to the genesis of my question: while upgrading the kernel we observed some changes in latency for copy-paste between Android and Chrome (which I believe involves Mojo at least on some level), and I was somewhat wondering whether it's noticeable in the smaller-scale benchmarks or if it's some deeper issue (or, some issue in the integration of all the disparate parts). If you're interested in my past findings, take a look at the graphs attached to this bug comment: b/157615371#comment32
I took a look at the UMA page and selected for the specific model/kernels I'm looking at to see if the change in latency is noticeable from these micro-measurements and did see anything interesting, but the earlier kernel has 200,000x the number of samples than the newer kernel so I'm thinking the long-tail data is not directly comparable anyway.
I also have a followup issue to investigate the observed latency for this operation across the full suite of devices we have running regularly in our builders, which would be a really interesting view into this end-to-end experience. Feel free to cc yourself to b/168041545 if you're interested in that information.
It is certainly nice to hear that folks are working on optimizing the system even further :)
Thanks again, all!