running tests against benchmarks

58 views
Skip to first unread message

Jeremy French

unread,
Mar 15, 2021, 1:44:50 PM3/15/21
to golang-nuts
I keep running into this solution to a particular problem, but when I go to search on how to do it, I find that not only are there seemingly no solutions out there in search-results-land, there doesn't seem to be anyone else even asking about it.  This then leads me to suspect, that I'm going about it in completely the wrong mindset, and I'd appreciate anyone helping me see how everyone else is solving this problem.

The issue is this:  Let's say I have some function - call it BeefyFunc() - that is heavily relied upon and executed many times per second in my high-traffic, high-availability critical application.  And let's say that BeefyFunc() is very complex in the sense that it calls many other functions which call other functions, etc, and these called functions are spread out all over the code base, and maintained by a broad and distributed development team (or even just one forgetful, overworked developer).  If BeefyFunc() currently executes at 80 ns/op on my development machine, I want to make sure that no one makes some change to an underlying function which causes BeefyFunc to balloon up to 800 ns/op.  And more to the point, if that ballooning does happen, I want someone to be notified, not to have to rely on a QA person's ability to notice a problem while scanning the benchmark reports.  I want that problem to be highlighted in the same way a test failure would be.

So it seems like the logical solution would be to create a test that runs a benchmark and makes sure the benchmark results are within some acceptable range.  I realize that benchmarks are going to differ from machine to machine, or based on CPU load, etc.  But it seems like it would still be useful in a CI/CD situation or on my personal dev machine, where the machine hardware is stable and known and the CPU load can be reasonably predicted, that at least a sufficiently wide range for benchmarks could be reasonably enforced and tested for.

Am I being stupid? Or is this a solved problem and it's just my google-fu that's failing me?

Marcin Romaszewicz

unread,
Mar 15, 2021, 2:48:52 PM3/15/21
to Jeremy French, golang-nuts
What you want to do is common, but it's application-specific enough that there aren't so many generalized solutions. What I've always done is create a Go program which takes a while to run (I shoot for at least a minute) which runs your BeefyFunc in various ways that make sense to you, then I make a Jenkins Job which runs it on every commit on fixed hardware, and uploads timings to some kind of metrics database like DataDog or Prometheus. Both of those have alert triggers on outliers. I realize this doesn't help much if you don't have DataDog, Prometheus or Jenkins.

-- Marcin

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/7adf598e-03a8-456a-a52f-824a8d1832e3n%40googlegroups.com.

Jeremy French

unread,
Mar 15, 2021, 4:16:30 PM3/15/21
to Marcin Romaszewicz, golang-nuts
Yes, I thought of something similar.  You could certainly save/write the results of a benchmark to any version of a db/textfile, and then run another program to analyze it and do whatever you like with it.  That just seems like unnecessary overhead.  It would seem that since the test is just a function with any arbitrary amount or style of code in it, you could customize it to whatever application-specificity you wanted, if you could just import the benchmark results without having to run a separate process.  I don't know, maybe it's just a purity of separation of concerns, tests are for definitive, yes or no answers, whereas benchmarks are more prone to human judgment. It just seems like the two pieces I need to do what I want are sitting right next to each other in my test script.  It would be nice if they could talk to each other.

Wojciech S. Czarnecki

unread,
Mar 15, 2021, 10:36:55 PM3/15/21
to golan...@googlegroups.com, ibi...@gmail.com
Dnia 2021-03-15, o godz. 10:44:50
Jeremy French <ibi...@gmail.com> napisał(a):

> So it seems like the logical solution would be to create a test that runs a
> benchmark and makes sure the benchmark results are within some acceptable
> range. I realize that benchmarks are going to differ from machine to
> machine, or based on CPU load, etc.

Compare old implementation and new one: https://play.golang.org/p/w9giDccrpNa

PS. In a CI pipeline, as we do not know exact HW where tests will run tomorrow, we
normalize to a known base. Usually crypto/des suits well as the stable unit of speed.


Hope this helps,

--
Wojciech S. Czarnecki
<< ^oo^ >> OHIR-RIPE

Jeremy French

unread,
Mar 15, 2021, 11:35:43 PM3/15/21
to Wojciech S. Czarnecki, golang-nuts
Nice!  Thanks.
Reply all
Reply to author
Forward
0 new messages