efficiency of read-only workloads

26 views
Skip to first unread message

Andrei Matei

unread,
Sep 19, 2018, 11:48:43 PM9/19/18
to CockroachDB
For your consideration, check out the following profile of workload "kv" (95% point reads) running at high concurrency on a single beefy node. What would you guess the ratio of "useful work" to "bookkeeping" is?

The highlighted area is batcheval.Scan, which contains all the RocksDB interactions. I'm not sure what to make of this exactly, but I can't help but wonder if the picture would look differently were crdb written in C.

batcheval.png

Nikhil Benesch

unread,
Sep 19, 2018, 11:55:18 PM9/19/18
to Andrei Matei, CockroachDB
Any chance you can post that profile somewhere? I'm very curious to know what the other boxes in this icicle chart are.

--
You received this message because you are subscribed to the Google Groups "CockroachDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cockroach-db+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cockroach-db/CAPqkKgk_w4GtT8VzuynwuJduMQOGiK-3-bCMDsWxT%3DRzupVCdg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Peter Mattis

unread,
Sep 20, 2018, 9:50:20 AM9/20/18
to Andrei Matei, CockroachDB
That's a fun picture. Looks like there is a lot of room for improvement in the "bookkeeping". From my experience, it's just as easy to have this bookkeeping overhead in C++ as it is in Go or any language. I don't want to get into a language war, just indicating my belief that we can fix this without switching languages.

On Wed, Sep 19, 2018 at 11:48 PM, Andrei Matei <and...@cockroachlabs.com> wrote:

--

Andrei Matei

unread,
Sep 20, 2018, 11:01:18 AM9/20/18
to Peter Mattis, CockroachDB
Nikhil, I've attached a raw profile and an svg.
Btw, I don't know how to get the icycle graph without a running server; I know how to get it from the UI and from running pprof -http :<port> <url> , but saving the .html doesn't seem to work properly - the javascript stops working for some reason. If anyone knows how to get the pprof server working from a saved profile, please let me know.

On Thu, Sep 20, 2018 at 9:50 AM Peter Mattis <pe...@cockroachlabs.com> wrote:
That's a fun picture. Looks like there is a lot of room for improvement in the "bookkeeping". From my experience, it's just as easy to have this bookkeeping overhead in C++ as it is in Go or any language. I don't want to get into a language war, just indicating my belief that we can fix this without switching languages.

On Wed, Sep 19, 2018 at 11:48 PM, Andrei Matei <and...@cockroachlabs.com> wrote:
For your consideration, check out the following profile of workload "kv" (95% point reads) running at high concurrency on a single beefy node. What would you guess the ratio of "useful work" to "bookkeeping" is?

The highlighted area is batcheval.Scan, which contains all the RocksDB interactions. I'm not sure what to make of this exactly, but I can't help but wonder if the picture would look differently were crdb written in C.

batcheval.png

--
You received this message because you are subscribed to the Google Groups "CockroachDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cockroach-db...@googlegroups.com.
pprof.svg
profile

Nathan VanBenschoten

unread,
Sep 20, 2018, 12:21:51 PM9/20/18
to Andrei Matei, Peter Mattis, CockroachDB
I've been looking at these kinds of graphs for a while. There have been a few easy wins, but a lot of this bookkeeping overhead is going to be harder to get rid of with flyby micro-optimization. If we're going to solve this, we need to transition to an engineering culture that thinks about performance as a primary concern when writing code and architecting new components. We're probably also going to need to rip up some of our old code along the way.

It's easy to say that we're in this situation because of the language that we're using, but I don't think that's fair, accurate, or helpful. Go's biggest obstacle to performance is that it makes it so easy to write moderately inefficient code. It makes it easy to allocate memory without the full cost being obvious because it's hidden by the GC, it makes it trivial to toss around and abuse expensive synchronization mechanisms to make up for poorly thought-out object relations, and it imposes a cost on abstraction in the form of heap allocations and dynamic dispatch.

But that doesn't mean that we can't use it to write highly performant code. The new cost-based optimizer is a perfect example of this. The team working on it has kept their eye on performance metrics throughout its development process. They began by creating a series of micro-benchmarks, they adopted a zero-allocation mindset, and they justified changes using benchmark results and profiles. As a result, the cost-based optimizer appears to be faster than the old heuristic-based optimizer even though it does significantly more work.

One of the biggest mistakes we make is believing that code that is off the "hot path" (for some definition of "hot path") can be arbitrarily inefficient. If nothing else, https://github.com/cockroachdb/cockroach/issues/30208 should come as a warning sign for how impactful small inefficiencies (a few allocations and a few lock acquisitions) that are only performed once per transaction can be. 

To start moving in the right direction, I'd like to get some mindshare around the tools engineers have at their disposal to write efficient code. Would people be interested in a lunch and learn about profiling using pprof?

Rebecca Taft

unread,
Sep 20, 2018, 2:01:06 PM9/20/18
to Nathan VanBenschoten, Andrei Matei, Peter Mattis, cockro...@googlegroups.com
> Would people be interested in a lunch and learn about profiling using pprof?

Yes please!

--
You received this message because you are subscribed to the Google Groups "CockroachDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cockroach-db...@googlegroups.com.

Peter Mattis

unread,
Sep 20, 2018, 2:03:35 PM9/20/18
to Rebecca Taft, Nathan VanBenschoten, Andrei Matei, Cockroach DB
I gave a lunch&learn on using pprof a while ago. Here are the slides (kind of weak, I know). Someone smarter than me can probably locate the video. I think it would be useful for Nathan to give another lunch&learn on the topic.

To unsubscribe from this group and stop receiving emails from it, send an email to cockroach-db+unsubscribe@googlegroups.com.

Arjun Narayan

unread,
Sep 20, 2018, 2:07:57 PM9/20/18
to Peter Mattis, Rebecca Taft, Nathan VanBenschoten, Andrei Matei, cockro...@googlegroups.com
Here are more of our PProf resources:

A wiki I started that I think will be useful, if you haven't already seen it: https://github.com/cockroachdb/cockroach/wiki/pprof

I endorse more lunch and learns on this topic. Nathan I'll ping you separately on scheduling it.

Arjun Narayan


To unsubscribe from this group and stop receiving emails from it, send an email to cockroach-db...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "CockroachDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cockroach-db...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cockroach-db/CANKgOKirgBbaV2_zdXnxb4%3Dk0tfjObvTBZpRv4bvsRc9M3mS0g%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages