Performance problem in strequal

72 views
Skip to first unread message

Graham Miller

unread,
Oct 4, 2010, 4:15:16 PM10/4/10
to golang-nuts
I realize it may be a bit early in the development of Go to start much optimization, but I recently profiled my code and found that a good 20% of the time was spent in strequal from runtime.c.  The existing implementation simply calls cmpstring and compares the result to 0. The call trace indicated that this is called during map[string]xxx operations quite a bit, so tried my hand at optimizing it.  My C is a bit rusty, so I would appreciate any feedback you might have.  Here is the diff:

http://pastie.org/private/psi7742jmcicjqmb0v7ag

I have put together a benchmark to measure the performance of a string insert into a map:

http://pastie.org/private/hubejibqiblug9pdfksw

This benchmark shows an improvement of 8%, 69%, 75%, 73% and 1% for strings of size 1, 100, 1000, 10000, and 100000 respectively.  If you do play around with the benchmark, please note that you cannot compare the sub-benchmarks with each other, as they each run the loop a different number of times.  You can only compare each of the five with different implementations of strequal.

All the unit tests pass, my code has seen a dramatic improvement.

graham

Evan Shaw

unread,
Oct 4, 2010, 10:19:44 PM10/4/10
to Graham Miller, golang-nuts

This seems to me like a reasonable thing to want to optimize and a
reasonable general approach. I think you'll have better luck getting
something done if you upload a CL:
http://golang.org/doc/contribute.html

I'd also suggest fixing style inconsistencies before uploading.

- Evan

roger peppe

unread,
Oct 5, 2010, 5:40:45 AM10/5/10
to Graham Miller, golang-nuts
is String.str guaranteed to be word-aligned?

chris dollin

unread,
Oct 5, 2010, 6:08:19 AM10/5/10
to roger peppe, Graham Miller, golang-nuts
On 5 October 2010 10:40, roger peppe <rogp...@gmail.com> wrote:
> is String.str guaranteed to be word-aligned?

I would have said not, seeing as the code for slicing a string includes
(I paraphrase)

result.stringpointer = incoming.stringpointer + delta;

So no guarantee.

Chris

--
Chris "allusive" Dollin

roger peppe

unread,
Oct 5, 2010, 8:39:36 AM10/5/10
to chris dollin, Graham Miller, golang-nuts
On 5 October 2010 11:08, chris dollin <ehog....@googlemail.com> wrote:
> On 5 October 2010 10:40, roger peppe <rogp...@gmail.com> wrote:
>> is String.str guaranteed to be word-aligned?
>
> I would have said not, seeing as the code for slicing a string includes
> (I paraphrase)
>
>   result.stringpointer = incoming.stringpointer + delta;
>
> So no guarantee.

sorry, it was kind of a rhetorical question.
evan's optimisation won't work on architectures that
disallow unaligned word access.

Rob 'Commander' Pike

unread,
Oct 5, 2010, 8:47:08 AM10/5/10
to chris dollin, roger peppe, Graham Miller, golang-nuts
Two points.

First, cmpstring is written simply and portably but it's probably worth writing architecture-dependent code for it, since it comes up a lot.

Second, and much more important, it's clear that cmpstring should be specialized for equality, since a==b can usually be detected as false due simply to the length, while cmpstring will examine all the bytes. This is the real optimization to make.

-rob

Russ Cox

unread,
Oct 5, 2010, 9:06:14 AM10/5/10
to Graham Miller, golang-nuts
memmove is already in the runtime in assembly
per-architecture. It would be fine to add memcmp
and memchr and then rewrite the existing code to
refer to those instead of mcpy, mchr, and mcpy
(mcpy in particular is already redundant).

As Rob says, a case you're ignoring is strings of
different length. strequal should say

return a->len == b->len && memcmp(a->str, b->str, a->len) == 0;

and then just make memcmp better.

The implementation of better may differ per architecture.
The x86 is pretty good about unaligned accesses in
this case, but the simple C code might be best for
something like ARM.

Your benchmarks are fighting the benchmark system.
If you drop the numIterations parameter and instead use b.N
as the iteration count, you'll find that the numbers are then
comparable.

Russ

Graham Miller

unread,
Oct 5, 2010, 10:34:10 AM10/5/10
to r...@golang.org, golang-nuts
OK, the word alignment issue seems to be a fairly big one.  So I'll back off on that, and take an approach similar to what Russ suggested (though I did have the length check in there to begin with :) ).  Now delegates to memequal also in runtime.c, and simply adjusted that func to use pointer math on byte pointers instead of array notation.  The speedups are not quite as good, but they are still respectable, 58% for 100 byte strings on my machine, for example.

Russ, thanks for the tip on benchmark.

Here are the revised diff and benchmark.

http://pastie.org/private/lqxhzszejzthlm05qw
http://pastie.org/private/do9lqebuizsmjlkqznwxvw

graham

Graham Miller

unread,
Oct 5, 2010, 2:22:20 PM10/5/10
to r...@golang.org, golang-nuts
Now with CL goodness... http://codereview.appspot.com/2317044

Thx.

graham

Sven Engelhardt

unread,
Oct 7, 2010, 11:43:33 AM10/7/10
to Graham Miller, r...@golang.org, golang-nuts
Graham,

your benchmark might be a bit set up for this, still, adding

if (a == b) {
  return 1;
}

at the beginning of memequal() got me from this:

Counter: 100000000
multiplexer.BenchmarkChar1           1    8194483000 ns/op
Counter: 10000000
multiplexer.BenchmarkChar100         1    4453044000 ns/op
Counter: 1000000
multiplexer.BenchmarkChar1000        1    3668401000 ns/op
Counter: 100000
multiplexer.BenchmarkChar10000       1    3686527000 ns/op
Counter: 100
multiplexer.BenchmarkChar100000      1    8314582000 ns/op


to this:

Counter: 100000000
multiplexer.BenchmarkChar1           1    7030998000 ns/op
Counter: 10000000
multiplexer.BenchmarkChar100         1    3239463000 ns/op
Counter: 1000000
multiplexer.BenchmarkChar1000        1    2583439000 ns/op
Counter: 100000
multiplexer.BenchmarkChar10000       1    2614330000 ns/op
Counter: 100
multiplexer.BenchmarkChar100000      1    8199042000 ns/op


YMMV,

Sven

2010/10/5 Graham Miller <graham...@gmail.com>

gmiller

unread,
Oct 8, 2010, 9:01:34 AM10/8/10
to golang-nuts
Awww! Good one. Code has already been committed, but I'll submit a
small supplementary CL.

graham
> 2010/10/5 Graham Miller <graham.mil...@gmail.com>
>
> > Now with CL goodness...http://codereview.appspot.com/2317044
>
> > Thx.
>
> > graham
Reply all
Reply to author
Forward
0 new messages