Wondering about "dict->putifabsent(k,v)"

11 views
Skip to first unread message

Ernie Rael

unread,
Mar 6, 2023, 2:57:49 AM3/6/23
to vim...@googlegroups.com
Is it worth adding a putifabsent(dict, key, value) builtin function?
Discuss among yourselves ;-)

I occasionally need/want a dictionary->putifabsent(k, v). Either an
existing key's value should not be changed, or the value would be the
same. The second case offers a potential performance optimization.
Should there be a builtin putifabsent()? Below are some profile
results of the different ways to change a key's value.

"d->extend({k: v})" is the only builtin way (that I know of).
Besides "d->extend()" there's "if !d->has_key(k) | d[k] = v | endif",
which is several times faster than d->extend(), but is cumbersome at
best. A builtin, based on has_key() results, is 20% or more of an
improvement over "if ..." and easy to type and understand. I was
surprised to see a user function ~1,000 times slower.

The profiling, in micro-sec/op, is of putting a single k,v into a
dictionary; the key is already in the target dictionary, column header
is the number of keys in the target dictionary. There's something
curious going on around 200 keys.

-ernie


Strict - don't write if key already present, has_key() is an approximation

         of a builtin putifabsent().
     10     30     65    100    200    300 : nKeys (micro-sec/line)
  0.428  0.451  0.444  0.451  0.442  0.448 : d->extend({k: v}, 'keep')
  0.045  0.116  0.105  0.110  0.131  0.092 : if !d->has_key(k) | d[k] =
v | endif
  0.110  0.105  0.096  0.093  0.096  0.075 : d->has_key(k)

Extend - Note the differences between 'force' and 'keep'. The fastest uses
         vdict already setup with the new values, only here for comparison.
     10     30     65    100    200    300 : nKeys (micro-sec/line)
  0.457  0.479  0.474  0.476  0.482  0.486 : d->extend({k: v}, 'force')
  0.420  0.441  0.436  0.445  0.440  0.443 : d->extend({k: v}, 'keep')
  0.264  0.273  0.283  0.312  0.191  0.323 : d->extend(vdict, 'force')
  0.219  0.229  0.253  0.258  0.199  0.288 : d->extend(vdict, 'keep')

Assign versus Inline - Assign always modifies, still very fast.
     10     30     65    100    200    300 : nKeys (micro-sec/line)
  0.114  0.103  0.113  0.101  0.110  0.076 : d[k] = v
  0.059  0.107  0.108  0.112  0.122  0.069 : if !d->has_key(k) | d[k] =
v | endif

PutIfAbsent - A trivial user function is about 1,000 times slower than
anything.
     10     30     65    100    200    300 : nKeys (micro-sec/line)
    595    609    607    612    621    641 : d->PutIfAbsent(k, v)
  0.490  0.504  0.483  0.509  0.492  0.496 : d->extend({k: v}, 'force')
  0.436  0.454  0.455  0.444  0.455  0.454 : d->extend({k: v}, 'keep')

Bram Moolenaar

unread,
Mar 7, 2023, 12:14:22 PM3/7/23
to vim...@googlegroups.com, Ernie Rael

Ernie Rael wrote:

> Is it worth adding a putifabsent(dict, key, value) builtin function?
> Discuss among yourselves ;-)
>
> I occasionally need/want a dictionary->putifabsent(k, v). Either an
> existing key's value should not be changed, or the value would be the
> same. The second case offers a potential performance optimization.
> Should there be a builtin putifabsent()? Below are some profile
> results of the different ways to change a key's value.
>
> "d->extend({k: v})" is the only builtin way (that I know of).
> Besides "d->extend()" there's "if !d->has_key(k) | d[k] = v | endif",
> which is several times faster than d->extend(), but is cumbersome at
> best. A builtin, based on has_key() results, is 20% or more of an
> improvement over "if ..." and easy to type and understand. I was
> surprised to see a user function ~1,000 times slower.

Invoking a user function has a lot of overhead. You did use a compiled
function?

> The profiling, in micro-sec/op, is of putting a single k,v into a
> dictionary; the key is already in the target dictionary, column header
> is the number of keys in the target dictionary. There's something
> curious going on around 200 keys.

I don't think adding a function specifically to make something a bit
faster is a good idea. Using extend() should work, perhaps a little
profiling can show how to make it go faster. It might be that looping
over the dictionary to find the one entry adds some overhead (there are
always empty entries in the dictionary to make adding more items fast).


--
A man is incomplete until he's married ... and then he's finished!

/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// \\\
\\\ sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///

Ernie Rael

unread,
Mar 7, 2023, 2:11:07 PM3/7/23
to vim...@googlegroups.com
On 23/03/07 9:14 AM, Bram Moolenaar wrote:
> Ernie Rael wrote:
>
>> Is it worth adding a putifabsent(dict, key, value) builtin function?
>> Discuss among yourselves ;-)
>>
>> ...
>>
>> "d->extend({k: v})" is the only builtin way (that I know of).
>> ...
>> I was surprised to see a user function ~1,000 times slower.
> Invoking a user function has a lot of overhead. You did use a compiled
> function?
I did. But I was uncomfortable with the results; finally this morning it
occurred to me that I wasn't using the profiler correctly for micro
benchmarking and I'm looking at that right now. I don't believe the
function is 1,000 slower, maybe more like 100 times slower.
>
>> The profiling, in micro-sec/op, is of putting a single k,v into a
>> dictionary; the key is already in the target dictionary, column header
>> is the number of keys in the target dictionary. There's something
>> curious going on around 200 keys.
> I don't think adding a function specifically to make something a bit
> faster is a good idea.
Agreed, that's the conclusion I came to. And for this case, extend looks
good and works well, I got started on this because a user function was
so slow.
> Using extend() should work, perhaps a little
> profiling can show how to make it go faster.
Maybe, at first glance it looks like building the temp dictionary to
pass to extend is ~1/3 of the time.
> It might be that looping
> over the dictionary to find the one entry adds some overhead (there are
> always empty entries in the dictionary to make adding more items fast).

Might take a look (I always enjoy running a profiler :-) )

-ernie

>
>

Ernie Rael

unread,
Mar 9, 2023, 3:18:51 AM3/9/23
to vim...@googlegroups.com
On 23/03/07 11:10 AM, Ernie Rael wrote:
> On 23/03/07 9:14 AM, Bram Moolenaar wrote:
>> Ernie Rael wrote:
>>
>>>
>>> I was surprised to see a user function ~1,000 times slower.
>> Invoking a user function has a lot of overhead.  You did use a compiled
>> function?
> I did. But I was uncomfortable with the results; finally this morning
> it occurred to me that I wasn't using the profiler correctly for micro
> benchmarking and I'm looking at that right now. I don't believe the
> function is 1,000 slower, maybe more like 100 times slower.

My hand slapping against my forehead...

Turns out the main performance issue is that I was running the test by
doing "source %" in a gvim that had been running for a while.
PutIfAbsent(), run from gvim that's been running for a long time, can be
around 12x slower than than when run from the command line as "gvim -c
'source test.vim' -c q". Alloc/free?

And curiously, when run from the command line, gvim is 50% slower than
vim. Some results below, the "command line" results are repeatable.
Doing ":source" in a just started gvim, versus command line gvim, is
~20% faster, I'm guessing that's because gvim has time to get to a
quiescent state before the test is run.

Note that only the user function implementation (which takes much longer
than anything else) shows such a huge variation in performance. The
bottom line is that the user function PutIfAbsent is ~30 times slower
than extend({[k]: v}, 'keep') when run with vim from command line.

-ernie

Using running for a while gvim, :sou %
       10       30       65      100      200      300 : nKeys
(micro-sec/algorithm)
      293      308      305      311      318      323 :
d->PutIfAbsent(k, v)
    0.468    0.448    0.465    0.484    0.467    0.488 :
d->extend({[k]: v}, 'keep')

Using gvim from command line
$ gvim -c 'source test.vim' -c q
    1.276    5.895   10.580   15.006   19.510   24.190 :
d->PutIfAbsent(k, v)
    0.479    0.476    0.473    0.464    0.477    0.493 :
d->extend({[k]: v}, 'keep')

Using console vim from command line
$ vim -c 'source test.vim' -c q
    0.974    4.223    7.227   10.401   13.967   16.784 :
d->PutIfAbsent(k, v)
    0.478    0.476    0.464    0.469    0.471    0.491 :
d->extend({[k]: v}, 'keep')

Using fresh start gvim, :sou
    1.038    4.948    8.836   12.581   16.383   20.328 :
d->PutIfAbsent(k, v)
    0.477    0.470    0.454    0.455    0.456    0.491 :
Reply all
Reply to author
Forward
0 new messages