Overflow detection issue in 9.0.2111, 9.0.2109

Michael Henry

unread,

Nov 16, 2023, 5:40:15 PM11/16/23

to vim...@googlegroups.com

All,

I noticed these two patches too late to comment on the associated pull
request:

- patch 9.0.2111: [security]: overflow in get_number:
- patch 9.0.2109: [security]: overflow in nv_z_get_count:

Both perform overflow detection similarly, verifying that multiplication
by 10 does not overflow; in both cases, the detection logic lacks a
necessary extra check to ensure the addition of the new digit value does
not overflow.

For the `int` case, on a typical box with 32-bit `int` the value of
`INT_MAX` will be 2147483647. Consider the case where all but the last
`7` has been typed; the current value is then 214748364, which is no
greater than `INT_MAX/10`, so another digit is accepted. Multiplying by
10 yields 21474836470. Any digit up to `7` will remain in range, but
`8` and `9` will cause overflow. Similar logic applies for the `long` case.

The below example program shows helper functions for detecting whether a
given value (`int` or `long`) can accept another digit.

```c
#include <stdio.h>
#include <limits.h>

#define OK   1
#define FAIL 0

#define CHECK_ADDITION 1

int
vim_append_digit_int(int *value, int digit)
{
    int x = *value;
    if (x > (INT_MAX / 10))
        return FAIL;
    x *= 10;
#if CHECK_ADDITION
    if (x > (INT_MAX - digit))
        return FAIL;
#endif
    x += digit;
    *value = x;
    return OK;
}

int
vim_append_digit_long(long *value, int digit)
{
    long x = *value;
    if (x > (LONG_MAX / 10))
        return FAIL;
    x *= 10;
#if CHECK_ADDITION
    if (x > (LONG_MAX - digit))
        return FAIL;
#endif
    x += digit;
    *value = x;
    return OK;
}

static void
testInt(int value, int digit)
{
    int oldValue = value;
    int success = vim_append_digit_int(&value, digit);
    if (success)
    {
        printf("int: %d:%d = %d <= %d\n", oldValue, digit, value, INT_MAX);
    }
    else
    {
        printf("int: %d:%d > %d (FAIL)\n", oldValue, digit, INT_MAX);
    }
}

static void
testLong(long value, int digit)
{
    long oldValue = value;
    int success = vim_append_digit_long(&value, digit);
    if (success)
    {
        printf("long: %ld:%d = %ld <= %ld\n", oldValue, digit, value,
LONG_MAX);
    }
    else
    {
        printf("long: %ld:%d > %ld (FAIL)\n", oldValue, digit, LONG_MAX);
    }
}

int
main(void)
{
    testInt(INT_MAX / 10, 0);
    testInt(INT_MAX / 10, INT_MAX % 10);
    testInt(INT_MAX / 10, INT_MAX % 10 + 1);
    testLong(LONG_MAX / 10, 0);
    testLong(LONG_MAX / 10, LONG_MAX % 10);
    testLong(LONG_MAX / 10, LONG_MAX % 10 + 1);
    return 0;
}
```

On my 64-bit Ubuntu Linux machine, I get the following output:

```
int: 214748364:0 = 2147483640 <= 2147483647
int: 214748364:7 = 2147483647 <= 2147483647
int: 214748364:8 > 2147483647 (FAIL)
long: 922337203685477580:0 = 9223372036854775800 <= 9223372036854775807
long: 922337203685477580:7 = 9223372036854775807 <= 9223372036854775807
long: 922337203685477580:8 > 9223372036854775807 (FAIL)
```

In the sample program change `CHECK_ADDITION` to `0` to demonstrate the
need for the second check.

Michael Henry

Ernie Rael

unread,

Nov 17, 2023, 1:41:30 AM11/17/23

to vim...@googlegroups.com

Hi Michael,

I appreciate seeing your analysis; I did a flawed analysis. I screwed up trying to get it "perfect" with a single compare.

I just tried the following as a single compare at entry (derived from: x * 10 + digit <= max)

if (x > ((INT_MAX - digit) / 10)) return FAIL;

AFAICT, it replicates your results without a separate check for addition. If I make it constant

if (x > ((INT_MAX - 9) / 10))

then you only lose less than 10 valid numbers, and get a single compare with no runtime calculations in the generated code.

int: 214748363:4 = 2147483634 <= 2147483647
int: 214748363:5 = 2147483635 <= 2147483647
int: 214748363:6 = 2147483636 <= 2147483647
int: 214748363:7 = 2147483637 <= 2147483647
int: 214748363:8 = 2147483638 <= 2147483647
int: 214748363:9 = 2147483639 <= 2147483647
int: 214748364:0 > 2147483647 (FAIL) 2147483640
int: 214748364:1 > 2147483647 (FAIL) 2147483641
int: 214748364:2 > 2147483647 (FAIL) 2147483642
int: 214748364:3 > 2147483647 (FAIL) 2147483643
int: 214748364:4 > 2147483647 (FAIL) 2147483644
int: 214748364:5 > 2147483647 (FAIL) 2147483645
int: 214748364:6 > 2147483647 (FAIL) 2147483646
int: 214748364:7 > 2147483647 (FAIL) 2147483647
int: 214748364:8 > 2147483647 (FAIL) -2147483648
int: 214748364:9 > 2147483647 (FAIL) -2147483647

Make it an inline function so the compiler can have its way with it and the generated code is pretty small.

Does this make sense to you?

-ernie

Michael Henry

unread,

Nov 17, 2023, 5:46:09 AM11/17/23

to vim...@googlegroups.com

Hi, Ernie,

> I just tried the following as a single compare at entry
> (derived from: x * 10 + digit <= max)

if (x > ((INT_MAX - digit) / 10)) return FAIL;

> AFAICT, it replicates your results without a separate check
> for addition.

Yes, I think `x > ((INT_MAX - digit) / 10)` is an accurate test
for predicting overflow. A separate check might be a little
easier to reason about, though in my experience people find it
complicated no matter how the check is implemented due to the
truncating nature of division; but I shouldn't have said a
second check was necessary, but rather that checking only `x`
(without consideration of `digit`) is insufficient to make an
accurate check. Extracting this logic into a function provides
a single location for explaining why the above test is correct,
so I see no reason not to use this more compact implementation.

As a user, I'd prefer in general that Vim perform range
calculations accurately rather than approximately. The most
important thing is to avoid generating signed overflow, as this
leads to undefined behavior in C. Approximating the above check
using `9` instead of `digit` is conservative; it avoids all
overflow cases while disallowing some valid (but extremely
large) values. Given that these values are generally being
typed in by humans and that most uses of these values have
additional range constraints far smaller than `INT_MAX`, I can
understand the temptation to throw away a few huge values. If
this were performance-critical code, it would make sense to
consider an approximate solution. But this code is generally
running character-by-character as the user types in keystrokes,
and after converting the characters to integer values, the
subsequent operations that use these integers will in general
take much longer to execute than the integer conversion code. I
think it's unlikely the difference could be measured in a Vim
benchmark. Since a helper function like this is general-purpose
code, I'd prefer the exact implementation unless profiling
demonstrates a performance bottleneck.

Michael Henry

Christian Brabandt

unread,

Nov 17, 2023, 6:50:52 AM11/17/23

to vim...@googlegroups.com

Thanks both. I have created the following PR to address this and another
issue reported by Coverity: https://github.com/vim/vim/pull/13539

I think it should work as expected now, but please verify. Happy to
give you @Michael credits for the second commit.

Thanks,
Chris
--
The Law of the Letter:
The best way to inspire fresh thoughts is to seal the envelope.

Michael Henry

unread,

Nov 17, 2023, 5:21:34 PM11/17/23

to vim...@googlegroups.com

On 11/17/23 06:50, Christian Brabandt wrote:
> Thanks both. I have created the following PR to address this
> and another issue reported by Coverity:
> https://github.com/vim/vim/pull/13539
>
> I think it should work as expected now, but please verify.

It looks correct to me. Thanks for the rapid response! You've
really been putting in the time lately. Bram left some big
shoes to fill, and I appreciate how hard you have been working
for the Vim community.

Michael Henry

Reply all

Reply to author

Forward