Clarifying preference of signed vs unsigned integer types in C++

95 views
Skip to first unread message

Mirko Brodesser

unread,
Nov 10, 2021, 5:07:47 AM11/10/21
to dev-pl...@mozilla.org
Hello,

TLDR: C++ offers signed and unsigned integer types, e.g. `int32_t` and `uint32_t`. Should variables which *should* not contain negative values have a signed or unsigned type? Arguments for both types can be found in Bjarne Stroustrup's paper [1].

I've discussed the issue in recent days with Botond, Emilio and Masayuki, since we disagreed on the answer and Mozilla's style guide [2] doesn't explicitly answer it either. The latter is based on Google's C++ Style Guide, which recommends using signed types [3].

For the sake of completeness, the most recent discussion about this on dev.platform can be found at [4].

Botond's proposal was to "contain" the use of unsigned types to certain layers of the code and cast when entering or exiting those layers [5] and to follow Google's Style Guide [2] for new code.
While I think that makes sense, I'd of course accept other decisions as well.
The important issue is to come to some decision which should be added to Mozilla's style guide. Otherwise, such discussions will emerge again, unnecessarily delaying reviews.

I don't know, how much static analysis currently helps to detect overflows of signed types, which would be one benefit of using signed instead of unsigned types. Does anyone know?

What do other people think about this?

Mirko

[5]

Mirko Brodesser

unread,
Nov 10, 2021, 5:09:14 AM11/10/21
to dev-pl...@mozilla.org, Mirko Brodesser
Adding incomplete footnote of previous message:

Frederik Braun

unread,
Nov 10, 2021, 5:14:48 AM11/10/21
to dev-pl...@mozilla.org


On 10.11.21 11:07, Mirko Brodesser wrote:
...

>
> I don't know, how much static analysis currently helps to detect
> overflows of signed types, which would be one benefit of using signed
> instead of unsigned types. Does anyone know?
>

We have very limited support to detect overflows using static analysis
and have recommended the use of CheckedInt
(<https://firefox-source-docs.mozilla.org/code-quality/coding-style/using_cxx_in_firefox_code.html?highlight=checkedint#safety-utilities>)
instead.

Christian Holler

unread,
Nov 10, 2021, 5:37:13 AM11/10/21
to Mirko Brodesser, dev-pl...@mozilla.org
Hey Mirko

On 10.11.21 11:07, Mirko Brodesser wrote:
>
> I don't know, how much static analysis currently helps to detect
> overflows of signed types, which would be one benefit of using signed
> instead of unsigned types. Does anyone know?
>
Static analysis is not very good in accurately detecting these issues,
but we do have dynamic analysis though UndefinedBehaviorSanitizer [1]
available to detect both types of overflows. The primary difference
between signed and unsigned here is that signed overflow is actual
undefined behavior and considered harmful in certain situations (the
compiler can perform optimizations and remove code based on the
assumption that signed types will not overflow).

One problem in our codebase is that we have a lot of old code that makes
excessive use of signed overflow where the results don't matter, which
makes deploying this kind of analysis challenging. I believe we are not
running this anywhere right now, despite the fact that we have an
extensive suppression list [2] and build system support for it [3]. I'll
actually go and figure out if we are planning to deploy this at any
point and get back to you when I know more about that.

For new code, if possible we should use `CheckedInt` which has a
diagnostic assert that indicates unchecked overflows. If that is not
possible for some reason and a variable is not supposed to contain
negative values (and not supposed to overflow), then from a memory
safety standpoint, it might be better to have it signed (if the variable
flows into a size parameter, it will be too large rather than too
small), but it really depends on the underlying code I guess. The UB
occurring might have negative side effects too, but that at least makes
the bug detectable, while we have very little chance of finding
unintended unsigned overflows unless they crash in some way.

For the situation above we should also always investigate why
`CheckedInt` wouldn't be applicable.


- Chris


[1]
https://releases.llvm.org/13.0.0/tools/clang/docs/UndefinedBehaviorSanitizer.html

[2]
https://searchfox.org/mozilla-central/rev/a48e21143960b383004afa9ff9411c5cf6d5a958/build/moz.configure/toolchain.configure#1832

[3]
https://searchfox.org/mozilla-central/rev/a48e21143960b383004afa9ff9411c5cf6d5a958/build/sanitizers/ubsan_signed_overflow_blacklist.txt

Emilio Cobos Álvarez

unread,
Nov 10, 2021, 5:54:51 AM11/10/21
to Mirko Brodesser, dev-pl...@mozilla.org
Thanks for posting this Mirko! :)

On 11/10/21 11:07, Mirko Brodesser wrote:
> What do other people think about this?

For the record, just re-instating some of the reasons I disagree with
preferring signed integers when the variable can't be negative:

* Types provide value to the reader, and are superior to comments. If
I'm reading a piece of code for the first time, knowing an argument
can't be negative is useful. Having this information in the type system
(rather than a comment + MOZ_ASSERT in the implementation) seems better.

* Existing code seems largely on the "using unsigned when possible"
camp, e.g., all the sizes in mfbt and xpcom data structures are
unsigned. Consistency with existing code might be valuable.

* Rust also prefers unsigned when possible. Consistency with the other
compiled language we have in the tree might also be valuable.

* Signed integer overflow does cause real issues and any theoretical
performance win we could have from abusing signed overflow being UB will
go away when https://bugzilla.mozilla.org/show_bug.cgi?id=1031653 is fixed.

* I think the real issue here is the implicit narrowing conversions
and integer promotions of C++, which we do have static analysis to
prevent (even if static analysis isn't always perfect, neither are the
diagnostics that compilers produce when you use signed types).

I mentioned this on a private thread too, but this is not a hill I want
to die on, so happy to accept whatever the module owners say about this.
I concur with Mirko that we just need _some_ direction to put in the
style guide :-)

-- Emilio

Olli Pettay

unread,
Nov 10, 2021, 6:05:35 AM11/10/21
to Mirko Brodesser, dev-pl...@mozilla.org
On 11/10/21 12:07, Mirko Brodesser wrote:
> Hello,
>
> TLDR: C++ offers signed and unsigned integer types, e.g. `int32_t` and `uint32_t`. Should variables which *should* not contain negative values have a
> signed or unsigned type? Arguments for both types can be found in Bjarne Stroustrup's paper [1].
>
> I've discussed the issue in recent days with Botond, Emilio and Masayuki, since we disagreed on the answer and Mozilla's style guide [2] doesn't
> explicitly answer it either. The latter is based on Google's C++ Style Guide, which recommends using signed types [3].
>
> For the sake of completeness, the most recent discussion about this on dev.platform can be found at [4].
>
> Botond's proposal was to "contain" the use of unsigned types to certain layers of the code and cast when entering or exiting those layers [5] and to
> follow Google's Style Guide [2] for new code.

That sounds reasonable, though what a "layer" is is vague. In general I think our DOM code, as an example, should follow the conventions the
specifications have - that just makes the code easier to follow and makes it less error prone.
For example Range object's offsets are unsigned long in the DOM spec, and WebIDL spec defines that type so our code should just use a native type
which maps to that. And CharacterData uses unsigned long too etc.
We should not use int64_t just that we can represent unsigned long in C++, that would increase memory usage for no good reason - and it would be
less clear to the reader too ("are negative values allowed here?").
In practice Gecko's DOM as a whole may be seen as one layer because it is so tightly connected to the DOM and WebIDL (and HTML and other) specs.
There are of course some areas even in our DOM code where we have some more flexibility. And other modules may not have similar specs behind them.



-Olli

> While I think that makes sense, I'd of course accept other decisions as well.
> The important issue is to come to /some/ decision which should be added to Mozilla's style guide. Otherwise, such discussions will emerge again,
> unnecessarily delaying reviews.
>
> I don't know, how much static analysis currently helps to detect overflows of signed types, which would be one benefit of using signed instead of
> unsigned types. Does anyone know?
>
> What do other people think about this?
>
> Mirko
>
> [1] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1428r0.pdf.
> [2] https://firefox-source-docs.mozilla.org/code-quality/coding-style/coding_style_cpp.html#c-c-practices
> [3] https://google.github.io/styleguide/cppguide.html#Integer_Types
> [4] https://groups.google.com/g/mozilla.dev.platform/c/ekmfMjjF7Mg/m/ir6HXSYbAwAJ
> [5]
>
> --
> You received this message because you are subscribed to the Google Groups "dev-pl...@mozilla.org" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to dev-platform...@mozilla.org
> <mailto:dev-platform...@mozilla.org>.
> To view this discussion on the web visit
> https://groups.google.com/a/mozilla.org/d/msgid/dev-platform/c5e54c66-2fdb-4174-8093-0bbf2c62b102n%40mozilla.org
> <https://groups.google.com/a/mozilla.org/d/msgid/dev-platform/c5e54c66-2fdb-4174-8093-0bbf2c62b102n%40mozilla.org?utm_medium=email&utm_source=footer>.

Mats Palmgren

unread,
Nov 10, 2021, 11:41:34 AM11/10/21
to dev-pl...@mozilla.org
I fully agree with all the points Emilio makes below.
Unsigned types conveys the intent much more clearly to me.

Also, can somebody please just fix the -fwrapv bug he mentioned?
It's been open for 8 years already...

/Mats

Bobby Holley

unread,
Nov 10, 2021, 4:51:37 PM11/10/21
to Mats Palmgren, dev-pl...@mozilla.org
Hi folks,

I've chatted with my peers in the C++ style/usage module, and while there are advantages to each approach, we are most persuaded by the rationale provided by Emilio. Accordingly, I've submitted a patch in bug 1740616 to update the style guide to prefer unsigned types in this situation, and also encourage the use of CheckedInt for managing overflow.

Cheers,
Bobby

--
You received this message because you are subscribed to the Google Groups "dev-pl...@mozilla.org" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dev-platform...@mozilla.org.
To view this discussion on the web visit https://groups.google.com/a/mozilla.org/d/msgid/dev-platform/a3c1ea56-b199-b60f-9df9-ceaf81c797a2%40mozilla.com.

Mirko Brodesser

unread,
Nov 11, 2021, 4:19:58 AM11/11/21
to dev-pl...@mozilla.org, Bobby Holley, dev-pl...@mozilla.org, Mats Palmgren
Thanks for all the feedback and the decision.

Mirko

To unsubscribe from this group and stop receiving emails from it, send an email to dev-platform+unsubscribe@mozilla.org.
Message has been deleted

Andrew Sutherland

unread,
Nov 11, 2021, 5:59:07 PM11/11/21
to dev-pl...@mozilla.org
On 11/11/21 1:23 PM, Chris Peterson wrote:
> On 2021-11-10 2:54 AM, Emilio Cobos Álvarez wrote:
>>  * I think the real issue here is the implicit narrowing conversions
>> and integer promotions of C++, which we do have static analysis to
>> prevent (even if static analysis isn't always perfect, neither are
>> the diagnostics that compilers produce when you use signed types).
>
> For a sense of scale, here are the number of mozilla-central warnings
> for some clang diagnostics (disabled by default) related to integer
> signedness and narrowing:

Is this something where it could be useful for searchfox to be able to
help display these conversions/promotions as an optional display layer? 
For example, inserting little emoji on implicit conversions?  This would
be along the lines of the bug I filed to try and make NRVO
(in)eligibility surfaced in searchfox as tracked at
https://bugzilla.mozilla.org/show_bug.cgi?id=1709710.

The use-case would be that if you're looking at code in searchfox where
you're already suspicious of the code / trying to manually figure out
what the compiler would think of something, it could be nice to be able
to turn this on when desired.  The information could also be made
accessible in the navigation sidebar which will soon optionally start
behaving perhaps a bit more like an inspector.

Andrew

Reply all
Reply to author
Forward
0 new messages