[llvm-dev] RFC: Strong typedef for LLVM

David Greene via llvm-dev

unread,

Aug 1, 2019, 11:57:49 AM8/1/19

to llvm...@lists.llvm.org

Lately I've been using some utilities to increase the number of logic
errors caught at compile time. I thought they might be useful to the
LLVM project. I'd appreciate feedback on the below proposal. Would the
community find these useful?

-David

RFC: Strong typedef utilities for LLVM
--------------------------------------

Abstract
--------
This proposal describes a set of utility classes for increasing type
safety within the LLVM project codebase. It introduces a general
"strong typedef" template along with more specialized templates for
Boolean values, accumulators and counters.

The interface and implementation are inspired by Jonathan Mueller's
post on strong typedef [1].

The Problem
-----------
As noted on a side-thread of the variable naming convention change,
LLVM has a number of places where function parameter semantics are
unclear at the point of use [2]. To compensate, much code exists
in this style:

foo(arg1, /* SomeFlagName = */true);

Functions with, for example, two boolean parameters are often called
as:

foo(arg1, /* Flag1Name = */true, /* Flag2Name =*/false);

Unfortunately, it is easy to mix this up unintentionally:

foo(arg1, /* Flag1Name = */false, /* Flag2Name =*/true);

The above may or may not be a bug. The intent is unclear from the
context.

Things are even worse without comments:

foo(arg1, true, false); // What does this mean?

bool SomeFlag = true;
bool AnotherFlag = false;

foo(arg1, AnotherFlag, SomeFlag); // Is this correct?

bool DoThing = bar();
bool EnableFeature = baz();

foo(arg1, EnableFeature, DoThing); // Still not clear.

bar(size, alignment); // Did I get that in the right order?
// Is size in bits or bytes?

A Possible Solution
-------------------
It would be nice if the Flag1Name and Flag2Name comments had semantic
meaning, such that they could be checked for correctness at compile
time. It would be even better if such semantic information was
*required* at the point of use. One could construct special classes
to make it so:

class Flag1Value {
bool Value;

public:
Flag1Value : Value(false) {}
explicit Flag1Value(bool V) : Value(V) {}

explicit operator bool() { return Value; }
};

class Flag2Value {
bool Value;

public:
Flag2Value : Value(false) {}
explicit Flag2Value(bool V) : Value(V) {}

explicit operator bool() { return Value; }
};

void foo(int arg, Flag1Value F1, Flag2Value F2);

foo(arg1, Flag1Value(true), Flag2Value(false));
foo(arg1, Flag2Value(true), Flag1Value(false)); // Won't compile.

Of course, defining individual classes for every kind of flag does not
scale. However, a template class can make it workable:

template<typename Tag, typename T>
class StrongTypedef {
public:
using BaseType = T;

private:
BaseType Value;

public:
constexpr StrongTypedef() : Value() {}
constexpr StrongTypedef(const StrongTypedef<Tag, T> &) = default;
explicit constexpr StrongTypedef(const BaseType &V) : Value(V) {}
explicit StrongTypedef(BaseType &&V)
noexcept(std::is_nothrow_move_constructible<BaseType>::value) :
Value(std::move(V)) {}

public:
explicit operator BaseType&() noexcept {
return Value;
}

explicit operator const BaseType&() const noexcept {
return Value;
}

friend void swap(StrongTypedef &A, StrongTypedef &B) noexcept {
using std::swap;
swap(static_cast<BaseType&>(A), static_cast<BaseType&>(B));
}
};

class Flag1Value : public StrongTypedef<Flag1Value, bool> {
public:
using StrongTypedef::StrongTypedef;
};

class Flag2Value : public StrongTypedef<Flag2Value, bool> {
public:
using StrongTypedef::StrongTypedef;
};

Note that Tag does not have to be a derived class and StrongTypedef
does not have to be used in a CRTP manner, though it is often
convenient to do so.

The StrongTypedef template doesn't provide any operations outside of
explicit conversion to/from the base type. Mixin classes can imbue it
with new interfaces:

// Mixin to add equality and inequality comparison.
template<typename ST>
struct HasEqualityCompare {
friend bool operator==(const ST &LHS, const ST &RHS) {
using BaseType = typename ST::BaseType;

return (static_cast<const BaseType &>(LHS) ==
static_cast<const BaseType &>(RHS));
}

friend bool operator!=(const ST &LHS, const ST &RHS) {
return !(LHS == RHS);
}
};

class Flag1Value : public StrongTypedef<Flag1Value, bool>,
public HasEqualityCompare<Flag1Value> {
public:
using StrongTypedef::StrongTypedef;
};

Flag1Value True(true);
Flag1Value False(false);

assert(True != False);
assert(True == True);

StrongTypedef can be used to create rudimentary units types:

class Bits : StrongTypedef<Bits, int64_t> {
public:
using StrongTypedef::StrongTypedef;
};

class Bytes : StrongTypedef<Bytes, int64_t> {
public:
using StrongTypedef::StrongTypedef;
};

Bits size(32);
Bytes alignment(4);

void bar(Bytes size, Bytes alignment);

bar(size, alignment); // Won't compile

Boolean parameters are so common that it's convenient to have a
special helper for them:

template<typename Tag>
class NamedBoolean : public StrongTypedef<Tag, bool>,
public HasEqualityCompare<NamedBoolean<Tag>>,
public HasLogicalConjunction<NamedBoolean<Tag>>,
public HasLogicalDisjunction<NamedBoolean<Tag>> {
public:
using StrongTypedef<Tag, bool>::StrongTypedef;
};

class Flag1Value : public NamedBoolean<Flag1Value> {
public:
using NamedBoolean::NamedBoolean;
};

It can also be useful to have utilities for strong typedefed counters
and accumulators:

template<typename Tag, typename Int = int64_t>
class NamedCounter : public StrongTypedef<Tag, Int>,
public HasEqualityCompare<NamedCounter<Tag>>,
public HasIncrement<NamedCounter<Tag>> { // ++
public:
using StrongTypedef<Tag, int64_t>::StrongTypedef;
};

template<typename Tag, typename Int = int64_t>
class NamedAccumulator : public StrongTypedef<Tag, Int>,
public HasEqualityCompare<NamedAccumulator<Tag>>,
public HasAccumulate<NamedAccumulator<Tag>> { // +=
public:
using StrongTypedef<Tag, int64_t>::StrongTypedef;
};

Proposal
--------
This RFC proposes nothing more than introduction of these templates
under include/llvm/ADT to make them available to developers who want
extra safety in their code. It specifically does not propose rewriting
existing uses of flags, counters, accumulators or any other values to
use these utilities, nor does it mandate their use in new code. Such
changes may be desirable but should happen incrementally over time.

References
----------
[1] https://foonathan.net/blog/2016/10/19/strong-typedefs.html
[2] http://llvm.1065342.n5.nabble.com/llvm-dev-RFC-changing-variable-naming-rules-in-LLVM-code
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Doerfert, Johannes via llvm-dev

unread,

Aug 1, 2019, 12:41:18 PM8/1/19

to llvm...@lists.llvm.org, David Greene

+1, I like this a lot.

Recently, I started to use two element `enum class`es to force people (mostly me)
to spell out what true/false actually means in a certain context. The solution proposed
here comes with more boilerplate but once available it is nicer (I think) and for sure
more generic.

________________________________________
From: llvm-dev <llvm-dev...@lists.llvm.org> on behalf of David Greene via llvm-dev <llvm...@lists.llvm.org>
Sent: Thursday, August 1, 2019 10:57
To: llvm...@lists.llvm.org
Subject: [llvm-dev] RFC: Strong typedef for LLVM

Reid Kleckner via llvm-dev

unread,

Aug 1, 2019, 1:29:05 PM8/1/19

to Doerfert, Johannes, llvm...@lists.llvm.org, David Greene

On Thu, Aug 1, 2019 at 9:41 AM Doerfert, Johannes via llvm-dev <llvm...@lists.llvm.org> wrote:

+1, I like this a lot.

Recently, I started to use two element `enum class`es to force people (mostly me)
to spell out what true/false actually means in a certain context. The solution proposed
here comes with more boilerplate but once available it is nicer (I think) and for sure
more generic.

I agree, I think for bools, an enum class is the simplest language provided way to get most of the benefits that a strong typedef (or value wrapper class) would provide. I would prefer to keep using them for these cases, especially for widely used user facing APIs. The less publicly visible the function is (private method, file local static, etc), the more inclined I am to just keep doing what we're doing for expediency and brevity.

For integer values that are easy to confuse (physical register numbers, byte quantities, bit quantities, alignments, etc), I agree, we should be leveraging the type system more than we are. Adding a strong typedef class to make it easier to make these types with less boilerplate would be great. See existing prior art for this kind of stuff in clang::CharUnits.

I suppose I don't love the name "strong typedef", since it kind of implies a value judgement that C typedefs are weak, and therefore worse. They aren't worse, they just aren't well suited to this use case and have other strengths and weaknesses. I see that the C++ paper proposes calling these "opaque typedefs". I don't know if that's the best name, but I prefer that shade of paint. :)

David Greene via llvm-dev

unread,

Aug 1, 2019, 4:44:57 PM8/1/19

to Reid Kleckner, llvm...@lists.llvm.org

Reid Kleckner <r...@google.com> writes:

> On Thu, Aug 1, 2019 at 9:41 AM Doerfert, Johannes via llvm-dev <llvm...@lists.llvm.org> wrote:
>
> +1, I like this a lot.
>
> Recently, I started to use two element `enum class`es to force people (mostly me)
> to spell out what true/false actually means in a certain context. The solution proposed
> here comes with more boilerplate but once available it is nicer (I think) and for sure
> more generic.
>
> I agree, I think for bools, an enum class is the simplest language
> provided way to get most of the benefits that a strong typedef (or
> value wrapper class) would provide. I would prefer to keep using them
> for these cases, especially for widely used user facing APIs. The less
> publicly visible the function is (private method, file local static,
> etc), the more inclined I am to just keep doing what we're doing for
> expediency and brevity.

This all sounds fine. My implementation of NamedBoolean allows this
kind of thing:

struct AFlag : public NamedBoolean<AFlag> {
using NamedBoolean::NamedBoolean;
};

struct AnotherFlag : public NamedBoolean<AnotherFlag> {
using NamedBoolean::NamedBoolean;
};

AFlag flag1;
AnotherFlag flag2;

if (flag1 || flag2) { // No explicit casts needed
...
};

I don't think that's quite as convenient with `enum class` but maybe
that's a minor point. Both spellings might be useful in different
contexts.

> For integer values that are easy to confuse (physical register
> numbers, byte quantities, bit quantities, alignments, etc), I agree,
> we should be leveraging the type system more than we are. Adding a
> strong typedef class to make it easier to make these types with less
> boilerplate would be great. See existing prior art for this kind of
> stuff in clang::CharUnits.

Thanks for the pointer! It looks like CharUnits has more functionality
than the proposed classes. I intend for these classes to be as simple
as possible to handle common cases. I wouldn't want to see them grow to
implement all of the interfaces required by CharUnits.

> I suppose I don't love the name "strong typedef", since it kind of
> implies a value judgement that C typedefs are weak, and therefore
> worse. They aren't worse, they just aren't well suited to this use
> case and have other strengths and weaknesses. I see that the C++ paper
> proposes calling these "opaque typedefs". I don't know if that's the
> best name, but I prefer that shade of paint. :)

OTOH, naming it OpaqueTypedef might give users the mistaken impression
that all of the qualities guaranteed by that proposal are provided by
these utilities. That is almost certainly not the case in general
(equality of sizes, for example).

I also think "opaque" is sort of a strange name. The types aren't
opaque. You can do ordinary operations on them (overloaded operators,
etc.). I tend to think of an "opaque" type as something that I can't
really do anything with except pass to/from some black-box interfaces.

That said, I'm not married to the name. If there is some sort of
consensus around a different name that's great! I don't love the Named*
names but couldn't think of anything better.

-David

Nevin Liber via llvm-dev

unread,

Aug 2, 2019, 3:03:13 PM8/2/19

to llvm...@lists.llvm.org

On Thu, Aug 1, 2019 at 10:57 AM David Greene via llvm-dev <llvm...@lists.llvm.org> wrote:

template<typename Tag, typename T>
class StrongTypedef {
public:
using BaseType = T;

private:
BaseType Value;

public:
constexpr StrongTypedef() : Value() {}
constexpr StrongTypedef(const StrongTypedef<Tag, T> &) = default;
explicit constexpr StrongTypedef(const BaseType &V) : Value(V) {}
explicit StrongTypedef(BaseType &&V)
noexcept(std::is_nothrow_move_constructible<BaseType>::value) :
Value(std::move(V)) {}

public:
explicit operator BaseType&() noexcept {
return Value;
}

explicit operator const BaseType&() const noexcept {
return Value;
}

friend void swap(StrongTypedef &A, StrongTypedef &B) noexcept {
using std::swap;
swap(static_cast<BaseType&>(A), static_cast<BaseType&>(B));
}
};

I'm generally a fan of using the type system to encapsulate this kind of thing. A few points:

Why no move assignment operator?
swap has the wrong noexcept specification.
In my experience, most of these types end up being equality comparable and hashable. I understand that equality comparison is separate concern which can be provided by mixins, but the more boilerplate that is required to use it, the less likely people are to adopt it. This could, of course, be constrained on whether or not the BaseType supports these operations.
Similar argument about using CRTP, as it requires more boilerplate (the using StrongTypedef::StrongTypedef; statement to bring in the converting constructor).
Not a fan of the name StrongTypedef (or OpaqueTypedef) either. If the C++ standard ever acquires related functionality, there will end up being a conceptual mismatch.

--

Nevin ":-)" Liber <mailto:ne...@cplusplusguy.com> +1-847-691-1404

David Greene via llvm-dev

unread,

Aug 5, 2019, 11:29:46 AM8/5/19

to Nevin Liber via llvm-dev, Nevin Liber

Nevin Liber via llvm-dev <llvm...@lists.llvm.org> writes:

> I'm generally a fan of using the type system to encapsulate this kind of thing. A few points:
>

> * Why no move assignment operator?

It's a sketch. :) The actual code will have a lot more.

> * swap has the wrong noexcept specification.

Noted, thanks.

> * In my experience, most of these types end up being equality

> comparable and hashable. I understand that equality comparison is
> separate concern which can be provided by mixins, but the more
> boilerplate that is required to use it, the less likely people are to
> adopt it. This could, of course, be constrained on whether or not the
> BaseType supports these operations.

Let's talk about this in the design review when I post the patch. I'll
definitely add you as a reviewer.

> * Similar argument about using CRTP, as it requires more boilerplate

> (the using StrongTypedef::StrongTypedef; statement to bring in the
> converting constructor).

Ditto above, though without CRTP operator overloading becomes less
convenient.

> * Not a fan of the name StrongTypedef (or OpaqueTypedef) either. If

> the C++ standard ever acquires related functionality, there will end
> up being a conceptual mismatch.

What about "ExplicitWrapper?" Let's talk more when the patch is posted.

Thanks for your helpful feedback!

-David

Jeremy Lakeman via llvm-dev

unread,

Aug 5, 2019, 9:57:29 PM8/5/19

to David Greene, Nevin Liber via llvm-dev, Nevin Liber

On Tue, 6 Aug 2019 at 00:59, David Greene via llvm-dev <llvm...@lists.llvm.org> wrote:

> * Not a fan of the name StrongTypedef (or OpaqueTypedef) either. If
> the C++ standard ever acquires related functionality, there will end
> up being a conceptual mismatch.

What about "ExplicitWrapper?" Let's talk more when the patch is posted.

The syntax of using bool's is fine, but the semantics are hard to get right, so how about SemanticTypedef?

David Greene via llvm-dev

unread,

Aug 13, 2019, 3:42:09 PM8/13/19

to llvm...@lists.llvm.org

For anyone interested, I've posted a patch as D66148.

https://reviews.llvm.org/D66148

-David

Reply all

Reply to author

Forward