[llvm-dev] Modernizing LLVM Coding Style Guide and enforcing Clang-tidy

304 views
Skip to first unread message

Piotr Padlewski via llvm-dev

unread,
Dec 29, 2016, 7:24:44 AM12/29/16
to Clang Dev, llvm-dev
Hi everyone,
I would like to start a discussion about enforcing use of clang-tidy (or better clang-tidy-diff) on patches before sending it to review. 
I like how clang-format simplified sending patches and reviews, e.g. "Use clang-format" instead of giving long list of lines that should be formatted by reviewer.
I believe that clang-tidy can be also be very helpful here.
Note that by enforcing I mean the same way as we enforce clang-format - It is not about "every changed line need to be formatted by clang-format" because thee might be some special formatting that looks better 
formatted by hand, it is about saving time for 99.9% of the lines. The same applies to clang-tidy, where there might be some bugs or false positives, but I would like to save my time as reviewer to not mention 
things like "use auto", "use range loop", "pass by value" over and over. 

But before enforcing clang-tidy, we have to get to consensus on adding new rules to coding style.
I will list clang-tidy checks that I think should be used in LLVM. To make it short I will not justify why it is better to use this style because the documentation of checks is doing good job there.

List of checks that I would like to add to .clang-tidy config
modernize-*



I skipped some checks because they work only in C++14.
Coding style right now only mention using auto.
Loops are mentioned here:
http://llvm.org/docs/CodingStandards.html#don-t-evaluate-end-every-time-through-a-loop - but thi probably should be changed to "use for range loop if possible. Try to not evaluate end() multiple times if for range loop is not possible to use"


I don't know which of the listed changes sounds controversial to you, so if you would like to start discussion about one of them, then please also mention that you agree/disagree with the others.
One of the things that I would like to propose, is using push_back whenever the inserted type is the same or it implicitly casts itself. 

std::vector<Value*> v;
Instr *I;
Value *V;
IntrinsicInstr *II;
v.push_back(I);
v.push_back(V);
v.push_back(II);

Use emplace_back only if we insert temporary object like:

std::vector<SomeType> v2;
v2.emplace_back(a, b, c); // instead of v.push_back(SomeType(a, b, c));

The first reason is make code simple and more readable. I belive that code 'v.push_back(I)' is more readable than 'v.emplace_back(I)' because I don't have to think about if the element type has special constructor taking Instr, or it is just vector of Instr.
From yesterday discussion, Daniel Berlin proposed using emplace_back everywhere to make code easier to maintain. I think it is valid argument, but I believe it would reduce readability.
There is also other reason - emplace_back can't take arguments of some types like bitfields, NULL, static member. Using emplace can also lead to memory leak in case of smart ptrs
"
auto *ptr = new int(1);
v.push_back(std::unique_ptr<int>(ptr));

This is because replacing it with emplace_back could cause a leak of this pointer if emplace_back would throw exception before emplacement (e.g. not enough memory to add new element).".

In the case of push_back/emplace_back for the same type, there should be no performance changes, however emplace_back might generate more template instantiations slowing down compilation.

There is also topic of using insert/emplace for maps showing that map.emplace can be slower than map.insert. I would not want to distinguish between emplace for maps and emplace for vectors,

so I believe using emplace for constructed temporaries, even if there would be some slightly performance regression, looks simpler and more readable.



I am happy to change to write changes to LLVM coding standards document, but I firstly would like to see what community thinks about it.

NOTE that I don't propose running clang-tidy and modifying whole codebase. It might happen some day, but we firstly need to agree that these changes are better.

There are also other checks from performance and misc that I would like to enable. Some of them are already enabled (all misc), but because they are more about finding bugs, I would like to narrow discussion only to modernize.


Piotr

Daniel Berlin via llvm-dev

unread,
Dec 29, 2016, 8:54:49 AM12/29/16
to Piotr Padlewski, llvm-dev, Clang Dev

From yesterday discussion, Daniel Berlin proposed using emplace_back everywhere to make code easier to maintain. I think it is valid argument, but I believe it would reduce readability.

Just to be clear; I proposed not trying to switch uses back and forth without real data, and to come to some agreement about what should be written in the first place, preferably based on real data, and then, switching in some organized fashion, not just randomly :)

IE Either have a clear coding standard and enforce it, or leave uses alone one way or the other without some demonstration that it actually matters.

I would *personally* prefer to just use emplace_back everywhere, and not try to switch between the two, without demonstration of real benefit.

(i'd also accept the other way, use push_back everywhere, and not try to switch between the two, without demonstration of real benefit).

This preference is based on a simple truth: People suck at deciding when to use one or the other. Yes, i know how to use it and when to use it.  Lots of our types are cheaply movable, etc, so you probably won't see any performance difference even when using them "right".  People have enough style/etc issues without having to try to think hard about this.

So that isn't what I would necessarily propose for LLVM.

For LLVM, my answer would be "either make tool do what we want, force use of tool, or be consistent and obvious with what we want and then fix it if it leads to performance issue"


 
There is also other reason - emplace_back can't take arguments of some types like bitfields, NULL, static member. Using emplace can also lead to memory leak in case of smart ptrs
"
auto *ptr = new int(1);
v.push_back(std::unique_ptr<int>(ptr));

This is because replacing it with emplace_back could cause a leak of this pointer if emplace_back would throw exception before emplacement (e.g. not enough memory to add new element).".


This seems, IMHO, not a useful argument for LLVM since it does not try to use exception based error handling to recover.

 

Piotr Padlewski via llvm-dev

unread,
Dec 29, 2016, 10:07:19 AM12/29/16
to Daniel Berlin, llvm-dev, Clang Dev
2016-12-29 14:54 GMT+01:00 Daniel Berlin <dbe...@dberlin.org>:

From yesterday discussion, Daniel Berlin proposed using emplace_back everywhere to make code easier to maintain. I think it is valid argument, but I believe it would reduce readability.

Just to be clear; I proposed not trying to switch uses back and forth without real data, and to come to some agreement about what should be written in the first place, preferably based on real data, and then, switching in some organized fashion, not just randomly :)

IE Either have a clear coding standard and enforce it, or leave uses alone one way or the other without some demonstration that it actually matters.

I would *personally* prefer to just use emplace_back everywhere, and not try to switch between the two, without demonstration of real benefit.

(i'd also accept the other way, use push_back everywhere, and not try to switch between the two, without demonstration of real benefit).

This preference is based on a simple truth: People suck at deciding when to use one or the other. Yes, i know how to use it and when to use it.  Lots of our types are cheaply movable, etc, so you probably won't see any performance difference even when using them "right".  People have enough style/etc issues without having to try to think hard about this.

So that isn't what I would necessarily propose for LLVM.

For LLVM, my answer would be "either make tool do what we want, force use of tool, or be consistent and obvious with what we want and then fix it if it leads to performance issue"

Sure, that's why we have clang-tidy. I propose using push_back everywhere except when temporary is passed, where using emplace_back would simplify by not having to write type name.
modernize-use-emplace can find places like push_back(Type(a, b, c)) and transform it into emplace_back(a, b, c); 
I hand someone a patches for LLVM and clang that 
1) modifies every push_back to emplace_back
2) modifies every emplace_back to push_back (if it is valid)
3) modifies every push_back(Type(a, b)) to emplace_back(a, b);

I actually wanted to do this last time at google because you have some framework to test performance of clang, but I didn't have enough time.
I only have compared time of running tests vs running tests after 3), and the results were almost the same.
That's why I think that performance is not concern here.
So if someone with such framework would like to help me, then we could check if there are any performance wins.

 

 
There is also other reason - emplace_back can't take arguments of some types like bitfields, NULL, static member. Using emplace can also lead to memory leak in case of smart ptrs
"
auto *ptr = new int(1);
v.push_back(std::unique_ptr<int>(ptr));

This is because replacing it with emplace_back could cause a leak of this pointer if emplace_back would throw exception before emplacement (e.g. not enough memory to add new element).".


This seems, IMHO, not a useful argument for LLVM since it does not try to use exception based error handling to recover.

But there is still problem with initializer lists, bit-fields, static members and nulls. When I was testing this check it actually found many of these cases in LLVM (specially bit fields and initializer lists).

 

David Blaikie via llvm-dev

unread,
Dec 29, 2016, 12:35:00 PM12/29/16
to Piotr Padlewski, Clang Dev, llvm-dev
I'm a bit confused by this whole discussion.

clang-format is neither mandated (by documentation) nor enforced (by any infrastructure/automation) for use in the LLVM project that I know of.

It's convenient, and in review people may reasonably ask authors to consider running it, etc - but we have no system that requires or checks for that. Might be nice, might not be.

It sounds like even if clang-format were mandated - we mostly accept that such a mandate is forward-looking, and that we don't want/intend to reformat the entire existing codebase. So I imagine the same would apply to clang-tidy (we wouldn't expect to tidy the entire codebase - we'd just use clang-tidy as advisory for any new changes as we do with clang-format) - so I don't think it'd run up against Danny's protest about choosing a philosophy for wide scale changes, because there would be no wide scale changes.

All that said, I think like clang-format we should probably have a project-wide config that specifies which checks we care about - but the real impediment to adoption of those checks is a streamlined process for running them on a patch. clang-format has nice integration with my editor (vim) and source control system (git-clang-format). I have yet to discover an equally convenient place to put clang-tidy in my workflow. Without that I doubt it'll see very much adoption within the LLVM community.

- Dave

_______________________________________________
cfe-dev mailing list
cfe...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

Mehdi Amini via llvm-dev

unread,
Dec 29, 2016, 1:05:16 PM12/29/16
to Daniel Berlin, llvm-dev, Clang Dev
On Dec 29, 2016, at 5:54 AM, Daniel Berlin <dbe...@dberlin.org> wrote:


From yesterday discussion, Daniel Berlin proposed using emplace_back everywhere to make code easier to maintain. I think it is valid argument, but I believe it would reduce readability.

Just to be clear; I proposed not trying to switch uses back and forth without real data, and to come to some agreement about what should be written in the first place, preferably based on real data, and then, switching in some organized fashion, not just randomly :)

IE Either have a clear coding standard and enforce it, or leave uses alone one way or the other without some demonstration that it actually matters.

I would *personally* prefer to just use emplace_back everywhere, and not try to switch between the two, without demonstration of real benefit.

(i'd also accept the other way, use push_back everywhere, and not try to switch between the two, without demonstration of real benefit).

This preference is based on a simple truth: People suck at deciding when to use one or the other. Yes, i know how to use it and when to use it.  Lots of our types are cheaply movable, etc, so you probably won't see any performance difference even when using them "right".  People have enough style/etc issues without having to try to think hard about this.

I agree that “people suck at deciding”, me in the first place in the line, and that’s why I like clear and good guideline and stick to it unless "demonstration of real benefit”.
I also think we can have clear guideline that are different from “always use emplace_back” or “always use push_back”, like Piotr is suggesting.

That said, I’m not convinced the relative “ugliness" of emplace_back  (I’m not sure what’s ugly there, I only see it as a habit to take)  vs push_back is enough to justify a set of rules to decide between the two, I’d be fine with “always using emplace_back” in the name of “simple rule is better when possible".


So that isn't what I would necessarily propose for LLVM.

For LLVM, my answer would be "either make tool do what we want, force use of tool, or be consistent and obvious with what we want and then fix it if it leads to performance issue”

+1
Ultimately whatever we do is not practical if it can’t be done by a tool.




 
There is also other reason - emplace_back can't take arguments of some types like bitfields, NULL, static member. Using emplace can also lead to memory leak in case of smart ptrs
"
auto *ptr = new int(1);
v.push_back(std::unique_ptr<int>(ptr));

This is because replacing it with emplace_back could cause a leak of this pointer if emplace_back would throw exception before emplacement (e.g. not enough memory to add new element).".


This seems, IMHO, not a useful argument for LLVM since it does not try to use exception based error handling to recover.

This, and also we should not write code like that (naked pointers, calling new), using make_unique<> for instance would prevent any such situation. Passing a raw pointer to a container of unique_ptr can be flagged by a tool.


— 
Mehdi

Sean Silva via llvm-dev

unread,
Dec 29, 2016, 2:01:18 PM12/29/16
to David Blaikie, llvm-dev, Clang Dev
On Thu, Dec 29, 2016 at 9:34 AM, David Blaikie via llvm-dev <llvm...@lists.llvm.org> wrote:
I'm a bit confused by this whole discussion.

clang-format is neither mandated (by documentation) nor enforced (by any infrastructure/automation) for use in the LLVM project that I know of.

It's convenient, and in review people may reasonably ask authors to consider running it, etc - but we have no system that requires or checks for that. Might be nice, might not be.

It sounds like even if clang-format were mandated - we mostly accept that such a mandate is forward-looking, and that we don't want/intend to reformat the entire existing codebase. So I imagine the same would apply to clang-tidy (we wouldn't expect to tidy the entire codebase - we'd just use clang-tidy as advisory for any new changes as we do with clang-format) - so I don't think it'd run up against Danny's protest about choosing a philosophy for wide scale changes, because there would be no wide scale changes.

All that said, I think like clang-format we should probably have a project-wide config that specifies which checks we care about - but the real impediment to adoption of those checks is a streamlined process for running them on a patch. clang-format has nice integration with my editor (vim) and source control system (git-clang-format). I have yet to discover an equally convenient place to put clang-tidy in my workflow. Without that I doubt it'll see very much adoption within the LLVM community.

+1

I don't remember there being a discussion analogous to the "enforcement" aspect of thread for clang-format. The uptake of clang-format was organic.

Piotr, what is different about the situation with clang-tidy (vs clang-format) that we need to have any discussion of "enforcement" in this thread? I think it's totally reasonable to have a discussion of which checks we want to run by default for the "LLVM style" (to use clang-format's terminology), and this thread should probably focus on that.

wrt adding rules to the coding standards, I don't think we want to clutter the coding standards document with a specific rule for every clang-tidy check we want to have on by default for the "LLVM style", the same way that we don't have a rule in the coding standards document for every clang-format config value which happens to be set in the "LLVM style" (i.e. all the stuff you see when you do -dump-config). Just the main ones is fine.

-- Sean Silva
 
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


David Blaikie via llvm-dev

unread,
Dec 29, 2016, 2:21:24 PM12/29/16
to Mehdi Amini, Daniel Berlin, llvm-dev, Clang Dev
On Thu, Dec 29, 2016 at 10:04 AM Mehdi Amini via cfe-dev <cfe...@lists.llvm.org> wrote:
On Dec 29, 2016, at 5:54 AM, Daniel Berlin <dbe...@dberlin.org> wrote:


From yesterday discussion, Daniel Berlin proposed using emplace_back everywhere to make code easier to maintain. I think it is valid argument, but I believe it would reduce readability.

Just to be clear; I proposed not trying to switch uses back and forth without real data, and to come to some agreement about what should be written in the first place, preferably based on real data, and then, switching in some organized fashion, not just randomly :)

IE Either have a clear coding standard and enforce it, or leave uses alone one way or the other without some demonstration that it actually matters.

I would *personally* prefer to just use emplace_back everywhere, and not try to switch between the two, without demonstration of real benefit.

(i'd also accept the other way, use push_back everywhere, and not try to switch between the two, without demonstration of real benefit).

This preference is based on a simple truth: People suck at deciding when to use one or the other. Yes, i know how to use it and when to use it.  Lots of our types are cheaply movable, etc, so you probably won't see any performance difference even when using them "right".  People have enough style/etc issues without having to try to think hard about this.

I agree that “people suck at deciding”, me in the first place in the line, and that’s why I like clear and good guideline and stick to it unless "demonstration of real benefit”.
I also think we can have clear guideline that are different from “always use emplace_back” or “always use push_back”, like Piotr is suggesting.

I don't think Piotr is suggesting "always use push_back" but "always use push_back when it's valid" & I'd second this.

To show a simpler example:

std::unique_ptr<T> u(v);

std::unique_ptr<T> u = v;

I'd generally advocate for using copy init (the second form - it doesn't copy in this case, it moves) over direct init (the first form) because it's more legible - copy init can only execute implicit ctor operations, whereas direct init can invoke implicit and explicit operations. So from a readability perspective seeing the latter is easier than seing the former - I know the operation is further constrained to simpler/less interesting operations (eg: 'v' can't be a raw pointer in the second form, it might be (& then I have to worry about whether that's an ownership transfer that's intended), etc)

& push_back V emplace_back is the same choice - push_back means only implicit ops are used and so it's more readable, emplace_back requires more careful attention.

I think this is a reasonably good stylistic rule - but I'm happy leaving an open to judgment for sure - there might be readability reasons that an emplace_back may work better in some cases.

I'd think of this like "else after return" - we don't have any enforcement, sometimes we make judgment about it being better (for consistency - sometimes there's a pattern such that else makes the code more readable), we haven't gone back to do any cleanup of the codebase, but it's pretty consistently applied/desired in code review.

All that said - sure, I'd like to have tools to help me get this right (else after return and "prefer copy init style over direct init") & make that really easy to do.

- Dave
 

That said, I’m not convinced the relative “ugliness" of emplace_back  (I’m not sure what’s ugly there, I only see it as a habit to take)  vs push_back is enough to justify a set of rules to decide between the two, I’d be fine with “always using emplace_back” in the name of “simple rule is better when possible".


So that isn't what I would necessarily propose for LLVM.

For LLVM, my answer would be "either make tool do what we want, force use of tool, or be consistent and obvious with what we want and then fix it if it leads to performance issue”

+1
Ultimately whatever we do is not practical if it can’t be done by a tool.

Not sure I follow this - we have lots of things we don't currently do with a tool that's still part of our style & stuff we try to catch with code review, etc. (else after return is my best example)
 




 
There is also other reason - emplace_back can't take arguments of some types like bitfields, NULL, static member. Using emplace can also lead to memory leak in case of smart ptrs
"
auto *ptr = new int(1);
v.push_back(std::unique_ptr<int>(ptr));

This is because replacing it with emplace_back could cause a leak of this pointer if emplace_back would throw exception before emplacement (e.g. not enough memory to add new element).".


This seems, IMHO, not a useful argument for LLVM since it does not try to use exception based error handling to recover.

This, and also we should not write code like that (naked pointers, calling new), using make_unique<> for instance would prevent any such situation. Passing a raw pointer to a container of unique_ptr can be flagged by a tool.

Sure - but it generalizes further than that, it's just a nice simple example.

std::vector<int> v(i);

'i' could be an int & this would make a vector of that many ints, or it could be another vector & this is making a copy, etc.

std::vector<int> v = i;

I know taht's not creating a vector of 'i' many ints.

Yeah, not a great example either - but the general idea applies, the latter form is less powerful & so is easier to read because you don't have to worry/think about it as much.

I think we mostly do this sort of thing out of habit anyway - but codifying it (& if possible, tools to help) could be nice.

 


— 
Mehdi

Mehdi Amini via llvm-dev

unread,
Dec 29, 2016, 2:46:53 PM12/29/16
to David Blaikie, llvm-dev, Clang Dev
On Dec 29, 2016, at 11:20 AM, David Blaikie <dbla...@gmail.com> wrote:



On Thu, Dec 29, 2016 at 10:04 AM Mehdi Amini via cfe-dev <cfe...@lists.llvm.org> wrote:
On Dec 29, 2016, at 5:54 AM, Daniel Berlin <dbe...@dberlin.org> wrote:


From yesterday discussion, Daniel Berlin proposed using emplace_back everywhere to make code easier to maintain. I think it is valid argument, but I believe it would reduce readability.

Just to be clear; I proposed not trying to switch uses back and forth without real data, and to come to some agreement about what should be written in the first place, preferably based on real data, and then, switching in some organized fashion, not just randomly :)

IE Either have a clear coding standard and enforce it, or leave uses alone one way or the other without some demonstration that it actually matters.

I would *personally* prefer to just use emplace_back everywhere, and not try to switch between the two, without demonstration of real benefit.

(i'd also accept the other way, use push_back everywhere, and not try to switch between the two, without demonstration of real benefit).

This preference is based on a simple truth: People suck at deciding when to use one or the other. Yes, i know how to use it and when to use it.  Lots of our types are cheaply movable, etc, so you probably won't see any performance difference even when using them "right".  People have enough style/etc issues without having to try to think hard about this.

I agree that “people suck at deciding”, me in the first place in the line, and that’s why I like clear and good guideline and stick to it unless "demonstration of real benefit”.
I also think we can have clear guideline that are different from “always use emplace_back” or “always use push_back”, like Piotr is suggesting.

I don't think Piotr is suggesting "always use push_back" but "always use push_back when it's valid" & I'd second this.

Define “valid”? Just that it will compile?

To show a simpler example:

std::unique_ptr<T> u(v);

std::unique_ptr<T> u = v;

I'd generally advocate for using copy init (the second form - it doesn't copy in this case, it moves) over direct init (the first form) because it's more legible - copy init can only execute implicit ctor operations, whereas direct init can invoke implicit and explicit operations. So from a readability perspective seeing the latter is easier than seing the former - I know the operation is further constrained to simpler/less interesting operations (eg: 'v' can't be a raw pointer in the second form, it might be (& then I have to worry about whether that's an ownership transfer that's intended), etc)

& push_back V emplace_back is the same choice - push_back means only implicit ops are used and so it's more readable

I don’t see what’s more readable about “only implicit ctor”. Emplace is very explicit that we intended to construct an object, I don’t understand what hurts readability here?


, emplace_back requires more careful attention.

I think this is a reasonably good stylistic rule - but I'm happy leaving an open to judgment for sure - there might be readability reasons that an emplace_back may work better in some cases.

I'd think of this like "else after return" - we don't have any enforcement, sometimes we make judgment about it being better (for consistency - sometimes there's a pattern such that else makes the code more readable), we haven't gone back to do any cleanup of the codebase, but it's pretty consistently applied/desired in code review.

I have a different impression: we are actively cleaning the codebase with tidy-checks. For instance look for `git log --author=Zelenko`.
Another example is that a few months ago the entire LLDB codebase has been formatted with clang-format, after marking the specific places where it was not desirable (tables for instance) with comment to disable clang-format.

— 
Mehdi

David Blaikie via llvm-dev

unread,
Dec 29, 2016, 3:03:43 PM12/29/16
to Mehdi Amini, llvm-dev, Clang Dev
On Thu, Dec 29, 2016 at 11:46 AM Mehdi Amini <mehdi...@apple.com> wrote:
On Dec 29, 2016, at 11:20 AM, David Blaikie <dbla...@gmail.com> wrote:



On Thu, Dec 29, 2016 at 10:04 AM Mehdi Amini via cfe-dev <cfe...@lists.llvm.org> wrote:
On Dec 29, 2016, at 5:54 AM, Daniel Berlin <dbe...@dberlin.org> wrote:


From yesterday discussion, Daniel Berlin proposed using emplace_back everywhere to make code easier to maintain. I think it is valid argument, but I believe it would reduce readability.

Just to be clear; I proposed not trying to switch uses back and forth without real data, and to come to some agreement about what should be written in the first place, preferably based on real data, and then, switching in some organized fashion, not just randomly :)

IE Either have a clear coding standard and enforce it, or leave uses alone one way or the other without some demonstration that it actually matters.

I would *personally* prefer to just use emplace_back everywhere, and not try to switch between the two, without demonstration of real benefit.

(i'd also accept the other way, use push_back everywhere, and not try to switch between the two, without demonstration of real benefit).

This preference is based on a simple truth: People suck at deciding when to use one or the other. Yes, i know how to use it and when to use it.  Lots of our types are cheaply movable, etc, so you probably won't see any performance difference even when using them "right".  People have enough style/etc issues without having to try to think hard about this.

I agree that “people suck at deciding”, me in the first place in the line, and that’s why I like clear and good guideline and stick to it unless "demonstration of real benefit”.
I also think we can have clear guideline that are different from “always use emplace_back” or “always use push_back”, like Piotr is suggesting.

I don't think Piotr is suggesting "always use push_back" but "always use push_back when it's valid" & I'd second this.

Define “valid”? Just that it will compile?

Yep
 

To show a simpler example:

std::unique_ptr<T> u(v);

std::unique_ptr<T> u = v;

I'd generally advocate for using copy init (the second form - it doesn't copy in this case, it moves) over direct init (the first form) because it's more legible - copy init can only execute implicit ctor operations, whereas direct init can invoke implicit and explicit operations. So from a readability perspective seeing the latter is easier than seing the former - I know the operation is further constrained to simpler/less interesting operations (eg: 'v' can't be a raw pointer in the second form, it might be (& then I have to worry about whether that's an ownership transfer that's intended), etc)

& push_back V emplace_back is the same choice - push_back means only implicit ops are used and so it's more readable

I don’t see what’s more readable about “only implicit ctor”.

It limits the set of operations that can be performed by the code. So when I read it there's less I have to think about/consider. (& specifically, the implicit operations tend to be simpler/safer/more obvious - copy/move or operations that are similar - not complex/interesting things like "taking ownership from a raw pointer" or "creating a vector of N elements")
 
Emplace is very explicit that we intended to construct an object, I don’t understand what hurts readability here?

Going back to the example above, given the following two lines:

std::unique_ptr<T> u(foo());
std::unique_ptr<T> u = foo();

(& the equivalent: emplace_back(foo()) V push_back(foo()) for a vector of unique_ptr)

the copy init/push_back are simpler to read because they aren't as powerful - I don't have to wonder if something is taking ownership there (or if I'm creating a vector of N ints, etc). I know it's a simple/obvious operation, generally (because others shouldn't be implicit).
 


, emplace_back requires more careful attention.

I think this is a reasonably good stylistic rule - but I'm happy leaving an open to judgment for sure - there might be readability reasons that an emplace_back may work better in some cases.

I'd think of this like "else after return" - we don't have any enforcement, sometimes we make judgment about it being better (for consistency - sometimes there's a pattern such that else makes the code more readable), we haven't gone back to do any cleanup of the codebase, but it's pretty consistently applied/desired in code review.

I have a different impression: we are actively cleaning the codebase with tidy-checks.

I think that's where Danny is pushing back - I'm on the fence about that. So I'd push back on the automated cleanup (& have done so on several occasions as these sort of patches have been sent) & encourage efforts to provide the tools (like clang-tidy integration for patches and editors) to avoid mistakes going forward in an advisory (rather than mandatory) way.
 
For instance look for `git log --author=Zelenko`.
Another example is that a few months ago the entire LLDB codebase has been formatted with clang-format, after marking the specific places where it was not desirable (tables for instance) with comment to disable clang-format.

*nod* Only because it was so far out of LLVM's style (by design originally). LLVM's generally "close enough" that the cleanup isn't desired/intended.

I'd say the same for this particular instance - implicit V explicit ctor calls. Mostly we write them using copy init, not direct init. I suspect push_back/emplace_back is similar (if only due to history - and perhaps due to my/other code review feedback encouraging push_back when possible in reviews that might otherwise be going to more emplace_back) so I'm not sure I care too much about the cleanup.

I think the issue Danny was getting at - which I agree - is unless we have good tools in place to not make these mistakes again going forward (I think clang-format has sort of reached that point - it's got easy integration in editors and source control systems) then there's limited merit in doing the cleanup. +1 to that.

If we have good integration for clang-tidy, then I'd be more OK with doing cleanup.

Either way (good or not integration) - if we have some people who have ways of using clang-tidy in their development process, I'd say it's worth having a project-wide clang-tidy config. I'd vote for putting "use implicit-only ops (like copy init and push_back) over explicit ops (like direct init and emplace_back) where both compile" in that list.

- Dave
 

Chandler Carruth via llvm-dev

unread,
Dec 29, 2016, 5:04:26 PM12/29/16
to David Blaikie, Mehdi Amini, llvm-dev, Clang Dev
Somewhat unfortunately, we have two discussions here:

- Clang-tidy has checks that might improve code -- should we deploy them? If so which?

I'll address this in a separate email, and focus this one on:

- Should we have coding standards around 'push_back' and 'emplace_back', and if so, what should they say?

I think the amount of confusion makes a coding standard useful.

As for what it should say, I tend to agree with Dave here. In particular:

This is exactly where I come down as well.

Another useful way to think about it is "what do I need to understand to understand the semantics of this operation".

In order to understand push_back, I need only read its documentation. The type going in will have to have value semantics (potentially with moves). Otherwise it will be an error. And push_back's documentation says what it does.

In order to understand a given emplace_back call I have to *both* read its documentation and the documentation for all of the constructors on the type.


Still another way to see the consequence of this is to look at the nature of compiler errors when a programmer makes a mistake.

With emplace_back, if you fail to call the constructor correctly, all of the error messages come with a layer (or many layers) of template instantiation. With push_back(T(...)), the constructor call is direct and the error messages are as well.

With emplace_back, if you are converting from one type to another and the conversion fails, you again get template backtraces. With push_back, all the conversions happen before the push_back method and so the error is local to your code.

Chandler Carruth via llvm-dev

unread,
Dec 29, 2016, 5:20:31 PM12/29/16
to Chandler Carruth, David Blaikie, Mehdi Amini, llvm-dev, Clang Dev
Dave pointed out that I didn't complete one aspect of my argument on the push_back vs. emplace_back:

On Thu, Dec 29, 2016 at 2:04 PM Chandler Carruth <chan...@gmail.com> wrote:
Still another way to see the consequence of this is to look at the nature of compiler errors when a programmer makes a mistake.

With emplace_back, if you fail to call the constructor correctly, all of the error messages come with a layer (or many layers) of template instantiation. With push_back(T(...)), the constructor call is direct and the error messages are as well.

With emplace_back, if you are converting from one type to another and the conversion fails, you again get template backtraces. With push_back, all the conversions happen before the push_back method and so the error is local to your code.

This in turn makes me prefer push_back(T(...)) over empalce_back(...). I *like* the constructor being called explicitly in the users code if is an *explicit* constructor. For things that should be implicit, they already are with push_back. If the API of the type went out of its way to make a constructor call explicit, I'd like to see the user code actually calling it.


The consequence of this is very simple guidelines for programmers to: don't use emplace_back unless you *must* use it. So for containers of non-copyable and non-movable types, it can be very useful and important. But in every other case I would rather see push_back (and if an explicit constructor call is necessary, an explicit constructor call).

My two cents.
-Chandler

Piotr Padlewski via llvm-dev

unread,
Dec 30, 2016, 5:08:41 AM12/30/16
to Chandler Carruth, llvm-dev, Clang Dev
Thanks for very accurate responses.
- I totally agree what Dave and Chandler said about explicit and implicit operations, this is what I meant my first email.
  I believe there are places like 
    v.emplace_back(A, B);
  istead of
    v.push_back(make_pair(A, B));b
  That can make code simpler. I think in cases like this we can leave it for judgement of contributor.
  Having said that I think the case Chandler and Dave mentioned should be in LLVM style guide.
  @Daniel does this argument seem good enough to prefer push_back instead of emplace_back?

- About clang-tidy config (Dave): 
  We already have clang-tidy config in LLVM and clang - check ".clang-tidy" file. I feel tho that people don't use clang-tidy as often as clang-format. 
  This might be because clang-tidy didn't have enough features that they would like. I believe that clang-tidy is doing much better and could really save a lot of time for reviewers and contributors.
  But to make it happen we either have to change LLVM style guide, so we won't need to discuss it over and over in reviews, or gain clang-format trust - it is so good people don't question it.

- About enforcing (Sean)
  Sorry, I used wrong word here. I meant to use it the same way as clang-format, to trust people they send cleaned code by clang-tidy the same way we trust them that they used clang-format.
  Reviewer can then ask contributor to run clang-tidy instead of listing things that contributor should change. We are a little far from having mandatory running of clang-format and clang-tidy on every patch automatically.

- About workflow (Dave)
  If I remember correctly there is some integration for clang-tidy for vim (YCM?). There is also clang-tidy-diff, but maybe we should have git-clang-tidy to make it even simpler.


Chandler Carruth via llvm-dev

unread,
Dec 30, 2016, 5:34:24 AM12/30/16
to Piotr Padlewski, Chandler Carruth, llvm-dev, Clang Dev
On Fri, Dec 30, 2016 at 2:08 AM Piotr Padlewski via cfe-dev <cfe...@lists.llvm.org> wrote:
Thanks for very accurate responses.
- I totally agree what Dave and Chandler said about explicit and implicit operations, this is what I meant my first email.
  I believe there are places like 
    v.emplace_back(A, B);
  istead of
    v.push_back(make_pair(A, B));b
  That can make code simpler.

Do you have examples? The only ones i can come up with are the ones where the push_back variant literally can't compile because the type isn't movable.

Perhaps it would be useful to break down categories of can happen here...

Case 1: there is one object already -- this is a *conversion* of a type.
- If the author of the conversion made it *implicit*, then 'v.push_back(x)' just works.
- If the author of the conversion made it *explicit* I would like to see the name of the type explicitly: 'v.push_back(T(x))'.

Case 2a: There is a collection of objects that are being composed into an aggregate. We don't have any interesting logic in the constructor, it takes an initializer list.
- This should work with 'v.push_back({a, b, c})'
- If it doesn't today, we can fix the type's constructors so that it does.
- Using 'emplace_back' doesn't help much -- you still need {}s to form the std::initializer_list in many cases. Pair and tuple are somewhat unusual in not requiring them.

Case 2b: A specific constructor needs to be called with an argument list. These arguments are not merely being aggregated but are inputs to a constructor that contains logic.
- This is analogous to a case called out w.r.t. '{...}' syntax in the coding standards[1]
- Similar to that rule, I would like to see a *call to the constructor* rather than hiding it behind 'emplace_back' as this is a function with interesting logic.
- That means i would write T(a, b, c) anyways, and 'v.push_back(T(a, b, c))' works.


Case 3: Passing objects of type 'T' through 'push_back' fails to compile because they cannot be copied or moved.
- You *must* use 'emplace_back' here. No argument (obviously).

My experience with LLVM code and other codebases is that case 3 should be extremely rare. The intersection of "types that cannot be moved or copied" and "types that you put into containers" is typically small.


Anyways, I don't disagree with this point with a tiny fix:
I think in cases like this we can leave it for judgement of contributor.
*or reviewer*. ;]

I continue to think exceptions can be made in rare cases when folks have good reasons. But I expect this to be quite rare. =]

Piotr Padlewski via llvm-dev

unread,
Dec 30, 2016, 6:27:32 AM12/30/16
to Chandler Carruth, llvm-dev, Clang Dev
2016-12-30 11:34 GMT+01:00 Chandler Carruth <chan...@gmail.com>:
On Fri, Dec 30, 2016 at 2:08 AM Piotr Padlewski via cfe-dev <cfe...@lists.llvm.org> wrote:
Thanks for very accurate responses.
- I totally agree what Dave and Chandler said about explicit and implicit operations, this is what I meant my first email.
  I believe there are places like 
    v.emplace_back(A, B);
  istead of
    v.push_back(make_pair(A, B));b
  That can make code simpler.

Do you have examples? The only ones i can come up with are the ones where the push_back variant literally can't compile because the type isn't movable.

Perhaps it would be useful to break down categories of can happen here...

Case 1: there is one object already -- this is a *conversion* of a type.
- If the author of the conversion made it *implicit*, then 'v.push_back(x)' just works.
- If the author of the conversion made it *explicit* I would like to see the name of the type explicitly: 'v.push_back(T(x))'.

Case 2a: There is a collection of objects that are being composed into an aggregate. We don't have any interesting logic in the constructor, it takes an initializer list.
- This should work with 'v.push_back({a, b, c})'
- If it doesn't today, we can fix the type's constructors so that it does.
- Using 'emplace_back' doesn't help much -- you still need {}s to form the std::initializer_list in many cases. Pair and tuple are somewhat unusual in not requiring them.

 This sounds extremely reasonable. 
Case 2b: A specific constructor needs to be called with an argument list. These arguments are not merely being aggregated but are inputs to a constructor that contains logic.
- This is analogous to a case called out w.r.t. '{...}' syntax in the coding standards[1]
- Similar to that rule, I would like to see a *call to the constructor* rather than hiding it behind 'emplace_back' as this is a function with interesting logic.
- That means i would write T(a, b, c) anyways, and 'v.push_back(T(a, b, c))' works.

Calling emplace_back with 0 or multiple arguments is a clear way of saying "this constructor takes multiple arguments".
We can do it with initializer list with easy way like:
v.emplace_back()        == v.push_back({})
v.emplace_back(a, b ,c) == v.push_back({a, b, c})

I personally never liked the initializer syntax because of tricky casees like:

vector<string> v{{"abc", "def"}};
Which is equivalent of 
vector<string> v = {std::string("abc", "def")};
That will call std::string ctor with 2 iterators likely crashing, and putting same string might gives us empty string.

In this case programmer probably meant
std::vector<std:string> v({"abc", "def"});
or
std::vector<std::string> v = {"abc", "def"};

But this case is not possible to mess up with push_back (in the case of vector<vector<string>> or something). At least I hope it is not.
So avoiding braces is my personal preference. It is fine for me if we would choose to prefer 'v.push_back({a, b, c})' instead of 'v.emplace_back(a, b, c)', ofc as long as most of the community would prefer first form to the second :)



Case 3: Passing objects of type 'T' through 'push_back' fails to compile because they cannot be copied or moved.
- You *must* use 'emplace_back' here. No argument (obviously).

My experience with LLVM code and other codebases is that case 3 should be extremely rare. The intersection of "types that cannot be moved or copied" and "types that you put into containers" is typically small.


Anyways, I don't disagree with this point with a tiny fix:
I think in cases like this we can leave it for judgement of contributor.
*or reviewer*. ;]

I continue to think exceptions can be made in rare cases when folks have good reasons. But I expect this to be quite rare. =]

Piotr

Piotr Padlewski via llvm-dev

unread,
Jan 9, 2017, 9:17:51 AM1/9/17
to Chandler Carruth, llvm-dev, Clang Dev
Are there any other comments about changing style guide?
I would like to add points like

- prefer "using' instead of "typedef"
- use default member initialization
struct A {
  void *ptr = nullptr;
};

(instead of doing it in constructor)

- use default, override, delete
- skip "virtual" with override

The last point is to get to consensus with 

push_back({first, second}) 
or
emplace_back(first ,second);

Renato Golin via llvm-dev

unread,
Jan 9, 2017, 10:15:34 AM1/9/17
to Piotr Padlewski, llvm-dev, Clang Dev
On 9 January 2017 at 14:17, Piotr Padlewski via cfe-dev

<cfe...@lists.llvm.org> wrote:
> - prefer "using' instead of "typedef"
> - use default member initialization
> - use default, override, delete
> - skip "virtual" with override

I thought we had all of those already...


> The last point is to get to consensus with
>
> push_back({first, second})
> or
> emplace_back(first ,second);

I second Chandler's opinion on this.

cheers,
--renato

Piotr Padlewski via llvm-dev

unread,
Jan 9, 2017, 10:25:39 AM1/9/17
to Renato Golin, llvm-dev, Clang Dev
2017-01-09 16:15 GMT+01:00 Renato Golin <renato...@linaro.org>:
On 9 January 2017 at 14:17, Piotr Padlewski via cfe-dev
<cfe...@lists.llvm.org> wrote:
> - prefer "using' instead of "typedef"
> - use default member initialization
> - use default, override, delete
> - skip "virtual" with override

I thought we had all of those already...

Nope, some people use it, but I still see a lot of new code with typedefs.
I would like to have it written in style guide so it will be easier to convince to change in review.
 

> The last point is to get to consensus with
>
> push_back({first, second})
> or
> emplace_back(first ,second);

I second Chandler's opinion on this.

2:1 for push_back. From commits I also see that most of the people prefer push_back, but I will wait a little bit with the verdict.
 

cheers,
--renato

David Blaikie via llvm-dev

unread,
Jan 9, 2017, 11:24:29 AM1/9/17
to Piotr Padlewski, Chandler Carruth, llvm-dev, Clang Dev
On Mon, Jan 9, 2017 at 6:17 AM Piotr Padlewski via cfe-dev <cfe...@lists.llvm.org> wrote:
Are there any other comments about changing style guide?
I would like to add points like

- prefer "using' instead of "typedef"
- use default member initialization
struct A {
  void *ptr = nullptr;
};

(instead of doing it in constructor)

- use default, override, delete
- skip "virtual" with override

The last point is to get to consensus with 

push_back({first, second}) 
or
emplace_back(first ,second);


It might be a bit noisy, but I'd be inclined to start a separate thread for each of these on llvm-dev with a clear subject line relevant to each one. So they don't get lost and some of them don't get drowned out by the discussion of others, etc.
 

Piotr Padlewski via llvm-dev

unread,
Jan 9, 2017, 1:10:27 PM1/9/17
to David Blaikie, llvm-dev, Clang Dev
2017-01-09 17:24 GMT+01:00 David Blaikie <dbla...@gmail.com>:


On Mon, Jan 9, 2017 at 6:17 AM Piotr Padlewski via cfe-dev <cfe...@lists.llvm.org> wrote:
Are there any other comments about changing style guide?
I would like to add points like

- prefer "using' instead of "typedef"
- use default member initialization
struct A {
  void *ptr = nullptr;
};

(instead of doing it in constructor)

- use default, override, delete
- skip "virtual" with override

The last point is to get to consensus with 

push_back({first, second}) 
or
emplace_back(first ,second);


It might be a bit noisy, but I'd be inclined to start a separate thread for each of these on llvm-dev with a clear subject line relevant to each one. So they don't get lost and some of them don't get drowned out by the discussion of others, etc.
 
Sure, I can do it, but at least now I don't see much of attention of people to any of the subject, so I would not like to start 5 empty threads.
I guess as long as the thread is not getting noisy I will keep only one. Does it sound ok?

David Blaikie via llvm-dev

unread,
Jan 9, 2017, 1:16:27 PM1/9/17
to Piotr Padlewski, llvm-dev, Clang Dev
On Mon, Jan 9, 2017 at 10:10 AM Piotr Padlewski <piotr.p...@gmail.com> wrote:
2017-01-09 17:24 GMT+01:00 David Blaikie <dbla...@gmail.com>:


On Mon, Jan 9, 2017 at 6:17 AM Piotr Padlewski via cfe-dev <cfe...@lists.llvm.org> wrote:
Are there any other comments about changing style guide?
I would like to add points like

- prefer "using' instead of "typedef"
- use default member initialization
struct A {
  void *ptr = nullptr;
};

(instead of doing it in constructor)

- use default, override, delete
- skip "virtual" with override

The last point is to get to consensus with 

push_back({first, second}) 
or
emplace_back(first ,second);


It might be a bit noisy, but I'd be inclined to start a separate thread for each of these on llvm-dev with a clear subject line relevant to each one. So they don't get lost and some of them don't get drowned out by the discussion of others, etc.
 
Sure, I can do it, but at least now I don't see much of attention of people to any of the subject, so I would not like to start 5 empty threads.
I guess as long as the thread is not getting noisy I will keep only one. Does it sound ok?

Perhaps - if no one else pipes up *shrug*

If that's the case, I'd at least suggest submitting the changes to the style guide separately with clear subject/titles so people reading commits might see/notice/discuss there.

FWIW: I'm also in favor of push_back where valid. Though I'm not sure it's so much a matter of votes, but justification, etc. As for the others - sure, they all sound good to me.

Also - once these are in the style guide there's still a separate discussion to be had about whether automated cleanup is worthwhile & how best to go about that sort of thing.
 

Reid Kleckner via llvm-dev

unread,
Jan 9, 2017, 1:20:40 PM1/9/17
to Piotr Padlewski, llvm-dev, Clang Dev
On Mon, Jan 9, 2017 at 7:25 AM, Piotr Padlewski via llvm-dev <llvm...@lists.llvm.org> wrote:
2017-01-09 16:15 GMT+01:00 Renato Golin <renato...@linaro.org>:
On 9 January 2017 at 14:17, Piotr Padlewski via cfe-dev
<cfe...@lists.llvm.org> wrote:
> - prefer "using' instead of "typedef"
> - use default member initialization
> - use default, override, delete
> - skip "virtual" with override

I thought we had all of those already...

Nope, some people use it, but I still see a lot of new code with typedefs.
I would like to have it written in style guide so it will be easier to convince to change in review.

The last two are enforced by compiler warnings now. The second is hard because of bitfields.

I object to the first. If you need a new type name, use a typedef. It's time honored and everyone, including C programmers, will know what you're doing. I don't understand why people push the new thing just for the sake of new-ness. 

Piotr Padlewski via llvm-dev

unread,
Jan 9, 2017, 1:24:26 PM1/9/17
to David Blaikie, llvm-dev, Clang Dev
2017-01-09 19:16 GMT+01:00 David Blaikie <dbla...@gmail.com>:


On Mon, Jan 9, 2017 at 10:10 AM Piotr Padlewski <piotr.p...@gmail.com> wrote:
2017-01-09 17:24 GMT+01:00 David Blaikie <dbla...@gmail.com>:


On Mon, Jan 9, 2017 at 6:17 AM Piotr Padlewski via cfe-dev <cfe...@lists.llvm.org> wrote:
Are there any other comments about changing style guide?
I would like to add points like

- prefer "using' instead of "typedef"
- use default member initialization
struct A {
  void *ptr = nullptr;
};

(instead of doing it in constructor)

- use default, override, delete
- skip "virtual" with override

The last point is to get to consensus with 

push_back({first, second}) 
or
emplace_back(first ,second);


It might be a bit noisy, but I'd be inclined to start a separate thread for each of these on llvm-dev with a clear subject line relevant to each one. So they don't get lost and some of them don't get drowned out by the discussion of others, etc.
 
Sure, I can do it, but at least now I don't see much of attention of people to any of the subject, so I would not like to start 5 empty threads.
I guess as long as the thread is not getting noisy I will keep only one. Does it sound ok?

Perhaps - if no one else pipes up *shrug*

If that's the case, I'd at least suggest submitting the changes to the style guide separately with clear subject/titles so people reading commits might see/notice/discuss there.

Good idea.
 
FWIW: I'm also in favor of push_back where valid. Though I'm not sure it's so much a matter of votes, but justification, etc. As for the others - sure, they all sound good to me.

I totally agree with all the points Chandler pointed out except the 2.b (multiple or no arugments). I just prefer emplace_back, but I find the push_back({}) also clean. So I guess small voting would be a good way to figure out what people prefer here.
 
Also - once these are in the style guide there's still a separate discussion to be had about whether automated cleanup is worthwhile & how best to go about that sort of thing.
 
That's right, I am not proposing any automatic cleanups (yet).

Mehdi Amini via llvm-dev

unread,
Jan 9, 2017, 1:25:40 PM1/9/17
to Reid Kleckner, llvm-dev, Clang Dev
`using` handles strictly more cases than  `typedef`, in particular partial specialization of templates. So because we’ll end up with `using` anyway, uniformity can be valuable. So that could be a motivation: since `using` is needed anyway, might be better to just use it always (I’m not saying it is a “strong” motivation though, just some piece of rational). 
(I also find that typedef of function pointers in particular are terrible, and `using` is much better for these, but that can be a matter of taste).

Mehdi

Renato Golin via llvm-dev

unread,
Jan 9, 2017, 1:29:56 PM1/9/17
to Reid Kleckner, llvm-dev, Clang Dev
On 9 January 2017 at 18:20, Reid Kleckner <r...@google.com> wrote:
> I object to the first. If you need a new type name, use a typedef. It's time
> honored and everyone, including C programmers, will know what you're doing.
> I don't understand why people push the new thing just for the sake of
> new-ness.

The text read "prefer", which I think it's fair. I'm opposed to any
hard rule for the sake of new-ness, consistency, or personal
preference.

If we encode all nit-rules as "prefer", then it should be more like a
guideline than a rule book, which has been our motto for everything
else.

Piotr Padlewski via llvm-dev

unread,
Jan 9, 2017, 1:32:33 PM1/9/17
to Mehdi Amini, llvm-dev, Clang Dev
+1 Exactly this. 
I don't think C programmer will not understand using. The "=" makes it much simpler to read, even if it is the first time you see it, which is not the case of typedef. 

typedef MyType::NestedType (*fptr)(const MyOhterType&);
or
using fptr = MyType::NestedType (*)(const MyOhterType&);

Typedefs with function pointers are used in couple of places in LLVM and I find it terible to read.

So it is not about new-ness. Trust me, I would never use typedef if using was first :)




Piotr Padlewski via llvm-dev

unread,
Jan 9, 2017, 1:41:06 PM1/9/17
to Reid Kleckner, llvm-dev, Clang Dev

2017-01-09 19:20 GMT+01:00 Reid Kleckner <r...@google.com>:
On Mon, Jan 9, 2017 at 7:25 AM, Piotr Padlewski via llvm-dev <llvm...@lists.llvm.org> wrote:
2017-01-09 16:15 GMT+01:00 Renato Golin <renato...@linaro.org>:
On 9 January 2017 at 14:17, Piotr Padlewski via cfe-dev
<cfe...@lists.llvm.org> wrote:
> - prefer "using' instead of "typedef"
> - use default member initialization
> - use default, override, delete
> - skip "virtual" with override

I thought we had all of those already...

Nope, some people use it, but I still see a lot of new code with typedefs.
I would like to have it written in style guide so it will be easier to convince to change in review.

The last two are enforced by compiler warnings now. The second is hard because of bitfields.
Clang is smart enough to propose using override, but only if it see that it was used somewhere, so it is not totally enforced. Also I don't think it warns about redundant "virtual" if one writes "virtual void foo() override".

Yes, the bitfields is the only one reason we would need initializations in constructors in some places, but it is rare (as far as what clang-tidy check have found)

Piotr Padlewski via llvm-dev

unread,
Jan 9, 2017, 1:46:19 PM1/9/17
to Renato Golin, llvm-dev, Clang Dev


2017-01-09 19:29 GMT+01:00 Renato Golin <renato...@linaro.org>:
On 9 January 2017 at 18:20, Reid Kleckner <r...@google.com> wrote:
> I object to the first. If you need a new type name, use a typedef. It's time
> honored and everyone, including C programmers, will know what you're doing.
> I don't understand why people push the new thing just for the sake of
> new-ness.

The text read "prefer", which I think it's fair. I'm opposed to any
hard rule for the sake of new-ness, consistency, or personal
preference.

If we encode all nit-rules as "prefer", then it should be more like a
guideline than a rule book, which has been our motto for everything
else.

cheers,
--renato

That's right, everyone has some preferences, but as long as most (over 75%) of the community prefer one thing over another then I guess it would make sense to convice the rest and make it official by writing it down. Then we won't have to spend time dicussing the same things over and over on different reviews. It is just much simpler to point new (or not) contributor to a rule in style guide.

Piotr

Mehdi Amini via llvm-dev

unread,
Jan 9, 2017, 2:04:06 PM1/9/17
to Renato Golin, llvm-dev, Clang Dev

> On Jan 9, 2017, at 10:29 AM, Renato Golin via llvm-dev <llvm...@lists.llvm.org> wrote:
>
> On 9 January 2017 at 18:20, Reid Kleckner <r...@google.com> wrote:
>> I object to the first. If you need a new type name, use a typedef. It's time
>> honored and everyone, including C programmers, will know what you're doing.
>> I don't understand why people push the new thing just for the sake of
>> new-ness.
>
> The text read "prefer", which I think it's fair. I'm opposed to any
> hard rule for the sake of new-ness, consistency, or personal
> preference.
>
> If we encode all nit-rules as "prefer", then it should be more like a
> guideline than a rule book, which has been our motto for everything
> else.

This is not correct according to the number of “should” and the imperative tone for many aspects of http://llvm.org/docs/CodingStandards.html#source-code-formatting


Mehdi

Renato Golin via llvm-dev

unread,
Jan 9, 2017, 3:47:57 PM1/9/17
to Mehdi Amini, llvm-dev, Clang Dev
On 9 January 2017 at 19:04, Mehdi Amini <mehdi...@apple.com> wrote:
> This is not correct according to the number of “should” and the imperative tone for many aspects of http://llvm.org/docs/CodingStandards.html#source-code-formatting

You mistake the tone of the documentation. There are things that
cannot be (exceptions, RTTI), things that are important to get right
(includes vs. forward declaration), things that are preferred
(c++11-isms) and things that are optional and very much depends on the
situation. The four items in the list I replied to fall into the
latter category.

The tone used for each type is appropriate to its enforcement. If you
add compiler errors or warnings, it's pretty easy to enforce.
Everything else will have varying degrees of success, and being
obnoxious about it has never been, and I hope never will be, our way.

We don't force people to run clang-format on patches, we ask when it's
ugly and people do because they believe it's a good thing. When the
formatting doesn't hurt my eyes, I don't ask for clang-format. I
certainly won't start asking people to run clang-tidy, though I'd be
happy if they did. That's personal and with the volume of commits we
have, that last thing we need is people blocking or reverting patches
because they didn't conform to personal preferences, even if they were
encoded in the coding standards.

I also strongly oppose to encoding personal preferences with a
stronger wording that it's warranted. Personal is personal. If it's
legal C++ and it's an appropriate use of the language for the case at
hand, than it's fine. I couldn't care less if you use "using" or
"typedef". I can understand both. "Prefer using" is an interesting
proposition, but refuse patches because they have "typedefs" is silly.

Honestly, my "coding standards" would be as simple as "do whatever
Scott Meyers says you should", but the LLVM one is nice, too. Unless
it's used as a weapon.

cheers,
--renato

Mehdi Amini via llvm-dev

unread,
Jan 9, 2017, 3:49:00 PM1/9/17
to Renato Golin, llvm-dev, Clang Dev

> On Jan 9, 2017, at 12:47 PM, Renato Golin <renato...@linaro.org> wrote:
>
> On 9 January 2017 at 19:04, Mehdi Amini <mehdi...@apple.com> wrote:
>> This is not correct according to the number of “should” and the imperative tone for many aspects of http://llvm.org/docs/CodingStandards.html#source-code-formatting
>
> You mistake the tone of the documentation.

Either one of us is mistaken, but I find yourself being fairly confident here…

Try going above the 80 cols and defend it as your personal preference in a review, and let me know how it went.


Mehdi

Renato Golin via llvm-dev

unread,
Jan 9, 2017, 3:55:09 PM1/9/17
to Mehdi Amini, llvm-dev, Clang Dev
On 9 January 2017 at 20:48, Mehdi Amini <mehdi...@apple.com> wrote:
> Either one of us is mistaken, but I find yourself being fairly confident here…

I'm only speaking from what I've seen happening in the past few years
committing and reviewing code. Our personal opinions don't change what
happened before, though they may try to change the future.

I just hope we don't become nit-pick nazis. I certainly won't become one.

Sanjoy Das via llvm-dev

unread,
Jan 9, 2017, 5:53:11 PM1/9/17
to Piotr Padlewski, llvm-dev, Clang Dev
On Mon, Jan 9, 2017 at 10:31 AM, Piotr Padlewski via llvm-dev

I would prefer using typedefs at least for function pointers. I find either of

Sanjoy Das via llvm-dev

unread,
Jan 9, 2017, 5:59:13 PM1/9/17
to Piotr Padlewski, llvm-dev, Clang Dev
Hi,

Sorry I fat fingered an earlier send in the previous email. I was
trying to say:

On Mon, Jan 9, 2017 at 2:52 PM, Sanjoy Das
<san...@playingwithpointers.com> wrote:
>> +1 Exactly this.
>> I don't think C programmer will not understand using. The "=" makes it much
>> simpler to read, even if it is the first time you see it, which is not the
>> case of typedef.
>>
>> typedef MyType::NestedType (*fptr)(const MyOhterType&);
>> or
>> using fptr = MyType::NestedType (*)(const MyOhterType&);
>

I would prefer to please keep using typedefs at least for function


pointers. I find either of

typedef MyType::NestedType (*fptr)(const MyOhterType&);

or

typedef int fptr(const int&);

void f(fptr* ptr) {
...
}

easier to read than the "using" declaration (especially the second
form, with the explicit `fptr* ptr`).

-- Sanjoy

David Blaikie via llvm-dev

unread,
Jan 9, 2017, 6:07:14 PM1/9/17
to Sanjoy Das, Piotr Padlewski, llvm-dev, Clang Dev
On Mon, Jan 9, 2017 at 2:59 PM Sanjoy Das via llvm-dev <llvm...@lists.llvm.org> wrote:
Hi,

Sorry I fat fingered an earlier send in the previous email.  I was
trying to say:

On Mon, Jan 9, 2017 at 2:52 PM, Sanjoy Das
<san...@playingwithpointers.com> wrote:
>> +1 Exactly this.
>> I don't think C programmer will not understand using. The "=" makes it much
>> simpler to read, even if it is the first time you see it, which is not the
>> case of typedef.
>>
>> typedef MyType::NestedType (*fptr)(const MyOhterType&);
>> or
>> using fptr = MyType::NestedType (*)(const MyOhterType&);
>

I would prefer to please keep using typedefs at least for function
pointers.  I find either of

typedef MyType::NestedType (*fptr)(const MyOhterType&);

or

typedef int fptr(const int&);

void f(fptr* ptr) {
  ...
}

easier to read than the "using" declaration (especially the second
form, with the explicit `fptr* ptr`).

Not sure I follow. You're saying this:

  typedef int func_type(const int&);

is easier for you to read than this:

  using func_type = int(const int&);

?

Daniel Berlin via llvm-dev

unread,
Jan 9, 2017, 6:14:43 PM1/9/17
to David Blaikie, llvm-dev, Clang Dev
I'd say so, one looks like the functions i right. The other looks strange ;)

David Blaikie via llvm-dev

unread,
Jan 9, 2017, 6:22:23 PM1/9/17
to Daniel Berlin, llvm-dev, Clang Dev

Fair enough

(though mostly (going by a quick grep) we end up with (a very rough guess, looking at a grep over llvm - maybe 20 times more common than function typedefs):

  typedef void (*fptr)(const int&);

which is a bit more arcane & perhaps easier as:

  using fptr = void (*)(const int&);

Then you don't have to try to remember where the variable name goes in these weird declarations - it's still pretty ugly, admittedly)
 

Sean Silva via llvm-dev

unread,
Jan 10, 2017, 3:35:55 AM1/10/17
to Mehdi Amini, llvm-dev, Clang Dev
On Mon, Jan 9, 2017 at 12:48 PM, Mehdi Amini via llvm-dev <llvm...@lists.llvm.org> wrote:

> On Jan 9, 2017, at 12:47 PM, Renato Golin <renato...@linaro.org> wrote:
>
> On 9 January 2017 at 19:04, Mehdi Amini <mehdi...@apple.com> wrote:
>> This is not correct according to the number of “should” and the imperative tone for many aspects of http://llvm.org/docs/CodingStandards.html#source-code-formatting
>
> You mistake the tone of the documentation.

Either one of us is mistaken, but I find yourself being fairly confident here…

Try going above the 80 cols and defend it as your personal preference in a review, and let me know how it went.

I doubt most reviewers will notice if you go slightly over 80 cols without some sort of automated check warning about it. W.r.t. the higher-level semantic guidelines, no reviewer keeps them all in their head. Just writing down a rule doesn't buy anything no matter how you write it down. The real coding standard is the one that a critical mass of LLVM developers will comment on when they find something objectionable.


-- Sean Silva
 

Mehdi Amini via llvm-dev

unread,
Jan 10, 2017, 3:37:33 AM1/10/17
to Sean Silva, llvm-dev, Clang Dev
On Jan 10, 2017, at 12:33 AM, Sean Silva <chiso...@gmail.com> wrote:



On Mon, Jan 9, 2017 at 12:48 PM, Mehdi Amini via llvm-dev <llvm...@lists.llvm.org> wrote:

> On Jan 9, 2017, at 12:47 PM, Renato Golin <renato...@linaro.org> wrote:
>
> On 9 January 2017 at 19:04, Mehdi Amini <mehdi...@apple.com> wrote:
>> This is not correct according to the number of “should” and the imperative tone for many aspects of http://llvm.org/docs/CodingStandards.html#source-code-formatting
>
> You mistake the tone of the documentation.

Either one of us is mistaken, but I find yourself being fairly confident here…

Try going above the 80 cols and defend it as your personal preference in a review, and let me know how it went.

I doubt most reviewers will notice if you go slightly over 80 cols without some sort of automated check warning about it. W.r.t. the higher-level semantic guidelines, no reviewer keeps them all in their head. Just writing down a rule doesn't buy anything no matter how you write it down.

Well I believe it still buys you an interesting property: no bike shedding over “personal preference” in any review. There’s a guideline to point at and we can’t instead focus on the important bits.

— 
Mehdi



The real coding standard is the one that a critical mass of LLVM developers will comment on when they find something objectionable.

<download (8).png>

Piotr Padlewski via llvm-dev

unread,
Jan 10, 2017, 5:04:49 AM1/10/17
to David Blaikie, llvm-dev, Clang Dev
I never saw syntax

typedef int funct_type(const int&);

I tried to use it and I got compile error for code:
int z(const int&);

int main() {
    typedef int fptr(const int&);

    fptr f = z;
    f(42);
}

where typedef int (*fptr)(const int&)

works. 

Renato Golin via llvm-dev

unread,
Jan 10, 2017, 6:17:51 AM1/10/17
to Mehdi Amini, llvm-dev, Clang Dev
On 10 January 2017 at 08:37, Mehdi Amini <mehdi...@apple.com> wrote:
> Well I believe it still buys you an interesting property: no bike shedding
> over “personal preference” in any review. There’s a guideline to point at
> and we can’t instead focus on the important bits.

You can't have that certainty with guidelines.

These 4 examples are less enforceable than 80-columns or
no-tabs-2-spaces rules, despite both sets being personal. They're
clearly guidelines.

All writing it down will buy you is "prefer this, not that", but:

1. If you have a good reason to do that, please do.
2. If the surrounding code uses that pattern, and this is not a
structural change, please repeat.
3. If it makes no difference as to the clarity of the code, who cares?

The most important thing here is the reaction of the community.

Reverting patches because they don't conform to guidelines is a big
mistake. Even 80 columns violation should be fixed later, in place, by
the original committer, after post-commit review.

Blocking patches because they don't conform to personal preferences,
even if they're encoded in the coding standard as "prefer" or
"should", ignoring the other facts I mention above, is also really bad
form.

First, we don't have code stewards to clean up other people's mistakes
or preferences, not we should. This would create a culture of garbage
code, where people don't care about what they commit.

Second, we want developers to feel comfortable contributing to our
project. People DO have different cultures, preferences, levels of
comfort with the languages and the project. But they also have their
own lives and work, and can't spend months polishing small patches,
just because they're not perfect in the view of random developers in
our community. That would be extremely counter-productive and would
ultimately kill the project.

Similar behaviour has killed other OSS projects in the past, so this
is not theoretical.

All that said, I think those 4 *guidelines* are nice to have on our
coding standards. But I'm ultimately against forcing people to run
clang-tidy or blocking/reverting patches because they don't match
perfectly with the coding standards.

Sanjoy Das via llvm-dev

unread,
Jan 10, 2017, 11:50:23 AM1/10/17
to Piotr Padlewski, llvm-dev, Clang Dev


Sent from a mobile device, please excuse typos.

That needs to be fptr* f = z;

Sean Silva via llvm-dev

unread,
Jan 11, 2017, 4:44:59 AM1/11/17
to Sanjoy Das, llvm-dev, Clang Dev
Yeah. It was confusing in the original example to have "ptr" in the name of the typedef. Usually it is more like:

typedef int callback_func(void *closure);
int foo(callback_func *cb) {
  // ...
}

(though this style is pretty rare in my experience, and results in the callback_func name being a weird type that you can't assign or do other stuff with).

-- Sean Silva
 

    f(42);
}

where typedef int (*fptr)(const int&)

works. 

 

-- Sanjoy
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



Piotr Padlewski via llvm-dev

unread,
Jan 11, 2017, 7:11:33 AM1/11/17
to Sean Silva, llvm-dev, Clang Dev
That make sense!

 
Yeah. It was confusing in the original example to have "ptr" in the name of the typedef. Usually it is more like:

typedef int callback_func(void *closure);
int foo(callback_func *cb) {
  // ...
}

(though this style is pretty rare in my experience, and results in the callback_func name being a weird type that you can't assign or do other stuff with).

-- Sean Silva
 

Yep, I have never seen it in LLVM code.

Philip Reames via llvm-dev

unread,
Jan 14, 2017, 12:40:45 AM1/14/17
to David Blaikie, Piotr Padlewski, llvm-dev, Clang Dev

Can we get at least a summary of any decisions made before action is taken?  I'm definitely not reading this thread in detail and I doubt others are either.

Philip

Reply all
Reply to author
Forward
0 new messages