Page MenuHomePhabricator

Variable names rule
Needs RevisionPublic

Authored by michaelplatings on Thu, Feb 7, 7:48 AM.

Details

Reviewers
lattner
zturner
Summary

Following discussion and general agreement that the current naming rule for variables is not ideal, this patch switches the naming rule to make lowerCamelCase the standard, consistent with a prior RFC.

Given that over 450,000 variables are currently named in UpperCamelCase, the rule also permits using that form for consistency with existing code. I can't see a way to express that in .clang-tidy files.

Diff Detail

Event Timeline

michaelplatings created this revision.Thu, Feb 7, 7:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptThu, Feb 7, 7:48 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Pretty sure this patch should have gone to llvm-commits, not cfe-commits.

Pretty sure this patch should have gone to llvm-commits, not cfe-commits.

I just set the repository, Phabricator did the rest - apparently the magic isn't working so well.

Pretty sure this patch should have gone to llvm-commits, not cfe-commits.

I just set the repository, Phabricator did the rest - apparently the magic isn't working so well.

  1. Does clang-tidy warn on every single existing variable now?
  2. It might be best to give this more visibility, by submitting a mail to llvm-dev, with a noticeable subject, like "RFC: changing variable naming rules in LLVM codebase"
  1. It might be best to give this more visibility, by submitting a mail to llvm-dev, with a noticeable subject, like "RFC: changing variable naming rules in LLVM codebase"

+1. I know the discussion took place there originally, but "RFC" will catch more people's eyes. Also the prior discussion had some digressions (ahem) which a new RFC can be more focused.

Herald added a project: Restricted Project. · View Herald TranscriptThu, Feb 7, 10:14 AM

does the readability-identifier-naming check need to be changed to support multiple allowed case types?

- key:             readability-identifier-naming.VariableCase
    value:        camelBack,CamelBack

I am generally in favour of this direction.

lattner accepted this revision.Thu, Feb 7, 10:04 PM
lattner added a subscriber: lattner.

I am very much +1 on this. That said, this isn't the sort of thing we just use patch review for. Please agitate a robust discussion about this on llvm-dev. :-)

This revision is now accepted and ready to land.Thu, Feb 7, 10:04 PM
  1. Does clang-tidy warn on every single existing variable now?
  2. It might be best to give this more visibility, by submitting a mail to llvm-dev, with a noticeable subject, like "RFC: changing variable naming rules in LLVM codebase"
  1. Pretty much. Previously clang-tidy gave 56,787 unique warnings. After this patch it gives 361,382 unique warnings. I personally would be happy to change the settings from camelBack to aNy_CasE.
  1. Done: http://lists.llvm.org/pipermail/llvm-dev/2019-February/130083.html
MyDeveloperDay added a comment.EditedFri, Feb 8, 6:27 AM

I personally would be happy to change the settings from camelBack to aNy_CasE.

Should we come up with a new style? say UpperOrLowerCamelCase or camelBackOrCase , I don't mind going and doing that in the readability-identifier-naming check, given that I just wrote up all the Options for that check https://clang.llvm.org/extra/clang-tidy/checks/readability-identifier-naming.html in D56563: [clang-tidy] add options documentation to readability-identifier-naming checker

zturner added a subscriber: zturner.Fri, Feb 8, 8:59 PM

Is this actually any better? Whereas before we can’t differentiate type names and variable names, under this proposal we can’t differentiate type names and function names. So it seems a bit of “6 of 1, half dozen of another”

Is this actually any better? Whereas before we can’t differentiate type names and variable names, under this proposal we can’t differentiate type names and function names. So it seems a bit of “6 of 1, half dozen of another”

Perhaps you mistyped? The proposal does not change the status quo of either type names nor function names. If you mean that we can't differentiate variable names and function names, then it seems worthwhile to point out that the actual letters (not just the case of said letters) matter too. Whereas the guidelines state that types and variables should have names that are nouns, the guidelines state that functions should have names that are verb phrases.

Should we come up with a new style? say UpperOrLowerCamelCase, I don't mind going and doing that in the readability-identifier-naming check, given that I just wrote up all the Options for that check https://clang.llvm.org/extra/clang-tidy/checks/readability-identifier-naming.html in D56563: [clang-tidy] add options documentation to readability-identifier-naming checker

Sounds good to me. I see that you've made D57966 a child of this issue, but we could swap the dependency around so that once your patch is applied I can update this patch to use camelBackOrCase.

MyDeveloperDay added a comment.EditedMon, Feb 11, 3:07 AM

Sounds good to me. I see that you've made D57966 a child of this issue, but we could swap the dependency around so that once your patch is applied I can update this patch to use camelBackOrCase.

I'm OK if we want to do that, but its very much a circular dependency, I don't want to land D57966: [clang-tidy] add camelBackOrCase casing style to readability-identifier-naming to support change to variable naming policy (if adopted), unless this whole variableName suggestion is accepted (which I think is a good idea).. I think your suggestion warrants being the driver, lets do the clang-tidy change and subsequent .clang-tidy changes on another revision post acceptance of both.

FWIW, I agree with the comments that the function name should be differentiated from the variable by the use of a verbs, I've spent too much time in my career grepping for the word type in code

Type type = type();

I think

Type type = getType();

or

Type objectType = getType();

adds some increased levels of clarity.

Is this actually any better? Whereas before we can’t differentiate type names and variable names, under this proposal we can’t differentiate type names and function names. So it seems a bit of “6 of 1, half dozen of another”

Perhaps you mistyped? The proposal does not change the status quo of either type names nor function names. If you mean that we can't differentiate variable names and function names, then it seems worthwhile to point out that the actual letters (not just the case of said letters) matter too. Whereas the guidelines state that types and variables should have names that are nouns, the guidelines state that functions should have names that are verb phrases.

There is still overlap, e.g. "process" can be a noun (a Linux process) or a verb (to process something)

I think it should also be pointed out there is not zero overhead -- it's not a lot (at least for native English speakers, which many LLVM developers are not), but determining if a word is a verb or a noun is harder than looking at the casing. Small, but worth observing.

A different convention, e.g. lower_case, avoids this. Personally, I'd prefer that, but I'm also fine with lowerCamelCase just so we can stop using UpperCamelCase.

llvm/docs/CodingStandards.rst
1194

It would be nice for this section to be expanded a bit, just to avoid inevitable code review churn, e.g. if I'm adding 50 lines to a 200 line file, am I allowed to change the existing var names elsewhere in the file or method, or is that outside the scope of my change? If I'm reviewing that patch, do I tell the author they have to be consistent and revert other changes? etc.

Is there any plan to use clang-tidy to do a global cleanup, or is this going to be a totally ad-hoc migration -- variables use the new scheme only when the code is updated?

Update .clang-tidy files to use aNy_CasE until camelBackOrCase is available.
Add more guidance around acronyms.
Add more guidance around consistency with existing CamelCase variable names.
Change other code examples to camelBack.

michaelplatings marked an inline comment as done.Mon, Feb 18, 8:50 AM
michaelplatings added inline comments.
llvm/docs/CodingStandards.rst
1194

I've had a go at expanding it. Please let me know if you have other suggestions.

Is there any plan to use clang-tidy to do a global cleanup, or is this going to be a totally ad-hoc migration -- variables use the new scheme only when the code is updated?

The latter. Given that the code doesn't keep to the existing .clang-tidy rules I'm not optimistic that we could persuade code owners to start now. That's not to say it couldn't happen eventually, but my aim at this point in time is to make it easier to use good variable names and I don't want perfect to be the enemy of better.

Changed recommendation for acronyms from lower case to upper case, as suggested by several responses to the RFC.

miyuki added a subscriber: miyuki.Tue, Feb 19, 3:40 AM
miyuki added inline comments.
llvm/docs/CodingStandards.rst
1065–1068

signed is a keyword, it can't be used as a variable name

michaelplatings marked an inline comment as done.

Changed recommendation for acronyms from lower case to upper case, as suggested by several responses to the RFC.

I haven't been following the discussion closely - why is this the preferred direction? I don't think that things like "Basicblock *bb" or "MachineInstr *mi" will be confusing, and going towards a consistently leading lower case letter seems simple and preferable.

-Chris

Changed recommendation for acronyms from lower case to upper case, as suggested by several responses to the RFC.

I haven't been following the discussion closely - why is this the preferred direction? I don't think that things like "Basicblock *bb" or "MachineInstr *mi" will be confusing, and going towards a consistently leading lower case letter seems simple and preferable.

Maybe I misunderstood you (http://lists.llvm.org/pipermail/llvm-dev/2019-February/130223.html):

> Maybe there should be an exception that variable names that start with an acronym still should start with an upper case letter?
That would also be fine with me - it could push such a debate down the road, and is definitely important for a transition period anyway.

Also Chandler (http://lists.llvm.org/pipermail/llvm-dev/2019-February/130313.html):

FWIW, I suspect separating the transition of our acronyms from the transition of identifiers with non-acronym words may be an effective way to chip away at the transition cost... Definitely an area that people who really care about this should look at carefully.

And Sanjoy (kind of) (http://lists.llvm.org/pipermail/llvm-dev/2019-February/130304.html):

maybe a gradual transition plan could be to allow these upper case acronyms for specific classes?

I agree that lower case acronyms would ultimately be more consistent, but given where we are it seems more achievable to only change the rule for non-acronyms.

Changed recommendation for acronyms from lower case to upper case, as suggested by several responses to the RFC.

I haven't been following the discussion closely - why is this the preferred direction? I don't think that things like "Basicblock *bb" or "MachineInstr *mi" will be confusing, and going towards a consistently leading lower case letter seems simple and preferable.

-Chris

I don’t think we should use this review as evidence of consensus. For example, I’m going to be against any change that doesn’t bring us closer to LLDB’s style of lower_case simply on the grounds that a move which brings us farther away from global consistency is strictly worse than one which brings us closer, despite ones personal aesthetic preferences.

And so far, I don’t really see that addressed here (or in the thread)

zturner requested changes to this revision.Tue, Feb 19, 7:15 AM

Since someone already accepted this, I suppose I should mark require changes to formalize my dissent

This revision now requires changes to proceed.Tue, Feb 19, 7:15 AM