Page MenuHomePhabricator

[Documentation] Proposal for plan to change variable names
ClosedPublic

Authored by michaelplatings on Mar 12 2019, 6:27 AM.

Details

Summary

This proposal summarizes feedback gathered in this thread: http://lists.llvm.org/pipermail/llvm-dev/2019-February/130083.html and aims to ultimately form it into a plan that can be agreed upon.

It is not expected that reviewers agree with the plan, especially its later stages, but it shouldn't misrepresent any views previously expressed, contain obvious errors or leave major gaps.

Diff Detail

Repository
rL LLVM

Event Timeline

jhenderson added inline comments.Mar 12 2019, 6:34 AM
llvm/docs/Proposals/VariableNames.rst
77 ↗(On Diff #190241)

Probably also worth noting that lower_case is consistent with the C++ standard, and maybe other projects like Boost?

t.p.northover added inline comments.
llvm/docs/Proposals/VariableNames.rst
46–47 ↗(On Diff #190241)

WTF!

133 ↗(On Diff #190241)

How was this list derived? It seems a bit skewed towards mid-end development over back-end.

MRI is MachineRegisterInfo to me (and about 70% of LLVM code by a quick grep); and TLI is TargetLoweringInfo (rarer than TargetLibraryInfo this time, but still about 30% of uses).

I know someone specifically mentioned being surprised by conflicting acronyms when moving to different parts of LLVM, but I think it's rare enough that we should still allow them.

michaelplatings marked 4 inline comments as done.
michaelplatings marked an inline comment as done.
michaelplatings added inline comments.
llvm/docs/Proposals/VariableNames.rst
77 ↗(On Diff #190241)

I've avoided referencing projects that don't also capitalise class names, as they need to balance different concerns to those we're considering.

133 ↗(On Diff #190241)

The list was largely compiled from those mentioned in the RFC thread - more can be added in future. I made a mistake with MRI, I've changed it to MachineRegisterInfo.
Was TargetLoweringInfo renamed to TargetLowering? In that case TLI seems less appropriate. If this is controversial then I'll remove it from the proposal for now.

michaelplatings marked an inline comment as done.Mar 12 2019, 7:33 AM
MyDeveloperDay added inline comments.Mar 12 2019, 7:49 AM
llvm/docs/Proposals/VariableNames.rst
297 ↗(On Diff #190253)

clang-tidy can autoformat any modification with -format-style=file and it will use a projects .clang-format file to clang-format just the lines that were changed by clang-tidy.

That removes the need for a separate clang-format run, this should at least help minimize the other changes caused by formatting code outside of the rename, or the need to run git clang-format.

Potentially you might want to run clang-tidy with JUST the readability-identifier-naming rules turned on so it only fixes those issues and not anything else... (otherwise clang-tidy is going to have a field day!)

t.p.northover added inline comments.Mar 12 2019, 7:51 AM
llvm/docs/Proposals/VariableNames.rst
133 ↗(On Diff #190241)

Wow, I never noticed the acronym didn't really match up for TLI! Yep, lowering seems reasonable under that.

michaelplatings marked 3 inline comments as done.
michaelplatings added inline comments.
llvm/docs/Proposals/VariableNames.rst
297 ↗(On Diff #190253)

Thanks, I've updated the plan accordingly.

michaelplatings marked an inline comment as done.Mar 12 2019, 8:07 AM
ruiu added inline comments.Mar 12 2019, 1:32 PM
llvm/docs/Proposals/VariableNames.rst
220 ↗(On Diff #190258)

By the way, do you have anything in mind about what tool can be used for batch renaming? Do we have to hack it up based on clang-format or something?

mehdi_amini added inline comments.Mar 12 2019, 3:33 PM
llvm/docs/Proposals/VariableNames.rst
144 ↗(On Diff #190258)

side note: I feel that single letter variable name are annoying (can't easily search in a text editor for example), I would rather use a short name like func.

264 ↗(On Diff #190258)

This can be seen as an advantage for a mass rename: it is a "one-time cost" (that can be helped by clang-tidy), while a progressive renaming will lead to many spurious merge conflicts (and successful merge breaking the builds, or worse changing the runtime behavior!!) for downstream users for multiple years.
It seems be easier to deal with a one time merge that is NFC rather than having many semantic change patches along the years that create conflict on variable naming.

this is a really great summary of the situation, thank you for collecting this in such a methodical way!

llvm/docs/Proposals/VariableNames.rst
46–47 ↗(On Diff #190241)

seriously :-)

90 ↗(On Diff #190258)

FWIW, I personally consider this to be a totally orthogonal discussion to the other issues, I think it would be nice to separate it out just so we have some hope of converging an already contentious topic. Ratcheting forward one decision at a time seems like better way to make progress.

141 ↗(On Diff #190258)

Small point, but I think that "block" is a much better name than bb in practice.

144 ↗(On Diff #190258)

Yeah, maybe this could include a list, like f, fn, func, etc. There is a diversity of contractions used here. Standardizing this is a theoretically nice thing, but a distraction from the core issue in this discussion and not particularly important in the big scheme of things (unlikely to cause confusion).

254 ↗(On Diff #190258)

FWIW, It has never been a goal for LLVM To prevent downstream merge conflicts. The "party line" (painful as it may be in practice) is that mainline moves ahead full speed without worrying about this, and anyone adversely affected should work to get their changes merged to mainline.

This isn't likely to affect the public APIs in llvm/include in a significant way, so it really only affects people with diffs against core code.

michaelplatings marked 4 inline comments as done.
michaelplatings marked 9 inline comments as done.Mar 13 2019, 7:50 AM
michaelplatings added inline comments.
llvm/docs/Proposals/VariableNames.rst
90 ↗(On Diff #190258)

I agree that it's orthogonal to reducing the number of acronyms (my personal intention with all this) but there seems to be a strong sentiment which we can't ignore that differentiating member variables should be considered as part of changing naming policy. For that reason I'm including it at this stage. I hope that results gathered from the experimentation phase will inform the discussion and we can agree at that point on a suitable way forward, which may be to defer this particular change as you suggest.

144 ↗(On Diff #190258)

I was under the misconception that F was one of the acronyms that people were fond of. I've removed it for now.

220 ↗(On Diff #190258)

Primarily clang-tidy. There will be some extra work to resolve name conflicts (there are some variables named Int for example) but we'll flush out such issues during the experimentation phase (currently step 3 in the provisional plan).

254 ↗(On Diff #190258)

Yes, and I'm fully supportive of that policy. But if we can make downstream folks' lives easier without compromising our choices upstream then I'm happy to put in the effort to do that.

264 ↗(On Diff #190258)

True, although see Chris Lattner's comment about the "party line" above - on that basis it's not something that should sway our decision.

Thanks for going through all this effort. I think this captures the discussion pretty well. Thanks for including all the citations. If anyone wants to continue on some more points, should we reply here, or reply on the RFC thread?

llvm/docs/Proposals/VariableNames.rst
86 ↗(On Diff #190413)

I also prefer this type... you could add my name here, but maybe I should ask more generally: it's good that the discussion points in favor/against each style are listed, but as far as individuals that approve/oppose a style, do we plan to run some kind of poll and go with whichever has a majority vote? (Should Chandler/Chris get more votes than me? :) )

97 ↗(On Diff #190413)

Similarly, I also (slightly) oppose this style, but I'm not sure if you need to add my name to the list, or if we're just going to tally votes at some point

141 ↗(On Diff #190413)

An important detail that seems to be left out -- not just here, but also the main llvm style guide -- is scoping of these variables.

For example:

for (DeterministicFiniteAutomaton* dfa : allTheDfas) // or all_the_dfas
  dfa->DoSomething();

dfa is great here, and in fact, a long name might just be distracting.

But, for something like:

DeterministicFiniteAutomaton* dfa = GetSomeSpecificDfa();
// ... 200 lines of other stuff ...
dfa->DoSomething(); // I forget, which dfa is this?

dfa is not a good name -- something that has such a long scope needs a more descriptive name, e.g. which dfa we're manipulating.

Which is a long way of saying: I don't think that having a blessed list of acronyms is a good idea. It's going to be entirely dependent on the context. We should just say that acronyms can be considered one word, and is a fine way of using short variable names. (Listing a couple of these as common examples is fine, though).

297 ↗(On Diff #190413)

Can we take a stab at defining "suitable delay" here? Would something like 2 weeks be enough? Or maybe a longer time (e.g. 1 month) for the first experiment, shorter times (e.g. 2 weeks) for the next few, and no delay for the remaining ones?

rupprecht accepted this revision.Mar 15 2019, 11:13 AM

Since this isn't the actual policy change, just a snapshot of an ongoing discussion, I think this fine to land, and we can modify it if/when there's more discussion off thread? Approving since nobody else did :)

This revision is now accepted and ready to land.Mar 15 2019, 11:13 AM
MyDeveloperDay added inline comments.Mar 18 2019, 3:09 AM
llvm/docs/Proposals/VariableNames.rst
294 ↗(On Diff #190413)

@michaelplatings I have identifier a number of issues

bugs.llvm.org/PR41119  - [clang-tidy] readability-identifier-naming incorrectly fixes lambda capture
bugs.llvm.org/PR41120  - [clang-tidy] readability-identifier-naming incorrectly fixes variables which become keywords
      (previously call out here with regards to Int being renamed to int)
bugs.llvm.org/PR41122 - [clang-tidy] readability-identifier-naming misses fixing member variables in destructor

(and there could likely me more)

Which would need to be fixed in clang-tidy prior to any rename activity to prevent this process being painful. I think this is a good dog fooding opportunity for clang-tidy.

jdenny added a subscriber: jdenny.Mar 20 2019, 10:39 AM
michaelplatings marked 11 inline comments as done.
michaelplatings marked an inline comment as done.Thu, Mar 28, 7:35 AM
michaelplatings added inline comments.
llvm/docs/Proposals/VariableNames.rst
86 ↗(On Diff #190413)

Yes, some kind of poll. Exactly how we do this is to be decided. @Charusso has pointed out https://reviews.llvm.org/vote/ but as we in the UK are painfully aware right now, giving people binary choices can lead to no choice at all. I'm inclined to copy Debian's voting method: https://www.debian.org/vote/
How we weight the voting is another interesting question. You could say more contributions = more weight, but given that we're specifically interested in the views of newcomers here that doesn't really work. On the other hand, 1 vote per person would mean that one person could get all their friends to vote for them which is even worse. Potentially we could give 1 vote to any person who has contributed before the discussion started.

141 ↗(On Diff #190413)

I don't think that having a blessed list of acronyms is a good idea. It's going to be entirely dependent on the context

If an acronym is entirely dependent on the context then I suggest it shouldn't be on the list.

Part of my hope with this list is to be able to say "learn these acronyms and you should be well placed to read most code in LLVM". At the moment the code has too many variables with long scope named using acronyms that may have any number of meanings.

I take your point about names with short scope. If a tool is developed to expand acronyms then it should avoid touching names whose scope is only a few lines.

294 ↗(On Diff #190413)

Thanks, I've added that to the proposal.

This revision was automatically updated to reflect the committed changes.
Charusso added inline comments.Thu, Mar 28, 7:47 AM
llvm/docs/Proposals/VariableNames.rst
86 ↗(On Diff #190413)

There you could fill 10 different options on-the-fly: https://reviews.llvm.org/vote/create/ and that is the common place across all the sub-projects. Immediately you could check-out this weight idea of a contributor. I see no problem with that.