Page MenuHomePhabricator

Rename variables so that they start with a lowercase letter.
ClosedPublic

Authored by ruiu on Jul 3 2019, 12:31 AM.

Details

Summary

This patch is mechanically generated by clang-llvm-rename tool that I wrote
using Clang Refactoring Engine just for creating this patch. You can see the
source code of the tool at https://reviews.llvm.org/D64123. There's no manual
post-processing; you can generate the same patch just by re-running the tool
against lld's code base.

This is a response to https://lists.llvm.org/pipermail/llvm-dev/2019-February/130083.html.
In the discussion thread, I proposed we use lld as a testbed for variable naming
scheme change, and this patch does that.

I chose to rename variables so that they are in camelCase, just because that is
a minimal change to make variables to start with a lowercase letter.

So, this is how it looks like if variables in an LLVM subproject start with a
lowercase letter. Any comments?

Diff Detail

Repository
rL LLVM

Event Timeline

ruiu created this revision.Jul 3 2019, 12:31 AM
Herald added a project: Restricted Project. · View Herald Transcript
ruiu planned changes to this revision.Jul 3 2019, 12:31 AM

FWIW my feeling regarding https://lists.llvm.org/pipermail/llvm-dev/2019-February/130083.html is that variable_name (used by lldb) is better than variableName. Though both are better than VariableName, the former makes it stand out from functionName...

ruiu updated this revision to Diff 207719.Jul 3 2019, 12:56 AM
  • tweak
ruiu planned changes to this revision.Jul 3 2019, 12:56 AM
ruiu updated this revision to Diff 207722.Jul 3 2019, 1:13 AM
  • tweak rules
ruiu planned changes to this revision.Jul 3 2019, 1:13 AM
ruiu added a comment.Jul 3 2019, 1:20 AM

FWIW my feeling regarding https://lists.llvm.org/pipermail/llvm-dev/2019-February/130083.html is that variable_name (used by lldb) is better than variableName. Though both are better than VariableName, the former makes it stand out from functionName...

I didn't intend to upload this patch to share, I just wanted to test how it would look like on Phabricator, but anyways. I don't have a strong preference over underscore or camelCase. I rename variables so that they are in camelCase because that's a minimal change -- for most variables, that results in a single letter change. It seems it is easier to get a consensus with this approach than adopting underscore naming style.

ruiu edited the summary of this revision. (Show Details)Jul 3 2019, 1:43 AM
ruiu edited reviewers, added: lattner, MaskRay, peter.smith, grimar; removed: espindola.
ruiu requested review of this revision.Jul 3 2019, 1:45 AM

In general I think a move to camelCase or snake_case is an improvement, don't have a particularly strong opinion over which one. I suspect that the camelCase transition is easier to automate. With just a quick scan of some of the output there are a few cases where some of the variable names are used as acronyms ISD -> InputSectionDescription. Or IS -> InputSection. I don't think that these survive the transition well, in particular IS -> is. I've not got a great idea for what to do about these right now, some possible options:

  • Do the full renaming as above and change the variable names later. Presumably it would help to be consistent.
  • Don't rename common acronyms like IS and change the variable names later.
  • Allow common acronyms to stay capitalised.
  • Have some custom renames for acronyms like IS.

I don't have a particularly strong opinion on which option. It would be helpful to be consistent though.

The other case where we may want to look at is use of capitals like S, P and A in relocation processing as these match the ELF specification. Of course we could use sym, place, addend instead.

ruiu added a comment.Jul 3 2019, 2:56 AM

In general I think a move to camelCase or snake_case is an improvement, don't have a particularly strong opinion over which one. I suspect that the camelCase transition is easier to automate. With just a quick scan of some of the output there are a few cases where some of the variable names are used as acronyms ISD -> InputSectionDescription. Or IS -> InputSection. I don't think that these survive the transition well, in particular IS -> is. I've not got a great idea for what to do about these right now, some possible options:

You are right; converting to camelCase is easier to automate than underscore. For some variables such as IsLMAAddr, it is not obvious where we should insert underscores and how to capitalize them. In addition to that, inserting underscores changes lengths of virtually all variables, so we need to run clang-format after running my tool, which effectively runs clang-format to the entire code base. I didn't want to do that, because for some files it looks like it drastically changes how source code is formatted.

  • Do the full renaming as above and change the variable names later. Presumably it would help to be consistent.
  • Don't rename common acronyms like IS and change the variable names later.
  • Allow common acronyms to stay capitalised.
  • Have some custom renames for acronyms like IS. I don't have a particularly strong opinion on which option. It would be helpful to be consistent though.

Currently, IS is just renamed is, but variables starting with IS (such as ISAddr) is renamed isecAddr by a special rule. Otherwise it would have looked like a predicate. I made a few special rules other than that, which you can see at https://reviews.llvm.org/D64123 (look for lowercase function).

The other case where we may want to look at is use of capitals like S, P and A in relocation processing as these match the ELF specification. Of course we could use sym, place, addend instead.

They are currently just renamed s, p and a, which doesn't look too awful to me.

Currently, IS is just renamed is, but variables starting with IS (such as ISAddr) is renamed isecAddr by a special rule. Otherwise it would have looked like a predicate. I made a few special rules other than that, which you can see at https://reviews.llvm.org/D64123 (look for lowercase function).

OK, thanks for a link to the rules. Reading https://reviews.llvm.org/D57896 (Variable Name Rules) it mentions that acronyms should be avoided but should favour upper case. There did seem to be some disagreement in the comments though and I don't know how strictly the guidelines will be enforced. There are trade-offs either way.

The other case where we may want to look at is use of capitals like S, P and A in relocation processing as these match the ELF specification. Of course we could use sym, place, addend instead.

They are currently just renamed s, p and a, which doesn't look too awful to me.

Agreed, I have a mild preference for the acronyms in upper case, but I can see the benefit of keeping the rule simple.

lld/ELF/Arch/MipsArchTree.cpp
77 ↗(On Diff #207722)

Looks like this got missed by the rename. I've not found any others yet.

lld/ELF/MapFile.cpp
90 ↗(On Diff #207722)

maybe worth a special case for OSec -> osec to match other uses of isec.

I'm not a reviewer for LLD, but this change looks really great to me. The major thing to watch out for is that things like "XYZVar" need to be renamed to "xyzVar", but it looks like LLD was already spelling these cases as "XyzVar", so this doesn't come up.

I'm very much +1 on this change, and thing we should roll it out across the entire LLVM project.

ruiu updated this revision to Diff 207940.Jul 3 2019, 5:45 PM
  • address review comments
ruiu marked 2 inline comments as done.Jul 3 2019, 5:54 PM

This patch is still machine-generated with an updated version of https://reviews.llvm.org/D64123. There's no manual post-processing.

lld/ELF/Arch/MipsArchTree.cpp
77 ↗(On Diff #207722)

Fixed

lld/ELF/MapFile.cpp
90 ↗(On Diff #207722)

Done

Thanks for the updates. I've scanned through the whole patch and only 2 things jumped out at me and they are minor and subjective; I've put comments inline. It is certainly possible that I missed something, but overall I think that this is good enough and we could fix up any other mistakes later on.

lld/ELF/AArch64ErrataFix.cpp
503 ↗(On Diff #207940)

ISLimit has gone to isecLimit. Could we make prevISLimit go to prevIsecLimit?

lld/ELF/InputSection.h
355 ↗(On Diff #207940)

I think this is the only use of Class in LLD. Given that it is the equivalence class could this be made eClass or eqClass?

ruiu added a comment.Jul 4 2019, 3:25 AM

Thank you for reviewing. I think both points make sense, I'll update my tool first and then update this patch so that your suggestions are applied to this patch.

I added everybody who joined https://lists.llvm.org/pipermail/llvm-dev/2019-February/130083.html to the subscriber list of this change.

FWIW, I skimmed through the first few files (up to the end of the Arch directory), and broadly am happy with it, but I do agree that "is" is not a good variable name, and would prefer it to be iSec or inputSec or something like that (I'd suggest inSec, but that is too predicate-like for my looking).

I definitely agree with a variable name change list this. I'd be happy with the underscore version too, though would personally think that should be for function names at least.

Generally looks good. +1 to Peter and James' suggestions: is -> isec and prevISLimit -> prevIsecLimit.

We have now an opportunity to rename global variables if they can be improved. I try to collect them below.

config (Config.h) in (SyntheticSections.h) driver (Driver.h) sharedFiles (InputFiles.h) outputSections (OutputSections.h) partitions (InputSection.h) script (LinkerScript) symtab (SymbolTable.h)

I am thinking in may be less ideal as a global variable name.

ruiu updated this revision to Diff 208115.Jul 4 2019, 9:29 PM

IS -> isec
prevISLimit -> prevIsecLimit
Class -> eqClass

MaskRay accepted this revision.Jul 9 2019, 12:02 AM
This revision is now accepted and ready to land.Jul 9 2019, 12:02 AM

I am fairly confident that we are in a stable state - reverting of a recent patch should be unlikely. (This change will make reverting of a commit very hard so I have to say this)

Agreed. I think now is as good a time as any to go for it, LGTM too.

This will cause big merge conflicts the next time I update our CHERI fork. However, the change itself looks good to me so I have no objections.

@grimar What do you think?

grimar added a comment.EditedJul 9 2019, 4:22 AM

I have no objections.
(Honestly I do not like the new style at all. I.e. for example for my personal pet projects I am using the current LLVM coding style and not going to switch.
I see that people are happy with it though and so I just have to follow).

ruiu updated this revision to Diff 208877.Jul 9 2019, 9:23 PM
  • rebase
This revision was automatically updated to reflect the committed changes.

Just chiming in to say this change makes me very happy. I'm pleasantly surprised by how much easier to read I'm finding the code. Thanks for doing this!