This is an archive of the discontinued LLVM Phabricator instance.

C++11 Range-based Loop Converter
ClosedPublic

Authored by panzer on Aug 20 2012, 10:47 AM.

Details

Reviewers
klimek
Summary

A tool which upgrades convertible loops to use the new range-based syntax. It's implemented as via source-to-source translation using clang-tooling.

Diff Detail

Event Timeline

panzer updated this revision to Unknown Object (????).Aug 20 2012, 2:34 PM

Added more comments explaining some of the less intuitive behaviors of the traversal, renamed a few badly-named typedefs, and integrated LoopFixerArgs into LoopFixer.

panzer updated this revision to Unknown Object (????).Aug 20 2012, 6:09 PM

Doxygen-ified comments.

It seems like this kind of tool would benefit a lot from the analysis infrastructure created for the Static Analyzer (include/clang/Analysis). Could you briefly explain why you didn't use that infrastructure?

silvas added inline comments.Aug 20 2012, 8:05 PM
loop-convert/LoopActions.cpp
837

You seem to have a lot of these bare calls to std::min() to compute the ConfidenceLevel all over the place. Could you encapsulate this operation and give it a descriptive name?

loop-convert/LoopMatchers.cpp
18

I think that most of these "matcher factory" functions would benefit from a documentation comment containing:

  • A list of bound variable names (or whatever you call the things captured by id() and .bind())
  • A chunk of example code with the part that would be matched called out in some way (maybe surround it with @@@, ||| or some other marker which will clearly stand out)

Also, for these in particular, I would like to see a brief explanation of how they fit into the larger for-loop conversion process.

22

Why are these static? (are they generating static constructors... ew...).

test/loop-convert/Inputs/structures.h
6

I don't think it's legal to use restrict in C++, even in extern "C".

First round of comments, more to come :)

loop-convert/LoopActions.cpp
26

In general, this seems like a design smell - can we instead always use Range, and just set it to E->getSourceRange when IsArrow is false? Perhaps even via a constructor, so this is impossible to use incorrectly?

29

Wondering what "true" means here: true as in indicates success, or true as in real?

41

Indent.

59

I'd change the interface to:
bool indexesMultipleContainers();
Expr *getSingleIndexedContainer();
where getSingleIndexedContainer() asserts(!indexesMultipleContainers());

That way more of the "how" is hidden in the class.

88

Perhaps call this findAndVerifyUsages; I think that'd better resemble the return value's meaning.

95

I'd order this roughly by abstraction level:

  1. typedefs
  2. methods from most abstract to most specific
  3. fields that are set at construction time
  4. fields that are changed at runtime
loop-convert/LoopActions.h
25

No using in headers...

31

Extra doesn't seem to fit in here (is that "extra safe"?)... Perhaps add a comment?

62

Consistency: public section first...

klimek added inline comments.Aug 21 2012, 10:51 AM
loop-convert/LoopActions.cpp
89

It seems strange that we set confidence level here, which would indicate that we can re-run this method, but initialize other values that are changed at runtime only in the constructor. I'm fine with both strategies, I'd just prefer to have a single one consistently done.

98

Above the comment makes it look like "ArraySubscriptExpression" would only apply to arrays, and not in general...

113

DependentExprs could need a comment.

139

We're making some assumptions here that would not make this fit a more general place: macros can cause all kind of havoc that needs to be checked for.

179

Rename to 'exprReferencesVariable'?

In response to silvas asking why I didn't reuse static analzer code, I wrote the first version based on a RecursiveASTVisitor, and didn't much of the analysis now present. The suggestions I received mostly pointed me towards Tooling.

Looking at the static analyzer now, I don't see too much that appears very similar, since I'm mostly working at a higher level than the CFG.

loop-convert/LoopActions.cpp
29

True as in real. I changed the comment to explain that ForLoopIndexUseVisitor exists to compute a UsageResult (and one boolean value).

41

Done.

59

I had imagined possibly doing an analysis to allow multiple containers if we could determine they were in fact the same object/array. As that's not likely to happen any time soon, done.

I actually need to check that exactly one container is indexed, since zero index
ed containers isn't convertible either (what would it loop over? (unless a range type exists...)).

88

Done.

89

This would theoretically allow multiple runs (from the pre-matcher days), but again it's not going to be used now. The new confidence level constructor is called in the ForLoopIndexUseVistor constructor.

95

Done.

98

Comment was out of date; fixed.

113

It does, and now has one.

139

This is partly because I've been putting off trying to deal with macros. I guess this function stays local, then.

179

Done.

837

I'm hesitant to create a class which is responsible for tracking a single enum value and making sure it decreases monotonically. But after doing so, I agree that the client code reads much more easily.

loop-convert/LoopActions.h
25

Missed this one... deleted.

31

Richard Smith and I tried to come up with a better name. I settled on TCK_Reasonable for now.

62

Missed this... fixed.

loop-convert/LoopMatchers.cpp
18

This is an excellent idea. In fact, I even caught a bug when explicitly documenting exactly what these matchers should do. Does the set of comments added to the next revision seem like what you were looking for?

22

This was a leftover from the tutorial I wrote, before I made this into a factory function. I missed deleting it - thanks!

test/loop-convert/Inputs/structures.h
6

This is how several other test files declare printf, in particular, test/SemaCXX/format-strings.cpp.

panzer updated this revision to Unknown Object (????).Aug 21 2012, 4:47 PM

Addressed Manuel's and Sean's suggestions, updated and improved comments, and fixed a bug in the loop matchers.

One more flight of comments, before switching to the latest diff :)

loop-convert/LoopActions.cpp
229

s/return/returns/ for consistency.

256

The "and the array's base exists" part seems pretty arbitrary.

300

Indent.

328

Formatting.

329–331

I'm not sure I understand what you're getting at. This seems to be a hint at the overall strategy of the various isValid* functions (this is not named isValid* btw, so this is further confusing).

355

TargetDecl is really confusing here. I thought you meant the target of the initialization, but that's TheDecl :)

Perhaps s/TargetDecl/IndexVar/. In general, naming those things consistently would help.

413

This showing up as the overview comment for the method in doxygen seems strange...

526

Leftover sentence?

716

There is no parameter named VDecl...

749

Please document this in the function comment.

760

The asymmetry between the name EndSourceExpr and ContainerExpr is confusing.
Perhaps BeginContainerExpr and EndContainerExpr?

775

Really long function, would be great to find a way to split this up.

780

Any reasons we don't want to make those transformations in headers?
(The strategy I've used so far was regexps on the path)

819

Indent.

859

When I came to this condition, I wondered where we had already tried to create replacements in this run() call. I think that's a general problem here - there's too many different things going on on the same level.

loop-convert/LoopConvert.cpp
82

I think that indent is bad, as it looks like there's two things on the same level.

loop-convert/LoopMatchers.cpp
23

+1 on that they shouldn't be static. Also, indent.

29

I'd only use one binding concept (.bind()), as it makes it much easier to see at a glance what is being bound.

loop-convert/LoopMatchers.h
35

Please insert newline.

loop-convert/StmtAncestor.h
22–33

I think adding some newlines would make this easier to read.

48

Seems to be only called once - why the bool parameter?

49

Can this ever be called twice on the same object? why not assert(StmtAncestors.empty())?

klimek added inline comments.Aug 22 2012, 9:39 AM
loop-convert/StmtAncestor.h
116

This one especially could need an example or two in the docs :) I find this to be one of the most complex parts of this code.

loop-convert/VariableNaming.cpp
23–24

Is that comment outdated?

53

Any reason not to use StringRef?

58–69

I think I mentioned that in one of the pre-phab review mails, but I'm not sure any more :) So, is this now used, or not?

loop-convert/VariableNaming.h
23

I'd add a whitespace for readability. In general, I find comments starting directly after something else without an empty line to be hard on the eye.

24

Also, I'd either leave out the VariableNamer here, or create a sentence like: VariableNamer creates names...

33–36

I'm wondering whether there's an abstraction missing for that kind of argument list, but I cannot come up with it currently. Something to keep in the back of our minds.

panzer updated this revision to Unknown Object (????).Aug 22 2012, 4:04 PM

Next round up for review.

loop-convert/LoopActions.cpp
26

Done. The nearby code is now a bit nicer too.

229

Done.

256

This function also used to do extra checking on the array expression. Removed.

300

Done.

328

Comment removed, as noted below.

329–331

The isValid* functions were supposed to abstract checking so that both overloaded and builtin operators could call the same function (e.g. isIndexInSubscriptExpr for operator[] and calls to at()). The abstraction wasn't worth it for isDereferenceOf{OpCall,Uop}, because the only common check is exprReferencesVariable().

In any case, I added a check to make sure that the opcode was correct to make the interface more uniform, and removed the now obsolete comment.

355

I must have missed that during the general rename. Done.

413

Comment fixed in a later revision :)

526

Yes, that doesn't belong. Fixed in later revision.

716

Fixed in later revision. Also, renamed the function to more accurately reflect what it does (getContainerFromBeginEndCall).

I see what you meant about assuming that the comments are wrong :)

749

Done. In the process, the function was rearranged a bit to be simpler.

760

Done.

775

120 lines is a bit long.

The function has been split into two pieces, each of which is far more reasonable.

780

This was the simplest safe solution I found for avoiding changes to files we weren't instructed to modify (especially system headers). I imagine this tool would conceptually be run on a portion of a project (e.g. clang/tools/extra/), and should only make changes within that project.

If there is a better mechanism (regexps on the path sound fine), I'd be happy to use it!

819

Done.

859

The answer is that we haven't already tried to create replacements in this run() call! This condition detects replacements across calls of run() (the logic for which belongs in RefactoringTool or a higher level, I think).

The refactored arrangement of this function now splits the callback into three pieces:

  1. run() gathers the matched nodes, and makes sure the loop's header is correct.
  2. findAndVerifyUsages() prepares and runs a ForLoopIndexUseVisitor, then decides if it should make the changes
  3. doConversion() does the actual refactoring work

Does this seem more manageable?

loop-convert/LoopConvert.cpp
82

I think this is better.

loop-convert/LoopMatchers.cpp
23

Done. What's the opinion on static matchers outside of the factory functions to increase sharing between the factory functions?

29

I tried to use id() only when the entire matcher being bound wasn't clearly visible at a glance. Especially for binding LoopName to the forStmt matcher, I
dislike placing the name 15 lines away from the matcher. Some formatting changes made .bind() much easier to read, so I'm happy now.

loop-convert/LoopMatchers.h
35

Done.

loop-convert/StmtAncestor.h
22–33

Done.

48

That was left as a hook in case some way to update the AST (after making a refactoring change) appeared. I've removed the extra parameter it for now.

49

It could be called twice on the same object; I just don't do that right now. In fact, if we run this across multiple TUs, then we will have to call gatherAncestors() once per TU.

116

I added an example that (I think!) makes it clear what this is trying to protect against.

loop-convert/VariableNaming.cpp
23–24

What gave that away? :)

53

No reason at all. StringRef it is.

58–69

No, it's not used...though it would be wonderful if someone who knows how DeclContexts work could explain what's going wrong.

Removed, along with all the parameters that were only involved in the ineffective code.

loop-convert/VariableNaming.h
23

Agreed; whitespace added.

24

Redundant name removed.

33–36

I've been considering that too. VariableNamer essentially takes a few naming hints (TheContainer, OldIndex), and some parameters which let it decide if a name was already taken (GeneratedDecls, ReverseAST, LoopContext, SourceStmt). The separation of responsibilities happens internally, though I did not feel creating more complexity because this is a function run once on a single object.

VariableNamer class pretty much exists as a place to put all the naming worries so that they doesn't clutter up LoopFixer. I'm open to ideas...

Ok, I think this is a good to go in after the Outside fix...

Thanks for bearing with me!

loop-convert/StmtAncestor.h
116

After an offline chat we came to the conclusion that Outside really means Inside here :) Please change...

panzer updated this revision to Unknown Object (????).Aug 24 2012, 10:15 AM

This update makes three kinds of changes:

  1. Fixes two bugs discovered when running the tool over a large codebase.
  2. Renames DependsOnInsideVariable to DependsOnOutsideVariable (Sorry about that mistake, Manuel...)
  3. Extracts another function from the matcher callback.
panzer updated this revision to Unknown Object (????).Aug 24 2012, 12:48 PM

Added tests for bugfixes from last revision, a spelling fix, and more \code blocks.

panzer updated this revision to Unknown Object (????).Aug 24 2012, 12:55 PM

Previous diff was made from incorrect base after upstream merge.

panzer added inline comments.Aug 24 2012, 12:58 PM
loop-convert/LoopActions.cpp
167

This was fixed near the end of checkDeferralsAndRejectsions().

lgtm for the latest updates

klimek accepted this revision.Aug 28 2012, 4:41 PM

This has landed as r162627

klimek closed this revision.Aug 28 2012, 4:41 PM
Via Old World