This is an archive of the discontinued LLVM Phabricator instance.

[clang-rename] improve performance for rename-all
AbandonedPublic

Authored by omtcyfz on Aug 18 2016, 2:49 AM.

Details

Reviewers
None
Summary

As Miklos Vajna noticed clang-rename rename-all has significant performance problems, which exposed the fact that clang-rename parses translation unit N times where N stands for the number of {offset | old-name} -> new-name pairs.

This patch prevents clang-rename from parsing translation unit multiple times.

Diff Detail

Event Timeline

omtcyfz updated this revision to Diff 68502.Aug 18 2016, 2:49 AM
omtcyfz retitled this revision from to [clang-rename] improve performance for rename-all.
omtcyfz updated this object.
omtcyfz added reviewers: alexfh, vmiklos.
omtcyfz added a subscriber: cfe-commits.
omtcyfz updated this revision to Diff 68503.Aug 18 2016, 3:10 AM

Prevent unnecessary std::vector copying. Explicitly write type.

vmiklos edited edge metadata.Aug 18 2016, 3:25 AM

I can confirm that with this, the test script from the mail thread shows that clang-rename is almost as fast as clang++ as expected. Thanks!

alexander-shaposhnikov added inline comments.
clang-rename/USRFindingAction.h
38–41

in the constructor SymbolOffsets, OldNames are passed by non-constant reference
but then you make a copy.

omtcyfz updated this revision to Diff 68511.Aug 18 2016, 5:08 AM
omtcyfz edited edge metadata.

Prevent std::vector from redundant copying.

omtcyfz marked an inline comment as done.Aug 18 2016, 5:08 AM
omtcyfz added inline comments.
clang-rename/USRFindingAction.h
38–41

Aw, you're right. Good catch, thanks!

alexfh requested changes to this revision.Aug 18 2016, 8:47 AM
alexfh edited edge metadata.
alexfh added inline comments.
clang-rename/USRFindingAction.cpp
69

Should USRs be a local variable now?

147

Use ArrayRef here as well. BTW, if the code relies on SymbolOffsets and OldNames being of the same length, maybe a single collection of pairs would work better? Or define a structure for keeping offset and old name together?

clang-rename/USRFindingAction.h
29

Use ArrayRef instead of const vector<>&. ArrayRef<> is less restrictive.

30

Use ArrayRef<std::string>.

38

Reference members always seem suspicious to me. One has to be really really careful not to mess up lifetimes. Are we actually saving much but not copying these vectors?

This revision now requires changes to proceed.Aug 18 2016, 8:47 AM
omtcyfz updated this revision to Diff 68656.Aug 19 2016, 1:24 AM
omtcyfz edited edge metadata.
omtcyfz marked 2 inline comments as done.

Partly address comments.

omtcyfz marked an inline comment as done.Aug 19 2016, 1:25 AM
omtcyfz added inline comments.
clang-rename/USRFindingAction.cpp
69

Can you elaborate please?

clang-rename/USRFindingAction.h
38

Not really, they aren't meant to be quite huge. At this point I have ensured the lifetimes, but if the code would be reused, I agree, it might cause some trouble if one is not careful enough.

Do you propose to perform copying to gain more safety?

omtcyfz updated this revision to Diff 68665.Aug 19 2016, 2:51 AM
omtcyfz edited edge metadata.

Address one more comment.

omtcyfz marked 2 inline comments as done.Aug 19 2016, 2:51 AM
omtcyfz added inline comments.
clang-rename/USRFindingAction.cpp
69–70

Ah, sure. Sorry, didn't understand it at first.

clang-rename/USRFindingAction.cpp
207

not particularly important, probably it can be simply

return llvm::make_unique<NamedDeclFindingConsumer>(
    SymbolOffsets, OldNames, SpellingNames, USRList);
clang-rename/USRFindingAction.cpp
192
USRList.push_back(std::move(NextUSRBunch))

or (IMO even better)

USRList.push_back(Finder.Find());
omtcyfz updated this revision to Diff 68667.Aug 19 2016, 3:16 AM
omtcyfz updated this revision to Diff 68668.
omtcyfz marked an inline comment as done.
omtcyfz marked an inline comment as done.
omtcyfz added inline comments.
clang-rename/USRFindingAction.cpp
193–194

Yep. Makes sense. Thanks!

207–208

Yes, it's way better.

alexfh requested changes to this revision.Aug 19 2016, 11:20 AM
alexfh edited edge metadata.
alexfh added inline comments.
clang-rename/USRFindingAction.cpp
147

What about the second part?

BTW, if the code relies on SymbolOffsets and OldNames being of the same length, maybe a single collection of pairs would work better? Or define a structure for keeping offset and old name together?

clang-rename/USRFindingAction.h
38–42

Yes, I think, copying here makes sense to uncouple lifetime of the creator of the class with the lifetime of this class.

This revision now requires changes to proceed.Aug 19 2016, 11:20 AM

It is expected that either SymbolOffsets or OldNames is empty, and the size of the non-empty container is the same as the size of the NewNames container. So no, the code does not rely on the offsets and the old names having the same length.

omtcyfz marked an inline comment as done.Aug 21 2016, 4:18 AM

It is expected that either SymbolOffsets or OldNames is empty, and the size of the non-empty container is the same as the size of the NewNames container. So no, the code does not rely on the offsets and the old names having the same length.

If I understood correctly, you and Alex are talking about different things.

As far as I understand Alex what he meant is the code inside USRFindingAction.cpp relies on SymbolOffsetsVector and OldNamesVector (or whatever I actually called them) having same size. And that's right. See clang-rename/USRFindingAction.cpp line 153 in current diff. And I actually construct them to have the same size (see line 212 in clang-rename/ClangRename.cpp). I do not claim this to be the best solution, open to any other ideas, but it isn't too ugly to do so IMO.

If I understand you correctly code doesn't rely on SymbolOffsets and OldNames having equal size. And that's also correct.

Thus said, what Alex proposes (bind offset and old-name together while passing to USRFindingAction) makes sense to me.

omtcyfz updated this revision to Diff 69214.Aug 25 2016, 2:15 AM
omtcyfz edited edge metadata.
omtcyfz marked 2 inline comments as done.

Address couple of comments.

Actually, it might make sense to merge rename-at and rename-all after all. Passing a list of old-name/offset -> new-name is just easier.

But that's for another patch.

If I understood correctly, you and Alex are talking about different things.

Yes, sorry, the context of my above comment was the cl::opt instances in the tool itself, not the various parameters to the different actions called from the tool.

Kirill, are you waiting for me on this one? AFAICS the patch in its current form applies on top of current trunk, but no longer builds.

Kirill, are you waiting for me on this one? AFAICS the patch in its current form applies on top of current trunk, but no longer builds.

Miklos, it is a bit more complicated :)

Bascially, I have another patch kind of complete, which is built on top of this patch, which kind of makes this one outdated. Pushing this will make things a bit more complicated. I'll write about it soon enough.

Ah, if you mean you squashed this into D24192, then I see what you mean, ignore me. :-)

omtcyfz updated this revision to Diff 70193.Sep 2 2016, 11:02 AM
omtcyfz edited edge metadata.

Revert diff, as the last one "deletes and creates" files instead of "moving and changing them" in the filesystem.

omtcyfz abandoned this revision.Sep 2 2016, 11:04 AM

Oops, wrong patch... Abandoning this one anyway.