This is an archive of the discontinued LLVM Phabricator instance.

[MachineSink] replace MachineLoop with MachineCycle
ClosedPublic

Authored by shchenz on Apr 19 2022, 3:53 AM.

Details

Summary

MachineCycle can handle irreducible loop. Natural loop analysis (MachineLoop) can not return correct loop depth if the loop is irreducible loop. And MachineSink is sensitive to the loop depth, see MachineSinking::isProfitableToSinkTo().

In https://reviews.llvm.org/D120800, we prevent a sinking from shallower loop to deeper loop after checking irreducible cfg in the whole function.

This patch tries to use MachineCycle so that we can handle irreducible loop better.

Diff Detail

Unit TestsFailed

Event Timeline

shchenz created this revision.Apr 19 2022, 3:53 AM
shchenz requested review of this revision.Apr 19 2022, 3:53 AM
nikic added a comment.Apr 19 2022, 9:17 AM

I was concerned about compile-time impact of using MachineCycleInfo, but it looks like the impact is fairly low on CTMark: http://llvm-compile-time-tracker.com/compare.php?from=42865819b22486963294434fb21b51ab3e6ebfa4&to=61706f3ee1c979e7ec893d0bc4858790561a27c0&stat=instructions Looks like SPASS with ReleaseLTO-g is the only one with some non-trivial impact.

Do note that the cycle hierarchy is implementation-defined, and hence the reported cycle depth is not an invariant property of the program being compiled. For example, in the following CFG:

     Entry
       |\
       v \
   .-> P  \
  /    |   \
 /     v    \
'      Q     \
|     / \     |
|    /   \    |
|   v     v   |
`-- S <-- R <-'
    |
    v
   Exit

Here. the analysis can either identify one cycle with header P, or two nested cycles with headers R and S.

The change description says that MachineSink is "sensitive" to cycle depth. Is it only the performance of the resulting program, or does it affect correctness too?

llvm/include/llvm/ADT/GenericCycleImpl.h
82

Why not just use succ_size()? It's defined in CFG.h for LLVM IR and in MachineSSAContext.h for Machine IR.

Also, minor optimization ... this check is probably slightly cheaper than isLegalToHoistInto(), so it might help to check it earlier.

llvm/include/llvm/ADT/GenericCycleInfo.h
140

This is not defined for LLVM IR. Maybe move it to the Machine IR specialization?

shchenz updated this revision to Diff 423868.Apr 20 2022, 4:06 AM

address @sameerds comments

shchenz marked 2 inline comments as done.Apr 20 2022, 4:16 AM

I was concerned about compile-time impact of using MachineCycleInfo, but it looks like the impact is fairly low on CTMark: http://llvm-compile-time-tracker.com/compare.php?from=42865819b22486963294434fb21b51ab3e6ebfa4&to=61706f3ee1c979e7ec893d0bc4858790561a27c0&stat=instructions Looks like SPASS with ReleaseLTO-g is the only one with some non-trivial impact.

Thanks very much for the testing @nikic and glad to hear that the compile time impact is low. Let me know if you think the %0.39 improve for SPASS needs to be further investigated.

The change description says that MachineSink is "sensitive" to cycle depth. Is it only the performance of the resulting program, or does it affect correctness too?

I think the depth check should only impact the performance in machinesink. However there are some codes(hasStoreBetween) which does not work right for irreducible loops, with this patch, we can get correct behavior. But I think we need more investigation for that function. I plan to do more investigation later to avoid errors.

llvm/include/llvm/ADT/GenericCycleImpl.h
82

Thanks, updated.

sameerds accepted this revision.Apr 20 2022, 4:52 PM

The changes in GenericCycleInfo look good to me. But please do wait for reviewers looking at MachineSink.

llvm/include/llvm/ADT/GenericCycleInfo.h
132

Instead of saying "for now", we could simply say that pre-header is well-defined for reducible cycles, as listed in LoopTerminology.html. Then also add "returns nullptr if irreducible".

Same for the next function.

llvm/include/llvm/CodeGen/MachineCycleAnalysis.h
61

I think you mean "add to GenericCycle"?

llvm/include/llvm/CodeGen/MachineSSAContext.h
31

Oops! Thanks for adding this to trunk! Now I realize I was looking at a topic branch in my repo where I had already put these in place. :)

This revision is now accepted and ready to land.Apr 20 2022, 4:52 PM
shchenz updated this revision to Diff 424813.Apr 24 2022, 9:08 PM
shchenz marked 3 inline comments as done.

address @sameerds comments

The changes in GenericCycleInfo look good to me. But please do wait for reviewers looking at MachineSink.

Thanks very much for review @sameerds . Sure, I will wait for the MachineSink part review.

llvm/include/llvm/ADT/GenericCycleInfo.h
132

Updated.

llvm/include/llvm/CodeGen/MachineCycleAnalysis.h
61

Oops...

gentle ping for the MachineSink part. Thanks!

MatzeB added inline comments.May 5 2022, 10:31 AM
llvm/include/llvm/ADT/GenericCycleImpl.h
70

I did not even know this is legal C++ syntax; Anyway please be consistent with the rest of LLVM here!

89
llvm/include/llvm/ADT/GenericCycleInfo.h
103

as above

221

Consider just pushing this as a small fix directly to git without review.

llvm/include/llvm/CodeGen/MachineCycleAnalysis.h
48

Do we need a TODO comment here? Will the verify analysis added at this particular line?

61–62

Do we need a TODO comment?

llvm/include/llvm/CodeGen/MachineSSAContext.h
31–32

Explicit types help everyone reading the code.

llvm/lib/CodeGen/MachineSink.cpp
384

Rename L to C (or better Cycle)?

468–473

Rename L/Loop to Cycle/Cycles ?

There's more place below that where the same applies.

MatzeB added a comment.May 5 2022, 1:21 PM

Thanks, I think this will be good to go when all the nitpicks are addressed!

llvm/include/llvm/ADT/GenericCycleImpl.h
71–72

This is already checked in getCyclePredecessor

82

Everything in the Cycle info appears to be only concerned about the structure of the CFG. This is the only function that looks at non-structural attributes like if something is an exception landing pad. I feel like it's probably better to have this helper function somewhere else and not be a member of GenericCycle.

How about keeping it local to MachineSink.cpp?

97
llvm/include/llvm/ADT/GenericCycleInfo.h
103
  • Other functions are called isEntry and printEntries and the CycleTerminology documentation calls them entries as well. So this should be called getEntries() I think.
  • Use SmallVectorImpl<T> without specifying the size for APIs
131–132

As far as I can tell LoopTerminology.rst defines the term *Header* but not pre-header. Please put a definition into the comment here.

(And consider moving the function out of here anyway as above, since it checks for non-structural properties in isLegalToHoistInto)

136–137

I think we should talk about "cycles" not "loops" here. How about this:

llvm/lib/CodeGen/MachineCycleAnalysis.cpp
17–18

I assume this can be removed given that you dropped the extern declarations in the header?

124–125

FWIW: We should probably change this to !MO.readsReg() (because moving undef use operands around should be no problem); but it's probably out of the scope of this diff and we better do that separately.

llvm/lib/CodeGen/MachineSink.cpp
236–239
468–469
645–651

Move the getCycle results into variables so we don't need to lookup the same block multiple times in the hashmap?

sameerds requested changes to this revision.May 5 2022, 9:52 PM
sameerds added inline comments.
llvm/include/llvm/ADT/GenericCycleImpl.h
70

There is no other precedent in LLVM for this to be consistent with. The proposed change of putting BlockT* as the return type will not work because BlockT is defined inside the GenericCycle<ContextT> class and cannot be used until that class is named. The alternative is to say "GenericCycle<ContextT>::BlockT", but I am very strongly opposed to that kind of repetition.

llvm/include/llvm/ADT/GenericCycleInfo.h
131–132

The document does define preheader, under the section Terminology:

https://llvm.org/docs/LoopTerminology.html#terminology

Good catch about isLegalToHoistInto. But maybe only that check needs to be moved out of this file? The preheader itself is still defined purely in terms of CFG structure.

llvm/include/llvm/CodeGen/MachineCycleAnalysis.h
22–23

Why was this removed? This is necessary to prevent automatic instantiation of the template at the point of use. This particular template instance is defined once in MachineCycleAnalysis.cpp

llvm/lib/CodeGen/MachineCycleAnalysis.cpp
10–14

This ordering of headers is wrong. MachineCycleAnalysis.h should be the first header. I am surprised that clang-format did not do this automatically.

17–18

No this should be retained along with the extern declarations in the header.

This revision now requires changes to proceed.May 5 2022, 9:52 PM
MatzeB added inline comments.May 6 2022, 9:00 AM
llvm/include/llvm/ADT/GenericCycleImpl.h
70

I would rather see the longer GenericCycle<ContextT>::BlockT than the unusual trailing return syntax that isn't even necessary here.

Seems like we have to agree to disagree here. And I guess that means leaving it unchanged if I cannot convince you...

llvm/include/llvm/ADT/GenericCycleInfo.h
131–132
  • Oh, I was reading CycleTerminology and missed this one. I would still recommend to add 1 or 2 sentences here to what a preheader is and maybe also add it to CycleTerminology.rst.
  • Moving just the isLegalToHoistInto call works too of course.
wxiao3 added a subscriber: wxiao3.May 6 2022, 10:50 PM
shchenz updated this revision to Diff 429136.May 12 2022, 9:24 PM
shchenz marked 18 inline comments as done.

address @MatzeB & @sameerds comments

shchenz added inline comments.May 12 2022, 9:24 PM
llvm/include/llvm/ADT/GenericCycleImpl.h
70

I leave it as now to make sure it is uniform at least in this file. We can change all these functions with a follow-up NFC patch if you insist.

82

This is from getLoopPreheader() and isLegalToHoistInto() check was intentionally added in https://reviews.llvm.org/D34487. And IMO, we also need this for cycle preheader?

llvm/include/llvm/ADT/GenericCycleInfo.h
103

Thanks. I will commit another patch to rename above getHeader() later.

131–132

I will handle isLegalToHoistInto when we decide where should we put it.

221
llvm/include/llvm/CodeGen/MachineCycleAnalysis.h
48

Moved from lib/CodeGen/MachineCycleAnalysis.cpp. I thought, it is a placeholder for a declare of verify function?

61–62

This function should be implemented for both MIR and IR. Now for machinesink, I just implement MIR version, once IR version is implemented, we can move this function to the template class GenericCycle. So I add this TODO, do you mean we don't need TODO for this case?

llvm/lib/CodeGen/MachineCycleAnalysis.cpp
124–125

OK.

gentle ping

LGTM but let's wait for @sameerds for accept.

llvm/include/llvm/ADT/GenericCycleInfo.h
103

The header is define in CycleTerminology.rst as the first entry we find. So the name of the getHeader function seems fine to me. (The getHeaders here was questionable, because the documented hinted that there is only one and those things were already called entries anyway).

llvm/include/llvm/CodeGen/MachineCycleAnalysis.h
48

Can you just remove it?

sameerds accepted this revision.May 23 2022, 8:26 PM

LGTM! My sincere apologies for the delay ... I have no excuse :(

llvm/include/llvm/ADT/GenericCycleImpl.h
70

FWIW, this got discussed in the original review where this file was added:

https://reviews.llvm.org/D112696#3175521

This revision is now accepted and ready to land.May 23 2022, 8:26 PM

LGTM! My sincere apologies for the delay ... I have no excuse :(

It doesn't matter at all : )

Thanks very much for review! @sameerds @MatzeB

llvm/include/llvm/CodeGen/MachineCycleAnalysis.h
48

Sure, will remove this in the commit.

This revision was landed with ongoing or failed builds.May 23 2022, 10:18 PM
This revision was automatically updated to reflect the committed changes.

It looks like this commit broke the -DLLVM_ENABLES_MODULES=1 build: https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/43994/changes

@shchenz Would you consider reverting your patch? Let me know if you have trouble reproducing the issue.

@shchenz Would you consider reverting your patch? Let me know if you have trouble reproducing the issue.

@aprantl Sorry for the inconvenience. I reverted this patch. However, I can not reproduce the build failure on PowerPC Linux with follow command: (I get it from lldb buildbot):

cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=/stage1/clang -DCMAKE_CXX_COMPILER=/stage1/clang++ -DLLVM_ENABLE_ASSERTIONS=1  -DLLVM_ENABLES_MODULES=1 -DLLVM_ENABLE_PROJECTS="clang;lld;lldb;cross-project-tests" -DLLVM_TARGETS_TO_BUILD="X86;ARM;AArch64" ../llvm/ ; make

I will do more test to reproduce the failure and let me know if I miss something in the above command. Thanks.

nikic added a comment.May 30 2022, 4:18 AM

The change description says that MachineSink is "sensitive" to cycle depth. Is it only the performance of the resulting program, or does it affect correctness too?

I think the depth check should only impact the performance in machinesink. However there are some codes(hasStoreBetween) which does not work right for irreducible loops, with this patch, we can get correct behavior. But I think we need more investigation for that function. I plan to do more investigation later to avoid errors.

Just to double check, did you verify that hasStoreBetween() works fine with the new implementation? I'm slightly concerned that we may be reintroducing the miscompile depending on how CycleInfo gets chosen.

shchenz added a comment.EditedMay 30 2022, 9:10 PM

The change description says that MachineSink is "sensitive" to cycle depth. Is it only the performance of the resulting program, or does it affect correctness too?

I think the depth check should only impact the performance in machinesink. However there are some codes(hasStoreBetween) which does not work right for irreducible loops, with this patch, we can get correct behavior. But I think we need more investigation for that function. I plan to do more investigation later to avoid errors.

Just to double check, did you verify that hasStoreBetween() works fine with the new implementation? I'm slightly concerned that we may be reintroducing the miscompile depending on how CycleInfo gets chosen.

Yes, I did some check before, with the machinecycle, we should be ok now for hasStoreBetween() (I am not 100% sure though).
1: We will not sink instructions from BBFrom in a cycle leading by HeaderA to a BBTo in another cycle leading by HeaderB (HeaderA and HeaderB are two headers of a irreducible cycle), as BBFrom must dominate BBTo, but HeaderA and HeaderB does not need to be dominated.
2: We will also not sink instruction from predecessor block of header cycle to a block inside the cycle no matter the cycle is reducible or irreducible.

FYI: I suspect this change caused a compile-time regression from 60sec to >40 minutes for an interpreter-loop based on computed-gotos. See https://github.com/llvm/llvm-project/issues/57664

fhahn added a subscriber: fhahn.Sep 10 2022, 9:08 AM

It looks like this is causing a huge performance regression: https://github.com/llvm/llvm-project/issues/57664

@shchenz could you take a look?