Meanwhile, use UniformityAnalysis instead of LegacyDivergenceAnalysis to collect divergence info.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
- added checks for the opt command in si-annotate-nested-control-flow.ll
- applied clang-format
llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp | ||
---|---|---|
126–127 | Not sure why you really need to change this, but the function should probably be a template argument. Is DA just missing a layer to present the PM independent analysis? |
llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp | ||
---|---|---|
126–127 | The LegacyPM version depends on LegacyDivergenceAnalysis info whereas the NewPM version of this pass depends on DivergenceInfo. |
llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp | ||
---|---|---|
126–127 | There should be a common logic DivergenceInfo that both pass versions export |
- isUniformlyReached now accepts a template argument instead of a std::function to allow both pass managers to use their corresponding DivergenceAnalysis info.
- DivergenceAnalysis::isUniform(Value &V) -> DivergenceAnalysis::isUniform(Value *V)
llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp | ||
---|---|---|
126–127 | I'm still confused. LegacyDivergenceAnalysis and DivergenceAnalysis/DivergenceInfo are two *different* analyses. LegacyDivergenceAnalysis is *not* the Legacy pass manager version of DivergenceAnalysis. New pass manager shouldn't invisibly switch the analysis used. I think LegacyDivergenceAnalysis needs to be ported to new PM first |
llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp | ||
---|---|---|
126–127 | Ahh I guess I got confused with the names as well. Sure, I will port LegacyDivergenceAnalysis to newPM first. |
At this point, there are three analyses as follows:
- LegacyDA is actually wrong, but useful in some cases involving irreducible control flow.
- GpuDA is correct, but treats irreducible control flow very conservatively.
- UniformityAnalysis is both correct and better at irreducible control flow.
Note that in almost every use on AMDGPU, LegacyDA simply acts as a wrapper for GpuDA.
Since this is fresh work, why not just go straight to UniformityAnalysis? I would definitely recommend that because not only does go straight to new stuff, it also avoids the unnecessary template. This would be a good time to use the new analysis. And if issues crop up that can't be fixed, then we should fall back no further than GpuDA, and unify their exposed Info objects along the way.
llvm/include/llvm/Analysis/DivergenceAnalysis.h | ||
---|---|---|
177 ↗ | (On Diff #491230) | Why is this change necessary? The point of accepting a reference is to indicate that nullptr is not expected inside the function. |
llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp | ||
---|---|---|
126 | Can you give a more meaningful name to the template argument? For instance, DivergenceAnalysisType? |
llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp | ||
---|---|---|
189–191 | I think you just removed this accidentally. |
I will create a new patch to replace LegacyDivergenceAnalysis with Uniformity analysis, unless it's too early to do so.
llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp | ||
---|---|---|
189–191 | Was this comment addressed? | |
197 | Either declare parameters as references, or implement checks for nullptr in the function body. Also, can any of these be const? | |
337–338 | If you know that a variable cannot be nullptr, it is preferable to declare it as a reference rather than a pointer. It can be converted into a pointer at the point where some function expects a pointer. For example, the call below can be "whatever.run(&DT, &PDT, UA)". |
llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp | ||
---|---|---|
189–191 | yup, I moved it under AMDGPUUnifyDivergentExitNodesImpl::run(..) |
llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp | ||
---|---|---|
197 | DominatorTree may be updated but PostDominatorTree and UniformityInfo are preserved. |
The commit message should make it clear that you're also changing it to use UniformityAnalysis instead of (Legacy?)DivergenceAnalysis.
llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.h | ||
---|---|---|
58 | It looks like only this class needs to be declared in the .h file. The others could stay in the .cpp file. |
- Move AMDGPUUnifyDivergentExitNodesImpl and AMDGPUUnifyDivergentExitNodes classes to the source file from the header file.
- do not need to include PassManager
- TTI should be private instead of protected in the Impl
llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp | ||
---|---|---|
342 | DT can be null where, which is UB https://lab.llvm.org/buildbot/#/builders/5/builds/32228/steps/16/logs/stdio |
Impl::run(..) now accepts DominatorTree* instead of DominatorTree& to avoid undefined behavior due to the possibility of DT == nullptr.
llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp | ||
---|---|---|
342 | Great catch! How does it look now? |
llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp | ||
---|---|---|
192–195 | When is DT null but PDT won't be? |
llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp | ||
---|---|---|
192–195 | This pass depends on PDT so it can never be null. DT may be null if it does not need to be preserved, which is specified by -simplifycfg-require-and-preserve-domtree. |
It looks like only this class needs to be declared in the .h file. The others could stay in the .cpp file.