The patch adds a new TargetMachine member "registerPassBuilderCallbacks" for targets to add passes to the pass pipeline using the New Pass Manager (similar to adjustPassManager for the Legacy Pass Manager).
Details
Diff Detail
Event Timeline
I wanted to add the equivalent of adjustPassManager for Hexagon. Should I make that change in this same patch itself?
I think I'm still missing how exactly this will fit into the pipeline. As in where is registerPassBuilderCallbacks() going to be called?
And a test like llvm/test/CodeGen/AMDGPU/opt-pipeline.ll for the new pass manager route would be nice to make sure this actually works.
something like RUN: opt -passes='default<O0>' -mtriple=... -disable-output -disable-verify -debug-pass-manager. Maybe checking all the normal passes isn't necessary like in the existing opt-pipeline.ll isn't necessary, just need to check that the custom passes (e.g. HexagonVectorLoopCarriedReusePass) are actually added.
llvm/lib/Target/Hexagon/HexagonTargetMachine.cpp | ||
---|---|---|
278 | this is typically fully capitalized |
My understanding is that this API gives other applications chance to modify the pass pipeline. HexagonVectorLoopCarriedReuse pass in not included in the pass pipeline for any optimization level, but we used to add the pass using adjustPassManager in Halide. That is why I don't think I can use the suggested method to test if the pass is being added
HexagonTargetMachine::adjustPassManager() adds some Hexagon-specific passes to the optimization pipeline.
$ ./build/rel/bin/opt -disable-output -debug-pass=Structure -O1 /dev/null -mtriple=hexagon |& grep -i hexagon Pass Arguments: -targetlibinfo -tti -targetpassconfig -tbaa -scoped-noalias-aa -assumption-cache-tracker -profile-summary-info -forceattrs -inferattrs -ipsccp -called-value-propagation -globalopt -domtree -mem2reg -deadargelim -domtree -basic-aa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -simplifycfg -basiccg -globals-aa -prune-eh -always-inline -function-attrs -domtree -sroa -basic-aa -aa -memoryssa -early-cse-memssa -simplifycfg -domtree -basic-aa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -libcalls-shrinkwrap -loops -postdomtree -branch-prob -block-freq -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -pgo-memop-opt -simplifycfg -reassociate -domtree -loops -loop-simplify -lcssa-verification -lcssa -basic-aa -aa -scalar-evolution -loop-rotate -memoryssa -lazy-branch-prob -lazy-block-freq -licm -loop-unswitch -simplifycfg -domtree -basic-aa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -loop-simplify -lcssa-verification -lcssa -scalar-evolution -indvars -loop-idiom -hexagon-loop-idiom -domtree -loops -loop-simplify -lcssa-verification -lcssa -scalar-evolution -loop-deletion -loop-unroll -hexagon-vlcr -phi-values -basic-aa -aa -memdep -memcpyopt -sccp -demanded-bits -bdce -aa -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -postdomtree -adce -simplifycfg -domtree -basic-aa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -barrier -basiccg -rpo-function-attrs -globalopt -globaldce -basiccg -globals-aa -domtree -float2int -lower-constant-intrinsics -domtree -loops -loop-simplify -lcssa-verification -lcssa -basic-aa -aa -scalar-evolution -loop-rotate -loop-accesses -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -loop-distribute -postdomtree -branch-prob -block-freq -scalar-evolution -basic-aa -aa -loop-accesses -demanded-bits -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -inject-tli-mappings -loop-vectorize -loop-simplify -scalar-evolution -aa -loop-accesses -lazy-branch-prob -lazy-block-freq -loop-load-elim -basic-aa -aa -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -simplifycfg -domtree -vector-combine -basic-aa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -loop-simplify -lcssa-verification -lcssa -scalar-evolution -loop-unroll -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -memoryssa -loop-simplify -lcssa-verification -lcssa -scalar-evolution -lazy-branch-prob -lazy-block-freq -licm -opt-remark-emitter -transform-warning -alignment-from-assumptions -strip-dead-prototypes -cg-profile -domtree -loops -postdomtree -branch-prob -block-freq -loop-simplify -lcssa-verification -lcssa -basic-aa -aa -scalar-evolution -block-freq -loop-sink -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instsimplify -div-rem-pairs -simplifycfg -verify Recognize Hexagon-specific loop idioms Hexagon-specific loop carried reuse for HVX vectors
opt.cpp contains TM->adjustPassManager(Builder); for the legacy PM path.
The thing you're adding here I thought was the equivalent for the new pass manager. Am I missing something?
This is to allow target-specific passes to be inserted into the pass pipeline. This is the exact equivalent of adjustPassManager for the legacy manager. See EmitAssemblyHelper::CreatePasses in clang/lib/CodeGen/BackendUtil.cpp.
Perhaps EmitAssemblyHelper::EmitAssemblyWithNewPassManager in BackendUtil.cpp. PassBuilder actually has TM as a member, so maybe from somewhere inside of PassBuilder itself, but I'm not sure whether that's the right thing to do.
Anywhere adjustPassManager() is called should have a corresponding call to registerPassBuilderCallbacks() in the NPM path. So yes EmitAssemblyHelper::EmitAssemblyWithNewPassManager() in BackendUtil.cpp, but also since opt.cpp calls adjustPassManager(), the corresponding NPM path in NewPMDriver.cpp should also call registerPassBuilderCallbacks(). Both places have access to a TM.
This should be done either before or in this patch, or else we can't test it and make sure it works.
Sorry I think I misunderstood the usage of adjustPassManager earlier. I'll add the calls in both opt and clang and add the test case as well. Thanks!
This looks good, just a couple nits
llvm/lib/Target/Hexagon/HexagonTargetMachine.cpp | ||
---|---|---|
280 | We should probably thread the DebugPassManager bool into here as a new param to registerPassBuilderCallbacks(). | |
llvm/test/CodeGen/Hexagon/registerpassbuildercallbacks.ll | ||
4 | I think just the Running pass: ...Pass part is sufficient, the on Loop ... seems brittle | |
7 | How did you come up with this IR? I've typically seen something like the IR in llvm/test/Other/new-pm-defaults.ll to test this sort of stuff. |
llvm/test/CodeGen/Hexagon/registerpassbuildercallbacks.ll | ||
---|---|---|
7 | I generated the .ll through C source file. Is it better to add the testcase to llvm/test/Other/new-pm-defaults.ll itself? |
llvm/test/CodeGen/Hexagon/registerpassbuildercallbacks.ll | ||
---|---|---|
7 | I think copying that here would be better rather than using something generated by clang. That IR was specifically constructed for this sort of test. |
One nit, otherwise lgtm
Thanks for doing this!
llvm/include/llvm/Target/TargetMachine.h | ||
---|---|---|
300 | I'd prefer no default value so callers don't forget to pass it since there should always be a user option to enable pass manager logging, which is already somewhat enforced by no default value for debug logging in the constructors for the various pass managers. |
I think this, and similar recent commits, are causing the shared library builds to fail some tests if this code gets linked into libLLVM.so: https://bugs.llvm.org/show_bug.cgi?id=48181. I assume it might actually a bug in ld (GNU Binutils for Ubuntu 2.34), as I don't understand the linker behavior there?
I'd prefer no default value so callers don't forget to pass it since there should always be a user option to enable pass manager logging, which is already somewhat enforced by no default value for debug logging in the constructors for the various pass managers.