Fix full unrolling with new pass manager. Last we looked at this and couldn't come up with a reason to change it, but with a pragma for full loop unrolling we bypass every other loop unroll and then fail to fully unroll a loop when the pragma is set. Move the OnlyWhenForced out of the check and into the initialization of the full unroll pass in the new pass manager. This doesn't show up with the old pass manager. Add a new option to opt so that we can turn off loop unrolling manually since this is a difference between clang and opt. Tested with check-clang and check-llvm.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Unit tests: pass. 61011 tests passed, 0 failed and 728 were skipped.
clang-tidy: pass.
clang-format: pass.
Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml
Unit tests: pass. 61011 tests passed, 0 failed and 728 were skipped.
clang-tidy: pass.
clang-format: pass.
Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml
clang/test/Misc/loop-opt-setup.c | ||
---|---|---|
13 | The test seems to pass without the code change (llvm/lib/Passes/PassBuilder.cpp) below. |
llvm/lib/Passes/PassBuilder.cpp | ||
---|---|---|
514 | I would document that the normal unroll pass does not honor the forced unroll, it is surprising to me. |
Can we add an LLVM test w/ the metadata so that we have an entirely LLVM test flow that ensures the pass builder DTRT?
(I still would include the Clang side test which is also very useful to test integrating Clang w/ different flows through the pass manager.)
Fixed the clang test.
Tried to get something that I could reduce down and duplicate with just opt but it's been... difficult. Even the small clang testcase in isolation won't duplicate via something like:
clang -O0 -fexperimental-new-pass-manager foo.cc -S -o - -emit-llvm -mllvm -disable-llvm-optzns | opt -passes='default<O1>' -transform-warning -pass-remarks-missed=transform-warning -S -o -
Might be holding it wrong, but this isn't very discoverable if so :)
Wooot about finally having a test case! (Sorry for nit picking it a bit ....)
llvm/test/Transforms/LoopUnroll/FullUnroll.ll | ||
---|---|---|
4–6 | Make sure it is in the correct function at least, and maybe after the label for the loop header? | |
12–17 | Are both functions needed? | |
19–20 | Nit, but minimize and clean this function up a touch? At the least, removing all the target features seems valuable, and I'd give things stable names instead of numbered values. |
OK, ready again :)
llvm/test/Transforms/LoopUnroll/FullUnroll.ll | ||
---|---|---|
4–6 | Got it. There's not a lot of function left at the end: define void @walrus() local_unnamed_addr #0 { br label %for.cond.preheader for.cond.preheader: ; preds = %for.cond.preheader, %entry br label %for.cond.preheader } | |
12–17 | I thought so, but apparently not :) | |
19–20 | Got it. Most things have stable names, only a few temporaries have numbered names. |
clang/test/Misc/loop-opt-setup.c | ||
---|---|---|
12 | This is dead now that you have different prefixes... | |
32–33 | But for which function? I'd rework these to be more modern CHECK-LABEL and affirmative checks for the unrolling. | |
llvm/test/Transforms/LoopUnroll/FullUnroll.ll | ||
4–6 | Then switch to generated CHECK lines? The checks we have here will pass easily after incorrect edits that cause this test to not actually check what it intends to. I'd either check something that constructively shows unrolling or generate exact checks (if its small enough to be genuinely stable). | |
19–20 | Sure, but even those will make future edits to this really frustrating with irrelevant diff, etc. And this test case can still be minimized. As an example, the function is currently attributed as *optnone*. =/ I think you can just run this through the metarenamer pass? |
Done. New diff incoming.
clang/test/Misc/loop-opt-setup.c | ||
---|---|---|
12 | So it is. I'd actually missed this comment and saw it when fixing up other things. Thanks :) |
I'm looking at enabling the -enable-npm-optnone flag and FullUnroll.ll fails. I understand that loop unrolling should be forced when some metadata is present, but the FullUnroll.ll test seems to check for a lot more than that. It checks for (roughly) two unconditional branches and no conditional branches. Running the test under -debug-pass-manager and --print-after-all, , it looks like InstCombine does a lot of work removing instructions, and SimplifyCFG is ultimately what removes all the conditional branches. But InstCombine and SimplifyCFG shouldn't run on an optnone function, right?
This is dead now that you have different prefixes...