Current peeling implementation bails out in case of loop nests.
The patch introduces the field in TargetTransformInfo structure that
certain targets can use to relax the constraints if it's
profitable (disabled by default).
Also additional option is added to enable peeling manually for
experimenting and testing purposes.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
+1 to the general direction - we *should* be peeling more, as discussed in
https://bugs.llvm.org/show_bug.cgi?id=43910 https://bugs.llvm.org/show_bug.cgi?id=43924
but i'm not sure i'm qualified to review this.
@fhahn thanks for your comments, addressed!
Regarding the impact - it highly depends on further simplifications that can be applied to a code after the loop peeling (branch folding, better dependency analysis, reducing of register pressure inside the loop). And since the max number of iterations to be peeled is restricted - the codesize impact is manageable.
As a good case I can point to the nested loops in ImageMagick project (line 3346, https://github.com/ImageMagick/ImageMagick6/blob/master/magick/accelerate-kernels-private.h), I observed a significant gain on the kernel here.
llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp | ||
---|---|---|
158 | enable by default? or atleast for x86? |
llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp | ||
---|---|---|
158 | I don't think that would be good idea, without at least some analysis on the impact on code size on targets turning this on. The cost-model currently assumes innermost loops exclusively. |
llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp | ||
---|---|---|
158 | The change is intended to be enabled on per-target basis via TTI where it's profitable. If someone could supply enough benchmarking for e.g. x86 target we can proceed further with enabling it/adjusting cost-model. |
@ashlykov This is failing on EXPENSIVE_CHECKS builds - revert? http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/1493/
Hi @RKSimon!
I'm trying to figure that out, the change is supposed to be off by default except the added test case and shouldn't affect anyone functionally.
What's LLVM policy in such cases - revert?
Yes please revert if you don't have a fix, if the build stays red then later committers are less likely to notice if their commits cause extra builds fails.
@RKSimon thanks, reverted. The added test is failed with LLVM_ENABLE_EXPENSIVE_CHECKS on. Will re-commit after the fix.
Hi @xbolva00!
Could you please take a look on slightly updated changes? The previous version didn't preserve LoopInfo in case of new loops are created during a peeling, that causes a fail on EXPENSIVE_CHECKS builds. It's fixed now by recursively creating corresponding Loop objects by cloneLoop() utility function (moved from LoopUnswitch.cpp to LoopUtils.cpp to reuse).
enable by default?
or atleast for x86?