This is an archive of the discontinued LLVM Phabricator instance.

[Loop Peeling] Add possibility to enable peeling on loop nests.
ClosedPublic

Authored by ashlykov on Nov 15 2019, 4:56 AM.

Details

Summary

Current peeling implementation bails out in case of loop nests.
The patch introduces the field in TargetTransformInfo structure that
certain targets can use to relax the constraints if it's
profitable (disabled by default).
Also additional option is added to enable peeling manually for
experimenting and testing purposes.

Diff Detail

Event Timeline

ashlykov created this revision.Nov 15 2019, 4:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 15 2019, 4:56 AM

+1 to the general direction - we *should* be peeling more, as discussed in
https://bugs.llvm.org/show_bug.cgi?id=43910 https://bugs.llvm.org/show_bug.cgi?id=43924
but i'm not sure i'm qualified to review this.

fhahn added a comment.Nov 22 2019, 3:41 AM

Out of interest, do you have any performance & size numbers on the impact of this change?

llvm/lib/Transforms/Utils/LoopUnrollPeel.cpp
291–292

Update comment.

llvm/test/Transforms/LoopUnroll/peel-loop-conditions.ll
3

I think it would be better to add tests for this in a separate file.

ashlykov updated this revision to Diff 230665.Nov 25 2019, 7:58 AM

@fhahn thanks for your comments, addressed!

Regarding the impact - it highly depends on further simplifications that can be applied to a code after the loop peeling (branch folding, better dependency analysis, reducing of register pressure inside the loop). And since the max number of iterations to be peeled is restricted - the codesize impact is manageable.

As a good case I can point to the nested loops in ImageMagick project (line 3346, https://github.com/ImageMagick/ImageMagick6/blob/master/magick/accelerate-kernels-private.h), I observed a significant gain on the kernel here.

xbolva00 added inline comments.
llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
158

enable by default?

or atleast for x86?

fhahn added inline comments.Nov 25 2019, 8:29 AM
llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
158

I don't think that would be good idea, without at least some analysis on the impact on code size on targets turning this on. The cost-model currently assumes innermost loops exclusively.

ashlykov added inline comments.Nov 25 2019, 8:39 AM
llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
158

The change is intended to be enabled on per-target basis via TTI where it's profitable. If someone could supply enough benchmarking for e.g. x86 target we can proceed further with enabling it/adjusting cost-model.

ashlykov marked 2 inline comments as done.Nov 27 2019, 4:09 AM

Hi @fhahn, do you have any more comments? Is it OK for trunk?

@fhahn Gentle reminder

xbolva00 accepted this revision.Jan 9 2020, 8:07 AM

Makes sense and no regressions expected since off by default.

This revision is now accepted and ready to land.Jan 9 2020, 8:07 AM
This revision was automatically updated to reflect the committed changes.

Hi @RKSimon!

I'm trying to figure that out, the change is supposed to be off by default except the added test case and shouldn't affect anyone functionally.
What's LLVM policy in such cases - revert?

Hi @RKSimon!

I'm trying to figure that out, the change is supposed to be off by default except the added test case and shouldn't affect anyone functionally.
What's LLVM policy in such cases - revert?

Yes please revert if you don't have a fix, if the build stays red then later committers are less likely to notice if their commits cause extra builds fails.

@RKSimon thanks, reverted. The added test is failed with LLVM_ENABLE_EXPENSIVE_CHECKS on. Will re-commit after the fix.

ashlykov updated this revision to Diff 240190.Jan 24 2020, 7:36 AM

Hi @xbolva00!

Could you please take a look on slightly updated changes? The previous version didn't preserve LoopInfo in case of new loops are created during a peeling, that causes a fail on EXPENSIVE_CHECKS builds. It's fixed now by recursively creating corresponding Loop objects by cloneLoop() utility function (moved from LoopUnswitch.cpp to LoopUtils.cpp to reuse).

ashlykov reopened this revision.Feb 3 2020, 4:20 AM

@fhahn, @xbolva00 Gentle reminder, the only change from the previously accepted revision is a fix of EXPENSIVE_CHECKS build failure (missing LoopInfo), change is off by default, can we land it?

This revision is now accepted and ready to land.Feb 3 2020, 4:20 AM

Hi @xbolva00, @fhahn! Gentle ping, can we land the change with the fix added?
For some reasons the revision is marked in Phabricator as accepted even after the modification, I would like to have explicit confirmation that I can commit it.

This revision was automatically updated to reflect the committed changes.