This is an archive of the discontinued LLVM Phabricator instance.

Intergerate Loop Peeling into Loop Fusion
ClosedPublic

Authored by sidbav on Jun 30 2020, 6:53 PM.

Details

Summary

This patch adds the ability to peel off iterations of the first loop in loop fusion. This can allow for both loops to have the same trip count, making it legal for them to be fused together.

Here is a simple scenario peeling can be used in loop fusion:

for (i = 0; i < 10; ++i)
  a[i] = a[i] + 3;
for (j = 1; j < 10; ++j)
  b[j] = b[j] + 5;

Here is we can make use of peeling, and then fuse the two loops together. We can peel off the 0th iteration of the loop i, and then combine loop i and j for i = 1 to 10.

a[0] = a[0] +3;
for (i = 1; i < 10; ++i) {
  a[i] = a[i] + 3;
  b[i] = b[i] + 5;
}

Currently peeling with loop fusion is only supported for loops with constant trip counts and a single exit point. Both unguarded and guarded loops are supported.

This patch is a followup to D80580. In that patch I separate the peeling specific parameters in the UnrollingPreferences struct and form a new struct called PeelingPreferences.

The diff of this patch was created assuming that D80580 is merged.

EDIT: I removed the Loop Peeling files since it is unrelated to fusion. The Loop Peeling changes revision is here: D83056

Diff Detail

Event Timeline

sidbav created this revision.Jun 30 2020, 6:53 PM

Could you update the patch with context (git diff -U999999)?

Meinersbur added inline comments.Jul 1 2020, 2:13 PM
llvm/lib/Transforms/Scalar/LoopFuse.cpp
704
712

Is it possible to return {false, None}? The pair type should be deducible by the compiler and it immediately shows that it returns.

727

Did you consider assigning it to a variable with a meaningful name first such as SameTripCount, then return {SameTripCount, diff}?

749
766
777

The call result is only used when asserts are enabled, please use #ifndef NDEBUG around the call.

790

[typo] exectue

953–955

[nit] Please avoid such spurious changes in the Differential

llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
166

[nit] Please avoid unrelated changes

llvm/lib/Transforms/Utils/CMakeLists.txt
38

Did you consider making this rename a separate patch? It adds a lot of noise to this one.

sidbav updated this revision to Diff 275134.EditedJul 2 2020, 8:35 AM
sidbav edited the summary of this revision. (Show Details)

Address @Meinersbur comments. Remove the Loop Peeling files since it is unrelated to fusion. The Loop Peeling changes revision is here: D83056

sidbav marked 8 inline comments as done.Jul 2 2020, 8:38 AM
sidbav edited the summary of this revision. (Show Details)
bmahjour added inline comments.Jul 7 2020, 9:48 AM
llvm/lib/Transforms/Scalar/LoopFuse.cpp
124

We only really need one option to control peeling for loop fusion. You can remove loop-fusion-peel and change the default for this threshold to 0. I don't know what others feel about this, but I'd rather have shorter commands when possible.

758

nit: Loop -> loop

766

nit: Peel Loop -> peel loop.

814

nit: Peeled -> peeled

861

nit: comments should be full sentences ending with a period.

862

is able to be peeled -> can be peeled

863

the different -> different

867

Please change this to give up on such loops instead of asserting.

1252

This function no longer does what this comment says. Please update the comment.

1558

moveInstructionsToTheBeginning(FC0.Peeled ? *FC0ExitBlockSuccessor : *FC0.ExitBlock, *FC1.ExitBlock, DT, PDT, DI);

1583
BasicBlock *BBToUpdate = FC0.Peeled ? FC0ExitBlockSuccessor : FC0.ExitBlock;
BBToUpdate->getTerminator()->replaceUsesOfWith(FC1GuardBlock, FC1.Header);
sidbav marked an inline comment as done.Jul 7 2020, 2:00 PM
sidbav added inline comments.
llvm/lib/Transforms/Scalar/LoopFuse.cpp
124

I agree with this idea, setting up the option like this will allow for this option to serve both purposes, of enabling peeling for fusion, and specifying the max value for fusion.

sidbav updated this revision to Diff 276219.Jul 7 2020, 2:43 PM

Address Bardia's comments, major change is removing the loop-fusion-peel option.

sidbav marked 11 inline comments as done.Jul 7 2020, 2:45 PM
bmahjour added inline comments.Jul 8 2020, 6:07 AM
llvm/lib/Transforms/Scalar/LoopFuse.cpp
858

we are checking that the user specifies...

861

I don't think we should check for occurrence of the option on the command line. It suffices to just compare TCDifference against the FusionPeelMaxCount. When the option is not specified the value of FusionPeelMaxCount will be the default, which is currently zero and would prevent peeling).

Similarly FusionPeelMaxCount != 0 is redundant.

sidbav updated this revision to Diff 276419.Jul 8 2020, 7:08 AM

Address Bardia's comments.

sidbav marked 2 inline comments as done.Jul 8 2020, 7:08 AM

Ping @ reviewers! Thanks!

Added some minor comments. Overall looks good to me. I'll approve if there are no objections by EOD today.

llvm/lib/Transforms/Scalar/LoopFuse.cpp
1558

to the to the -> to the

llvm/test/Transforms/LoopFusion/guarded_peel.ll
20

CHECK -> CHECK-NEXT

24

CHECK -> CHECK-NEXT

26

CHECK -> CHECK-NEXT

27

Add: CHECK: for.first.entry.peel.newph

There are also a number of empty basic blocks that make the test less readable. One way to make it more readable is to put a CHECK-NEXT (instead of a CHECK) right after the CHECK for the label to indicate that there is no code in the basic block.

llvm/test/Transforms/LoopFusion/peel.ll
31

Please use CHECK-NEXT for all branches that immediately follow the label.

sidbav updated this revision to Diff 278831.Jul 17 2020, 10:17 AM

Address Bardia's comments

sidbav marked 6 inline comments as done.Jul 17 2020, 10:17 AM
bmahjour accepted this revision.Jul 20 2020, 6:38 AM
This revision is now accepted and ready to land.Jul 20 2020, 6:38 AM
This revision was automatically updated to reflect the committed changes.
rupprecht marked an inline comment as done.Jul 21 2020, 9:33 AM
rupprecht added a subscriber: rupprecht.
rupprecht added inline comments.
llvm/lib/Transforms/Scalar/LoopFuse.cpp
395–397

FYI, the comment here explains why PDT should not be extracted to a var, which this patch did. Perhaps a previous version of this patch needed it? Fixed in 1ee1da1ea5728c2fe07bbf18bb25728ac3512e92.

[1075/3350] Building CXX object lib/Transforms/Scalar/CMakeFiles/LLVMScalarOpts.dir/LoopFuse.cpp.o
/home/rupprecht/src/llvm-project/llvm/lib/Transforms/Scalar/LoopFuse.cpp:390:30: warning: unused variable 'PDT' [-Wunused-variable]
    const PostDominatorTree *PDT = LHS.PDT;
MaskRay added a subscriber: MaskRay.EditedJul 21 2020, 12:26 PM

I am sorry but I reverted this in 8a268bec1b02dd446fbc36e20d0a9af45d764f67.

3 check-llvm-transforms-loopfusion tests failed in an ASAN build.

BTW, your git commit contains extra Phabricator tags. You can drop Reviewers: Subscribers: Tags: and the text Summary: from the git commit with the following script:

arcfilter () {
        arc amend
        git log -1 --pretty=%B | awk '/Reviewers:|Subscribers:/{p=1} /Reviewed By:|Differential Revision:/{p=0} !p && !/^Summary:$/ {sub(/^Summary: /,"");print}' | git commit --amend --date=now -F -
}

Reviewed By: is considered important by some people. Please keep the tag. (--date=now is my personal preference (author dates are usually not useful. Using committer dates can make log almost monotonic in time))

llvm/utils/git/pre-push.py can validate the message does not include unneeded tags. Please clean up tags for future commits (arc land does not seem to work). Thanks.

llvm/lib/Transforms/Scalar/LoopFuse.cpp
796

This line may operate on a deleted BB.

MaskRay added inline comments.Jul 21 2020, 12:31 PM
llvm/test/Transforms/LoopFusion/guarded_unsafeblock_peel.ll
2

file-name.ll is more common than file_name.ll

11

Add some spaces after ; CHECK: so that they align with ; CHECK-NEXT:
I usually indent instructions more than labels.

; CHECK:      for.first.preheader:
; CHECK-NEXT:   br label %for.first
llvm/test/Transforms/LoopFusion/peel.ll
4

This tests or just Test

MaskRay added inline comments.Jul 21 2020, 12:34 PM
llvm/lib/Transforms/Scalar/LoopFuse.cpp
731

Consider const if not mutable

752
779

No need to add too many new lines in this function. It actually harms readability (people can't see too many lines in a screen)

1347

Rationale for an increase? Large inline count can consume more stack space.

MaskRay reopened this revision.Jul 21 2020, 12:34 PM
This revision is now accepted and ready to land.Jul 21 2020, 12:34 PM
MaskRay requested changes to this revision.Jul 21 2020, 12:34 PM
This revision now requires changes to proceed.Jul 21 2020, 12:34 PM
MaskRay added inline comments.Jul 21 2020, 12:41 PM
llvm/test/Transforms/LoopFusion/guarded_unsafeblock_peel.ll
10

Function names are unique. Use ; CHECK-LABEL: -> they improve FileCheck output when some instructions below don't match CHECK lines.

llvm/test/Transforms/LoopFusion/nonadjacent_peel.ll
56

The comments are not aligned.

sidbav updated this revision to Diff 280080.EditedJul 23 2020, 5:48 AM

Address the ASAN build failures (major change is around line 795 in LoopFuse.cpp file). Also address some issues @MaskRay pointed out in lit tests.

sidbav marked 12 inline comments as done.Jul 23 2020, 5:52 AM
sidbav added inline comments.
llvm/lib/Transforms/Scalar/LoopFuse.cpp
1347

Initially was planning to have the logic from lines 795-805 in this block, which is why I increased from 8 -> 16, but I forgot to change it back.

llvm/test/Transforms/LoopFusion/guarded_unsafeblock_peel.ll
2

The Loop Fusion lit tests follow the file_name.ll convention, so I will keep the files names like this.

bmahjour added inline comments.Jul 23 2020, 9:24 AM
llvm/lib/Transforms/Scalar/LoopFuse.cpp
804

Predecessors -> predecessors

806

Please use range-based for loops whenever possible.

MaskRay added inline comments.Jul 23 2020, 12:38 PM
llvm/lib/Transforms/Scalar/LoopFuse.cpp
806
llvm/test/Transforms/LoopFusion/peel.ll
21

It is usually more robust if you include ( - when you have multiple functions named function*, void @function( will still work.

107

Delete trailing empty line

sidbav updated this revision to Diff 280240.Jul 23 2020, 1:11 PM
sidbav marked 2 inline comments as done.

Address review comments

sidbav updated this revision to Diff 280241.Jul 23 2020, 1:14 PM

Missed one comment from Bardia, address that comment.

sidbav marked 5 inline comments as done.Jul 23 2020, 1:15 PM
MaskRay accepted this revision.Jul 23 2020, 1:48 PM

LGTM.

This revision is now accepted and ready to land.Jul 23 2020, 1:48 PM
This revision was automatically updated to reflect the committed changes.