This is an archive of the discontinued LLVM Phabricator instance.

[LoopUnroll] Peel iterations based on select conditions
ClosedPublic

Authored by caojoshua on May 21 2023, 2:32 PM.

Details

Summary

This also allows us to peel loops with a select:

for (int i = 0; i <= N; ++i);
  f3(i == 0 ? a : b); // select instruction

into:

f3(a); // peel one iteration
for (int i = 1; i <= N; ++i)
  f3(b);

Diff Detail

Event Timeline

caojoshua created this revision.May 21 2023, 2:32 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 21 2023, 2:32 PM
caojoshua updated this revision to Diff 524132.May 21 2023, 2:52 PM
caojoshua edited the summary of this revision. (Show Details)

More details examples in commit msg

caojoshua updated this revision to Diff 524133.May 21 2023, 2:55 PM
caojoshua edited the summary of this revision. (Show Details)

update commit msg

caojoshua published this revision for review.May 21 2023, 3:02 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 21 2023, 3:02 PM
nikic added a comment.May 22 2023, 1:21 PM

Please restrict this patch to handle select only. Let's handle and/or in a followup.

llvm/lib/Transforms/Utils/LoopPeel.cpp
352

Why is this code needed now?

452

You probably meant m_And/m_Or here. m_CombineAnd/m_CombineOr will put &I into both LHS and RHS, which is certainly not what you intended.

caojoshua added inline comments.May 22 2023, 1:33 PM
llvm/lib/Transforms/Utils/LoopPeel.cpp
352

Latch branch conditions can peel the entire loop.

for (i = 0; i < 4; ++i) {
  foo();
  // The backedge branch, i < 4, is known for all iterations, and could make us peel the entire loop
}

When peeling only branch instructions, we avoid this by ignoring the branch condition of the latch block. But with this patch, we don't ignore and/or's that are used by the latch branch.

I will take out and/or's from this patch as recommended, but I would still like to keep this code block here. It's not breaking and it avoids unnecessary transformations.

caojoshua updated this revision to Diff 524562.May 22 2023, 9:44 PM
caojoshua retitled this revision from [LoopUnroll] Peel iterations based on select, and, or conditions to [LoopUnroll] Peel iterations based on select conditions.
caojoshua edited the summary of this revision. (Show Details)

Restrict patch to only handle selects. Handle and/or in another patch.

nikic added inline comments.May 23 2023, 5:33 AM
llvm/lib/Transforms/Utils/LoopPeel.cpp
352

The getZExtValue() here can assert if we the BECount is larger than 64 bits. You can use getLimitedValue() instead.

When peeling only branch instructions, we avoid this by ignoring the branch condition of the latch block. But with this patch, we don't ignore and/or's that are used by the latch branch.

FWIW, I think that's a problem in your and/or handling: You shouldn't pick up random and/ors, but only those rooted at a br or select.

But anyway, I don't mind adding the check.

caojoshua updated this revision to Diff 524982.May 23 2023, 8:25 PM

use getLimitedValue() instead of getZExtValue()

caojoshua marked an inline comment as done.May 23 2023, 8:30 PM
caojoshua added inline comments.
llvm/lib/Transforms/Utils/LoopPeel.cpp
352

Thanks, using getLimitedValue() in newest changes. I added a test for peeling large loops BECounts (>2^64) to precommit tests locally.

FWIW, I think that's a problem in your and/or handling: You shouldn't pick up random and/ors, but only those rooted at a br or select.

Benefit of checking random and/ors is that allows peeling nested and/ors easily. We can also peel for and/ors not used in branches/selects, for example, we might pass a boolean to a function (probably rare in practice though).

nikic accepted this revision.May 24 2023, 12:31 AM

LGTM

llvm/lib/Transforms/Utils/LoopPeel.cpp
352

Benefit of checking random and/ors is that allows peeling nested and/ors easily. We can also peel for and/ors not used in branches/selects, for example, we might pass a boolean to a function (probably rare in practice though).

If we wanted to peel such cases, we would just look at all icmps, regardless of where they occur. But we specifically don't want to peel cases like that, because it's not useful. (It does not enable any meaningful follow-up simplification.)

This revision is now accepted and ready to land.May 24 2023, 12:31 AM
This revision was landed with ongoing or failed builds.May 24 2023, 12:58 AM
This revision was automatically updated to reflect the committed changes.
caojoshua marked an inline comment as done.