This is an archive of the discontinued LLVM Phabricator instance.

[LoopUnroll] Separate peeling from unrolling
ClosedPublic

Authored by nikic on May 29 2021, 10:00 AM.

Details

Summary

Loop peeling is currently performed as part of UnrollLoop(). The whole setup is somewhat odd, here is my current understanding of the situation:

Based on that reasoning (and suggested by @efriedma on the referenced review) I've moved the peeling code out of UnrollLoop() into tryToUnrollLoop(). An error is thrown if a test explicitly requests to perform both peeling and unrolling.

TBH the way peeling is implemented has left me very confused, so I'm not sure I'm doing the right thing here.

Diff Detail

Event Timeline

nikic created this revision.May 29 2021, 10:00 AM
nikic requested review of this revision.May 29 2021, 10:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 29 2021, 10:00 AM
lebedev.ri requested changes to this revision.May 29 2021, 10:05 AM
lebedev.ri added a subscriber: lebedev.ri.

This doesn't make sense to me.
So now if we ask to unroll loop, and peel stuff first, it will only do the peeling?
The tests even show as much.

This revision now requires changes to proceed.May 29 2021, 10:05 AM
nikic updated this revision to Diff 348653.May 29 2021, 11:37 AM
nikic edited the summary of this revision. (Show Details)

Yes, silently ignoring one option isn't good. I've changed the code to emit an error if both options are specified.

lebedev.ri resigned from this revision.May 30 2021, 2:26 PM

I was actually talking about not the explicit options, but how this works in the pipeline,
and i guess this somehow does not affect that use-case at all.
IMO that becomes even more confusing, but since this doesn't regress pipelines, i don't care.

I was actually talking about not the explicit options, but how this works in the pipeline,
and i guess this somehow does not affect that use-case at all.
IMO that becomes even more confusing, but since this doesn't regress pipelines, i don't care.

Yes, this does not affect the optimization pipeline, it only affects tests that specify explicit unroll counts.

I think it is worth noting that there are cases where performing both (non-PGO) peeling and unrolling could be sensible, namely when invariance peeling renders a previously non-analyzable loop analyzable. However, this requires that we first perform a peel and then re-analyze the loop for unrolling profitability from scratch, not that we perform a combined peel and unroll in a single step. As such, even if we want to support both peeling and unrolling a loop in the future, I believe this patch would still be the necessary first step in that direction.

I'm not sure we can drop the behavior of both peeling and unrolling. The tests use the command lines, but I believe there's a pragma/metadata mechanism which can be used to achieve the same effect. (Vague memory of prior conversations w/ @Meinersbur)

Supporting the existing behavior with the new code structure doesn't seem too hard, would you mind updating to continue supporting both? (Well, while fixing the miscompile noted in D103620.)

nikic added a comment.Jun 3 2021, 3:01 PM

I'm not sure we can drop the behavior of both peeling and unrolling. The tests use the command lines, but I believe there's a pragma/metadata mechanism which can be used to achieve the same effect. (Vague memory of prior conversations w/ @Meinersbur)

While there is indeed metadata to set an unroll count, I don't believe there is metadata to set a peel count. If an unroll count is set via metadata, we will not attempt to compute a peel count.

Supporting the existing behavior with the new code structure doesn't seem too hard, would you mind updating to continue supporting both? (Well, while fixing the miscompile noted in D103620.)

I don't think the existing behavior makes sense, so I would prefer not to preserve it. It seems pointless to go out of our way to allow a scenario in testing that cannot appear as part of the optimization pipeline.

nikic planned changes to this revision.Jun 4 2021, 12:12 PM

@nikic I found your last comment convincing.

reames accepted this revision.Jun 4 2021, 12:27 PM

LGTM

llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
1189

In the future, we really should pull out a function with the other loop simplification, and use that here, but that's for the future.

nikic updated this revision to Diff 349952.Jun 4 2021, 1:14 PM

If an explicit PeelCount has been provided for testing, use it directly and don't try to go through other unroll heuristics. Otherwise we might try to use the explicit test PeelCount together with a full unroll heuristic. (Once again, this doesn't affect the optimization pipeline, just testing.)

Also preserve existing tests by removing explicit unroll counts from tests that already have an explicit peel count. The tests are still reasonable apart from that.

This revision is now accepted and ready to land.Jun 4 2021, 1:14 PM
reames added a comment.Jun 4 2021, 2:23 PM

LGTM still holds.

This revision was automatically updated to reflect the committed changes.