This is an archive of the discontinued LLVM Phabricator instance.

[LoopUnroll] Eliminate PreserveCondBr parameter and fix a bug in the process
ClosedPublic

Authored by reames on Jun 3 2021, 8:21 AM.

Details

Summary

This builds on D103584. The change eliminates the coupling between unroll heuristic and implementation w.r.t. knowing when the passed in trip count is an exact trip count or a max trip count. In theory the new code is slightly less powerful (since it relies on exact computable trip counts), but in practice, it appears to cover all the same cases. It can also be extended if needed.

The test change shows what appears to be a bug in the existing code around the interaction of peeling and unrolling. The original loop only ran 8 iterations. The previous output had the loop peeled by 2, and then an exact unroll of 8. This meant the loop ran a total of 10 iterations which appears to have been a miscompile.

Diff Detail

Event Timeline

reames created this revision.Jun 3 2021, 8:21 AM
reames requested review of this revision.Jun 3 2021, 8:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 3 2021, 8:21 AM
nikic added inline comments.Jun 3 2021, 1:00 PM
llvm/lib/Transforms/Utils/LoopUnroll.cpp
380

I don't like this. This not resulting in regressions just means we have bad test coverage. I've added an extra test in https://reviews.llvm.org/rG33e41eaecdd7 that should fail after this change.

It would be better to base this on the exact trip count of ExitingBlock (below) and then generalize from there.

772

I think it would be better to clamp ULO.Count to MaxTripCount upfront, and avoid these special cases for MaxTripCount and ExactTripCount. It both makes the code simpler, and the unrolling result simpler.

This is actually already done at Effectively "DCE" unrolled iterations above, but we should do it based on the MaxTripCount, not ULO.TripCount.

reames updated this revision to Diff 349673.Jun 3 2021, 1:31 PM

address review comment

reames added inline comments.Jun 3 2021, 1:33 PM
llvm/lib/Transforms/Utils/LoopUnroll.cpp
380

Good suggestion, incorporated.

JFYI, ExitingBI is on my list to kill, but not in this patch. :)

772

I'm not quite sure what you're asking for. It sounds like maybe you want me to change how many iterations we unroll when ULO.Count > MaxTripCount?

If so, I definitely request that be a separate patch. I'm trying to avoid large test diffs with each incremental patch and changing too much at once makes the test diffs really hard to understand.

nikic accepted this revision.Jun 3 2021, 1:57 PM

LGTM

llvm/lib/Transforms/Utils/LoopUnroll.cpp
415

nit: Space after comma.

760–774

nit: Stray semicolon.

770

It may make sense to move the if (j == 0) return false; case to the start, as it's always the same. With that done, you should be able to drop the separate check for ExactUnroll, as the ExactTripCount case below should cover it.

772

Fair enough, I'm happy to have that in a followup. I mainly suggested it here because we would not have to worry about j >= MaxTripCount situations.

This revision is now accepted and ready to land.Jun 3 2021, 1:57 PM
This revision was landed with ongoing or failed builds.Jun 3 2021, 2:10 PM
This revision was automatically updated to reflect the committed changes.
reames added inline comments.Jun 3 2021, 2:20 PM
llvm/lib/Transforms/Utils/LoopUnroll.cpp
770

Landed as is, then did this in cddcc4cf. This code is complicated enough that I want something easy to revert if this final style change turns out be buggy. :)

772

I'm definitely going to hold on doing that until after we split out peeling. I don't believe the current logic is even correct when there's a non-zero peel count. I don't want to build on it yet.