This is an archive of the discontinued LLVM Phabricator instance.

Bugfix for application of trivial loop optimization in LoopIdiomRecognize(Github Issue #50308)
Needs ReviewPublic

Authored by razetime on Jan 6 2022, 9:03 AM.

Details

Summary

Github issue: Trivial memset optimization not applied to loops under -Oz (LoopIdiomRecognize)

When opt -loop-idiom is run on an LLVM IR file containing a countable unrotated loop containing stores of a simple constant to a single array, a memset optimization is not applied under -Oz or -loop-idiom. The optimization is not applied since runOnLoopBlock stops before the memset optimization is applied on the loop.

runOnLoopBlock stops before the optimization because it checks whether each basic block dominates all exit blocks in the loop. For unrotated loops, this check causes an early return. To work around this, a boolean representing the rotation form of the loop is passed to runOnLoopBlock. Based on the given boolean, the check for domination of the exit blocks is done. Otherwise, the check is skipped and the optimization is applied directly.

Diff Detail

Event Timeline

razetime created this revision.Jan 6 2022, 9:03 AM
razetime requested review of this revision.Jan 6 2022, 9:03 AM
razetime edited the summary of this revision. (Show Details)Jan 6 2022, 9:08 AM
lebedev.ri added inline comments.
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
628–636

I do not understand why we can just not do this check for unrotated loops?

razetime added inline comments.Jan 6 2022, 9:25 PM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
628–636

The check unconditionally failed on the unrotated test case I was using from https://llvm.godbolt.org/z/YrGqGhnE9.

The place where it failed was when BB =loop.latch and ExitBlock = exit. pre-rotating the loop would work, but it seemed unintuitive and costlier than skipping the check.

ChuanqiXu added inline comments.Jan 6 2022, 9:40 PM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
628–636

It might not be a good reason to remove a check simply if it fails on a false negative condition. Since the condition might do many other true negative checks.

llvm/test/Transforms/LoopIdiom/memset-unrotated-loop.ll
7–11

This transformation looks not good. Given %start is not no alias and %start == %end, then the original behavior would be no-op. However, it would set value 1 at %start in the transformed code. See https://llvm.godbolt.org/z/a9czn4z4b, which contains the original semantics.

ast resigned from this revision.Jan 6 2022, 9:54 PM
razetime added inline comments.Jan 7 2022, 1:41 AM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
628–636

This is the dominator tree for the given IR file:

The problem here is that loop.latch cannot dominate exit since they're on the same level i.e. this check returns false:

From llvm/include/llvm/Support/GenericDomTree.h

// A can only dominate B if it is higher in the tree.
if (A->getLevel() >= B->getLevel()) return false;

I'm not sure how this check can be modified to accept the IR for optimization. Maybe a completely different check should be added?

razetime added inline comments.Jan 7 2022, 1:47 AM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
628–636

Since the IR optimization is incorrect, is pre-rotating the loop a valid strategy here?

razetime marked 2 inline comments as not done.Jan 7 2022, 9:24 AM
ChuanqiXu added inline comments.Jan 10 2022, 5:50 PM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
628–636

Yeah, when we made a change, not only we need to consider how could we make the false negative case to pass, we need to consider if the change would affect other cases, too. You are talking about the first part and we are talking about the second part. For the concrete suggestion, maybe @fhahn could help?

fhahn added inline comments.Jan 18 2022, 4:00 AM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
628–636

So to support the loop-not-rotated-case here, we could check if it can trivially rotated (header is the only exiting block, all instructions in header can moved/duplicated).

Then proceed with the analysis. If we can replace the loop, we need to rotate the loop before we perform a transform. To start with, we should only rotate if the loop will go away entirely, i.e. it only contains memory operations that will be removed.

arsenm resigned from this revision.Nov 16 2022, 3:38 PM

I'm not the right person for loop transforms

Herald added a project: Restricted Project. · View Herald TranscriptNov 16 2022, 3:38 PM
Herald added a subscriber: StephenFan. · View Herald Transcript
lebedev.ri resigned from this revision.Jan 12 2023, 5:32 PM

This review may be stuck/dead, consider abandoning if no longer relevant.
Removing myself as reviewer in attempt to clean dashboard.