This is an archive of the discontinued LLVM Phabricator instance.

[LSR] Drop potentially invalid nowrap flags when switching to post-inc IV (PR46943)
ClosedPublic

Authored by nikic on Jan 23 2021, 5:52 AM.

Details

Summary

When LSR converts a branch on pre-inc IV into a branch on post-inc IV, the nowrap flags on the addition may no longer be valid. Previously, a poison result of the addition might have been ignored, in which case the program was well defined. After branching on the post-inc IV, we might be branching on poison, which is undefined behavior.

Fix this by discarding nowrap flags which are not present on the SCEV expression. Nowrap flags on the SCEV expression are proven by SCEV to always hold, independently of how the expression will be used. This is essentially the same fix we applied to IndVars LFTR, which also performs this kind of pre-inc to post-inc conversion.

I believe a similar problem can also exist for getelementptr inbounds, but I was not able to come up with a problematic test case. The inbounds case would have to be addressed in a differently anyway (as SCEV does not track this property).

Fixes https://bugs.llvm.org/show_bug.cgi?id=46943.

Diff Detail

Event Timeline

nikic created this revision.Jan 23 2021, 5:52 AM
nikic requested review of this revision.Jan 23 2021, 5:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 23 2021, 5:52 AM

Adding @dmgreen for the changes to the Thumb2 tests.

Hmm. These are the only tests that changed?

They are no longer producing hardware loops for what look like unrolled loops. Those are very old tests though and I don't believe we generate code quite like that any more.

@dmgreen Yeah, those are the only test changes. I believe the problem for those loops is that they have two induction variables, one that counts up and one that counts down. The branch is on the one that counts down (so this is the IV on which flags can be safely preserved), while there's a nuw flag on the one that counts up.

dmgreen accepted this revision.Jan 24 2021, 10:10 AM

Yeah OK. I ran some tests and did some experiments - none of which showed any similar problems. The correctness fix sounds valid to me so I'm inclined to say that the Thumb2 tests are OK (as in not worth worrying about too much) and if we run into similar problems later on we can try and do something about them then.

LGTM

This revision is now accepted and ready to land.Jan 24 2021, 10:10 AM
fhahn added inline comments.Jan 24 2021, 12:56 PM
lib/Transforms/Utils/ScalarEvolutionExpander.cpp
1446 ↗(On Diff #318751)

OverflowingBinaryOperator?

test/Transforms/LoopStrengthReduce/X86/pr46943.ll
48 ↗(On Diff #318751)

Is this transform actually still helpful, if we have to drop flags to do it?

nikic added inline comments.Jan 24 2021, 1:07 PM
test/Transforms/LoopStrengthReduce/X86/pr46943.ll
48 ↗(On Diff #318751)

As LSR runs in the backend, and the backend makes rather little use of nowrap flags, I would assume so. When we did the same change in LFTR (which runs in the middle of the pipeline where nowrap flags are more important), I don't think any performance regressions were reported.

nikic updated this revision to Diff 318864.Jan 24 2021, 1:10 PM

Use OverflowingBinaryOperator.

LGTM as well w/minor comments.

llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
1447

if (auto *op = dyn_cast<OBO>...)

e.g. you don't need the instruction, you can use the accessors on OBO.

1451

Any reason not to just copy the SCEV flags? Inferring stronger flags should be legal here.

llvm/test/Transforms/LoopStrengthReduce/X86/sibling-loops.ll
20

Given the nsw is present in the source, SCEV should know this is nsw. Any idea why it doesn't?

nikic added inline comments.Jan 25 2021, 12:56 AM
llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
1447

OBO can generally also be a constant expression, which is why it's not allowed to set nowrap flags directly on it. We need to go through Instruction to modify the flags.

1451

I don't think that's quite correct without additional checks. We're checking the flags on the post-inc addrec, which don't make any statement about overflow on the first iteration. If you have something like pre-inc {255,+,1} and {0,+,1} post-inc, then the latter would be nuw (assuming appropriate BE count), while the former would not be. The add can't be nuw in that case, due to the overflow on the first iteration.

llvm/test/Transforms/LoopStrengthReduce/X86/sibling-loops.ll
20

The %inc IV doesn't seem to ever be branched on, so there's no guarantee that %inc being poison would result in undefined behavior. Thus SCEV can't transfer poison flags from IR. There are some additional cases we could transfer using D92739 (for branches in non-latch exits), but I don't think that would help this case either.

fhahn added inline comments.Jan 25 2021, 4:28 AM
llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
1447

But can Result be a constant expression here? PN must be an AddRec for L, then the incoming value from the loop should be non-constant, otherwise it wouldn't be an AddRec?

nikic added inline comments.Jan 25 2021, 5:13 AM
llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
1447

Yes, it can't be a constant expression here, but it can be in general, thus the OBO API does not support setting nowrap flags. (This is why I was originally using BinaryOperator here.)

Or do you mean that this code should be using cast<> rather than dyn_cast<> for Instruction? I can change that.

nikic updated this revision to Diff 319004.Jan 25 2021, 7:21 AM

Use cast<> instead of dyn_cast<>.

fhahn added inline comments.Jan 26 2021, 4:14 AM
llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
1447

Or do you mean that this code should be using cast<> rather than dyn_cast<> for Instruction? I can change that.

Yep, thanks!