This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fix worklist management when simplifying demanded bits (PR44541)
ClosedPublic

Authored by nikic on Jan 17 2020, 12:06 PM.

Details

Summary

When simplifying demanded bits, we currently only report the instruction on which SimplifyDemandedBits was called as changed. However, this is a recursive call, and the actually modified instruction will usually be further up the chain. Additionally, all the intermediate instructions should also be revisited, as additional combines may be possible after the demanded bits simplification. We fix this by explicitly adding them to the worklist.

I originally wrote this patch to address the excessive number of instcombine iterations in an existing test (drops from 5 to 3, which is still not optimal), but found that this also addresses https://bugs.llvm.org/show_bug.cgi?id=44541, though I'm not sure whether this is really a "fix". What happens there is that demanded bits simplifies

%zero = call i16 @passthru(i16 0)
%sub = sub nuw nsw i16 %arg, %zero
%cmp = icmp slt i16 %sub, 0
%ret = select i1 %cmp, i16 0, i16 %sub

to

%zero = call i16 @passthru(i16 0)
%sub = sub nuw nsw i16 %arg, 0
%cmp = icmp slt i16 %sub, 0
%ret = select i1 %cmp, i16 0, i16 %sub

without adding the sub to the worklist (which this patch fixes). Then %cmp = icmp slt i16 %sub, 0 sensibly becomes %cmp = icmp slt i16 %arg, 0. The select is then recognized as a minmax SPF and canonicalized to

%zero = call i16 @passthru(i16 0)
%sub = sub nuw nsw i16 %arg, 0
%cmp = icmp sgt i16 0, %sub
%ret = select i1 %cmp, i16 0, i16 %sub

At which point we enter an infinite combine loop. Adding the %sub to the worklist is sufficient to get it simplified and avoid the issue. I don't know if that's considered enough, or whether there is also a bug in SPF matching or canonicalization here, in the sense that it needs to special-case this type of pattern, on the off-chance that it could appear through another pathway.

Diff Detail

Event Timeline

nikic created this revision.Jan 17 2020, 12:06 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 17 2020, 12:06 PM
nikic marked an inline comment as done.Jan 17 2020, 12:08 PM
nikic added inline comments.
llvm/test/Transforms/InstCombine/logical-select.ll
553 ↗(On Diff #238842)

These changes are caused by https://bugs.llvm.org/show_bug.cgi?id=44521. It's one of two common spurious differences that arise when worklist order changes are made.

spatel accepted this revision.Jan 20 2020, 5:15 AM

This change LGTM.

To fix the infinite loop bug without this change, we could add a clause in canonicalizeMinMaxWithConstant() to avoid creating this:
%cmp = icmp sgt i16 0, %sub

Ie, if the LHS returned by matchSelectPattern() is a constant, swap LHS and RHS because we know we can't create a canonical compare that way. An alternative would be to do that swap within matchSelectPattern() itself. We already have a different LHS/RHS constraint for ABS/NABS on that analysis.

I'm not sure how we can expose that bug with this fix in place though, so might want to make that fix first to be safer?

This revision is now accepted and ready to land.Jan 20 2020, 5:15 AM
nikic added a comment.EditedJan 20 2020, 5:27 AM

To fix the infinite loop bug without this change, we could add a clause in canonicalizeMinMaxWithConstant() to avoid creating this:
%cmp = icmp sgt i16 0, %sub

Ie, if the LHS returned by matchSelectPattern() is a constant, swap LHS and RHS because we know we can't create a canonical compare that way. An alternative would be to do that swap within matchSelectPattern() itself. We already have a different LHS/RHS constraint for ABS/NABS on that analysis.

I'm not sure how we can expose that bug with this fix in place though, so might want to make that fix first to be safer?

I don't think this would help. The problem here is not that the icmp operand order is non-canonical (though it would of course be better if we directly created it in canonical order), but that the (SPF) canonical form is based on %sub rather than %arg. This is because SPF has special support for sub-based min/max in https://github.com/llvm-mirror/llvm/blob/2c4ca6832fa6b306ee6a7010bfb80a3f2596f824/lib/Analysis/ValueTracking.cpp#L4687-L4699. I think to avoid this issue interdependently of this patch, we'd have to special case that code to not match the degenerate case where the sub has a zero on the RHS.

nikic planned changes to this revision.Feb 2 2020, 4:45 AM

This should be using AddDeferred now. But if we do that, we run into https://bugs.llvm.org/show_bug.cgi?id=44754. Additionally, we don't get the desired number of iterations anymore until D73803 and followup work will shake out (a dangling "not" is not getting DCEd and blocks transforms)...

nikic updated this revision to Diff 244760.Feb 14 2020, 1:43 PM

Rebase over D74294.

This revision is now accepted and ready to land.Feb 14 2020, 1:43 PM
This revision was automatically updated to reflect the committed changes.