This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] generalize fold for mask-with-signbit-splat
ClosedPublic

Authored by spatel on Oct 8 2021, 7:44 AM.

Details

Summary

(iN X s>> (N-1)) & Y --> (X < 0) ? Y : 0

https://alive2.llvm.org/ce/z/qeYhdz

I was looking at a missing abs() transform and found my way to this generalization of an existing fold that was added with D67799. As discussed in that review, we want to make sure codegen handles this difference well, and for all of the targets/types that I spot-checked, it looks good.

There's a difference on an existing test that shows a potentially unnecessary use limit on an icmp fold.

That fold is in InstCombinerImpl::foldICmpSubConstant(), and IIRC there was some back-and-forth on it and similar folds because it could cause analysis/passes (SCEV, LSR?) to miss optimizations.

So I think this patch makes things more consistent. But if the more specific transform was here as a work-around, we could leave it in as a special-case before we do the more general fold (...but that would be pretty hacky).

Diff Detail

Event Timeline

spatel created this revision.Oct 8 2021, 7:44 AM
spatel requested review of this revision.Oct 8 2021, 7:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 8 2021, 7:44 AM
spatel added inline comments.Oct 8 2021, 7:46 AM
llvm/test/Transforms/InstCombine/sub-ashr-and-to-icmp-select.ll
134

This is the missing icmp fold - it checks hasOneUse(), but that is not strictly necessary.

lebedev.ri accepted this revision.Oct 8 2021, 7:50 AM

Hm, looks fine to me.
I guess there's also the 'commutative' variant: https://alive2.llvm.org/ce/z/t8iHTy

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2067
This revision is now accepted and ready to land.Oct 8 2021, 7:50 AM
spatel marked an inline comment as done.Oct 8 2021, 7:55 AM

Hm, looks fine to me.
I guess there's also the 'commutative' variant: https://alive2.llvm.org/ce/z/t8iHTy

Thanks! I'll create some tests and make that a follow-up.
I'll leave this up for a bit in case there are other comments about the existing fold.

RKSimon accepted this revision.Oct 8 2021, 10:06 AM

LGTM

So I think this patch makes things more consistent. But if the more specific transform was here as a work-around, we could leave it in as a special-case before we do the more general fold (...but that would be pretty hacky).

Maybe remove the current fold as a followup commit to make bisection a little easier if somebody was depending upon it?

LGTM

So I think this patch makes things more consistent. But if the more specific transform was here as a work-around, we could leave it in as a special-case before we do the more general fold (...but that would be pretty hacky).

Maybe remove the current fold as a followup commit to make bisection a little easier if somebody was depending upon it?

Yes - that will make it clearer. Also, I found the existing 'or' sibling to this was added with D67800, so I'll do the same sequence with that one.

The sibling 'or' pattern to this 'and' patch uncovered many potential regressions, so I have been fixing those and referencing back to this review. I think I'm down to the last 1 or 2 regressions on that one. I'll go ahead and push this in 2 parts and see if that uncovers any regressions that are not immediately visible.

This revision was landed with ongoing or failed builds.Oct 15 2021, 1:31 PM
This revision was automatically updated to reflect the committed changes.