We can use OR instead of BLEND if either the element we are not picking is zero (or masked away);
or the element we are picking overwhelms (e.g. it's all-ones) whatever the element we are not picking:
https://alive2.llvm.org/ce/z/RKejao
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Why couldn't this be put in more generic ISD::OR combines somewhere? Possibly TargetLowering::SimplifyDemandedVectorElts?
The insert tests look like we're just failing to recognize that we're OR'ing all-ones elements, which should allow the SimplifyDemanded call to realise that we're going to completely overwrite those elements, so we don't need to mask them. Its most of the logic you've written but it feels like it should be somewhere more generic than inside x86 shuffle combining.
That's a shame - D109927 seems to be useful, but not useful enough for our regression :(
I was kind of hoping we could avoid this - adding so much more shuffle code for so few regressions - but it seems to be the most straightforward fix atm, and will unblock the patch dependency tree you currently have - please can you add a TODO comment suggesting we review this in the future?
The only other possibility might be to perform OR/AND more aggressively for INSERT_VECTOR_ELT lowering for -1/0 elements?
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
36240 | APInt Mask = APInt::getOneBitSet(NumV1Elts, V1EltIdx) ? |
@RKSimon thank you for taking a look!
Rebased, refactored.
I do think it would be nice to handle this in general demandedelts,
but i'm currently just not seeing how we can do that.
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
36235 | break the if-else chain? |
break the if-else chain?