We already do it for splat constants, but not just values.
Also, undef cases are mostly non-functional.
https://bugs.llvm.org/show_bug.cgi?id=37603
https://rise4fun.com/Alive/cplX
Differential D47980
[InstCombine] Fold (x << y) >> y -> x & (-1 >> y) lebedev.ri on Jun 9 2018, 3:26 AM. Authored by
Details
We already do it for splat constants, but not just values. https://bugs.llvm.org/show_bug.cgi?id=37603
Diff Detail
Event Timeline
Comment Actions Ok, that didn't go as planned :)
Comment Actions In general for AMDGPU two shifts are better. Any shift immediate can be folded right into the shift instruction while a rather big mask produced by this change would require either extra 4 bytes in the encoding or even worse a move and a register. What's the rational for the folding? In addition as tests suggest we would expect the pattern to be folded into a bfe instruction but D48005 shows it is at best "bfm" (with an extra register to hold a mask) and "and". I.e. it basically shows a regression for our target. There probably would be no concern if the sequence is converted to a bfe as expected. Comment Actions Thanks!
|
As with D47981, we could consolidate this, but the constant version doesn't need the one-use check.
I think it's fine either way, but let's keep both patches consistent in their structure.