We already do it for splat constants, but not just values.
Also, undef cases are mostly non-functional.
https://bugs.llvm.org/show_bug.cgi?id=37603
https://rise4fun.com/Alive/cplX
Differential D47980
[InstCombine] Fold (x << y) >> y -> x & (-1 >> y) lebedev.ri on Jun 9 2018, 3:26 AM. Authored by
Details
We already do it for splat constants, but not just values. https://bugs.llvm.org/show_bug.cgi?id=37603
Diff Detail
Event TimelineComment Actions Thank you for the review!
Comment Actions Ok, that didn't go as planned :)
Comment Actions In general for AMDGPU two shifts are better. Any shift immediate can be folded right into the shift instruction while a rather big mask produced by this change would require either extra 4 bytes in the encoding or even worse a move and a register. What's the rational for the folding? In addition as tests suggest we would expect the pattern to be folded into a bfe instruction but D48005 shows it is at best "bfm" (with an extra register to hold a mask) and "and". I.e. it basically shows a regression for our target. There probably would be no concern if the sequence is converted to a bfe as expected. Comment Actions Thanks!
|
BTW it's interesting to note that all these masks are not fine-grained, isn't it?
Alive says https://rise4fun.com/Alive/Yes (lol)
Though in practice, from what i have seen from the tests, somehow the mask seems to be adjusted later.