On SystemZ, shift and rotate instructions only use the bottom 6 bits of the shift/rotate amount. Therefore, if the amount is ANDed with an immediate mask that has all of the bottom 6 bits set, we can remove the AND operation entirely.
Details
- Reviewers
uweigand - Commits
- rG687691aeac10: Fix SystemZ compilation abort caused by negative AND mask
rGbc2cfc229121: [SystemZ] Remove AND mask of bottom 6 bits when result is used for shift/rotate
rL279105: Fix SystemZ compilation abort caused by negative AND mask
rL274650: [SystemZ] Remove AND mask of bottom 6 bits when result is used for shift/rotate
Diff Detail
Event Timeline
I'm not sure that adding so many extra patterns in the .td file is the best way to implement this. Maybe it would be better to implement the transformation of the mask in PerformDAGCombine? This is done earlier, and might in theory expose more combination opportunities ...
Good point, thanks Ulrich. As suggested, I've moved the implementation to a more concise one in PerformDAGCombine.
Looks good in general. However, it seems you missed the recent refactoring to the PerformDAGCombine routine (rev. 274191), so your patch won't apply. Please update to current mainline (your code should now go into a subroutine like combineShift).
As further enhancement I'm wondering if it might be useful to tweak the ANDed constant even if the AND cannot be optimized away completely. Not sure if that really makes a difference to real-world code though.
lib/Target/SystemZ/SystemZISelLowering.cpp | ||
---|---|---|
5115 | In principle it ought to be possible to do that optimization even if there are multiple uses of the AND result (the AND would still stay there, just not used for the shift). |
Ah, now it seems the CombineTo use is incorrect (this will replace *all* uses of the AND with its input value, which is incorrect if there is indeed another use). So I think we have to generate a new shift/rotate output node instead of using CombineTo here.
If the AND value has more than one use, generate a new shift/rotate node with the AND's first operand rather than removing the AND completely. Also add tests for this situation.
In principle it ought to be possible to do that optimization even if there are multiple uses of the AND result (the AND would still stay there, just not used for the shift).