This is an archive of the discontinued LLVM Phabricator instance.

[DAG] Enable combineShiftOfShiftedLogic folds after type legalization
ClosedPublic

Authored by RKSimon on Oct 16 2022, 10:56 AM.

Details

Summary

This was disabled to prevent regressions, which appear to be just occurring on AMDGPU (at least in our current lit tests).

Fixes #57872

Diff Detail

Event Timeline

RKSimon created this revision.Oct 16 2022, 10:56 AM
RKSimon requested review of this revision.Oct 16 2022, 10:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 16 2022, 10:56 AM

The SystemZ change is a clear improvement, so that part LGTM. Thanks!

Thanks. The change looks good to me from BPF perspective as it fixed the regression in https://github.com/llvm/llvm-project/issues/57872.

@foad What do you think about the AMDGPU diffs? Do you want me to focus on getting D136081 done first?

foad added a comment.Oct 18 2022, 6:18 AM

@foad What do you think about the AMDGPU diffs? Do you want me to focus on getting D136081 done first?

The diffs look OK to me, modulo one pretty minor regression noted inline.

As for the general approach of putting the smarts in AMDGPUTargetLowering::isDesirableToCommuteWithShift: does AMDGPU have special code to match or(shl(load_zext(),c), load_zext()) and merge it into a wider load? If so, could it be improved to match the case where both loads are shifted? On the other hand, since you've already coded and tested this approach, I think it is fine.

llvm/test/CodeGen/AMDGPU/sdiv.ll
1652 ↗(On Diff #468086)

Seems like we are losing a couple of bfe formations here, and D136081 does not fix it.

RKSimon updated this revision to Diff 468896.Oct 19 2022, 6:49 AM
RKSimon edited the summary of this revision. (Show Details)

Extend AMDGPUTargetLowering::isDesirableToCommuteWithShift to avoid losing SR*(SHL(X,C1),C2) patterns that can fold to BFE instructions

@foad I gave up on D136081 and instead just tried harder not to lose patterns that the existing BFE match code can handle.

RKSimon updated this revision to Diff 468922.Oct 19 2022, 8:14 AM

minor cleanup (remove unnecessary check for ISD::SHL)

ping? I think everything is covered now

yonghong-song accepted this revision.Oct 27 2022, 10:48 PM
This revision is now accepted and ready to land.Oct 27 2022, 10:48 PM
This revision was landed with ongoing or failed builds.Oct 29 2022, 4:30 AM
This revision was automatically updated to reflect the committed changes.
foad added a comment.Oct 31 2022, 2:31 AM

AMDGPU parts look fine, thanks, and sorry for the late review.