This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Try to recognize bswap pattern when calling funnel shifts
ClosedPublic

Authored by junaire on Mar 22 2023, 9:00 AM.

Diff Detail

Event Timeline

junaire created this revision.Mar 22 2023, 9:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 22 2023, 9:00 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
junaire requested review of this revision.Mar 22 2023, 9:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 22 2023, 9:00 AM
RKSimon added inline comments.Mar 22 2023, 9:52 AM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
1780

Why not use test matchBSwapOrBitReverse here?

nikic requested changes to this revision.Mar 22 2023, 10:38 AM

As @RKSimon said, this transform is in the wrong place. It has nothing to do with shuffled intrinsic operands...

This revision now requires changes to proceed.Mar 22 2023, 10:38 AM
junaire updated this revision to Diff 507606.Mar 22 2023, 10:14 PM

Put the transform in the right place.

junaire retitled this revision from [InstCombine] Try to recognize bswap pattern when calling fshl to [InstCombine] Try to recognize bswap pattern when calling funnel shifts.Mar 22 2023, 10:14 PM
junaire edited the summary of this revision. (Show Details)
junaire updated this revision to Diff 507635.Mar 23 2023, 1:06 AM

Update existing tests

nikic added inline comments.Mar 23 2023, 1:21 AM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
1797–1801

This code block can probably be dropped now, if it's already handled by the preceding?

junaire added inline comments.Mar 23 2023, 1:23 AM
llvm/test/Transforms/InstCombine/fsh.ll
677

Looks like this transform somehow breaks the existing optimization? Any thoughts about how to fix it? My idea is to add another default parameter called bool ForceTrunc = true and ask matchBSwapOrBitReverse not to truncate in this case. However, I don't know if this is appropriate and please let me know if you have better solutions...

nikic added inline comments.Mar 23 2023, 1:24 AM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
1797–1801

Or you can move your code after the existing one, which will avoid the regression in fshl_mask_args_same2. It shouldn't be necessary to handle this specially, but clearly we're missing a zext(shl(trunc)) fold.

junaire added inline comments.Mar 23 2023, 1:57 AM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
1797–1801

No that doesn't work since the existing transform is highly depend on the bit-width (only valid if it's 16 but in the test that's i32)

Or maybe we can leave it as is and add a zext(shl(trunc)) fold in the future?

nikic accepted this revision.Mar 23 2023, 4:05 AM

LGTM

llvm/test/Transforms/InstCombine/fsh.ll
677

We should just add support for folding this trunc+shl+zext pattern. I've filed https://github.com/llvm/llvm-project/issues/61650 to track.

This revision is now accepted and ready to land.Mar 23 2023, 4:05 AM
This revision was landed with ongoing or failed builds.Mar 23 2023, 7:51 PM
This revision was automatically updated to reflect the committed changes.