This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Expose __builtin_amdgcn_perm for v_perm_b32
ClosedPublic

Authored by rampitec on May 6 2021, 1:36 PM.

Diff Detail

Event Timeline

rampitec created this revision.May 6 2021, 1:36 PM
rampitec requested review of this revision.May 6 2021, 1:36 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 6 2021, 1:36 PM
Herald added a subscriber: wdng. · View Herald Transcript
rampitec updated this revision to Diff 343496.May 6 2021, 1:57 PM

Fixed lowering.

arsenm added inline comments.May 6 2021, 3:24 PM
llvm/lib/Target/AMDGPU/VOP3Instructions.td
424–427 ↗(On Diff #343496)

Can you avoid the second pattern by handling the two cases with a PatFrags as is done for other intrinsics? e.g.

def AMDGPUcos : PatFrags<(ops node:$src), [(int_amdgcn_cos node:$src),
                                           (AMDGPUcos_impl node:$src)]>;
b-sumner added inline comments.May 6 2021, 3:34 PM
llvm/include/llvm/IR/IntrinsicsAMDGPU.td
1719

This comment is incorrect

rampitec updated this revision to Diff 343521.May 6 2021, 4:03 PM
rampitec marked 2 inline comments as done.
arsenm accepted this revision.May 6 2021, 4:04 PM

LGTM

llvm/test/CodeGen/AMDGPU/llvm.amdgcn.perm.ll
14

Usually I add a few permutations with SGPRs and constants to make sure the constant bus restriction is properly respected with operand folding

This revision is now accepted and ready to land.May 6 2021, 4:04 PM
rampitec updated this revision to Diff 343527.May 6 2021, 4:24 PM
rampitec marked an inline comment as done.

Added more tests with different argument permutations.

This revision was landed with ongoing or failed builds.May 6 2021, 4:25 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMay 6 2021, 4:25 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript