This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Add intrinsics llvm.amdgcn.cvt.{pknorm.i16, pknorm.u16, pk.i16, pk.u16}
ClosedPublic

Authored by mareko on Jan 2 2018, 4:25 AM.

Event Timeline

mareko created this revision.Jan 2 2018, 4:25 AM
arsenm added inline comments.Jan 2 2018, 7:55 AM
include/llvm/IR/IntrinsicsAMDGPU.td
241–244

Since on some subtargets v2i16/v2f16 are legal, they should probably use that return type directly. This will require a little more work for the other subtargets in the custom lowering.

Alternatively, aren't these all just pairs of convert (x / constant)? Can we just directly match that?

mareko added inline comments.Jan 2 2018, 8:43 AM
include/llvm/IR/IntrinsicsAMDGPU.td
241–244

Not sure how to add support for v2i16, but Mesa will never need v2i16 from these intrinsics.

The intrinsics are non-trivial. We are talking about 10 or so instructions when emulated.

arsenm added inline comments.Jan 2 2018, 9:01 AM
include/llvm/IR/IntrinsicsAMDGPU.td
241–244

To add support you do the same thing that ReplaceNodeResults does for amdgcn_cvt_pkrtz. For targets without legal packed types, it just replaces it with i32 and casts back

mareko updated this revision to Diff 128476.Jan 2 2018, 4:13 PM

Switched the return type to v2i16.

arsenm accepted this revision.Jan 3 2018, 10:23 AM

LGTM

This revision is now accepted and ready to land.Jan 3 2018, 10:23 AM
This revision was automatically updated to reflect the committed changes.