This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Add med3 intrinsics
AbandonedPublic

Authored by arsenm on Jan 27 2016, 4:21 PM.

Details

Reviewers
tstellarAMD

Diff Detail

Event Timeline

arsenm updated this revision to Diff 46197.Jan 27 2016, 4:21 PM
arsenm retitled this revision from to AMDGPU: Add med3 intrinsics.
arsenm updated this object.
arsenm added a reviewer: tstellarAMD.
arsenm added a subscriber: llvm-commits.

Why do we need intrinsics for these?

Why do we need intrinsics for these?

For the integer ones, we should always be able to get away with the pattern, even if it sort of big, something like
max(min(x, y), min(max(x, y), z))

For the fmed3 case, in case we ever care about signaling nans, we would have to be more conservative on the pattern. It would probably be better to try to implement the pattern for the integer ones and leave the FP one.

Why do we need intrinsics for these?

For the integer ones, we should always be able to get away with the pattern, even if it sort of big, something like
max(min(x, y), min(max(x, y), z))

For the fmed3 case, in case we ever care about signaling nans, we would have to be more conservative on the pattern. It would probably be better to try to implement the pattern for the integer ones and leave the FP one.

Didn't you you add the integer patterns in another patch?

arsenm abandoned this revision.Aug 12 2016, 11:11 AM