This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Add llvm.amdgcn.fmad.ftz intrinsic
ClosedPublic

Authored by rampitec on Jun 25 2018, 3:26 PM.

Details

Summary

This intrinsic selects v_mad_f32 regardless of fp32 denorm support.

Diff Detail

Event Timeline

rampitec created this revision.Jun 25 2018, 3:26 PM
arsenm added inline comments.Jun 25 2018, 11:58 PM
include/llvm/IR/IntrinsicsAMDGPU.td
364

Does v_mad_f16 flush denormals? I don't think it does

rampitec added inline comments.Jun 26 2018, 12:07 AM
include/llvm/IR/IntrinsicsAMDGPU.td
364

It does. It cannot preserve denormals. It is even lowered into FMAD_FTZ, which is lowered to mad. The new part is this intrinsic sitting above the existing SDNode.

arsenm added inline comments.Jun 26 2018, 1:47 AM
include/llvm/IR/IntrinsicsAMDGPU.td
364

In that case I would make this intrinsic type mangled and make it work for f16 as well

rampitec added inline comments.Jun 26 2018, 7:19 AM
include/llvm/IR/IntrinsicsAMDGPU.td
364

AFAIR f16 does not flush.

rampitec added inline comments.Jun 26 2018, 7:51 AM
include/llvm/IR/IntrinsicsAMDGPU.td
364

Sorry for confusion: v_mad_f32 flushes. v_mad_f16 does not. That is why it is not overloaded.

arsenm accepted this revision.Jun 26 2018, 12:17 PM

LGTM. A test showing the source modifiers are matched wouldn't hurt

This revision is now accepted and ready to land.Jun 26 2018, 12:17 PM
rampitec updated this revision to Diff 152942.Jun 26 2018, 12:37 PM

Added tests with source modifiers.

This revision was automatically updated to reflect the committed changes.

Actually according to the selection code, f16 mad does not support denormals the same, so the intrinsic should work with f16 if that is correct

Actually according to the selection code, f16 mad does not support denormals the same, so the intrinsic should work with f16 if that is correct

https://reviews.llvm.org/D48677

artem.tamazov reopened this revision.Jul 5 2018, 10:17 AM
artem.tamazov added a subscriber: artem.tamazov.

Perhaps fixes needed.

include/llvm/IR/IntrinsicsAMDGPU.td
364

According to SCDevUtil/SCMathengine, V_MAD_F16 always flushes HP denormals. Please double-check.

This revision is now accepted and ready to land.Jul 5 2018, 10:17 AM
arsenm closed this revision.Jul 5 2018, 10:18 AM

This was extended to f16 in r335866