This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU : Replace FMAD with FMA when denormals are enabled.
ClosedPublic

Authored by wdng on Feb 14 2017, 12:15 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

wdng created this revision.Feb 14 2017, 12:15 PM
arsenm requested changes to this revision.Feb 14 2017, 12:16 PM
arsenm added a subscriber: llvm-commits.

You should not be attempting to lower fmad. It is not the same as FMA. You should be creating a new FMAD_FTZ node for use in the one specific case you are trying to fix, which will only be used if denormals are disabled.

This revision now requires changes to proceed.Feb 14 2017, 12:17 PM
wdng updated this revision to Diff 88572.Feb 15 2017, 10:12 AM
wdng edited edge metadata.

Address code reviews.

b-sumner edited edge metadata.Feb 15 2017, 10:34 AM
This comment was removed by b-sumner.
arsenm added inline comments.Feb 15 2017, 11:43 AM
lib/Target/AMDGPU/AMDGPUISelLowering.cpp
1296–1300

You only need to select between the opcode. You don't need 2 getNode calls

lib/Target/AMDGPU/AMDGPUInstrInfo.td
198

Extra newline

lib/Target/AMDGPU/SIISelLowering.cpp
365 ↗(On Diff #88572)

Not sure why this appears in the diff

lib/Target/AMDGPU/SIInstructions.td
500–505

You should be using the exact fmad pattern above. You should refactor the class to take the input node

test/CodeGen/AMDGPU/udiv.ll
8–9

You should only need one new run line, since the default is already tested. You should change the existing VI runline to explicitly set off. The r600 run line should also be last

194–196

You don't need both functions since you are using the global -mattr to do this. These also should have the same result with and without denormals

196

You should minimize this list

wdng updated this revision to Diff 88857.Feb 16 2017, 10:21 PM
wdng marked 4 inline comments as done.

As complex pattern causes pattern cannot be selected. So no complex pattern used for FMAD_FTZ pattern definition. Also the generated ISAs is different as fneg is being folded, so we have v_mac_f32 and v_mad_f32 respectively when denormals are enabled and disabled.

v_mac_f32 always flushes subnormal inputs just like v_mad_f32

wdng updated this revision to Diff 88908.Feb 17 2017, 9:27 AM

Change V_MAC_f32 to V_MAD_F32 to emit the same result with and without denormals.

arsenm added inline comments.Feb 17 2017, 4:42 PM
lib/Target/AMDGPU/AMDGPUISelLowering.cpp
1297

The casts are unnecessary

lib/Target/AMDGPU/SIInstructions.td
505

This won't fold the neg source modifier, so will still make code worse. There should be a 3-operand class with source modifiers you should use rather than calling it FMAC. This also can be a class, a multi class isn't necessary

test/CodeGen/AMDGPU/udiv.ll
167

Should check the neg source modifier is used

wdng marked an inline comment as done.Feb 21 2017, 3:57 PM
wdng added inline comments.
lib/Target/AMDGPU/AMDGPUISelLowering.cpp
1297

If cast is removed, there will be a "warning: enumeral mismatch in conditional expression: ‘llvm::AMDGPUISD::NodeType’ vs ‘llvm::ISD::NodeType’", shall we still remove it?

wdng updated this revision to Diff 89399.Feb 22 2017, 12:12 PM
wdng marked an inline comment as done.

Address code reviews.

arsenm added inline comments.Feb 22 2017, 1:43 PM
lib/Target/AMDGPU/SIInstructions.td
511

For now you might want to just stick emitting v_mad_f32 to not deviate from the denormal disabled output. Selecting v_mac_ with src0/src1 modifiers should be a separate patch

wdng updated this revision to Diff 89526.Feb 23 2017, 9:55 AM

Address code review feedback.

arsenm added inline comments.Feb 23 2017, 12:54 PM
lib/Target/AMDGPU/AMDGPUISelLowering.h
269

Needs a comment explaining what it is for

lib/Target/AMDGPU/SIInstructions.td
505

Extra space after VOP3Mods

wdng updated this revision to Diff 89553.Feb 23 2017, 1:05 PM
wdng marked 2 inline comments as done.

Address code review feedback.

arsenm added inline comments.Feb 23 2017, 2:53 PM
lib/Target/AMDGPU/AMDGPUISelLowering.h
269

This is the opposite. It is for emitting mad when f32 denormals are enabled because mac/mad always flush

wdng updated this revision to Diff 89682.Feb 24 2017, 9:23 AM

Update comments.

arsenm added inline comments.Feb 24 2017, 10:38 AM
lib/Target/AMDGPU/AMDGPUISelLowering.h
269

should refer to it as ISD::FMAD here

270

It's an illegal operation, not an illegal type

wdng updated this revision to Diff 89700.Feb 24 2017, 11:07 AM
wdng marked 2 inline comments as done.

Updated, thanks!

arsenm accepted this revision.Feb 24 2017, 1:24 PM

LGTM

This revision is now accepted and ready to land.Feb 24 2017, 1:24 PM
This revision was automatically updated to reflect the committed changes.