AMDGPU : Replace FMAD with FMA when denormals are enabled.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
You should not be attempting to lower fmad. It is not the same as FMA. You should be creating a new FMAD_FTZ node for use in the one specific case you are trying to fix, which will only be used if denormals are disabled.
lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
1296–1300 | You only need to select between the opcode. You don't need 2 getNode calls | |
lib/Target/AMDGPU/AMDGPUInstrInfo.td | ||
198 | Extra newline | |
lib/Target/AMDGPU/SIISelLowering.cpp | ||
365 ↗ | (On Diff #88572) | Not sure why this appears in the diff |
lib/Target/AMDGPU/SIInstructions.td | ||
500–505 | You should be using the exact fmad pattern above. You should refactor the class to take the input node | |
test/CodeGen/AMDGPU/udiv.ll | ||
7–8 | You should only need one new run line, since the default is already tested. You should change the existing VI runline to explicitly set off. The r600 run line should also be last | |
193–195 | You don't need both functions since you are using the global -mattr to do this. These also should have the same result with and without denormals | |
195 | You should minimize this list |
As complex pattern causes pattern cannot be selected. So no complex pattern used for FMAD_FTZ pattern definition. Also the generated ISAs is different as fneg is being folded, so we have v_mac_f32 and v_mad_f32 respectively when denormals are enabled and disabled.
lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
1297 | The casts are unnecessary | |
lib/Target/AMDGPU/SIInstructions.td | ||
505 | This won't fold the neg source modifier, so will still make code worse. There should be a 3-operand class with source modifiers you should use rather than calling it FMAC. This also can be a class, a multi class isn't necessary | |
test/CodeGen/AMDGPU/udiv.ll | ||
166 | Should check the neg source modifier is used |
lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
1297 | If cast is removed, there will be a "warning: enumeral mismatch in conditional expression: ‘llvm::AMDGPUISD::NodeType’ vs ‘llvm::ISD::NodeType’", shall we still remove it? |
lib/Target/AMDGPU/SIInstructions.td | ||
---|---|---|
511 | For now you might want to just stick emitting v_mad_f32 to not deviate from the denormal disabled output. Selecting v_mac_ with src0/src1 modifiers should be a separate patch |
lib/Target/AMDGPU/AMDGPUISelLowering.h | ||
---|---|---|
269 | This is the opposite. It is for emitting mad when f32 denormals are enabled because mac/mad always flush |
Needs a comment explaining what it is for