This patch adds support for combining patterns such as (FMUL(FADD(1.0, x), y)) and (FMUL(FSUB(x, 1.0), y)) to their FMA equivalents.
This is useful in particular for linear interpolation cases such as (FADD(FMUL(x, t), FMUL(y, FSUB(1.0, t))))
Paths
| Differential D13003
[DAGCombiner] Improve FMA support for interpolation patterns ClosedPublic Authored by RKSimon on Sep 20 2015, 11:00 AM.
Details Summary This patch adds support for combining patterns such as (FMUL(FADD(1.0, x), y)) and (FMUL(FSUB(x, 1.0), y)) to their FMA equivalents. This is useful in particular for linear interpolation cases such as (FADD(FMUL(x, t), FMUL(y, FSUB(1.0, t))))
Diff Detail
Event TimelineComment Actions This mostly LGTM. There aren't any tests stressing the FMAD path. AMDGPU seems to be only target using it still, and the one test change is in the expansion of an intrinsic which should be removed. If you can add some of those that would be good, otherwise I can try to do it after you commit
Comment Actions
Thanks Matt, I can add some FMAD tests for v_mad_f32 - is that the only instruction I should be testing for? Most of your comments about the preamble are just as relevant for the other FMA pattern combines (visitFADDForFMACombine, visitFSUBForFMACombine); given that I copied+pasted most of it from them should they be updated as well? Comment Actions
Yes. The fneg should be folded in as a source modifier that looks something like v_mad_f32 v0, v1, v2, -v3. Sometimes v_mac_f32 is used, although in these cases that shouldn't happen
Yes, probably RKSimon edited edge metadata. Comment ActionsUpdated all FMA combine helpers based on Matt's feedback. Added AMDGPU FMA/FMAD tests arsenm edited edge metadata. Comment ActionsLGTM, although I think you should split the renames in the other parts into a separate patch This revision is now accepted and ready to land.Sep 21 2015, 9:45 AM Closed by commit rL248210: [DAGCombiner] Improve FMA support for interpolation patterns (authored by RKSimon). · Explain WhySep 21 2015, 1:34 PM This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 35206 lib/CodeGen/SelectionDAG/DAGCombiner.cpp
test/CodeGen/AMDGPU/llvm.amdgpu.lrp.ll
test/CodeGen/X86/fma_patterns.ll
|
A better name would be AllowFusion or something like that