This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] Improve FMA support for interpolation patterns
ClosedPublic

Authored by RKSimon on Sep 20 2015, 11:00 AM.

Details

Summary

This patch adds support for combining patterns such as (FMUL(FADD(1.0, x), y)) and (FMUL(FSUB(x, 1.0), y)) to their FMA equivalents.

This is useful in particular for linear interpolation cases such as (FADD(FMUL(x, t), FMUL(y, FSUB(1.0, t))))

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon updated this revision to Diff 35206.Sep 20 2015, 11:00 AM
RKSimon retitled this revision from to [DAGCombiner] Improve FMA support for interpolation patterns.
RKSimon updated this object.
RKSimon added reviewers: hfinkel, arsenm, spatel, delena.
RKSimon set the repository for this revision to rL LLVM.
RKSimon added a subscriber: llvm-commits.
arsenm edited edge metadata.Sep 20 2015, 11:28 AM

This mostly LGTM.

There aren't any tests stressing the FMAD path. AMDGPU seems to be only target using it still, and the one test change is in the expansion of an intrinsic which should be removed. If you can add some of those that would be good, otherwise I can try to do it after you commit

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
7937 ↗(On Diff #35206)

A better name would be AllowFusion or something like that

7944–7946 ↗(On Diff #35206)

I think the AllowFusion/UnsafeFPMath check should be first

7953 ↗(On Diff #35206)

Usually the int is omitted

7954 ↗(On Diff #35206)

It seems wrong to use this in the FMAD case, although AMDGPU happens to not care because enableAggressiveFMAFusion always reports true and it seems to be what is used already.

There aren't any tests stressing the FMAD path. AMDGPU seems to be only target using it still, and the one test change is in the expansion of an intrinsic which should be removed. If you can add some of those that would be good, otherwise I can try to do it after you commit

Thanks Matt, I can add some FMAD tests for v_mad_f32 - is that the only instruction I should be testing for?

Most of your comments about the preamble are just as relevant for the other FMA pattern combines (visitFADDForFMACombine, visitFSUBForFMACombine); given that I copied+pasted most of it from them should they be updated as well?

There aren't any tests stressing the FMAD path. AMDGPU seems to be only target using it still, and the one test change is in the expansion of an intrinsic which should be removed. If you can add some of those that would be good, otherwise I can try to do it after you commit

Thanks Matt, I can add some FMAD tests for v_mad_f32 - is that the only instruction I should be testing for?

Yes. The fneg should be folded in as a source modifier that looks something like v_mad_f32 v0, v1, v2, -v3. Sometimes v_mac_f32 is used, although in these cases that shouldn't happen

Most of your comments about the preamble are just as relevant for the other FMA pattern combines (visitFADDForFMACombine, visitFSUBForFMACombine); given that I copied+pasted most of it from them should they be updated as well?

Yes, probably

RKSimon updated this revision to Diff 35267.Sep 21 2015, 9:36 AM
RKSimon edited edge metadata.

Updated all FMA combine helpers based on Matt's feedback.

Added AMDGPU FMA/FMAD tests

arsenm accepted this revision.Sep 21 2015, 9:45 AM
arsenm edited edge metadata.

LGTM, although I think you should split the renames in the other parts into a separate patch

This revision is now accepted and ready to land.Sep 21 2015, 9:45 AM
This revision was automatically updated to reflect the committed changes.