This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] Initial support for the fast-math flag contract
ClosedPublic

Authored by anemet on Mar 20 2017, 8:13 PM.

Details

Summary

Now alternatively to the TargetOption.AllowFPOpFusion global flag, FMUL->FADD
can also use the per operation FMF to allow fusion.

The idea here is not to port everything to the new scheme (e.g. fused
multiply-and-sub will be ported later) but that this work all the way from
clang.

The transformation is conditionalized on *both* the FADD and the FMUL having
the FMF contract flag.

Diff Detail

Repository
rL LLVM

Event Timeline

anemet created this revision.Mar 20 2017, 8:13 PM
arsenm added a subscriber: arsenm.Mar 27 2017, 11:45 AM
arsenm added inline comments.
test/CodeGen/AArch64/neon-fma-FMF.ll
5–6 ↗(On Diff #92425)

Needs tests where only the fmul or fadd are marked contract

anemet updated this revision to Diff 93184.Mar 27 2017, 3:09 PM
anemet marked an inline comment as done.

Address Matt's comment

anemet updated this revision to Diff 93186.Mar 27 2017, 3:10 PM

Fix typo

Does this look OK now?

spatel added inline comments.Mar 29 2017, 3:18 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
8721–8725 ↗(On Diff #93186)

If the target has FMAD, then did we just bail out unnecessarily?

arsenm added inline comments.Mar 29 2017, 3:20 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
8721–8725 ↗(On Diff #93186)

Yes, the point of fmad is it is safe anyway

test/CodeGen/AArch64/neon-fma-FMF.ll
13 ↗(On Diff #93186)

Positive checks for the separate add and mul would be less fragile

spatel added inline comments.Mar 29 2017, 3:23 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
8721–8725 ↗(On Diff #93186)

Yikes...this means there are no regression tests for any of the folds in this function for an FMAD-capable target?

anemet added inline comments.Mar 29 2017, 3:35 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
8721–8725 ↗(On Diff #93186)

Good point. Let me do some digging.

anemet added inline comments.Mar 29 2017, 3:45 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
8721–8725 ↗(On Diff #93186)

Looks like there is: CodeGen/R600/mad-combine.ll. I just didn't help that target enabled :(. Thanks for catching this!

anemet updated this revision to Diff 93426.Mar 29 2017, 5:01 PM

Address review comments. Also test with all target enabled now. Sorry again
for the silly mistake.

spatel edited edge metadata.

See inline for one more possible test case. I have no other comments, but I'm cc'ing some people who have worked on this function recently for potential feedback.

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
8753–8756 ↗(On Diff #93426)

I think this diff exposes a possibility that didn't exist before, so it's worth adding a test for it:

define float @cannot_choose_fma(float %f1, float %f2, float %f3, float %f4) {
  %mul1 = fmul contract float %f1, %f2
  %mul2 = fmul float %f3, %f4
  %add = fadd contract float %mul1, %mul2
  %second_use_of_mul1 = fdiv float %mul1, %add
  ret float %second_use_of_mul1
}

$ ./llc -o - fma.ll -mtriple=powerpc64-darwin 
  fmuls f0, f3, f4
  fmuls f13, f1, f2
  fmadds f0, f1, f2, f0
  fdivs f1, f13, f0
anemet marked 7 inline comments as done.Mar 30 2017, 9:42 AM
anemet added inline comments.
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
8753–8756 ↗(On Diff #93426)

Makes sense, I've added it. Thanks for all your feedback and sorry again about the testing mishap. I haven't worked on the backend for a while.

anemet updated this revision to Diff 93495.Mar 30 2017, 9:44 AM
anemet marked an inline comment as done.

Add a new test requested by Sanjay.

spatel accepted this revision.Mar 30 2017, 11:02 AM

LGTM - see a couple of small nits for the PPC test comment.

Also, you can use the script at utils/update_llc_test_checks.py to auto-generate exact checks for PPC (avoid CHECK-NOT whenever possible). Bonus points if you add aarch64 to the supported triples in that script and improve those checks too. :)

test/CodeGen/PowerPC/fma-aggr-FMF.ll
15 ↗(On Diff #93495)

"no use" -> "one use"

16 ↗(On Diff #93495)

"a use" -> "two uses"

This revision is now accepted and ready to land.Mar 30 2017, 11:02 AM
This revision was automatically updated to reflect the committed changes.
anemet marked an inline comment as done.

LGTM - see a couple of small nits for the PPC test comment.

Thanks!

Also, you can use the script at utils/update_llc_test_checks.py to auto-generate exact checks for PPC (avoid CHECK-NOT whenever possible).

Done. Please take a look at the committed version. The powerpc64le assembly looked correct to me but I don't work on PPC.

test/CodeGen/PowerPC/fma-aggr-FMF.ll
15 ↗(On Diff #93495)

I added the word "extra" use which is what I meant to write :)

Done. Please take a look at the committed version. The powerpc64le assembly looked correct to me but I don't work on PPC.

Seems right. PPC-LE is foreign to me too...much preferred the BE and Darwin syntax. :)

The counterpart of fused multiply-and-sub was committed under rL299572.