This is an archive of the discontinued LLVM Phabricator instance.

[x86] split FMA with fast-math-flags to avoid libcall
ClosedPublic

Authored by spatel on Jul 16 2020, 12:55 PM.

Details

Summary

fma reassoc A, B, C --> fadd (fmul A, B), C (when target has no FMA hardware)

C/C++ code may use explicit fma() calls (which become LLVM fma intrinsics in IR) but then gets compiled with -ffast-math or similar.
For targets that do not have FMA hardware, we don't want to go out to the math library for a precise but slow FMA result.

I tried this as a generic DAGCombine, but it caused infinite looping on more than 1 other target, so there's likely some over-reaching fma formation happening.

There's also a potential intersection of strict FP with fast-math here. I'm not sure who should win that fight, so just deferring to current behavior for that case.

Diff Detail

Event Timeline

spatel created this revision.Jul 16 2020, 12:55 PM
spatel updated this revision to Diff 278726.Jul 17 2020, 4:59 AM

Patch updated:
Added a minimal vector test to show larger diffs and confirm that doesn't conflict with existing transforms.

SGTM, but i'll review to someone more familiar with FP

This looks good besides the Piledriver checks.

Regarding StrictFP: an FMA is not the same as a MUL+ADD in that context. So going to a libcall is the right move.

llvm/test/CodeGen/X86/fma.ll
1612

These Piledriver checks disappeared. Was that deliberate?

spatel marked 2 inline comments as done.Jul 17 2020, 1:58 PM
spatel added inline comments.
llvm/test/CodeGen/X86/fma.ll
1612

They were subsumed into the common prefix of "FMACALL32". It's a bit strange that we don't have a unique prefix for this run line:
; RUN: llc < %s -mtriple=i386-apple-darwin10 -mattr=+avx,-fma,-fma4 -show-mc-encoding | FileCheck %s --check-prefix=FMACALL32

I can try to make that clearer with a cleanup patch.

cameron.mcinally accepted this revision.Jul 17 2020, 2:02 PM

LGTM

Ah, okay. I should have looked for overlap in the RUN lines. Sorry for the noise.

This revision is now accepted and ready to land.Jul 17 2020, 2:02 PM
This revision was automatically updated to reflect the committed changes.
spatel marked an inline comment as done.