This is an archive of the discontinued LLVM Phabricator instance.

[X86][FMA4] Prefer FMA4 to FMA
ClosedPublic

Authored by RKSimon on Nov 25 2015, 1:36 PM.

Details

Summary

We currently output FMA instructions on targets which support both FMA4 + FMA (i.e. later Bulldozer CPUS bdver2/bdver3/bdver4).

This patch flips this so FMA4 is preferred; this is for several reasons:

1 - FMA4 is non-destructive reducing the need for mov instructions.
2 - Its more straighforward to commute and fold inputs (although the recent work on FMA has reduced this difference).
3 - All supported targets have FMA4 performance equal or better to FMA - Piledriver (bdver2) in particular has half the throughput when executing FMA instructions.

Its looks like no future AMD processor lines will support FMA4 after the Bulldozer series so we're not causing problems for later CPUs.

(Craig - it looks like it only took 3 years to make this change after your original patch at r162454!)

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon updated this revision to Diff 41177.Nov 25 2015, 1:36 PM
RKSimon retitled this revision from to [X86][FMA4] Prefer FMA4 to FMA.
RKSimon updated this object.
RKSimon added reviewers: craig.topper, spatel.
RKSimon set the repository for this revision to rL LLVM.
RKSimon added a subscriber: llvm-commits.
spatel accepted this revision.Nov 29 2015, 1:43 PM
spatel edited edge metadata.

LGTM.

This revision is now accepted and ready to land.Nov 29 2015, 1:43 PM
This revision was automatically updated to reflect the committed changes.