This is an archive of the discontinued LLVM Phabricator instance.

[X86] Enable enableAggressiveFMAFusion to true for FMA capable targets (PR36826)
Changes PlannedPublic

Authored by RKSimon on Apr 5 2022, 10:35 AM.

Details

Summary

If the X86 subtarget supports FMA, then allow it to aggressively generate FMA nodes, even if it means we have duplicated mul(x,y) and fma(x,y,z) cases

This demonstrates a likely flaw in the existing enableAggressiveFMAFusion folds - should we fold fadd(fmul(x,y), fmul(x,y)) -> fma(x,y,fmul(x,y)) ?

Diff Detail

Unit TestsFailed

Event Timeline

RKSimon created this revision.Apr 5 2022, 10:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 5 2022, 10:35 AM
RKSimon requested review of this revision.Apr 5 2022, 10:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 5 2022, 10:35 AM
craig.topper added inline comments.Apr 5 2022, 11:25 AM
llvm/test/CodeGen/X86/dag-fmf-cse.ll
6

This comment needs to be updated.

12

I guess this was the flaw you were referring to?

This looks like a regression on Haswell and Broadwell.

Latencies

HSWBDWSKL
vaddss334
vmulss534
fma554
RKSimon added inline comments.Apr 5 2022, 12:26 PM
llvm/test/CodeGen/X86/dag-fmf-cse.ll
12

Yes, I'm not convinced DAGCombine should be folding the fadd(fmul(x,y),fmul(x,y)) case for any target tbh.

Matt added a subscriber: Matt.Apr 7 2022, 12:02 PM
RKSimon planned changes to this revision.Apr 19 2022, 8:22 AM