Page MenuHomePhabricator

[AArch64] Consider instruction-level contract FMFs in combiner patterns.
ClosedPublic

Authored by fhahn on Thu, Jul 30, 3:43 AM.

Details

Summary

Currently, instruction level fast math flags are not considered when
generating patterns for the machine combiner.

This currently leads to some missed opportunities to generate FMAs in
combination with #pragma clang fp contract (fast).

For example, when building the example below with -O3 for AArch64, no
FMADD is generated. If built with -O2 and the DAGCombiner is used
instead of the MachineCombiner for FMAs, an FMADD is generated.

With this patch, the same code is generated in both cases.

float madd_contract(float a, float b, float c) {
#pragma clang fp contract (fast)
  return (a * b) + c;
}

Diff Detail

Event Timeline

fhahn created this revision.Thu, Jul 30, 3:43 AM
fhahn requested review of this revision.Thu, Jul 30, 3:43 AM
dmgreen added inline comments.Thu, Jul 30, 7:22 AM
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
3919

-> We can only fuse ...

4122

Do both instructions need the fast-math flag, or just the add? I guess it's safer to check both, but the code would end up being simpler if it was just the add.

Reading DAGCombiner::visitFADDForFMACombine it's a bit messy, but seems like it only really considers the add:
https://godbolt.org/z/3ejrhE
Reading the code though, I'm not sure that's really what it is intending to do.

llvm/test/CodeGen/AArch64/machine-combiner-instr-fmf.mir
82

-> both

fhahn updated this revision to Diff 282628.Mon, Aug 3, 8:03 AM

Only require contract on FADD/FSUB for fusion.

fhahn added inline comments.Mon, Aug 3, 8:06 AM
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
4122

Indeed it seems like it should be sufficient on the FADD/FSUB. Given that is what DAGCOmbine does, it should be safe to do so in the machine combiner too (plus it ensures consistency between -O3 and other levels). It also makes the code much simpler.

I also found llvm/test/CodeGen/AArch64/neon-fma-FMF.ll, which tests with different contract flags and it specifically checks for fusion with contract on FADD and no fusion with contract on FMUL. I added -O3, which checks the machine-combiner path as well.

dmgreen accepted this revision.Tue, Aug 4, 1:12 AM

Thanks. LGTM, in that this work the same as DAGCombiner.

This revision is now accepted and ready to land.Tue, Aug 4, 1:12 AM