This adds fp16 variants of all the fma patterns in the ARM backend.
Details
Diff Detail
Event Timeline
llvm/lib/Target/ARM/ARMInstrVFP.td | ||
---|---|---|
2215 | Do you think it would be worth doing some canonicalisation somewhere? With two fnegs I'm assuming this pattern is more expensive than the others if not caught. |
llvm/lib/Target/ARM/ARMInstrVFP.td | ||
---|---|---|
2215 | I gave this a go, but it appears that the opposite is sometimes true. AMDGPU has instructions that look like v_fma_f32 v0, -v0, v1, -v1, where each of the operands can be inverted for free. If we inverted the whole thing, then we would need to add reverse patterns. RISCV seems to have some cases with fnmadd.s that are made worse too. |
Do you think it would be worth doing some canonicalisation somewhere? With two fnegs I'm assuming this pattern is more expensive than the others if not caught.