In combineRepeatedFPDivisors only scale num uses by splat if the
division can be converted into scalar op.
Details
Details
Diff Detail
Diff Detail
Event Timeline
llvm/test/CodeGen/AArch64/fdiv-combine.ll | ||
---|---|---|
199 | To be honest, the original code looks faster to me even with the extra fmul and fmov. The latency of a fmul is a lot lower than fdiv and the throughput for fdiv is terrible, whereas it's pretty good for fmul. |
To be honest, the original code looks faster to me even with the extra fmul and fmov. The latency of a fmul is a lot lower than fdiv and the throughput for fdiv is terrible, whereas it's pretty good for fmul.