This patch updates the cost model for ordered reductions so that a call
to the llvm.fmuladd intrinsic is modelled as a normal fmul instruction
plus the cost of an ordered fadd reduction.
Depends on D111555
Differential D111630
[LoopVectorize][CostModel] Update cost model for fmuladd intrinsic RosieSumpter on Oct 12 2021, 4:25 AM. Authored by
Details This patch updates the cost model for ordered reductions so that a call Depends on D111555
Diff Detail Event Timeline
Comment Actions Created getFMulAddReductionCost. This means the code which calculates the total cost of the fmuladd is moved out of the vectorizer, but avoids changing the interface for getArithmeticReductionCost.
Comment Actions
Comment Actions Thanks for the extra info @RosieSumpter. As I see it @fhahn was just asking a question of which I personally would have answered no, presenting the same rational as my previous comment. I guess we'll have to wait for others to respond.
Comment Actions
Comment Actions Thanks for the comments @paulwalker-arm and @david-arm. I've moved the fmul cost calculation back to the vectorizer since this seems like the more favourable option.
Comment Actions
Comment Actions This cost calculation seems correct to me. For in-loop reductions it calculates the cost as a single fmul + fadd *reduction*. If this is not an in-loop reduction, or if fmuladd is not used in a reduction, it will follow the regular code-path to get the cost of this intrinsic (either as an FMA or separate fmul+fadd). So, LGTM! (Please address the minor nits before you commit)
|
Perhaps I've misunderstood something? because this code looks more complicated (or perhaps just more verbose) than the previous patch. I guess I'm struggling why different versions of getArithmeticInstrCost are being called between the two. I just assumed you'd add something like: