getScalarizationOverhead() was duplicated and found in three different places. It was also lacking in that the number of unique operands were not checked, so it could be that extract costs for two operands using same Value could be computed. It could also happen in LoopVectorizer that for e.g. an Add instruction, only extracts for one operand are accounted for (true for all arithmetic instructions).
This patch improves on this:
- There is a single function definition in BasicTTIImpl, which is now public so that it can be called by TargetTransformInfo, which also gets a method with same name so that LoopVectorizer can access it.
- Removed X86 duplicated method and also useless declarations of the method in AArch64 and ARM derivations.
- Removed the LoopVectorizer duplicated method and changed so that TTI is queried instead. I assumed that the check for void Type is only relevant for the RetTy.
- A new method getOperandsScalarizationOverhead(), that takes a list of operands and computes the cost of extract operations needed for the unique Values among them. Default implementation of getArithmeticInstrCost() now uses this if operands are provided. If not it keeps current behavior by assuming just one operand.
- LoopVectorizer improved by utilizing the new TTI method and thereby removing one of its wrapper static functions it used before. It should now also get the right number of extracts accounted for in getInstructionCost() for arithmetic instructions.
Even in just LoopVectorize.cpp, there are more places to go over and see what would work best. I am however happy at this point to ask for feedback and suggestions. Does this seem to be going in the right direction?
Discussion on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2017-January/109382.html
Simon, apart from duplicate operands, you mentioned extracted immediates. Do you have any
example or reference of how to best treat that case?