This patch simplifies the calculation of certain costs in
getInstructionCost when isScalarAfterVectorization() returns a true value.
There are a few places where we multiply a cost by a number N, i.e.
unsigned N = isScalarAfterVectorization(I, VF) ? VF.getKnownMinValue() : 1; return N * TTI.getArithmeticInstrCost(...
After some investigation it seems that there are only these cases that occur
in practice:
- VF is a scalar, in which case N = 1.
- VF is a vector. We can only get here if: a) the instruction is a GEP/bitcast with scalar uses, or b) this is an update to an induction variable that remains scalar.
I have changed the code so that N is assumed to always be 1. For GEPs
the cost is always 0, since this is calculated later on as part of the
load/store cost. For all other cases I have added an assert that none of the
users needs scalarising, which didn't fire in any unit tests.
Only one test required fixing and I believe the original cost for the scalar
add instruction to have been wrong, since only one copy remains after
vectorisation.
I'd prefer to either remove the ifndef here so that it's always defined (and the compiler can optimize it away if it isn't called), or implementing the lambda suggestion from Florian.