getIntrinsicInstrCost() used to only compute scalarization cost based on types. This patch improves this so that the actual arguments are checked when they are available, in order to handle only unique non-constant operands.
getIntrinsicInstrCost("Types") has gotten a new parameter 'ScalarizationCostPassed', so that a caller can pass on a value for scalarization cost based on actual operands. If this is UINT_MAX, the Type based estimation is made as before.
getIntrinsicInstrCost("Args") has gotten a new parameter 'VF'. If VF > 1, the types involved are vectorized with it before analyzing begins (vectorized Args that are passed can however not be combined with VF > 1). This seemed like a good idea, since both SLPVectorizer and LoopVectorizer can use this.
getOperandsScalarizationOverhead() now also checks for Constants. It has also been extended, to allow vector operands (in case which VF must be 1). I deduced this to be needed since BBVectorize calls getVecTypeForPair(), which also handles vector types as input. I am not that familiar with BBVectrize however, so if this is wrong in the sense that BBVectorize never further vectorizes a vector intrinsic, this is then not needed...
In BBVectorize, things got a bit tricky while handling merged arguments. Here, the scalarization cost is computed locally, by considering all the input operands of both instructions, plus the vectorized return type. Since vectorization is done by arguments merging, getOperandsScalarizationOverhead() is then called for all operands with a VF of 1. I hope this is right.
test/Analysis/CostModel/X86/arith-fp.ll has been updated to expect lower instruction costs, which should make sense since the calls has undef operands, so the scalarization cost of them (extracts) should not be added.
Please define what "it" is. This is the cost of scalarizing the arguments and the return value, right?