As a followup to D95291, getOperandsScalarizationOverhead was still using a VF as a vector factor if the arguments were scalar, and would assert on certain matrix intrinsics with differently sized vector arguments. This patch removes the VF arg, instead passing the Types through directly. This should allow it to more accurately compute the cost without having to guess at which operands will be vectorized, something difficult with more complex intrinsics.
Details
Diff Detail
Event Timeline
llvm/include/llvm/CodeGen/BasicTTIImpl.h | ||
---|---|---|
630–631 | nit: if(auto *VecTy = dyn_cast<VectorType>(Ty)) | |
647 | Is this not doing something different than it did before, i.e. it now always uses InTy for each of the arguments to get the cost, where it previously used the types from each of the arguments? | |
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp | ||
751 | nit: clang-format? (over 80char I think) |
Thanks for taking a look.
llvm/include/llvm/CodeGen/BasicTTIImpl.h | ||
---|---|---|
647 | Yeah. This version of getScalarizationOverhead is only called from getArithmeticInstrCost (or ARMTTIImpl::getArithmeticInstrCost etc). So the arguments/return value are all likely to all be the same value, and this is making that assumption. It is probably at least good enough as a heuristic, when compared to the else below. I've added a comment too. |
llvm/include/llvm/CodeGen/BasicTTIImpl.h | ||
---|---|---|
647 | If this code assumes all arguments are of the same type, can you add an assert which checks this? |
llvm/include/llvm/CodeGen/BasicTTIImpl.h | ||
---|---|---|
647 | Unfortunately that isn't something that is always true. The type of Args may be incorrect, even the scalar type. The vectorizer can do bitwidth narrowing at the same time as vectorization, in which case the Ty will be the most accurate type available. If the original Arg type was an i32, but the vectorizer is vectorizing with it as an i8 to maximize the vector factor, a vXi8 is the correct type to use. That will be Ty for all the uses of this function. I could try to move this array creation to the callers of the function. That would involve duplicating this bit of code in a few places, but after e771614bae0a05 removed the only other use of it, it should ensure these types are correct from the outside. |
Thanks for the changes! I think moving the creation of the Type array to the callers is an improvement, it makes it more clear what types are used to calculate the cost.
Just a few more nits, and then I'm happy.
llvm/include/llvm/Analysis/TargetTransformInfo.h | ||
---|---|---|
726–727 | Comment needs updating. | |
llvm/include/llvm/CodeGen/BasicTTIImpl.h | ||
638 | nit: can you add a comment to describe the meaning of the arguments? Just my mental understanding:
| |
647 | Thanks for the explanation and changes! nit: This comment ("// Assume all arguments are of type Ty, ..") can now be removed. |
Comment needs updating.