In quite a few cases in LoopVectorize.cpp we call createStepForVF
with a step value of 0, which leads to unnecessary generation of
llvm.vscale intrinsic calls. I've optimised IRBuilder::CreateVScale
and createStepForVF to return 0 when attempting to multiply
vscale by 0.
Details
Diff Detail
Event Timeline
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
1116 | Not that this is wrong, but would it be inappropriate to make CreateVScale detect a zero step and return zero? |
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
1116 | Sorry I just realised that's what you've done. Do we need to make this change, then? |
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
1116 | It's not really necessary as the end result is the same - I was just trying to avoid the compiler doing more work for the Step=0 case, i.e. the creation of a ConstantInt, etc. I noticed there are a few places in the vectoriser where we call this function with a Step of 0 - it happens when widening a PHI instruction or vectorising induction variables. If you prefer I can just revert this code? |
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
1116 | I don't feel particularly strongly. I suppose my initial comment was trying to avoid repeating ourselves and duplicating logic. But I also realise that this is how compile-time regressions creep in. I don't know how impactful the creation of a zero ConstantInt is? Presumably it's cached more often than not? |
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
1116 | The overhead of creating the ConstantInt is small, and this only happens on a small number of cases when the step value is actually zero. Personally I'd prefer this to be implemented only once in IRBuilder, and have this change removed to make the code simpler. |
Not that this is wrong, but would it be inappropriate to make CreateVScale detect a zero step and return zero?