This is a minor fix for LV cost model, where the cost for VF=2 is computed twice when the vectorization of the loop is forced without specifying a VF. It was reported by @xusx595 in the mailing list.
Details
Diff Detail
Event Timeline
LGTM. Thanks for this patch and having me as a reviewer. As my work on vectorization is not based on the llvm-trunk but still on top of VPlan, it is not straightforward for me to create patches.
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
5033 | Could have alternatively started from unsigned i = 2 * Width, as the condition above is essentially peeling the first iteration. Ideally computing expectedCost(1) would also be saved in this case. |
Thanks for the comments, Ayal! Please, let me know if you have any other concerns.
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
5033 | Good points!
I initially did that but then I had to replicate the LLVM_DEBUG line also in the peeled iteration to be consistent. For that reason I opted for this approach which doesn't need that replication.
Agreed. Unfortunately ScalarCost is necessary below. I also thought that it'd be interesting to have the debug information about the scalar cost for these cases |
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
5033 | Having a designated LLVM_DEBUG line explaining that the first VF considered is 2 instead of 1, might be helpful. If cost is to be computed just for supplying it as debug information, it should appear under LLVM_DEBUG. Anyway, mtcw, feel free to proceed. |
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
5033 | Thanks, Ayal.
Not only for debug. Look at line 5059. Not sure if it would be executed when vectorization is forced, though, but I'd prefer not to change the behavior w.r.t the scalar cost in this patch. I'll proceed then. |
Could have alternatively started from unsigned i = 2 * Width, as the condition above is essentially peeling the first iteration.
Ideally computing expectedCost(1) would also be saved in this case.