Optimize the common case of splat vector constant. For large vector going through all the elements is expensive. For splat/broadcast cases we can skip going through all the elements.
Details
Diff Detail
Event Timeline
llvm/include/llvm/IR/Constants.h | ||
---|---|---|
770 | I'd move this into the initializer. | |
llvm/lib/Analysis/ValueTracking.cpp | ||
179 | Is it OK to set DemandedLHS bit 0 when DemandedElts[0] is false? | |
llvm/lib/IR/ConstantFold.cpp | ||
64–67 | Please add a comment for this. | |
891–896 | Please add a comment here. | |
1385–1386 | This doesn't depend on C1. |
Thanks David.
llvm/include/llvm/IR/Constants.h | ||
---|---|---|
770 | Done. | |
llvm/lib/Analysis/ValueTracking.cpp | ||
179 | It is okay as long as DemandedElts has at least one bit set. With current code this function would never be called with an empty DemandedElts but I added an early check for this case to handle this case. | |
llvm/lib/IR/ConstantFold.cpp | ||
64–67 | Done. | |
891–896 | Done. | |
1385–1386 | Right, I pulled this part out of the if. |
llvm/include/llvm/IR/Constants.h | ||
---|---|---|
770 | This introduces a linear scan in the creation of the ConstantDataVector() class, I wonder if this might be wasteful if "isSplat()" never endsup to be called on a particular vector. I wonder if moving this scan in "isSplat()" and caching the result might be better (as isSplat() already was an expensive method, while the construction of ConstantDataVector was not) |
llvm/lib/Analysis/ValueTracking.cpp | ||
---|---|---|
177–178 | I believe we still need to set DemandedLHS and DemandedRHS to APInt::getNullValue(NumElts) so I cannot really move it up? | |
179 | DemandedLHS[0] is set if for any DemandedElts[i] Shuf->getMaskValue(i) is 0 or NumElts. |
llvm/include/llvm/IR/Constants.h | ||
---|---|---|
770 | Here is an alternative solution were we lazily set IsSplat. I assume it is going to be rare that we create a constant and not have any pass call isSplat on it but it could make sense to try to keep creation cheap. What do you think of this solution? |
llvm/include/llvm/IR/Constants.h | ||
---|---|---|
770 | Exactly what I was thinking about thanks! I think there are potentially cases where isSplat() is not called when a Constant is created. An example is serialization and deserialization of bitcode, where the bitcode is read into an LLVM representation is constructed and then it is just printed out (or viceversa). |
llvm/include/llvm/IR/Constants.h | ||
---|---|---|
770 | Sounds good, let's leave it that way then. |
llvm/include/llvm/IR/Constants.h | ||
---|---|---|
771 | You could save a byte by rolling your own optional. I think this is valuable because there may be many ConstantDataVector. Maybe something like: bool IsSplatSet : 1; bool IsSplat : 1; | |
llvm/lib/Analysis/ValueTracking.cpp | ||
177–178 | I was just thinking of floating this bit up: if (DemandedElts.isNullValue()) return true; | |
llvm/lib/IR/ConstantFold.cpp | ||
1389 | Now that these are lazy, maybe only do C1->getSplatValue() if C2Splat is true. | |
llvm/lib/IR/Constants.cpp | ||
2907 | Formatting. |
llvm/include/llvm/IR/Constants.h | ||
---|---|---|
771 | Done. | |
llvm/lib/Analysis/ValueTracking.cpp | ||
177–178 | What I meant is that DemandedLHS DemandedRHS are out parameters and do need to be set before return true. So we cannot move this above the APInt above. | |
llvm/lib/IR/ConstantFold.cpp | ||
1389 | Done. | |
llvm/lib/IR/Constants.cpp | ||
2907 | Done. |
LGTM
llvm/lib/IR/Instructions.cpp | ||
---|---|---|
1965 | Instead of resize, I'd reserve + push_back. resize fills the vector with values you will overwrite anyway. |
I'd move this into the initializer.