This patch fixes the logic in the DAGCombiner that folds an AND node according to rule:
(and (X (load V)), C) -> (X (load V))
An AND between a vector load 'X' and a constant build_vector 'C' can be folded into the load itself only if we can prove that the AND operation is redundant.
The algorithm implemented by 'visitAND' firstly computes the splat value 'S' from C, and then checks if S has the lower 'B' bits set (where B is the size in bits of the vector element type). The algorithm takes into account also the 'undef' bits in the splat mask.
Unfortunately, the algorithm only works under the assumption that the sizeof S is a multiple of the vector element type. This assumption is not always valid. For example:
%load = load <3 x i8>, <3 x i8>* %ptr %and = <3 x i8> %load, <i8 undef, i8 undef, i8 95>
Here, the splat value returned by method 'BuildVectorSDNode::isConstantSplat' is APInt(12b, 1520u, 1520s).
The undef bits of the splat mask are APInt(12b, 15u, 15s).
The unified splat plus undef mask 'SU' is ApInt(12b, 1535u, 1535s).
The size in bits of the vector element type is 8. However, the splat value has a size of 12.
Before this patch, the algorithm would have wrongly truncated SU from 12 to 8 bits. So, the resulting splat value was incorrectly propagated as constant 255. Since 255 is a mask with all-bits set, the AND was considered redundant and therefore wrongly removed.
With this patch, we conservatively avoid folding the AND if the splat bits are not compatible with the vector element type.
Added X86 test and-load-fold.ll
Please let me know if ok to submit.