This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Build correct vector filled with undef nodes
Needs ReviewPublic

Authored by David.Xu on Aug 31 2014, 10:35 PM.

Details

Summary

This patch generates correct vector when visiting AND and OR SDNodes in DAGCombiner. When a BuildVectore SDNode which is one of the operands of AND/OR SDNOdes is checked to be all zero/ones, it just return this SDNode. But in this SDNode there may be some undef elements. These undef elements should be replaced with real ZEROS/ONES, because the undef elements have been regard as ZEROS/ONES in ISD::isBuildVectorAllZeros()/ISD::isBuildVectorAllOnes() function.

If these undef elements are not replaced, It will cause some bugs. For example,
define <8 x i16> @aarch64_tree_tests_and(<8 x i16> %a) {
entry:

%and = and <8 x i16> <i16 0, i16 undef, i16 undef, i16 0, i16 0, i16 undef, i16 undef, i16 0>, %a
%ret = add <8 x i16> %and, <i16 -32768, i16 32767, i16 4664, i16 32767, i16 -32768, i16 -32768, i16 0, i16 0>
ret <8 x i16> %ret

}

After visitAND,
%and = <8 x i16> <i16 0, i16 undef, i16 undef, i16 0, i16 0, i16 undef, i16 undef, i16 0>
%ret = <8 x i16> <i16 -32768, i16 undef, i16 undef, i16 32767, i16 -32768, i16 undef, i16 undef, i16 0>
The undef elements in %ret are regard as ZEROS in the end, but it should be i16 32767, i16 4664, i16 -32768 and i16 0.

This bug is found in the emperor test. This bug indicates that we should not deal with AND/OR and ADD/SUB in the same way, because visitAND/visitOR returns the AllZEROS/ONES SDNodes, while visitADD/visitSUB returns the other SDNodes.

Diff Detail

Event Timeline

David.Xu updated this revision to Diff 13134.Aug 31 2014, 10:35 PM
David.Xu retitled this revision from to [AArch64] Build correct vector filled with undef nodes.
David.Xu updated this object.
David.Xu edited the test plan for this revision. (Show Details)
David.Xu added reviewers: t.p.northover, Jiangning.
David.Xu added a subscriber: Unknown Object (MLST).Aug 31 2014, 10:36 PM
t.p.northover edited edge metadata.Sep 1 2014, 5:28 AM

Hi David,

Thanks for working on this. Bet it was fun to track down! I've got a comment about the implementation...

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
486

I think this might be more useful as a generic getConstantSplat function in SelectionDAG (following the SelectionDAG::getConstant functions).

It wouldn't significantly complicate the use-sites, would simplify the implementation (don't need to care about bitcasts or weird promotion rules or anything) and I could see other cases where it could be used.

Something like (untested)

SDValue getConstantSplat(uint64_t Val, SDLoc DL, EVT VT) {
  SmallVector<SDValue, 16> Ops(VT.getVectorNumElements(),
                               getConstant(Val, VT.getVectorElementType()));
  return getNode(ISD::BUILD_VECTOR, DL, VT, Ops);
}

Yes, it is more useful as a generic getConstantSplat(all of the elements of the vector are the same) function in SelectionDAG. But it may not be used frequently as getConstant function. If we want to present a bunch of APIs to generate const splat, more work is needed. For example, we should consider whether the data type is integer or float point. In SelectionDAG, there are 3 getConstant and 3 getConstantFP, thus 6 getConstantSplat functions should be added.

Here we just do the post-processing of the AllOnes/AllZeros BuildVector. The only disadvantage of my method is we should use ReplaceUndefFromZeroOrOneVector carefully. For this small case which is to fix a bug, I think we need not to add 6 getConstantSplat functions.

Hi David,

The only disadvantage of my method is we should use ReplaceUndefFromZeroOrOneVector carefully

There's the extra complexity in implementation and tight coupling to isBuildVectorAllOnes as well (if that function was hypothetically enhanced to handle extra cases, this function would break).

For this small case which is to fix a bug, I think we need not to add 6 getConstantSplat functions.

I agree. We have a real use for at least one though, so we should add it. Otherwise these functions would never appear, no matter how useful they are, as everyone says "it's not worth it just for this one case".

Cheers.

Tim.

David.Xu updated this revision to Diff 13508.Sep 9 2014, 6:27 PM
David.Xu edited edge metadata.

Actually, getConstant has been provided to build a BuildVector Node.

Please have a look.
David

Oh, well spotted! This one looks fine, thanks for updating it.

Tim.