This is an archive of the discontinued LLVM Phabricator instance.

[x86] Fix wrong lowering of vsetcc nodes (PR25080).
ClosedPublic

Authored by andreadb on Oct 12 2015, 8:58 AM.

Details

Summary

This patch fixes pr25080.

Function LowerVSETCC (in X86ISelLowering.cpp) works under the assumption that for non-AVX512 targets, the source type and destination type of a type-legalized setcc node are always the same type.

This assumption is unfortunately incorrect because the type legalizer is not always able to promote the return type of a setcc to the same type as the first operand of a setcc.

In the case of a vsetcc node, the legalizer firstly checks if the first input operand has a legal type. If so, then it would promote the return type of the vsetcc to that same type. Otherwise, the return type is promoted to the 'next legal type', which, for vectors of MVT::i1 is always a 128-bit integer vector type.

Example (-mattr=+avx):

%0 = trunc <8 x i32> %a to <8 x i23>
%1 = icmp eq <8 x i23> %0, zeroinitializer

The initial selection dag for the code above is:

v8i1 = setcc t5, t7, seteq:ch
  t5: v8i23 = truncate t2
    t2: v8i32,ch = CopyFromReg t0, Register:v8i32 %vreg1
    t7: v8i32 = build_vector of all zeroes.

The type legalizer would firstly check if 't5' has a legal type. If so, then it would reuse that same type to promote the return type of the setcc node. Unfortunately 't5' is of illegal type v8i23, and therefore it cannot be used to promote the return type of the setcc node. Consequently, the setcc return type is promoted to v8i16.
Later on, 't5' is promoted to v8i32 thus leading to the following dag node sequence:

v8i16 = setcc t32, t25, seteq:ch

where t32 and t25 are now values of type v8i32.

Before this patch, function LowerVSETCC would have wrongly expanded the setcc to a single X86ISD::PCMPEQ. Surprisingly, ISel was still able to match an instruction. In our case, ISel would have matched a VPCMPEQWrr:

t37: v8i16 = X86ISD::VPCMPEQWrr t36, t25

However, t36 and t25 are both VR256, while the result type is instead class VR128.
This inconsistency ended up causing the insertion of COPY instructions like these:

%vreg7<def> = COPY %vreg3; VR128:%vreg7 VR256:%vreg3

Which are invalid full copies (not sub register copies).
Eventually, the backend would have hit an UNREACHABLE "Cannot emit physreg copy instruction" in the attempt to expand the malformed pseudo COPY instructions.

This patch fixes the problem adding the missing logic in LowerVSETCC to handle the corner case of a setcc with 128-bit return type and 256-bit operand type.

This problem was originally reported by Dimitry as PR25080. It has been latent for a very long time. I have added the minimal reproducible from that bugzilla as test setcc-lowering.ll.

Please let me know if okay to submit.

Thanks,
Andrea

Diff Detail

Event Timeline

andreadb updated this revision to Diff 37116.Oct 12 2015, 8:58 AM
andreadb retitled this revision from to [x86] Fix wrong lowering of vsetcc nodes (PR25080)..
andreadb updated this object.
andreadb added reviewers: qcolombet, dim, RKSimon, spatel.
andreadb added a subscriber: llvm-commits.
emaste added a subscriber: emaste.Oct 12 2015, 9:03 AM
qcolombet accepted this revision.Oct 12 2015, 11:27 AM
qcolombet edited edge metadata.

Hi Andrea,

LGTM.
Could you double check if ARM/AArch64 target may suffer the same kind of assumptions?

Thanks,
-Quentin

This revision is now accepted and ready to land.Oct 12 2015, 11:27 AM

Hi Andrea,

LGTM.
Could you double check if ARM/AArch64 target may suffer the same kind of assumptions?

Thanks,
-Quentin

Hi Quentin,

ARM/AArch64 don't seem to suffer for this same problem.
The lowering logic in ARMISelLowering.cpp (function LowerVSETCC) uses the operand type to lower a setcc node. To avoid problems with mismatching return type, it explicitly calls

DAG.getSExtOrTrunc(Result, dl, VT);   // VT is the setcc return type.

So the transformation for ARM is perfectly okay. I also verified that the reproducible test case doesn't fail to compile using an arm triple.
A similar logic is used in AArch64ISelLowering.cpp (function LowerVSETCC). Even in this case, each call to function 'EmitVectorComparison' is post-dominated by a call to 'getSExtOrTrunc(dl, Cmp, Op.getValueType())'. Here 'Op' is the setcc node, so we are safe.

I hope this helps!

Thanks for the review Quentin!
Cheers,
Andrea

Thanks for double checking!

Cheers,
Q.

This revision was automatically updated to reflect the committed changes.