Page MenuHomePhabricator

[x86] bug fix for https://reviews.llvm.org/D64551
AbandonedPublic

Authored by yubing on Wed, Sep 4, 10:14 PM.

Details

Summary

bug fix for https://reviews.llvm.org/D64551
Prevent combining X86ISD::VZEXT_MOVL into wrong DAGs

https://reviews.llvm.org/D64551 has produced a bug in llvm:

the attached t.ll can reproduce this bug:
For t.ll, llvm without this patch produces the correct asm while llvm with this patch produces bad asm:
llvm with this patch:

vmovd   xmm0, dword ptr [rdi + 260] # xmm0 = mem[0],zero,zero,zero
vmovd   xmm1, dword ptr [rdi + 252] # xmm1 = mem[0],zero,zero,zero
vpunpcklqdq     xmm0, xmm1, xmm0 # xmm0 = xmm1[0],xmm0[0]
vxorps  xmm1, xmm1, xmm1
vinserti128     ymm0, ymm1, xmm0, 1
vmovdqa ymmword ptr [rsi + 672], ymm0

llvm without this patch:

vmovq   xmm0, qword ptr [rdi + 252] # xmm0 = mem[0],zero
movabs  rax, offset .LCPI0_0
vmovdqa xmm1, xmmword ptr [rax]
vpshufb xmm0, xmm0, xmm1
vmovd   xmm1, dword ptr [rdi + 260] # xmm1 = mem[0],zero,zero,zero
vpunpcklqdq     xmm0, xmm0, xmm1 # xmm0 = xmm0[0],xmm1[0]
vxorps  xmm1, xmm1, xmm1
vinserti128     ymm0, ymm1, xmm0, 1
vmovdqa ymmword ptr [rsi + 672], ymm0

When I compared the ‘llc -debug’ output, I found the procedure of Combining: t53: v4i32 = X86ISD::VZEXT_MOVL t43 are different:
llvm without this patch:

Legalizing: t53: v4i32 = X86ISD::VZEXT_MOVL t43
Legal node: nothing to do

Combining: t53: v4i32 = X86ISD::VZEXT_MOVL t43
Creating new node: t55: v2i64 = undef
Creating constant: t56: i8 = Constant<4>
Creating constant: t57: i8 = Constant<5>
Creating constant: t58: i8 = Constant<6>
Creating constant: t59: i8 = Constant<7>
Creating constant: t60: i8 = Constant<-1>

llvm with this patch:

Legalizing: t53: v4i32 = X86ISD::VZEXT_MOVL t43
Legal node: nothing to do

Combining: t53: v4i32 = X86ISD::VZEXT_MOVL t43

Then I used ‘dbg llc’ and found that EltsFromConsecutiveLoads will return SDValue() while llvm with this patch won’t.

The return SDValue() was deleted by your patch:

if (!ISD::isNON_EXTLoad(Elt.getNode()))
   return SDValue();

So, I added these two lines back to llvm with this patch, then output asm is correct and the same with llvm without this patch:

Diff Detail

Event Timeline

yubing created this revision.Wed, Sep 4, 10:14 PM
Herald added a project: Restricted Project. · View Herald TranscriptWed, Sep 4, 10:14 PM
yubing edited the summary of this revision. (Show Details)Wed, Sep 4, 10:16 PM

Please abandon this, it isn't a valid solution to the issue (raised at PR43227). I have a WIP fix that will address this correctly.

llvm/lib/Target/X86/X86ISelLowering.cpp
7681

This misses the entire point of D64551 which adds support for drilling down through other ops to find a NON_EXTLoad with a byte offset.

Please abandon this, it isn't a valid solution to the issue (raised at PR43227). I have a WIP fix that will address this correctly.

Hi Simon,
We meet some deadline for fixing the issue. We are eager to see your patch. :)

yubing added a comment.Thu, Sep 5, 7:39 AM

Please abandon this, it isn't a valid solution to the issue (raised at PR43227). I have a WIP fix that will address this correctly.

Thanks @RKSimon

rL371078 should fix this

rL371078 should fix this

Thanks Simon! We'll try it.

yubing abandoned this revision.Thu, Sep 5, 7:24 PM