This patch extends to handle OR vector reduction patterns where the result is compared against zero.
Fixes PR45378
Paths
| Differential D81547
[X86][SSE] MatchVectorAllZeroTest - handle OR vector reductions ClosedPublic Authored by RKSimon on Jun 10 2020, 3:49 AM.
Details Summary This patch extends to handle OR vector reduction patterns where the result is compared against zero. Fixes PR45378
Diff Detail
Event Timeline
RKSimon retitled this revision from [X86][SSE] combineVectorSizedSetCCEquality - handle OR vector reductions to [X86][SSE] LowerVectorAllZeroTest - handle OR vector reductions. Comment ActionsRefreshed the patch to be based off LowerVectorAllZeroTest instead of combineVectorSizedSetCCEquality, which already handles scalar reductions and with a suitable refactor can handle vector reductions as well. RKSimon retitled this revision from [X86][SSE] LowerVectorAllZeroTest - handle OR vector reductions to [X86][SSE] MatchVectorAllZeroTest - handle OR vector reductions. Comment ActionsRebased now that the LowerVectorAllZero refactor is complete
This revision is now accepted and ready to land.Jun 15 2020, 2:55 PM Closed by commit rG057c9c7ee00b: [X86][SSE] MatchVectorAllZeroTest - handle OR vector reductions (authored by RKSimon). · Explain WhyJun 16 2020, 2:09 AM This revision was automatically updated to reflect the committed changes. Comment Actions This seems to be causing Chromium test failures. Diffing the asm looks suspicious: Disassembly of section .text._ZN5glcts18ComputeShaderTestsC2ERN4deqp7ContextE: @@ -29795,7 +29795,7 @@ 11c4: 66 0f 63 c3 packsswb %xmm3,%xmm0 11c8: 66 0f d7 c0 pmovmskb %xmm0,%eax 11cc: 66 85 c0 test %ax,%ax - 11cf: 0f 84 65 01 00 00 je 133a <_ZN5glcts12_GLOBAL__N_128AdvancedPipelineComputeChain3RunEv+0x133a> + 11cf: 0f 85 65 01 00 00 jne 133a <_ZN5glcts12_GLOBAL__N_128AdvancedPipelineComputeChain3RunEv+0x133a> 11d5: 49 8b 45 20 mov 0x20(%r13),%rax 11d9: 48 8b 00 mov (%rax),%rax 11dc: 48 8b 40 10 mov 0x10(%rax),%rax @@ -29894,7 +29894,7 @@ 137d: 66 0f 63 c2 packsswb %xmm2,%xmm0 1381: 66 0f d7 c0 pmovmskb %xmm0,%eax 1385: 66 85 c0 test %ax,%ax - 1388: 0f 84 64 01 00 00 je 14f2 <_ZN5glcts12_GLOBAL__N_128AdvancedPipelineComputeChain3RunEv+0x14f2> + 1388: 0f 85 64 01 00 00 jne 14f2 <_ZN5glcts12_GLOBAL__N_128AdvancedPipelineComputeChain3RunEv+0x14f2> 138e: 49 8b 45 20 mov 0x20(%r13),%rax 1392: 48 8b 00 mov (%rax),%rax 1395: 48 8b 40 10 mov 0x10(%rax),%rax It seems the conditions on the jumps there are getting flipped, which really looks like a bug. I've posted a repro here: https://bugs.chromium.org/p/chromium/issues/detail?id=1097758#c9 I've reverted in 1357c065783e8d66c9db4be59aca389d1dc6c05f in the meantime.
Revision Contents
Diff 270987 llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/test/CodeGen/X86/pr45378.ll
llvm/test/CodeGen/X86/vector-reduce-or-cmp.ll
|
Can we assert that CC is ISD::SETEQ or ISD::SETNE here?