This patch extends to handle OR vector reduction patterns where the result is compared against zero.
Fixes PR45378
Paths
| Differential D81547
[X86][SSE] MatchVectorAllZeroTest - handle OR vector reductions ClosedPublic Authored by RKSimon on Jun 10 2020, 3:49 AM.
Details Summary This patch extends to handle OR vector reduction patterns where the result is compared against zero. Fixes PR45378
Diff Detail
Unit TestsFailed
Event Timeline
RKSimon retitled this revision from [X86][SSE] combineVectorSizedSetCCEquality - handle OR vector reductions to [X86][SSE] LowerVectorAllZeroTest - handle OR vector reductions. Comment ActionsRefreshed the patch to be based off LowerVectorAllZeroTest instead of combineVectorSizedSetCCEquality, which already handles scalar reductions and with a suitable refactor can handle vector reductions as well. RKSimon retitled this revision from [X86][SSE] LowerVectorAllZeroTest - handle OR vector reductions to [X86][SSE] MatchVectorAllZeroTest - handle OR vector reductions. Comment ActionsRebased now that the LowerVectorAllZero refactor is complete
This revision is now accepted and ready to land.Jun 15 2020, 2:55 PM Closed by commit rG057c9c7ee00b: [X86][SSE] MatchVectorAllZeroTest - handle OR vector reductions (authored by RKSimon). · Explain WhyJun 16 2020, 2:09 AM This revision was automatically updated to reflect the committed changes. Comment Actions This seems to be causing Chromium test failures. Diffing the asm looks suspicious: Disassembly of section .text._ZN5glcts18ComputeShaderTestsC2ERN4deqp7ContextE: @@ -29795,7 +29795,7 @@ 11c4: 66 0f 63 c3 packsswb %xmm3,%xmm0 11c8: 66 0f d7 c0 pmovmskb %xmm0,%eax 11cc: 66 85 c0 test %ax,%ax - 11cf: 0f 84 65 01 00 00 je 133a <_ZN5glcts12_GLOBAL__N_128AdvancedPipelineComputeChain3RunEv+0x133a> + 11cf: 0f 85 65 01 00 00 jne 133a <_ZN5glcts12_GLOBAL__N_128AdvancedPipelineComputeChain3RunEv+0x133a> 11d5: 49 8b 45 20 mov 0x20(%r13),%rax 11d9: 48 8b 00 mov (%rax),%rax 11dc: 48 8b 40 10 mov 0x10(%rax),%rax @@ -29894,7 +29894,7 @@ 137d: 66 0f 63 c2 packsswb %xmm2,%xmm0 1381: 66 0f d7 c0 pmovmskb %xmm0,%eax 1385: 66 85 c0 test %ax,%ax - 1388: 0f 84 64 01 00 00 je 14f2 <_ZN5glcts12_GLOBAL__N_128AdvancedPipelineComputeChain3RunEv+0x14f2> + 1388: 0f 85 64 01 00 00 jne 14f2 <_ZN5glcts12_GLOBAL__N_128AdvancedPipelineComputeChain3RunEv+0x14f2> 138e: 49 8b 45 20 mov 0x20(%r13),%rax 1392: 48 8b 00 mov (%rax),%rax 1395: 48 8b 40 10 mov 0x10(%rax),%rax It seems the conditions on the jumps there are getting flipped, which really looks like a bug. I've posted a repro here: https://bugs.chromium.org/p/chromium/issues/detail?id=1097758#c9 I've reverted in 1357c065783e8d66c9db4be59aca389d1dc6c05f in the meantime.
Revision Contents
Diff 270761 llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/test/CodeGen/X86/pr45378.ll
llvm/test/CodeGen/X86/vector-reduce-or-cmp.ll
|
Can we assert that CC is ISD::SETEQ or ISD::SETNE here?