This patch extends to handle OR vector reduction patterns where the result is compared against zero.
Fixes PR45378
Differential D81547
[X86][SSE] MatchVectorAllZeroTest - handle OR vector reductions RKSimon on Jun 10 2020, 3:49 AM. Authored by
Details This patch extends to handle OR vector reduction patterns where the result is compared against zero. Fixes PR45378
Diff Detail
Event Timeline
Comment Actions Refreshed the patch to be based off LowerVectorAllZeroTest instead of combineVectorSizedSetCCEquality, which already handles scalar reductions and with a suitable refactor can handle vector reductions as well.
Comment Actions This seems to be causing Chromium test failures. Diffing the asm looks suspicious: Disassembly of section .text._ZN5glcts18ComputeShaderTestsC2ERN4deqp7ContextE: @@ -29795,7 +29795,7 @@ 11c4: 66 0f 63 c3 packsswb %xmm3,%xmm0 11c8: 66 0f d7 c0 pmovmskb %xmm0,%eax 11cc: 66 85 c0 test %ax,%ax - 11cf: 0f 84 65 01 00 00 je 133a <_ZN5glcts12_GLOBAL__N_128AdvancedPipelineComputeChain3RunEv+0x133a> + 11cf: 0f 85 65 01 00 00 jne 133a <_ZN5glcts12_GLOBAL__N_128AdvancedPipelineComputeChain3RunEv+0x133a> 11d5: 49 8b 45 20 mov 0x20(%r13),%rax 11d9: 48 8b 00 mov (%rax),%rax 11dc: 48 8b 40 10 mov 0x10(%rax),%rax @@ -29894,7 +29894,7 @@ 137d: 66 0f 63 c2 packsswb %xmm2,%xmm0 1381: 66 0f d7 c0 pmovmskb %xmm0,%eax 1385: 66 85 c0 test %ax,%ax - 1388: 0f 84 64 01 00 00 je 14f2 <_ZN5glcts12_GLOBAL__N_128AdvancedPipelineComputeChain3RunEv+0x14f2> + 1388: 0f 85 64 01 00 00 jne 14f2 <_ZN5glcts12_GLOBAL__N_128AdvancedPipelineComputeChain3RunEv+0x14f2> 138e: 49 8b 45 20 mov 0x20(%r13),%rax 1392: 48 8b 00 mov (%rax),%rax 1395: 48 8b 40 10 mov 0x10(%rax),%rax It seems the conditions on the jumps there are getting flipped, which really looks like a bug. I've posted a repro here: https://bugs.chromium.org/p/chromium/issues/detail?id=1097758#c9 I've reverted in 1357c065783e8d66c9db4be59aca389d1dc6c05f in the meantime. |
Code comments like this and the function name don't apply if we're matching a reduction, right?
I think we need to refactor and/or update comments to reduce confusion.