Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/test/CodeGen/X86/known-never-zero.ll | ||
---|---|---|
56 | I'm not sure. The transform happens in: visitCTTZ. See my comment on fshr, but I think there is some inter-dependency with the middle end that might be subtly affecting things. | |
67 | As an aside, this is a todo: Cmp(Cmov(CC, Zero, Non-Zero), Zero) should be (Cmp(CC)) | |
288 | I originally did this, but as I was working on the change I noticed we are also missing non-zero case for funnel-shift in the middle end. I filled in the middle-end case and found it was affecting the CodeGen test and figured it be best to avoid that kind of inter-dependency. Changes to ValueTracking.cpp shouldn't mess with a test in CodeGen/X86. Maybe different llc flags? |
llvm/test/CodeGen/X86/known-never-zero.ll | ||
---|---|---|
288 | CodeGenPrepare (despeculateCountZeros) is trying to prevent CTLZ/CTTZ zero-input branching from appearing, using ValueTracking's variant of isKnownNonZero I wonder if there's something other than CTTZ we could use? |
llvm/test/CodeGen/X86/known-never-zero.ll | ||
---|---|---|
288 | I see. We don't use it in many places. Let me post a patch to try and evaluate SetCC based on known bits, then we can use that. |
llvm/test/CodeGen/X86/known-never-zero.ll | ||
---|---|---|
288 |
Created: D149383, once that goes through I'll update tests to use compare against zero. |
Added fshl and fshr cases as well. But we do have:
define i32 @rotr_known_nonzero(i32 %xx, i32 %y) { ; CHECK-LABEL: rotr_known_nonzero: ; CHECK: # %bb.0: ; CHECK-NEXT: movl %esi, %ecx ; CHECK-NEXT: orl $256, %edi # imm = 0x100 ; CHECK-NEXT: # kill: def $cl killed $cl killed $ecx ; CHECK-NEXT: rorl %cl, %edi ; CHECK-NEXT: testl %edi, %edi ; CHECK-NEXT: je .LBB14_1 ; CHECK-NEXT: # %bb.2: # %cond.false ; CHECK-NEXT: rep bsfl %edi, %eax ; CHECK-NEXT: retq ; CHECK-NEXT: .LBB14_1: ; CHECK-NEXT: movl $32, %eax ; CHECK-NEXT: retq %x = or i32 %xx, 256 %shr = lshr i32 %x, %y %sub = sub i32 32, %y %shl = shl i32 %x, %sub %z = or i32 %shl, %shr %r = call i32 @llvm.cttz.i32(i32 %z, i1 false) ret i32 %r } define i32 @rotr_maybe_zero(i32 %x, i32 %y) { ; CHECK-LABEL: rotr_maybe_zero: ; CHECK: # %bb.0: ; CHECK-NEXT: movl %esi, %ecx ; CHECK-NEXT: # kill: def $cl killed $cl killed $ecx ; CHECK-NEXT: rorl %cl, %edi ; CHECK-NEXT: testl %edi, %edi ; CHECK-NEXT: je .LBB15_1 ; CHECK-NEXT: # %bb.2: # %cond.false ; CHECK-NEXT: rep bsfl %edi, %eax ; CHECK-NEXT: retq ; CHECK-NEXT: .LBB15_1: ; CHECK-NEXT: movl $32, %eax ; CHECK-NEXT: retq %shr = lshr i32 %x, %y %sub = sub i32 32, %y %shl = shl i32 %x, %sub %z = or i32 %shl, %shr %r = call i32 @llvm.cttz.i32(i32 %z, i1 false) ret i32 %r }
Which are generating the rotate instructions at the very least.
Any idea why we sometimes get branches and sometimes CMOV? IIRC CodeGenPrepare does attempt to determine if a CTTZ/CTLZ is non-zero?