This is an archive of the discontinued LLVM Phabricator instance.

[X86] Add tests for checking `isKnownNeverZero`; NFC
ClosedPublic

Authored by goldstein.w.n on Apr 26 2023, 2:49 PM.

Diff Detail

Unit TestsFailed

Event Timeline

goldstein.w.n created this revision.Apr 26 2023, 2:49 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 26 2023, 2:49 PM
Herald added a subscriber: pengfei. · View Herald Transcript
goldstein.w.n requested review of this revision.Apr 26 2023, 2:49 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 26 2023, 2:49 PM
RKSimon added inline comments.Apr 27 2023, 4:05 AM
llvm/test/CodeGen/X86/known-never-zero.ll
57

Any idea why we sometimes get branches and sometimes CMOV? IIRC CodeGenPrepare does attempt to determine if a CTTZ/CTLZ is non-zero?

289

use the fshl/fshr intrinsics here to help ensure you're testing the ROTL/ROTR paths

goldstein.w.n added inline comments.Apr 27 2023, 9:12 AM
llvm/test/CodeGen/X86/known-never-zero.ll
57

I'm not sure. The transform happens in: visitCTTZ. See my comment on fshr, but I think there is some inter-dependency with the middle end that might be subtly affecting things.

68

As an aside, this is a todo:

Cmp(Cmov(CC, Zero, Non-Zero), Zero) should be (Cmp(CC))

289

I originally did this, but as I was working on the change I noticed we are also missing non-zero case for funnel-shift in the middle end. I filled in the middle-end case and found it was affecting the CodeGen test and figured it be best to avoid that kind of inter-dependency. Changes to ValueTracking.cpp shouldn't mess with a test in CodeGen/X86.

Maybe different llc flags?

RKSimon added inline comments.Apr 27 2023, 9:55 AM
llvm/test/CodeGen/X86/known-never-zero.ll
289

CodeGenPrepare (despeculateCountZeros) is trying to prevent CTLZ/CTTZ zero-input branching from appearing, using ValueTracking's variant of isKnownNonZero

I wonder if there's something other than CTTZ we could use?

goldstein.w.n added inline comments.Apr 27 2023, 10:45 AM
llvm/test/CodeGen/X86/known-never-zero.ll
289

I see. We don't use it in many places. Let me post a patch to try and evaluate SetCC based on known bits, then we can use that.

goldstein.w.n added inline comments.Apr 27 2023, 2:34 PM
llvm/test/CodeGen/X86/known-never-zero.ll
289

CodeGenPrepare (despeculateCountZeros) is trying to prevent CTLZ/CTTZ zero-input branching from appearing, using ValueTracking's variant of isKnownNonZero

I wonder if there's something other than CTTZ we could use?

Created: D149383, once that goes through I'll update tests to use compare against zero.

LGTM (although I think you still need rotl/rotr coverage?)

RKSimon accepted this revision.Jul 10 2023, 2:59 AM
This revision is now accepted and ready to land.Jul 10 2023, 2:59 AM

LGTM (although I think you still need rotl/rotr coverage?)

Added fshl and fshr cases as well. But we do have:

define i32 @rotr_known_nonzero(i32 %xx, i32 %y) {
; CHECK-LABEL: rotr_known_nonzero:
; CHECK:       # %bb.0:
; CHECK-NEXT:    movl %esi, %ecx
; CHECK-NEXT:    orl $256, %edi # imm = 0x100
; CHECK-NEXT:    # kill: def $cl killed $cl killed $ecx
; CHECK-NEXT:    rorl %cl, %edi
; CHECK-NEXT:    testl %edi, %edi
; CHECK-NEXT:    je .LBB14_1
; CHECK-NEXT:  # %bb.2: # %cond.false
; CHECK-NEXT:    rep bsfl %edi, %eax
; CHECK-NEXT:    retq
; CHECK-NEXT:  .LBB14_1:
; CHECK-NEXT:    movl $32, %eax
; CHECK-NEXT:    retq
  %x = or i32 %xx, 256
  %shr = lshr i32 %x, %y
  %sub = sub i32 32, %y
  %shl = shl i32 %x, %sub
  %z = or i32 %shl, %shr
  %r = call i32 @llvm.cttz.i32(i32 %z, i1 false)
  ret i32 %r
}

define i32 @rotr_maybe_zero(i32 %x, i32 %y) {
; CHECK-LABEL: rotr_maybe_zero:
; CHECK:       # %bb.0:
; CHECK-NEXT:    movl %esi, %ecx
; CHECK-NEXT:    # kill: def $cl killed $cl killed $ecx
; CHECK-NEXT:    rorl %cl, %edi
; CHECK-NEXT:    testl %edi, %edi
; CHECK-NEXT:    je .LBB15_1
; CHECK-NEXT:  # %bb.2: # %cond.false
; CHECK-NEXT:    rep bsfl %edi, %eax
; CHECK-NEXT:    retq
; CHECK-NEXT:  .LBB15_1:
; CHECK-NEXT:    movl $32, %eax
; CHECK-NEXT:    retq
  %shr = lshr i32 %x, %y
  %sub = sub i32 32, %y
  %shl = shl i32 %x, %sub
  %z = or i32 %shl, %shr
  %r = call i32 @llvm.cttz.i32(i32 %z, i1 false)
  ret i32 %r
}

Which are generating the rotate instructions at the very least.

RKSimon accepted this revision.Jul 10 2023, 10:32 AM

LGTM - cheers

This revision was landed with ongoing or failed builds.Jul 12 2023, 3:18 PM
This revision was automatically updated to reflect the committed changes.