Download Raw Diff

Details

Reviewers

craig.topper
RKSimon
wenlei
modimo
• zino
spatel

Commits

rGbaae814377bc: Add tests for D121320

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

MatzeB created this revision.Mar 9 2022, 11:31 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 9 2022, 11:31 AM

Herald added subscribers: pengfei, mcrosier. · View Herald Transcript

MatzeB requested review of this revision.Mar 9 2022, 11:31 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 9 2022, 11:31 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

MatzeB added a child revision: D121320: X86ISelDAGToDAG: Transform TEST + MOV64ri to SHR + TEST.Mar 9 2022, 11:35 AM

MatzeB retitled this revision from Precommit tests to Precommit tests for D121320.

MatzeB retitled this revision from Precommit tests for D121320 to Tests for D121320.

Harbormaster completed remote builds in B153407: Diff 414171.Mar 9 2022, 12:09 PM

MatzeB updated this revision to Diff 414234.Mar 9 2022, 4:02 PM

Harbormaster completed remote builds in B153456: Diff 414234.Mar 9 2022, 4:49 PM

LGTM - if there's any overlap between D121320 and D121147, it will be easier to see if all tests are committed.

This revision is now accepted and ready to land.Mar 10 2022, 5:18 AM

I looked at the effect of the 2 proposed patches together, and we should add at least 2 more tests (can be added to this commit or separately):

define i1 @shifted_mask_testb_lower32(i64 %a) {
  %v0 = and i64 %a, 66846720  ; 0xff << 18
  %v1 = icmp ne i64 %v0, 0
  ret i1 %v1
}

define i1 @shifted_mask_testw_lower32(i64 %a) {
  %v0 = and i64 %a, 131072 ; 0xffff << 1
  %v1 = icmp ne i64 %v0, 0
  ret i1 %v1
}

The difference for these is that the shifted mask does not extend into the upper 32-bits of the value. We probably want to convert these to use shifts too, and we'll handle that with both patches applied.

In D121319#3372603, @spatel wrote:
I looked at the effect of the 2 proposed patches together, and we should add at least 2 more tests (can be added to this commit or separately):
define i1 @shifted_mask_testb_lower32(i64 %a) {
  %v0 = and i64 %a, 66846720  ; 0xff << 18
  %v1 = icmp ne i64 %v0, 0
  ret i1 %v1
}

This codegens as testl $66846720, %edi ; setne %al without any of our changes which to me seems to be as good as it gets. Using a shift for the "shifted_mask" cases only makes sense when the constant value becomes so big that it no longer fits the 32bit immediates on X86...

I will add the function anyway as an example for something that should not be optimized, does that make sense?

define i1 @shifted_mask_testw_lower32(i64 %a) {
  %v0 = and i64 %a, 131072 ; 0xffff << 1

I guess this should have been 131070 :)

%v1 = icmp ne i64 %v0, 0
ret i1 %v1

}

This ends up being just another variant of the other function where it is best to just use a testl with immediate, so I think I can skip this.

The difference for these is that the shifted mask does not extend into the upper 32-bits of the value. We probably want to convert these to use shifts too, and we'll handle that with both patches applied.

I think those should keep using testl + immediate.

MatzeB updated this revision to Diff 414405.Mar 10 2022, 9:22 AM

Harbormaster completed remote builds in B153585: Diff 414405.Mar 10 2022, 10:12 AM

In D121319#3372983, @MatzeB wrote:
In D121319#3372603, @spatel wrote:
I looked at the effect of the 2 proposed patches together, and we should add at least 2 more tests (can be added to this commit or separately):
define i1 @shifted_mask_testb_lower32(i64 %a) {
  %v0 = and i64 %a, 66846720  ; 0xff << 18
  %v1 = icmp ne i64 %v0, 0
  ret i1 %v1
}
This codegens as testl $66846720, %edi ; setne %al without any of our changes which to me seems to be as good as it gets. Using a shift for the "shifted_mask" cases only makes sense when the constant value becomes so big that it no longer fits the 32bit immediates on X86...

I will add the function anyway as an example for something that should not be optimized, does that make sense?
define i1 @shifted_mask_testw_lower32(i64 %a) {
  %v0 = and i64 %a, 131072 ; 0xffff << 1
I guess this should have been 131070 :)

Oops - yes, I missed dropping the last bit.

The difference for these is that the shifted mask does not extend into the upper 32-bits of the value. We probably want to convert these to use shifts too, and we'll handle that with both patches applied.

I think those should keep using testl + immediate.

I'm not sure if there's a universal answer. As you mentioned, it may depend on throughput vs. saving on instruction size. Either way, the tests can be there to show current/expected codegen.

Added test variants with more than 1 use for the constant/add.

Harbormaster completed remote builds in B154367: Diff 415490.Mar 15 2022, 10:23 AM

rebase

Harbormaster completed remote builds in B154368: Diff 415491.Mar 15 2022, 11:22 AM

This revision was landed with ongoing or failed builds.Mar 15 2022, 2:18 PM

Closed by commit rGbaae814377bc: Add tests for D121320 (authored by MatzeB). · Explain Why

This revision was automatically updated to reflect the committed changes.

MatzeB added a commit: rGbaae814377bc: Add tests for D121320.

Diff 415584

llvm/test/CodeGen/X86/cmp.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown -show-mc-encoding \| FileCheck %s			; RUN: llc < %s -mtriple=x86_64-unknown-unknown -show-mc-encoding \| FileCheck %s

	@d = dso_local global i8 0, align 1			@d = dso_local global i8 0, align 1
				@d64 = dso_local global i64 0

	define i32 @test1(i32 %X, i32* %y) nounwind {			define i32 @test1(i32 %X, i32* %y) nounwind {
	; CHECK-LABEL: test1:			; CHECK-LABEL: test1:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: cmpl $0, (%rsi) # encoding: [0x83,0x3e,0x00]			; CHECK-NEXT: cmpl $0, (%rsi) # encoding: [0x83,0x3e,0x00]
	; CHECK-NEXT: je .LBB0_2 # encoding: [0x74,A]			; CHECK-NEXT: je .LBB0_2 # encoding: [0x74,A]
	; CHECK-NEXT: # fixup A - offset: 1, value: .LBB0_2-1, kind: FK_PCRel_1			; CHECK-NEXT: # fixup A - offset: 1, value: .LBB0_2-1, kind: FK_PCRel_1
	; CHECK-NEXT: # %bb.1: # %cond_true			; CHECK-NEXT: # %bb.1: # %cond_true
	▲ Show 20 Lines • Show All 576 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: setne %al # encoding: [0x0f,0x95,0xc0]			; CHECK-NEXT: setne %al # encoding: [0x0f,0x95,0xc0]
	; CHECK-NEXT: retq # encoding: [0xc3]			; CHECK-NEXT: retq # encoding: [0xc3]
	%and = and i32 %val, 31			%and = and i32 %val, 31
	%cmp = icmp ne i32 %and, 0			%cmp = icmp ne i32 %and, 0
	%ret = zext i1 %cmp to i32			%ret = zext i1 %cmp to i32
	ret i32 %ret			ret i32 %ret
	}			}

				define i1 @shifted_mask64_testb(i64 %a) {
				; CHECK-LABEL: shifted_mask64_testb:
				; CHECK: # %bb.0:
				; CHECK-NEXT: movabsq $287104476244869120, %rax # encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x00,0x00,0xfc,0x03]
				; CHECK-NEXT: # imm = 0x3FC000000000000
				; CHECK-NEXT: testq %rax, %rdi # encoding: [0x48,0x85,0xc7]
				; CHECK-NEXT: setne %al # encoding: [0x0f,0x95,0xc0]
				; CHECK-NEXT: retq # encoding: [0xc3]
				%v0 = and i64 %a, 287104476244869120 ; 0xff << 50
				%v1 = icmp ne i64 %v0, 0
				ret i1 %v1
				}

				define i1 @shifted_mask64_testw(i64 %a) {
				; CHECK-LABEL: shifted_mask64_testw:
				; CHECK: # %bb.0:
				; CHECK-NEXT: movabsq $562941363486720, %rax # encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0xfe,0xff,0x01,0x00]
				; CHECK-NEXT: # imm = 0x1FFFE00000000
				; CHECK-NEXT: testq %rax, %rdi # encoding: [0x48,0x85,0xc7]
				; CHECK-NEXT: setne %al # encoding: [0x0f,0x95,0xc0]
				; CHECK-NEXT: retq # encoding: [0xc3]
				%v0 = and i64 %a, 562941363486720 ; 0xffff << 33
				%v1 = icmp ne i64 %v0, 0
				ret i1 %v1
				}

				define i1 @shifted_mask64_testl(i64 %a) {
				; CHECK-LABEL: shifted_mask64_testl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: movabsq $549755813760, %rax # encoding: [0x48,0xb8,0x80,0xff,0xff,0xff,0x7f,0x00,0x00,0x00]
				; CHECK-NEXT: # imm = 0x7FFFFFFF80
				; CHECK-NEXT: testq %rax, %rdi # encoding: [0x48,0x85,0xc7]
				; CHECK-NEXT: sete %al # encoding: [0x0f,0x94,0xc0]
				; CHECK-NEXT: retq # encoding: [0xc3]
				%v0 = and i64 %a, 549755813760 ; 0xffffffff << 7
				%v1 = icmp eq i64 %v0, 0
				ret i1 %v1
				}

				define i1 @shifted_mask64_extra_use_const(i64 %a) {
				; CHECK-LABEL: shifted_mask64_extra_use_const:
				; CHECK: # %bb.0:
				; CHECK-NEXT: movabsq $287104476244869120, %rcx # encoding: [0x48,0xb9,0x00,0x00,0x00,0x00,0x00,0x00,0xfc,0x03]
				; CHECK-NEXT: # imm = 0x3FC000000000000
				; CHECK-NEXT: testq %rcx, %rdi # encoding: [0x48,0x85,0xcf]
				; CHECK-NEXT: setne %al # encoding: [0x0f,0x95,0xc0]
				; CHECK-NEXT: movq %rcx, d64(%rip) # encoding: [0x48,0x89,0x0d,A,A,A,A]
				; CHECK-NEXT: # fixup A - offset: 3, value: d64-4, kind: reloc_riprel_4byte
				; CHECK-NEXT: retq # encoding: [0xc3]
				%v0 = and i64 %a, 287104476244869120 ; 0xff << 50
				%v1 = icmp ne i64 %v0, 0
				store i64 287104476244869120, i64* @d64
				ret i1 %v1
				}

				define i1 @shifted_mask64_extra_use_and(i64 %a) {
				; CHECK-LABEL: shifted_mask64_extra_use_and:
				; CHECK: # %bb.0:
				; CHECK-NEXT: movabsq $287104476244869120, %rcx # encoding: [0x48,0xb9,0x00,0x00,0x00,0x00,0x00,0x00,0xfc,0x03]
				; CHECK-NEXT: # imm = 0x3FC000000000000
				; CHECK-NEXT: andq %rdi, %rcx # encoding: [0x48,0x21,0xf9]
				; CHECK-NEXT: setne %al # encoding: [0x0f,0x95,0xc0]
				; CHECK-NEXT: movq %rcx, d64(%rip) # encoding: [0x48,0x89,0x0d,A,A,A,A]
				; CHECK-NEXT: # fixup A - offset: 3, value: d64-4, kind: reloc_riprel_4byte
				; CHECK-NEXT: retq # encoding: [0xc3]
				%v0 = and i64 %a, 287104476244869120 ; 0xff << 50
				%v1 = icmp ne i64 %v0, 0
				store i64 %v0, i64* @d64
				ret i1 %v1
				}

				define i1 @shifted_mask32_testl_immediate(i64 %a) {
				; CHECK-LABEL: shifted_mask32_testl_immediate:
				; CHECK: # %bb.0:
				; CHECK-NEXT: testl $66846720, %edi # encoding: [0xf7,0xc7,0x00,0x00,0xfc,0x03]
				; CHECK-NEXT: # imm = 0x3FC0000
				; CHECK-NEXT: setne %al # encoding: [0x0f,0x95,0xc0]
				; CHECK-NEXT: retq # encoding: [0xc3]
				%v0 = and i64 %a, 66846720 ; 0xff << 18
				%v1 = icmp ne i64 %v0, 0
				ret i1 %v1
				}

				define i1 @shifted_mask32_extra_use_const(i64 %a) {
				; CHECK-LABEL: shifted_mask32_extra_use_const:
				; CHECK: # %bb.0:
				; CHECK-NEXT: testl $66846720, %edi # encoding: [0xf7,0xc7,0x00,0x00,0xfc,0x03]
				; CHECK-NEXT: # imm = 0x3FC0000
				; CHECK-NEXT: setne %al # encoding: [0x0f,0x95,0xc0]
				; CHECK-NEXT: movq $66846720, d64(%rip) # encoding: [0x48,0xc7,0x05,A,A,A,A,0x00,0x00,0xfc,0x03]
				; CHECK-NEXT: # fixup A - offset: 3, value: d64-8, kind: reloc_riprel_4byte
				; CHECK-NEXT: # imm = 0x3FC0000
				; CHECK-NEXT: retq # encoding: [0xc3]
				%v0 = and i64 %a, 66846720 ; 0xff << 18
				%v1 = icmp ne i64 %v0, 0
				store i64 66846720, i64* @d64
				ret i1 %v1
				}

				define i1 @shifted_mask32_extra_use_and(i64 %a) {
				; CHECK-LABEL: shifted_mask32_extra_use_and:
				; CHECK: # %bb.0:
				; CHECK-NEXT: andq $66846720, %rdi # encoding: [0x48,0x81,0xe7,0x00,0x00,0xfc,0x03]
				; CHECK-NEXT: # imm = 0x3FC0000
				; CHECK-NEXT: setne %al # encoding: [0x0f,0x95,0xc0]
				; CHECK-NEXT: movq %rdi, d64(%rip) # encoding: [0x48,0x89,0x3d,A,A,A,A]
				; CHECK-NEXT: # fixup A - offset: 3, value: d64-4, kind: reloc_riprel_4byte
				; CHECK-NEXT: retq # encoding: [0xc3]
				%v0 = and i64 %a, 66846720 ; 0xff << 50
				%v1 = icmp ne i64 %v0, 0
				store i64 %v0, i64* @d64
				ret i1 %v1
				}

	define { i64, i64 } @pr39968(i64, i64, i32) {			define { i64, i64 } @pr39968(i64, i64, i32) {
	; CHECK-LABEL: pr39968:			; CHECK-LABEL: pr39968:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: xorl %eax, %eax # encoding: [0x31,0xc0]			; CHECK-NEXT: xorl %eax, %eax # encoding: [0x31,0xc0]
	; CHECK-NEXT: testb $64, %dl # encoding: [0xf6,0xc2,0x40]			; CHECK-NEXT: testb $64, %dl # encoding: [0xf6,0xc2,0x40]
	; CHECK-NEXT: cmovneq %rdi, %rsi # encoding: [0x48,0x0f,0x45,0xf7]			; CHECK-NEXT: cmovneq %rdi, %rsi # encoding: [0x48,0x0f,0x45,0xf7]
	; CHECK-NEXT: cmovneq %rdi, %rax # encoding: [0x48,0x0f,0x45,0xc7]			; CHECK-NEXT: cmovneq %rdi, %rax # encoding: [0x48,0x0f,0x45,0xc7]
	; CHECK-NEXT: movq %rsi, %rdx # encoding: [0x48,0x89,0xf2]			; CHECK-NEXT: movq %rsi, %rdx # encoding: [0x48,0x89,0xf2]
	Show All 9 Lines

	; Make sure we use a 32-bit comparison without an extend based on the input			; Make sure we use a 32-bit comparison without an extend based on the input
	; being pre-sign extended by caller.			; being pre-sign extended by caller.
	define i32 @pr42189(i16 signext %c) {			define i32 @pr42189(i16 signext %c) {
	; CHECK-LABEL: pr42189:			; CHECK-LABEL: pr42189:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: cmpl $32767, %edi # encoding: [0x81,0xff,0xff,0x7f,0x00,0x00]			; CHECK-NEXT: cmpl $32767, %edi # encoding: [0x81,0xff,0xff,0x7f,0x00,0x00]
	; CHECK-NEXT: # imm = 0x7FFF			; CHECK-NEXT: # imm = 0x7FFF
	; CHECK-NEXT: jne .LBB37_2 # encoding: [0x75,A]			; CHECK-NEXT: jne .LBB45_2 # encoding: [0x75,A]
	; CHECK-NEXT: # fixup A - offset: 1, value: .LBB37_2-1, kind: FK_PCRel_1			; CHECK-NEXT: # fixup A - offset: 1, value: .LBB45_2-1, kind: FK_PCRel_1
	; CHECK-NEXT: # %bb.1: # %if.then			; CHECK-NEXT: # %bb.1: # %if.then
	; CHECK-NEXT: jmp g@PLT # TAILCALL			; CHECK-NEXT: jmp g@PLT # TAILCALL
	; CHECK-NEXT: # encoding: [0xeb,A]			; CHECK-NEXT: # encoding: [0xeb,A]
	; CHECK-NEXT: # fixup A - offset: 1, value: g@PLT-1, kind: FK_PCRel_1			; CHECK-NEXT: # fixup A - offset: 1, value: g@PLT-1, kind: FK_PCRel_1
	; CHECK-NEXT: .LBB37_2: # %if.end			; CHECK-NEXT: .LBB45_2: # %if.end
	; CHECK-NEXT: jmp f@PLT # TAILCALL			; CHECK-NEXT: jmp f@PLT # TAILCALL
	; CHECK-NEXT: # encoding: [0xeb,A]			; CHECK-NEXT: # encoding: [0xeb,A]
	; CHECK-NEXT: # fixup A - offset: 1, value: f@PLT-1, kind: FK_PCRel_1			; CHECK-NEXT: # fixup A - offset: 1, value: f@PLT-1, kind: FK_PCRel_1
	entry:			entry:
	%cmp = icmp eq i16 %c, 32767			%cmp = icmp eq i16 %c, 32767
	br i1 %cmp, label %if.then, label %if.end			br i1 %cmp, label %if.then, label %if.end

	if.then: ; preds = %entry			if.then: ; preds = %entry
	Show All 14 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Tests for D121320
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 415584

llvm/test/CodeGen/X86/cmp.ll

This is an archive of the discontinued LLVM Phabricator instance.

Tests for D121320ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 415584

llvm/test/CodeGen/X86/cmp.ll

Tests for D121320
ClosedPublic