This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] fold assertzexts separated by trunc
ClosedPublic

Authored by spatel on Aug 22 2017, 11:26 AM.

Download Raw Diff

Details

Reviewers

aivchenk
craig.topper
aaboud
delena
zvi
qcolombet
andreadb
efriedma
RKSimon

Commits

rGf31b1a00ea4d: [DAGCombiner] fold assertzexts separated by trunc
rL313577: [DAGCombiner] fold assertzexts separated by trunc

Summary

If we have an AssertZext of a truncated value that has already been AssertZext'ed, we can assert on the wider source op to improve the zext-y knowledge.

This is an implementation of the change suggested in D36890 (and I think would obsolete that patch). Actually, I don't think the suggestion #1 that I made there is valid, but this achieves the same result in a safe way by matching the a larger pattern.

The fact that this only affects x86 makes me suspicious that x86 is doing something suboptimal to create these AssertZexts in the first place, but I'm not sure.

Diff Detail

Event Timeline

spatel created this revision.Aug 22 2017, 11:26 AM

Herald added a subscriber: mcrosier. · View Herald TranscriptAug 22 2017, 11:26 AM

Here's the effect on "select_C1_C2_zeroext" nodes in case that makes the transform clearer:

      t4: i32 = AssertZext t2, ValueType:ch:i8
    t5: i8 = truncate t4
  t7: i8 = AssertZext t5, ValueType:ch:i1
t8: i1 = truncate t7

-->

  t16: i32 = AssertZext t2, ValueType:ch:i1
t18: i1 = truncate t16

aivchenk mentioned this in D36890: [X86] Emit testl instead of testb for select condition where possible.Aug 22 2017, 2:02 PM

I have one comment below.
By the way, I noticed that the double AssertZero occur only for the x86_64 (in i386 it does not happen).
It might be worth checking where it comes from, regardless of this patch.

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
7925	What about the other case: assert (trunc (assert X, i1) to iN), i8 --> trunc (assert X, i1) to iN Do not we want to cover this as well? Or we are sure that such case will never occur?

That's certainly a much better approach than the original. Couple of comments/questions:

We eliminate the "outer" assert by making "inner" stronger. In a sense, this essentially moves the information about the extension further away from the use. Is there some sort of canonicalization or a convention for that? The other way of removing redundancy would be just to eliminate the weaker "inner" assert, though it would not solve the initial problem for handling AssertZext in the backend. But may be the problem is that we don't handle AssertZext properly?
Should we implement the same idea for AssertSext?

In D37017#849197, @aaboud wrote:

I have one comment below.
By the way, I noticed that the double AssertZero occur only for the x86_64 (in i386 it does not happen).
It might be worth checking where it comes from, regardless of this patch.

i386 is (mostly?) immune from this because it passes function arguments on the stack. The first assertzext (the one with i32/i8 types in most of these examples) is generated by X86TargetLowering::LowerFormalArguments(). We always get the zext assert source value type as i8 for an i1 arg because that's what is specified in "RegisterTypeForVT", but I think that's the root cause of the bug. Should we be using the IR type (i1) at this stage when creating assertzext nodes and the subsequent truncate node?

Adding some more potential reviewers. This is my first look at function arg lowering. Is the intermediate trunc to i8 of the formal arg necessary for other reasons? The naive change to X86TargetLowering::LowerFormalArguments() doesn't work for weird vector types or AVX512 masks:

@@ -3015,7 +3015,7 @@
           // Promoting a mask type (v*i1) into a register of type i64/i32/i16/i8
           ArgValue = lowerRegToMasks(ArgValue, VA.getValVT(), RegVT, dl, DAG);
         } else
-          ArgValue = DAG.getNode(ISD::TRUNCATE, dl, VA.getValVT(), ArgValue);
+          ArgValue = DAG.getNode(ISD::TRUNCATE, dl, Ins[InsIndex].ArgVT, ArgValue);
       }
     } else {
       assert(VA.isMemLoc());

We're trying to avoid adding a DAG combine for this pattern by not creating the pattern in the first place:

      t4: i32 = AssertZext t2, ValueType:ch:i8
    t5: i8 = truncate t4
  t7: i8 = AssertZext t5, ValueType:ch:i1
t8: i1 = truncate t7

-->

  t16: i32 = AssertZext t2, ValueType:ch:i1
t18: i1 = truncate t16

spatel mentioned this in D37069: [x86] use the IR type of formal args to create assertzext/assertsext and scalar truncate nodes.Aug 23 2017, 9:13 AM

In D37017#849272, @aivchenk wrote:

Should we implement the same idea for AssertSext?

Yes, we should. But after discovering that this code almost already exists in the Mips backend, I'd like to make that a small follow-up just to reduce the patch size and risk.

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
7925	I haven't seen that occur yet, but since it's easy to handle, I'll make the fold more general.

Patch updated:
After finding that the same functionality exists in trunk, I think it makes sense to go ahead with this patch. Effectively, we're just taking a general fold that was hiding in the Mips backend and applying it to all targets.

Also, given that there is already another fold for assertzext(assertzext(x)), I assume there's already some scenario under which it makes sense to have folds for these nodes besides the one we're looking at. Therefore, there's not much benefit to trying to stub this pattern out at the source. We just account for the potential inefficiency in argument lowering with this fold.

Herald added a subscriber: sdardis. · View Herald TranscriptSep 1 2017, 11:43 AM

RKSimon added a reviewer: RKSimon.Sep 4 2017, 2:24 PM

Ping.

efriedma added a subscriber: hans.Sep 13 2017, 3:42 PM

efriedma added inline comments.

test/CodeGen/X86/negate-i1.ll
134	This seems kind of scary... is an "i1 zeroext" actually guaranteed to be zero-extended to 64 bits?

craig.topper added inline comments.Sep 13 2017, 3:49 PM

test/CodeGen/X86/negate-i1.ll
134	Nope. It hit PR28540 which should be fixed by D37729

spatel added inline comments.Sep 13 2017, 4:12 PM

test/CodeGen/X86/negate-i1.ll

134

Should I make this patch dependent on that fix? Are we correct when we go from:

        t4: i32 = AssertZext t2, ValueType:ch:i8
      t5: i8 = truncate t4
    t7: i8 = AssertZext t5, ValueType:ch:i1
  t8: i1 = truncate t7
t9: i64 = sign_extend t8

to:

    t16: i32 = AssertZext t2, ValueType:ch:i1
  t18: i64 = any_extend t16
t15: i64 = sign_extend_inreg t18, ValueType:ch:i1

PR28540 is fixed now.

In D37017#874229, @craig.topper wrote:

PR28540 is fixed now.

Thanks - so now for the questionable test, we'll get movl+negq rather than negq+movq (no changes to the output from this patch).

Any other known problems?

Patch updated:
No code changes, but test output updated after the fix from rL313563.

In D37017#874263, @spatel wrote:

In D37017#874229, @craig.topper wrote:

PR28540 is fixed now.

Thanks - so now for the questionable test, we'll get movl+negq rather than negq+movq (no changes to the output from this patch).

That comment wasn't clear. I meant we still get the same DAG nodes for the questionable test from this combine:

    t16: i32 = AssertZext t2, ValueType:ch:i1
  t18: i64 = any_extend t16
t15: i64 = sign_extend_inreg t18, ValueType:ch:i1

LGTM

This revision is now accepted and ready to land.Sep 18 2017, 2:25 PM

Closed by commit rL313577: [DAGCombiner] fold assertzexts separated by trunc (authored by spatel). · Explain WhySep 18 2017, 3:07 PM

This revision was automatically updated to reflect the committed changes.

spatel mentioned this in rL315206: [DAG] combine assertsexts around a trunc.Oct 9 2017, 8:24 AM

nikic mentioned this in D126952: [DAGCombiner] Remove overzealous assertion when folding assert+trunc+assert (PR55846).Jun 3 2022, 2:22 AM

nikic mentioned this in rG5a64bc207ee0: [DAGCombiner] Remove overzealous assertion when folding assert+trunc+assert….Jun 7 2022, 12:50 AM

Revision Contents

Path

Size

lib/

CodeGen/

SelectionDAG/

DAGCombiner.cpp

20 lines

test/

CodeGen/

X86/

bool-zext.ll

3 lines

critical-edge-split-2.ll

2 lines

fp128-select.ll

4 lines

illegal-bitfield-loadstore.ll

53 lines

4 lines

14 lines

19 lines

4 lines

Diff 112204

lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 7,901 Lines • ▼ Show 20 Lines	SDValue DAGCombiner::visitANY_EXTEND(SDNode *N) {
}		}

return SDValue();		return SDValue();
}		}

SDValue DAGCombiner::visitAssertZext(SDNode *N) {		SDValue DAGCombiner::visitAssertZext(SDNode *N) {
SDValue N0 = N->getOperand(0);		SDValue N0 = N->getOperand(0);
SDValue N1 = N->getOperand(1);		SDValue N1 = N->getOperand(1);
EVT EVT = cast<VTSDNode>(N1)->getVT();		EVT AssertVT = cast<VTSDNode>(N1)->getVT();

// fold (assertzext (assertzext x, vt), vt) -> (assertzext x, vt)		// fold (assertzext (assertzext x, vt), vt) -> (assertzext x, vt)
if (N0.getOpcode() == ISD::AssertZext &&		if (N0.getOpcode() == ISD::AssertZext &&
EVT == cast<VTSDNode>(N0.getOperand(1))->getVT())		AssertVT == cast<VTSDNode>(N0.getOperand(1))->getVT())
return N0;		return N0;

		if (N0.getOpcode() == ISD::TRUNCATE && N0.hasOneUse() &&
		N0.getOperand(0).getOpcode() == ISD::AssertZext) {
		// We have an assert, truncate, assert sandwich. If the later assert has a
		// smaller asserting type, make the first assert stronger by asserting on
		// that smaller type. This eliminates the later assert:
		// assert (trunc (assert X, i8) to iN), i1 --> trunc (assert X, i1) to iN
		SDValue BigAZ = N0.getOperand(0);
		EVT BigAZVT = cast<VTSDNode>(BigAZ.getOperand(1))->getVT();
		if (AssertVT.getSizeInBits() < BigAZVT.getSizeInBits()) {
		aaboudUnsubmitted Not Done Reply Inline Actions What about the other case: assert (trunc (assert X, i1) to iN), i8 --> trunc (assert X, i1) to iN Do not we want to cover this as well? Or we are sure that such case will never occur? aaboud: What about the other case: assert (trunc (assert X, i1) to iN), i8 --> trunc (assert X, i1) to…
		spatelAuthorUnsubmitted Not Done Reply Inline Actions I haven't seen that occur yet, but since it's easy to handle, I'll make the fold more general. spatel: I haven't seen that occur yet, but since it's easy to handle, I'll make the fold more general.
		SDLoc DL(N);
		SDValue NewAZ = DAG.getNode(ISD::AssertZext, DL, BigAZ.getValueType(),
		BigAZ.getOperand(0), N1);
		return DAG.getNode(ISD::TRUNCATE, DL, N->getValueType(0), NewAZ);
		}
		}

return SDValue();		return SDValue();
}		}

/// If the result of a wider load is shifted to right of N bits and then		/// If the result of a wider load is shifted to right of N bits and then
/// truncated to a narrower type and where N is a multiple of number of bits of		/// truncated to a narrower type and where N is a multiple of number of bits of
/// the narrower type, transform it to a narrower load from address + N / num of		/// the narrower type, transform it to a narrower load from address + N / num of
/// bits of new type. If the result is to be extended, also fold the extension		/// bits of new type. If the result is to be extended, also fold the extension
/// to form a extending load.		/// to form a extending load.
▲ Show 20 Lines • Show All 9,437 Lines • Show Last 20 Lines

test/CodeGen/X86/bool-zext.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=i686-unknown-unknown \| FileCheck %s -check-prefix=X32			; RUN: llc < %s -mtriple=i686-unknown-unknown \| FileCheck %s -check-prefix=X32
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s -check-prefix=X64			; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s -check-prefix=X64

	; Check that the argument gets zero-extended before calling.			; It's not necessary to zero-extend the arg because it is specified 'zeroext'.
	define void @bar1(i1 zeroext %v1) nounwind ssp {			define void @bar1(i1 zeroext %v1) nounwind ssp {
	; X32-LABEL: bar1:			; X32-LABEL: bar1:
	; X32: # BB#0:			; X32: # BB#0:
	; X32-NEXT: movzbl {{[0-9]+}}(%esp), %eax			; X32-NEXT: movzbl {{[0-9]+}}(%esp), %eax
	; X32-NEXT: pushl %eax			; X32-NEXT: pushl %eax
	; X32-NEXT: calll foo1			; X32-NEXT: calll foo1
	; X32-NEXT: addl $4, %esp			; X32-NEXT: addl $4, %esp
	; X32-NEXT: retl			; X32-NEXT: retl
	;			;
	; X64-LABEL: bar1:			; X64-LABEL: bar1:
	; X64: # BB#0:			; X64: # BB#0:
	; X64-NEXT: movzbl %dil, %edi
	; X64-NEXT: xorl %eax, %eax			; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: jmp foo1 # TAILCALL			; X64-NEXT: jmp foo1 # TAILCALL
	%conv = zext i1 %v1 to i32			%conv = zext i1 %v1 to i32
	%call = tail call i32 (...) @foo1(i32 %conv) nounwind			%call = tail call i32 (...) @foo1(i32 %conv) nounwind
	ret void			ret void
	}			}

	; Check that on x86-64 the arguments are simply forwarded.			; Check that on x86-64 the arguments are simply forwarded.
	Show All 38 Lines

test/CodeGen/X86/critical-edge-split-2.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s			; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s

	%0 = type <{ %1, %1 }>			%0 = type <{ %1, %1 }>
	%1 = type { i8, i8, i8, i8 }			%1 = type { i8, i8, i8, i8 }

	@g_2 = global %0 zeroinitializer			@g_2 = global %0 zeroinitializer
	@g_4 = global %1 zeroinitializer, align 4			@g_4 = global %1 zeroinitializer, align 4

	; PR8642			; PR8642
	define i16 @test1(i1 zeroext %C, i8** nocapture %argv) nounwind ssp {			define i16 @test1(i1 zeroext %C, i8** nocapture %argv) nounwind ssp {
	; CHECK-LABEL: test1:			; CHECK-LABEL: test1:
	; CHECK: # BB#0: # %entry			; CHECK: # BB#0: # %entry
	; CHECK-NEXT: movw $1, %ax			; CHECK-NEXT: movw $1, %ax
	; CHECK-NEXT: testb %dil, %dil			; CHECK-NEXT: testl %edi, %edi
	; CHECK-NEXT: jne .LBB0_2			; CHECK-NEXT: jne .LBB0_2
	; CHECK-NEXT: # BB#1: # %cond.false.i			; CHECK-NEXT: # BB#1: # %cond.false.i
	; CHECK-NEXT: movl $g_4, %eax			; CHECK-NEXT: movl $g_4, %eax
	; CHECK-NEXT: movl $g_2+4, %ecx			; CHECK-NEXT: movl $g_2+4, %ecx
	; CHECK-NEXT: xorl %esi, %esi			; CHECK-NEXT: xorl %esi, %esi
	; CHECK-NEXT: cmpq %rax, %rcx			; CHECK-NEXT: cmpq %rax, %rcx
	; CHECK-NEXT: sete %sil			; CHECK-NEXT: sete %sil
	; CHECK-NEXT: movl $1, %eax			; CHECK-NEXT: movl $1, %eax
	Show All 17 Lines

test/CodeGen/X86/fp128-select.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -O2 -mtriple=x86_64-linux-android -mattr=+mmx \			; RUN: llc < %s -O2 -mtriple=x86_64-linux-android -mattr=+mmx \
	; RUN: -enable-legalize-types-checking \| FileCheck %s --check-prefix=MMX			; RUN: -enable-legalize-types-checking \| FileCheck %s --check-prefix=MMX
	; RUN: llc < %s -O2 -mtriple=x86_64-linux-gnu -mattr=+mmx \			; RUN: llc < %s -O2 -mtriple=x86_64-linux-gnu -mattr=+mmx \
	; RUN: -enable-legalize-types-checking \| FileCheck %s --check-prefix=MMX			; RUN: -enable-legalize-types-checking \| FileCheck %s --check-prefix=MMX
	; RUN: llc < %s -O2 -mtriple=x86_64-linux-android \			; RUN: llc < %s -O2 -mtriple=x86_64-linux-android \
	; RUN: -enable-legalize-types-checking \| FileCheck %s			; RUN: -enable-legalize-types-checking \| FileCheck %s
	; RUN: llc < %s -O2 -mtriple=x86_64-linux-gnu \			; RUN: llc < %s -O2 -mtriple=x86_64-linux-gnu \
	; RUN: -enable-legalize-types-checking \| FileCheck %s			; RUN: -enable-legalize-types-checking \| FileCheck %s

	define void @test_select(fp128* %p, fp128* %q, i1 zeroext %c) {			define void @test_select(fp128* %p, fp128* %q, i1 zeroext %c) {
	; MMX-LABEL: test_select:			; MMX-LABEL: test_select:
	; MMX: # BB#0:			; MMX: # BB#0:
	; MMX-NEXT: testb %dl, %dl			; MMX-NEXT: testl %edx, %edx
	; MMX-NEXT: jne .LBB0_1			; MMX-NEXT: jne .LBB0_1
	; MMX-NEXT: # BB#2:			; MMX-NEXT: # BB#2:
	; MMX-NEXT: movaps {{.*}}(%rip), %xmm0			; MMX-NEXT: movaps {{.*}}(%rip), %xmm0
	; MMX-NEXT: movaps %xmm0, (%rsi)			; MMX-NEXT: movaps %xmm0, (%rsi)
	; MMX-NEXT: retq			; MMX-NEXT: retq
	; MMX-NEXT: .LBB0_1:			; MMX-NEXT: .LBB0_1:
	; MMX-NEXT: movaps (%rdi), %xmm0			; MMX-NEXT: movaps (%rdi), %xmm0
	; MMX-NEXT: movaps %xmm0, (%rsi)			; MMX-NEXT: movaps %xmm0, (%rsi)
	; MMX-NEXT: retq			; MMX-NEXT: retq
	;			;
	; CHECK-LABEL: test_select:			; CHECK-LABEL: test_select:
	; CHECK: # BB#0:			; CHECK: # BB#0:
	; CHECK-NEXT: xorl %eax, %eax			; CHECK-NEXT: xorl %eax, %eax
	; CHECK-NEXT: testb %dl, %dl			; CHECK-NEXT: testl %edx, %edx
	; CHECK-NEXT: cmovneq (%rdi), %rax			; CHECK-NEXT: cmovneq (%rdi), %rax
	; CHECK-NEXT: movabsq $9223231299366420480, %rcx # imm = 0x7FFF800000000000			; CHECK-NEXT: movabsq $9223231299366420480, %rcx # imm = 0x7FFF800000000000
	; CHECK-NEXT: cmovneq 8(%rdi), %rcx			; CHECK-NEXT: cmovneq 8(%rdi), %rcx
	; CHECK-NEXT: movq %rcx, 8(%rsi)			; CHECK-NEXT: movq %rcx, 8(%rsi)
	; CHECK-NEXT: movq %rax, (%rsi)			; CHECK-NEXT: movq %rax, (%rsi)
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%a = load fp128, fp128* %p, align 2			%a = load fp128, fp128* %p, align 2
	%r = select i1 %c, fp128 %a, fp128 0xL00000000000000007FFF800000000000			%r = select i1 %c, fp128 %a, fp128 0xL00000000000000007FFF800000000000
	store fp128 %r, fp128* %q			store fp128 %r, fp128* %q
	ret void			ret void
	}			}

test/CodeGen/X86/illegal-bitfield-loadstore.ll

	Show First 20 Lines • Show All 81 Lines • ▼ Show 20 Lines
	; X86-NEXT: andl $16769023, %eax # imm = 0xFFDFFF			; X86-NEXT: andl $16769023, %eax # imm = 0xFFDFFF
	; X86-NEXT: orl %edx, %eax			; X86-NEXT: orl %edx, %eax
	; X86-NEXT: movw %ax, (%ecx)			; X86-NEXT: movw %ax, (%ecx)
	; X86-NEXT: popl %esi			; X86-NEXT: popl %esi
	; X86-NEXT: retl			; X86-NEXT: retl
	;			;
	; X64-LABEL: i24_insert_bit:			; X64-LABEL: i24_insert_bit:
	; X64: # BB#0:			; X64: # BB#0:
	; X64-NEXT: movzbl %sil, %eax			; X64-NEXT: movzwl (%rdi), %eax
	; X64-NEXT: movzwl (%rdi), %ecx			; X64-NEXT: movzbl 2(%rdi), %ecx
	; X64-NEXT: movzbl 2(%rdi), %edx			; X64-NEXT: movb %cl, 2(%rdi)
	; X64-NEXT: movb %dl, 2(%rdi)			; X64-NEXT: shll $16, %ecx
	; X64-NEXT: shll $16, %edx			; X64-NEXT: orl %eax, %ecx
	; X64-NEXT: orl %ecx, %edx			; X64-NEXT: shll $13, %esi
	; X64-NEXT: shll $13, %eax			; X64-NEXT: andl $16769023, %ecx # imm = 0xFFDFFF
	; X64-NEXT: andl $16769023, %edx # imm = 0xFFDFFF			; X64-NEXT: orl %esi, %ecx
	; X64-NEXT: orl %eax, %edx			; X64-NEXT: movw %cx, (%rdi)
	; X64-NEXT: movw %dx, (%rdi)
	; X64-NEXT: retq			; X64-NEXT: retq
	%extbit = zext i1 %bit to i24			%extbit = zext i1 %bit to i24
	%b = load i24, i24* %a, align 1			%b = load i24, i24* %a, align 1
	%extbit.shl = shl nuw nsw i24 %extbit, 13			%extbit.shl = shl nuw nsw i24 %extbit, 13
	%c = and i24 %b, -8193			%c = and i24 %b, -8193
	%d = or i24 %c, %extbit.shl			%d = or i24 %c, %extbit.shl
	store i24 %d, i24* %a, align 1			store i24 %d, i24* %a, align 1
	ret void			ret void
	▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines
	; X86-NEXT: movl $-8193, %edx # imm = 0xDFFF			; X86-NEXT: movl $-8193, %edx # imm = 0xDFFF
	; X86-NEXT: andl (%eax), %edx			; X86-NEXT: andl (%eax), %edx
	; X86-NEXT: orl %ecx, %edx			; X86-NEXT: orl %ecx, %edx
	; X86-NEXT: movl %edx, (%eax)			; X86-NEXT: movl %edx, (%eax)
	; X86-NEXT: retl			; X86-NEXT: retl
	;			;
	; X64-LABEL: i56_insert_bit:			; X64-LABEL: i56_insert_bit:
	; X64: # BB#0:			; X64: # BB#0:
	; X64-NEXT: movzbl %sil, %eax			; X64-NEXT: # kill: %ESI<def> %ESI<kill> %RSI<def>
	; X64-NEXT: movzwl 4(%rdi), %ecx			; X64-NEXT: movzwl 4(%rdi), %eax
	; X64-NEXT: movzbl 6(%rdi), %edx			; X64-NEXT: movzbl 6(%rdi), %ecx
	; X64-NEXT: movl (%rdi), %esi			; X64-NEXT: movl (%rdi), %edx
	; X64-NEXT: movb %dl, 6(%rdi)			; X64-NEXT: movb %cl, 6(%rdi)
	; X64-NEXT: # kill: %EDX<def> %EDX<kill> %RDX<kill> %RDX<def>			; X64-NEXT: # kill: %ECX<def> %ECX<kill> %RCX<kill> %RCX<def>
	; X64-NEXT: shll $16, %edx			; X64-NEXT: shll $16, %ecx
	; X64-NEXT: orl %ecx, %edx			; X64-NEXT: orl %eax, %ecx
	; X64-NEXT: shlq $32, %rdx			; X64-NEXT: shlq $32, %rcx
	; X64-NEXT: orq %rdx, %rsi			; X64-NEXT: orq %rcx, %rdx
	; X64-NEXT: shlq $13, %rax			; X64-NEXT: shlq $13, %rsi
	; X64-NEXT: movabsq $72057594037919743, %rcx # imm = 0xFFFFFFFFFFDFFF			; X64-NEXT: movabsq $72057594037919743, %rax # imm = 0xFFFFFFFFFFDFFF
	; X64-NEXT: andq %rsi, %rcx			; X64-NEXT: andq %rdx, %rax
	; X64-NEXT: orq %rax, %rcx			; X64-NEXT: orq %rsi, %rax
	; X64-NEXT: movl %ecx, (%rdi)			; X64-NEXT: movl %eax, (%rdi)
	; X64-NEXT: shrq $32, %rcx			; X64-NEXT: shrq $32, %rax
	; X64-NEXT: movw %cx, 4(%rdi)			; X64-NEXT: movw %ax, 4(%rdi)
	; X64-NEXT: retq			; X64-NEXT: retq
	%extbit = zext i1 %bit to i56			%extbit = zext i1 %bit to i56
	%b = load i56, i56* %a, align 1			%b = load i56, i56* %a, align 1
	%extbit.shl = shl nuw nsw i56 %extbit, 13			%extbit.shl = shl nuw nsw i56 %extbit, 13
	%c = and i56 %b, -8193			%c = and i56 %b, -8193
	%d = or i56 %c, %extbit.shl			%d = or i56 %c, %extbit.shl
	store i56 %d, i56* %a, align 1			store i56 %d, i56* %a, align 1
	ret void			ret void
	}			}

test/CodeGen/X86/mask-negated-bool.ll

Show All 10 Lines	; CHECK-NEXT: retq
%neg = sub i32 0, %ext		%neg = sub i32 0, %ext
%and = and i32 %neg, 1		%and = and i32 %neg, 1
ret i32 %and		ret i32 %and
}		}

define i32 @mask_negated_zext_bool2(i1 zeroext %x) {		define i32 @mask_negated_zext_bool2(i1 zeroext %x) {
; CHECK-LABEL: mask_negated_zext_bool2:		; CHECK-LABEL: mask_negated_zext_bool2:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: movzbl %dil, %eax		; CHECK-NEXT: movl %edi, %eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%ext = zext i1 %x to i32		%ext = zext i1 %x to i32
%neg = sub i32 0, %ext		%neg = sub i32 0, %ext
%and = and i32 %neg, 1		%and = and i32 %neg, 1
ret i32 %and		ret i32 %and
}		}

define <4 x i32> @mask_negated_zext_bool_vec(<4 x i1> %x) {		define <4 x i32> @mask_negated_zext_bool_vec(<4 x i1> %x) {
Show All 17 Lines	; CHECK-NEXT: retq
%neg = sub i32 0, %ext		%neg = sub i32 0, %ext
%and = and i32 %neg, 1		%and = and i32 %neg, 1
ret i32 %and		ret i32 %and
}		}

define i32 @mask_negated_sext_bool2(i1 zeroext %x) {		define i32 @mask_negated_sext_bool2(i1 zeroext %x) {
; CHECK-LABEL: mask_negated_sext_bool2:		; CHECK-LABEL: mask_negated_sext_bool2:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: movzbl %dil, %eax		; CHECK-NEXT: movl %edi, %eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%ext = sext i1 %x to i32		%ext = sext i1 %x to i32
%neg = sub i32 0, %ext		%neg = sub i32 0, %ext
%and = and i32 %neg, 1		%and = and i32 %neg, 1
ret i32 %and		ret i32 %and
}		}

define <4 x i32> @mask_negated_sext_bool_vec(<4 x i1> %x) {		define <4 x i32> @mask_negated_sext_bool_vec(<4 x i1> %x) {
Show All 10 Lines

test/CodeGen/X86/negate-i1.ll

	Show First 20 Lines • Show All 52 Lines • ▼ Show 20 Lines
	; X32-NEXT: retl			; X32-NEXT: retl
	%b = sext i1 %a to i16			%b = sext i1 %a to i16
	ret i16 %b			ret i16 %b
	}			}

	define i16 @select_i16_neg1_or_0_zeroext(i1 zeroext %a) {			define i16 @select_i16_neg1_or_0_zeroext(i1 zeroext %a) {
	; X64-LABEL: select_i16_neg1_or_0_zeroext:			; X64-LABEL: select_i16_neg1_or_0_zeroext:
	; X64: # BB#0:			; X64: # BB#0:
	; X64-NEXT: movzbl %dil, %eax			; X64-NEXT: negl %edi
	; X64-NEXT: negl %eax			; X64-NEXT: movl %edi, %eax
	; X64-NEXT: # kill: %AX<def> %AX<kill> %EAX<kill>
	; X64-NEXT: retq			; X64-NEXT: retq
	;			;
	; X32-LABEL: select_i16_neg1_or_0_zeroext:			; X32-LABEL: select_i16_neg1_or_0_zeroext:
	; X32: # BB#0:			; X32: # BB#0:
	; X32-NEXT: movzbl {{[0-9]+}}(%esp), %eax			; X32-NEXT: movzbl {{[0-9]+}}(%esp), %eax
	; X32-NEXT: negl %eax			; X32-NEXT: negl %eax
	; X32-NEXT: # kill: %AX<def> %AX<kill> %EAX<kill>			; X32-NEXT: # kill: %AX<def> %AX<kill> %EAX<kill>
	; X32-NEXT: retl			; X32-NEXT: retl
	Show All 17 Lines
	; X32-NEXT: retl			; X32-NEXT: retl
	%b = sext i1 %a to i32			%b = sext i1 %a to i32
	ret i32 %b			ret i32 %b
	}			}

	define i32 @select_i32_neg1_or_0_zeroext(i1 zeroext %a) {			define i32 @select_i32_neg1_or_0_zeroext(i1 zeroext %a) {
	; X64-LABEL: select_i32_neg1_or_0_zeroext:			; X64-LABEL: select_i32_neg1_or_0_zeroext:
	; X64: # BB#0:			; X64: # BB#0:
	; X64-NEXT: movzbl %dil, %eax			; X64-NEXT: negl %edi
	; X64-NEXT: negl %eax			; X64-NEXT: movl %edi, %eax
	; X64-NEXT: retq			; X64-NEXT: retq
	;			;
	; X32-LABEL: select_i32_neg1_or_0_zeroext:			; X32-LABEL: select_i32_neg1_or_0_zeroext:
	; X32: # BB#0:			; X32: # BB#0:
	; X32-NEXT: movzbl {{[0-9]+}}(%esp), %eax			; X32-NEXT: movzbl {{[0-9]+}}(%esp), %eax
	; X32-NEXT: negl %eax			; X32-NEXT: negl %eax
	; X32-NEXT: retl			; X32-NEXT: retl
	%b = sext i1 %a to i32			%b = sext i1 %a to i32
	Show All 18 Lines
	; X32-NEXT: retl			; X32-NEXT: retl
	%b = sext i1 %a to i64			%b = sext i1 %a to i64
	ret i64 %b			ret i64 %b
	}			}

	define i64 @select_i64_neg1_or_0_zeroext(i1 zeroext %a) {			define i64 @select_i64_neg1_or_0_zeroext(i1 zeroext %a) {
	; X64-LABEL: select_i64_neg1_or_0_zeroext:			; X64-LABEL: select_i64_neg1_or_0_zeroext:
	; X64: # BB#0:			; X64: # BB#0:
	; X64-NEXT: movzbl %dil, %eax			; X64-NEXT: # kill: %EDI<def> %EDI<kill> %RDI<def>
	; X64-NEXT: negq %rax			; X64-NEXT: negq %rdi
				; X64-NEXT: movq %rdi, %rax
				efriedmaUnsubmitted Not Done Reply Inline Actions This seems kind of scary... is an "i1 zeroext" actually guaranteed to be zero-extended to 64 bits? efriedma: This seems kind of scary... is an "i1 zeroext" actually guaranteed to be zero-extended to 64…
				craig.topperUnsubmitted Not Done Reply Inline Actions Nope. It hit PR28540 which should be fixed by D37729 craig.topper: Nope. It hit PR28540 which should be fixed by D37729
				spatelAuthorUnsubmitted Not Done Reply Inline Actions Should I make this patch dependent on that fix? Are we correct when we go from: t4: i32 = AssertZext t2, ValueType:ch:i8 t5: i8 = truncate t4 t7: i8 = AssertZext t5, ValueType:ch:i1 t8: i1 = truncate t7 t9: i64 = sign_extend t8 to: t16: i32 = AssertZext t2, ValueType:ch:i1 t18: i64 = any_extend t16 t15: i64 = sign_extend_inreg t18, ValueType:ch:i1 spatel: Should I make this patch dependent on that fix? Are we correct when we go from: t4…
	; X64-NEXT: retq			; X64-NEXT: retq
	;			;
	; X32-LABEL: select_i64_neg1_or_0_zeroext:			; X32-LABEL: select_i64_neg1_or_0_zeroext:
	; X32: # BB#0:			; X32: # BB#0:
	; X32-NEXT: movzbl {{[0-9]+}}(%esp), %eax			; X32-NEXT: movzbl {{[0-9]+}}(%esp), %eax
	; X32-NEXT: negl %eax			; X32-NEXT: negl %eax
	; X32-NEXT: movl %eax, %edx			; X32-NEXT: movl %eax, %edx
	; X32-NEXT: retl			; X32-NEXT: retl
	%b = sext i1 %a to i64			%b = sext i1 %a to i64
	ret i64 %b			ret i64 %b
	}			}

test/CodeGen/X86/select_const.ll

	Show First 20 Lines • Show All 47 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%sel = select i1 %cond, i32 1, i32 0			%sel = select i1 %cond, i32 1, i32 0
	ret i32 %sel			ret i32 %sel
	}			}

	define i32 @select_1_or_0_zeroext(i1 zeroext %cond) {			define i32 @select_1_or_0_zeroext(i1 zeroext %cond) {
	; CHECK-LABEL: select_1_or_0_zeroext:			; CHECK-LABEL: select_1_or_0_zeroext:
	; CHECK: # BB#0:			; CHECK: # BB#0:
	; CHECK-NEXT: movzbl %dil, %eax			; CHECK-NEXT: movl %edi, %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%sel = select i1 %cond, i32 1, i32 0			%sel = select i1 %cond, i32 1, i32 0
	ret i32 %sel			ret i32 %sel
	}			}

	define i32 @select_1_or_0_signext(i1 signext %cond) {			define i32 @select_1_or_0_signext(i1 signext %cond) {
	; CHECK-LABEL: select_1_or_0_signext:			; CHECK-LABEL: select_1_or_0_signext:
	; CHECK: # BB#0:			; CHECK: # BB#0:
	Show All 15 Lines
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%sel = select i1 %cond, i32 0, i32 -1			%sel = select i1 %cond, i32 0, i32 -1
	ret i32 %sel			ret i32 %sel
	}			}

	define i32 @select_0_or_neg1_zeroext(i1 zeroext %cond) {			define i32 @select_0_or_neg1_zeroext(i1 zeroext %cond) {
	; CHECK-LABEL: select_0_or_neg1_zeroext:			; CHECK-LABEL: select_0_or_neg1_zeroext:
	; CHECK: # BB#0:			; CHECK: # BB#0:
	; CHECK-NEXT: movzbl %dil, %eax			; CHECK-NEXT: # kill: %EDI<def> %EDI<kill> %RDI<def>
	; CHECK-NEXT: decl %eax			; CHECK-NEXT: leal -1(%rdi), %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%sel = select i1 %cond, i32 0, i32 -1			%sel = select i1 %cond, i32 0, i32 -1
	ret i32 %sel			ret i32 %sel
	}			}

	define i32 @select_0_or_neg1_signext(i1 signext %cond) {			define i32 @select_0_or_neg1_signext(i1 signext %cond) {
	; CHECK-LABEL: select_0_or_neg1_signext:			; CHECK-LABEL: select_0_or_neg1_signext:
	; CHECK: # BB#0:			; CHECK: # BB#0:
	Show All 16 Lines
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%sel = select i1 %cond, i32 -1, i32 0			%sel = select i1 %cond, i32 -1, i32 0
	ret i32 %sel			ret i32 %sel
	}			}

	define i32 @select_neg1_or_0_zeroext(i1 zeroext %cond) {			define i32 @select_neg1_or_0_zeroext(i1 zeroext %cond) {
	; CHECK-LABEL: select_neg1_or_0_zeroext:			; CHECK-LABEL: select_neg1_or_0_zeroext:
	; CHECK: # BB#0:			; CHECK: # BB#0:
	; CHECK-NEXT: movzbl %dil, %eax			; CHECK-NEXT: negl %edi
	; CHECK-NEXT: negl %eax			; CHECK-NEXT: movl %edi, %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%sel = select i1 %cond, i32 -1, i32 0			%sel = select i1 %cond, i32 -1, i32 0
	ret i32 %sel			ret i32 %sel
	}			}

	define i32 @select_neg1_or_0_signext(i1 signext %cond) {			define i32 @select_neg1_or_0_signext(i1 signext %cond) {
	; CHECK-LABEL: select_neg1_or_0_signext:			; CHECK-LABEL: select_neg1_or_0_signext:
	; CHECK: # BB#0:			; CHECK: # BB#0:
	Show All 14 Lines
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%sel = select i1 %cond, i32 42, i32 41			%sel = select i1 %cond, i32 42, i32 41
	ret i32 %sel			ret i32 %sel
	}			}

	define i32 @select_Cplus1_C_zeroext(i1 zeroext %cond) {			define i32 @select_Cplus1_C_zeroext(i1 zeroext %cond) {
	; CHECK-LABEL: select_Cplus1_C_zeroext:			; CHECK-LABEL: select_Cplus1_C_zeroext:
	; CHECK: # BB#0:			; CHECK: # BB#0:
	; CHECK-NEXT: movzbl %dil, %eax			; CHECK-NEXT: # kill: %EDI<def> %EDI<kill> %RDI<def>
	; CHECK-NEXT: addl $41, %eax			; CHECK-NEXT: leal 41(%rdi), %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%sel = select i1 %cond, i32 42, i32 41			%sel = select i1 %cond, i32 42, i32 41
	ret i32 %sel			ret i32 %sel
	}			}

	define i32 @select_Cplus1_C_signext(i1 signext %cond) {			define i32 @select_Cplus1_C_signext(i1 signext %cond) {
	; CHECK-LABEL: select_Cplus1_C_signext:			; CHECK-LABEL: select_Cplus1_C_signext:
	; CHECK: # BB#0:			; CHECK: # BB#0:
	Show All 16 Lines
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%sel = select i1 %cond, i32 41, i32 42			%sel = select i1 %cond, i32 41, i32 42
	ret i32 %sel			ret i32 %sel
	}			}

	define i32 @select_C_Cplus1_zeroext(i1 zeroext %cond) {			define i32 @select_C_Cplus1_zeroext(i1 zeroext %cond) {
	; CHECK-LABEL: select_C_Cplus1_zeroext:			; CHECK-LABEL: select_C_Cplus1_zeroext:
	; CHECK: # BB#0:			; CHECK: # BB#0:
	; CHECK-NEXT: movzbl %dil, %ecx
	; CHECK-NEXT: movl $42, %eax			; CHECK-NEXT: movl $42, %eax
	; CHECK-NEXT: subl %ecx, %eax			; CHECK-NEXT: subl %edi, %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%sel = select i1 %cond, i32 41, i32 42			%sel = select i1 %cond, i32 41, i32 42
	ret i32 %sel			ret i32 %sel
	}			}

	define i32 @select_C_Cplus1_signext(i1 signext %cond) {			define i32 @select_C_Cplus1_signext(i1 signext %cond) {
	; CHECK-LABEL: select_C_Cplus1_signext:			; CHECK-LABEL: select_C_Cplus1_signext:
	; CHECK: # BB#0:			; CHECK: # BB#0:
	▲ Show 20 Lines • Show All 209 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%sel = select i1 %cond, i32 421, i32 42			%sel = select i1 %cond, i32 421, i32 42
	ret i32 %sel			ret i32 %sel
	}			}

	define i32 @select_C1_C2_zeroext(i1 zeroext %cond) {			define i32 @select_C1_C2_zeroext(i1 zeroext %cond) {
	; CHECK-LABEL: select_C1_C2_zeroext:			; CHECK-LABEL: select_C1_C2_zeroext:
	; CHECK: # BB#0:			; CHECK: # BB#0:
	; CHECK-NEXT: testb %dil, %dil			; CHECK-NEXT: testl %edi, %edi
	; CHECK-NEXT: movl $421, %ecx # imm = 0x1A5			; CHECK-NEXT: movl $421, %ecx # imm = 0x1A5
	; CHECK-NEXT: movl $42, %eax			; CHECK-NEXT: movl $42, %eax
	; CHECK-NEXT: cmovnel %ecx, %eax			; CHECK-NEXT: cmovnel %ecx, %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%sel = select i1 %cond, i32 421, i32 42			%sel = select i1 %cond, i32 421, i32 42
	ret i32 %sel			ret i32 %sel
	}			}

	▲ Show 20 Lines • Show All 84 Lines • Show Last 20 Lines

test/CodeGen/X86/sext-i1.ll

	Show First 20 Lines • Show All 138 Lines • ▼ Show 20 Lines
	; X32-LABEL: select_0_or_1s_zeroext:			; X32-LABEL: select_0_or_1s_zeroext:
	; X32: # BB#0:			; X32: # BB#0:
	; X32-NEXT: movzbl {{[0-9]+}}(%esp), %eax			; X32-NEXT: movzbl {{[0-9]+}}(%esp), %eax
	; X32-NEXT: decl %eax			; X32-NEXT: decl %eax
	; X32-NEXT: retl			; X32-NEXT: retl
	;			;
	; X64-LABEL: select_0_or_1s_zeroext:			; X64-LABEL: select_0_or_1s_zeroext:
	; X64: # BB#0:			; X64: # BB#0:
	; X64-NEXT: movzbl %dil, %eax			; X64-NEXT: # kill: %EDI<def> %EDI<kill> %RDI<def>
	; X64-NEXT: decl %eax			; X64-NEXT: leal -1(%rdi), %eax
	; X64-NEXT: retq			; X64-NEXT: retq
	%not = xor i1 %cond, 1			%not = xor i1 %cond, 1
	%sext = sext i1 %not to i32			%sext = sext i1 %not to i32
	ret i32 %sext			ret i32 %sext
	}			}

	; sext (xor Bool, -1) --> sub (zext Bool), 1			; sext (xor Bool, -1) --> sub (zext Bool), 1

	Show All 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] fold assertzexts separated by truncClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 112204

lib/CodeGen/SelectionDAG/DAGCombiner.cpp

test/CodeGen/X86/bool-zext.ll

test/CodeGen/X86/critical-edge-split-2.ll

test/CodeGen/X86/fp128-select.ll

test/CodeGen/X86/illegal-bitfield-loadstore.ll

test/CodeGen/X86/mask-negated-bool.ll

test/CodeGen/X86/negate-i1.ll

test/CodeGen/X86/select_const.ll

test/CodeGen/X86/sext-i1.ll

[DAGCombiner] fold assertzexts separated by trunc
ClosedPublic