This is an archive of the discontinued LLVM Phabricator instance.

[x86] Fix PR34377 by disabling cmov conversion when we relied on it performing a zext of a register.
ClosedPublic

Authored by chandlerc on Sep 5 2017, 10:45 PM.

Download Raw Diff

Details

Reviewers

craig.topper
aaboud

Commits

rG585bfc844318: [x86] Fix PR34377 by disabling cmov conversion when we relied on it performing…
rL312620: [x86] Fix PR34377 by disabling cmov conversion when we relied on it

Summary

On the PR there is discussion of how to more effectively handle this,
but this patch prevents us from miscompiling code.

Diff Detail

Build Status

Buildable 9928
Build 9928: arc lint + arc unit

Event Timeline

chandlerc created this revision.Sep 5 2017, 10:45 PM

Herald added subscribers: mcrosier, sanjoy. · View Herald TranscriptSep 5 2017, 10:45 PM

LGTM

This revision is now accepted and ready to land.Sep 5 2017, 10:49 PM

Closed by commit rL312620: [x86] Fix PR34377 by disabling cmov conversion when we relied on it (authored by chandlerc). · Explain WhySep 5 2017, 11:29 PM

This revision was automatically updated to reflect the committed changes.

aaboud added a subscriber: DavidKreitzer.Sep 6 2017, 1:52 AM

aaboud added inline comments.

llvm/trunk/lib/Target/X86/X86CmovConversion.cpp
297 ↗	(On Diff #113960)	Thanks Chandler for the quick fix. However, we can reduce the restriction to only the case where: We are compiling for 64bit The CMOV destination is 32bit. All other cases has no issue with the CMOV behavior. Do you agree?

chandlerc added inline comments.Sep 6 2017, 1:59 AM

llvm/trunk/lib/Target/X86/X86CmovConversion.cpp
297 ↗	(On Diff #113960)	Hmm. Probably, because that's where zext is present. However, aren't those the same conditions in which SUBREG_TO_REG will be introduced? If this is an assertion of zext-ing behavior, it can only show up due to there being zext-ing behavior of cmov itself, and that seems to be the same set of restrictions you outline. Not sure how much more time we should spend on enhanceing the dodge of a miscompile vs. the perhaps more interesting work to handle this case elegantly and effectively.

aaboud added inline comments.Sep 6 2017, 2:10 AM

llvm/trunk/lib/Target/X86/X86CmovConversion.cpp
297 ↗	(On Diff #113960)	However, aren't those the same conditions in which SUBREG_TO_REG will be introduced? If this is an assertion of zext-ing behavior, it can only show up due to there being zext-ing behavior of cmov itself, and that seems to be the same set of restrictions you outline. I think you are right, I was thinking about (CMOV16rr + zextTo32), but in such case the zext will not be removed, and we will not see the SUBREG_TO_REG. So, this restriction is good enough. Not sure how much more time we should spend on enhanceing the dodge of a miscompile vs. the perhaps more interesting work to handle this case elegantly and effectively. Sure, I did not mean that we need to spend any more effort immediately. The top priority is to make sure the pass has no functionality issue, which you did already. Thinking forward, I want to be sure what more we can do. I also prepared a patch that add the MOVrr instruction, as Dave suggested, but I need to run performance measurements before I suggest that direction. For now, I think we are fine with this solution, at least till we see a real performance issue.

Revision Contents

Path

Size

lib/

Target/

X86/

X86CmovConversion.cpp

10 lines

test/

CodeGen/

X86/

cmov-into-branch.ll

26 lines

Diff 113958

lib/Target/X86/X86CmovConversion.cpp

Show First 20 Lines • Show All 283 Lines • ▼ Show 20 Lines	for (auto &I : *MBB) {
if (I.mayLoad()) {		if (I.mayLoad()) {
if (MemOpCC == X86::COND_INVALID)		if (MemOpCC == X86::COND_INVALID)
// The first memory operand CMOV.		// The first memory operand CMOV.
MemOpCC = CC;		MemOpCC = CC;
else if (CC != MemOpCC)		else if (CC != MemOpCC)
// Can't handle mixed conditions with memory operands.		// Can't handle mixed conditions with memory operands.
SkipGroup = true;		SkipGroup = true;
}		}
		// Check if we were relying on zero-extending behavior of the CMOV.
		if (!SkipGroup &&
		llvm::any_of(
		MRI->use_nodbg_instructions(I.defs().begin()->getReg()),
		[&](MachineInstr &UseI) {
		return UseI.getOpcode() == X86::SUBREG_TO_REG;
		}))
		// FIXME: We should model the cost of using an explicit MOV to handle
		// the zero-extension rather than just refusing to handle this.
		SkipGroup = true;
continue;		continue;
}		}
// If Group is empty, keep looking for first CMOV in the range.		// If Group is empty, keep looking for first CMOV in the range.
if (Group.empty())		if (Group.empty())
continue;		continue;

// We found a non X86::CMOVrr instruction.		// We found a non X86::CMOVrr instruction.
FoundNonCMOVInst = true;		FoundNonCMOVInst = true;
▲ Show 20 Lines • Show All 493 Lines • Show Last 20 Lines

test/CodeGen/X86/cmov-into-branch.ll

Show First 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	; CHECK-NEXT: retq
%load = load i32, i32* %b, align 4		%load = load i32, i32* %b, align 4
%cmp = icmp ult i32 %load, %a		%cmp = icmp ult i32 %load, %a
%cmp1 = icmp ugt i32 %load, %a		%cmp1 = icmp ugt i32 %load, %a
%cond = select i1 %cmp1, i32 %a, i32 %y		%cond = select i1 %cmp1, i32 %a, i32 %y
%cond5 = select i1 %cmp, i32 %cond, i32 %x		%cond5 = select i1 %cmp, i32 %cond, i32 %x
ret i32 %cond5		ret i32 %cond5
}		}

		; Zero-extended select.
		define void @test6(i32 %a, i32 %x, i32* %y.ptr, i64* %z.ptr) {
		; CHECK-LABEL: test6:
		; CHECK: # BB#0: # %entry
		; CHECK-NEXT: # kill: %ESI<def> %ESI<kill> %RSI<def>
		; CHECK-NEXT: testl %edi, %edi
		; CHECK-NEXT: cmovnsl (%rdx), %esi
		; CHECK-NEXT: movq %rsi, (%rcx)
		; CHECK-NEXT: retq
		entry:
		%y = load i32, i32* %y.ptr
		%cmp = icmp slt i32 %a, 0
		%z = select i1 %cmp, i32 %x, i32 %y
		%z.ext = zext i32 %z to i64
		store i64 %z.ext, i64* %z.ptr
		ret void
		}

; If a select is not obviously predictable, don't turn it into a branch.		; If a select is not obviously predictable, don't turn it into a branch.
define i32 @weighted_select1(i32 %a, i32 %b) {		define i32 @weighted_select1(i32 %a, i32 %b) {
; CHECK-LABEL: weighted_select1:		; CHECK-LABEL: weighted_select1:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: testl %edi, %edi		; CHECK-NEXT: testl %edi, %edi
; CHECK-NEXT: cmovnel %edi, %esi		; CHECK-NEXT: cmovnel %edi, %esi
; CHECK-NEXT: movl %esi, %eax		; CHECK-NEXT: movl %esi, %eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%cmp = icmp ne i32 %a, 0		%cmp = icmp ne i32 %a, 0
%sel = select i1 %cmp, i32 %a, i32 %b, !prof !0		%sel = select i1 %cmp, i32 %a, i32 %b, !prof !0
ret i32 %sel		ret i32 %sel
}		}

; If a select is obviously predictable, turn it into a branch.		; If a select is obviously predictable, turn it into a branch.
define i32 @weighted_select2(i32 %a, i32 %b) {		define i32 @weighted_select2(i32 %a, i32 %b) {
; CHECK-LABEL: weighted_select2:		; CHECK-LABEL: weighted_select2:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: testl %edi, %edi		; CHECK-NEXT: testl %edi, %edi
; CHECK-NEXT: jne .LBB5_2		; CHECK-NEXT: jne .LBB6_2
; CHECK-NEXT: # BB#1: # %select.false		; CHECK-NEXT: # BB#1: # %select.false
; CHECK-NEXT: movl %esi, %edi		; CHECK-NEXT: movl %esi, %edi
; CHECK-NEXT: .LBB5_2: # %select.end		; CHECK-NEXT: .LBB6_2: # %select.end
; CHECK-NEXT: movl %edi, %eax		; CHECK-NEXT: movl %edi, %eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%cmp = icmp ne i32 %a, 0		%cmp = icmp ne i32 %a, 0
%sel = select i1 %cmp, i32 %a, i32 %b, !prof !1		%sel = select i1 %cmp, i32 %a, i32 %b, !prof !1
ret i32 %sel		ret i32 %sel
}		}

; Note the reversed profile weights: it doesn't matter if it's		; Note the reversed profile weights: it doesn't matter if it's
; obviously true or obviously false.		; obviously true or obviously false.
; Either one should become a branch rather than conditional move.		; Either one should become a branch rather than conditional move.
; TODO: But likely true vs. likely false should affect basic block placement?		; TODO: But likely true vs. likely false should affect basic block placement?
define i32 @weighted_select3(i32 %a, i32 %b) {		define i32 @weighted_select3(i32 %a, i32 %b) {
; CHECK-LABEL: weighted_select3:		; CHECK-LABEL: weighted_select3:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: testl %edi, %edi		; CHECK-NEXT: testl %edi, %edi
; CHECK-NEXT: je .LBB6_1		; CHECK-NEXT: je .LBB7_1
; CHECK-NEXT: # BB#2: # %select.end		; CHECK-NEXT: # BB#2: # %select.end
; CHECK-NEXT: movl %edi, %eax		; CHECK-NEXT: movl %edi, %eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
; CHECK-NEXT: .LBB6_1: # %select.false		; CHECK-NEXT: .LBB7_1: # %select.false
; CHECK-NEXT: movl %esi, %edi		; CHECK-NEXT: movl %esi, %edi
; CHECK-NEXT: movl %edi, %eax		; CHECK-NEXT: movl %edi, %eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%cmp = icmp ne i32 %a, 0		%cmp = icmp ne i32 %a, 0
%sel = select i1 %cmp, i32 %a, i32 %b, !prof !2		%sel = select i1 %cmp, i32 %a, i32 %b, !prof !2
ret i32 %sel		ret i32 %sel
}		}

Show All 18 Lines