This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Widen guards with conditions between
ClosedPublic

Authored by reames on Apr 27 2018, 12:04 PM.

Download Raw Diff

Details

Reviewers

anna
mkazantsev

Commits

rG79e917d117b9: [InstCombine] Widen guards with conditions between
rL331935: [InstCombine] Widen guards with conditions between

Summary

The previous handling for guard widening in InstCombine was extremely restrictive. In particular, it didn't handle the common case where we had two guards separated by a single icmp. Handle this by scanning through a small fixed window of instructions to find the next guard if needed.

Diff Detail

Repository: rL LLVM

Event Timeline

reames created this revision.Apr 27 2018, 12:04 PM

Herald added subscribers: bollu, mcrosier. · View Herald TranscriptApr 27 2018, 12:04 PM

reames added inline comments.Apr 27 2018, 2:58 PM

test/Transforms/InstCombine/call-guard.ll
36 ↗	(On Diff #144378)	Note to self: before submit, this needs tests to show negative cases: i.e. why do the intermediate instructions have to be safe to execute?

I would suggest a slightly different approach to handle such situations. Whenever we see consecutive guard followed by an instruction that is safe to speculate, we can just swap them. Doing so, we will collect all guards in one continuous sequence after all speculable code. And then we just fold consecutive pairs. What do you think about it?

lib/Transforms/InstCombine/InstCombineCalls.cpp
3631 ↗	(On Diff #144378)	How about having this number as an option rather than a hard-coded constant?

mkazantsev added inline comments.May 2 2018, 7:18 PM

lib/Transforms/InstCombine/InstCombineCalls.cpp
3651 ↗	(On Diff #144378)	We should assert that Temp is guaranteed to transfer execution to its successor, otherwise it might be illegal.

In D46203#1085980, @mkazantsev wrote:

I would suggest a slightly different approach to handle such situations. Whenever we see consecutive guard followed by an instruction that is safe to speculate, we can just swap them. Doing so, we will collect all guards in one continuous sequence after all speculable code. And then we just fold consecutive pairs. What do you think about it?

I thought about this approach and I agree it's interesting. It does have one serious problem I haven't quite figured out yet though which is that hoisting above the guard changes placement in the final lowered IR. If we'd left it where it was, we might have been able to *sink* the instruction into a block with the only user.

My original plan was to do this patch, then spend more time thinking about the sinking problem. If we can solve that, then I agree, hoisting over guards by default is probably the right long term answer.

LGTM once comments are addressed.

This revision is now accepted and ready to land.May 6 2018, 11:44 PM

Closed by commit rL331935: [InstCombine] Widen guards with conditions between (authored by reames). · Explain WhyMay 9 2018, 4:00 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Transforms/

InstCombine/

InstCombineCalls.cpp

23 lines

test/

Transforms/

InstCombine/

call-guard.ll

78 lines

Diff 146020

llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp

Show First 20 Lines • Show All 73 Lines • ▼ Show 20 Lines
STATISTIC(NumSimplified, "Number of library calls simplified");		STATISTIC(NumSimplified, "Number of library calls simplified");

static cl::opt<unsigned> UnfoldElementAtomicMemcpyMaxElements(		static cl::opt<unsigned> UnfoldElementAtomicMemcpyMaxElements(
"unfold-element-atomic-memcpy-max-elements",		"unfold-element-atomic-memcpy-max-elements",
cl::init(16),		cl::init(16),
cl::desc("Maximum number of elements in atomic memcpy the optimizer is "		cl::desc("Maximum number of elements in atomic memcpy the optimizer is "
"allowed to unfold"));		"allowed to unfold"));

		static cl::opt<unsigned> GuardWideningWindow(
		"instcombine-guard-widening-window",
		cl::init(3),
		cl::desc("How wide an instruction window to bypass looking for "
		"another guard"));


/// Return the specified type promoted as it would be to pass though a va_arg		/// Return the specified type promoted as it would be to pass though a va_arg
/// area.		/// area.
static Type getPromotedType(Type Ty) {		static Type getPromotedType(Type Ty) {
if (IntegerType* ITy = dyn_cast<IntegerType>(Ty)) {		if (IntegerType* ITy = dyn_cast<IntegerType>(Ty)) {
if (ITy->getBitWidth() < 32)		if (ITy->getBitWidth() < 32)
return Type::getInt32Ty(Ty->getContext());		return Type::getInt32Ty(Ty->getContext());
}		}
return Ty;		return Ty;
▲ Show 20 Lines • Show All 3,529 Lines • ▼ Show 20 Lines	case Intrinsic::experimental_gc_relocate: {
// TODO: bitcast(relocate(p)) -> relocate(bitcast(p))		// TODO: bitcast(relocate(p)) -> relocate(bitcast(p))
// Canonicalize on the type from the uses to the defs		// Canonicalize on the type from the uses to the defs

// TODO: relocate((gep p, C, C2, ...)) -> gep(relocate(p), C, C2, ...)		// TODO: relocate((gep p, C, C2, ...)) -> gep(relocate(p), C, C2, ...)
break;		break;
}		}

case Intrinsic::experimental_guard: {		case Intrinsic::experimental_guard: {
// Is this guard followed by another guard?		// Is this guard followed by another guard? We scan forward over a small
		// fixed window of instructions to handle common cases with conditions
		// computed between guards.
Instruction *NextInst = II->getNextNode();		Instruction *NextInst = II->getNextNode();
		for (int i = 0; i < GuardWideningWindow; i++) {
		// Note: Using context-free form to avoid compile time blow up
		if (!isSafeToSpeculativelyExecute(NextInst))
		break;
		NextInst = NextInst->getNextNode();
		}
Value *NextCond = nullptr;		Value *NextCond = nullptr;
if (match(NextInst,		if (match(NextInst,
m_Intrinsic<Intrinsic::experimental_guard>(m_Value(NextCond)))) {		m_Intrinsic<Intrinsic::experimental_guard>(m_Value(NextCond)))) {
Value *CurrCond = II->getArgOperand(0);		Value *CurrCond = II->getArgOperand(0);

// Remove a guard that it is immediately preceded by an identical guard.		// Remove a guard that it is immediately preceded by an identical guard.
if (CurrCond == NextCond)		if (CurrCond == NextCond)
return eraseInstFromFunction(*NextInst);		return eraseInstFromFunction(*NextInst);

// Otherwise canonicalize guard(a); guard(b) -> guard(a & b).		// Otherwise canonicalize guard(a); guard(b) -> guard(a & b).
		Instruction* MoveI = II->getNextNode();
		while (MoveI != NextInst) {
		auto *Temp = MoveI;
		MoveI = MoveI->getNextNode();
		Temp->moveBefore(II);
		}
II->setArgOperand(0, Builder.CreateAnd(CurrCond, NextCond));		II->setArgOperand(0, Builder.CreateAnd(CurrCond, NextCond));
return eraseInstFromFunction(*NextInst);		return eraseInstFromFunction(*NextInst);
}		}
break;		break;
}		}
}		}
return visitCallSite(II);		return visitCallSite(II);
}		}
▲ Show 20 Lines • Show All 692 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/InstCombine/call-guard.ll

	Show All 24 Lines
	; CHECK-NEXT: %2 = and i1 %1, %C			; CHECK-NEXT: %2 = and i1 %1, %C
	; CHECK-NEXT: call void (i1, ...) @llvm.experimental.guard(i1 %2, i32 123) [ "deopt"() ]			; CHECK-NEXT: call void (i1, ...) @llvm.experimental.guard(i1 %2, i32 123) [ "deopt"() ]
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	call void(i1, ...) @llvm.experimental.guard( i1 %A, i32 123 )[ "deopt"() ]			call void(i1, ...) @llvm.experimental.guard( i1 %A, i32 123 )[ "deopt"() ]
	call void(i1, ...) @llvm.experimental.guard( i1 %B, i32 456 )[ "deopt"() ]			call void(i1, ...) @llvm.experimental.guard( i1 %B, i32 456 )[ "deopt"() ]
	call void(i1, ...) @llvm.experimental.guard( i1 %C, i32 789 )[ "deopt"() ]			call void(i1, ...) @llvm.experimental.guard( i1 %C, i32 789 )[ "deopt"() ]
	ret void			ret void
	}			}

				; This version tests for the common form where the conditions are
				; between the guards
				define void @test_guard_adjacent_diff_cond2(i32 %V1, i32 %V2) {
				; CHECK-LABEL: @test_guard_adjacent_diff_cond2(
				; CHECK-NEXT: %1 = and i32 %V1, %V2
				; CHECK-NEXT: %2 = icmp slt i32 %1, 0
				; CHECK-NEXT: %and = and i32 %V1, 255
				; CHECK-NEXT: %C = icmp ult i32 %and, 129
				; CHECK-NEXT: %3 = and i1 %2, %C
				; CHECK-NEXT: call void (i1, ...) @llvm.experimental.guard(i1 %3, i32 123) [ "deopt"() ]
				; CHECK-NEXT: ret void
				%A = icmp slt i32 %V1, 0
				call void(i1, ...) @llvm.experimental.guard( i1 %A, i32 123 )[ "deopt"() ]
				%B = icmp slt i32 %V2, 0
				call void(i1, ...) @llvm.experimental.guard( i1 %B, i32 456 )[ "deopt"() ]
				%and = and i32 %V1, 255
				%C = icmp sle i32 %and, 128
				call void(i1, ...) @llvm.experimental.guard( i1 %C, i32 789 )[ "deopt"() ]
				ret void
				}

				; Might not be legal to hoist the load above the first guard since the
				; guard might control dereferenceability
				define void @negative_load(i32 %V1, i32* %P) {
				; CHECK-LABEL: @negative_load
				; CHECK: @llvm.experimental.guard
				; CHECK: @llvm.experimental.guard
				%A = icmp slt i32 %V1, 0
				call void(i1, ...) @llvm.experimental.guard( i1 %A, i32 123 )[ "deopt"() ]
				%V2 = load i32, i32* %P
				%B = icmp slt i32 %V2, 0
				call void(i1, ...) @llvm.experimental.guard( i1 %B, i32 456 )[ "deopt"() ]
				ret void
				}

				define void @deref_load(i32 %V1, i32* dereferenceable(4) %P) {
				; CHECK-LABEL: @deref_load
				; CHECK-NEXT: %V2 = load i32, i32* %P, align 4
				; CHECK-NEXT: %1 = and i32 %V2, %V1
				; CHECK-NEXT: %2 = icmp slt i32 %1, 0
				; CHECK-NEXT: call void (i1, ...) @llvm.experimental.guard(i1 %2, i32 123) [ "deopt"() ]
				%A = icmp slt i32 %V1, 0
				call void(i1, ...) @llvm.experimental.guard( i1 %A, i32 123 )[ "deopt"() ]
				%V2 = load i32, i32* %P
				%B = icmp slt i32 %V2, 0
				call void(i1, ...) @llvm.experimental.guard( i1 %B, i32 456 )[ "deopt"() ]
				ret void
				}

				; The divide might fault above the guard
				define void @negative_div(i32 %V1, i32 %D) {
				; CHECK-LABEL: @negative_div
				; CHECK: @llvm.experimental.guard
				; CHECK: @llvm.experimental.guard
				%A = icmp slt i32 %V1, 0
				call void(i1, ...) @llvm.experimental.guard( i1 %A, i32 123 )[ "deopt"() ]
				%V2 = udiv i32 %V1, %D
				%B = icmp slt i32 %V2, 0
				call void(i1, ...) @llvm.experimental.guard( i1 %B, i32 456 )[ "deopt"() ]
				ret void
				}

				; Highlight the limit of the window in a case which would otherwise be mergable
				define void @negative_window(i32 %V1, i32 %a, i32 %b, i32 %c, i32 %d) {
				; CHECK-LABEL: @negative_window
				; CHECK: @llvm.experimental.guard
				; CHECK: @llvm.experimental.guard
				%A = icmp slt i32 %V1, 0
				call void(i1, ...) @llvm.experimental.guard( i1 %A, i32 123 )[ "deopt"() ]
				%V2 = add i32 %a, %b
				%V3 = add i32 %V2, %c
				%V4 = add i32 %V3, %d
				%B = icmp slt i32 %V4, 0
				call void(i1, ...) @llvm.experimental.guard( i1 %B, i32 456 )[ "deopt"() ]
				ret void
				}