Download Raw Diff

Details

Reviewers

echristo
t.p.northover

Commits

rZORG23f5e7549435: [TargetLowering] Handle multi depth GEPs w/ inline asm constraints
rZORGeb3371a2bb45: [TargetLowering] Handle multi depth GEPs w/ inline asm constraints
rG23f5e7549435: [TargetLowering] Handle multi depth GEPs w/ inline asm constraints
rGeb3371a2bb45: [TargetLowering] Handle multi depth GEPs w/ inline asm constraints
rGc33f754e747b: [TargetLowering] Handle multi depth GEPs w/ inline asm constraints
rL360604: [TargetLowering] Handle multi depth GEPs w/ inline asm constraints

Summary

X86TargetLowering::LowerAsmOperandForConstraint had better support than
TargetLowering::LowerAsmOperandForConstraint for arbitrary depth
getelementpointers for "i", "n", and "s" extended inline assembly
constraints. Hoist its support from the derived class into the base
class.

Link: https://github.com/ClangBuiltLinux/linux/issues/469

Diff Detail

Repository

rG LLVM Github Monorepo

Build Status

Buildable 31686
Build 31685: arc lint + arc unit

Event Timeline

nickdesaulniers created this revision.May 4 2019, 11:03 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 4 2019, 11:03 PM

Herald added subscribers: llvm-commits, jsji, hiraditya and 3 others. · View Herald Transcript

Harbormaster completed remote builds in B31413: Diff 198160.May 4 2019, 11:03 PM

git-clang-format HEAD~

Harbormaster completed remote builds in B31414: Diff 198161.May 4 2019, 11:07 PM

nickdesaulniers added a subscriber: jyknight.May 4 2019, 11:41 PM

nickdesaulniers added subscribers: kees, E5ten.May 6 2019, 9:32 AM

add test case for arm32

Harbormaster completed remote builds in B31459: Diff 198296.May 6 2019, 9:55 AM

t.p.northover added a subscriber: t.p.northover.May 7 2019, 3:22 AM

t.p.northover added inline comments.

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
3488	I think this needs to be `uint64_t` since the arithmetic SDNodes naturally have 2s-complement behaviour. The final value will still be interpreted in a signed manner, but intermediate calculations need to be define when wrapping.
3500	With the above suggestion, I think you'll need to make sure this extension happens properly. You probably do anyway -- the code below for combining constants can do strange things to the high bits if pointer size is less than i64.

prefer uint64_t, as per @t.p.northover

Harbormaster completed remote builds in B31632: Diff 198696.May 8 2019, 10:30 AM

nickdesaulniers marked 2 inline comments as done.May 8 2019, 10:32 AM

nickdesaulniers added inline comments.

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
3500	Thanks for the review. I'm sorry, but I don't fully understand this comment. Can you please provide more information or clarification? Are you suggesting maybe returning early if `Offset + C->getSExtValue()` is greater than `INT_MAX`?

nickdesaulniers added a reviewer: t.p.northover.May 8 2019, 10:33 AM

nickdesaulniers marked an inline comment as not done.

t.p.northover added inline comments.May 8 2019, 1:50 PM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
3500	All arithmetic is performed at 64-bits, and the final result is provided at that width. But if the original computation is 32-bits, I think bad things might happen. Consider: (add (sub X:i32, 1), -1) I think that leaves `Offset` as 0xfffffffe because of the odd combination of zero-extending the input and multiplying it by -1 for `ISD::SUB`. Effectively, some -1s are at 32-bits and some are at 64-bits. So I think you need to do a manual extension of `Offset` from `Op.getValueType().getSizeInBits()` (not looked up the API, but you get the idea) to i64 before creating the constant.

nickdesaulniers added inline comments.May 8 2019, 2:37 PM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
3500	I think that leaves Offset as 0xfffffffe Can you please walk me through the arithmetic on that (I might be mixing up my 2's compliment arithmetic, or order of operations)?

nickdesaulniers marked an inline comment as done.May 8 2019, 3:31 PM

nickdesaulniers added inline comments.

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
3500	Also, it looks like `SelectionDAG::FoldConstantArithmetic()` could help here.

nickdesaulniers marked an inline comment as not done.May 8 2019, 4:52 PM

nickdesaulniers added inline comments.

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
3500	Ok, I agree that we should be Sign extending, not Z extending. `SelectionDAG::FoldSymbolOffset` seems to agree (or have the same misunderstanding). @srhines also points out I dropped the commutative nature of the pre-existing code, so I'll add that back in.

check both operands for the constant, use SExt

nickdesaulniers marked 7 inline comments as done.May 8 2019, 5:28 PM

nickdesaulniers added inline comments.

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
3500	ok, I've switched to SExt from ZExt (as per `SelectionDAG::FoldSymbolOffset`). @t.p.northover can you please triple check?

Harbormaster completed remote builds in B31646: Diff 198746.May 8 2019, 5:30 PM

nickdesaulniers marked 2 inline comments as done.May 8 2019, 5:40 PM

nickdesaulniers added inline comments.

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
3494	Maybe I should initialize `C` to `nullptr` and then keep the `C ? SDLoc(C) : SDLoc()`? I'm not verify familiar with `SDLoc`, but it seems that I'm potentially messing up debug info? Should the previous code be constructing empty `SDLoc`s (as opposed to `SDLoc(GA)`?

nickdesaulniers marked an inline comment as not done.May 8 2019, 5:40 PM

t.p.northover added inline comments.May 9 2019, 2:11 AM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
3500	Yep, I think that bit's OK now.
3509	Subtraction is not commutative, so you need to be more careful here.

subtraction is not commutative

Harbormaster completed remote builds in B31686: Diff 198849.May 9 2019, 9:28 AM

nickdesaulniers marked 3 inline comments as done.May 9 2019, 9:28 AM

nickdesaulniers added inline comments.

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
3509	Good catch, thank you. How's that look? Any thoughts on my SDLoc comment, above?

I think it looks reasonable now. Thanks for updating it!

This revision is now accepted and ready to land.May 10 2019, 4:12 AM

Closed by commit rL360604: [TargetLowering] Handle multi depth GEPs w/ inline asm constraints (authored by nickdesaulniers). · Explain WhyMay 13 2019, 10:26 AM

This revision was automatically updated to reflect the committed changes.

nickdesaulniers marked an inline comment as done.

Diff 198849

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp

Show First 20 Lines • Show All 3,476 Lines • ▼ Show 20 Lines	if (Op.getOpcode() == ISD::BasicBlock \|\|
Op.getOpcode() == ISD::TargetBlockAddress) {		Op.getOpcode() == ISD::TargetBlockAddress) {
Ops.push_back(Op);		Ops.push_back(Op);
return;		return;
}		}
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
case 'i': // Simple Integer or Relocatable Constant		case 'i': // Simple Integer or Relocatable Constant
case 'n': // Simple Integer		case 'n': // Simple Integer
case 's': { // Relocatable Constant		case 's': { // Relocatable Constant
// These operands are interested in values of the form (GV+C), where C may
// be folded in as an offset of GV, or it may be explicitly added. Also, it
// is possible and fine if either GV or C are missing.
ConstantSDNode *C = dyn_cast<ConstantSDNode>(Op);
GlobalAddressSDNode *GA = dyn_cast<GlobalAddressSDNode>(Op);

// If we have "(add GV, C)", pull out GV/C
if (Op.getOpcode() == ISD::ADD) {
C = dyn_cast<ConstantSDNode>(Op.getOperand(1));
GA = dyn_cast<GlobalAddressSDNode>(Op.getOperand(0));
if (!C \|\| !GA) {
C = dyn_cast<ConstantSDNode>(Op.getOperand(0));
GA = dyn_cast<GlobalAddressSDNode>(Op.getOperand(1));
}
if (!C \|\| !GA) {
C = nullptr;
GA = nullptr;
}
}

// If we find a valid operand, map to the TargetXXX version so that the		GlobalAddressSDNode *GA;
// value itself doesn't get selected.		ConstantSDNode *C;
if (GA) { // Either &GV or &GV+C		uint64_t Offset = 0;
		t.p.northoverUnsubmitted Done Reply Inline Actions I think this needs to be `uint64_t` since the arithmetic SDNodes naturally have 2s-complement behaviour. The final value will still be interpreted in a signed manner, but intermediate calculations need to be define when wrapping. t.p.northover: I think this needs to be `uint64_t` since the arithmetic SDNodes naturally have 2s-complement…
if (ConstraintLetter != 'n') {
int64_t Offs = GA->getOffset();		// Match (GA) or (C) or (GA+C) or (GA-C) or ((GA+C)+C) or (((GA+C)+C)+C),
if (C) Offs += C->getZExtValue();		// etc., since getelementpointer is variadic. We can't use
Ops.push_back(DAG.getTargetGlobalAddress(GA->getGlobal(),		// SelectionDAG::FoldSymbolOffset because it expects the GA to be accessible
C ? SDLoc(C) : SDLoc(),		// while in this case the GA may be furthest from the root node which is
Op.getValueType(), Offs));		// likely an ISD::ADD.
		nickdesaulniersAuthorUnsubmitted Not Done Reply Inline Actions Maybe I should initialize `C` to `nullptr` and then keep the `C ? SDLoc(C) : SDLoc()`? I'm not verify familiar with `SDLoc`, but it seems that I'm potentially messing up debug info? Should the previous code be constructing empty `SDLoc`s (as opposed to `SDLoc(GA)`? nickdesaulniers: Maybe I should initialize `C` to `nullptr` and then keep the `C ? SDLoc(C) : SDLoc()`? I'm not…
}		while (1) {
		if ((GA = dyn_cast<GlobalAddressSDNode>(Op)) && ConstraintLetter != 'n') {
		Ops.push_back(DAG.getTargetGlobalAddress(GA->getGlobal(), SDLoc(Op),
		GA->getValueType(0),
		Offset + GA->getOffset()));
return;		return;
		t.p.northoverUnsubmitted Done Reply Inline Actions With the above suggestion, I think you'll need to make sure this extension happens properly. You probably do anyway -- the code below for combining constants can do strange things to the high bits if pointer size is less than i64. t.p.northover: With the above suggestion, I think you'll need to make sure this extension happens properly.
		nickdesaulniersAuthorUnsubmitted Done Reply Inline Actions Thanks for the review. I'm sorry, but I don't fully understand this comment. Can you please provide more information or clarification? Are you suggesting maybe returning early if `Offset + C->getSExtValue()` is greater than `INT_MAX`? nickdesaulniers: Thanks for the review. I'm sorry, but I don't fully understand this comment. Can you please…
		t.p.northoverUnsubmitted Done Reply Inline Actions All arithmetic is performed at 64-bits, and the final result is provided at that width. But if the original computation is 32-bits, I think bad things might happen. Consider: (add (sub X:i32, 1), -1) I think that leaves `Offset` as 0xfffffffe because of the odd combination of zero-extending the input and multiplying it by -1 for `ISD::SUB`. Effectively, some -1s are at 32-bits and some are at 64-bits. So I think you need to do a manual extension of `Offset` from `Op.getValueType().getSizeInBits()` (not looked up the API, but you get the idea) to i64 before creating the constant. t.p.northover: All arithmetic is performed at 64-bits, and the final result is provided at that width. But if…
		nickdesaulniersAuthorUnsubmitted Done Reply Inline Actions I think that leaves Offset as 0xfffffffe Can you please walk me through the arithmetic on that (I might be mixing up my 2's compliment arithmetic, or order of operations)? nickdesaulniers: > I think that leaves Offset as 0xfffffffe Can you please walk me through the arithmetic on…
		nickdesaulniersAuthorUnsubmitted Done Reply Inline Actions Also, it looks like `SelectionDAG::FoldConstantArithmetic()` could help here. nickdesaulniers: Also, it looks like `SelectionDAG::FoldConstantArithmetic()` could help here.
		nickdesaulniersAuthorUnsubmitted Done Reply Inline Actions Ok, I agree that we should be Sign extending, not Z extending. `SelectionDAG::FoldSymbolOffset` seems to agree (or have the same misunderstanding). @srhines also points out I dropped the commutative nature of the pre-existing code, so I'll add that back in. nickdesaulniers: Ok, I agree that we should be Sign extending, not Z extending. `SelectionDAG…
		nickdesaulniersAuthorUnsubmitted Done Reply Inline Actions ok, I've switched to SExt from ZExt (as per `SelectionDAG::FoldSymbolOffset`). @t.p.northover can you please triple check? nickdesaulniers: ok, I've switched to SExt from ZExt (as per `SelectionDAG::FoldSymbolOffset`). @t.p.northover…
		t.p.northoverUnsubmitted Done Reply Inline Actions Yep, I think that bit's OK now. t.p.northover: Yep, I think that bit's OK now.
}		} else if ((C = dyn_cast<ConstantSDNode>(Op)) &&
if (C) { // just C, no GV.		ConstraintLetter != 's') {
// Simple constants are not allowed for 's'.		Ops.push_back(DAG.getTargetConstant(Offset + C->getSExtValue(),
if (ConstraintLetter != 's') {
// gcc prints these as sign extended. Sign extend value to 64 bits
// now; without this it would get ZExt'd later in
// ScheduleDAGSDNodes::EmitNode, which is very generic.
Ops.push_back(DAG.getTargetConstant(C->getSExtValue(),
SDLoc(C), MVT::i64));		SDLoc(C), MVT::i64));
		return;
		} else {
		const unsigned OpCode = Op.getOpcode();
		if (OpCode == ISD::ADD \|\| OpCode == ISD::SUB) {
		if ((C = dyn_cast<ConstantSDNode>(Op.getOperand(0))))
		t.p.northoverUnsubmitted Done Reply Inline Actions Subtraction is not commutative, so you need to be more careful here. t.p.northover: Subtraction is not commutative, so you need to be more careful here.
		nickdesaulniersAuthorUnsubmitted Done Reply Inline Actions Good catch, thank you. How's that look? Any thoughts on my SDLoc comment, above? nickdesaulniers: Good catch, thank you. How's that look? Any thoughts on my SDLoc comment, above?
		Op = Op.getOperand(1);
		// Subtraction is not commutative.
		else if (OpCode == ISD::ADD &&
		(C = dyn_cast<ConstantSDNode>(Op.getOperand(1))))
		Op = Op.getOperand(0);
		else
		return;
		Offset += (OpCode == ISD::ADD ? 1 : -1) * C->getSExtValue();
		continue;
		}
}		}
return;		return;
}		}
break;		break;
}		}
}		}
}		}

▲ Show 20 Lines • Show All 2,366 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 32,759 Lines • ▼ Show 20 Lines
	// In any sort of PIC mode addresses need to be computed at runtime by			// In any sort of PIC mode addresses need to be computed at runtime by
	// adding in a register or some sort of table lookup. These can't			// adding in a register or some sort of table lookup. These can't
	// be used as immediates.			// be used as immediates.
	if (Subtarget.isPICStyleGOT() \|\| Subtarget.isPICStyleStubPIC())			if (Subtarget.isPICStyleGOT() \|\| Subtarget.isPICStyleStubPIC())
	return;			return;

	// If we are in non-pic codegen mode, we allow the address of a global (with			// If we are in non-pic codegen mode, we allow the address of a global (with
	// an optional displacement) to be used with 'i'.			// an optional displacement) to be used with 'i'.
	GlobalAddressSDNode *GA = nullptr;			if (auto *GA = dyn_cast<GlobalAddressSDNode>(Op))
	int64_t Offset = 0;			// If we require an extra load to get this address, as in PIC mode, we
				// can't accept it.
	// Match either (GA), (GA+C), (GA+C1+C2), etc.			if (isGlobalStubReference(
	while (1) {			Subtarget.classifyGlobalReference(GA->getGlobal())))
	if ((GA = dyn_cast<GlobalAddressSDNode>(Op))) {			return;
	Offset += GA->getOffset();
	break;
	} else if (Op.getOpcode() == ISD::ADD) {
	if (ConstantSDNode *C = dyn_cast<ConstantSDNode>(Op.getOperand(1))) {
	Offset += C->getZExtValue();
	Op = Op.getOperand(0);
	continue;
	}
	} else if (Op.getOpcode() == ISD::SUB) {
	if (ConstantSDNode *C = dyn_cast<ConstantSDNode>(Op.getOperand(1))) {
	Offset += -C->getZExtValue();
	Op = Op.getOperand(0);
	continue;
	}
	}

	// Otherwise, this isn't something we can handle, reject it.
	return;
	}

	const GlobalValue *GV = GA->getGlobal();
	// If we require an extra load to get this address, as in PIC mode, we
	// can't accept it.
	if (isGlobalStubReference(Subtarget.classifyGlobalReference(GV)))
	return;

	Result = DAG.getTargetGlobalAddress(GV, SDLoc(Op),
	GA->getValueType(0), Offset);
	break;			break;
	}			}
	}			}

	if (Result.getNode()) {			if (Result.getNode()) {
	Ops.push_back(Result);			Ops.push_back(Result);
	return;			return;
	}			}
	▲ Show 20 Lines • Show All 484 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/inline-asm-multilevel-gep.ll

This file was added.

				; RUN: llc < %s -mtriple aarch64-gnu-linux \| FileCheck %s

				; @foo is a 2d array of i32s, ex.
				; i32 foo [2][2]
				@foo = internal global [2 x [2 x i32]] zeroinitializer, align 4

				define void @bar() {
				; access foo[1][1]
				; CHECK: // foo+12
				tail call void asm sideeffect "// ${0:c}", "i"(i32* getelementptr inbounds ([2 x [2 x i32]], [2 x [2 x i32]]* @foo, i64 0, i64 1, i64 1))
				ret void
				}

llvm/test/CodeGen/ARM/inline-asm-multilevel-gep.ll

This file was added.

				; RUN: llc < %s -mtriple armv7-linux-gnueabi \| FileCheck %s

				; @foo is a 2d array of i32s, ex.
				; i32 foo [2][2]
				@foo = internal global [2 x [2 x i32]] zeroinitializer, align 4

				define void @bar() {
				; access foo[1][1]
				; CHECK: @ foo+12
				tail call void asm sideeffect "@ ${0:c}", "i"(i32* getelementptr inbounds ([2 x [2 x i32]], [2 x [2 x i32]]* @foo, i64 0, i64 1, i64 1))
				ret void
				}

llvm/test/CodeGen/PowerPC/inline-asm-multilevel-gep.ll

This file was added.

				; RUN: llc < %s -mtriple ppc32-- \| FileCheck %s

				; @foo is a 2d array of i32s, ex.
				; i32 foo [2][2]
				@foo = internal global [2 x [2 x i32]] zeroinitializer, align 4

				define void @bar() {
				; access foo[1][1]
				; CHECK: # foo+12
				tail call void asm sideeffect "# ${0:c}", "i"(i32* getelementptr inbounds ([2 x [2 x i32]], [2 x [2 x i32]]* @foo, i64 0, i64 1, i64 1))
				ret void
				}

llvm/test/CodeGen/X86/inline-asm-multilevel-gep.ll

This file was added.

				; RUN: llc < %s -mtriple x86_64-gnu-linux \| FileCheck %s

				; @foo is a 2d array of i32s, ex.
				; i32 foo [2][2]
				@foo = internal global [2 x [2 x i32]] zeroinitializer, align 4

				define void @bar() {
				; access foo[1][1]
				; CHECK: # foo+12
				tail call void asm sideeffect "# ${0:c}", "i"(i32* getelementptr inbounds ([2 x [2 x i32]], [2 x [2 x i32]]* @foo, i64 0, i64 1, i64 1))
				ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

[TargetLowering] Handle multi depth GEPs w/ inline asm constraints
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 198849

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp

llvm/lib/Target/X86/X86ISelLowering.cpp

llvm/test/CodeGen/AArch64/inline-asm-multilevel-gep.ll

llvm/test/CodeGen/ARM/inline-asm-multilevel-gep.ll

llvm/test/CodeGen/PowerPC/inline-asm-multilevel-gep.ll

llvm/test/CodeGen/X86/inline-asm-multilevel-gep.ll

This is an archive of the discontinued LLVM Phabricator instance.

[TargetLowering] Handle multi depth GEPs w/ inline asm constraintsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 198849

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp

llvm/lib/Target/X86/X86ISelLowering.cpp

llvm/test/CodeGen/AArch64/inline-asm-multilevel-gep.ll

llvm/test/CodeGen/ARM/inline-asm-multilevel-gep.ll

llvm/test/CodeGen/PowerPC/inline-asm-multilevel-gep.ll

llvm/test/CodeGen/X86/inline-asm-multilevel-gep.ll

[TargetLowering] Handle multi depth GEPs w/ inline asm constraints
ClosedPublic