This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/lib/Target/X86/
-
lib/
-
Target/
-
X86/
6/6
X86ISelLowering.cpp

Differential D69034

[Alignment] Use Align for TFI.getStackAlignment() in X86ISelLowering
ClosedPublic

Authored by gchatelet on Oct 16 2019, 6:35 AM.

Download Raw Diff

Details

Reviewers

courbet
craig.topper
rnk

Commits

rG119b436da1c0: [Alignment] Use Align for TFI.getStackAlignment() in X86ISelLowering

Summary

This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

gchatelet created this revision.Oct 16 2019, 6:35 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 16 2019, 6:36 AM

Herald added subscribers: llvm-commits, hiraditya. · View Herald Transcript

Harbormaster completed remote builds in B39639: Diff 225208.Oct 16 2019, 6:36 AM

courbet requested changes to this revision.Oct 16 2019, 7:01 AM

courbet added inline comments.

llvm/lib/Target/X86/X86ISelLowering.cpp
4186–4192	This is not the same as the previous code for all values: https://godbolt.org/z/QTxj9U Could you at least add an assert if you don;t intend to cover the same range of inputs ?
22264	Let's refactor this in `Align` (in a separate patch).

This revision now requires changes to proceed.Oct 16 2019, 7:01 AM

gchatelet added a reviewer: craig.topper.Oct 18 2019, 7:08 AM

gchatelet marked 4 inline comments as done.Oct 18 2019, 7:15 AM

gchatelet added inline comments.

llvm/lib/Target/X86/X86ISelLowering.cpp
4186–4192	I traced the addition of this code back to rL42870 (2007). Since the author is probably not working on this any more I'm redirecting to @craig.topper (from `CODE_OWNERS.TXT`). The original code can be rewritten as: unsigned GetAlignedArgumentStackSize(uint64_t StackSize, uint64_t StackAlignment, uint64_t SlotSize) { const uint64_t AlignMask = StackAlignment - 1; const uint64_t lowBits = StackSize & AlignMask; const uint64_t hiBits = StackSize & ~AlignMask; if (lowBits <= (StackAlignment - SlotSize)) return StackSize - lowBits + StackAlignment - SlotSize; return StackAlignment + hiBits + StackAlignment - SlotSize; } I believe the last line is wrong, at least I can't wrap my head around it. interestingly, the last line is never exercised in the tests so I believe it is dead code. As such I believe the patch cannot be `NFC` anymore.
22264	SGTM

gchatelet retitled this revision from [Alignment][NFC] Use Align for TFI.getStackAlignment() in X86ISelLowering to [Alignment] Use Align for TFI.getStackAlignment() in X86ISelLowering.Oct 18 2019, 7:40 AM

gchatelet marked 3 inline comments as done.Oct 25 2019, 1:52 PM

A gentle ping @craig.topper

@rnk do you know much about this code?

In D69034#1725662, @craig.topper wrote:

@rnk do you know much about this code?

Not previously, but from looking at it, I'd say the simplification is fine. This code is only called from the guaranteed TCO codepath, so I'm not surprised it's lightly tested.

llvm/lib/Target/X86/X86ISelLowering.cpp
4186–4192	I think the code only differs when StackAlignment is less than SlotSize, which should never happen in practice. I suspect we rely on that invariant in other places. I edited the godbolt program to avoid such test cases and now the algorithms always produce the same values: https://godbolt.org/z/MqPErL And, the only way to exercise the second case in the original code would be if StackSize is not a multiple of SlotSize, which seems unlikely. I suspect we align StackSize to SlotSize somewhere else. If you wanted to add test coverage for this case, I'd experiment with multiple one byte allocas. In the end, I think it's safe to take your simplification.

Add an assert to make the invariant clear

Harbormaster completed remote builds in B40259: Diff 227051.Oct 30 2019, 2:21 AM

gchatelet marked 2 inline comments as done.Oct 30 2019, 2:24 AM

gchatelet added inline comments.

llvm/lib/Target/X86/X86ISelLowering.cpp
4186–4192	Thx a lot @rnk . I've added an assert to make the invariant clear.

This revision was not accepted when it landed; it landed in state Needs Review.Oct 30 2019, 2:40 AM

Closed by commit rG119b436da1c0: [Alignment] Use Align for TFI.getStackAlignment() in X86ISelLowering (authored by gchatelet). · Explain Why

This revision was automatically updated to reflect the committed changes.

gchatelet marked an inline comment as done.

Revision Contents

Path

Size

llvm/

lib/

Target/

X86/

X86ISelLowering.cpp

44 lines

Diff 227053

llvm/lib/Target/X86/X86ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,177 Lines • ▼ Show 20 Lines
// (possible EBP)		// (possible EBP)
// ESI		// ESI
// EDI		// EDI
// local1 ..		// local1 ..

/// Make the stack size align e.g 16n + 12 aligned for a 16-byte align		/// Make the stack size align e.g 16n + 12 aligned for a 16-byte align
/// requirement.		/// requirement.
unsigned		unsigned
X86TargetLowering::GetAlignedArgumentStackSize(unsigned StackSize,		X86TargetLowering::GetAlignedArgumentStackSize(const unsigned StackSize,
SelectionDAG& DAG) const {		SelectionDAG &DAG) const {
const X86RegisterInfo *RegInfo = Subtarget.getRegisterInfo();		const Align StackAlignment(Subtarget.getFrameLowering()->getStackAlignment());
const TargetFrameLowering &TFI = *Subtarget.getFrameLowering();		const uint64_t SlotSize = Subtarget.getRegisterInfo()->getSlotSize();
unsigned StackAlignment = TFI.getStackAlignment();		assert(StackSize % SlotSize == 0 &&
uint64_t AlignMask = StackAlignment - 1;		"StackSize must be a multiple of SlotSize");
int64_t Offset = StackSize;		return alignTo(StackSize + SlotSize, StackAlignment) - SlotSize;
		courbetUnsubmitted Done Reply Inline Actions This is not the same as the previous code for all values: https://godbolt.org/z/QTxj9U Could you at least add an assert if you don;t intend to cover the same range of inputs ? courbet: This is not the same as the previous code for all values: https://godbolt.org/z/QTxj9U Could…
		gchateletAuthorUnsubmitted Done Reply Inline Actions I traced the addition of this code back to rL42870 (2007). Since the author is probably not working on this any more I'm redirecting to @craig.topper (from `CODE_OWNERS.TXT`). The original code can be rewritten as: unsigned GetAlignedArgumentStackSize(uint64_t StackSize, uint64_t StackAlignment, uint64_t SlotSize) { const uint64_t AlignMask = StackAlignment - 1; const uint64_t lowBits = StackSize & AlignMask; const uint64_t hiBits = StackSize & ~AlignMask; if (lowBits <= (StackAlignment - SlotSize)) return StackSize - lowBits + StackAlignment - SlotSize; return StackAlignment + hiBits + StackAlignment - SlotSize; } I believe the last line is wrong, at least I can't wrap my head around it. interestingly, the last line is never exercised in the tests so I believe it is dead code. As such I believe the patch cannot be `NFC` anymore. gchatelet: I traced the addition of this code back to rL42870 (2007). Since the author is probably not…
		rnkUnsubmitted Done Reply Inline Actions I think the code only differs when StackAlignment is less than SlotSize, which should never happen in practice. I suspect we rely on that invariant in other places. I edited the godbolt program to avoid such test cases and now the algorithms always produce the same values: https://godbolt.org/z/MqPErL And, the only way to exercise the second case in the original code would be if StackSize is not a multiple of SlotSize, which seems unlikely. I suspect we align StackSize to SlotSize somewhere else. If you wanted to add test coverage for this case, I'd experiment with multiple one byte allocas. In the end, I think it's safe to take your simplification. rnk: I think the code only differs when StackAlignment is less than SlotSize, which should never…
		gchateletAuthorUnsubmitted Done Reply Inline Actions Thx a lot @rnk . I've added an assert to make the invariant clear. gchatelet: Thx a lot @rnk . I've added an assert to make the invariant clear.
unsigned SlotSize = RegInfo->getSlotSize();
if ( (Offset & AlignMask) <= (StackAlignment - SlotSize) ) {
// Number smaller than 12 so just add the difference.
Offset += ((StackAlignment - SlotSize) - (Offset & AlignMask));
} else {
// Mask out lower bits, add stackalignment once plus the 12 bytes.
Offset = ((~AlignMask) & Offset) + StackAlignment +
(StackAlignment-SlotSize);
}
return Offset;
}		}

/// Return true if the given stack call argument is already available in the		/// Return true if the given stack call argument is already available in the
/// same position (relatively) of the caller's incoming argument stack.		/// same position (relatively) of the caller's incoming argument stack.
static		static
bool MatchingStackOffset(SDValue Arg, unsigned Offset, ISD::ArgFlagsTy Flags,		bool MatchingStackOffset(SDValue Arg, unsigned Offset, ISD::ArgFlagsTy Flags,
MachineFrameInfo &MFI, const MachineRegisterInfo *MRI,		MachineFrameInfo &MFI, const MachineRegisterInfo *MRI,
const X86InstrInfo *TII, const CCValAssign &VA) {		const X86InstrInfo *TII, const CCValAssign &VA) {
▲ Show 20 Lines • Show All 17,997 Lines • ▼ Show 20 Lines	X86TargetLowering::LowerDYNAMIC_STACKALLOC(SDValue Op,
bool Lower = (Subtarget.isOSWindows() && !Subtarget.isTargetMachO()) \|\|		bool Lower = (Subtarget.isOSWindows() && !Subtarget.isTargetMachO()) \|\|
SplitStack \|\| EmitStackProbe;		SplitStack \|\| EmitStackProbe;
SDLoc dl(Op);		SDLoc dl(Op);

// Get the inputs.		// Get the inputs.
SDNode *Node = Op.getNode();		SDNode *Node = Op.getNode();
SDValue Chain = Op.getOperand(0);		SDValue Chain = Op.getOperand(0);
SDValue Size = Op.getOperand(1);		SDValue Size = Op.getOperand(1);
unsigned Align = Op.getConstantOperandVal(2);		MaybeAlign Alignment(Op.getConstantOperandVal(2));
EVT VT = Node->getValueType(0);		EVT VT = Node->getValueType(0);

// Chain the dynamic stack allocation so that it doesn't modify the stack		// Chain the dynamic stack allocation so that it doesn't modify the stack
// pointer when other instructions are using the stack.		// pointer when other instructions are using the stack.
Chain = DAG.getCALLSEQ_START(Chain, 0, 0, dl);		Chain = DAG.getCALLSEQ_START(Chain, 0, 0, dl);

bool Is64Bit = Subtarget.is64Bit();		bool Is64Bit = Subtarget.is64Bit();
MVT SPTy = getPointerTy(DAG.getDataLayout());		MVT SPTy = getPointerTy(DAG.getDataLayout());

SDValue Result;		SDValue Result;
if (!Lower) {		if (!Lower) {
const TargetLowering &TLI = DAG.getTargetLoweringInfo();		const TargetLowering &TLI = DAG.getTargetLoweringInfo();
unsigned SPReg = TLI.getStackPointerRegisterToSaveRestore();		unsigned SPReg = TLI.getStackPointerRegisterToSaveRestore();
assert(SPReg && "Target cannot require DYNAMIC_STACKALLOC expansion and"		assert(SPReg && "Target cannot require DYNAMIC_STACKALLOC expansion and"
" not tell us which reg is the stack pointer!");		" not tell us which reg is the stack pointer!");

SDValue SP = DAG.getCopyFromReg(Chain, dl, SPReg, VT);		SDValue SP = DAG.getCopyFromReg(Chain, dl, SPReg, VT);
Chain = SP.getValue(1);		Chain = SP.getValue(1);
const TargetFrameLowering &TFI = *Subtarget.getFrameLowering();		const TargetFrameLowering &TFI = *Subtarget.getFrameLowering();
unsigned StackAlign = TFI.getStackAlignment();		const Align StackAlign(TFI.getStackAlignment());
Result = DAG.getNode(ISD::SUB, dl, VT, SP, Size); // Value		Result = DAG.getNode(ISD::SUB, dl, VT, SP, Size); // Value
if (Align > StackAlign)		if (Alignment && Alignment > StackAlign)
Result = DAG.getNode(ISD::AND, dl, VT, Result,		Result =
DAG.getConstant(-(uint64_t)Align, dl, VT));		DAG.getNode(ISD::AND, dl, VT, Result,
		DAG.getConstant(~(Alignment->value() - 1ULL), dl, VT));
Chain = DAG.getCopyToReg(Chain, dl, SPReg, Result); // Output chain		Chain = DAG.getCopyToReg(Chain, dl, SPReg, Result); // Output chain
} else if (SplitStack) {		} else if (SplitStack) {
MachineRegisterInfo &MRI = MF.getRegInfo();		MachineRegisterInfo &MRI = MF.getRegInfo();

if (Is64Bit) {		if (Is64Bit) {
// The 64 bit implementation of segmented stacks needs to clobber both r10		// The 64 bit implementation of segmented stacks needs to clobber both r10
// r11. This makes it impossible to use it along with nested parameters.		// r11. This makes it impossible to use it along with nested parameters.
const Function &F = MF.getFunction();		const Function &F = MF.getFunction();
Show All 14 Lines	if (!Lower) {
Chain = DAG.getNode(X86ISD::WIN_ALLOCA, dl, NodeTys, Chain, Size);		Chain = DAG.getNode(X86ISD::WIN_ALLOCA, dl, NodeTys, Chain, Size);
MF.getInfo<X86MachineFunctionInfo>()->setHasWinAlloca(true);		MF.getInfo<X86MachineFunctionInfo>()->setHasWinAlloca(true);

const X86RegisterInfo *RegInfo = Subtarget.getRegisterInfo();		const X86RegisterInfo *RegInfo = Subtarget.getRegisterInfo();
Register SPReg = RegInfo->getStackRegister();		Register SPReg = RegInfo->getStackRegister();
SDValue SP = DAG.getCopyFromReg(Chain, dl, SPReg, SPTy);		SDValue SP = DAG.getCopyFromReg(Chain, dl, SPReg, SPTy);
Chain = SP.getValue(1);		Chain = SP.getValue(1);

if (Align) {		if (Alignment) {
SP = DAG.getNode(ISD::AND, dl, VT, SP.getValue(0),		SP = DAG.getNode(ISD::AND, dl, VT, SP.getValue(0),
DAG.getConstant(-(uint64_t)Align, dl, VT));		DAG.getConstant(~(Alignment->value() - 1ULL), dl, VT));
		courbetUnsubmitted Done Reply Inline Actions Let's refactor this in `Align` (in a separate patch). courbet: Let's refactor this in `Align` (in a separate patch).
		gchateletAuthorUnsubmitted Done Reply Inline Actions SGTM gchatelet: SGTM
Chain = DAG.getCopyToReg(Chain, dl, SPReg, SP);		Chain = DAG.getCopyToReg(Chain, dl, SPReg, SP);
}		}

Result = SP;		Result = SP;
}		}

Chain = DAG.getCALLSEQ_END(Chain, DAG.getIntPtrConstant(0, dl, true),		Chain = DAG.getCALLSEQ_END(Chain, DAG.getIntPtrConstant(0, dl, true),
DAG.getIntPtrConstant(0, dl, true), SDValue(), dl);		DAG.getIntPtrConstant(0, dl, true), SDValue(), dl);
▲ Show 20 Lines • Show All 2,263 Lines • ▼ Show 20 Lines	FLT_ROUNDS, on the other hand, expects the following:
3 Round to -inf		3 Round to -inf

To perform the conversion, we do:		To perform the conversion, we do:
(((((FPSR & 0x800) >> 11) \| ((FPSR & 0x400) >> 9)) + 1) & 3)		(((((FPSR & 0x800) >> 11) \| ((FPSR & 0x400) >> 9)) + 1) & 3)
*/		*/

MachineFunction &MF = DAG.getMachineFunction();		MachineFunction &MF = DAG.getMachineFunction();
const TargetFrameLowering &TFI = *Subtarget.getFrameLowering();		const TargetFrameLowering &TFI = *Subtarget.getFrameLowering();
unsigned StackAlignment = TFI.getStackAlignment();		const Align StackAlignment(TFI.getStackAlignment());
MVT VT = Op.getSimpleValueType();		MVT VT = Op.getSimpleValueType();
SDLoc DL(Op);		SDLoc DL(Op);

// Save FP Control Word to stack slot		// Save FP Control Word to stack slot
int SSFI = MF.getFrameInfo().CreateStackObject(2, StackAlignment, false);		int SSFI =
		MF.getFrameInfo().CreateStackObject(2, StackAlignment.value(), false);
SDValue StackSlot =		SDValue StackSlot =
DAG.getFrameIndex(SSFI, getPointerTy(DAG.getDataLayout()));		DAG.getFrameIndex(SSFI, getPointerTy(DAG.getDataLayout()));

MachineMemOperand *MMO =		MachineMemOperand *MMO =
MF.getMachineMemOperand(MachinePointerInfo::getFixedStack(MF, SSFI),		MF.getMachineMemOperand(MachinePointerInfo::getFixedStack(MF, SSFI),
MachineMemOperand::MOStore, 2, 2);		MachineMemOperand::MOStore, 2, 2);

SDValue Ops[] = { DAG.getEntryNode(), StackSlot };		SDValue Ops[] = { DAG.getEntryNode(), StackSlot };
▲ Show 20 Lines • Show All 21,728 Lines • Show Last 20 Lines