This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
1
TargetLowering.h
-
lib/
-
CodeGen/SelectionDAG/
-
SelectionDAG/
1
LegalizeIntegerTypes.cpp
-
Target/
-
AArch64/
-
AArch64ISelLowering.h
-
ARM/
-
ARMISelLowering.h
-
X86/
-
X86ISelLowering.h
-
test/CodeGen/
-
CodeGen/
-
AArch64/
-
shift_minsize.ll
-
ARM/
-
shift_minsize.ll
-
X86/
-
shift_minsize.ll

Differential D57386

[SelectionDAG] Codesize: don't expand SHIFT to SHIFT_PARTS
ClosedPublic

Authored by SjoerdMeijer on Jan 29 2019, 7:27 AM.

Download Raw Diff

Details

Reviewers

samparker
efriedma
craig.topper
RKSimon
t.p.northover

Commits

rGf7cc34cae890: [SelectionDAG] Codesize: don't expand SHIFT to SHIFT_PARTS
rL352736: [SelectionDAG] Codesize: don't expand SHIFT to SHIFT_PARTS

Summary

And instead just generate a libcall. My motivating example on ARM was a simple:

  
shl i64 %A, %B

for which the code bloat is quite significant. For other targets that also
accept __int128/i128 such as AArch64 and X86, it also seems beneficial for these
cases to generate a libcall when optimising for minsize. On these 64-bit targets,
the 64-bits shifts are of course unaffected because the SHIFT/SHIFT_PARTS
lowering operation action is not set to custom/expand.

Diff Detail

Event Timeline

SjoerdMeijer created this revision.Jan 29 2019, 7:27 AM

Herald added subscribers: kristof.beyls, javed.absar. · View Herald TranscriptJan 29 2019, 7:27 AM

efriedma added inline comments.Jan 29 2019, 12:46 PM

lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
2775	The optForMinSize check should probably be in target-specific code; yes, the call is smaller on all the popular targets I can think of, but that's a function of the specific opcodes available, not a general rule.

Thanks for reviewing!

The optForMinSize check should probably be in target-specific code

Agreed. I have created TLI.expandShift() to allow target-specific decision making.

lebedev.ri added a subscriber: lebedev.ri.Jan 30 2019, 5:19 AM

lebedev.ri added inline comments.

include/llvm/CodeGen/TargetLowering.h
648	It probably should be `shouldExpandShift` or something? Just `expandShift` reads as "call this function to expand the shift operation"

It probably should be shouldExpandShift or something?

Yep, thanks, done.

LGTM

This revision is now accepted and ready to land.Jan 30 2019, 11:28 AM

Closed by commit rL352736: [SelectionDAG] Codesize: don't expand SHIFT to SHIFT_PARTS (authored by SjoerdMeijer). · Explain WhyJan 31 2019, 12:09 AM

This revision was automatically updated to reflect the committed changes.

Hi, we recently found this revision breaks Linux kernel (https://bugs.chromium.org/p/chromium/issues/detail?id=938985). Please advise us how to solve it. Thanks!

Herald added a project: Restricted Project. · View Herald TranscriptMar 8 2019, 10:57 AM

We generally expect that code built using clang will link against compiler-rt or libgcc, even when targeting a freestanding environment. We aren't going to restrict that to only use the subset of compiler-rt functions Linux 4.4 built with some other compiler would use.

Revision Contents

Path

Size

include/

llvm/

CodeGen/

TargetLowering.h

7 lines

lib/

CodeGen/

SelectionDAG/

LegalizeIntegerTypes.cpp

10 lines

Target/

AArch64/

AArch64ISelLowering.h

6 lines

ARM/

ARMISelLowering.h

6 lines

X86/

X86ISelLowering.h

6 lines

test/

CodeGen/

AArch64/

shift_minsize.ll

122 lines

ARM/

shift_minsize.ll

32 lines

X86/

shift_minsize.ll

134 lines

Diff 184278

include/llvm/CodeGen/TargetLowering.h

Show First 20 Lines • Show All 636 Lines • ▼ Show 20 Lines	public:
}		}

/// Return the cost of the 'representative' register class for the specified		/// Return the cost of the 'representative' register class for the specified
/// value type.		/// value type.
virtual uint8_t getRepRegClassCostFor(MVT VT) const {		virtual uint8_t getRepRegClassCostFor(MVT VT) const {
return RepRegClassCostForVT[VT.SimpleTy];		return RepRegClassCostForVT[VT.SimpleTy];
}		}

		/// Return true if SHIFT instructions should be expanded to SHIFT_PARTS
		/// instructions, and false if a library call is preferred (e.g for code-size
		/// reasons).
		virtual bool expandShift(SelectionDAG &DAG, SDNode *N) const {
		lebedev.riUnsubmitted Not Done Reply Inline Actions It probably should be `shouldExpandShift` or something? Just `expandShift` reads as "call this function to expand the shift operation" lebedev.ri: It probably should be `shouldExpandShift` or something? Just `expandShift` reads as "call this…
		return true;
		}

/// Return true if the target has native support for the specified value type.		/// Return true if the target has native support for the specified value type.
/// This means that it has a register that directly holds it without		/// This means that it has a register that directly holds it without
/// promotions or expansions.		/// promotions or expansions.
bool isTypeLegal(EVT VT) const {		bool isTypeLegal(EVT VT) const {
assert(!VT.isSimple() \|\|		assert(!VT.isSimple() \|\|
(unsigned)VT.getSimpleVT().SimpleTy < array_lengthof(RegClassForVT));		(unsigned)VT.getSimpleVT().SimpleTy < array_lengthof(RegClassForVT));
return VT.isSimple() && RegClassForVT[VT.getSimpleVT().SimpleTy] != nullptr;		return VT.isSimple() && RegClassForVT[VT.getSimpleVT().SimpleTy] != nullptr;
}		}
▲ Show 20 Lines • Show All 3,263 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

Show First 20 Lines • Show All 2,758 Lines • ▼ Show 20 Lines	void DAGTypeLegalizer::ExpandIntRes_Shift(SDNode *N,
} else if (N->getOpcode() == ISD::SRL) {		} else if (N->getOpcode() == ISD::SRL) {
PartsOpc = ISD::SRL_PARTS;		PartsOpc = ISD::SRL_PARTS;
} else {		} else {
assert(N->getOpcode() == ISD::SRA && "Unknown shift!");		assert(N->getOpcode() == ISD::SRA && "Unknown shift!");
PartsOpc = ISD::SRA_PARTS;		PartsOpc = ISD::SRA_PARTS;
}		}

// Next check to see if the target supports this SHL_PARTS operation or if it		// Next check to see if the target supports this SHL_PARTS operation or if it
// will custom expand it.		// will custom expand it. Don't lower this to SHL_PARTS when we optimise for
		// size, but create a libcall instead.
EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), VT);		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), VT);
TargetLowering::LegalizeAction Action = TLI.getOperationAction(PartsOpc, NVT);		TargetLowering::LegalizeAction Action = TLI.getOperationAction(PartsOpc, NVT);
if ((Action == TargetLowering::Legal && TLI.isTypeLegal(NVT)) \|\|		const bool LegalOrCustom =
Action == TargetLowering::Custom) {		(Action == TargetLowering::Legal && TLI.isTypeLegal(NVT)) \|\|
		Action == TargetLowering::Custom;

		if (LegalOrCustom && TLI.expandShift(DAG, N)) {
		efriedmaUnsubmitted Not Done Reply Inline Actions The optForMinSize check should probably be in target-specific code; yes, the call is smaller on all the popular targets I can think of, but that's a function of the specific opcodes available, not a general rule. efriedma: The optForMinSize check should probably be in target-specific code; yes, the call is smaller on…
// Expand the subcomponents.		// Expand the subcomponents.
SDValue LHSL, LHSH;		SDValue LHSL, LHSH;
GetExpandedInteger(N->getOperand(0), LHSL, LHSH);		GetExpandedInteger(N->getOperand(0), LHSL, LHSH);
EVT VT = LHSL.getValueType();		EVT VT = LHSL.getValueType();

// If the shift amount operand is coming from a vector legalization it may		// If the shift amount operand is coming from a vector legalization it may
// have an illegal type. Fix that first by casting the operand, otherwise		// have an illegal type. Fix that first by casting the operand, otherwise
// the new SHL_PARTS operation would need further legalization.		// the new SHL_PARTS operation would need further legalization.
▲ Show 20 Lines • Show All 1,082 Lines • Show Last 20 Lines

lib/Target/AArch64/AArch64ISelLowering.h

Show First 20 Lines • Show All 463 Lines • ▼ Show 20 Lines	bool hasAndNot(SDValue Y) const override {
EVT VT = Y.getValueType();		EVT VT = Y.getValueType();

if (!VT.isVector())		if (!VT.isVector())
return hasAndNotCompare(Y);		return hasAndNotCompare(Y);

return VT.getSizeInBits() >= 64; // vector 'bic'		return VT.getSizeInBits() >= 64; // vector 'bic'
}		}

		bool expandShift(SelectionDAG &DAG, SDNode *N) const override {
		if (DAG.getMachineFunction().getFunction().optForMinSize())
		return false;
		return true;
		}

bool shouldTransformSignedTruncationCheck(EVT XVT,		bool shouldTransformSignedTruncationCheck(EVT XVT,
unsigned KeptBits) const override {		unsigned KeptBits) const override {
// For vectors, we don't have a preference..		// For vectors, we don't have a preference..
if (XVT.isVector())		if (XVT.isVector())
return false;		return false;

auto VTIsOk = [](EVT VT) -> bool {		auto VTIsOk = [](EVT VT) -> bool {
return VT == MVT::i8 \|\| VT == MVT::i16 \|\| VT == MVT::i32 \|\|		return VT == MVT::i8 \|\| VT == MVT::i16 \|\| VT == MVT::i32 \|\|
▲ Show 20 Lines • Show All 250 Lines • Show Last 20 Lines

lib/Target/ARM/ARMISelLowering.h

Show First 20 Lines • Show All 561 Lines • ▼ Show 20 Lines	public:
bool supportSwiftError() const override {		bool supportSwiftError() const override {
return true;		return true;
}		}

bool hasStandaloneRem(EVT VT) const override {		bool hasStandaloneRem(EVT VT) const override {
return HasStandaloneRem;		return HasStandaloneRem;
}		}

		bool expandShift(SelectionDAG &DAG, SDNode *N) const override {
		if (DAG.getMachineFunction().getFunction().optForMinSize())
		return false;
		return true;
		}

CCAssignFn *CCAssignFnForCall(CallingConv::ID CC, bool isVarArg) const;		CCAssignFn *CCAssignFnForCall(CallingConv::ID CC, bool isVarArg) const;
CCAssignFn *CCAssignFnForReturn(CallingConv::ID CC, bool isVarArg) const;		CCAssignFn *CCAssignFnForReturn(CallingConv::ID CC, bool isVarArg) const;

/// Returns true if \p VecTy is a legal interleaved access type. This		/// Returns true if \p VecTy is a legal interleaved access type. This
/// function checks the vector element type and the overall width of the		/// function checks the vector element type and the overall width of the
/// vector.		/// vector.
bool isLegalInterleavedAccessType(VectorType *VecTy,		bool isLegalInterleavedAccessType(VectorType *VecTy,
const DataLayout &DL) const;		const DataLayout &DL) const;
▲ Show 20 Lines • Show All 248 Lines • Show Last 20 Lines

lib/Target/X86/X86ISelLowering.h

Show First 20 Lines • Show All 825 Lines • ▼ Show 20 Lines	shouldTransformSignedTruncationCheck(EVT XVT,
};		};

// We are ok with KeptBitsVT being byte/word/dword, what MOVS supports.		// We are ok with KeptBitsVT being byte/word/dword, what MOVS supports.
// XVT will be larger than KeptBitsVT.		// XVT will be larger than KeptBitsVT.
MVT KeptBitsVT = MVT::getIntegerVT(KeptBits);		MVT KeptBitsVT = MVT::getIntegerVT(KeptBits);
return VTIsOk(XVT) && VTIsOk(KeptBitsVT);		return VTIsOk(XVT) && VTIsOk(KeptBitsVT);
}		}

		bool expandShift(SelectionDAG &DAG, SDNode *N) const override {
		if (DAG.getMachineFunction().getFunction().optForMinSize())
		return false;
		return true;
		}

bool shouldSplatInsEltVarIndex(EVT VT) const override;		bool shouldSplatInsEltVarIndex(EVT VT) const override;

bool convertSetCCLogicToBitwiseLogic(EVT VT) const override {		bool convertSetCCLogicToBitwiseLogic(EVT VT) const override {
return VT.isScalarInteger();		return VT.isScalarInteger();
}		}

/// Vector-sized comparisons are fast using PCMPEQ + PMOVMSK or PTEST.		/// Vector-sized comparisons are fast using PCMPEQ + PMOVMSK or PTEST.
MVT hasFastEqualityCompare(unsigned NumBits) const override;		MVT hasFastEqualityCompare(unsigned NumBits) const override;
▲ Show 20 Lines • Show All 749 Lines • Show Last 20 Lines

test/CodeGen/AArch64/shift_minsize.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc < %s -mtriple=aarch64-unknown-unknown \| FileCheck %s

				define i64 @f0(i64 %val, i64 %amt) minsize optsize {
				; CHECK-LABEL: f0:
				; CHECK: // %bb.0:
				; CHECK-NEXT: lsl x0, x0, x1
				; CHECK-NEXT: ret
				%res = shl i64 %val, %amt
				ret i64 %res
				}

				define i32 @f1(i64 %x, i64 %y) minsize optsize {
				; CHECK-LABEL: f1:
				; CHECK: // %bb.0:
				; CHECK-NEXT: lsl x0, x0, x1
				; CHECK-NEXT: // kill: def $w0 killed $w0 killed $x0
				; CHECK-NEXT: ret
				%a = shl i64 %x, %y
				%b = trunc i64 %a to i32
				ret i32 %b
				}

				define i32 @f2(i64 %x, i64 %y) minsize optsize {
				; CHECK-LABEL: f2:
				; CHECK: // %bb.0:
				; CHECK-NEXT: asr x0, x0, x1
				; CHECK-NEXT: // kill: def $w0 killed $w0 killed $x0
				; CHECK-NEXT: ret
				%a = ashr i64 %x, %y
				%b = trunc i64 %a to i32
				ret i32 %b
				}

				define i32 @f3(i64 %x, i64 %y) minsize optsize {
				; CHECK-LABEL: f3:
				; CHECK: // %bb.0:
				; CHECK-NEXT: lsr x0, x0, x1
				; CHECK-NEXT: // kill: def $w0 killed $w0 killed $x0
				; CHECK-NEXT: ret
				%a = lshr i64 %x, %y
				%b = trunc i64 %a to i32
				ret i32 %b
				}

				define dso_local { i64, i64 } @shl128(i64 %x.coerce0, i64 %x.coerce1, i8 signext %y) minsize optsize {
				; CHECK-LABEL: shl128:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: .cfi_offset w30, -16
				; CHECK-NEXT: // kill: def $w2 killed $w2 def $x2
				; CHECK-NEXT: bl __ashlti3
				; CHECK-NEXT: ldr x30, [sp], #16 // 8-byte Folded Reload
				; CHECK-NEXT: ret
				entry:
				%x.sroa.2.0.insert.ext = zext i64 %x.coerce1 to i128
				%x.sroa.2.0.insert.shift = shl nuw i128 %x.sroa.2.0.insert.ext, 64
				%x.sroa.0.0.insert.ext = zext i64 %x.coerce0 to i128
				%x.sroa.0.0.insert.insert = or i128 %x.sroa.2.0.insert.shift, %x.sroa.0.0.insert.ext
				%conv = sext i8 %y to i32
				%sh_prom = zext i32 %conv to i128
				%shl = shl i128 %x.sroa.0.0.insert.insert, %sh_prom
				%retval.sroa.0.0.extract.trunc = trunc i128 %shl to i64
				%retval.sroa.2.0.extract.shift = lshr i128 %shl, 64
				%retval.sroa.2.0.extract.trunc = trunc i128 %retval.sroa.2.0.extract.shift to i64
				%.fca.0.insert = insertvalue { i64, i64 } undef, i64 %retval.sroa.0.0.extract.trunc, 0
				%.fca.1.insert = insertvalue { i64, i64 } %.fca.0.insert, i64 %retval.sroa.2.0.extract.trunc, 1
				ret { i64, i64 } %.fca.1.insert
				}

				define dso_local { i64, i64 } @ashr128(i64 %x.coerce0, i64 %x.coerce1, i8 signext %y) minsize optsize {
				; CHECK-LABEL: ashr128:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: .cfi_offset w30, -16
				; CHECK-NEXT: // kill: def $w2 killed $w2 def $x2
				; CHECK-NEXT: bl __ashrti3
				; CHECK-NEXT: ldr x30, [sp], #16 // 8-byte Folded Reload
				; CHECK-NEXT: ret
				entry:
				%x.sroa.2.0.insert.ext = zext i64 %x.coerce1 to i128
				%x.sroa.2.0.insert.shift = shl nuw i128 %x.sroa.2.0.insert.ext, 64
				%x.sroa.0.0.insert.ext = zext i64 %x.coerce0 to i128
				%x.sroa.0.0.insert.insert = or i128 %x.sroa.2.0.insert.shift, %x.sroa.0.0.insert.ext
				%conv = sext i8 %y to i32
				%sh_prom = zext i32 %conv to i128
				%shr = ashr i128 %x.sroa.0.0.insert.insert, %sh_prom
				%retval.sroa.0.0.extract.trunc = trunc i128 %shr to i64
				%retval.sroa.2.0.extract.shift = lshr i128 %shr, 64
				%retval.sroa.2.0.extract.trunc = trunc i128 %retval.sroa.2.0.extract.shift to i64
				%.fca.0.insert = insertvalue { i64, i64 } undef, i64 %retval.sroa.0.0.extract.trunc, 0
				%.fca.1.insert = insertvalue { i64, i64 } %.fca.0.insert, i64 %retval.sroa.2.0.extract.trunc, 1
				ret { i64, i64 } %.fca.1.insert
				}

				define dso_local { i64, i64 } @lshr128(i64 %x.coerce0, i64 %x.coerce1, i8 signext %y) minsize optsize {
				; CHECK-LABEL: lshr128:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: .cfi_offset w30, -16
				; CHECK-NEXT: // kill: def $w2 killed $w2 def $x2
				; CHECK-NEXT: bl __lshrti3
				; CHECK-NEXT: ldr x30, [sp], #16 // 8-byte Folded Reload
				; CHECK-NEXT: ret
				entry:
				%x.sroa.2.0.insert.ext = zext i64 %x.coerce1 to i128
				%x.sroa.2.0.insert.shift = shl nuw i128 %x.sroa.2.0.insert.ext, 64
				%x.sroa.0.0.insert.ext = zext i64 %x.coerce0 to i128
				%x.sroa.0.0.insert.insert = or i128 %x.sroa.2.0.insert.shift, %x.sroa.0.0.insert.ext
				%conv = sext i8 %y to i32
				%sh_prom = zext i32 %conv to i128
				%shr = lshr i128 %x.sroa.0.0.insert.insert, %sh_prom
				%retval.sroa.0.0.extract.trunc = trunc i128 %shr to i64
				%retval.sroa.2.0.extract.shift = lshr i128 %shr, 64
				%retval.sroa.2.0.extract.trunc = trunc i128 %retval.sroa.2.0.extract.shift to i64
				%.fca.0.insert = insertvalue { i64, i64 } undef, i64 %retval.sroa.0.0.extract.trunc, 0
				%.fca.1.insert = insertvalue { i64, i64 } %.fca.0.insert, i64 %retval.sroa.2.0.extract.trunc, 1
				ret { i64, i64 } %.fca.1.insert
				}

test/CodeGen/ARM/shift_minsize.ll

This file was added.

				; RUN: llc -mtriple=arm-eabi %s -o - \| FileCheck %s

				define i64 @f0(i64 %val, i64 %amt) minsize optsize {
				; CHECK-LABEL: f0:
				; CHECK: bl __aeabi_llsl
				%res = shl i64 %val, %amt
				ret i64 %res
				}

				define i32 @f1(i64 %x, i64 %y) minsize optsize {
				; CHECK-LABEL: f1:
				; CHECK: bl __aeabi_llsl
				%a = shl i64 %x, %y
				%b = trunc i64 %a to i32
				ret i32 %b
				}

				define i32 @f2(i64 %x, i64 %y) minsize optsize {
				; CHECK-LABEL: f2:
				; CHECK: bl __aeabi_lasr
				%a = ashr i64 %x, %y
				%b = trunc i64 %a to i32
				ret i32 %b
				}

				define i32 @f3(i64 %x, i64 %y) minsize optsize {
				; CHECK-LABEL: f3:
				; CHECK: bl __aeabi_llsr
				%a = lshr i64 %x, %y
				%b = trunc i64 %a to i32
				ret i32 %b
				}

test/CodeGen/X86/shift_minsize.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc < %s -mtriple=x86_64-unknown \| FileCheck %s

				define i64 @f0(i64 %val, i64 %amt) minsize optsize {
				; CHECK-LABEL: f0:
				; CHECK: # %bb.0:
				; CHECK-NEXT: movq %rsi, %rcx
				; CHECK-NEXT: movq %rdi, %rax
				; CHECK-NEXT: # kill: def $cl killed $cl killed $rcx
				; CHECK-NEXT: shlq %cl, %rax
				; CHECK-NEXT: retq
				%res = shl i64 %val, %amt
				ret i64 %res
				}

				define i32 @f1(i64 %x, i64 %y) minsize optsize {
				; CHECK-LABEL: f1:
				; CHECK: # %bb.0:
				; CHECK-NEXT: movq %rsi, %rcx
				; CHECK-NEXT: movq %rdi, %rax
				; CHECK-NEXT: # kill: def $cl killed $cl killed $rcx
				; CHECK-NEXT: shlq %cl, %rax
				; CHECK-NEXT: # kill: def $eax killed $eax killed $rax
				; CHECK-NEXT: retq
				%a = shl i64 %x, %y
				%b = trunc i64 %a to i32
				ret i32 %b
				}

				define i32 @f2(i64 %x, i64 %y) minsize optsize {
				; CHECK-LABEL: f2:
				; CHECK: # %bb.0:
				; CHECK-NEXT: movq %rsi, %rcx
				; CHECK-NEXT: movq %rdi, %rax
				; CHECK-NEXT: # kill: def $cl killed $cl killed $rcx
				; CHECK-NEXT: sarq %cl, %rax
				; CHECK-NEXT: # kill: def $eax killed $eax killed $rax
				; CHECK-NEXT: retq
				%a = ashr i64 %x, %y
				%b = trunc i64 %a to i32
				ret i32 %b
				}

				define i32 @f3(i64 %x, i64 %y) minsize optsize {
				; CHECK-LABEL: f3:
				; CHECK: # %bb.0:
				; CHECK-NEXT: movq %rsi, %rcx
				; CHECK-NEXT: movq %rdi, %rax
				; CHECK-NEXT: # kill: def $cl killed $cl killed $rcx
				; CHECK-NEXT: shrq %cl, %rax
				; CHECK-NEXT: # kill: def $eax killed $eax killed $rax
				; CHECK-NEXT: retq
				%a = lshr i64 %x, %y
				%b = trunc i64 %a to i32
				ret i32 %b
				}

				define dso_local { i64, i64 } @shl128(i64 %x.coerce0, i64 %x.coerce1, i8 signext %y) minsize optsize {
				; CHECK-LABEL: shl128:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: pushq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: movzbl %dl, %edx
				; CHECK-NEXT: callq __ashlti3
				; CHECK-NEXT: popq %rcx
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				entry:
				%x.sroa.2.0.insert.ext = zext i64 %x.coerce1 to i128
				%x.sroa.2.0.insert.shift = shl nuw i128 %x.sroa.2.0.insert.ext, 64
				%x.sroa.0.0.insert.ext = zext i64 %x.coerce0 to i128
				%x.sroa.0.0.insert.insert = or i128 %x.sroa.2.0.insert.shift, %x.sroa.0.0.insert.ext
				%conv = sext i8 %y to i32
				%sh_prom = zext i32 %conv to i128
				%shl = shl i128 %x.sroa.0.0.insert.insert, %sh_prom
				%retval.sroa.0.0.extract.trunc = trunc i128 %shl to i64
				%retval.sroa.2.0.extract.shift = lshr i128 %shl, 64
				%retval.sroa.2.0.extract.trunc = trunc i128 %retval.sroa.2.0.extract.shift to i64
				%.fca.0.insert = insertvalue { i64, i64 } undef, i64 %retval.sroa.0.0.extract.trunc, 0
				%.fca.1.insert = insertvalue { i64, i64 } %.fca.0.insert, i64 %retval.sroa.2.0.extract.trunc, 1
				ret { i64, i64 } %.fca.1.insert
				}

				define dso_local { i64, i64 } @ashr128(i64 %x.coerce0, i64 %x.coerce1, i8 signext %y) minsize optsize {
				; CHECK-LABEL: ashr128:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: pushq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: callq __ashrti3
				; CHECK-NEXT: popq %rcx
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				entry:
				%x.sroa.2.0.insert.ext = zext i64 %x.coerce1 to i128
				%x.sroa.2.0.insert.shift = shl nuw i128 %x.sroa.2.0.insert.ext, 64
				%x.sroa.0.0.insert.ext = zext i64 %x.coerce0 to i128
				%x.sroa.0.0.insert.insert = or i128 %x.sroa.2.0.insert.shift, %x.sroa.0.0.insert.ext
				%conv = sext i8 %y to i32
				%sh_prom = zext i32 %conv to i128
				%shr = ashr i128 %x.sroa.0.0.insert.insert, %sh_prom
				%retval.sroa.0.0.extract.trunc = trunc i128 %shr to i64
				%retval.sroa.2.0.extract.shift = lshr i128 %shr, 64
				%retval.sroa.2.0.extract.trunc = trunc i128 %retval.sroa.2.0.extract.shift to i64
				%.fca.0.insert = insertvalue { i64, i64 } undef, i64 %retval.sroa.0.0.extract.trunc, 0
				%.fca.1.insert = insertvalue { i64, i64 } %.fca.0.insert, i64 %retval.sroa.2.0.extract.trunc, 1
				ret { i64, i64 } %.fca.1.insert
				}

				define dso_local { i64, i64 } @lshr128(i64 %x.coerce0, i64 %x.coerce1, i8 signext %y) minsize optsize {
				; CHECK-LABEL: lshr128:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: pushq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: movzbl %dl, %edx
				; CHECK-NEXT: callq __lshrti3
				; CHECK-NEXT: popq %rcx
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				entry:
				%x.sroa.2.0.insert.ext = zext i64 %x.coerce1 to i128
				%x.sroa.2.0.insert.shift = shl nuw i128 %x.sroa.2.0.insert.ext, 64
				%x.sroa.0.0.insert.ext = zext i64 %x.coerce0 to i128
				%x.sroa.0.0.insert.insert = or i128 %x.sroa.2.0.insert.shift, %x.sroa.0.0.insert.ext
				%conv = sext i8 %y to i32
				%sh_prom = zext i32 %conv to i128
				%shr = lshr i128 %x.sroa.0.0.insert.insert, %sh_prom
				%retval.sroa.0.0.extract.trunc = trunc i128 %shr to i64
				%retval.sroa.2.0.extract.shift = lshr i128 %shr, 64
				%retval.sroa.2.0.extract.trunc = trunc i128 %retval.sroa.2.0.extract.shift to i64
				%.fca.0.insert = insertvalue { i64, i64 } undef, i64 %retval.sroa.0.0.extract.trunc, 0
				%.fca.1.insert = insertvalue { i64, i64 } %.fca.0.insert, i64 %retval.sroa.2.0.extract.trunc, 1
				ret { i64, i64 } %.fca.1.insert
				}

This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG] Codesize: don't expand SHIFT to SHIFT_PARTSClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 184278

include/llvm/CodeGen/TargetLowering.h

lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

lib/Target/AArch64/AArch64ISelLowering.h

lib/Target/ARM/ARMISelLowering.h

lib/Target/X86/X86ISelLowering.h

test/CodeGen/AArch64/shift_minsize.ll

test/CodeGen/ARM/shift_minsize.ll

test/CodeGen/X86/shift_minsize.ll

[SelectionDAG] Codesize: don't expand SHIFT to SHIFT_PARTS
ClosedPublic