This is an archive of the discontinued LLVM Phabricator instance.

Use getStoreSize() in various places instead of BitSize >> 3
Needs ReviewPublic

Authored by jonpa on Nov 22 2017, 3:04 AM.

Download Raw Diff

Details

Reviewers

hfinkel
patrik.h.hagglund
bjope

Summary

A bug (https://bugs.llvm.org/show_bug.cgi?id=35366) in DAGCombiner was discovered where loads/stores of i1 got the size of 0, due to the use of 'BitSize >> 3'. The proper way to get the memory access size is to use getStoreSize().

Since this is found also in other places, I have tried to fix a few more of them where it looks obvious.

Please review for correctness.

Diff Detail

Event Timeline

jonpa created this revision.Nov 22 2017, 3:04 AM

Herald added a subscriber: sdardis. · View Herald TranscriptNov 22 2017, 3:04 AM

uabelho added a subscriber: uabelho.Nov 22 2017, 3:45 AM

As a side note: This also helps people like myself that live with an out-of-tree target with
non-8bit bytes. We've changed getStoreSize() so it takes the byte size into account, so
using that instead of all these "/ 8" and ">> 3" all over the place makes life easier for us as
well. Thanks for that Jonas!

fhahn added a subscriber: fhahn.Nov 22 2017, 9:14 AM

bjope added a subscriber: bjope.Nov 23 2017, 8:57 AM

bjope added inline comments.Nov 23 2017, 10:14 AM

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
12623–12624	As far as I can see all uses of ElementSizeBytes in this method is doing "ElementSizeBytes * 8". So I gues we can replace this by a int64_t ElementSizeBits = MemVT.getStoreSizeInBits(); and get rid of all those multiplications by 8.
lib/CodeGen/SelectionDAG/StatepointLowering.cpp
100	I wouldn't mind if this assert is changed into: assert(ValueType.getStoreSizeInBits() == ValueType.getSizeInBits() && "Size not equal to store size"); Or is the assert aiming at checking if ValueType is a multiple of the byte size, then it should be assert(ValueType.isByteSized() && "Size not in bytes?"); but despite the current assert string I doubt that is what we want to verify.

MergeStoresOfConstantsOrVecElts() updated to use getStoreSizeInBits() per review.

jonpa added inline comments.Nov 24 2017, 6:56 AM

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
12623–12624	Seems right to me also - done.
lib/CodeGen/SelectionDAG/StatepointLowering.cpp
100	I'm not exactly sure either - unless someone confirms this, perhaps this can be changed at a later point after we get these fixes in...

I have no further comments. And it looks good from our out-of-tree-targets perspective.
In many situations I think this is NFC. For example the consecutive load/store optimizations have other checks verifying that the involved types are byte sized, so the end result will be the same.
But I haven't really reviewed the MIPS/Hexagon specific parts, and there are no test cases showing that something is more correct (I'm not sure all changes are NFC).

lib/CodeGen/SelectionDAG/StatepointLowering.cpp
100	Sure, I don't want this to be a stopper for your patch so you may ignore it. It was just something I discovered when reviewing (not directly connected to your patch).

The MIPS part LGTM.

LGTM

This revision is now accepted and ready to land.Nov 28 2017, 4:38 AM

Thanks for review. r319173.

BTW,

What about all the cases using Type* , like 'ByteSize = Ty->getSizeInBits() / 8;'... These should also be fixed, I presume, but I don't see a getStoreSize() method or similar in Type...

Also, with

diff --git a/include/llvm/IR/DebugInfoMetadata.h b/include/llvm/IR/DebugInfoMetadata.h
index c515f6d..a8aa195 100644
--- a/include/llvm/IR/DebugInfoMetadata.h
+++ b/include/llvm/IR/DebugInfoMetadata.h
@@ -594,6 +594,7 @@ public:
   unsigned getLine() const { return Line; }
   uint64_t getSizeInBits() const { return SizeInBits; }
   uint32_t getAlignInBits() const { return AlignInBits; }
+  uint64_t getSizeInBytes() const { return (SizeInBits < 8 ? 1 : SizeInBits >> 3); }
   uint32_t getAlignInBytes() const { return getAlignInBits() / CHAR_BIT; }
   uint64_t getOffsetInBits() const { return OffsetInBits; }
   DIFlags getFlags() const { return Flags; }
diff --git a/lib/CodeGen/AsmPrinter/DwarfUnit.cpp b/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
index 911e462..8829886 100644
--- a/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
+++ b/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
@@ -813,14 +813,14 @@ void DwarfUnit::constructTypeDIE(DIE &Buffer, const DIBasicType *BTy) {
   addUInt(Buffer, dwarf::DW_AT_encoding, dwarf::DW_FORM_data1,
           BTy->getEncoding());
 
-  uint64_t Size = BTy->getSizeInBits() >> 3;
+  uint64_t Size = BTy->getSizeInBytes();
   addUInt(Buffer, dwarf::DW_AT_byte_size, None, Size);
 }
 
 void DwarfUnit::constructTypeDIE(DIE &Buffer, const DIDerivedType *DTy) {
   // Get core information.
   StringRef Name = DTy->getName();
-  uint64_t Size = DTy->getSizeInBits() >> 3;
+  uint64_t Size = DTy->getSizeInBytes();
   uint16_t Tag = Buffer.getTag();
 
   // Map to main type, void will not have a type.
@@ -907,7 +907,7 @@ void DwarfUnit::constructTypeDIE(DIE &Buffer, const DICompositeType *CTy) {
   // Add name if not anonymous or intermediate type.
   StringRef Name = CTy->getName();
 
-  uint64_t Size = CTy->getSizeInBits() >> 3;
+  uint64_t Size = CTy->getSizeInBytes();
   uint16_t Tag = Buffer.getTag();
 
   switch (Tag) {

I got

Failing Tests (3):
    LLVM :: DebugInfo/Generic/namespace.ll
    LLVM :: DebugInfo/Generic/tu-composite.ll
    LLVM :: DebugInfo/X86/sret.ll

Could this be an indicator that something *should* be handled here, or is this a bad fix for some reason?

In D40339#937625, @jonpa wrote:
Thanks for review. r319173.

BTW,

What about all the cases using Type* , like 'ByteSize = Ty->getSizeInBits() / 8;'... These should also be fixed, I presume, but I don't see a getStoreSize() method or similar in Type...

Also, with
diff --git a/include/llvm/IR/DebugInfoMetadata.h b/include/llvm/IR/DebugInfoMetadata.h
index c515f6d..a8aa195 100644
--- a/include/llvm/IR/DebugInfoMetadata.h
+++ b/include/llvm/IR/DebugInfoMetadata.h
@@ -594,6 +594,7 @@ public:
   unsigned getLine() const { return Line; }
   uint64_t getSizeInBits() const { return SizeInBits; }
   uint32_t getAlignInBits() const { return AlignInBits; }
+  uint64_t getSizeInBytes() const { return (SizeInBits < 8 ? 1 : SizeInBits >> 3); }

Your way of implementing getSizeInBytes() is having some flaws:

for SizeInBits==0 it will return 1. <--- Seems wrong to me.
for SizeInBits > 8, but not a multiple of the byte size it rounds down (whereas it rounds up for < 8). <--- Inconsistent.

I think that a correct way of doing it would be

uint64_t getSizeInBytes() const { return (SizeInBits + (ByteSizeInBits - 1) / ByteSizeInBits); }

where "ByteSizeInBits" is 8 for all in-tree-targets.

Your way of implementing getSizeInBytes() is having some flaws

Aah, sorry - seems I was just handling i1. Thanks for the suggestion. Now all tests are still green...

(Sorry for hard-coding the ByteSizeInBits to 8 once again, but it should save you two instances in total ;-)

Herald added a subscriber: JDevlieghere. · View Herald TranscriptNov 29 2017, 1:42 AM

jonpa requested review of this revision.Nov 30 2017, 7:29 AM

jonpa edited edge metadata.

bjope added inline comments.Nov 30 2017, 3:17 PM

include/llvm/IR/DebugInfoMetadata.h
597 ↗	(On Diff #124702)	The getSizeInBytes() in other classes often truncates to number of whole bytes. Whereas for example getStoreSizeInBits() is doing a ceiling operation. Maybe this method should take a bool to indicate if ceiling or truncation is wanted in case of size in bits not being a multiple of the byte size. Making sure new uses of this method requires the author to choose between the two alternatives.
lib/CodeGen/AsmPrinter/DwarfUnit.cpp
816 ↗	(On Diff #124702)	Do we know if this is correct? Or was the old behavior of truncating intentional? Or are these constructTypeDIE assuming that the size in bits is a multiple of 8? Maybe we even need to assert that the SizeInBits is a multiple of the byte size here (or inside the getSizeInBytes method) to ensure correct behavior (in case it is incorrect to both round up or down when the size in bits isn't a multiple of the byte size). I think you need some test cases that proves that this is correct (if ceiling is needed). Or if you can't create such tests, nor prove that truncate is correct, then I think we should go for the asserts or llvm_unreachable. Such asserts would ensure that whenever, in the future, we end up here without having a multiple of the byte size, then we need to determine what the correct solution is.

I have no idea about this...

Revision Contents

Path

Size

lib/

CodeGen/

GlobalISel/

IRTranslator.cpp

2 lines

SelectionDAG/

DAGCombiner.cpp

16 lines

SelectionDAGBuilder.cpp

4 lines

StatepointLowering.cpp

2 lines

TargetLowering.cpp

6 lines

Target/

Hexagon/

HexagonLoopIdiomRecognition.cpp

12 lines

Mips/

MipsISelLowering.cpp

3 lines

Transforms/

Scalar/

LoopIdiomRecognize.cpp

15 lines

Diff 124205

lib/CodeGen/GlobalISel/IRTranslator.cpp

Show First 20 Lines • Show All 842 Lines • ▼ Show 20 Lines	bool IRTranslator::translateCall(const User &U, MachineIRBuilder &MIRBuilder) {
const TargetLowering &TLI = *MF->getSubtarget().getTargetLowering();		const TargetLowering &TLI = *MF->getSubtarget().getTargetLowering();
TargetLowering::IntrinsicInfo Info;		TargetLowering::IntrinsicInfo Info;
// TODO: Add a GlobalISel version of getTgtMemIntrinsic.		// TODO: Add a GlobalISel version of getTgtMemIntrinsic.
if (TLI.getTgtMemIntrinsic(Info, CI, ID)) {		if (TLI.getTgtMemIntrinsic(Info, CI, ID)) {
MachineMemOperand::Flags Flags =		MachineMemOperand::Flags Flags =
Info.vol ? MachineMemOperand::MOVolatile : MachineMemOperand::MONone;		Info.vol ? MachineMemOperand::MOVolatile : MachineMemOperand::MONone;
Flags \|=		Flags \|=
Info.readMem ? MachineMemOperand::MOLoad : MachineMemOperand::MOStore;		Info.readMem ? MachineMemOperand::MOLoad : MachineMemOperand::MOStore;
uint64_t Size = Info.memVT.getSizeInBits() >> 3;		uint64_t Size = Info.memVT.getStoreSize();
MIB.addMemOperand(MF->getMachineMemOperand(MachinePointerInfo(Info.ptrVal),		MIB.addMemOperand(MF->getMachineMemOperand(MachinePointerInfo(Info.ptrVal),
Flags, Size, Info.align));		Flags, Size, Info.align));
}		}

return true;		return true;
}		}

bool IRTranslator::translateInvoke(const User &U,		bool IRTranslator::translateInvoke(const User &U,
▲ Show 20 Lines • Show All 482 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 8,632 Lines • ▼ Show 20 Lines	SDValue DAGCombiner::CombineConsecutiveLoads(SDNode *N, EVT VT) {
assert(N->getOpcode() == ISD::BUILD_PAIR);		assert(N->getOpcode() == ISD::BUILD_PAIR);

LoadSDNode *LD1 = dyn_cast<LoadSDNode>(getBuildPairElt(N, 0));		LoadSDNode *LD1 = dyn_cast<LoadSDNode>(getBuildPairElt(N, 0));
LoadSDNode *LD2 = dyn_cast<LoadSDNode>(getBuildPairElt(N, 1));		LoadSDNode *LD2 = dyn_cast<LoadSDNode>(getBuildPairElt(N, 1));
if (!LD1 \|\| !LD2 \|\| !ISD::isNON_EXTLoad(LD1) \|\| !LD1->hasOneUse() \|\|		if (!LD1 \|\| !LD2 \|\| !ISD::isNON_EXTLoad(LD1) \|\| !LD1->hasOneUse() \|\|
LD1->getAddressSpace() != LD2->getAddressSpace())		LD1->getAddressSpace() != LD2->getAddressSpace())
return SDValue();		return SDValue();
EVT LD1VT = LD1->getValueType(0);		EVT LD1VT = LD1->getValueType(0);
unsigned LD1Bytes = LD1VT.getSizeInBits() / 8;		unsigned LD1Bytes = LD1VT.getStoreSize();
if (ISD::isNON_EXTLoad(LD2) && LD2->hasOneUse() &&		if (ISD::isNON_EXTLoad(LD2) && LD2->hasOneUse() &&
DAG.areNonVolatileConsecutiveLoads(LD2, LD1, LD1Bytes, 1)) {		DAG.areNonVolatileConsecutiveLoads(LD2, LD1, LD1Bytes, 1)) {
unsigned Align = LD1->getAlignment();		unsigned Align = LD1->getAlignment();
unsigned NewAlign = DAG.getDataLayout().getABITypeAlignment(		unsigned NewAlign = DAG.getDataLayout().getABITypeAlignment(
VT.getTypeForEVT(*DAG.getContext()));		VT.getTypeForEVT(*DAG.getContext()));

if (NewAlign <= Align &&		if (NewAlign <= Align &&
(!LegalOperations \|\| TLI.isOperationLegal(ISD::LOAD, VT)))		(!LegalOperations \|\| TLI.isOperationLegal(ISD::LOAD, VT)))
▲ Show 20 Lines • Show All 3,965 Lines • ▼ Show 20 Lines	bool DAGCombiner::MergeStoresOfConstantsOrVecElts(
SmallVectorImpl<MemOpLink> &StoreNodes, EVT MemVT, unsigned NumStores,		SmallVectorImpl<MemOpLink> &StoreNodes, EVT MemVT, unsigned NumStores,
bool IsConstantSrc, bool UseVector, bool UseTrunc) {		bool IsConstantSrc, bool UseVector, bool UseTrunc) {
// Make sure we have something to merge.		// Make sure we have something to merge.
if (NumStores < 2)		if (NumStores < 2)
return false;		return false;

// The latest Node in the DAG.		// The latest Node in the DAG.
SDLoc DL(StoreNodes[0].MemNode);		SDLoc DL(StoreNodes[0].MemNode);

int64_t ElementSizeBytes = MemVT.getSizeInBits() / 8;		int64_t ElementSizeBits = MemVT.getStoreSizeInBits();
		bjopeUnsubmitted Done Reply Inline Actions As far as I can see all uses of ElementSizeBytes in this method is doing "ElementSizeBytes * 8". So I gues we can replace this by a int64_t ElementSizeBits = MemVT.getStoreSizeInBits(); and get rid of all those multiplications by 8. bjope: As far as I can see all uses of ElementSizeBytes in this method is doing "ElementSizeBytes * 8".
		jonpaAuthorUnsubmitted Not Done Reply Inline Actions Seems right to me also - done. jonpa: Seems right to me also - done.
unsigned SizeInBits = NumStores * ElementSizeBytes * 8;		unsigned SizeInBits = NumStores * ElementSizeBits;
unsigned NumMemElts = MemVT.isVector() ? MemVT.getVectorNumElements() : 1;		unsigned NumMemElts = MemVT.isVector() ? MemVT.getVectorNumElements() : 1;

EVT StoreTy;		EVT StoreTy;
if (UseVector) {		if (UseVector) {
unsigned Elts = NumStores * NumMemElts;		unsigned Elts = NumStores * NumMemElts;
// Get the type for the merged vector store.		// Get the type for the merged vector store.
StoreTy = EVT::getVectorVT(*DAG.getContext(), MemVT.getScalarType(), Elts);		StoreTy = EVT::getVectorVT(*DAG.getContext(), MemVT.getScalarType(), Elts);
} else		} else
StoreTy = EVT::getIntegerVT(*DAG.getContext(), SizeInBits);		StoreTy = EVT::getIntegerVT(*DAG.getContext(), SizeInBits);

SDValue StoredVal;		SDValue StoredVal;
if (UseVector) {		if (UseVector) {
if (IsConstantSrc) {		if (IsConstantSrc) {
SmallVector<SDValue, 8> BuildVector;		SmallVector<SDValue, 8> BuildVector;
for (unsigned I = 0; I != NumStores; ++I) {		for (unsigned I = 0; I != NumStores; ++I) {
StoreSDNode *St = cast<StoreSDNode>(StoreNodes[I].MemNode);		StoreSDNode *St = cast<StoreSDNode>(StoreNodes[I].MemNode);
SDValue Val = St->getValue();		SDValue Val = St->getValue();
// If constant is of the wrong type, convert it now.		// If constant is of the wrong type, convert it now.
if (MemVT != Val.getValueType()) {		if (MemVT != Val.getValueType()) {
Val = peekThroughBitcast(Val);		Val = peekThroughBitcast(Val);
// Deal with constants of wrong size.		// Deal with constants of wrong size.
if (ElementSizeBytes * 8 != Val.getValueSizeInBits()) {		if (ElementSizeBits != Val.getValueSizeInBits()) {
EVT IntMemVT =		EVT IntMemVT =
EVT::getIntegerVT(*DAG.getContext(), MemVT.getSizeInBits());		EVT::getIntegerVT(*DAG.getContext(), MemVT.getSizeInBits());
if (auto *CFP = dyn_cast<ConstantFPSDNode>(Val))		if (auto *CFP = dyn_cast<ConstantFPSDNode>(Val))
Val = DAG.getConstant(		Val = DAG.getConstant(
CFP->getValueAPF().bitcastToAPInt().zextOrTrunc(		CFP->getValueAPF().bitcastToAPInt().zextOrTrunc(
8 * ElementSizeBytes),		ElementSizeBits),
SDLoc(CFP), IntMemVT);		SDLoc(CFP), IntMemVT);
else if (auto *C = dyn_cast<ConstantSDNode>(Val))		else if (auto *C = dyn_cast<ConstantSDNode>(Val))
Val = DAG.getConstant(		Val = DAG.getConstant(
C->getAPIntValue().zextOrTrunc(8 * ElementSizeBytes),		C->getAPIntValue().zextOrTrunc(ElementSizeBits),
SDLoc(C), IntMemVT);		SDLoc(C), IntMemVT);
}		}
// Make sure correctly size type is the correct type.		// Make sure correctly size type is the correct type.
Val = DAG.getBitcast(MemVT, Val);		Val = DAG.getBitcast(MemVT, Val);
}		}
BuildVector.push_back(Val);		BuildVector.push_back(Val);
}		}
StoredVal = DAG.getNode(MemVT.isVector() ? ISD::CONCAT_VECTORS		StoredVal = DAG.getNode(MemVT.isVector() ? ISD::CONCAT_VECTORS
▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	if (UseVector) {
// Construct a single integer constant which is made of the smaller		// Construct a single integer constant which is made of the smaller
// constant inputs.		// constant inputs.
bool IsLE = DAG.getDataLayout().isLittleEndian();		bool IsLE = DAG.getDataLayout().isLittleEndian();
for (unsigned i = 0; i < NumStores; ++i) {		for (unsigned i = 0; i < NumStores; ++i) {
unsigned Idx = IsLE ? (NumStores - 1 - i) : i;		unsigned Idx = IsLE ? (NumStores - 1 - i) : i;
StoreSDNode *St = cast<StoreSDNode>(StoreNodes[Idx].MemNode);		StoreSDNode *St = cast<StoreSDNode>(StoreNodes[Idx].MemNode);

SDValue Val = St->getValue();		SDValue Val = St->getValue();
StoreInt <<= ElementSizeBytes * 8;		StoreInt <<= ElementSizeBits;
if (ConstantSDNode *C = dyn_cast<ConstantSDNode>(Val)) {		if (ConstantSDNode *C = dyn_cast<ConstantSDNode>(Val)) {
StoreInt \|= C->getAPIntValue().zextOrTrunc(SizeInBits);		StoreInt \|= C->getAPIntValue().zextOrTrunc(SizeInBits);
} else if (ConstantFPSDNode *C = dyn_cast<ConstantFPSDNode>(Val)) {		} else if (ConstantFPSDNode *C = dyn_cast<ConstantFPSDNode>(Val)) {
StoreInt \|= C->getValueAPF().bitcastToAPInt().zextOrTrunc(SizeInBits);		StoreInt \|= C->getValueAPF().bitcastToAPInt().zextOrTrunc(SizeInBits);
} else {		} else {
llvm_unreachable("Invalid constant element type");		llvm_unreachable("Invalid constant element type");
}		}
}		}
▲ Show 20 Lines • Show All 181 Lines • ▼ Show 20 Lines	bool DAGCombiner::checkMergeStoreCandidatesForDependencies(
return true;		return true;
}		}

bool DAGCombiner::MergeConsecutiveStores(StoreSDNode *St) {		bool DAGCombiner::MergeConsecutiveStores(StoreSDNode *St) {
if (OptLevel == CodeGenOpt::None)		if (OptLevel == CodeGenOpt::None)
return false;		return false;

EVT MemVT = St->getMemoryVT();		EVT MemVT = St->getMemoryVT();
int64_t ElementSizeBytes = MemVT.getSizeInBits() / 8;		int64_t ElementSizeBytes = MemVT.getStoreSize();
unsigned NumMemElts = MemVT.isVector() ? MemVT.getVectorNumElements() : 1;		unsigned NumMemElts = MemVT.isVector() ? MemVT.getVectorNumElements() : 1;

if (MemVT.getSizeInBits() * 2 > MaximumLegalStoreInBits)		if (MemVT.getSizeInBits() * 2 > MaximumLegalStoreInBits)
return false;		return false;

bool NoVectors = DAG.getMachineFunction().getFunction()->hasFnAttribute(		bool NoVectors = DAG.getMachineFunction().getFunction()->hasFnAttribute(
Attribute::NoImplicitFloat);		Attribute::NoImplicitFloat);

▲ Show 20 Lines • Show All 4,621 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,134 Lines • ▼ Show 20 Lines	void SelectionDAGBuilder::visitAtomicLoad(const LoadInst &I) {
AtomicOrdering Order = I.getOrdering();		AtomicOrdering Order = I.getOrdering();
SyncScope::ID SSID = I.getSyncScopeID();		SyncScope::ID SSID = I.getSyncScopeID();

SDValue InChain = getRoot();		SDValue InChain = getRoot();

const TargetLowering &TLI = DAG.getTargetLoweringInfo();		const TargetLowering &TLI = DAG.getTargetLoweringInfo();
EVT VT = TLI.getValueType(DAG.getDataLayout(), I.getType());		EVT VT = TLI.getValueType(DAG.getDataLayout(), I.getType());

if (I.getAlignment() < VT.getSizeInBits() / 8)		if (I.getAlignment() < VT.getStoreSize())
report_fatal_error("Cannot generate unaligned atomic load");		report_fatal_error("Cannot generate unaligned atomic load");

MachineMemOperand *MMO =		MachineMemOperand *MMO =
DAG.getMachineFunction().		DAG.getMachineFunction().
getMachineMemOperand(MachinePointerInfo(I.getPointerOperand()),		getMachineMemOperand(MachinePointerInfo(I.getPointerOperand()),
MachineMemOperand::MOVolatile \|		MachineMemOperand::MOVolatile \|
MachineMemOperand::MOLoad,		MachineMemOperand::MOLoad,
VT.getStoreSize(),		VT.getStoreSize(),
Show All 19 Lines	void SelectionDAGBuilder::visitAtomicStore(const StoreInst &I) {
SyncScope::ID SSID = I.getSyncScopeID();		SyncScope::ID SSID = I.getSyncScopeID();

SDValue InChain = getRoot();		SDValue InChain = getRoot();

const TargetLowering &TLI = DAG.getTargetLoweringInfo();		const TargetLowering &TLI = DAG.getTargetLoweringInfo();
EVT VT =		EVT VT =
TLI.getValueType(DAG.getDataLayout(), I.getValueOperand()->getType());		TLI.getValueType(DAG.getDataLayout(), I.getValueOperand()->getType());

if (I.getAlignment() < VT.getSizeInBits() / 8)		if (I.getAlignment() < VT.getStoreSize())
report_fatal_error("Cannot generate unaligned atomic store");		report_fatal_error("Cannot generate unaligned atomic store");

SDValue OutChain =		SDValue OutChain =
DAG.getAtomic(ISD::ATOMIC_STORE, dl, VT,		DAG.getAtomic(ISD::ATOMIC_STORE, dl, VT,
InChain,		InChain,
getValue(I.getPointerOperand()),		getValue(I.getPointerOperand()),
getValue(I.getValueOperand()),		getValue(I.getValueOperand()),
I.getPointerOperand(), I.getAlignment(),		I.getPointerOperand(), I.getAlignment(),
▲ Show 20 Lines • Show All 5,833 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/StatepointLowering.cpp

	Show First 20 Lines • Show All 90 Lines • ▼ Show 20 Lines
	}			}

	SDValue			SDValue
	StatepointLoweringState::allocateStackSlot(EVT ValueType,			StatepointLoweringState::allocateStackSlot(EVT ValueType,
	SelectionDAGBuilder &Builder) {			SelectionDAGBuilder &Builder) {
	NumSlotsAllocatedForStatepoints++;			NumSlotsAllocatedForStatepoints++;
	MachineFrameInfo &MFI = Builder.DAG.getMachineFunction().getFrameInfo();			MachineFrameInfo &MFI = Builder.DAG.getMachineFunction().getFrameInfo();

	unsigned SpillSize = ValueType.getSizeInBits() / 8;			unsigned SpillSize = ValueType.getStoreSize();
	assert((SpillSize * 8) == ValueType.getSizeInBits() && "Size not in bytes?");			assert((SpillSize * 8) == ValueType.getSizeInBits() && "Size not in bytes?");
				bjopeUnsubmitted Not Done Reply Inline Actions I wouldn't mind if this assert is changed into: assert(ValueType.getStoreSizeInBits() == ValueType.getSizeInBits() && "Size not equal to store size"); Or is the assert aiming at checking if ValueType is a multiple of the byte size, then it should be assert(ValueType.isByteSized() && "Size not in bytes?"); but despite the current assert string I doubt that is what we want to verify. bjope: I wouldn't mind if this assert is changed into: assert(ValueType.getStoreSizeInBits() ==…
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions I'm not exactly sure either - unless someone confirms this, perhaps this can be changed at a later point after we get these fixes in... jonpa: I'm not exactly sure either - unless someone confirms this, perhaps this can be changed at a…
				bjopeUnsubmitted Not Done Reply Inline Actions Sure, I don't want this to be a stopper for your patch so you may ignore it. It was just something I discovered when reviewing (not directly connected to your patch). bjope: Sure, I don't want this to be a stopper for your patch so you may ignore it. It was just…

	// First look for a previously created stack slot which is not in			// First look for a previously created stack slot which is not in
	// use (accounting for the fact arbitrary slots may already be			// use (accounting for the fact arbitrary slots may already be
	// reserved), or to create a new stack slot and use it.			// reserved), or to create a new stack slot and use it.

	const size_t NumSlots = AllocatedStackSlots.size();			const size_t NumSlots = AllocatedStackSlots.size();
	assert(NextSlotToAllocate <= NumSlots && "Broken invariant");			assert(NextSlotToAllocate <= NumSlots && "Broken invariant");

	▲ Show 20 Lines • Show All 912 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/TargetLowering.cpp

Show First 20 Lines • Show All 3,489 Lines • ▼ Show 20 Lines	if (isTypeLegal(intVT) && isTypeLegal(LoadedVT)) {
ISD::ANY_EXTEND, dl, VT, Result);		ISD::ANY_EXTEND, dl, VT, Result);

return std::make_pair(Result, newLoad.getValue(1));		return std::make_pair(Result, newLoad.getValue(1));
}		}

// Copy the value to a (aligned) stack slot using (unaligned) integer		// Copy the value to a (aligned) stack slot using (unaligned) integer
// loads and stores, then do a (aligned) load from the stack slot.		// loads and stores, then do a (aligned) load from the stack slot.
MVT RegVT = getRegisterType(*DAG.getContext(), intVT);		MVT RegVT = getRegisterType(*DAG.getContext(), intVT);
unsigned LoadedBytes = LoadedVT.getSizeInBits() / 8;		unsigned LoadedBytes = LoadedVT.getStoreSize();
unsigned RegBytes = RegVT.getSizeInBits() / 8;		unsigned RegBytes = RegVT.getSizeInBits() / 8;
unsigned NumRegs = (LoadedBytes + RegBytes - 1) / RegBytes;		unsigned NumRegs = (LoadedBytes + RegBytes - 1) / RegBytes;

// Make sure the stack slot is also aligned for the register type.		// Make sure the stack slot is also aligned for the register type.
SDValue StackBase = DAG.CreateStackTemporary(LoadedVT, RegVT);		SDValue StackBase = DAG.CreateStackTemporary(LoadedVT, RegVT);
auto FrameIndex = cast<FrameIndexSDNode>(StackBase.getNode())->getIndex();		auto FrameIndex = cast<FrameIndexSDNode>(StackBase.getNode())->getIndex();
SmallVector<SDValue, 8> Stores;		SmallVector<SDValue, 8> Stores;
SDValue StackPtr = StackBase;		SDValue StackPtr = StackBase;
▲ Show 20 Lines • Show All 138 Lines • ▼ Show 20 Lines	if (ST->getMemoryVT().isFloatingPoint() \|\|
// Do a (aligned) store to a stack slot, then copy from the stack slot		// Do a (aligned) store to a stack slot, then copy from the stack slot
// to the final destination using (unaligned) integer loads and stores.		// to the final destination using (unaligned) integer loads and stores.
EVT StoredVT = ST->getMemoryVT();		EVT StoredVT = ST->getMemoryVT();
MVT RegVT =		MVT RegVT =
getRegisterType(*DAG.getContext(),		getRegisterType(*DAG.getContext(),
EVT::getIntegerVT(*DAG.getContext(),		EVT::getIntegerVT(*DAG.getContext(),
StoredVT.getSizeInBits()));		StoredVT.getSizeInBits()));
EVT PtrVT = Ptr.getValueType();		EVT PtrVT = Ptr.getValueType();
unsigned StoredBytes = StoredVT.getSizeInBits() / 8;		unsigned StoredBytes = StoredVT.getStoreSize();
unsigned RegBytes = RegVT.getSizeInBits() / 8;		unsigned RegBytes = RegVT.getSizeInBits() / 8;
unsigned NumRegs = (StoredBytes + RegBytes - 1) / RegBytes;		unsigned NumRegs = (StoredBytes + RegBytes - 1) / RegBytes;

// Make sure the stack slot is also aligned for the register type.		// Make sure the stack slot is also aligned for the register type.
SDValue StackPtr = DAG.CreateStackTemporary(StoredVT, RegVT);		SDValue StackPtr = DAG.CreateStackTemporary(StoredVT, RegVT);
auto FrameIndex = cast<FrameIndexSDNode>(StackPtr.getNode())->getIndex();		auto FrameIndex = cast<FrameIndexSDNode>(StackPtr.getNode())->getIndex();

// Perform the original store, only redirected to the stack slot.		// Perform the original store, only redirected to the stack slot.
▲ Show 20 Lines • Show All 105 Lines • ▼ Show 20 Lines	if (IsCompressedMemory) {
// Count '1's with POPCNT.		// Count '1's with POPCNT.
Increment = DAG.getNode(ISD::CTPOP, DL, MaskIntVT, MaskInIntReg);		Increment = DAG.getNode(ISD::CTPOP, DL, MaskIntVT, MaskInIntReg);
Increment = DAG.getZExtOrTrunc(Increment, DL, AddrVT);		Increment = DAG.getZExtOrTrunc(Increment, DL, AddrVT);
// Scale is an element size in bytes.		// Scale is an element size in bytes.
SDValue Scale = DAG.getConstant(DataVT.getScalarSizeInBits() / 8, DL,		SDValue Scale = DAG.getConstant(DataVT.getScalarSizeInBits() / 8, DL,
AddrVT);		AddrVT);
Increment = DAG.getNode(ISD::MUL, DL, AddrVT, Increment, Scale);		Increment = DAG.getNode(ISD::MUL, DL, AddrVT, Increment, Scale);
} else		} else
Increment = DAG.getConstant(DataVT.getSizeInBits() / 8, DL, AddrVT);		Increment = DAG.getConstant(DataVT.getStoreSize(), DL, AddrVT);

return DAG.getNode(ISD::ADD, DL, AddrVT, Addr, Increment);		return DAG.getNode(ISD::ADD, DL, AddrVT, Addr, Increment);
}		}

static SDValue clampDynamicVectorIndex(SelectionDAG &DAG,		static SDValue clampDynamicVectorIndex(SelectionDAG &DAG,
SDValue Idx,		SDValue Idx,
EVT VecVT,		EVT VecVT,
const SDLoc &dl) {		const SDLoc &dl) {
▲ Show 20 Lines • Show All 104 Lines • Show Last 20 Lines

lib/Target/Hexagon/HexagonLoopIdiomRecognition.cpp

Show First 20 Lines • Show All 134 Lines • ▼ Show 20 Lines	void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.addRequired<DominatorTreeWrapperPass>();		AU.addRequired<DominatorTreeWrapperPass>();
AU.addRequired<TargetLibraryInfoWrapperPass>();		AU.addRequired<TargetLibraryInfoWrapperPass>();
AU.addPreserved<TargetLibraryInfoWrapperPass>();		AU.addPreserved<TargetLibraryInfoWrapperPass>();
}		}

bool runOnLoop(Loop *L, LPPassManager &LPM) override;		bool runOnLoop(Loop *L, LPPassManager &LPM) override;

private:		private:
unsigned getStoreSizeInBytes(StoreInst *SI);
int getSCEVStride(const SCEVAddRecExpr *StoreEv);		int getSCEVStride(const SCEVAddRecExpr *StoreEv);
bool isLegalStore(Loop CurLoop, StoreInst SI);		bool isLegalStore(Loop CurLoop, StoreInst SI);
void collectStores(Loop CurLoop, BasicBlock BB,		void collectStores(Loop CurLoop, BasicBlock BB,
SmallVectorImpl<StoreInst*> &Stores);		SmallVectorImpl<StoreInst*> &Stores);
bool processCopyingStore(Loop CurLoop, StoreInst SI, const SCEV *BECount);		bool processCopyingStore(Loop CurLoop, StoreInst SI, const SCEV *BECount);
bool coverLoop(Loop L, SmallVectorImpl<Instruction> &Insts) const;		bool coverLoop(Loop L, SmallVectorImpl<Instruction> &Insts) const;
bool runOnLoopBlock(Loop CurLoop, BasicBlock BB, const SCEV *BECount,		bool runOnLoopBlock(Loop CurLoop, BasicBlock BB, const SCEV *BECount,
SmallVectorImpl<BasicBlock*> &ExitBlocks);		SmallVectorImpl<BasicBlock*> &ExitBlocks);
▲ Show 20 Lines • Show All 1,690 Lines • ▼ Show 20 Lines	bool PolynomialMultiplyRecognize::recognize() {
if (PM->getType() != PV.Res->getType())		if (PM->getType() != PV.Res->getType())
PM = IRBuilder<>(&*At).CreateIntCast(PM, PV.Res->getType(), false);		PM = IRBuilder<>(&*At).CreateIntCast(PM, PV.Res->getType(), false);

PV.Res->replaceAllUsesWith(PM);		PV.Res->replaceAllUsesWith(PM);
PV.Res->eraseFromParent();		PV.Res->eraseFromParent();
return true;		return true;
}		}

unsigned HexagonLoopIdiomRecognize::getStoreSizeInBytes(StoreInst *SI) {
uint64_t SizeInBits = DL->getTypeSizeInBits(SI->getValueOperand()->getType());
assert(((SizeInBits & 7) \|\| (SizeInBits >> 32) == 0) &&
"Don't overflow unsigned.");
return (unsigned)SizeInBits >> 3;
}

int HexagonLoopIdiomRecognize::getSCEVStride(const SCEVAddRecExpr *S) {		int HexagonLoopIdiomRecognize::getSCEVStride(const SCEVAddRecExpr *S) {
if (const SCEVConstant *SC = dyn_cast<SCEVConstant>(S->getOperand(1)))		if (const SCEVConstant *SC = dyn_cast<SCEVConstant>(S->getOperand(1)))
return SC->getAPInt().getSExtValue();		return SC->getAPInt().getSExtValue();
return 0;		return 0;
}		}

bool HexagonLoopIdiomRecognize::isLegalStore(Loop CurLoop, StoreInst SI) {		bool HexagonLoopIdiomRecognize::isLegalStore(Loop CurLoop, StoreInst SI) {
// Allow volatile stores if HexagonVolatileMemcpy is enabled.		// Allow volatile stores if HexagonVolatileMemcpy is enabled.
Show All 15 Lines	bool HexagonLoopIdiomRecognize::isLegalStore(Loop CurLoop, StoreInst SI) {
if (!StoreEv \|\| StoreEv->getLoop() != CurLoop \|\| !StoreEv->isAffine())		if (!StoreEv \|\| StoreEv->getLoop() != CurLoop \|\| !StoreEv->isAffine())
return false;		return false;

// Check to see if the stride matches the size of the store. If so, then we		// Check to see if the stride matches the size of the store. If so, then we
// know that every byte is touched in the loop.		// know that every byte is touched in the loop.
int Stride = getSCEVStride(StoreEv);		int Stride = getSCEVStride(StoreEv);
if (Stride == 0)		if (Stride == 0)
return false;		return false;
unsigned StoreSize = getStoreSizeInBytes(SI);		unsigned StoreSize = DL->getTypeStoreSize(SI->getValueOperand()->getType());
if (StoreSize != unsigned(std::abs(Stride)))		if (StoreSize != unsigned(std::abs(Stride)))
return false;		return false;

// The store must be feeding a non-volatile load.		// The store must be feeding a non-volatile load.
LoadInst *LI = dyn_cast<LoadInst>(SI->getValueOperand());		LoadInst *LI = dyn_cast<LoadInst>(SI->getValueOperand());
if (!LI \|\| !LI->isSimple())		if (!LI \|\| !LI->isSimple())
return false;		return false;

▲ Show 20 Lines • Show All 58 Lines • ▼ Show 20 Lines	bool HexagonLoopIdiomRecognize::processCopyingStore(Loop *CurLoop,
StoreInst SI, const SCEV BECount) {		StoreInst SI, const SCEV BECount) {
assert((SI->isSimple() \|\| (SI->isVolatile() && HexagonVolatileMemcpy)) &&		assert((SI->isSimple() \|\| (SI->isVolatile() && HexagonVolatileMemcpy)) &&
"Expected only non-volatile stores, or Hexagon-specific memcpy"		"Expected only non-volatile stores, or Hexagon-specific memcpy"
"to volatile destination.");		"to volatile destination.");

Value *StorePtr = SI->getPointerOperand();		Value *StorePtr = SI->getPointerOperand();
auto *StoreEv = cast<SCEVAddRecExpr>(SE->getSCEV(StorePtr));		auto *StoreEv = cast<SCEVAddRecExpr>(SE->getSCEV(StorePtr));
unsigned Stride = getSCEVStride(StoreEv);		unsigned Stride = getSCEVStride(StoreEv);
unsigned StoreSize = getStoreSizeInBytes(SI);		unsigned StoreSize = DL->getTypeStoreSize(SI->getValueOperand()->getType());
if (Stride != StoreSize)		if (Stride != StoreSize)
return false;		return false;

// See if the pointer expression is an AddRec like {base,+,1} on the current		// See if the pointer expression is an AddRec like {base,+,1} on the current
// loop, which indicates a strided load. If we have something else, it's a		// loop, which indicates a strided load. If we have something else, it's a
// random load we can't handle.		// random load we can't handle.
LoadInst *LI = dyn_cast<LoadInst>(SI->getValueOperand());		LoadInst *LI = dyn_cast<LoadInst>(SI->getValueOperand());
auto *LoadEv = cast<SCEVAddRecExpr>(SE->getSCEV(LI->getPointerOperand()));		auto *LoadEv = cast<SCEVAddRecExpr>(SE->getSCEV(LI->getPointerOperand()));
▲ Show 20 Lines • Show All 418 Lines • Show Last 20 Lines

lib/Target/Mips/MipsISelLowering.cpp

Show First 20 Lines • Show All 2,806 Lines • ▼ Show 20 Lines	if (ValVT == MVT::f32) {
if (Reg2 == Mips::A1 \|\| Reg2 == Mips::A3)		if (Reg2 == Mips::A1 \|\| Reg2 == Mips::A3)
State.AllocateReg(IntRegs);		State.AllocateReg(IntRegs);
State.AllocateReg(IntRegs);		State.AllocateReg(IntRegs);
}		}
} else		} else
llvm_unreachable("Cannot handle this ValVT.");		llvm_unreachable("Cannot handle this ValVT.");

if (!Reg) {		if (!Reg) {
unsigned Offset = State.AllocateStack(ValVT.getSizeInBits() >> 3,		unsigned Offset = State.AllocateStack(ValVT.getStoreSize(), OrigAlign);
OrigAlign);
State.addLoc(CCValAssign::getMem(ValNo, ValVT, Offset, LocVT, LocInfo));		State.addLoc(CCValAssign::getMem(ValNo, ValVT, Offset, LocVT, LocInfo));
} else		} else
State.addLoc(CCValAssign::getReg(ValNo, ValVT, Reg, LocVT, LocInfo));		State.addLoc(CCValAssign::getReg(ValNo, ValVT, Reg, LocVT, LocInfo));

return false;		return false;
}		}

static bool CC_MipsO32_FP32(unsigned ValNo, MVT ValVT,		static bool CC_MipsO32_FP32(unsigned ValNo, MVT ValVT,
▲ Show 20 Lines • Show All 1,559 Lines • Show Last 20 Lines

lib/Transforms/Scalar/LoopIdiomRecognize.cpp

Show First 20 Lines • Show All 328 Lines • ▼ Show 20 Lines	for (auto *BB : CurLoop->getBlocks()) {
if (LI->getLoopFor(BB) != CurLoop)		if (LI->getLoopFor(BB) != CurLoop)
continue;		continue;

MadeChange \|= runOnLoopBlock(BB, BECount, ExitBlocks);		MadeChange \|= runOnLoopBlock(BB, BECount, ExitBlocks);
}		}
return MadeChange;		return MadeChange;
}		}

static unsigned getStoreSizeInBytes(StoreInst SI, const DataLayout DL) {
uint64_t SizeInBits = DL->getTypeSizeInBits(SI->getValueOperand()->getType());
assert(((SizeInBits & 7) \|\| (SizeInBits >> 32) == 0) &&
"Don't overflow unsigned.");
return (unsigned)SizeInBits >> 3;
}

static APInt getStoreStride(const SCEVAddRecExpr *StoreEv) {		static APInt getStoreStride(const SCEVAddRecExpr *StoreEv) {
const SCEVConstant *ConstStride = cast<SCEVConstant>(StoreEv->getOperand(1));		const SCEVConstant *ConstStride = cast<SCEVConstant>(StoreEv->getOperand(1));
return ConstStride->getAPInt();		return ConstStride->getAPInt();
}		}

/// getMemSetPatternValue - If a strided store of the specified value is safe to		/// getMemSetPatternValue - If a strided store of the specified value is safe to
/// turn into a memset_pattern16, return a ConstantArray of 16 bytes that should		/// turn into a memset_pattern16, return a ConstantArray of 16 bytes that should
/// be passed in. Otherwise, return null.		/// be passed in. Otherwise, return null.
▲ Show 20 Lines • Show All 101 Lines • ▼ Show 20 Lines	if (!UnorderedAtomic && HasMemset && SplatValue &&
return LegalStoreKind::MemsetPattern;		return LegalStoreKind::MemsetPattern;
}		}

// Otherwise, see if the store can be turned into a memcpy.		// Otherwise, see if the store can be turned into a memcpy.
if (HasMemcpy) {		if (HasMemcpy) {
// Check to see if the stride matches the size of the store. If so, then we		// Check to see if the stride matches the size of the store. If so, then we
// know that every byte is touched in the loop.		// know that every byte is touched in the loop.
APInt Stride = getStoreStride(StoreEv);		APInt Stride = getStoreStride(StoreEv);
unsigned StoreSize = getStoreSizeInBytes(SI, DL);		unsigned StoreSize = DL->getTypeStoreSize(SI->getValueOperand()->getType());
if (StoreSize != Stride && StoreSize != -Stride)		if (StoreSize != Stride && StoreSize != -Stride)
return LegalStoreKind::None;		return LegalStoreKind::None;

// The store must be feeding a non-volatile load.		// The store must be feeding a non-volatile load.
LoadInst *LI = dyn_cast<LoadInst>(SI->getValueOperand());		LoadInst *LI = dyn_cast<LoadInst>(SI->getValueOperand());

// Only allow non-volatile loads		// Only allow non-volatile loads
if (!LI \|\| LI->isVolatile())		if (!LI \|\| LI->isVolatile())
▲ Show 20 Lines • Show All 122 Lines • ▼ Show 20 Lines	bool LoopIdiomRecognize::processLoopStores(SmallVectorImpl<StoreInst *> &SL,
for (unsigned i = 0, e = SL.size(); i < e; ++i) {		for (unsigned i = 0, e = SL.size(); i < e; ++i) {
assert(SL[i]->isSimple() && "Expected only non-volatile stores.");		assert(SL[i]->isSimple() && "Expected only non-volatile stores.");

Value *FirstStoredVal = SL[i]->getValueOperand();		Value *FirstStoredVal = SL[i]->getValueOperand();
Value *FirstStorePtr = SL[i]->getPointerOperand();		Value *FirstStorePtr = SL[i]->getPointerOperand();
const SCEVAddRecExpr *FirstStoreEv =		const SCEVAddRecExpr *FirstStoreEv =
cast<SCEVAddRecExpr>(SE->getSCEV(FirstStorePtr));		cast<SCEVAddRecExpr>(SE->getSCEV(FirstStorePtr));
APInt FirstStride = getStoreStride(FirstStoreEv);		APInt FirstStride = getStoreStride(FirstStoreEv);
unsigned FirstStoreSize = getStoreSizeInBytes(SL[i], DL);		unsigned FirstStoreSize = DL->getTypeStoreSize(SL[i]->getValueOperand()->getType());

// See if we can optimize just this store in isolation.		// See if we can optimize just this store in isolation.
if (FirstStride == FirstStoreSize \|\| -FirstStride == FirstStoreSize) {		if (FirstStride == FirstStoreSize \|\| -FirstStride == FirstStoreSize) {
Heads.insert(SL[i]);		Heads.insert(SL[i]);
continue;		continue;
}		}

Value *FirstSplatValue = nullptr;		Value *FirstSplatValue = nullptr;
▲ Show 20 Lines • Show All 76 Lines • ▼ Show 20 Lines	for (SetVector<StoreInst *>::iterator it = Heads.begin(), e = Heads.end();
unsigned StoreSize = 0;		unsigned StoreSize = 0;

// Collect the chain into a list.		// Collect the chain into a list.
while (Tails.count(I) \|\| Heads.count(I)) {		while (Tails.count(I) \|\| Heads.count(I)) {
if (TransformedStores.count(I))		if (TransformedStores.count(I))
break;		break;
AdjacentStores.insert(I);		AdjacentStores.insert(I);

StoreSize += getStoreSizeInBytes(I, DL);		StoreSize += DL->getTypeStoreSize(I->getValueOperand()->getType());
// Move to the next value in the chain.		// Move to the next value in the chain.
I = ConsecutiveChain[I];		I = ConsecutiveChain[I];
}		}

Value *StoredVal = HeadStore->getValueOperand();		Value *StoredVal = HeadStore->getValueOperand();
Value *StorePtr = HeadStore->getPointerOperand();		Value *StorePtr = HeadStore->getPointerOperand();
const SCEVAddRecExpr *StoreEv = cast<SCEVAddRecExpr>(SE->getSCEV(StorePtr));		const SCEVAddRecExpr *StoreEv = cast<SCEVAddRecExpr>(SE->getSCEV(StorePtr));
APInt Stride = getStoreStride(StoreEv);		APInt Stride = getStoreStride(StoreEv);
▲ Show 20 Lines • Show All 257 Lines • ▼ Show 20 Lines
/// for (i) A[i] = B[i];		/// for (i) A[i] = B[i];
bool LoopIdiomRecognize::processLoopStoreOfLoopLoad(StoreInst *SI,		bool LoopIdiomRecognize::processLoopStoreOfLoopLoad(StoreInst *SI,
const SCEV *BECount) {		const SCEV *BECount) {
assert(SI->isUnordered() && "Expected only non-volatile non-ordered stores.");		assert(SI->isUnordered() && "Expected only non-volatile non-ordered stores.");

Value *StorePtr = SI->getPointerOperand();		Value *StorePtr = SI->getPointerOperand();
const SCEVAddRecExpr *StoreEv = cast<SCEVAddRecExpr>(SE->getSCEV(StorePtr));		const SCEVAddRecExpr *StoreEv = cast<SCEVAddRecExpr>(SE->getSCEV(StorePtr));
APInt Stride = getStoreStride(StoreEv);		APInt Stride = getStoreStride(StoreEv);
unsigned StoreSize = getStoreSizeInBytes(SI, DL);		unsigned StoreSize = DL->getTypeStoreSize(SI->getValueOperand()->getType());
bool NegStride = StoreSize == -Stride;		bool NegStride = StoreSize == -Stride;

// The store must be feeding a non-volatile load.		// The store must be feeding a non-volatile load.
LoadInst *LI = cast<LoadInst>(SI->getValueOperand());		LoadInst *LI = cast<LoadInst>(SI->getValueOperand());
assert(LI->isUnordered() && "Expected only non-volatile non-ordered loads.");		assert(LI->isUnordered() && "Expected only non-volatile non-ordered loads.");

// See if the pointer expression is an AddRec like {base,+,1} on the current		// See if the pointer expression is an AddRec like {base,+,1} on the current
// loop, which indicates a strided load. If we have something else, it's a		// loop, which indicates a strided load. If we have something else, it's a
▲ Show 20 Lines • Show All 759 Lines • Show Last 20 Lines