This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/Target/
-
llvm/
-
Target/
-
TargetLowering.h
-
lib/CodeGen/
-
CodeGen/
-
SelectionDAG/
-
DAGCombiner.cpp
-
LegalizeDAG.cpp
8
TargetLoweringBase.cpp

Differential D10905

move DAGCombiner's allowableAlignment() helper function into the TLI
ClosedPublic

Authored by spatel on Jul 2 2015, 1:21 PM.

Download Raw Diff

Details

Reviewers

jyknight
qcolombet
dsanders
bruno

Commits

rG0f9dcf8b90aa: move DAGCombiner's allowableAlignment() helper function into the TLI
rL243549: move DAGCombiner's allowableAlignment() helper function into the TLI

Summary

Making allowableAlignment() more accessible was suggested as a predecessor patch
for D10662, so I've pulled it into TargetLowering. This let's us remove 4 instances
of duplicate logic in LegalizeDAG.

There's a subtle functional change in the implementation: the existing
allowableAlignment() code was using getPrefTypeAlignment() when checking
alignment with the DataLayout and assumed that was fast. In this implementation,
we use getABITypeAlignment() and assume that is fast. See the TODO comment
for future improvements in this implementation (don't use the data layout at all).

There are no regression test changes from this difference, and I'm not sure how to
expose it via a test. I think we actually do want to provide the 'Fast' param when
checking this from DAGCombiner::MergeConsecutiveStores(). Ie, we shouldn't merge
stores if the new stores are not going to be fast. But that change will require
fixing allowsMisalignedMemoryAccess() overrides as noted in D10662.

Diff Detail

Event Timeline

spatel updated this revision to Diff 28968.Jul 2 2015, 1:21 PM

spatel retitled this revision from to move DAGCombiner's allowableAlignment() helper function into the TLI.

spatel updated this object.

spatel added reviewers: jyknight, qcolombet.

spatel added a subscriber: llvm-commits.

jyknight added inline comments.Jul 6 2015, 1:51 PM

lib/CodeGen/TargetLoweringBase.cpp
1547	This won't return the right result in Fast. E.g. allowsMisalignedMemoryAccesses might return with `{ Fast = false; return true; }`, which would then cause this function to claim that a properly aligned memory access is not fast. That's obviously wrong, but I'm not really sure what Right is. What's the actual contract for the ABI alignment and Preferred alignment information in DataLayout? I think there's three things that LLVM might want to know about alignment and memory accesses: If I can choose any alignment for this data, what should it be? (getPrefTypeAlignment should return that, I think) Is this memory access guaranteed to work AT ALL*, no matter the speed. I think that's a combination of getABITypeAlignment, and then allowsMisalignedMemoryAccesses without checking "Fast". Although: is it always guaranteed that the ABI alignment for a type is always okay to load/store from? It'd be sane for it to be, but it's not really totally crazy to have a target where that's not true: e.g. ABI alignment of 32bits for an i64, but requires 64bit alignment for an i64 load/store?) I guess that's not the case of any existing target, since LLVM doesn't support it. Is it worthwhile to combine some memory accesses into a larger memory access, given alignment of X? I'm not sure this is actually adequately expressed in LLVM right now. Could be it's intended to be getPrefTypeAlignment and allowsMisalignedMemoryAccesses (with testing Fast==true) is supposed to be this. But, is "preferred alignment" and "fast to access" supposed to always be the same thing? Because, e.g. MIPS says i8:8:32-i16:16:32-i64:64 -- so both i8 and i16 have abi alignment of their size, but preferred alignment of 32 bits. I don't know a lot about MIPS, so I don't know why it claims a preferred alignment of 32bits, but I think it has a perfectly good 1/2-byte load instructions that work at their natural alignment. So on MIPS (even pre-MIPS32r6/MIPS64r6 which apparently made unaligned access allowed), it seems like it should always be an improvement to merge 8-bit loads which are 16-bit aligned into a 16-bit load. Yet, going by getPrefTypeAlignment would say that's not ok.

spatel added reviewers: bruno, dsanders.Jul 7 2015, 11:08 AM

spatel added inline comments.

lib/CodeGen/TargetLoweringBase.cpp
1547	Hi James - Thanks for the close review - much appreciated! I can certainly fix the Fast bug that you noted, but I also don't know what the right answer is. My guess is that the MIPS DataLayout is wrong. Let's see if we can find out... The 32-bit preferred alignment for MIPS i8/i16 was introduced way back here: http://reviews.llvm.org/rL53585 Adding Bruno (author) and Daniel (currently listed as owner of the MIPS backend).

dsanders added inline comments.Jul 7 2015, 1:24 PM

lib/CodeGen/TargetLoweringBase.cpp
1547	I'm afraid I'm not sure why we prefer 32-bit alignment for i8 and i16. My best guess is that it pays off in library calls by allowing better-optimised memcpy/strcpy/strcmp/etc. I don't know a lot about MIPS, so I don't know why it claims a preferred alignment of 32bits, but I think it has a perfectly good 1/2-byte load instructions that work at their natural alignment. That's right. As far as I know they're usually (possibly always) the same latency too. ... (even pre-MIPS32r6/MIPS64r6 which apparently made unaligned access allowed) ... That's right. 32r6/64r6 dropped the unaligned load/store left/right instruction pairs in favour a promise that something in the system will make unaligned load/store work without special instructions. Depending on the MIPS implementation, the 'something' could be full hardware support, or a software exception handler, or a hybrid such as hardware-support for unaligned accesses inside a cache-line and software otherwise. So on MIPS (even pre-MIPS32r6/MIPS64r6 which apparently made unaligned access allowed), it seems like it should always be an improvement to merge 8-bit loads which are 16-bit aligned into a 16-bit load. Yet, going by getPrefTypeAlignment would say that's not ok. As far as the load is concerned, I agree it's always an improvement to merge like this. The risk is that manipulating the data might cost more than the saving. It's worth mentioning that some MIPS implementations will do this optimization in hardware without the risk of needing additional data manipulation.

nlewycky: chandler says you might be able to help answer the questions in the comment thread.

mehdi_amini added a subscriber: mehdi_amini.Jul 13 2015, 12:42 PM

mehdi_amini added inline comments.

lib/CodeGen/TargetLoweringBase.cpp
1553	I'm removing the getDataLayout() from TargetLowering soon (D11079), you'll have to find another way of getting it (argument to the function?).

jyknight added inline comments.Jul 14 2015, 8:02 AM

lib/CodeGen/TargetLoweringBase.cpp
1547	Even if there were no reason at all for MIPS to have preferred alignment set as it is, I think I've convinced myself that using "preferred" alignment in this function isn't correct. The "Fast" output should be indicating whether it's faster to do a potentially-misaligned load/store, or to use multiple load/store instructions. But "preferred" alignment is indicating the overall BEST alignment choice, not just what is going to be faster-or-equal than separate loads. E.g. consider an i64:32:64 alignment spec, with a target that doesn't support under-aligned memory access (it supports the abi alignment of 32bits, but not below). It still is quite possible that the target behavior is such that merging two 32-bit loads to a 64bit load is a good idea -- e.g. that it's at worst the same speed as two 32-bit loads, but is still faster when there's 64bit alignment. So perhaps the DataLayout should not actually be used to determine what memory alignments are valid for load/store on a target at all? That is, I'm thinking that targets maybe ought to implement "allowsMemoryAccess" directly, describing the entirety of that target's rules for what memory alignments work, and are "fast", rather than having "allowsMisalignedMemoryAccess" which describes only half of them, the other half intuited from the DataLayout. That seems a potentially saner model -- leaving the "Data Layout" spec to just describe the layout, and the target to describe what the hardware/runtime-environment can actually do, independent of that. Alternatively it might be okay for the ABI alignment to just be assumed to be fast. If that's not the case, that's certainly a suboptimal ABI, but it wouldn't surprise me wouldn't surprise me if some ABIs were suboptimal in that way.

spatel added inline comments.Jul 21 2015, 9:35 AM

lib/CodeGen/TargetLoweringBase.cpp
1547	I see your point about making a new target hook that implements all of 'allowsMemoryAccess' directly, but a quick look at the implementations of DataLayout::getAlignment() and DataLayout::getAlignmentInfo() makes me scared... Let me post an updated version of the patch that assumes the ABI alignment is fast and see what people think.

Patch updated:

Changed implementation of allowsMemoryAccess() to check ABI alignment first and assume that ABI alignment is fast.
Added DataLayout as parameter because it's no longer available from TargetLowering.

qcolombet added inline comments.Jul 28 2015, 2:29 PM

lib/CodeGen/TargetLoweringBase.cpp
1547	I’d say that’s fine for the default behavior, but we could provide a way to change that at the target discretion by making the hook virtual.

qcolombet added inline comments.Jul 28 2015, 2:31 PM

lib/CodeGen/TargetLoweringBase.cpp
1547	Note: We could also do the virtual thing when we get an actual use case. I do not feel strongly either way.

I do still think it might be ultimately better and cleaner to not involve DataLayout at all, and to have the target hook describe the complete rules on what alignments can be loaded/stored, since DataLayout is really about the system's defined ABI, which can legitimately differ in different environments, even though the hardware's behavior does not. And I expect it'd actually be easier to understand that way as well.

But this is definitely a solid improvement, and is a good change both if considered as a step towards the above and as the final state, so LGTM. Please update the description to match the new revision before committing.

[I'll also note that there's still a few other places left that call allowsMisalignedMemoryAccesses after this; I bet most of those could also be updated to call allowsMemoryAccess instead.]

This revision is now accepted and ready to land.Jul 28 2015, 10:17 PM

In D10905#214097, @jyknight wrote:

I do still think it might be ultimately better and cleaner to not involve DataLayout at all, and to have the target hook describe the complete rules on what alignments can be loaded/stored, since DataLayout is really about the system's defined ABI, which can legitimately differ in different environments, even though the hardware's behavior does not. And I expect it'd actually be easier to understand that way as well.

Thanks, James and Quentin. I had considered Quentin's suggestion of making the hook virtual, but I'd prefer to do that when we have evidence that it's needed. That may come when I mess around with MergeConsecutiveStores() again. :)

I'll add a 'TODO' comment, so we have a reminder of this discussion in the code.

Closed by commit rL243549: move DAGCombiner's allowableAlignment() helper function into the TLI (authored by spatel). · Explain WhyJul 29 2015, 11:24 AM

This revision was automatically updated to reflect the committed changes.

spatel updated this object.Jul 29 2015, 11:32 AM

spatel edited edge metadata.

spatel mentioned this in D10662: [x86] fix allowsMisalignedMemoryAccess() implementation.Aug 12 2015, 2:37 PM

Revision Contents

Path

Size

include/

llvm/

Target/

TargetLowering.h

10 lines

lib/

CodeGen/

SelectionDAG/

DAGCombiner.cpp

64 lines

LegalizeDAG.cpp

42 lines

TargetLoweringBase.cpp

23 lines

Diff 28968

include/llvm/Target/TargetLowering.h

Show First 20 Lines • Show All 869 Lines • ▼ Show 20 Lines	public:
/// alignment error (trap) on the target machine.		/// alignment error (trap) on the target machine.
virtual bool allowsMisalignedMemoryAccesses(EVT,		virtual bool allowsMisalignedMemoryAccesses(EVT,
unsigned AddrSpace = 0,		unsigned AddrSpace = 0,
unsigned Align = 1,		unsigned Align = 1,
bool * /Fast/ = nullptr) const {		bool * /Fast/ = nullptr) const {
return false;		return false;
}		}

		/// Return true if the target supports a memory access of this type for the
		/// given address space and alignment. This first checks if a misaligned
		/// access of this type is allowed and if not, uses the data layout to
		/// determine if the access is allowed based on the given alignment.
		/// If the access is allowed, the optional final parameter returns if the
		/// access is also fast (as defined by the target).
		bool allowsMemoryAccess(LLVMContext &Context, EVT VT,
		unsigned AddrSpace = 0, unsigned Alignment = 1,
		bool *Fast = nullptr) const;

/// Returns the target specific optimal type for load and store operations as		/// Returns the target specific optimal type for load and store operations as
/// a result of memset, memcpy, and memmove lowering.		/// a result of memset, memcpy, and memmove lowering.
///		///
/// If DstAlign is zero that means it's safe to destination alignment can		/// If DstAlign is zero that means it's safe to destination alignment can
/// satisfy any constraint. Similarly if SrcAlign is zero it means there isn't		/// satisfy any constraint. Similarly if SrcAlign is zero it means there isn't
/// a need to check it against alignment requirement, probably because the		/// a need to check it against alignment requirement, probably because the
/// source does not need to be loaded. If 'IsMemset' is true, that means it's		/// source does not need to be loaded. If 'IsMemset' is true, that means it's
/// expanding a memset. If 'ZeroMemset' is true, that means it's a memset of		/// expanding a memset. If 'ZeroMemset' is true, that means it's a memset of
▲ Show 20 Lines • Show All 1,947 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 10,730 Lines • ▼ Show 20 Lines	for (unsigned i = 0; i < NumElem ; ++i) {
while (!St->use_empty())		while (!St->use_empty())
DAG.ReplaceAllUsesWith(SDValue(St, 0), St->getChain());		DAG.ReplaceAllUsesWith(SDValue(St, 0), St->getChain());
deleteAndRecombine(St);		deleteAndRecombine(St);
}		}

return true;		return true;
}		}

static bool allowableAlignment(const SelectionDAG &DAG,
const TargetLowering &TLI, EVT EVTTy,
unsigned AS, unsigned Align) {
if (TLI.allowsMisalignedMemoryAccesses(EVTTy, AS, Align))
return true;

Type Ty = EVTTy.getTypeForEVT(DAG.getContext());
unsigned ABIAlignment = TLI.getDataLayout()->getPrefTypeAlignment(Ty);
return (Align >= ABIAlignment);
}

void DAGCombiner::getStoreMergeAndAliasCandidates(		void DAGCombiner::getStoreMergeAndAliasCandidates(
StoreSDNode* St, SmallVectorImpl<MemOpLink> &StoreNodes,		StoreSDNode* St, SmallVectorImpl<MemOpLink> &StoreNodes,
SmallVectorImpl<LSBaseSDNode*> &AliasLoadNodes) {		SmallVectorImpl<LSBaseSDNode*> &AliasLoadNodes) {
// This holds the base pointer, index, and the offset in bytes from the base		// This holds the base pointer, index, and the offset in bytes from the base
// pointer.		// pointer.
BaseIndexOffset BasePtr = BaseIndexOffset::match(St->getBasePtr());		BaseIndexOffset BasePtr = BaseIndexOffset::match(St->getBasePtr());

// We must have a base and an offset.		// We must have a base and an offset.
▲ Show 20 Lines • Show All 149 Lines • ▼ Show 20 Lines	for (unsigned i = 0, e = StoreNodes.size(); i < e; ++i) {
// Mark this node as useful.		// Mark this node as useful.
LastConsecutiveStore = i;		LastConsecutiveStore = i;
}		}

// The node with the lowest store address.		// The node with the lowest store address.
LSBaseSDNode *FirstInChain = StoreNodes[0].MemNode;		LSBaseSDNode *FirstInChain = StoreNodes[0].MemNode;
unsigned FirstStoreAS = FirstInChain->getAddressSpace();		unsigned FirstStoreAS = FirstInChain->getAddressSpace();
unsigned FirstStoreAlign = FirstInChain->getAlignment();		unsigned FirstStoreAlign = FirstInChain->getAlignment();
		LLVMContext &Context = *DAG.getContext();

// Store the constants into memory as one consecutive store.		// Store the constants into memory as one consecutive store.
if (IsConstantSrc) {		if (IsConstantSrc) {
unsigned LastLegalType = 0;		unsigned LastLegalType = 0;
unsigned LastLegalVectorType = 0;		unsigned LastLegalVectorType = 0;
bool NonZero = false;		bool NonZero = false;
for (unsigned i=0; i<LastConsecutiveStore+1; ++i) {		for (unsigned i=0; i<LastConsecutiveStore+1; ++i) {
StoreSDNode *St = cast<StoreSDNode>(StoreNodes[i].MemNode);		StoreSDNode *St = cast<StoreSDNode>(StoreNodes[i].MemNode);
SDValue StoredVal = St->getValue();		SDValue StoredVal = St->getValue();

if (ConstantSDNode *C = dyn_cast<ConstantSDNode>(StoredVal)) {		if (ConstantSDNode *C = dyn_cast<ConstantSDNode>(StoredVal)) {
NonZero \|= !C->isNullValue();		NonZero \|= !C->isNullValue();
} else if (ConstantFPSDNode *C = dyn_cast<ConstantFPSDNode>(StoredVal)) {		} else if (ConstantFPSDNode *C = dyn_cast<ConstantFPSDNode>(StoredVal)) {
NonZero \|= !C->getConstantFPValue()->isNullValue();		NonZero \|= !C->getConstantFPValue()->isNullValue();
} else {		} else {
// Non-constant.		// Non-constant.
break;		break;
}		}

// Find a legal type for the constant store.		// Find a legal type for the constant store.
unsigned SizeInBits = (i+1) * ElementSizeBytes * 8;		unsigned SizeInBits = (i+1) * ElementSizeBytes * 8;
EVT StoreTy = EVT::getIntegerVT(*DAG.getContext(), SizeInBits);		EVT StoreTy = EVT::getIntegerVT(Context, SizeInBits);
if (TLI.isTypeLegal(StoreTy) &&		if (TLI.isTypeLegal(StoreTy) &&
allowableAlignment(DAG, TLI, StoreTy, FirstStoreAS,		TLI.allowsMemoryAccess(Context, StoreTy, FirstStoreAS,
FirstStoreAlign)) {		FirstStoreAlign)) {
LastLegalType = i+1;		LastLegalType = i+1;
// Or check whether a truncstore is legal.		// Or check whether a truncstore is legal.
} else if (TLI.getTypeAction(*DAG.getContext(), StoreTy) ==		} else if (TLI.getTypeAction(Context, StoreTy) ==
TargetLowering::TypePromoteInteger) {		TargetLowering::TypePromoteInteger) {
EVT LegalizedStoredValueTy =		EVT LegalizedStoredValueTy =
TLI.getTypeToTransformTo(*DAG.getContext(), StoredVal.getValueType());		TLI.getTypeToTransformTo(Context, StoredVal.getValueType());
if (TLI.isTruncStoreLegal(LegalizedStoredValueTy, StoreTy) &&		if (TLI.isTruncStoreLegal(LegalizedStoredValueTy, StoreTy) &&
allowableAlignment(DAG, TLI, LegalizedStoredValueTy, FirstStoreAS,		TLI.allowsMemoryAccess(Context, LegalizedStoredValueTy,
FirstStoreAlign)) {		FirstStoreAS, FirstStoreAlign)) {
LastLegalType = i + 1;		LastLegalType = i + 1;
}		}
}		}

// Find a legal type for the vector store.		// Find a legal type for the vector store.
EVT Ty = EVT::getVectorVT(*DAG.getContext(), MemVT, i+1);		EVT Ty = EVT::getVectorVT(Context, MemVT, i+1);
if (TLI.isTypeLegal(Ty) &&		if (TLI.isTypeLegal(Ty) &&
allowableAlignment(DAG, TLI, Ty, FirstStoreAS, FirstStoreAlign)) {		TLI.allowsMemoryAccess(Context, Ty, FirstStoreAS, FirstStoreAlign)) {
LastLegalVectorType = i + 1;		LastLegalVectorType = i + 1;
}		}
}		}


// We only use vectors if the constant is known to be zero or the target		// We only use vectors if the constant is known to be zero or the target
// allows it and the function is not marked with the noimplicitfloat		// allows it and the function is not marked with the noimplicitfloat
// attribute.		// attribute.
Show All 27 Lines	for (unsigned i = 0; i < LastConsecutiveStore + 1; ++i) {
// Bail out if any stored values are not elements extracted from a vector.		// Bail out if any stored values are not elements extracted from a vector.
// It should be possible to handle mixed sources, but load sources need		// It should be possible to handle mixed sources, but load sources need
// more careful handling (see the block of code below that handles		// more careful handling (see the block of code below that handles
// consecutive loads).		// consecutive loads).
if (StoredVal.getOpcode() != ISD::EXTRACT_VECTOR_ELT)		if (StoredVal.getOpcode() != ISD::EXTRACT_VECTOR_ELT)
return false;		return false;

// Find a legal type for the vector store.		// Find a legal type for the vector store.
EVT Ty = EVT::getVectorVT(*DAG.getContext(), MemVT, i+1);		EVT Ty = EVT::getVectorVT(Context, MemVT, i+1);
if (TLI.isTypeLegal(Ty) &&		if (TLI.isTypeLegal(Ty) &&
allowableAlignment(DAG, TLI, Ty, FirstStoreAS, FirstStoreAlign))		TLI.allowsMemoryAccess(Context, Ty, FirstStoreAS, FirstStoreAlign))
NumElem = i + 1;		NumElem = i + 1;
}		}

return MergeStoresOfConstantsOrVecElts(StoreNodes, MemVT, NumElem,		return MergeStoresOfConstantsOrVecElts(StoreNodes, MemVT, NumElem,
false, true);		false, true);
}		}

// Below we handle the case of multiple consecutive stores that		// Below we handle the case of multiple consecutive stores that
▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	if (LoadNodes[i].MemNode->getChain() != FirstChain)
break;		break;

int64_t CurrAddress = LoadNodes[i].OffsetFromBase;		int64_t CurrAddress = LoadNodes[i].OffsetFromBase;
if (CurrAddress - StartAddress != (ElementSizeBytes * i))		if (CurrAddress - StartAddress != (ElementSizeBytes * i))
break;		break;
LastConsecutiveLoad = i;		LastConsecutiveLoad = i;

// Find a legal type for the vector store.		// Find a legal type for the vector store.
EVT StoreTy = EVT::getVectorVT(*DAG.getContext(), MemVT, i+1);		EVT StoreTy = EVT::getVectorVT(Context, MemVT, i+1);
if (TLI.isTypeLegal(StoreTy) &&		if (TLI.isTypeLegal(StoreTy) &&
allowableAlignment(DAG, TLI, StoreTy, FirstStoreAS, FirstStoreAlign) &&		TLI.allowsMemoryAccess(Context, StoreTy, FirstStoreAS,
allowableAlignment(DAG, TLI, StoreTy, FirstLoadAS, FirstLoadAlign)) {		FirstStoreAlign) &&
		TLI.allowsMemoryAccess(Context, StoreTy, FirstLoadAS, FirstLoadAlign)) {
LastLegalVectorType = i + 1;		LastLegalVectorType = i + 1;
}		}

// Find a legal type for the integer store.		// Find a legal type for the integer store.
unsigned SizeInBits = (i+1) * ElementSizeBytes * 8;		unsigned SizeInBits = (i+1) * ElementSizeBytes * 8;
StoreTy = EVT::getIntegerVT(*DAG.getContext(), SizeInBits);		StoreTy = EVT::getIntegerVT(Context, SizeInBits);
if (TLI.isTypeLegal(StoreTy) &&		if (TLI.isTypeLegal(StoreTy) &&
allowableAlignment(DAG, TLI, StoreTy, FirstStoreAS, FirstStoreAlign) &&		TLI.allowsMemoryAccess(Context, StoreTy, FirstStoreAS,
allowableAlignment(DAG, TLI, StoreTy, FirstLoadAS, FirstLoadAlign))		FirstStoreAlign) &&
		TLI.allowsMemoryAccess(Context, StoreTy, FirstLoadAS, FirstLoadAlign))
LastLegalIntegerType = i + 1;		LastLegalIntegerType = i + 1;
// Or check whether a truncstore and extload is legal.		// Or check whether a truncstore and extload is legal.
else if (TLI.getTypeAction(*DAG.getContext(), StoreTy) ==		else if (TLI.getTypeAction(Context, StoreTy) ==
TargetLowering::TypePromoteInteger) {		TargetLowering::TypePromoteInteger) {
EVT LegalizedStoredValueTy =		EVT LegalizedStoredValueTy =
TLI.getTypeToTransformTo(*DAG.getContext(), StoreTy);		TLI.getTypeToTransformTo(Context, StoreTy);
if (TLI.isTruncStoreLegal(LegalizedStoredValueTy, StoreTy) &&		if (TLI.isTruncStoreLegal(LegalizedStoredValueTy, StoreTy) &&
TLI.isLoadExtLegal(ISD::ZEXTLOAD, LegalizedStoredValueTy, StoreTy) &&		TLI.isLoadExtLegal(ISD::ZEXTLOAD, LegalizedStoredValueTy, StoreTy) &&
TLI.isLoadExtLegal(ISD::SEXTLOAD, LegalizedStoredValueTy, StoreTy) &&		TLI.isLoadExtLegal(ISD::SEXTLOAD, LegalizedStoredValueTy, StoreTy) &&
TLI.isLoadExtLegal(ISD::EXTLOAD, LegalizedStoredValueTy, StoreTy) &&		TLI.isLoadExtLegal(ISD::EXTLOAD, LegalizedStoredValueTy, StoreTy) &&
allowableAlignment(DAG, TLI, LegalizedStoredValueTy, FirstStoreAS,		TLI.allowsMemoryAccess(Context, LegalizedStoredValueTy, FirstStoreAS,
FirstStoreAlign) &&		FirstStoreAlign) &&
allowableAlignment(DAG, TLI, LegalizedStoredValueTy, FirstLoadAS,		TLI.allowsMemoryAccess(Context, LegalizedStoredValueTy, FirstLoadAS,
FirstLoadAlign))		FirstLoadAlign))
LastLegalIntegerType = i+1;		LastLegalIntegerType = i+1;
}		}
}		}

// Only use vector types if the vector type is larger than the integer type.		// Only use vector types if the vector type is larger than the integer type.
// If they are the same, use integers.		// If they are the same, use integers.
bool UseVectorTy = LastLegalVectorType > LastLegalIntegerType && !NoVectors;		bool UseVectorTy = LastLegalVectorType > LastLegalIntegerType && !NoVectors;
unsigned LastLegalType = std::max(LastLegalVectorType, LastLegalIntegerType);		unsigned LastLegalType = std::max(LastLegalVectorType, LastLegalIntegerType);
Show All 18 Lines	bool DAGCombiner::MergeConsecutiveStores(StoreSDNode* St) {
}		}

LSBaseSDNode *LatestOp = StoreNodes[LatestNodeUsed].MemNode;		LSBaseSDNode *LatestOp = StoreNodes[LatestNodeUsed].MemNode;

// Find if it is better to use vectors or integers to load and store		// Find if it is better to use vectors or integers to load and store
// to memory.		// to memory.
EVT JointMemOpVT;		EVT JointMemOpVT;
if (UseVectorTy) {		if (UseVectorTy) {
JointMemOpVT = EVT::getVectorVT(*DAG.getContext(), MemVT, NumElem);		JointMemOpVT = EVT::getVectorVT(Context, MemVT, NumElem);
} else {		} else {
unsigned SizeInBits = NumElem * ElementSizeBytes * 8;		unsigned SizeInBits = NumElem * ElementSizeBytes * 8;
JointMemOpVT = EVT::getIntegerVT(*DAG.getContext(), SizeInBits);		JointMemOpVT = EVT::getIntegerVT(Context, SizeInBits);
}		}

SDLoc LoadDL(LoadNodes[0].MemNode);		SDLoc LoadDL(LoadNodes[0].MemNode);
SDLoc StoreDL(StoreNodes[0].MemNode);		SDLoc StoreDL(StoreNodes[0].MemNode);

SDValue NewLoad = DAG.getLoad(		SDValue NewLoad = DAG.getLoad(
JointMemOpVT, LoadDL, FirstLoad->getChain(), FirstLoad->getBasePtr(),		JointMemOpVT, LoadDL, FirstLoad->getChain(), FirstLoad->getBasePtr(),
FirstLoad->getPointerInfo(), false, false, false, FirstLoadAlign);		FirstLoad->getPointerInfo(), false, false, false, FirstLoadAlign);
▲ Show 20 Lines • Show All 2,919 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

Show First 20 Lines • Show All 714 Lines • ▼ Show 20 Lines	if (!ST->isTruncatingStore()) {
{		{
SDValue Value = ST->getValue();		SDValue Value = ST->getValue();
MVT VT = Value.getSimpleValueType();		MVT VT = Value.getSimpleValueType();
switch (TLI.getOperationAction(ISD::STORE, VT)) {		switch (TLI.getOperationAction(ISD::STORE, VT)) {
default: llvm_unreachable("This action is not supported yet!");		default: llvm_unreachable("This action is not supported yet!");
case TargetLowering::Legal: {		case TargetLowering::Legal: {
// If this is an unaligned store and the target doesn't support it,		// If this is an unaligned store and the target doesn't support it,
// expand it.		// expand it.
		EVT MemVT = ST->getMemoryVT();
unsigned AS = ST->getAddressSpace();		unsigned AS = ST->getAddressSpace();
unsigned Align = ST->getAlignment();		unsigned Align = ST->getAlignment();
if (!TLI.allowsMisalignedMemoryAccesses(ST->getMemoryVT(), AS, Align)) {		if (!TLI.allowsMemoryAccess(*DAG.getContext(), MemVT, AS, Align))
Type Ty = ST->getMemoryVT().getTypeForEVT(DAG.getContext());
unsigned ABIAlignment= TLI.getDataLayout()->getABITypeAlignment(Ty);
if (Align < ABIAlignment)
ExpandUnalignedStore(cast<StoreSDNode>(Node), DAG, TLI, this);		ExpandUnalignedStore(cast<StoreSDNode>(Node), DAG, TLI, this);
}
break;		break;
}		}
case TargetLowering::Custom: {		case TargetLowering::Custom: {
SDValue Res = TLI.LowerOperation(SDValue(Node, 0), DAG);		SDValue Res = TLI.LowerOperation(SDValue(Node, 0), DAG);
if (Res && Res != SDValue(Node, 0))		if (Res && Res != SDValue(Node, 0))
ReplaceNode(SDValue(Node, 0), Res);		ReplaceNode(SDValue(Node, 0), Res);
return;		return;
}		}
▲ Show 20 Lines • Show All 88 Lines • ▼ Show 20 Lines	if (!ST->isTruncatingStore()) {
// The order of the stores doesn't matter.		// The order of the stores doesn't matter.
SDValue Result = DAG.getNode(ISD::TokenFactor, dl, MVT::Other, Lo, Hi);		SDValue Result = DAG.getNode(ISD::TokenFactor, dl, MVT::Other, Lo, Hi);
ReplaceNode(SDValue(Node, 0), Result);		ReplaceNode(SDValue(Node, 0), Result);
} else {		} else {
switch (TLI.getTruncStoreAction(ST->getValue().getSimpleValueType(),		switch (TLI.getTruncStoreAction(ST->getValue().getSimpleValueType(),
StVT.getSimpleVT())) {		StVT.getSimpleVT())) {
default: llvm_unreachable("This action is not supported yet!");		default: llvm_unreachable("This action is not supported yet!");
case TargetLowering::Legal: {		case TargetLowering::Legal: {
		EVT MemVT = ST->getMemoryVT();
unsigned AS = ST->getAddressSpace();		unsigned AS = ST->getAddressSpace();
unsigned Align = ST->getAlignment();		unsigned Align = ST->getAlignment();
// If this is an unaligned store and the target doesn't support it,		// If this is an unaligned store and the target doesn't support it,
// expand it.		// expand it.
if (!TLI.allowsMisalignedMemoryAccesses(ST->getMemoryVT(), AS, Align)) {		if (!TLI.allowsMemoryAccess(*DAG.getContext(), MemVT, AS, Align))
Type Ty = ST->getMemoryVT().getTypeForEVT(DAG.getContext());
unsigned ABIAlignment= TLI.getDataLayout()->getABITypeAlignment(Ty);
if (Align < ABIAlignment)
ExpandUnalignedStore(cast<StoreSDNode>(Node), DAG, TLI, this);		ExpandUnalignedStore(cast<StoreSDNode>(Node), DAG, TLI, this);
}
break;		break;
}		}
case TargetLowering::Custom: {		case TargetLowering::Custom: {
SDValue Res = TLI.LowerOperation(SDValue(Node, 0), DAG);		SDValue Res = TLI.LowerOperation(SDValue(Node, 0), DAG);
if (Res && Res != SDValue(Node, 0))		if (Res && Res != SDValue(Node, 0))
ReplaceNode(SDValue(Node, 0), Res);		ReplaceNode(SDValue(Node, 0), Res);
return;		return;
}		}
Show All 26 Lines	void SelectionDAGLegalize::LegalizeLoadOps(SDNode *Node) {
if (ExtType == ISD::NON_EXTLOAD) {		if (ExtType == ISD::NON_EXTLOAD) {
MVT VT = Node->getSimpleValueType(0);		MVT VT = Node->getSimpleValueType(0);
SDValue RVal = SDValue(Node, 0);		SDValue RVal = SDValue(Node, 0);
SDValue RChain = SDValue(Node, 1);		SDValue RChain = SDValue(Node, 1);

switch (TLI.getOperationAction(Node->getOpcode(), VT)) {		switch (TLI.getOperationAction(Node->getOpcode(), VT)) {
default: llvm_unreachable("This action is not supported yet!");		default: llvm_unreachable("This action is not supported yet!");
case TargetLowering::Legal: {		case TargetLowering::Legal: {
		EVT MemVT = LD->getMemoryVT();
unsigned AS = LD->getAddressSpace();		unsigned AS = LD->getAddressSpace();
unsigned Align = LD->getAlignment();		unsigned Align = LD->getAlignment();
// If this is an unaligned load and the target doesn't support it,		// If this is an unaligned load and the target doesn't support it,
// expand it.		// expand it.
if (!TLI.allowsMisalignedMemoryAccesses(LD->getMemoryVT(), AS, Align)) {		if (!TLI.allowsMemoryAccess(*DAG.getContext(), MemVT, AS, Align))
Type Ty = LD->getMemoryVT().getTypeForEVT(DAG.getContext());
unsigned ABIAlignment =
TLI.getDataLayout()->getABITypeAlignment(Ty);
if (Align < ABIAlignment){
ExpandUnalignedLoad(cast<LoadSDNode>(Node), DAG, TLI, RVal, RChain);		ExpandUnalignedLoad(cast<LoadSDNode>(Node), DAG, TLI, RVal, RChain);
}
}
break;		break;
}		}
case TargetLowering::Custom: {		case TargetLowering::Custom: {
SDValue Res = TLI.LowerOperation(RVal, DAG);		SDValue Res = TLI.LowerOperation(RVal, DAG);
if (Res.getNode()) {		if (Res.getNode()) {
RVal = Res;		RVal = Res;
RChain = Res.getValue(1);		RChain = Res.getValue(1);
}		}
▲ Show 20 Lines • Show All 167 Lines • ▼ Show 20 Lines	case TargetLowering::Legal: {

if (isCustom) {		if (isCustom) {
SDValue Res = TLI.LowerOperation(SDValue(Node, 0), DAG);		SDValue Res = TLI.LowerOperation(SDValue(Node, 0), DAG);
if (Res.getNode()) {		if (Res.getNode()) {
Value = Res;		Value = Res;
Chain = Res.getValue(1);		Chain = Res.getValue(1);
}		}
} else {		} else {
// If this is an unaligned load and the target doesn't support		// If this is an unaligned load and the target doesn't support it,
// it, expand it.		// expand it.
EVT MemVT = LD->getMemoryVT();		EVT MemVT = LD->getMemoryVT();
unsigned AS = LD->getAddressSpace();		unsigned AS = LD->getAddressSpace();
unsigned Align = LD->getAlignment();		unsigned Align = LD->getAlignment();
if (!TLI.allowsMisalignedMemoryAccesses(MemVT, AS, Align)) {		if (!TLI.allowsMemoryAccess(*DAG.getContext(), MemVT, AS, Align))
Type Ty = LD->getMemoryVT().getTypeForEVT(DAG.getContext());
unsigned ABIAlignment = TLI.getDataLayout()->getABITypeAlignment(Ty);
if (Align < ABIAlignment){
ExpandUnalignedLoad(cast<LoadSDNode>(Node), DAG, TLI, Value, Chain);		ExpandUnalignedLoad(cast<LoadSDNode>(Node), DAG, TLI, Value, Chain);
}		}
}
}
break;		break;
}		}
case TargetLowering::Expand:		case TargetLowering::Expand:
if (!TLI.isLoadExtLegal(ISD::EXTLOAD, Node->getValueType(0), SrcVT)) {		if (!TLI.isLoadExtLegal(ISD::EXTLOAD, Node->getValueType(0), SrcVT)) {
// If the source type is not legal, see if there is a legal extload to		// If the source type is not legal, see if there is a legal extload to
// an intermediate type that we can then extend further.		// an intermediate type that we can then extend further.
EVT LoadVT = TLI.getRegisterType(SrcVT.getSimpleVT());		EVT LoadVT = TLI.getRegisterType(SrcVT.getSimpleVT());
if (TLI.isTypeLegal(SrcVT) \|\| // Same as SrcVT == LoadVT?		if (TLI.isTypeLegal(SrcVT) \|\| // Same as SrcVT == LoadVT?
▲ Show 20 Lines • Show All 3,267 Lines • Show Last 20 Lines

lib/CodeGen/TargetLoweringBase.cpp

	Show First 20 Lines • Show All 1,532 Lines • ▼ Show 20 Lines

	/// getByValTypeAlignment - Return the desired alignment for ByVal aggregate			/// getByValTypeAlignment - Return the desired alignment for ByVal aggregate
	/// function arguments in the caller parameter area. This is the actual			/// function arguments in the caller parameter area. This is the actual
	/// alignment, not its logarithm.			/// alignment, not its logarithm.
	unsigned TargetLoweringBase::getByValTypeAlignment(Type *Ty) const {			unsigned TargetLoweringBase::getByValTypeAlignment(Type *Ty) const {
	return getDataLayout()->getABITypeAlignment(Ty);			return getDataLayout()->getABITypeAlignment(Ty);
	}			}

				bool TargetLoweringBase::allowsMemoryAccess(LLVMContext &Context, EVT VT,
				unsigned AddrSpace,
				unsigned Alignment,
				bool *Fast) const {
				// If a misaligned access of this type is allowed, then assume that an
				// aligned access of this type is allowed too.
				if (allowsMisalignedMemoryAccesses(VT, AddrSpace, Alignment, Fast))
				jyknightUnsubmitted Not Done Reply Inline Actions This won't return the right result in Fast. E.g. allowsMisalignedMemoryAccesses might return with `{ Fast = false; return true; }`, which would then cause this function to claim that a properly aligned memory access is not fast. That's obviously wrong, but I'm not really sure what Right is. What's the actual contract for the ABI alignment and Preferred alignment information in DataLayout? I think there's three things that LLVM might want to know about alignment and memory accesses: If I can choose any alignment for this data, what should it be? (getPrefTypeAlignment should return that, I think) Is this memory access guaranteed to work AT ALL, no matter the speed. I think that's a combination of getABITypeAlignment, and then allowsMisalignedMemoryAccesses without checking "Fast". Although: is it always guaranteed that the ABI alignment for a type is always okay to load/store from? It'd be sane for it to be, but it's not really totally crazy to have a target where that's not true: e.g. ABI alignment of 32bits for an i64, but requires 64bit alignment for an i64 load/store?) I guess that's not the case of any existing target, since LLVM doesn't support it. Is it worthwhile to combine some memory accesses into a larger memory access, given alignment of X? I'm not sure this is actually adequately expressed in LLVM right now. Could be it's intended to be getPrefTypeAlignment and allowsMisalignedMemoryAccesses (with testing Fast==true) is supposed to be this. But, is "preferred alignment" and "fast to access" supposed to always be the same thing? Because, e.g. MIPS says i8:8:32-i16:16:32-i64:64 -- so both i8 and i16 have abi alignment of their size, but preferred alignment of 32 bits. I don't know a lot about MIPS, so I don't know why it claims a preferred alignment of 32bits, but I think it has a perfectly good 1/2-byte load instructions that work at their natural alignment. So on MIPS (even pre-MIPS32r6/MIPS64r6 which apparently made unaligned access allowed), it seems like it should always be an improvement to merge 8-bit loads which are 16-bit aligned into a 16-bit load. Yet, going by getPrefTypeAlignment would say that's not ok. jyknight:* This won't return the right result in Fast. E.g. allowsMisalignedMemoryAccesses might return…
				spatelAuthorUnsubmitted Not Done Reply Inline Actions Hi James - Thanks for the close review - much appreciated! I can certainly fix the Fast bug that you noted, but I also don't know what the right answer is. My guess is that the MIPS DataLayout is wrong. Let's see if we can find out... The 32-bit preferred alignment for MIPS i8/i16 was introduced way back here: http://reviews.llvm.org/rL53585 Adding Bruno (author) and Daniel (currently listed as owner of the MIPS backend). spatel: Hi James - Thanks for the close review - much appreciated! I can certainly fix the Fast bug…
				dsandersUnsubmitted Not Done Reply Inline Actions I'm afraid I'm not sure why we prefer 32-bit alignment for i8 and i16. My best guess is that it pays off in library calls by allowing better-optimised memcpy/strcpy/strcmp/etc. I don't know a lot about MIPS, so I don't know why it claims a preferred alignment of 32bits, but I think it has a perfectly good 1/2-byte load instructions that work at their natural alignment. That's right. As far as I know they're usually (possibly always) the same latency too. ... (even pre-MIPS32r6/MIPS64r6 which apparently made unaligned access allowed) ... That's right. 32r6/64r6 dropped the unaligned load/store left/right instruction pairs in favour a promise that something in the system will make unaligned load/store work without special instructions. Depending on the MIPS implementation, the 'something' could be full hardware support, or a software exception handler, or a hybrid such as hardware-support for unaligned accesses inside a cache-line and software otherwise. So on MIPS (even pre-MIPS32r6/MIPS64r6 which apparently made unaligned access allowed), it seems like it should always be an improvement to merge 8-bit loads which are 16-bit aligned into a 16-bit load. Yet, going by getPrefTypeAlignment would say that's not ok. As far as the load is concerned, I agree it's always an improvement to merge like this. The risk is that manipulating the data might cost more than the saving. It's worth mentioning that some MIPS implementations will do this optimization in hardware without the risk of needing additional data manipulation. dsanders: I'm afraid I'm not sure why we prefer 32-bit alignment for i8 and i16. My best guess is that it…
				jyknightUnsubmitted Not Done Reply Inline Actions Even if there were no reason at all for MIPS to have preferred alignment set as it is, I think I've convinced myself that using "preferred" alignment in this function isn't correct. The "Fast" output should be indicating whether it's faster to do a potentially-misaligned load/store, or to use multiple load/store instructions. But "preferred" alignment is indicating the overall BEST alignment choice, not just what is going to be faster-or-equal than separate loads. E.g. consider an i64:32:64 alignment spec, with a target that doesn't support under-aligned memory access (it supports the abi alignment of 32bits, but not below). It still is quite possible that the target behavior is such that merging two 32-bit loads to a 64bit load is a good idea -- e.g. that it's at worst the same speed as two 32-bit loads, but is still faster when there's 64bit alignment. So perhaps the DataLayout should not actually be used to determine what memory alignments are valid for load/store on a target at all? That is, I'm thinking that targets maybe ought to implement "allowsMemoryAccess" directly, describing the entirety of that target's rules for what memory alignments work, and are "fast", rather than having "allowsMisalignedMemoryAccess" which describes only half of them, the other half intuited from the DataLayout. That seems a potentially saner model -- leaving the "Data Layout" spec to just describe the layout, and the target to describe what the hardware/runtime-environment can actually do, independent of that. Alternatively it might be okay for the ABI alignment to just be assumed to be fast. If that's not the case, that's certainly a suboptimal ABI, but it wouldn't surprise me wouldn't surprise me if some ABIs were suboptimal in that way. jyknight: Even if there were no reason at all for MIPS to have preferred alignment set as it is, I think…
				spatelAuthorUnsubmitted Not Done Reply Inline Actions I see your point about making a new target hook that implements all of 'allowsMemoryAccess' directly, but a quick look at the implementations of DataLayout::getAlignment() and DataLayout::getAlignmentInfo() makes me scared... Let me post an updated version of the patch that assumes the ABI alignment is fast and see what people think. spatel: I see your point about making a new target hook that implements all of 'allowsMemoryAccess'…
				qcolombetUnsubmitted Not Done Reply Inline Actions I’d say that’s fine for the default behavior, but we could provide a way to change that at the target discretion by making the hook virtual. qcolombet: I’d say that’s fine for the default behavior, but we could provide a way to change that at the…
				qcolombetUnsubmitted Not Done Reply Inline Actions Note: We could also do the virtual thing when we get an actual use case. I do not feel strongly either way. qcolombet: Note: We could also do the virtual thing when we get an actual use case. I do not feel strongly…
				return true;

				// If a misaligned access of this type is not allowed, see if the specified
				// alignment is sufficient based on the data layout.
				Type *Ty = VT.getTypeForEVT(Context);
				const DataLayout *DL = getDataLayout();
				mehdi_aminiUnsubmitted Not Done Reply Inline Actions I'm removing the getDataLayout() from TargetLowering soon (D11079), you'll have to find another way of getting it (argument to the function?). mehdi_amini: I'm removing the getDataLayout() from TargetLowering soon (D11079), you'll have to find another…

				// If we want to know if this access is fast, use the preferred alignment.
				if (Fast != nullptr)
				*Fast = (Alignment >= DL->getPrefTypeAlignment(Ty));

				// The access is allowed if its alignment meets the ABI-specified setting.
				return (Alignment >= DL->getABITypeAlignment(Ty));
				}


	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// TargetTransformInfo Helpers			// TargetTransformInfo Helpers
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	int TargetLoweringBase::InstructionOpcodeToISD(unsigned Opcode) const {			int TargetLoweringBase::InstructionOpcodeToISD(unsigned Opcode) const {
	enum InstructionOpcodes {			enum InstructionOpcodes {
	#define HANDLE_INST(NUM, OPCODE, CLASS) OPCODE = NUM,			#define HANDLE_INST(NUM, OPCODE, CLASS) OPCODE = NUM,
	#define LAST_OTHER_INST(NUM) InstructionOpcodesCount = NUM			#define LAST_OTHER_INST(NUM) InstructionOpcodesCount = NUM
	▲ Show 20 Lines • Show All 130 Lines • Show Last 20 Lines