This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/Analysis/
-
llvm/
-
Analysis/
-
PtrUseVisitor.h
-
lib/
-
Analysis/
2
PtrUseVisitor.cpp
-
Transforms/Scalar/
-
Scalar/
4/7
SROA.cpp
-
test/Transforms/SROA/
-
Transforms/
-
SROA/
-
addrspacecast.ll
1
basictest.ll
-
phi-and-select.ll

Differential D31924

SROA: Allow eliminating addrspacecasted allocas
ClosedPublic

Authored by arsenm on Apr 10 2017, 8:52 PM.

Download Raw Diff

Details

Reviewers

chandlerc
sanjoy
theraven

Summary

This is a resurrection of D10482 and D4501

There is a circular dependency between SROA and InferAddressSpaces
today that requires running both multiple times in order to be able to
eliminate all simple allocas and addrspacecasts. InferAddressSpaces
can't remove addrspacecasts when written to memory, and SROA helps
move pointers out of memory.

This should avoid inserting new commuting addrspacecasts with GEPs,
since there are unresolved questions about pointer wrapping between
different address spaces.

For now, don't replace volatile operations that don't match the alloca
addrspace, as it would change the address space of the access. It may
be still OK to insert an addrspacecast from the new alloca, but be
more conservative for now.

Diff Detail

Event Timeline

arsenm created this revision.Apr 10 2017, 8:52 PM

Herald added a subscriber: wdng. · View Herald TranscriptApr 10 2017, 8:52 PM

efriedma added a subscriber: efriedma.Apr 11 2017, 11:42 AM

efriedma added inline comments.

docs/LangRef.rst
8452 ↗	(On Diff #94776)	Do we need to restrict this rule to inbounds indexing?

arsenm added inline comments.Apr 11 2017, 12:28 PM

docs/LangRef.rst
8452 ↗	(On Diff #94776)	I suppose this should specify for a defined result. The idea is only really to disallow implementations that can't round-trip likes in @sanjoy's example of implementing it with abs.

Specify defined results, and that the pointer must be able to round trip

efriedma added inline comments.Apr 14 2017, 10:45 AM

docs/LangRef.rst
8451 ↗	(On Diff #95311)	What does "and then indexed" mean for a gep that isn't inbounds? We clearly can't make all indexing equivalent: if you use a gep to increment a pointer by 2^33, that's clearly going to have a different result if you try to round-trip that value through a 32-bit pointer. I mean, I understand what you're getting at with the new text: the rule is essentially that the original and casted pointers point at the same memory allocation. LangRef really needs to be clear, though.

Attempt simpler phrasing

efriedma added inline comments.Apr 18 2017, 12:30 PM

docs/LangRef.rst
8460 ↗	(On Diff #95441)	"The pointer conversion cannot be an arbitrarily complex value modification." is a bit vague... it'd be better if we could specifically say what transforms are allowed.
lib/Transforms/Scalar/SROA.cpp
1661	Why do we want to generate an address-space cast here, as opposed to performing the memory operation using the alloca's natural address-space?

Remove unnecessary change. The pointer already has the right address space

arsenm added inline comments.Apr 20 2017, 1:18 PM

docs/LangRef.rst
8460 ↗	(On Diff #95441)	Would just dropping that sentence work? The intent is more clear in the second sentence where it needs to be reversible

I'd like to see some testcases which involve mixing GEPs and addrspacecasts. What width of APInt do we use to accumulate the offset? How does overflow work when we mix GEPs in different address-spaces?

docs/LangRef.rst
8460 ↗	(On Diff #95441)	Yes, I think that's fine.

sanjoy added inline comments.Apr 20 2017, 1:41 PM

lib/Transforms/Scalar/SROA.cpp
1631	Are you ruling that `GEP(CAST(X), 1)` is the same as `CAST(GEP X, 1)`? If so, I'm not sure this is correct given your constraint on address space casts. For instance, if casting from address space N to M, with both spaces having the same pointer width, involves flipping the high and low halves of the pointer then `GEP(CAST(X), 1)` is not the same as `(CAST(GEP X, 1))`. Of course, this means that GEPs over pointers of address space M are different operations from GEPs over pointers of address space N, but that's allowed, AFAIK.

arsenm added inline comments.Apr 20 2017, 1:45 PM

lib/Transforms/Scalar/SROA.cpp
1631	Yes, those should be the same

sanjoy added inline comments.Apr 20 2017, 1:46 PM

lib/Transforms/Scalar/SROA.cpp
1631	Then you need to change the langref to disallow address space casts as the above (flipping the high and low halfs of the pointer).

Try to re-word langref again.

Re-add addrspacecast insertion. It can be necessary when the alloca isn't entirely eliminated.

Fix asserting when casting between different sized pointers

sanjoy added a subscriber: theraven.Apr 20 2017, 8:03 PM

Restricting addrspace cast in this way seems... really hard to get right. I still have the question Eli asked: what does this mean in the absence of inbounds? What if the address spaces have different wrapping behavior even though they have the same number of bits?

Consider an address space where there are tag bits in the high bits and one where there aren't. These may appear to be the same type, but the GEP-ing rule you propose doesn't seem to generally hold.

And that's just one example. I'm not sure even restricting this to inbounds will really fix the issue.

What about approaching this more from the inference perspective? Could we embed the inference into the iteration of SROA without shifting the restrictions so much?

theraven added inline comments.Apr 21 2017, 1:11 AM

docs/LangRef.rst
8460 ↗	(On Diff #95441)	It would be nice to clarify what 'legal' means in this context. For us, the relationship between address spaces 0 and 200 rely on some run-time properties. Address space 200 is always a superset of address space 0, so it is always safe to cast from AS 0 to AS 200, but the converse might not be possible and will give either a valid value (without bounds information) or a null pointer. A cast from AS200 -> AS0 -> AS200 may result in a null pointer if the address is outside the range covered by AS0. The same would apply on microcontrollers with a 32-bit global address space and a 16-bit address space mapped within that: you could always cast from the 16-bit range to the 32-bit range and back, but casting from an arbitrary 32-bit range to the 16-bit range and back may not work.
lib/Analysis/PtrUseVisitor.cpp
37–43	No changes suggested here, but in our version we have queries on the data layout that differentiate between the type and the range of a pointer (ours are 128- or 256-bit sized, but with a 64-bit range).

In D31924#733213, @chandlerc wrote:

Restricting addrspace cast in this way seems... really hard to get right. I still have the question Eli asked: what does this mean in the absence of inbounds? What if the address spaces have different wrapping behavior even though they have the same number of bits?

Consider an address space where there are tag bits in the high bits and one where there aren't. These may appear to be the same type, but the GEP-ing rule you propose doesn't seem to generally hold.

And that's just one example. I'm not sure even restricting this to inbounds will really fix the issue.

What about approaching this more from the inference perspective? Could we embed the inference into the iteration of SROA without shifting the restrictions so much?

I'm not sure exactly what you mean by this. Do you mean somehow merging InferAddressSpaces and SROA?

lib/Analysis/PtrUseVisitor.cpp
37–43	We will probably need this at some point for using pointers for resource descriptors

In D31924#735918, @arsenm wrote:

In D31924#733213, @chandlerc wrote:

What about approaching this more from the inference perspective? Could we embed the inference into the iteration of SROA without shifting the restrictions so much?

I'm not sure exactly what you mean by this. Do you mean somehow merging InferAddressSpaces and SROA?

In a limited form...

Essentially, expose utilities to infer address spaces which can be shared with the InferAddressSpaces pass but can also be used to infer address spaces for allocas as SROA promotes their uses into SSA registers.

Was there a decision reached here on what the correct semantics are? There are other places in LLVM (I found one in instcombine - there may be others) which do make the assumption that this change is proposing to introduce to the langref. Personally, I don't think this transformation should be allowed. I know there are architectures where different address spaces have different GEP behavior (though I'm not sure if this is the case for any in-tree backend). Nevertheless, if people do feel like this should be allowed (e.g. because such architectures should use something other than addrspacecast to convert between such address spaces), that's fine with me as well, but I think there should a clear statement in the langref on way or the other. As is, different people read the langref differently.

In D31924#754467, @loladiro wrote:

Was there a decision reached here on what the correct semantics are? There are other places in LLVM (I found one in instcombine - there may be others) which do make the assumption that this change is proposing to introduce to the langref. Personally, I don't think this transformation should be allowed. I know there are architectures where different address spaces have different GEP behavior (though I'm not sure if this is the case for any in-tree backend). Nevertheless, if people do feel like this should be allowed (e.g. because such architectures should use something other than addrspacecast to convert between such address spaces), that's fine with me as well, but I think there should a clear statement in the langref on way or the other. As is, different people read the langref differently.

Do you mean non-integral pointers? I don't think this changes the rules I was thinking. You still should have a reversible result. AMDGPU will use some non-integral pointers eventually, although it doesn't need them for this particular case.

I think the way the non-integral pointer is worded now is to allow GC etc. to completely replace the pointer value, in which case eliminating it like this is probably not OK. Would it work to restrict this for only integral pointer address spaces?

There was some discussion about non-integral address spaces at EuroLLVM. The current restriction is too great, as not allowing ptrtoint and inttoptr makes it impossible to support C-like languages. We discussed refining the definition to be that optimisers should not introduce inttoptr or ptrtoint, but that they are allowed to be inserted by the front end in places where they are valid in the context of the source language.

This change still has the problem with requiring bijection between address spaces, which is not the case for us or for any platform where there is a subset relationship between address spaces. For example, casting from a 32-bit address to a 16-bit address and then back is not guaranteed to give the same address. On an ARM M-profile chip, casting between addresses in different MPU sections may encounter similar problems.

I'm very nervous of any optimisations that introduce address space casts, because they rely on far more knowledge of the relationship between address spaces than we currently provide with the data layout. Perhaps providing that information in the data layout should be a prerequisite for this.

Do you mean non-integral pointers?

No, I don't mean non-integral pointers (though there's problems there too). I apologize if I'm being vague here, but I read a lot of architecture specs and I can never remember what is and is not public. In any case, I think the easiest example here is virtual memory. Consider an architecture with primitives for both physical and virtual memory and instructions for converting between the two quickly. It seems perfectly plausible to want to express the conversion between the two kinds of pointers as an address space cast. However, certainly geps and address space casts don't commute here. Now, as I said, you might argue that should a crazy address space cast deserves a target specific intrinsic, and I think that's a fine stance to take. I suppose the other alternative would be to add some information to the datalayout or the addressspace casts itself to indicate whether the optimizer is allowed to introduce addrspace casts that weren't in the original program.

In D31924#754613, @theraven wrote:

There was some discussion about non-integral address spaces at EuroLLVM. The current restriction is too great, as not allowing ptrtoint and inttoptr makes it impossible to support C-like languages. We discussed refining the definition to be that optimisers should not introduce inttoptr or ptrtoint, but that they are allowed to be inserted by the front end in places where they are valid in the context of the source language.

You're making me regret not making it to EuroLLVM even more. :)

We disallowed ptrtoint and inttoptr because these instructions (today) are arbitrarily speculatable; and changing that to be dependent on their types would introduce complexity.

Instead my plan is to add intrinsics to convert between ni pointers and integers with exactly the property you mentioned -- these intrinsics may have side effects so they can't be inserted by the optimizer or speculated, but the frontend may insert them when legal.

For us, speculation isn't a problem. ptrtoint is not guaranteed to give stable results in all run-time environments (i.e. if we enable a copying GC), but it doesn't break the memory safety guarantees. inttoptr only works in some execution environments (and will result in a null where it wouldn't work), and it's up to the C programmer to ensure that they don't use it when it wouldn't be sensible and other front ends won't emit it at all. Code works as expected, as long as optimisers don't try to add them.

loladiro mentioned this in D33361: [InstCombine] Fix inbounds gep for addrspacecasts.May 19 2017, 10:00 AM

Having pondered this some more, I wonder if what we're missing is an annotation on the addrspace cast itself that indicates whether or not GEPs may be commuted past it (could call it inbounds or something else). It seems like in many cases (including allocas). The frontend (or whoever else is doing language/target specific work) can often know whether the entire object is available in the target address space (e.g. because the entire stack always is).

Rebase and fix using the pointer type instead of the new indexing type. Don't introduce a new addrspacecast, since it's easily avoidable.

Since the addrspacecast is no longer inserted, I think that avoids some of the questions about pointer wrapping? The addrspacecasts are only eliminated, and ignored for computing the offset.

lib/Transforms/Scalar/SROA.cpp
1631	Does this only matter because of the newly introduced addrspacecast? This may change the pointer value, but we only care about number of bytes indexed off of the original object. Changing the representation in the middle shouldn't change the total number of bytes addressed from the original object?
1661	You're right, this is unnecessary

jdoerfert added a subscriber: jdoerfert.Jun 3 2019, 1:43 PM

Re-apply change to insert a final addrspacecast in getAdjustedPtr, which is necessary in some cases.

Don't allow getAdjustedPtr to search through addrspacecast. This should prevent the questionable addrspacecast-GEP commuting behavior. The practical case that matters is an alloca immediately casted, and all addressing is done in the result address space, so getting the same GEP folds in all cases as a bitcast isn't critically important, though would be nice to have. Some new addrspacecasts may be introduced, but not followed by a GEP.

Fix changing the address space of volatile operations, although inserting a new addrspacecast should be OK in these cases. For now just leave these cases alone.

Also fix some missing test coverage.

arsenm marked an inline comment as done.Jun 10 2019, 1:45 PM

arsenm added inline comments.

lib/Transforms/Scalar/SROA.cpp
1661	This is actually necessary in some cases. (e.g. in select_addrspacecast_const_op, without this, the two operands of the select end up as different types)

I think this looks like it will improve codegen for us and not violate any of our C-level guarantees. Hopefully @arichardson can also take a look.

This revision is now accepted and ready to land.Jun 11 2019, 1:31 AM

In D31924#1537558, @theraven wrote:

I think this looks like it will improve codegen for us and not violate any of our C-level guarantees. Hopefully @arichardson can also take a look.

I just tried this on our fork and it looks good.

test/Transforms/SROA/basictest.ll
108	Use FileCheck captures for the variables in case the naming changes in the future?

r363462

Revision Contents

Path

Size

include/

llvm/

Analysis/

PtrUseVisitor.h

4 lines

lib/

Analysis/

PtrUseVisitor.cpp

8 lines

Transforms/

Scalar/

SROA.cpp

51 lines

test/

Transforms/

SROA/

addrspacecast.ll

60 lines

basictest.ll

110 lines

phi-and-select.ll

50 lines

Diff 203889

include/llvm/Analysis/PtrUseVisitor.h

Show First 20 Lines • Show All 250 Lines • ▼ Show 20 Lines	void visitStoreInst(StoreInst &SI) {
if (SI.getValueOperand() == U->get())		if (SI.getValueOperand() == U->get())
PI.setEscaped(&SI);		PI.setEscaped(&SI);
}		}

void visitBitCastInst(BitCastInst &BC) {		void visitBitCastInst(BitCastInst &BC) {
enqueueUsers(BC);		enqueueUsers(BC);
}		}

		void visitAddrSpaceCastInst(AddrSpaceCastInst &ASC) {
		enqueueUsers(ASC);
		}

void visitPtrToIntInst(PtrToIntInst &I) {		void visitPtrToIntInst(PtrToIntInst &I) {
PI.setEscaped(&I);		PI.setEscaped(&I);
}		}

void visitGetElementPtrInst(GetElementPtrInst &GEPI) {		void visitGetElementPtrInst(GetElementPtrInst &GEPI) {
if (GEPI.use_empty())		if (GEPI.use_empty())
return;		return;

Show All 36 Lines

lib/Analysis/PtrUseVisitor.cpp

Show All 28 Lines	for (Use &U : I.uses()) {
}		}
}		}
}		}

bool detail::PtrUseVisitorBase::adjustOffsetForGEP(GetElementPtrInst &GEPI) {		bool detail::PtrUseVisitorBase::adjustOffsetForGEP(GetElementPtrInst &GEPI) {
if (!IsOffsetKnown)		if (!IsOffsetKnown)
return false;		return false;

return GEPI.accumulateConstantOffset(DL, Offset);		APInt TmpOffset(DL.getIndexTypeSizeInBits(GEPI.getType()), 0);
		if (GEPI.accumulateConstantOffset(DL, TmpOffset)) {
		Offset += TmpOffset.sextOrTrunc(Offset.getBitWidth());
		return true;
		}

		return false;
		theravenUnsubmitted Not Done Reply Inline Actions No changes suggested here, but in our version we have queries on the data layout that differentiate between the type and the range of a pointer (ours are 128- or 256-bit sized, but with a 64-bit range). theraven: No changes suggested here, but in our version we have queries on the data layout that…
		arsenmAuthorUnsubmitted Not Done Reply Inline Actions We will probably need this at some point for using pointers for resource descriptors arsenm: We will probably need this at some point for using pointers for resource descriptors
}		}

lib/Transforms/Scalar/SROA.cpp

Show First 20 Lines • Show All 707 Lines • ▼ Show 20 Lines	private:

void visitBitCastInst(BitCastInst &BC) {		void visitBitCastInst(BitCastInst &BC) {
if (BC.use_empty())		if (BC.use_empty())
return markAsDead(BC);		return markAsDead(BC);

return Base::visitBitCastInst(BC);		return Base::visitBitCastInst(BC);
}		}

		void visitAddrSpaceCastInst(AddrSpaceCastInst &ASC) {
		if (ASC.use_empty())
		return markAsDead(ASC);

		return Base::visitAddrSpaceCastInst(ASC);
		}

void visitGetElementPtrInst(GetElementPtrInst &GEPI) {		void visitGetElementPtrInst(GetElementPtrInst &GEPI) {
if (GEPI.use_empty())		if (GEPI.use_empty())
return markAsDead(GEPI);		return markAsDead(GEPI);

if (SROAStrictInbounds && GEPI.isInBounds()) {		if (SROAStrictInbounds && GEPI.isInBounds()) {
// FIXME: This is a manually un-factored variant of the basic code inside		// FIXME: This is a manually un-factored variant of the basic code inside
// of GEPs with checking of the inbounds invariant specified in the		// of GEPs with checking of the inbounds invariant specified in the
// langref in a very strict sense. If we ever want to enable		// langref in a very strict sense. If we ever want to enable
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	private:

void visitLoadInst(LoadInst &LI) {		void visitLoadInst(LoadInst &LI) {
assert((!LI.isSimple() \|\| LI.getType()->isSingleValueType()) &&		assert((!LI.isSimple() \|\| LI.getType()->isSingleValueType()) &&
"All simple FCA loads should have been pre-split");		"All simple FCA loads should have been pre-split");

if (!IsOffsetKnown)		if (!IsOffsetKnown)
return PI.setAborted(&LI);		return PI.setAborted(&LI);

const DataLayout &DL = LI.getModule()->getDataLayout();		if (LI.isVolatile() &&
		LI.getPointerAddressSpace() != DL.getAllocaAddrSpace())
		return PI.setAborted(&LI);

uint64_t Size = DL.getTypeStoreSize(LI.getType());		uint64_t Size = DL.getTypeStoreSize(LI.getType());
return handleLoadOrStore(LI.getType(), LI, Offset, Size, LI.isVolatile());		return handleLoadOrStore(LI.getType(), LI, Offset, Size, LI.isVolatile());
}		}

void visitStoreInst(StoreInst &SI) {		void visitStoreInst(StoreInst &SI) {
Value *ValOp = SI.getValueOperand();		Value *ValOp = SI.getValueOperand();
if (ValOp == *U)		if (ValOp == *U)
return PI.setEscapedAndAborted(&SI);		return PI.setEscapedAndAborted(&SI);
if (!IsOffsetKnown)		if (!IsOffsetKnown)
return PI.setAborted(&SI);		return PI.setAborted(&SI);

const DataLayout &DL = SI.getModule()->getDataLayout();		if (SI.isVolatile() &&
		SI.getPointerAddressSpace() != DL.getAllocaAddrSpace())
		return PI.setAborted(&SI);

uint64_t Size = DL.getTypeStoreSize(ValOp->getType());		uint64_t Size = DL.getTypeStoreSize(ValOp->getType());

// If this memory access can be shown to statically extend outside the		// If this memory access can be shown to statically extend outside the
// bounds of the allocation, it's behavior is undefined, so simply		// bounds of the allocation, it's behavior is undefined, so simply
// ignore it. Note that this is more strict than the generic clamping		// ignore it. Note that this is more strict than the generic clamping
// behavior of insertUse. We also try to handle cases which might run the		// behavior of insertUse. We also try to handle cases which might run the
// risk of overflow.		// risk of overflow.
// FIXME: We should instead consider the pointer to have escaped if this		// FIXME: We should instead consider the pointer to have escaped if this
Show All 18 Lines	void visitMemSetInst(MemSetInst &II) {
if ((Length && Length->getValue() == 0) \|\|		if ((Length && Length->getValue() == 0) \|\|
(IsOffsetKnown && Offset.uge(AllocSize)))		(IsOffsetKnown && Offset.uge(AllocSize)))
// Zero-length mem transfer intrinsics can be ignored entirely.		// Zero-length mem transfer intrinsics can be ignored entirely.
return markAsDead(II);		return markAsDead(II);

if (!IsOffsetKnown)		if (!IsOffsetKnown)
return PI.setAborted(&II);		return PI.setAborted(&II);

		// Don't replace this with a store with a different address space. TODO:
		// Use a store with the casted new alloca?
		if (II.isVolatile() && II.getDestAddressSpace() != DL.getAllocaAddrSpace())
		return PI.setAborted(&II);

insertUse(II, Offset, Length ? Length->getLimitedValue()		insertUse(II, Offset, Length ? Length->getLimitedValue()
: AllocSize - Offset.getLimitedValue(),		: AllocSize - Offset.getLimitedValue(),
(bool)Length);		(bool)Length);
}		}

void visitMemTransferInst(MemTransferInst &II) {		void visitMemTransferInst(MemTransferInst &II) {
ConstantInt *Length = dyn_cast<ConstantInt>(II.getLength());		ConstantInt *Length = dyn_cast<ConstantInt>(II.getLength());
if (Length && Length->getValue() == 0)		if (Length && Length->getValue() == 0)
// Zero-length mem transfer intrinsics can be ignored entirely.		// Zero-length mem transfer intrinsics can be ignored entirely.
return markAsDead(II);		return markAsDead(II);

// Because we can visit these intrinsics twice, also check to see if the		// Because we can visit these intrinsics twice, also check to see if the
// first time marked this instruction as dead. If so, skip it.		// first time marked this instruction as dead. If so, skip it.
if (VisitedDeadInsts.count(&II))		if (VisitedDeadInsts.count(&II))
return;		return;

if (!IsOffsetKnown)		if (!IsOffsetKnown)
return PI.setAborted(&II);		return PI.setAborted(&II);

		// Don't replace this with a load/store with a different address space.
		// TODO: Use a store with the casted new alloca?
		if (II.isVolatile() &&
		(II.getDestAddressSpace() != DL.getAllocaAddrSpace() \|\|
		II.getSourceAddressSpace() != DL.getAllocaAddrSpace()))
		return PI.setAborted(&II);

// This side of the transfer is completely out-of-bounds, and so we can		// This side of the transfer is completely out-of-bounds, and so we can
// nuke the entire transfer. However, we also need to nuke the other side		// nuke the entire transfer. However, we also need to nuke the other side
// if already added to our partitions.		// if already added to our partitions.
// FIXME: Yet another place we really should bypass this when		// FIXME: Yet another place we really should bypass this when
// instrumenting for ASan.		// instrumenting for ASan.
if (Offset.uge(AllocSize)) {		if (Offset.uge(AllocSize)) {
SmallDenseMap<Instruction *, unsigned>::iterator MTPI =		SmallDenseMap<Instruction *, unsigned>::iterator MTPI =
MemTransferSliceMap.find(&II);		MemTransferSliceMap.find(&II);
▲ Show 20 Lines • Show All 91 Lines • ▼ Show 20 Lines	do {
Size = std::max(Size, DL.getTypeStoreSize(Op->getType()));		Size = std::max(Size, DL.getTypeStoreSize(Op->getType()));
continue;		continue;
}		}

if (GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(I)) {		if (GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(I)) {
if (!GEP->hasAllZeroIndices())		if (!GEP->hasAllZeroIndices())
return GEP;		return GEP;
} else if (!isa<BitCastInst>(I) && !isa<PHINode>(I) &&		} else if (!isa<BitCastInst>(I) && !isa<PHINode>(I) &&
!isa<SelectInst>(I)) {		!isa<SelectInst>(I) && !isa<AddrSpaceCastInst>(I)) {
return I;		return I;
}		}

for (User *U : I->users())		for (User *U : I->users())
if (Visited.insert(cast<Instruction>(U)).second)		if (Visited.insert(cast<Instruction>(U)).second)
Uses.push_back(std::make_pair(I, cast<Instruction>(U)));		Uses.push_back(std::make_pair(I, cast<Instruction>(U)));
} while (!Uses.empty());		} while (!Uses.empty());

▲ Show 20 Lines • Show All 595 Lines • ▼ Show 20 Lines	static Value getAdjustedPtr(IRBuilderTy &IRB, const DataLayout &DL, Value Ptr,
Value *OffsetPtr = nullptr;		Value *OffsetPtr = nullptr;
Value *OffsetBasePtr;		Value *OffsetBasePtr;

// Remember any i8 pointer we come across to re-use if we need to do a raw		// Remember any i8 pointer we come across to re-use if we need to do a raw
// byte offset.		// byte offset.
Value *Int8Ptr = nullptr;		Value *Int8Ptr = nullptr;
APInt Int8PtrOffset(Offset.getBitWidth(), 0);		APInt Int8PtrOffset(Offset.getBitWidth(), 0);

Type *TargetTy = PointerTy->getPointerElementType();		PointerType *TargetPtrTy = cast<PointerType>(PointerTy);
		Type *TargetTy = TargetPtrTy->getElementType();

do {		do {
// First fold any existing GEPs into the offset.		// First fold any existing GEPs into the offset.
while (GEPOperator *GEP = dyn_cast<GEPOperator>(Ptr)) {		while (GEPOperator *GEP = dyn_cast<GEPOperator>(Ptr)) {
APInt GEPOffset(Offset.getBitWidth(), 0);		APInt GEPOffset(Offset.getBitWidth(), 0);
if (!GEP->accumulateConstantOffset(DL, GEPOffset))		if (!GEP->accumulateConstantOffset(DL, GEPOffset))
break;		break;
Offset += GEPOffset;		Offset += GEPOffset;
Show All 24 Lines	do {
// Stash this pointer if we've found an i8*.		// Stash this pointer if we've found an i8*.
if (Ptr->getType()->isIntegerTy(8)) {		if (Ptr->getType()->isIntegerTy(8)) {
Int8Ptr = Ptr;		Int8Ptr = Ptr;
Int8PtrOffset = Offset;		Int8PtrOffset = Offset;
}		}

// Peel off a layer of the pointer and update the offset appropriately.		// Peel off a layer of the pointer and update the offset appropriately.
if (Operator::getOpcode(Ptr) == Instruction::BitCast) {		if (Operator::getOpcode(Ptr) == Instruction::BitCast) {
Ptr = cast<Operator>(Ptr)->getOperand(0);		Ptr = cast<Operator>(Ptr)->getOperand(0);
		sanjoyUnsubmitted Not Done Reply Inline Actions Are you ruling that `GEP(CAST(X), 1)` is the same as `CAST(GEP X, 1)`? If so, I'm not sure this is correct given your constraint on address space casts. For instance, if casting from address space N to M, with both spaces having the same pointer width, involves flipping the high and low halves of the pointer then `GEP(CAST(X), 1)` is not the same as `(CAST(GEP X, 1))`. Of course, this means that GEPs over pointers of address space M are different operations from GEPs over pointers of address space N, but that's allowed, AFAIK. sanjoy: Are you ruling that `GEP(CAST(X), 1)` is the same as `CAST(GEP X, 1)`? If so, I'm not sure…
		arsenmAuthorUnsubmitted Not Done Reply Inline Actions Yes, those should be the same arsenm: Yes, those should be the same
		sanjoyUnsubmitted Not Done Reply Inline Actions Then you need to change the langref to disallow address space casts as the above (flipping the high and low halfs of the pointer). sanjoy: Then you need to change the langref to disallow address space casts as the above (flipping the…
		arsenmAuthorUnsubmitted Done Reply Inline Actions Does this only matter because of the newly introduced addrspacecast? This may change the pointer value, but we only care about number of bytes indexed off of the original object. Changing the representation in the middle shouldn't change the total number of bytes addressed from the original object? arsenm: Does this only matter because of the newly introduced addrspacecast? This may change the…
} else if (GlobalAlias *GA = dyn_cast<GlobalAlias>(Ptr)) {		} else if (GlobalAlias *GA = dyn_cast<GlobalAlias>(Ptr)) {
if (GA->isInterposable())		if (GA->isInterposable())
break;		break;
Ptr = GA->getAliasee();		Ptr = GA->getAliasee();
} else {		} else {
break;		break;
}		}
assert(Ptr->getType()->isPointerTy() && "Unexpected operand type!");		assert(Ptr->getType()->isPointerTy() && "Unexpected operand type!");
Show All 11 Lines	OffsetPtr = Int8PtrOffset == 0
? Int8Ptr		? Int8Ptr
: IRB.CreateInBoundsGEP(IRB.getInt8Ty(), Int8Ptr,		: IRB.CreateInBoundsGEP(IRB.getInt8Ty(), Int8Ptr,
IRB.getInt(Int8PtrOffset),		IRB.getInt(Int8PtrOffset),
NamePrefix + "sroa_raw_idx");		NamePrefix + "sroa_raw_idx");
}		}
Ptr = OffsetPtr;		Ptr = OffsetPtr;

// On the off chance we were targeting i8*, guard the bitcast here.		// On the off chance we were targeting i8*, guard the bitcast here.
if (Ptr->getType() != PointerTy)		if (cast<PointerType>(Ptr->getType()) != TargetPtrTy) {
Ptr = IRB.CreateBitCast(Ptr, PointerTy, NamePrefix + "sroa_cast");		Ptr = IRB.CreatePointerBitCastOrAddrSpaceCast(Ptr,
		TargetPtrTy,
		efriedmaUnsubmitted Done Reply Inline Actions Why do we want to generate an address-space cast here, as opposed to performing the memory operation using the alloca's natural address-space? efriedma: Why do we want to generate an address-space cast here, as opposed to performing the memory…
		arsenmAuthorUnsubmitted Done Reply Inline Actions You're right, this is unnecessary arsenm: You're right, this is unnecessary
		arsenmAuthorUnsubmitted Done Reply Inline Actions This is actually necessary in some cases. (e.g. in select_addrspacecast_const_op, without this, the two operands of the select end up as different types) arsenm: This is actually necessary in some cases. (e.g. in select_addrspacecast_const_op, without this…
		NamePrefix + "sroa_cast");
		}

return Ptr;		return Ptr;
}		}

/// Compute the adjusted alignment for a load or store from an offset.		/// Compute the adjusted alignment for a load or store from an offset.
static unsigned getAdjustedAlignment(Instruction *I, uint64_t Offset,		static unsigned getAdjustedAlignment(Instruction *I, uint64_t Offset,
const DataLayout &DL) {		const DataLayout &DL) {
unsigned Alignment;		unsigned Alignment;
▲ Show 20 Lines • Show All 1,434 Lines • ▼ Show 20 Lines	do {
if (!StoreAlign) {		if (!StoreAlign) {
Value *Op = SI->getOperand(0);		Value *Op = SI->getOperand(0);
StoreAlign = DL.getABITypeAlignment(Op->getType());		StoreAlign = DL.getABITypeAlignment(Op->getType());
}		}
SI->setAlignment(std::min(StoreAlign, getSliceAlign()));		SI->setAlignment(std::min(StoreAlign, getSliceAlign()));
continue;		continue;
}		}

assert(isa<BitCastInst>(I) \|\| isa<PHINode>(I) \|\|		assert(isa<BitCastInst>(I) \|\| isa<AddrSpaceCastInst>(I) \|\|
isa<SelectInst>(I) \|\| isa<GetElementPtrInst>(I));		isa<PHINode>(I) \|\| isa<SelectInst>(I) \|\|
		isa<GetElementPtrInst>(I));
for (User *U : I->users())		for (User *U : I->users())
if (Visited.insert(cast<Instruction>(U)).second)		if (Visited.insert(cast<Instruction>(U)).second)
Uses.push_back(cast<Instruction>(U));		Uses.push_back(cast<Instruction>(U));
} while (!Uses.empty());		} while (!Uses.empty());
}		}

bool visitPHINode(PHINode &PN) {		bool visitPHINode(PHINode &PN) {
LLVM_DEBUG(dbgs() << " original: " << PN << "\n");		LLVM_DEBUG(dbgs() << " original: " << PN << "\n");
▲ Show 20 Lines • Show All 284 Lines • ▼ Show 20 Lines	bool visitStoreInst(StoreInst &SI) {
return true;		return true;
}		}

bool visitBitCastInst(BitCastInst &BC) {		bool visitBitCastInst(BitCastInst &BC) {
enqueueUsers(BC);		enqueueUsers(BC);
return false;		return false;
}		}

		bool visitAddrSpaceCastInst(AddrSpaceCastInst &ASC) {
		enqueueUsers(ASC);
		return false;
		}

bool visitGetElementPtrInst(GetElementPtrInst &GEPI) {		bool visitGetElementPtrInst(GetElementPtrInst &GEPI) {
enqueueUsers(GEPI);		enqueueUsers(GEPI);
return false;		return false;
}		}

bool visitPHINode(PHINode &PN) {		bool visitPHINode(PHINode &PN) {
enqueueUsers(PN);		enqueueUsers(PN);
return false;		return false;
▲ Show 20 Lines • Show All 1,207 Lines • Show Last 20 Lines

test/Transforms/SROA/addrspacecast.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt < %s -sroa -S \| FileCheck %s		; RUN: opt < %s -sroa -S \| FileCheck %s
; RUN: opt < %s -passes=sroa -S \| FileCheck %s		; RUN: opt < %s -passes=sroa -S \| FileCheck %s

target datalayout = "e-p:64:64:64-p1:16:16:16-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n8:16:32:64"		target datalayout = "e-p:64:64:64-p1:16:16:16-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n8:16:32:64"

declare void @llvm.memcpy.p0i8.p1i8.i32(i8* nocapture writeonly, i8 addrspace(1)* nocapture readonly, i32, i1 immarg) #0		declare void @llvm.memcpy.p0i8.p1i8.i32(i8* nocapture writeonly, i8 addrspace(1)* nocapture readonly, i32, i1 immarg) #0
declare void @llvm.memcpy.p1i8.p0i8.i32(i8 addrspace(1)* nocapture writeonly, i8* nocapture readonly, i32, i1 immarg) #0		declare void @llvm.memcpy.p1i8.p0i8.i32(i8 addrspace(1)* nocapture writeonly, i8* nocapture readonly, i32, i1 immarg) #0

define i64 @alloca_addrspacecast_bitcast(i64 %X) {		define i64 @alloca_addrspacecast_bitcast(i64 %X) {
; CHECK-LABEL: @alloca_addrspacecast_bitcast(		; CHECK-LABEL: @alloca_addrspacecast_bitcast(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[A:%.*]] = alloca [8 x i8]		; CHECK-NEXT: ret i64 [[X:%.*]]
; CHECK-NEXT: [[A_CAST:%.]] = addrspacecast [8 x i8] [[A]] to [8 x i8] addrspace(1)*
; CHECK-NEXT: [[B:%.]] = bitcast [8 x i8] addrspace(1) [[A_CAST]] to i64 addrspace(1)*
; CHECK-NEXT: store i64 [[X:%.]], i64 addrspace(1) [[B]]
; CHECK-NEXT: [[Z:%.]] = load i64, i64 addrspace(1) [[B]]
; CHECK-NEXT: ret i64 [[Z]]
;		;
entry:		entry:
%A = alloca [8 x i8]		%A = alloca [8 x i8]
%A.cast = addrspacecast [8 x i8]* %A to [8 x i8] addrspace(1)*		%A.cast = addrspacecast [8 x i8]* %A to [8 x i8] addrspace(1)*
%B = bitcast [8 x i8] addrspace(1)* %A.cast to i64 addrspace(1)*		%B = bitcast [8 x i8] addrspace(1)* %A.cast to i64 addrspace(1)*
store i64 %X, i64 addrspace(1)* %B		store i64 %X, i64 addrspace(1)* %B
%Z = load i64, i64 addrspace(1)* %B		%Z = load i64, i64 addrspace(1)* %B
ret i64 %Z		ret i64 %Z
}		}

define i64 @alloca_bitcast_addrspacecast(i64 %X) {		define i64 @alloca_bitcast_addrspacecast(i64 %X) {
; CHECK-LABEL: @alloca_bitcast_addrspacecast(		; CHECK-LABEL: @alloca_bitcast_addrspacecast(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[A:%.*]] = alloca [8 x i8]		; CHECK-NEXT: ret i64 [[X:%.*]]
; CHECK-NEXT: [[A_CAST:%.]] = bitcast [8 x i8] [[A]] to i64*
; CHECK-NEXT: [[B:%.]] = addrspacecast i64 [[A_CAST]] to i64 addrspace(1)*
; CHECK-NEXT: store i64 [[X:%.]], i64 addrspace(1) [[B]]
; CHECK-NEXT: [[Z:%.]] = load i64, i64 addrspace(1) [[B]]
; CHECK-NEXT: ret i64 [[Z]]
;		;
entry:		entry:
%A = alloca [8 x i8]		%A = alloca [8 x i8]
%A.cast = bitcast [8 x i8]* %A to i64*		%A.cast = bitcast [8 x i8]* %A to i64*
%B = addrspacecast i64* %A.cast to i64 addrspace(1)*		%B = addrspacecast i64* %A.cast to i64 addrspace(1)*
store i64 %X, i64 addrspace(1)* %B		store i64 %X, i64 addrspace(1)* %B
%Z = load i64, i64 addrspace(1)* %B		%Z = load i64, i64 addrspace(1)* %B
ret i64 %Z		ret i64 %Z
}		}

define i64 @alloca_addrspacecast_gep(i64 %X) {		define i64 @alloca_addrspacecast_gep(i64 %X) {
; CHECK-LABEL: @alloca_addrspacecast_gep(		; CHECK-LABEL: @alloca_addrspacecast_gep(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[A_AS0:%.*]] = alloca [256 x i8], align 4		; CHECK-NEXT: ret i64 [[X:%.*]]
; CHECK-NEXT: [[GEPA_AS0:%.]] = getelementptr [256 x i8], [256 x i8] [[A_AS0]], i16 0, i16 32
; CHECK-NEXT: [[GEPA_AS0_BC:%.]] = bitcast i8 [[GEPA_AS0]] to i64*
; CHECK-NEXT: store i64 [[X:%.]], i64 [[GEPA_AS0_BC]], align 4
; CHECK-NEXT: [[A_AS1:%.]] = addrspacecast [256 x i8] [[A_AS0]] to [256 x i8] addrspace(1)*
; CHECK-NEXT: [[GEPA_AS1:%.]] = getelementptr [256 x i8], [256 x i8] addrspace(1) [[A_AS1]], i16 0, i16 32
; CHECK-NEXT: [[GEPA_AS1_BC:%.]] = bitcast i8 addrspace(1) [[GEPA_AS1]] to i64 addrspace(1)*
; CHECK-NEXT: [[Z:%.]] = load i64, i64 addrspace(1) [[GEPA_AS1_BC]], align 4
; CHECK-NEXT: ret i64 [[Z]]
;		;
entry:		entry:
%A.as0 = alloca [256 x i8], align 4		%A.as0 = alloca [256 x i8], align 4

%gepA.as0 = getelementptr [256 x i8], [256 x i8]* %A.as0, i16 0, i16 32		%gepA.as0 = getelementptr [256 x i8], [256 x i8]* %A.as0, i16 0, i16 32
%gepA.as0.bc = bitcast i8* %gepA.as0 to i64*		%gepA.as0.bc = bitcast i8* %gepA.as0 to i64*
store i64 %X, i64* %gepA.as0.bc, align 4		store i64 %X, i64* %gepA.as0.bc, align 4

%A.as1 = addrspacecast [256 x i8]* %A.as0 to [256 x i8] addrspace(1)*		%A.as1 = addrspacecast [256 x i8]* %A.as0 to [256 x i8] addrspace(1)*
%gepA.as1 = getelementptr [256 x i8], [256 x i8] addrspace(1)* %A.as1, i16 0, i16 32		%gepA.as1 = getelementptr [256 x i8], [256 x i8] addrspace(1)* %A.as1, i16 0, i16 32
%gepA.as1.bc = bitcast i8 addrspace(1)* %gepA.as1 to i64 addrspace(1)*		%gepA.as1.bc = bitcast i8 addrspace(1)* %gepA.as1 to i64 addrspace(1)*
%Z = load i64, i64 addrspace(1)* %gepA.as1.bc, align 4		%Z = load i64, i64 addrspace(1)* %gepA.as1.bc, align 4

ret i64 %Z		ret i64 %Z
}		}

define i64 @alloca_gep_addrspacecast(i64 %X) {		define i64 @alloca_gep_addrspacecast(i64 %X) {
; CHECK-LABEL: @alloca_gep_addrspacecast(		; CHECK-LABEL: @alloca_gep_addrspacecast(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[A_AS0:%.*]] = alloca [256 x i8], align 4		; CHECK-NEXT: ret i64 [[X:%.*]]
; CHECK-NEXT: [[GEPA_AS0:%.]] = getelementptr [256 x i8], [256 x i8] [[A_AS0]], i16 0, i16 32
; CHECK-NEXT: [[GEPA_AS0_BC:%.]] = bitcast i8 [[GEPA_AS0]] to i64*
; CHECK-NEXT: store i64 [[X:%.]], i64 [[GEPA_AS0_BC]], align 4
; CHECK-NEXT: [[GEPA_AS1_BC:%.]] = addrspacecast i64 [[GEPA_AS0_BC]] to i64 addrspace(1)*
; CHECK-NEXT: [[Z:%.]] = load i64, i64 addrspace(1) [[GEPA_AS1_BC]], align 4
; CHECK-NEXT: ret i64 [[Z]]
;		;
entry:		entry:
%A.as0 = alloca [256 x i8], align 4		%A.as0 = alloca [256 x i8], align 4

%gepA.as0 = getelementptr [256 x i8], [256 x i8]* %A.as0, i16 0, i16 32		%gepA.as0 = getelementptr [256 x i8], [256 x i8]* %A.as0, i16 0, i16 32
%gepA.as0.bc = bitcast i8* %gepA.as0 to i64*		%gepA.as0.bc = bitcast i8* %gepA.as0 to i64*
store i64 %X, i64* %gepA.as0.bc, align 4		store i64 %X, i64* %gepA.as0.bc, align 4

%gepA.as1.bc = addrspacecast i64* %gepA.as0.bc to i64 addrspace(1)*		%gepA.as1.bc = addrspacecast i64* %gepA.as0.bc to i64 addrspace(1)*
%Z = load i64, i64 addrspace(1)* %gepA.as1.bc, align 4		%Z = load i64, i64 addrspace(1)* %gepA.as1.bc, align 4
ret i64 %Z		ret i64 %Z
}		}

define i64 @alloca_gep_addrspacecast_gep(i64 %X) {		define i64 @alloca_gep_addrspacecast_gep(i64 %X) {
; CHECK-LABEL: @alloca_gep_addrspacecast_gep(		; CHECK-LABEL: @alloca_gep_addrspacecast_gep(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[A_AS0:%.*]] = alloca [256 x i8], align 4		; CHECK-NEXT: ret i64 [[X:%.*]]
; CHECK-NEXT: [[GEPA_AS0:%.]] = getelementptr [256 x i8], [256 x i8] [[A_AS0]], i16 0, i16 32
; CHECK-NEXT: [[GEPA_AS0_BC:%.]] = bitcast i8 [[GEPA_AS0]] to i64*
; CHECK-NEXT: store i64 [[X:%.]], i64 [[GEPA_AS0_BC]], align 4
; CHECK-NEXT: [[GEPB_AS0:%.]] = getelementptr [256 x i8], [256 x i8] [[A_AS0]], i16 0, i16 16
; CHECK-NEXT: [[GEPB_AS1:%.]] = addrspacecast i8 [[GEPB_AS0]] to i8 addrspace(1)*
; CHECK-NEXT: [[GEPC_AS1:%.]] = getelementptr i8, i8 addrspace(1) [[GEPB_AS1]], i16 16
; CHECK-NEXT: [[GEPC_AS1_BC:%.]] = bitcast i8 addrspace(1) [[GEPC_AS1]] to i64 addrspace(1)*
; CHECK-NEXT: [[Z:%.]] = load i64, i64 addrspace(1) [[GEPC_AS1_BC]], align 4
; CHECK-NEXT: ret i64 [[Z]]
;		;
entry:		entry:
%A.as0 = alloca [256 x i8], align 4		%A.as0 = alloca [256 x i8], align 4

%gepA.as0 = getelementptr [256 x i8], [256 x i8]* %A.as0, i16 0, i16 32		%gepA.as0 = getelementptr [256 x i8], [256 x i8]* %A.as0, i16 0, i16 32
%gepA.as0.bc = bitcast i8* %gepA.as0 to i64*		%gepA.as0.bc = bitcast i8* %gepA.as0 to i64*
store i64 %X, i64* %gepA.as0.bc, align 4		store i64 %X, i64* %gepA.as0.bc, align 4

▲ Show 20 Lines • Show All 162 Lines • ▼ Show 20 Lines	entry:
%asc = addrspacecast i8* %ptr to i8 addrspace(1)*		%asc = addrspacecast i8* %ptr to i8 addrspace(1)*
call void @llvm.memcpy.p1i8.p0i8.i32(i8 addrspace(1)* %asc, i8* %src, i32 4, i1 true), !tbaa !0		call void @llvm.memcpy.p1i8.p0i8.i32(i8 addrspace(1)* %asc, i8* %src, i32 4, i1 true), !tbaa !0
call void @llvm.memcpy.p0i8.p1i8.i32(i8* %dst, i8 addrspace(1)* %asc, i32 4, i1 true), !tbaa !3		call void @llvm.memcpy.p0i8.p1i8.i32(i8* %dst, i8 addrspace(1)* %asc, i32 4, i1 true), !tbaa !3
ret void		ret void
}		}

define void @select_addrspacecast(i1 %a, i1 %b) {		define void @select_addrspacecast(i1 %a, i1 %b) {
; CHECK-LABEL: @select_addrspacecast(		; CHECK-LABEL: @select_addrspacecast(
; CHECK-NEXT: [[C:%.*]] = alloca i64, align 8
; CHECK-NEXT: [[P_0_C:%.]] = select i1 undef, i64 [[C]], i64* [[C]]
; CHECK-NEXT: [[ASC:%.]] = addrspacecast i64 [[P_0_C]] to i64 addrspace(1)*
; CHECK-NEXT: [[COND_IN:%.]] = select i1 undef, i64 addrspace(1) [[ASC]], i64 addrspace(1)* [[ASC]]
; CHECK-NEXT: [[COND:%.]] = load i64, i64 addrspace(1) [[COND_IN]], align 8
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
%c = alloca i64, align 8		%c = alloca i64, align 8
%p.0.c = select i1 undef, i64* %c, i64* %c		%p.0.c = select i1 undef, i64* %c, i64* %c
%asc = addrspacecast i64* %p.0.c to i64 addrspace(1)*		%asc = addrspacecast i64* %p.0.c to i64 addrspace(1)*

%cond.in = select i1 undef, i64 addrspace(1)* %asc, i64 addrspace(1)* %asc		%cond.in = select i1 undef, i64 addrspace(1)* %asc, i64 addrspace(1)* %asc
%cond = load i64, i64 addrspace(1)* %cond.in, align 8		%cond = load i64, i64 addrspace(1)* %cond.in, align 8
ret void		ret void
}		}

define void @select_addrspacecast_const_op(i1 %a, i1 %b) {		define void @select_addrspacecast_const_op(i1 %a, i1 %b) {
; CHECK-LABEL: @select_addrspacecast_const_op(		; CHECK-LABEL: @select_addrspacecast_const_op(
; CHECK-NEXT: [[C:%.*]] = alloca i64, align 8		; CHECK-NEXT: [[C:%.*]] = alloca i64, align 8
; CHECK-NEXT: [[P_0_C:%.]] = select i1 undef, i64 [[C]], i64* [[C]]		; CHECK-NEXT: [[C_0_ASC_SROA_CAST:%.]] = addrspacecast i64 [[C]] to i64 addrspace(1)*
; CHECK-NEXT: [[ASC:%.]] = addrspacecast i64 [[P_0_C]] to i64 addrspace(1)*		; CHECK-NEXT: [[COND_IN:%.]] = select i1 undef, i64 addrspace(1) [[C_0_ASC_SROA_CAST]], i64 addrspace(1)* null
; CHECK-NEXT: [[COND_IN:%.]] = select i1 undef, i64 addrspace(1) [[ASC]], i64 addrspace(1)* null
; CHECK-NEXT: [[COND:%.]] = load i64, i64 addrspace(1) [[COND_IN]], align 8		; CHECK-NEXT: [[COND:%.]] = load i64, i64 addrspace(1) [[COND_IN]], align 8
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
%c = alloca i64, align 8		%c = alloca i64, align 8
%p.0.c = select i1 undef, i64* %c, i64* %c		%p.0.c = select i1 undef, i64* %c, i64* %c
%asc = addrspacecast i64* %p.0.c to i64 addrspace(1)*		%asc = addrspacecast i64* %p.0.c to i64 addrspace(1)*

%cond.in = select i1 undef, i64 addrspace(1)* %asc, i64 addrspace(1)* null		%cond.in = select i1 undef, i64 addrspace(1)* %asc, i64 addrspace(1)* null
%cond = load i64, i64 addrspace(1)* %cond.in, align 8		%cond = load i64, i64 addrspace(1)* %cond.in, align 8
ret void		ret void
}		}

@gv = external addrspace(1) global i64		@gv = external addrspace(1) global i64

define void @select_addrspacecast_gv(i1 %a, i1 %b) {		define void @select_addrspacecast_gv(i1 %a, i1 %b) {
; CHECK-LABEL: @select_addrspacecast_gv(		; CHECK-LABEL: @select_addrspacecast_gv(
; CHECK-NEXT: [[C:%.*]] = alloca i64, align 8		; CHECK-NEXT: [[COND_SROA_SPECULATE_LOAD_FALSE:%.]] = load i64, i64 addrspace(1) @gv, align 8
; CHECK-NEXT: [[P_0_C:%.]] = select i1 undef, i64 [[C]], i64* [[C]]		; CHECK-NEXT: [[COND_SROA_SPECULATED:%.*]] = select i1 undef, i64 undef, i64 [[COND_SROA_SPECULATE_LOAD_FALSE]]
; CHECK-NEXT: [[ASC:%.]] = addrspacecast i64 [[P_0_C]] to i64 addrspace(1)*
; CHECK-NEXT: [[COND_IN:%.]] = select i1 undef, i64 addrspace(1) [[ASC]], i64 addrspace(1)* @gv
; CHECK-NEXT: [[COND:%.]] = load i64, i64 addrspace(1) [[COND_IN]], align 8
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
%c = alloca i64, align 8		%c = alloca i64, align 8
%p.0.c = select i1 undef, i64* %c, i64* %c		%p.0.c = select i1 undef, i64* %c, i64* %c
%asc = addrspacecast i64* %p.0.c to i64 addrspace(1)*		%asc = addrspacecast i64* %p.0.c to i64 addrspace(1)*

%cond.in = select i1 undef, i64 addrspace(1)* %asc, i64 addrspace(1)* @gv		%cond.in = select i1 undef, i64 addrspace(1)* %asc, i64 addrspace(1)* @gv
%cond = load i64, i64 addrspace(1)* %cond.in, align 8		%cond = load i64, i64 addrspace(1)* %cond.in, align 8
ret void		ret void
}		}

!0 = !{!1, !1, i64 0, i64 1}		!0 = !{!1, !1, i64 0, i64 1}
!1 = !{!2, i64 1, !"type_0"}		!1 = !{!2, i64 1, !"type_0"}
!2 = !{!"root"}		!2 = !{!"root"}
!3 = !{!4, !4, i64 0, i64 1}		!3 = !{!4, !4, i64 0, i64 1}
!4 = !{!2, i64 1, !"type_3"}		!4 = !{!2, i64 1, !"type_3"}

test/Transforms/SROA/basictest.ll

Show First 20 Lines • Show All 59 Lines • ▼ Show 20 Lines	entry:
store i64 %X, i64* %B		store i64 %X, i64* %B
br label %L2		br label %L2

L2:		L2:
%Z = load i64, i64* %B		%Z = load i64, i64* %B
ret i64 %Z		ret i64 %Z
}		}

		define i64 @test2_addrspacecast(i64 %X) {
		; CHECK-LABEL: @test2_addrspacecast(
		; CHECK-NOT: alloca
		; CHECK: ret i64 %X

		entry:
		%A = alloca [8 x i8]
		%B = addrspacecast [8 x i8]* %A to i64 addrspace(1)*
		store i64 %X, i64 addrspace(1)* %B
		br label %L2

		L2:
		%Z = load i64, i64 addrspace(1)* %B
		ret i64 %Z
		}

		define i64 @test2_addrspacecast_gep(i64 %X, i16 %idx) {
		; CHECK-LABEL: @test2_addrspacecast_gep(
		; CHECK-NOT: alloca
		; CHECK: ret i64 %X

		entry:
		%A = alloca [256 x i8]
		%B = addrspacecast [256 x i8]* %A to i64 addrspace(1)*
		%gepA = getelementptr [256 x i8], [256 x i8]* %A, i16 0, i16 32
		%gepB = getelementptr i64, i64 addrspace(1)* %B, i16 4
		store i64 %X, i64 addrspace(1)* %gepB, align 1
		br label %L2

		L2:
		%gepA.bc = bitcast i8* %gepA to i64*
		%Z = load i64, i64* %gepA.bc, align 1
		ret i64 %Z
		}

		; Avoid crashing when load/storing at at different offsets.
		define i64 @test2_addrspacecast_gep_offset(i64 %X) {
		; CHECK-LABEL: @test2_addrspacecast_gep_offset(
		; CHECK: %A.sroa.0 = alloca [10 x i8]
		; CHECK: %A.sroa.0.2.gepB.sroa_idx = getelementptr inbounds [10 x i8], [10 x i8]* %A.sroa.0, i16 0, i16 2
		; CHECK-NEXT: %A.sroa.0.2.gepB.sroa_cast = addrspacecast i8* %A.sroa.0.2.gepB.sroa_idx to i64 addrspace(1)*
		arichardsonUnsubmitted Not Done Reply Inline Actions Use FileCheck captures for the variables in case the naming changes in the future? arichardson: Use FileCheck captures for the variables in case the naming changes in the future?
		; CHECK-NEXT: store i64 %X, i64 addrspace(1)* %A.sroa.0.2.gepB.sroa_cast, align 1
		; CHECK: br

		; CHECK: %A.sroa.0.0.gepA.bc.sroa_cast = bitcast [10 x i8]* %A.sroa.0 to i64*
		; CHECK: %A.sroa.0.0.A.sroa.0.30.Z = load i64, i64* %A.sroa.0.0.gepA.bc.sroa_cast, align 1
		; CHECK-NEXT: ret
		entry:
		%A = alloca [256 x i8]
		%B = addrspacecast [256 x i8]* %A to i64 addrspace(1)*
		%gepA = getelementptr [256 x i8], [256 x i8]* %A, i16 0, i16 30
		%gepB = getelementptr i64, i64 addrspace(1)* %B, i16 4
		store i64 %X, i64 addrspace(1)* %gepB, align 1
		br label %L2

		L2:
		%gepA.bc = bitcast i8* %gepA to i64*
		%Z = load i64, i64* %gepA.bc, align 1
		ret i64 %Z
		}

define void @test3(i8* %dst, i8* align 8 %src) {		define void @test3(i8* %dst, i8* align 8 %src) {
; CHECK-LABEL: @test3(		; CHECK-LABEL: @test3(

entry:		entry:
%a = alloca [300 x i8]		%a = alloca [300 x i8]
; CHECK-NOT: alloca		; CHECK-NOT: alloca
; CHECK: %[[test3_a1:.*]] = alloca [42 x i8]		; CHECK: %[[test3_a1:.*]] = alloca [42 x i8]
; CHECK-NEXT: %[[test3_a2:.*]] = alloca [99 x i8]		; CHECK-NEXT: %[[test3_a2:.*]] = alloca [99 x i8]
▲ Show 20 Lines • Show All 345 Lines • ▼ Show 20 Lines	entry:
%fptr = bitcast [4 x i8]* %a to float*		%fptr = bitcast [4 x i8]* %a to float*
store float 0.0, float* %fptr		store float 0.0, float* %fptr
%ptr = getelementptr [4 x i8], [4 x i8]* %a, i32 0, i32 2		%ptr = getelementptr [4 x i8], [4 x i8]* %a, i32 0, i32 2
%iptr = bitcast i8* %ptr to i16*		%iptr = bitcast i8* %ptr to i16*
%val = load i16, i16* %iptr		%val = load i16, i16* %iptr
ret i16 %val		ret i16 %val
}		}

		define i16 @test5_multi_addrspace_access() {
		; CHECK-LABEL: @test5_multi_addrspace_access(
		; CHECK-NOT: alloca float
		; CHECK: %[[cast:.]] = bitcast float 0.0{{.}} to i32
		; CHECK-NEXT: %[[shr:.*]] = lshr i32 %[[cast]], 16
		; CHECK-NEXT: %[[trunc:.*]] = trunc i32 %[[shr]] to i16
		; CHECK-NEXT: ret i16 %[[trunc]]

		entry:
		%a = alloca [4 x i8]
		%fptr = bitcast [4 x i8]* %a to float*
		%fptr.as1 = addrspacecast float* %fptr to float addrspace(1)*
		store float 0.0, float addrspace(1)* %fptr.as1
		%ptr = getelementptr [4 x i8], [4 x i8]* %a, i32 0, i32 2
		%iptr = bitcast i8* %ptr to i16*
		%val = load i16, i16* %iptr
		ret i16 %val
		}

define i32 @test6() {		define i32 @test6() {
; CHECK-LABEL: @test6(		; CHECK-LABEL: @test6(
; CHECK: alloca i32		; CHECK: alloca i32
; CHECK-NEXT: store volatile i32		; CHECK-NEXT: store volatile i32
; CHECK-NEXT: load i32, i32*		; CHECK-NEXT: load i32, i32*
; CHECK-NEXT: ret i32		; CHECK-NEXT: ret i32

entry:		entry:
▲ Show 20 Lines • Show All 383 Lines • ▼ Show 20 Lines	entry:
%cast1 = bitcast %opaque* %x to i8*		%cast1 = bitcast %opaque* %x to i8*
%cast2 = bitcast { i64, i8* }* %a to i8*		%cast2 = bitcast { i64, i8* }* %a to i8*
call void @llvm.memcpy.p0i8.p0i8.i32(i8* %cast2, i8* %cast1, i32 16, i1 false)		call void @llvm.memcpy.p0i8.p0i8.i32(i8* %cast2, i8* %cast1, i32 16, i1 false)
%gep = getelementptr inbounds { i64, i8* }, { i64, i8* }* %a, i32 0, i32 0		%gep = getelementptr inbounds { i64, i8* }, { i64, i8* }* %a, i32 0, i32 0
%val = load i64, i64* %gep		%val = load i64, i64* %gep
ret i32 undef		ret i32 undef
}		}

		declare void @llvm.memcpy.p0i8.p1i8.i32(i8* nocapture, i8 addrspace(1)* nocapture, i32, i32, i1) nounwind

		define i32 @test19_addrspacecast(%opaque* %x) {
		; This input will cause us to try to compute a natural GEP when rewriting
		; pointers in such a way that we try to GEP through the opaque type. Previously,
		; a check for an unsized type was missing and this crashed. Ensure it behaves
		; reasonably now.
		; CHECK-LABEL: @test19_addrspacecast(
		; CHECK-NOT: alloca
		; CHECK: ret i32 undef

		entry:
		%a = alloca { i64, i8* }
		%cast1 = addrspacecast %opaque* %x to i8 addrspace(1)*
		%cast2 = bitcast { i64, i8* }* %a to i8*
		call void @llvm.memcpy.p0i8.p1i8.i32(i8* %cast2, i8 addrspace(1)* %cast1, i32 16, i32 1, i1 false)
		%gep = getelementptr inbounds { i64, i8* }, { i64, i8* }* %a, i32 0, i32 0
		%val = load i64, i64* %gep
		ret i32 undef
		}

define i32 @test20() {		define i32 @test20() {
; Ensure we can track negative offsets (before the beginning of the alloca) and		; Ensure we can track negative offsets (before the beginning of the alloca) and
; negative relative offsets from offsets starting past the end of the alloca.		; negative relative offsets from offsets starting past the end of the alloca.
; CHECK-LABEL: @test20(		; CHECK-LABEL: @test20(
; CHECK-NOT: alloca		; CHECK-NOT: alloca
; CHECK: %[[sum1:.*]] = add i32 1, 2		; CHECK: %[[sum1:.*]] = add i32 1, 2
; CHECK: %[[sum2:.*]] = add i32 %[[sum1]], 3		; CHECK: %[[sum2:.*]] = add i32 %[[sum1]], 3
; CHECK: ret i32 %[[sum2]]		; CHECK: ret i32 %[[sum2]]
▲ Show 20 Lines • Show All 313 Lines • ▼ Show 20 Lines	; CHECK-NEXT: getelementptr inbounds { [16 x i8] }, { [16 x i8] }* %ptr, i64 -1, i32 0, i64 0
call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %cast1, i8* align 8 %cast2, i32 16, i1 true)		call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %cast1, i8* align 8 %cast2, i32 16, i1 true)
ret void		ret void
; CHECK: ret		; CHECK: ret
}		}

define void @PR14105_as1({ [16 x i8] } addrspace(1)* %ptr) {		define void @PR14105_as1({ [16 x i8] } addrspace(1)* %ptr) {
; Make sure this the right address space pointer is used for type check.		; Make sure this the right address space pointer is used for type check.
; CHECK-LABEL: @PR14105_as1(		; CHECK-LABEL: @PR14105_as1(
		; CHECK: alloca { [16 x i8] }, align 8
		; CHECK-NEXT: %gep = getelementptr inbounds { [16 x i8] }, { [16 x i8] } addrspace(1)* %ptr, i64 -1
		; CHECK-NEXT: %cast1 = bitcast { [16 x i8] } addrspace(1)* %gep to i8 addrspace(1)*
		; CHECK-NEXT: %cast2 = bitcast { [16 x i8] }* %a to i8*
		; CHECK-NEXT: call void @llvm.memcpy.p1i8.p0i8.i32(i8 addrspace(1)* align 8 %cast1, i8* align 8 %cast2, i32 16, i1 true)

entry:		entry:
%a = alloca { [16 x i8] }, align 8		%a = alloca { [16 x i8] }, align 8
; CHECK: alloca [16 x i8], align 8

%gep = getelementptr inbounds { [16 x i8] }, { [16 x i8] } addrspace(1)* %ptr, i64 -1		%gep = getelementptr inbounds { [16 x i8] }, { [16 x i8] } addrspace(1)* %ptr, i64 -1
; CHECK-NEXT: getelementptr inbounds { [16 x i8] }, { [16 x i8] } addrspace(1)* %ptr, i16 -1, i32 0, i16 0

%cast1 = bitcast { [16 x i8 ] } addrspace(1)* %gep to i8 addrspace(1)*		%cast1 = bitcast { [16 x i8 ] } addrspace(1)* %gep to i8 addrspace(1)*
%cast2 = bitcast { [16 x i8 ] }* %a to i8*		%cast2 = bitcast { [16 x i8 ] }* %a to i8*
call void @llvm.memcpy.p1i8.p0i8.i32(i8 addrspace(1)* align 8 %cast1, i8* align 8 %cast2, i32 16, i1 true)		call void @llvm.memcpy.p1i8.p0i8.i32(i8 addrspace(1)* align 8 %cast1, i8* align 8 %cast2, i32 16, i1 true)
ret void		ret void
; CHECK: ret		; CHECK: ret
}		}

define void @PR14465() {		define void @PR14465() {
▲ Show 20 Lines • Show All 746 Lines • Show Last 20 Lines

test/Transforms/SROA/phi-and-select.ll

Show First 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	; CHECK-NOT: load
%cond = icmp sle i32 %v0, %v1		%cond = icmp sle i32 %v0, %v1
%select = select i1 %cond, i32* %a1, i32* %a0		%select = select i1 %cond, i32* %a1, i32* %a0
; CHECK: select i1 %{{.*}}, i32 1, i32 0		; CHECK: select i1 %{{.*}}, i32 1, i32 0

%result = load i32, i32* %select		%result = load i32, i32* %select
ret i32 %result		ret i32 %result
}		}

		; If bitcast isn't considered a safe phi/select use, the alloca
		; remains as an array.
		; FIXME: Why isn't this identical to test2?

		; CHECK-LABEL: @test2_bitcast(
		; CHECK: alloca i32
		; CHECK-NEXT: alloca i32

		; CHECK: %select = select i1 %cond, i32* %a.sroa.3, i32* %a.sroa.0
		; CHECK-NEXT: %select.bc = bitcast i32* %select to float*
		; CHECK-NEXT: %result = load float, float* %select.bc, align 4
		define float @test2_bitcast() {
		entry:
		%a = alloca [2 x i32]
		%a0 = getelementptr [2 x i32], [2 x i32]* %a, i64 0, i32 0
		%a1 = getelementptr [2 x i32], [2 x i32]* %a, i64 0, i32 1
		store i32 0, i32* %a0
		store i32 1, i32* %a1
		%v0 = load i32, i32* %a0
		%v1 = load i32, i32* %a1
		%cond = icmp sle i32 %v0, %v1
		%select = select i1 %cond, i32* %a1, i32* %a0
		%select.bc = bitcast i32* %select to float*
		%result = load float, float* %select.bc
		ret float %result
		}

		; CHECK-LABEL: @test2_addrspacecast(
		; CHECK: alloca i32
		; CHECK-NEXT: alloca i32

		; CHECK: %select = select i1 %cond, i32* %a.sroa.3, i32* %a.sroa.0
		; CHECK-NEXT: %select.asc = addrspacecast i32* %select to i32 addrspace(1)*
		; CHECK-NEXT: load i32, i32 addrspace(1)* %select.asc, align 4
		define i32 @test2_addrspacecast() {
		entry:
		%a = alloca [2 x i32]
		%a0 = getelementptr [2 x i32], [2 x i32]* %a, i64 0, i32 0
		%a1 = getelementptr [2 x i32], [2 x i32]* %a, i64 0, i32 1
		store i32 0, i32* %a0
		store i32 1, i32* %a1
		%v0 = load i32, i32* %a0
		%v1 = load i32, i32* %a1
		%cond = icmp sle i32 %v0, %v1
		%select = select i1 %cond, i32* %a1, i32* %a0
		%select.asc = addrspacecast i32* %select to i32 addrspace(1)*
		%result = load i32, i32 addrspace(1)* %select.asc
		ret i32 %result
		}

define i32 @test3(i32 %x) {		define i32 @test3(i32 %x) {
; CHECK-LABEL: @test3(		; CHECK-LABEL: @test3(
entry:		entry:
%a = alloca [2 x i32]		%a = alloca [2 x i32]
; CHECK-NOT: alloca		; CHECK-NOT: alloca

; Note that we build redundant GEPs here to ensure that having different GEPs		; Note that we build redundant GEPs here to ensure that having different GEPs
; into the same alloca partation continues to work with PHI speculation. This		; into the same alloca partation continues to work with PHI speculation. This
▲ Show 20 Lines • Show All 583 Lines • Show Last 20 Lines