This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
docs/
7
LangRef.rst
-
include/llvm/Analysis/
-
llvm/
-
Analysis/
-
PtrUseVisitor.h
-
lib/
-
Analysis/
2
PtrUseVisitor.cpp
-
Transforms/Scalar/
-
Scalar/
4/7
SROA.cpp
-
test/Transforms/SROA/
-
Transforms/
-
SROA/
1
basictest.ll

Differential D31924

SROA: Allow eliminating addrspacecasted allocas
ClosedPublic

Authored by arsenm on Apr 10 2017, 8:52 PM.

Download Raw Diff

Details

Reviewers

chandlerc
sanjoy
theraven

Summary

This is a resurrection of D10482 and D4501

There is a circular dependency between SROA and InferAddressSpaces
today that requires running both multiple times in order to be able to
eliminate all simple allocas and addrspacecasts. InferAddressSpaces
can't remove addrspacecasts when written to memory, and SROA helps
move pointers out of memory.

This should avoid inserting new commuting addrspacecasts with GEPs,
since there are unresolved questions about pointer wrapping between
different address spaces.

For now, don't replace volatile operations that don't match the alloca
addrspace, as it would change the address space of the access. It may
be still OK to insert an addrspacecast from the new alloca, but be
more conservative for now.

Diff Detail

Event Timeline

arsenm created this revision.Apr 10 2017, 8:52 PM

Herald added a subscriber: wdng. · View Herald TranscriptApr 10 2017, 8:52 PM

efriedma added a subscriber: efriedma.Apr 11 2017, 11:42 AM

efriedma added inline comments.

docs/LangRef.rst
8465	Do we need to restrict this rule to inbounds indexing?

arsenm added inline comments.Apr 11 2017, 12:28 PM

docs/LangRef.rst
8465	I suppose this should specify for a defined result. The idea is only really to disallow implementations that can't round-trip likes in @sanjoy's example of implementing it with abs.

Specify defined results, and that the pointer must be able to round trip

efriedma added inline comments.Apr 14 2017, 10:45 AM

docs/LangRef.rst
8466	What does "and then indexed" mean for a gep that isn't inbounds? We clearly can't make all indexing equivalent: if you use a gep to increment a pointer by 2^33, that's clearly going to have a different result if you try to round-trip that value through a 32-bit pointer. I mean, I understand what you're getting at with the new text: the rule is essentially that the original and casted pointers point at the same memory allocation. LangRef really needs to be clear, though.

Attempt simpler phrasing

efriedma added inline comments.Apr 18 2017, 12:30 PM

docs/LangRef.rst
8465	"The pointer conversion cannot be an arbitrarily complex value modification." is a bit vague... it'd be better if we could specifically say what transforms are allowed.
lib/Transforms/Scalar/SROA.cpp
1588	Why do we want to generate an address-space cast here, as opposed to performing the memory operation using the alloca's natural address-space?

Remove unnecessary change. The pointer already has the right address space

arsenm added inline comments.Apr 20 2017, 1:18 PM

docs/LangRef.rst
8465	Would just dropping that sentence work? The intent is more clear in the second sentence where it needs to be reversible

I'd like to see some testcases which involve mixing GEPs and addrspacecasts. What width of APInt do we use to accumulate the offset? How does overflow work when we mix GEPs in different address-spaces?

docs/LangRef.rst
8465	Yes, I think that's fine.

sanjoy added inline comments.Apr 20 2017, 1:41 PM

lib/Transforms/Scalar/SROA.cpp
1557	Are you ruling that `GEP(CAST(X), 1)` is the same as `CAST(GEP X, 1)`? If so, I'm not sure this is correct given your constraint on address space casts. For instance, if casting from address space N to M, with both spaces having the same pointer width, involves flipping the high and low halves of the pointer then `GEP(CAST(X), 1)` is not the same as `(CAST(GEP X, 1))`. Of course, this means that GEPs over pointers of address space M are different operations from GEPs over pointers of address space N, but that's allowed, AFAIK.

arsenm added inline comments.Apr 20 2017, 1:45 PM

lib/Transforms/Scalar/SROA.cpp
1557	Yes, those should be the same

sanjoy added inline comments.Apr 20 2017, 1:46 PM

lib/Transforms/Scalar/SROA.cpp
1557	Then you need to change the langref to disallow address space casts as the above (flipping the high and low halfs of the pointer).

Try to re-word langref again.

Re-add addrspacecast insertion. It can be necessary when the alloca isn't entirely eliminated.

Fix asserting when casting between different sized pointers

sanjoy added a subscriber: theraven.Apr 20 2017, 8:03 PM

Restricting addrspace cast in this way seems... really hard to get right. I still have the question Eli asked: what does this mean in the absence of inbounds? What if the address spaces have different wrapping behavior even though they have the same number of bits?

Consider an address space where there are tag bits in the high bits and one where there aren't. These may appear to be the same type, but the GEP-ing rule you propose doesn't seem to generally hold.

And that's just one example. I'm not sure even restricting this to inbounds will really fix the issue.

What about approaching this more from the inference perspective? Could we embed the inference into the iteration of SROA without shifting the restrictions so much?

theraven added inline comments.Apr 21 2017, 1:11 AM

docs/LangRef.rst
8465	It would be nice to clarify what 'legal' means in this context. For us, the relationship between address spaces 0 and 200 rely on some run-time properties. Address space 200 is always a superset of address space 0, so it is always safe to cast from AS 0 to AS 200, but the converse might not be possible and will give either a valid value (without bounds information) or a null pointer. A cast from AS200 -> AS0 -> AS200 may result in a null pointer if the address is outside the range covered by AS0. The same would apply on microcontrollers with a 32-bit global address space and a 16-bit address space mapped within that: you could always cast from the 16-bit range to the 32-bit range and back, but casting from an arbitrary 32-bit range to the 16-bit range and back may not work.
lib/Analysis/PtrUseVisitor.cpp
34	No changes suggested here, but in our version we have queries on the data layout that differentiate between the type and the range of a pointer (ours are 128- or 256-bit sized, but with a 64-bit range).

In D31924#733213, @chandlerc wrote:

Restricting addrspace cast in this way seems... really hard to get right. I still have the question Eli asked: what does this mean in the absence of inbounds? What if the address spaces have different wrapping behavior even though they have the same number of bits?

Consider an address space where there are tag bits in the high bits and one where there aren't. These may appear to be the same type, but the GEP-ing rule you propose doesn't seem to generally hold.

And that's just one example. I'm not sure even restricting this to inbounds will really fix the issue.

What about approaching this more from the inference perspective? Could we embed the inference into the iteration of SROA without shifting the restrictions so much?

I'm not sure exactly what you mean by this. Do you mean somehow merging InferAddressSpaces and SROA?

lib/Analysis/PtrUseVisitor.cpp
34	We will probably need this at some point for using pointers for resource descriptors

In D31924#735918, @arsenm wrote:

In D31924#733213, @chandlerc wrote:

What about approaching this more from the inference perspective? Could we embed the inference into the iteration of SROA without shifting the restrictions so much?

I'm not sure exactly what you mean by this. Do you mean somehow merging InferAddressSpaces and SROA?

In a limited form...

Essentially, expose utilities to infer address spaces which can be shared with the InferAddressSpaces pass but can also be used to infer address spaces for allocas as SROA promotes their uses into SSA registers.

Was there a decision reached here on what the correct semantics are? There are other places in LLVM (I found one in instcombine - there may be others) which do make the assumption that this change is proposing to introduce to the langref. Personally, I don't think this transformation should be allowed. I know there are architectures where different address spaces have different GEP behavior (though I'm not sure if this is the case for any in-tree backend). Nevertheless, if people do feel like this should be allowed (e.g. because such architectures should use something other than addrspacecast to convert between such address spaces), that's fine with me as well, but I think there should a clear statement in the langref on way or the other. As is, different people read the langref differently.

In D31924#754467, @loladiro wrote:

Was there a decision reached here on what the correct semantics are? There are other places in LLVM (I found one in instcombine - there may be others) which do make the assumption that this change is proposing to introduce to the langref. Personally, I don't think this transformation should be allowed. I know there are architectures where different address spaces have different GEP behavior (though I'm not sure if this is the case for any in-tree backend). Nevertheless, if people do feel like this should be allowed (e.g. because such architectures should use something other than addrspacecast to convert between such address spaces), that's fine with me as well, but I think there should a clear statement in the langref on way or the other. As is, different people read the langref differently.

Do you mean non-integral pointers? I don't think this changes the rules I was thinking. You still should have a reversible result. AMDGPU will use some non-integral pointers eventually, although it doesn't need them for this particular case.

I think the way the non-integral pointer is worded now is to allow GC etc. to completely replace the pointer value, in which case eliminating it like this is probably not OK. Would it work to restrict this for only integral pointer address spaces?

There was some discussion about non-integral address spaces at EuroLLVM. The current restriction is too great, as not allowing ptrtoint and inttoptr makes it impossible to support C-like languages. We discussed refining the definition to be that optimisers should not introduce inttoptr or ptrtoint, but that they are allowed to be inserted by the front end in places where they are valid in the context of the source language.

This change still has the problem with requiring bijection between address spaces, which is not the case for us or for any platform where there is a subset relationship between address spaces. For example, casting from a 32-bit address to a 16-bit address and then back is not guaranteed to give the same address. On an ARM M-profile chip, casting between addresses in different MPU sections may encounter similar problems.

I'm very nervous of any optimisations that introduce address space casts, because they rely on far more knowledge of the relationship between address spaces than we currently provide with the data layout. Perhaps providing that information in the data layout should be a prerequisite for this.

Do you mean non-integral pointers?

No, I don't mean non-integral pointers (though there's problems there too). I apologize if I'm being vague here, but I read a lot of architecture specs and I can never remember what is and is not public. In any case, I think the easiest example here is virtual memory. Consider an architecture with primitives for both physical and virtual memory and instructions for converting between the two quickly. It seems perfectly plausible to want to express the conversion between the two kinds of pointers as an address space cast. However, certainly geps and address space casts don't commute here. Now, as I said, you might argue that should a crazy address space cast deserves a target specific intrinsic, and I think that's a fine stance to take. I suppose the other alternative would be to add some information to the datalayout or the addressspace casts itself to indicate whether the optimizer is allowed to introduce addrspace casts that weren't in the original program.

In D31924#754613, @theraven wrote:

There was some discussion about non-integral address spaces at EuroLLVM. The current restriction is too great, as not allowing ptrtoint and inttoptr makes it impossible to support C-like languages. We discussed refining the definition to be that optimisers should not introduce inttoptr or ptrtoint, but that they are allowed to be inserted by the front end in places where they are valid in the context of the source language.

You're making me regret not making it to EuroLLVM even more. :)

We disallowed ptrtoint and inttoptr because these instructions (today) are arbitrarily speculatable; and changing that to be dependent on their types would introduce complexity.

Instead my plan is to add intrinsics to convert between ni pointers and integers with exactly the property you mentioned -- these intrinsics may have side effects so they can't be inserted by the optimizer or speculated, but the frontend may insert them when legal.

For us, speculation isn't a problem. ptrtoint is not guaranteed to give stable results in all run-time environments (i.e. if we enable a copying GC), but it doesn't break the memory safety guarantees. inttoptr only works in some execution environments (and will result in a null where it wouldn't work), and it's up to the C programmer to ensure that they don't use it when it wouldn't be sensible and other front ends won't emit it at all. Code works as expected, as long as optimisers don't try to add them.

loladiro mentioned this in D33361: [InstCombine] Fix inbounds gep for addrspacecasts.May 19 2017, 10:00 AM

Having pondered this some more, I wonder if what we're missing is an annotation on the addrspace cast itself that indicates whether or not GEPs may be commuted past it (could call it inbounds or something else). It seems like in many cases (including allocas). The frontend (or whoever else is doing language/target specific work) can often know whether the entire object is available in the target address space (e.g. because the entire stack always is).

Rebase and fix using the pointer type instead of the new indexing type. Don't introduce a new addrspacecast, since it's easily avoidable.

Since the addrspacecast is no longer inserted, I think that avoids some of the questions about pointer wrapping? The addrspacecasts are only eliminated, and ignored for computing the offset.

lib/Transforms/Scalar/SROA.cpp
1557	Does this only matter because of the newly introduced addrspacecast? This may change the pointer value, but we only care about number of bytes indexed off of the original object. Changing the representation in the middle shouldn't change the total number of bytes addressed from the original object?
1588	You're right, this is unnecessary

jdoerfert added a subscriber: jdoerfert.Jun 3 2019, 1:43 PM

Re-apply change to insert a final addrspacecast in getAdjustedPtr, which is necessary in some cases.

Don't allow getAdjustedPtr to search through addrspacecast. This should prevent the questionable addrspacecast-GEP commuting behavior. The practical case that matters is an alloca immediately casted, and all addressing is done in the result address space, so getting the same GEP folds in all cases as a bitcast isn't critically important, though would be nice to have. Some new addrspacecasts may be introduced, but not followed by a GEP.

Fix changing the address space of volatile operations, although inserting a new addrspacecast should be OK in these cases. For now just leave these cases alone.

Also fix some missing test coverage.

arsenm marked an inline comment as done.Jun 10 2019, 1:45 PM

arsenm added inline comments.

lib/Transforms/Scalar/SROA.cpp
1588	This is actually necessary in some cases. (e.g. in select_addrspacecast_const_op, without this, the two operands of the select end up as different types)

I think this looks like it will improve codegen for us and not violate any of our C-level guarantees. Hopefully @arichardson can also take a look.

This revision is now accepted and ready to land.Jun 11 2019, 1:31 AM

In D31924#1537558, @theraven wrote:

I think this looks like it will improve codegen for us and not violate any of our C-level guarantees. Hopefully @arichardson can also take a look.

I just tried this on our fork and it looks good.

test/Transforms/SROA/basictest.ll
108	Use FileCheck captures for the variables in case the naming changes in the future?

r363462

Revision Contents

Path

Size

docs/

LangRef.rst

11 lines

include/

llvm/

Analysis/

PtrUseVisitor.h

4 lines

lib/

Analysis/

PtrUseVisitor.cpp

8 lines

Transforms/

Scalar/

SROA.cpp

23 lines

test/

Transforms/

SROA/

basictest.ll

101 lines

Diff 96082

docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 8,453 Lines • ▼ Show 20 Lines

	Semantics:			Semantics:
	""""""""""			""""""""""

	The '``addrspacecast``' instruction converts the pointer value			The '``addrspacecast``' instruction converts the pointer value
	``ptrval`` to type ``pty2``. It can be a no-op cast or a complex			``ptrval`` to type ``pty2``. It can be a no-op cast or a complex
	value modification, depending on the target and the address space			value modification, depending on the target and the address space
	pair. Pointer conversions within the same address space must be			pair. Pointer conversions within the same address space must be
	performed with the ``bitcast`` instruction. Note that if the address space			performed with the ``bitcast`` instruction. Note that if the address
	conversion is legal then both result and operand refer to the same memory			space conversion is legal then both result and operand refer to the
	location.			same memory location. For a defined result, the inverse cast to the
				original address space should yield the original pointer value. A
				efriedmaUnsubmitted Not Done Reply Inline Actions Do we need to restrict this rule to inbounds indexing? efriedma: Do we need to restrict this rule to inbounds indexing?
				arsenmAuthorUnsubmitted Not Done Reply Inline Actions I suppose this should specify for a defined result. The idea is only really to disallow implementations that can't round-trip likes in @sanjoy's example of implementing it with abs. arsenm: I suppose this should specify for a defined result. The idea is only really to disallow…
				efriedmaUnsubmitted Not Done Reply Inline Actions "The pointer conversion cannot be an arbitrarily complex value modification." is a bit vague... it'd be better if we could specifically say what transforms are allowed. efriedma: "The pointer conversion cannot be an arbitrarily complex value modification." is a bit vague...
				arsenmAuthorUnsubmitted Not Done Reply Inline Actions Would just dropping that sentence work? The intent is more clear in the second sentence where it needs to be reversible arsenm: Would just dropping that sentence work? The intent is more clear in the second sentence where…
				efriedmaUnsubmitted Not Done Reply Inline Actions Yes, I think that's fine. efriedma: Yes, I think that's fine.
				theravenUnsubmitted Not Done Reply Inline Actions It would be nice to clarify what 'legal' means in this context. For us, the relationship between address spaces 0 and 200 rely on some run-time properties. Address space 200 is always a superset of address space 0, so it is always safe to cast from AS 0 to AS 200, but the converse might not be possible and will give either a valid value (without bounds information) or a null pointer. A cast from AS200 -> AS0 -> AS200 may result in a null pointer if the address is outside the range covered by AS0. The same would apply on microcontrollers with a 32-bit global address space and a 16-bit address space mapped within that: you could always cast from the 16-bit range to the 32-bit range and back, but casting from an arbitrary 32-bit range to the 16-bit range and back may not work. theraven: It would be nice to clarify what 'legal' means in this context. For us, the relationship…
				defined pointer computation derived from the casted value should be
				efriedmaUnsubmitted Not Done Reply Inline Actions What does "and then indexed" mean for a gep that isn't inbounds? We clearly can't make all indexing equivalent: if you use a gep to increment a pointer by 2^33, that's clearly going to have a different result if you try to round-trip that value through a 32-bit pointer. I mean, I understand what you're getting at with the new text: the rule is essentially that the original and casted pointers point at the same memory allocation. LangRef really needs to be clear, though. efriedma: What does "and then indexed" mean for a gep that isn't inbounds? We clearly can't make all…
				equivalent to an offset computed in the original address space and
				then casted. i.e. addrspacecast (getelementptr %ptr, %index) is
				equivalent to getelementptr (addrspacecast %ptr), %index.

	Example:			Example:
	""""""""			""""""""

	.. code-block:: llvm			.. code-block:: llvm

	%X = addrspacecast i32* %x to i32 addrspace(1)* ; yields i32 addrspace(1)*:%x			%X = addrspacecast i32* %x to i32 addrspace(1)* ; yields i32 addrspace(1)*:%x
	%Y = addrspacecast i32 addrspace(1)* %y to i64 addrspace(2)* ; yields i64 addrspace(2)*:%y			%Y = addrspacecast i32 addrspace(1)* %y to i64 addrspace(2)* ; yields i64 addrspace(2)*:%y
	▲ Show 20 Lines • Show All 4,797 Lines • Show Last 20 Lines

include/llvm/Analysis/PtrUseVisitor.h

Show First 20 Lines • Show All 236 Lines • ▼ Show 20 Lines	void visitStoreInst(StoreInst &SI) {
if (SI.getValueOperand() == U->get())		if (SI.getValueOperand() == U->get())
PI.setEscaped(&SI);		PI.setEscaped(&SI);
}		}

void visitBitCastInst(BitCastInst &BC) {		void visitBitCastInst(BitCastInst &BC) {
enqueueUsers(BC);		enqueueUsers(BC);
}		}

		void visitAddrSpaceCastInst(AddrSpaceCastInst &ASC) {
		enqueueUsers(ASC);
		}

void visitPtrToIntInst(PtrToIntInst &I) {		void visitPtrToIntInst(PtrToIntInst &I) {
PI.setEscaped(&I);		PI.setEscaped(&I);
}		}

void visitGetElementPtrInst(GetElementPtrInst &GEPI) {		void visitGetElementPtrInst(GetElementPtrInst &GEPI) {
if (GEPI.use_empty())		if (GEPI.use_empty())
return;		return;

Show All 36 Lines

lib/Analysis/PtrUseVisitor.cpp

Show All 25 Lines	for (Use &U : I.uses()) {
}		}
}		}
}		}

bool detail::PtrUseVisitorBase::adjustOffsetForGEP(GetElementPtrInst &GEPI) {		bool detail::PtrUseVisitorBase::adjustOffsetForGEP(GetElementPtrInst &GEPI) {
if (!IsOffsetKnown)		if (!IsOffsetKnown)
return false;		return false;

return GEPI.accumulateConstantOffset(DL, Offset);		APInt TmpOffset(DL.getPointerTypeSizeInBits(GEPI.getType()), 0);
		theravenUnsubmitted Not Done Reply Inline Actions No changes suggested here, but in our version we have queries on the data layout that differentiate between the type and the range of a pointer (ours are 128- or 256-bit sized, but with a 64-bit range). theraven: No changes suggested here, but in our version we have queries on the data layout that…
		arsenmAuthorUnsubmitted Not Done Reply Inline Actions We will probably need this at some point for using pointers for resource descriptors arsenm: We will probably need this at some point for using pointers for resource descriptors
		if (GEPI.accumulateConstantOffset(DL, TmpOffset)) {
		Offset += TmpOffset.sextOrTrunc(Offset.getBitWidth());
		return true;
		}

		return false;
}		}

lib/Transforms/Scalar/SROA.cpp

Show First 20 Lines • Show All 664 Lines • ▼ Show 20 Lines	private:

void visitBitCastInst(BitCastInst &BC) {		void visitBitCastInst(BitCastInst &BC) {
if (BC.use_empty())		if (BC.use_empty())
return markAsDead(BC);		return markAsDead(BC);

return Base::visitBitCastInst(BC);		return Base::visitBitCastInst(BC);
}		}

		void visitAddrSpaceCastInst(AddrSpaceCastInst &ASC) {
		if (ASC.use_empty())
		return markAsDead(ASC);

		return Base::visitAddrSpaceCastInst(ASC);
		}

void visitGetElementPtrInst(GetElementPtrInst &GEPI) {		void visitGetElementPtrInst(GetElementPtrInst &GEPI) {
if (GEPI.use_empty())		if (GEPI.use_empty())
return markAsDead(GEPI);		return markAsDead(GEPI);

if (SROAStrictInbounds && GEPI.isInBounds()) {		if (SROAStrictInbounds && GEPI.isInBounds()) {
// FIXME: This is a manually un-factored variant of the basic code inside		// FIXME: This is a manually un-factored variant of the basic code inside
// of GEPs with checking of the inbounds invariant specified in the		// of GEPs with checking of the inbounds invariant specified in the
// langref in a very strict sense. If we ever want to enable		// langref in a very strict sense. If we ever want to enable
▲ Show 20 Lines • Show All 221 Lines • ▼ Show 20 Lines	do {
Size = std::max(Size, DL.getTypeStoreSize(Op->getType()));		Size = std::max(Size, DL.getTypeStoreSize(Op->getType()));
continue;		continue;
}		}

if (GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(I)) {		if (GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(I)) {
if (!GEP->hasAllZeroIndices())		if (!GEP->hasAllZeroIndices())
return GEP;		return GEP;
} else if (!isa<BitCastInst>(I) && !isa<PHINode>(I) &&		} else if (!isa<BitCastInst>(I) && !isa<PHINode>(I) &&
!isa<SelectInst>(I)) {		!isa<SelectInst>(I) && !isa<AddrSpaceCastInst>(I)) {
return I;		return I;
}		}

for (User *U : I->users())		for (User *U : I->users())
if (Visited.insert(cast<Instruction>(U)).second)		if (Visited.insert(cast<Instruction>(U)).second)
Uses.push_back(std::make_pair(I, cast<Instruction>(U)));		Uses.push_back(std::make_pair(I, cast<Instruction>(U)));
} while (!Uses.empty());		} while (!Uses.empty());

▲ Show 20 Lines • Show All 622 Lines • ▼ Show 20 Lines	do {

// Stash this pointer if we've found an i8*.		// Stash this pointer if we've found an i8*.
if (Ptr->getType()->isIntegerTy(8)) {		if (Ptr->getType()->isIntegerTy(8)) {
Int8Ptr = Ptr;		Int8Ptr = Ptr;
Int8PtrOffset = Offset;		Int8PtrOffset = Offset;
}		}

// Peel off a layer of the pointer and update the offset appropriately.		// Peel off a layer of the pointer and update the offset appropriately.
if (Operator::getOpcode(Ptr) == Instruction::BitCast) {		unsigned Opc = Operator::getOpcode(Ptr);
		if (Opc == Instruction::BitCast \|\| Opc == Instruction::AddrSpaceCast) {
		sanjoyUnsubmitted Not Done Reply Inline Actions Are you ruling that `GEP(CAST(X), 1)` is the same as `CAST(GEP X, 1)`? If so, I'm not sure this is correct given your constraint on address space casts. For instance, if casting from address space N to M, with both spaces having the same pointer width, involves flipping the high and low halves of the pointer then `GEP(CAST(X), 1)` is not the same as `(CAST(GEP X, 1))`. Of course, this means that GEPs over pointers of address space M are different operations from GEPs over pointers of address space N, but that's allowed, AFAIK. sanjoy: Are you ruling that `GEP(CAST(X), 1)` is the same as `CAST(GEP X, 1)`? If so, I'm not sure…
		arsenmAuthorUnsubmitted Not Done Reply Inline Actions Yes, those should be the same arsenm: Yes, those should be the same
		sanjoyUnsubmitted Not Done Reply Inline Actions Then you need to change the langref to disallow address space casts as the above (flipping the high and low halfs of the pointer). sanjoy: Then you need to change the langref to disallow address space casts as the above (flipping the…
		arsenmAuthorUnsubmitted Done Reply Inline Actions Does this only matter because of the newly introduced addrspacecast? This may change the pointer value, but we only care about number of bytes indexed off of the original object. Changing the representation in the middle shouldn't change the total number of bytes addressed from the original object? arsenm: Does this only matter because of the newly introduced addrspacecast? This may change the…
Ptr = cast<Operator>(Ptr)->getOperand(0);		Ptr = cast<Operator>(Ptr)->getOperand(0);
} else if (GlobalAlias *GA = dyn_cast<GlobalAlias>(Ptr)) {		} else if (GlobalAlias *GA = dyn_cast<GlobalAlias>(Ptr)) {
if (GA->isInterposable())		if (GA->isInterposable())
break;		break;
Ptr = GA->getAliasee();		Ptr = GA->getAliasee();
} else {		} else {
break;		break;
}		}
Show All 12 Lines	OffsetPtr = Int8PtrOffset == 0
? Int8Ptr		? Int8Ptr
: IRB.CreateInBoundsGEP(IRB.getInt8Ty(), Int8Ptr,		: IRB.CreateInBoundsGEP(IRB.getInt8Ty(), Int8Ptr,
IRB.getInt(Int8PtrOffset),		IRB.getInt(Int8PtrOffset),
NamePrefix + "sroa_raw_idx");		NamePrefix + "sroa_raw_idx");
}		}
Ptr = OffsetPtr;		Ptr = OffsetPtr;

// On the off chance we were targeting i8*, guard the bitcast here.		// On the off chance we were targeting i8*, guard the bitcast here.
if (Ptr->getType() != PointerTy)		if (Ptr->getType() != PointerTy) {
Ptr = IRB.CreateBitCast(Ptr, PointerTy, NamePrefix + "sroa_cast");		Ptr = IRB.CreatePointerBitCastOrAddrSpaceCast(Ptr, PointerTy,
		NamePrefix + "sroa_cast");
		efriedmaUnsubmitted Done Reply Inline Actions Why do we want to generate an address-space cast here, as opposed to performing the memory operation using the alloca's natural address-space? efriedma: Why do we want to generate an address-space cast here, as opposed to performing the memory…
		arsenmAuthorUnsubmitted Done Reply Inline Actions You're right, this is unnecessary arsenm: You're right, this is unnecessary
		arsenmAuthorUnsubmitted Done Reply Inline Actions This is actually necessary in some cases. (e.g. in select_addrspacecast_const_op, without this, the two operands of the select end up as different types) arsenm: This is actually necessary in some cases. (e.g. in select_addrspacecast_const_op, without this…
		}

return Ptr;		return Ptr;
}		}

/// \brief Compute the adjusted alignment for a load or store from an offset.		/// \brief Compute the adjusted alignment for a load or store from an offset.
static unsigned getAdjustedAlignment(Instruction *I, uint64_t Offset,		static unsigned getAdjustedAlignment(Instruction *I, uint64_t Offset,
const DataLayout &DL) {		const DataLayout &DL) {
unsigned Alignment;		unsigned Alignment;
▲ Show 20 Lines • Show All 1,566 Lines • ▼ Show 20 Lines	bool visitStoreInst(StoreInst &SI) {
return true;		return true;
}		}

bool visitBitCastInst(BitCastInst &BC) {		bool visitBitCastInst(BitCastInst &BC) {
enqueueUsers(BC);		enqueueUsers(BC);
return false;		return false;
}		}

		bool visitAddrSpaceCastInst(AddrSpaceCastInst &ASC) {
		enqueueUsers(ASC);
		return false;
		}

bool visitGetElementPtrInst(GetElementPtrInst &GEPI) {		bool visitGetElementPtrInst(GetElementPtrInst &GEPI) {
enqueueUsers(GEPI);		enqueueUsers(GEPI);
return false;		return false;
}		}

bool visitPHINode(PHINode &PN) {		bool visitPHINode(PHINode &PN) {
enqueueUsers(PN);		enqueueUsers(PN);
return false;		return false;
▲ Show 20 Lines • Show All 1,129 Lines • Show Last 20 Lines

test/Transforms/SROA/basictest.ll

Show First 20 Lines • Show All 59 Lines • ▼ Show 20 Lines	entry:
store i64 %X, i64* %B		store i64 %X, i64* %B
br label %L2		br label %L2

L2:		L2:
%Z = load i64, i64* %B		%Z = load i64, i64* %B
ret i64 %Z		ret i64 %Z
}		}

		define i64 @test2_addrspacecast(i64 %X) {
		; CHECK-LABEL: @test2_addrspacecast(
		; CHECK-NOT: alloca
		; CHECK: ret i64 %X

		entry:
		%A = alloca [8 x i8]
		%B = addrspacecast [8 x i8]* %A to i64 addrspace(1)*
		store i64 %X, i64 addrspace(1)* %B
		br label %L2

		L2:
		%Z = load i64, i64 addrspace(1)* %B
		ret i64 %Z
		}

		define i64 @test2_addrspacecast_gep(i64 %X, i16 %idx) {
		; CHECK-LABEL: @test2_addrspacecast_gep(
		; CHECK-NOT: alloca
		; CHECK: ret i64 %X

		entry:
		%A = alloca [256 x i8]
		%B = addrspacecast [256 x i8]* %A to i64 addrspace(1)*
		%gepA = getelementptr [256 x i8], [256 x i8]* %A, i16 0, i16 32
		%gepB = getelementptr i64, i64 addrspace(1)* %B, i16 4
		store i64 %X, i64 addrspace(1)* %gepB, align 1
		br label %L2

		L2:
		%gepA.bc = bitcast i8* %gepA to i64*
		%Z = load i64, i64* %gepA.bc, align 1
		ret i64 %Z
		}

		; Avoid crashing when load/storing at at different offsets.
		define i64 @test2_addrspacecast_gep_offset(i64 %X, i16 %idx) {
		; CHECK-LABEL: @test2_addrspacecast_gep_offset(
		; CHECK: %A.sroa.0 = alloca [10 x i8]
		; CHECK: %A.sroa.0.2.gepB.sroa_idx = getelementptr inbounds [10 x i8], [10 x i8]* %A.sroa.0, i16 0, i16 2
		; CHECK: %A.sroa.0.2.gepB.sroa_cast = addrspacecast i8* %A.sroa.0.2.gepB.sroa_idx to i64 addrspace(1)*
		arichardsonUnsubmitted Not Done Reply Inline Actions Use FileCheck captures for the variables in case the naming changes in the future? arichardson: Use FileCheck captures for the variables in case the naming changes in the future?
		; CHECK: store i64 %X, i64 addrspace(1)* %A.sroa.0.2.gepB.sroa_cast, align 1
		; CHECK: br

		; CHECK: %A.sroa.0.0.gepA.bc.sroa_cast = bitcast [10 x i8]* %A.sroa.0 to i64*
		; CHECK: %A.sroa.0.0.A.sroa.0.30.Z = load i64, i64* %A.sroa.0.0.gepA.bc.sroa_cast, align 1
		; CHECK-NEXT: ret
		entry:
		%A = alloca [256 x i8]
		%B = addrspacecast [256 x i8]* %A to i64 addrspace(1)*
		%gepA = getelementptr [256 x i8], [256 x i8]* %A, i16 0, i16 30
		%gepB = getelementptr i64, i64 addrspace(1)* %B, i16 4
		store i64 %X, i64 addrspace(1)* %gepB, align 1
		br label %L2

		L2:
		%gepA.bc = bitcast i8* %gepA to i64*
		%Z = load i64, i64* %gepA.bc, align 1
		ret i64 %Z
		}

define void @test3(i8* %dst, i8* %src) {		define void @test3(i8* %dst, i8* %src) {
; CHECK-LABEL: @test3(		; CHECK-LABEL: @test3(

entry:		entry:
%a = alloca [300 x i8]		%a = alloca [300 x i8]
; CHECK-NOT: alloca		; CHECK-NOT: alloca
; CHECK: %[[test3_a1:.*]] = alloca [42 x i8]		; CHECK: %[[test3_a1:.*]] = alloca [42 x i8]
; CHECK-NEXT: %[[test3_a2:.*]] = alloca [99 x i8]		; CHECK-NEXT: %[[test3_a2:.*]] = alloca [99 x i8]
▲ Show 20 Lines • Show All 345 Lines • ▼ Show 20 Lines	entry:
%fptr = bitcast [4 x i8]* %a to float*		%fptr = bitcast [4 x i8]* %a to float*
store float 0.0, float* %fptr		store float 0.0, float* %fptr
%ptr = getelementptr [4 x i8], [4 x i8]* %a, i32 0, i32 2		%ptr = getelementptr [4 x i8], [4 x i8]* %a, i32 0, i32 2
%iptr = bitcast i8* %ptr to i16*		%iptr = bitcast i8* %ptr to i16*
%val = load i16, i16* %iptr		%val = load i16, i16* %iptr
ret i16 %val		ret i16 %val
}		}

		define i16 @test5_multi_addrspace_access() {
		; CHECK-LABEL: @test5_multi_addrspace_access(
		; CHECK-NOT: alloca float
		; CHECK: %[[cast:.]] = bitcast float 0.0{{.}} to i32
		; CHECK-NEXT: %[[shr:.*]] = lshr i32 %[[cast]], 16
		; CHECK-NEXT: %[[trunc:.*]] = trunc i32 %[[shr]] to i16
		; CHECK-NEXT: ret i16 %[[trunc]]

		entry:
		%a = alloca [4 x i8]
		%fptr = bitcast [4 x i8]* %a to float*
		%fptr.as1 = addrspacecast float* %fptr to float addrspace(1)*
		store float 0.0, float addrspace(1)* %fptr.as1
		%ptr = getelementptr [4 x i8], [4 x i8]* %a, i32 0, i32 2
		%iptr = bitcast i8* %ptr to i16*
		%val = load i16, i16* %iptr
		ret i16 %val
		}

define i32 @test6() {		define i32 @test6() {
; CHECK-LABEL: @test6(		; CHECK-LABEL: @test6(
; CHECK: alloca i32		; CHECK: alloca i32
; CHECK-NEXT: store volatile i32		; CHECK-NEXT: store volatile i32
; CHECK-NEXT: load i32, i32*		; CHECK-NEXT: load i32, i32*
; CHECK-NEXT: ret i32		; CHECK-NEXT: ret i32

entry:		entry:
▲ Show 20 Lines • Show All 383 Lines • ▼ Show 20 Lines	entry:
%cast1 = bitcast %opaque* %x to i8*		%cast1 = bitcast %opaque* %x to i8*
%cast2 = bitcast { i64, i8* }* %a to i8*		%cast2 = bitcast { i64, i8* }* %a to i8*
call void @llvm.memcpy.p0i8.p0i8.i32(i8* %cast2, i8* %cast1, i32 16, i32 1, i1 false)		call void @llvm.memcpy.p0i8.p0i8.i32(i8* %cast2, i8* %cast1, i32 16, i32 1, i1 false)
%gep = getelementptr inbounds { i64, i8* }, { i64, i8* }* %a, i32 0, i32 0		%gep = getelementptr inbounds { i64, i8* }, { i64, i8* }* %a, i32 0, i32 0
%val = load i64, i64* %gep		%val = load i64, i64* %gep
ret i32 undef		ret i32 undef
}		}

		declare void @llvm.memcpy.p0i8.p1i8.i32(i8* nocapture, i8 addrspace(1)* nocapture, i32, i32, i1) nounwind

		define i32 @test19_addrspacecast(%opaque* %x) {
		; This input will cause us to try to compute a natural GEP when rewriting
		; pointers in such a way that we try to GEP through the opaque type. Previously,
		; a check for an unsized type was missing and this crashed. Ensure it behaves
		; reasonably now.
		; CHECK-LABEL: @test19_addrspacecast(
		; CHECK-NOT: alloca
		; CHECK: ret i32 undef

		entry:
		%a = alloca { i64, i8* }
		%cast1 = addrspacecast %opaque* %x to i8 addrspace(1)*
		%cast2 = bitcast { i64, i8* }* %a to i8*
		call void @llvm.memcpy.p0i8.p1i8.i32(i8* %cast2, i8 addrspace(1)* %cast1, i32 16, i32 1, i1 false)
		%gep = getelementptr inbounds { i64, i8* }, { i64, i8* }* %a, i32 0, i32 0
		%val = load i64, i64* %gep
		ret i32 undef
		}

define i32 @test20() {		define i32 @test20() {
; Ensure we can track negative offsets (before the beginning of the alloca) and		; Ensure we can track negative offsets (before the beginning of the alloca) and
; negative relative offsets from offsets starting past the end of the alloca.		; negative relative offsets from offsets starting past the end of the alloca.
; CHECK-LABEL: @test20(		; CHECK-LABEL: @test20(
; CHECK-NOT: alloca		; CHECK-NOT: alloca
; CHECK: %[[sum1:.*]] = add i32 1, 2		; CHECK: %[[sum1:.*]] = add i32 1, 2
; CHECK: %[[sum2:.*]] = add i32 %[[sum1]], 3		; CHECK: %[[sum2:.*]] = add i32 %[[sum1]], 3
; CHECK: ret i32 %[[sum2]]		; CHECK: ret i32 %[[sum2]]
▲ Show 20 Lines • Show All 862 Lines • Show Last 20 Lines