This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/IR/
-
IR/
9/24
Value.cpp
-
test/Analysis/ValueTracking/
-
Analysis/
-
ValueTracking/
-
deref-abstract-gc.ll
-
memory-dereferenceable.ll

Differential D99135

[deref] Implement initial set of inference rules for deref-at-point
ClosedPublic

Authored by reames on Mar 22 2021, 7:42 PM.

Download Raw Diff

Details

Reviewers

jdoerfert
apilipenko
nlewycky
bollu
nikic
nlopes

Commits

rG4054b8322fd8: [deref] Implement initial set of inference rules for deref-at-point

Summary

Building block towards D99100.

This implements the initial set of inference rules proposed in the llvm-dev thread "RFC: Decomposing deref(N) into deref(N) + nofree". I've already stumbled across a few more when examining test cases, but it's good to get this checked in and tested.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

reames created this revision.Mar 22 2021, 7:42 PM

Herald added a reviewer: bollu. · View Herald TranscriptMar 22 2021, 7:42 PM

Herald added subscribers: dexonsmith, dantrushin, hiraditya, mcrosier. · View Herald Transcript

reames requested review of this revision.Mar 22 2021, 7:42 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 22 2021, 7:42 PM

reames mentioned this in D99138: [deref] Use readonly to infer global dereferenceability in a callee.Mar 22 2021, 8:19 PM

reames added a child revision: D99138: [deref] Use readonly to infer global dereferenceability in a callee.

reames added a child revision: D99100: [WIP] Implement RFC: Decomposing deref(N) into deref(N) + nofree.

Harbormaster completed remote builds in B95143: Diff 332501.Mar 22 2021, 8:22 PM

nlopes requested changes to this revision.Mar 23 2021, 4:19 AM

nlopes added inline comments.

llvm/lib/IR/Value.cpp
741	an Alloca can be killed with the lifetime intrinsics. Storing to an alloca after lifetime_end is UB.
756	I don't understand this argument of why hasNoSync isn't needed. All argument attributes are with respect to what the function does, not the environment. It would be impossible to infer attributes if the environment was considered. My understanding is that readonly implies nofree. Can't see why nofree would have those special semantics
769	FWIW missing byval case (can't be freed).

This revision now requires changes to proceed.Mar 23 2021, 4:19 AM

reames added inline comments.Mar 23 2021, 9:14 AM

llvm/lib/IR/Value.cpp
741	Right, but how does that affect dereferenceability? The memory is still dereferenceable. The contents may be undefined, but the access won't fault. p.s. Lifetime intrinsics are very badly specified, and are inconsistent across the optimizer - including in the existing deref code. I really don't want them to be a blocker here.
756	The focus here isn't on nofree implying nosync. It's noalias which implies nosync. In order to for there to be a written (freed) copy of the pointer accessible in the other thread, the noalias fact must be false. If so, that's already UB.
769	Handled in follow up patch.

reames added inline comments.Mar 23 2021, 9:21 AM

llvm/lib/IR/Value.cpp
741	JFYI, the wording in the LangRef on semantics of a dead object (after lifetime end) was recently changed in c821ef4. I objected to the wording at the time, though the patch appears not to have been adjusted. This is why. lifetime intrinsics do not change dereferenceability. The relevant wording from c821ef4 needs removed from the LangRef. It is wrong. Both with current and proposed semantics.

nlopes added inline comments.Mar 23 2021, 9:58 AM

llvm/lib/IR/Value.cpp
741	I agree that doing a store after lifetime_end intrinsic doesn't cause a crash (ATM), but it may write to another object. For example: %p = alloca i8 %q = alloca i8 lifetime_start %p ... lifetime_end %p lifetime_start %q store i8 1, %q store i8 0, %p ; UB: %p is dead %v = load i8, %q What's the value of %v? In practice, it's going to be 0 because %p & %q will end up allocated at the same address. And this is why storing after lifetime_end must be UB. So now it depends on how this analysis is used. My concern is that it may be used to move a store past lifetime_end and then we have a miscompilation. Because if a pointer is dereferenceable for the size of the store and it's guaranteed to be alive throughout the function, then we can conclude that loads & stores to that pointer can be moved freely. A simple solution is to check if the particular alloca is an argument to a lifetime_end intrinsic. More complex ones require an additional argument like a BB or specific instruction, such that the query becomes: is the object still alive at this BB/instruction?

reames added inline comments.Mar 23 2021, 10:05 AM

llvm/lib/IR/Value.cpp
741	This has turned into a discussion of the lifetime intrinsics. I assert that is unrelated to this change (as the implementation here exactly matches the existing deref implementation for this case). I ask that this discussion be removed from this review thread and taken elsewhere. I'll also note that you are commingling concerns in your discussion. Dereferenceability does NOT imply it's safe to insert a store. That's a higher level property with additional concerns (e.g. concurrency) which need to be satisfied.

nlopes added inline comments.Mar 23 2021, 11:26 AM

llvm/lib/IR/Value.cpp
756	Ok, I confess I'm less familiar with the noalias attribute. Though memcpy's signature is something like: declare void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nocapture writeonly, i8* noalias nocapture readonly, i64, i1 immarg) #1 The caller of memcpy may have had the src/dst pointers escaped to other threads. Even then, both arguments are marked as noalias because the C spec says the arguments cannot overlap. This contradicts the global definition of noalias you are suggesting.

reames added inline comments.Mar 23 2021, 11:44 AM

llvm/lib/IR/Value.cpp
756	I'm also fairly unfamiliar with noalias. I'm going off existing wording, and what the optimizer appears to actually do. I don't think it actually is contradictory. Your concern involves a writeable copy of the pointer. The wording of noalias talks about access. It doesn't appear to be UB to have a writeable copy of the pointer on another thread (or even within the callee). It does appear to be UB to actually write to it during the execution of the callee. Note that this definition does make noalias argument attributes very hard to infer. I'll note that we only have noalias return inference which might be exactly because of that. :) I'm not sure.

nlopes added inline comments.Mar 23 2021, 12:40 PM

llvm/lib/IR/Value.cpp
741	Unfortunately we need to settle on the lifetime semantics if we want to have code inferring properties about allocas here. BTW, my concern is about movement of load/stores, not of introduction. This `canFree` is called from `getPointerDereferenceableBytes()`, which will infer for how many bytes a pointer is dereferenceable in the whole function. Not just in a particular BB. Hence once the dereferenceable attribute is inferred, "free" movement of memory accesses becomes legal. But it's not, as we have to respect the boundaries of lifetime_start/end. (I'm even ignoring segmented stacks to avoid extra complication). Fundamentally, I think that if we go with a definition of dereferenceability per function, we are always going to have our hands tied. We may need to refine the abstraction and do it per BB, for example. Without knowing what every single user of `getPointerDereferenceableBytes()` does with the result of the analysis, I'm not comfortable declaring an alloca'ed object as always dereferenceable.
756	I see; thanks for the explanation. LLVM doesn't seem to assume that noalias pointers are distinct indeed: https://gcc.godbolt.org/z/KTfnzMMj1 So I'm good on this now :)

nikic added a subscriber: nikic.Mar 23 2021, 1:38 PM

nikic added inline comments.

llvm/lib/IR/Value.cpp
741	Movement of accesses across lifetime intrinsics is already prevented because lifetime intrinsics are modeled as clobbering the location. I really don't think we should get lifetime intrinsics caught up in this if we can avoid it. Requiring context to handle allocas complicates things a lot. Do you have some specific example of a miscompile that would occur if lifetime intrinsics are not modeled here?

nlopes added inline comments.Mar 24 2021, 4:38 AM

llvm/lib/IR/Value.cpp
741	That's a good point, thanks! Since the intrinsics clobber the objects then movement is already _likely_ prevented. I tried and couldn't trigger a miscompilation by hand in ~15 min of fiddling with IR. I'm still uncomfortable declaring allocas dereferenceable in the whole function when they are not. Who knows what the users of `getPointerDereferenceableBytes` will do with this information (now or in the future).. We don't have documentation about these analyses. Anyway, to unblock this, I'll just suggest a comment is added to `getPointerDereferenceableBytes` regarding allocas. Meaning, just because an alloca'ed pointer isn't escaped and is "dereferenceable" for x bytes, it doesn't mean we can introduce a store to it (as one would suppose!). One needs to consider lifetime as well.
756	Just re-read LangRef a few times again regarding noalias (https://llvm.org/docs/LangRef.html#parameter-attributes). I can't interpret what's written there as anything related with pointers not being writable in another thread. It just says that if I access object o1, and that object is pointed to by a noalias argument %p, then object o1 can't be accessed through any other pointer that is not based on %p. There's no word about concurrent accesses. So either this code is incorrect or LangRef needs updating.

reames added inline comments.Mar 24 2021, 9:12 AM

llvm/lib/IR/Value.cpp
741	Nuno, I will not put an incorrect comment into the source code to reflect a mistaken understanding of the code. As explained by both myself and Nikita previously, you are commingling concerns of dereferenability, concurrency, and memory aliasing. I have already asked you to move the discussion of lifetime intrinsics elsewhere. As I asserted before, the code as written here matches the existing implementation it's replacing. If you want to change that implementation or the semantics thereof, feel free to put forth proposals. I will not. I am frustrated by this sub-thread. TBH IMO, it seems like you are not discussing the issue in this patch in good faith. It seems like you're trying to insist I broaden the scope of work substantially. That bothers me.
756	I'm going to move this inference rule to it's own review. We probably need input from Johannes or someone more knowledgeable on noalias than myself. I want to unblock the rest of the review.

nlopes resigned from this revision.Mar 24 2021, 9:24 AM

nlopes added inline comments.

llvm/lib/IR/Value.cpp
741	The whole point of your RFC & patches was to fix bugs in LLVM, not to match the implementation it is replacing AFAIU. Anyway, since apparently I'm acting in bad faith regarding this patch, I'll step down immediately.

This revision now requires review to proceed.Mar 24 2021, 9:24 AM

dexonsmith removed a subscriber: dexonsmith.Mar 24 2021, 12:11 PM

Strip out the argument attribute cases for now. I'll move them to their own review since they seem to need more discussion.

Can I get an LGTM on this? This is blocking a bunch of follow up patches.

nikic added inline comments.Mar 24 2021, 3:22 PM

llvm/lib/IR/Value.cpp
741	GlobalValue is a subtype of Constant, no need to check it separately.
746	nit: it's -> its
750	I don't really get why this code is specific to arguments. Shouldn't this hold for any SSA value inside the function? Sure, it can get freed after the function returns/unwinds, but not while the SSA value is active. Am I missing something here?

You might want to split out the GC part as well, or at least have someone else look at that -- I don't know anything about it.

I looked at the GC part and it looks good to me.

Address review comments, and make a stylistic adjustment to make it really clear that this patch doesn't effect alloca handling at all.

LGTM

llvm/lib/IR/Value.cpp
738	Probably doesn't make sense to keep this as a bullet point list if there's only the one element :)

This revision is now accepted and ready to land.Mar 24 2021, 4:10 PM

This revision was landed with ongoing or failed builds.Mar 24 2021, 4:21 PM

Closed by commit rG4054b8322fd8: [deref] Implement initial set of inference rules for deref-at-point (authored by reames). · Explain Why

This revision was automatically updated to reflect the committed changes.

reames added a commit: rG4054b8322fd8: [deref] Implement initial set of inference rules for deref-at-point.

aqjune mentioned this in D99303: Reword lifetime description to avoid contradicting long term implementation.Mar 24 2021, 9:13 PM

Harbormaster completed remote builds in B95594: Diff 333144.Mar 24 2021, 11:00 PM

Harbormaster completed remote builds in B95602: Diff 333163.Mar 24 2021, 11:34 PM

reames added inline comments.Mar 25 2021, 11:56 AM

llvm/lib/IR/Value.cpp
741	To close the loop here for third parties reading along, Nuno and I had an offline conversation on this sub-point today. This discussion did trigger a clarification response to the original RFC to be much more specific about the scope of the proposal. That seems to have been the original confusion which let us down this path of miscommunication. The conversation also triggered me to change my handling of a related discussion thread about changes which recently changed the semantics of lifetime intrinsics.

mkazantsev mentioned this in D99926: [GVN] Introduce loop load PRE.Apr 15 2021, 3:33 AM

mkazantsev mentioned this in D100551: [deref] No need to check nosync in addition to nofree.Apr 15 2021, 5:07 AM

mkazantsev mentioned this in D100559: [GC][NFC] Make getGCStrategy by name available in IR.Apr 15 2021, 6:52 AM

mkazantsev mentioned this in rGc5b63714b520: [GC][NFC] Make getGCStrategy by name available in IR.Aug 2 2021, 12:26 AM

Revision Contents

Path

Size

llvm/

lib/

IR/

Value.cpp

61 lines

test/

Analysis/

ValueTracking/

deref-abstract-gc.ll

3 lines

memory-dereferenceable.ll

6 lines

Diff 333172

llvm/lib/IR/Value.cpp

Show First 20 Lines • Show All 722 Lines • ▼ Show 20 Lines	const Value *Value::stripAndAccumulateConstantOffsets(
return V;		return V;
}		}

const Value *		const Value *
Value::stripInBoundsOffsets(function_ref<void(const Value *)> Func) const {		Value::stripInBoundsOffsets(function_ref<void(const Value *)> Func) const {
return stripPointerCastsAndOffsets<PSK_InBounds>(this, Func);		return stripPointerCastsAndOffsets<PSK_InBounds>(this, Func);
}		}

		// Return true if the memory object referred to by V can by freed in the scope
		// for which the SSA value defining the allocation is statically defined. E.g.
		// deallocation after the static scope of a value does not count.
		static bool canBeFreed(const Value *V) {
		assert(V->getType()->isPointerTy());

		// Cases that can simply never be deallocated
		// *) Constants aren't allocated per se, thus not deallocated either.
		nikicUnsubmitted Not Done Reply Inline Actions Probably doesn't make sense to keep this as a bullet point list if there's only the one element :) nikic: Probably doesn't make sense to keep this as a bullet point list if there's only the one element…
		if (isa<Constant>(V))
		return false;

		nlopesUnsubmitted Not Done Reply Inline Actions an Alloca can be killed with the lifetime intrinsics. Storing to an alloca after lifetime_end is UB. nlopes: an Alloca can be killed with the lifetime intrinsics. Storing to an alloca after lifetime_end…
		reamesAuthorUnsubmitted Done Reply Inline Actions Right, but how does that affect dereferenceability? The memory is still dereferenceable. The contents may be undefined, but the access won't fault. p.s. Lifetime intrinsics are very badly specified, and are inconsistent across the optimizer - including in the existing deref code. I really don't want them to be a blocker here. reames: Right, but how does that affect dereferenceability? The memory is still dereferenceable. The…
		reamesAuthorUnsubmitted Done Reply Inline Actions JFYI, the wording in the LangRef on semantics of a dead object (after lifetime end) was recently changed in c821ef4. I objected to the wording at the time, though the patch appears not to have been adjusted. This is why. lifetime intrinsics do not change dereferenceability. The relevant wording from c821ef4 needs removed from the LangRef. It is wrong. Both with current and proposed semantics. reames: JFYI, the wording in the LangRef on semantics of a dead object (after lifetime end) was…
		nlopesUnsubmitted Not Done Reply Inline Actions I agree that doing a store after lifetime_end intrinsic doesn't cause a crash (ATM), but it may write to another object. For example: %p = alloca i8 %q = alloca i8 lifetime_start %p ... lifetime_end %p lifetime_start %q store i8 1, %q store i8 0, %p ; UB: %p is dead %v = load i8, %q What's the value of %v? In practice, it's going to be 0 because %p & %q will end up allocated at the same address. And this is why storing after lifetime_end must be UB. So now it depends on how this analysis is used. My concern is that it may be used to move a store past lifetime_end and then we have a miscompilation. Because if a pointer is dereferenceable for the size of the store and it's guaranteed to be alive throughout the function, then we can conclude that loads & stores to that pointer can be moved freely. A simple solution is to check if the particular alloca is an argument to a lifetime_end intrinsic. More complex ones require an additional argument like a BB or specific instruction, such that the query becomes: is the object still alive at this BB/instruction? nlopes: I agree that doing a store after lifetime_end intrinsic doesn't cause a crash (ATM), but it may…
		reamesAuthorUnsubmitted Done Reply Inline Actions This has turned into a discussion of the lifetime intrinsics. I assert that is unrelated to this change (as the implementation here exactly matches the existing deref implementation for this case). I ask that this discussion be removed from this review thread and taken elsewhere. I'll also note that you are commingling concerns in your discussion. Dereferenceability does NOT imply it's safe to insert a store. That's a higher level property with additional concerns (e.g. concurrency) which need to be satisfied. reames: This has turned into a discussion of the lifetime intrinsics. I assert that is unrelated to…
		nlopesUnsubmitted Not Done Reply Inline Actions Unfortunately we need to settle on the lifetime semantics if we want to have code inferring properties about allocas here. BTW, my concern is about movement of load/stores, not of introduction. This `canFree` is called from `getPointerDereferenceableBytes()`, which will infer for how many bytes a pointer is dereferenceable in the whole function. Not just in a particular BB. Hence once the dereferenceable attribute is inferred, "free" movement of memory accesses becomes legal. But it's not, as we have to respect the boundaries of lifetime_start/end. (I'm even ignoring segmented stacks to avoid extra complication). Fundamentally, I think that if we go with a definition of dereferenceability per function, we are always going to have our hands tied. We may need to refine the abstraction and do it per BB, for example. Without knowing what every single user of `getPointerDereferenceableBytes()` does with the result of the analysis, I'm not comfortable declaring an alloca'ed object as always dereferenceable. nlopes: Unfortunately we need to settle on the lifetime semantics if we want to have code inferring…
		nikicUnsubmitted Not Done Reply Inline Actions Movement of accesses across lifetime intrinsics is already prevented because lifetime intrinsics are modeled as clobbering the location. I really don't think we should get lifetime intrinsics caught up in this if we can avoid it. Requiring context to handle allocas complicates things a lot. Do you have some specific example of a miscompile that would occur if lifetime intrinsics are not modeled here? nikic: Movement of accesses across lifetime intrinsics is already prevented because lifetime…
		nlopesUnsubmitted Not Done Reply Inline Actions That's a good point, thanks! Since the intrinsics clobber the objects then movement is already _likely_ prevented. I tried and couldn't trigger a miscompilation by hand in ~15 min of fiddling with IR. I'm still uncomfortable declaring allocas dereferenceable in the whole function when they are not. Who knows what the users of `getPointerDereferenceableBytes` will do with this information (now or in the future).. We don't have documentation about these analyses. Anyway, to unblock this, I'll just suggest a comment is added to `getPointerDereferenceableBytes` regarding allocas. Meaning, just because an alloca'ed pointer isn't escaped and is "dereferenceable" for x bytes, it doesn't mean we can introduce a store to it (as one would suppose!). One needs to consider lifetime as well. nlopes: That's a good point, thanks! Since the intrinsics clobber the objects then movement is already…
		reamesAuthorUnsubmitted Done Reply Inline Actions Nuno, I will not put an incorrect comment into the source code to reflect a mistaken understanding of the code. As explained by both myself and Nikita previously, you are commingling concerns of dereferenability, concurrency, and memory aliasing. I have already asked you to move the discussion of lifetime intrinsics elsewhere. As I asserted before, the code as written here matches the existing implementation it's replacing. If you want to change that implementation or the semantics thereof, feel free to put forth proposals. I will not. I am frustrated by this sub-thread. TBH IMO, it seems like you are not discussing the issue in this patch in good faith. It seems like you're trying to insist I broaden the scope of work substantially. That bothers me. reames: Nuno, I will not put an incorrect comment into the source code to reflect a mistaken…
		nlopesUnsubmitted Not Done Reply Inline Actions The whole point of your RFC & patches was to fix bugs in LLVM, not to match the implementation it is replacing AFAIU. Anyway, since apparently I'm acting in bad faith regarding this patch, I'll step down immediately. nlopes: The whole point of your RFC & patches was to fix bugs in LLVM, not to match the implementation…
		reamesAuthorUnsubmitted Done Reply Inline Actions To close the loop here for third parties reading along, Nuno and I had an offline conversation on this sub-point today. This discussion did trigger a clarification response to the original RFC to be much more specific about the scope of the proposal. That seems to have been the original confusion which let us down this path of miscommunication. The conversation also triggered me to change my handling of a related discussion thread about changes which recently changed the semantics of lifetime intrinsics. reames: To close the loop here for third parties reading along, Nuno and I had an offline conversation…
		nikicUnsubmitted Not Done Reply Inline Actions GlobalValue is a subtype of Constant, no need to check it separately. nikic: GlobalValue is a subtype of Constant, no need to check it separately.
		const Function *F = nullptr;
		if (auto *I = dyn_cast<Instruction>(V))
		F = I->getFunction();
		if (auto *A = dyn_cast<Argument>(V))
		F = A->getParent();
		nikicUnsubmitted Not Done Reply Inline Actions nit: it's -> its nikic: nit: it's -> its

		if (!F)
		return true;

		nikicUnsubmitted Not Done Reply Inline Actions I don't really get why this code is specific to arguments. Shouldn't this hold for any SSA value inside the function? Sure, it can get freed after the function returns/unwinds, but not while the SSA value is active. Am I missing something here? nikic: I don't really get why this code is specific to arguments. Shouldn't this hold for any SSA…
		// A pointer to an object in a function which neither frees, nor can arrange
		// for another thread to free on its behalf, can not be freed in the scope
		// of the function.
		if (F->doesNotFreeMemory() && F->hasNoSync())
		return false;

		nlopesUnsubmitted Not Done Reply Inline Actions I don't understand this argument of why hasNoSync isn't needed. All argument attributes are with respect to what the function does, not the environment. It would be impossible to infer attributes if the environment was considered. My understanding is that readonly implies nofree. Can't see why nofree would have those special semantics nlopes: I don't understand this argument of why hasNoSync isn't needed. All argument attributes are…
		reamesAuthorUnsubmitted Done Reply Inline Actions The focus here isn't on nofree implying nosync. It's noalias which implies nosync. In order to for there to be a written (freed) copy of the pointer accessible in the other thread, the noalias fact must be false. If so, that's already UB. reames: The focus here isn't on nofree implying nosync. It's noalias which implies nosync. In order…
		nlopesUnsubmitted Not Done Reply Inline Actions Ok, I confess I'm less familiar with the noalias attribute. Though memcpy's signature is something like: declare void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nocapture writeonly, i8* noalias nocapture readonly, i64, i1 immarg) #1 The caller of memcpy may have had the src/dst pointers escaped to other threads. Even then, both arguments are marked as noalias because the C spec says the arguments cannot overlap. This contradicts the global definition of noalias you are suggesting. nlopes: Ok, I confess I'm less familiar with the noalias attribute. Though memcpy's signature is…
		reamesAuthorUnsubmitted Done Reply Inline Actions I'm also fairly unfamiliar with noalias. I'm going off existing wording, and what the optimizer appears to actually do. I don't think it actually is contradictory. Your concern involves a writeable copy of the pointer. The wording of noalias talks about access. It doesn't appear to be UB to have a writeable copy of the pointer on another thread (or even within the callee). It does appear to be UB to actually write to it during the execution of the callee. Note that this definition does make noalias argument attributes very hard to infer. I'll note that we only have noalias return inference which might be exactly because of that. :) I'm not sure. reames: I'm also fairly unfamiliar with noalias. I'm going off existing wording, and what the…
		nlopesUnsubmitted Not Done Reply Inline Actions I see; thanks for the explanation. LLVM doesn't seem to assume that noalias pointers are distinct indeed: https://gcc.godbolt.org/z/KTfnzMMj1 So I'm good on this now :) nlopes: I see; thanks for the explanation. LLVM doesn't seem to assume that noalias pointers are…
		nlopesUnsubmitted Not Done Reply Inline Actions Just re-read LangRef a few times again regarding noalias (https://llvm.org/docs/LangRef.html#parameter-attributes). I can't interpret what's written there as anything related with pointers not being writable in another thread. It just says that if I access object o1, and that object is pointed to by a noalias argument %p, then object o1 can't be accessed through any other pointer that is not based on %p. There's no word about concurrent accesses. So either this code is incorrect or LangRef needs updating. nlopes: Just re-read LangRef a few times again regarding noalias (https://llvm.org/docs/LangRef.
		reamesAuthorUnsubmitted Done Reply Inline Actions I'm going to move this inference rule to it's own review. We probably need input from Johannes or someone more knowledgeable on noalias than myself. I want to unblock the rest of the review. reames: I'm going to move this inference rule to it's own review. We probably need input from Johannes…
		// With garbage collection, deallocation typically occurs solely at or after
		// safepoints. If we're compiling for a collector which uses the
		// gc.statepoint infrastructure, safepoints aren't explicitly present
		// in the IR until after lowering from abstract to physical machine model.
		// The collector could chose to mix explicit deallocation and gc'd objects
		// which is why we need the explicit opt in on a per collector basis.
		if (!F->hasGC())
		return true;

		const auto &GCName = F->getGC();
		const StringRef StatepointExampleName("statepoint-example");
		if (GCName != StatepointExampleName)
		return true;
		nlopesUnsubmitted Not Done Reply Inline Actions FWIW missing byval case (can't be freed). nlopes: FWIW missing byval case (can't be freed).
		reamesAuthorUnsubmitted Done Reply Inline Actions Handled in follow up patch. reames: Handled in follow up patch.

		auto *PT = cast<PointerType>(V->getType());
		if (PT->getAddressSpace() != 1)
		// For the sake of this example GC, we arbitrarily pick addrspace(1) as our
		// GC managed heap. This must match the same check in
		// RewriteStatepointsForGC (and probably needs better factored.)
		return true;

		// It is cheaper to scan for a declaration than to scan for a use in this
		// function. Note that gc.statepoint is a type overloaded function so the
		// usual trick of requesting declaration of the intrinsic from the module
		// doesn't work.
		for (auto &Fn : *F->getParent())
		if (Fn.getIntrinsicID() == Intrinsic::experimental_gc_statepoint)
		return true;
		return false;
		}


uint64_t Value::getPointerDereferenceableBytes(const DataLayout &DL,		uint64_t Value::getPointerDereferenceableBytes(const DataLayout &DL,
bool &CanBeNull,		bool &CanBeNull,
bool &CanBeFreed) const {		bool &CanBeFreed) const {
assert(getType()->isPointerTy() && "must be pointer");		assert(getType()->isPointerTy() && "must be pointer");

uint64_t DerefBytes = 0;		uint64_t DerefBytes = 0;
CanBeNull = false;		CanBeNull = false;
CanBeFreed = UseDerefAtPointSemantics;		CanBeFreed = UseDerefAtPointSemantics && canBeFreed(this);
if (const Argument *A = dyn_cast<Argument>(this)) {		if (const Argument *A = dyn_cast<Argument>(this)) {
DerefBytes = A->getDereferenceableBytes();		DerefBytes = A->getDereferenceableBytes();
if (DerefBytes == 0) {		if (DerefBytes == 0) {
// Handle byval/byref/inalloca/preallocated arguments		// Handle byval/byref/inalloca/preallocated arguments
if (Type *ArgMemTy = A->getPointeeInMemoryValueType()) {		if (Type *ArgMemTy = A->getPointeeInMemoryValueType()) {
if (ArgMemTy->isSized()) {		if (ArgMemTy->isSized()) {
// FIXME: Why isn't this the type alloc size?		// FIXME: Why isn't this the type alloc size?
DerefBytes = DL.getTypeStoreSize(ArgMemTy).getKnownMinSize();		DerefBytes = DL.getTypeStoreSize(ArgMemTy).getKnownMinSize();
▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	if (!AI->isArrayAllocation()) {
CanBeFreed = false;		CanBeFreed = false;
}		}
} else if (auto *GV = dyn_cast<GlobalVariable>(this)) {		} else if (auto *GV = dyn_cast<GlobalVariable>(this)) {
if (GV->getValueType()->isSized() && !GV->hasExternalWeakLinkage()) {		if (GV->getValueType()->isSized() && !GV->hasExternalWeakLinkage()) {
// TODO: Don't outright reject hasExternalWeakLinkage but set the		// TODO: Don't outright reject hasExternalWeakLinkage but set the
// CanBeNull flag.		// CanBeNull flag.
DerefBytes = DL.getTypeStoreSize(GV->getValueType()).getFixedSize();		DerefBytes = DL.getTypeStoreSize(GV->getValueType()).getFixedSize();
CanBeNull = false;		CanBeNull = false;
CanBeFreed = false;
}		}
}		}
return DerefBytes;		return DerefBytes;
}		}

Align Value::getPointerAlignment(const DataLayout &DL) const {		Align Value::getPointerAlignment(const DataLayout &DL) const {
assert(getType()->isPointerTy() && "must be pointer");		assert(getType()->isPointerTy() && "must be pointer");
if (auto *GO = dyn_cast<GlobalObject>(this)) {		if (auto *GO = dyn_cast<GlobalObject>(this)) {
▲ Show 20 Lines • Show All 308 Lines • Show Last 20 Lines

llvm/test/Analysis/ValueTracking/deref-abstract-gc.ll

	; RUN: opt -print-memderefs -analyze -S < %s -enable-new-pm=0 -use-dereferenceable-at-point-semantics=1 \| FileCheck %s --check-prefixes=CHECK			; RUN: opt -print-memderefs -analyze -S < %s -enable-new-pm=0 -use-dereferenceable-at-point-semantics=1 \| FileCheck %s --check-prefixes=CHECK
	; RUN: opt -passes=print-memderefs -S < %s -disable-output -use-dereferenceable-at-point-semantics=1 2>&1 \| FileCheck %s --check-prefixes=CHECK			; RUN: opt -passes=print-memderefs -S < %s -disable-output -use-dereferenceable-at-point-semantics=1 2>&1 \| FileCheck %s --check-prefixes=CHECK

	target datalayout = "e-i32:32:64"			target datalayout = "e-i32:32:64"

	; For the abstract machine model (before RS4GC), gc managed objects			; For the abstract machine model (before RS4GC), gc managed objects
	; conceptually live forever. But there may be non-managed objects which are			; conceptually live forever. But there may be non-managed objects which are
	; freed.			; freed.
	; CHECK-LABEL: 'abstract_model'			; CHECK-LABEL: 'abstract_model'
	; CHECK-NOT: %gc_ptr			; CHECK: %gc_ptr
	; CHECK-NOT: %other_ptr			; CHECK-NOT: %other_ptr
	; FIXME: Can infer the gc pointer case			; FIXME: Can infer the gc pointer case
	define void @abstract_model(i32 addrspace(1)* dereferenceable(8) %gc_ptr,			define void @abstract_model(i32 addrspace(1)* dereferenceable(8) %gc_ptr,
	i32* dereferenceable(8) %other_ptr)			i32* dereferenceable(8) %other_ptr)
	gc "statepoint-example" {			gc "statepoint-example" {
	; CHECK: The following are dereferenceable:
	entry:			entry:
	call void @mayfree()			call void @mayfree()
	load i32, i32 addrspace(1)* %gc_ptr			load i32, i32 addrspace(1)* %gc_ptr
	load i32, i32* %other_ptr			load i32, i32* %other_ptr
	ret void			ret void
	}			}

	; Can free any object accessible in memory			; Can free any object accessible in memory
	declare void @mayfree()			declare void @mayfree()

llvm/test/Analysis/ValueTracking/memory-dereferenceable.ll

	Show First 20 Lines • Show All 239 Lines • ▼ Show 20 Lines
	}			}


	; The most basic case showing the difference between legacy global deref			; The most basic case showing the difference between legacy global deref
	; attribute semantics and the new point-in-time semantics.			; attribute semantics and the new point-in-time semantics.
	; CHECK-LABEL: 'negative'			; CHECK-LABEL: 'negative'
	; GLOBAL: %p			; GLOBAL: %p
	; POINT-NOT: %p			; POINT-NOT: %p
	define void @negative(i32* dereferenceable(8) %p) nofree nosync {			define void @negative(i32* dereferenceable(8) %p) {
	call void @mayfree()			call void @mayfree()
	%v = load i32, i32* %p			%v = load i32, i32* %p
	ret void			ret void
	}			}

	; CHECK-LABEL: 'infer_func_attrs1'			; CHECK-LABEL: 'infer_func_attrs1'
	; GLOBAL: %p			; CHECK: %p
	; POINT-NOT: %p
	; FIXME: Can be inferred from attributes
	define void @infer_func_attrs1(i32* dereferenceable(8) %p) nofree nosync {			define void @infer_func_attrs1(i32* dereferenceable(8) %p) nofree nosync {
	call void @mayfree()			call void @mayfree()
	%v = load i32, i32* %p			%v = load i32, i32* %p
	ret void			ret void
	}			}

	; CHECK-LABEL: 'infer_func_attrs2'			; CHECK-LABEL: 'infer_func_attrs2'
	; GLOBAL: %p			; GLOBAL: %p
	▲ Show 20 Lines • Show All 57 Lines • Show Last 20 Lines