This is an archive of the discontinued LLVM Phabricator instance.

llvm/lib/Transforms/IPO/Attributor.cpp
616	Isn't this almost the `stripAndAccumulate` code? How hard would it be to pass some kind of `AttributorInfo` object to that function (optionally) which triggers the lookup. Or more generically, a callback that takes a `Value` and returns a lower bound offset (plus indicates success/failure).
llvm/test/Transforms/Attributor/dereferenceable-1.ll
234	Isn't this the `fill_range` code?

Thank you for working on this!!

As a high-level comment, please add full-context of diff(diff -U99999)

llvm/lib/Transforms/IPO/Attributor.cpp
511	Add commnent here
llvm/test/Transforms/Attributor/dereferenceable-1.ll
214	I think optimaly %ptr is dereferenceable(30). Please add FIXME here.

uenoku added a reviewer: baziotis.Mar 15 2020, 10:15 PM

Made the changes @jdoerfert asked for.

@jdoerfert

This patch is not complete yet.

Currently it can only use the known information. I would like to make it possible for it to use the assumed information as well.
The problem with that is:

We need to iterate over the uses and take the maximum of the current dereferancability that is indicated by that use.
this works fine for the known info. But since the lower bound of the assumed constant range can decrease over time,
taking max with existing state is a problem.

The way that the AAFromMustBeExecutedContext is wired makes things a bit tricky and
I would like to get your opinions on this before I proceed.

bbn added a subscriber: bbn.Mar 19 2020, 6:53 AM

Can you explain why we need to catch overflows now but not before? I mean, the values determine by the external analysis are valid lower bounds, what is different from versioning on them and making them constant in one of the versions.

In D76208#1931081, @kuter wrote:

@jdoerfert

This patch is not complete yet.

Currently it can only use the known information. I would like to make it possible for it to use the assumed information as well.
The problem with that is:

We need to iterate over the uses and take the maximum of the current dereferancability that is indicated by that use.
this works fine for the known info. But since the lower bound of the assumed constant range can decrease over time,
taking max with existing state is a problem.

The way that the AAFromMustBeExecutedContext is wired makes things a bit tricky and
I would like to get your opinions on this before I proceed.

I think we should finish this one and then do the following:

We actually don't want the minimal offset but the maximal known offset, right? In addition to renaming the function we will check if the range from AAConstantRange is known, if so, we take the maximum instead.

llvm/include/llvm/IR/Operator.h
563 ↗	(On Diff #251274)	`= nullptr` is not sufficient? unfortunate. Nit: -`value` +`Offset`, or remove the names. Please describe what external analysis exactly does here. I like the solution of adding this callback though. We need more of these soon.
llvm/lib/IR/Value.cpp
644 ↗	(On Diff #251274)	Put the entire thing in an `if (ExternalAnalysis)` please. Or better `!ExternaAnalysis` then old code, else this code. TBH, I'm not sure why we don't bail on overflow in the old code as well, maybe we should. Can you check if that would make any test fail?
llvm/lib/Transforms/IPO/Attributor.cpp
552	Style: `UseAssumed`, `Range`, `Value`, etc.

kuter marked 3 inline comments as done.Mar 19 2020, 9:36 PM

kuter added inline comments.

llvm/test/Transforms/Attributor/dereferenceable-1.ll
214	OK, I will. For us to be able to deduce a 30, there would have to be separate mechanism that deduces the biggest number that the "Value" in question is going to be guaranteed to take. Perhaps we could use SCEV ?
234	No it is not. Currently this patch does not work with for loops. This is because with for loops AAFromMustBeExecutedContext looks at the branch at the top; calls the followUse() on both of the successors and it "and"s them together. This is probably the reason why only a single test is affected by this patch. I will address this issue with a separate patch that passes LoopInfo, DominatorTree and the PostDominatorTree to the MustBeExecutedContextExplorer.

In D76208#1932759, @jdoerfert wrote:

Can you explain why we need to catch overflows now but not before? I mean, the values determine by the external analysis are valid lower bounds, what is different from versioning on them and making them constant in one of the versions.

In D76208#1931081, @kuter wrote:

@jdoerfert

This patch is not complete yet.

Currently it can only use the known information. I would like to make it possible for it to use the assumed information as well.
The problem with that is:

We need to iterate over the uses and take the maximum of the current dereferancability that is indicated by that use.
this works fine for the known info. But since the lower bound of the assumed constant range can decrease over time,
taking max with existing state is a problem.

The way that the AAFromMustBeExecutedContext is wired makes things a bit tricky and
I would like to get your opinions on this before I proceed.

I think we should finish this one and then do the following:

We actually don't want the minimal offset but the maximal known offset, right? In addition to renaming the function we will check if the range from AAConstantRange is known, if so, we take the maximum instead.

void internal test(int index, char *ptrArg) {
  ptrArg[index] = 'A';
}

test(10, ptrA);
test(30, ptrB);
test(40, ptrC);

Based on call sites the index argument of the test function would be in range [10, 40] right ?
but marking ptrArg dereferancable(41) would be wrong woudn't it ?

The reason that I made the stripAndAccumulateConstantOffsets overflow aware is that the range can be lower than what it can be in reality.
I thought that these differences can result in unintended underflows.

For accumulateConstantOffset, I think for non inbound GEP's overflows should be ok.
but for stripAndAccumulateConstantOffsets bailing out should probably be the default behaviour.

In D76208#1932819, @kuter wrote:
In D76208#1932759, @jdoerfert wrote:

Can you explain why we need to catch overflows now but not before? I mean, the values determine by the external analysis are valid lower bounds, what is different from versioning on them and making them constant in one of the versions.

In D76208#1931081, @kuter wrote:

@jdoerfert

This patch is not complete yet.

Currently it can only use the known information. I would like to make it possible for it to use the assumed information as well.
The problem with that is:

We need to iterate over the uses and take the maximum of the current dereferancability that is indicated by that use.
this works fine for the known info. But since the lower bound of the assumed constant range can decrease over time,
taking max with existing state is a problem.

The way that the AAFromMustBeExecutedContext is wired makes things a bit tricky and
I would like to get your opinions on this before I proceed.

I think we should finish this one and then do the following:

We actually don't want the minimal offset but the maximal known offset, right? In addition to renaming the function we will check if the range from AAConstantRange is known, if so, we take the maximum instead.
void internal test(int index, char *ptrArg) {
  ptrArg[index] = 'A';
}

test(10, ptrA);
test(30, ptrB);
test(40, ptrC);
Based on call sites the index argument of the test function would be in range [10, 40] right ?
but marking ptrArg dereferancable(41) would be wrong woudn't it ?

Correct. First though was: we need something special for the loop case so we know we reach the upper bound. After all, that is probably the most important case to get.

The reason that I made the stripAndAccumulateConstantOffsets overflow aware is that the range can be lower than what it can be in reality.
I thought that these differences can result in unintended underflows.

I see. Agreed.

For accumulateConstantOffset, I think for non inbound GEP's overflows should be ok.
but for stripAndAccumulateConstantOffsets bailing out should probably be the default behaviour.

OK.

llvm/test/Transforms/Attributor/dereferenceable-1.ll
234	Right. That reminds me of D64974, though I'm unsure if we still need it. What we need is to make the explorer aware of `CanProveNotTakenFirstIteration`. Similar to the use of that function that exists, we can go to the non-exit block from a header in `getMustBeExecutedNextInstruction` (or `findForwardJoinPoint`) if we know that edge is taken at least once. To not loose the code after the loop we should add a stack of unexplored edges. The exit edge goes there if the loop is known not to be endless (see `findForwardJoinPoint`). If we are out of forward instructions we can pop an edge from the stack and continue. Alternatively, we could check if the loop was not endless when we visit the header for the second time.

Fixed styling, Added FIXME, Sperated new code.

kuter marked 2 inline comments as done.Mar 20 2020, 10:13 PM

kuter marked an inline comment as done.Mar 20 2020, 10:16 PM

uenoku added inline comments.Mar 21 2020, 4:19 AM

llvm/test/Transforms/Attributor/willreturn.ll
364	Why can't we get dereferenceable(4) for `p` in this case?

kuter marked an inline comment as done.Mar 21 2020, 5:40 AM

kuter added inline comments.

llvm/test/Transforms/Attributor/willreturn.ll
364	Ok for this example specifically, the variable `n` can be negative so the `ans +=p[n]` may never be executed. But in general, this patch doesn't work with loops that are not `do {} while (cond)` This is because `AAFromMustBeExecutedContext` is not aware when the first iteration is always going to be ran. This needs to be addressed with a separate path and would probably improve many other deductions as well.

baziotis added inline comments.Mar 21 2020, 6:13 AM

llvm/test/Transforms/Attributor/willreturn.ll
364	IIUC, loop rotation can help here because it provides this guarantee.

uenoku added inline comments.Mar 21 2020, 7:32 AM

llvm/test/Transforms/Attributor/willreturn.ll
364	Oh, I thought `%n` is assumed to be positive:) Thanks.

uenoku added inline comments.Mar 21 2020, 7:46 AM

llvm/test/Transforms/Attributor/dereferenceable-1.ll
214	For us to be able to deduce a 30, there would have to be separate mechanism that deduces the biggest number that the "Value" in question is going to be guaranteed to take. Agreed. Perhaps we could use SCEV ? I think so, what I thought once is: Assume that a loop is in the context and it is guaranteed to proceed to the last iteration, we can use the biggest number of the value. For inbounds gep, it is allowed to take that number as dereferenceable bytes, and for non-inbounds gep, if SCEV of the value is something like <0, +, 1>, we can take the biggest number as derefrenceable bytes.

This is because AAFromMustBeExecutedContext is not aware when the first iteration is always going to be ran.
This needs to be addressed with a separate path and would probably improve many other deductions as well.

Agreed and agreed.

Perhaps we could use SCEV ?

[...]

We could. What we want (regardless of how) is to track in AAConstantRange (conceptually) two ranges: (1) potential value range, (2) known value range. This is not the same as the assumed/known we track now. Right now bot track the potential value range, that is what value can this llvm::Value potentially have at runtime. I think tracking the second, thus, what value is this llvm::Value going to have at runtime (if it is executed) could be tracked in the same AA or a different one, depending on how much logic we can share. This is also a follow up patch (or two).

I'll have to go through the logic in this patch tomorrow (=rested) but I think it looks pretty good.

In D76208#1936117, @jdoerfert wrote:

We could. What we want (regardless of how) is to track in AAConstantRange (conceptually) two ranges: (1) potential value range, (2) known value range. This is not the same as the assumed/known we track now. Right now bot track the potential value range, that is what value can this llvm::Value potentially have at runtime. I think tracking the second, thus, what value is this llvm::Value going to have at runtime (if it is executed) could be tracked in the same AA or a different one, depending on how much logic we can share. This is also a follow up patch (or two).

Agreed.

Currently AAConstantRange tracks the range that a "llvm::value" is guaranteed to be within in runtime.
But for loops, this is results in information loss.

So I think what you suggest is that we track a specific range for loop like situations.
for an actual loop this would be, the first and the last index of the counter.

and when we are propagating this range from phi nodes, select instructions, call sites (of internal functions) we would intersect them together.

For example:

void test_use(char  *ptrA, char *ptrB) {
  for (int i = 10; i < 100; i++) {
    test(ptrA, i); //i is in [10, 99] loop range.
  }
  for (int i = 5; i < 150; i++) {
    test(ptrB, i); //i is in [5, 149] loop range.
  }
}

//i is in [10, 99] loop range
//ptr is dereferencable(100)
void internal test(char *ptr, int i) {
  ptr[i] = 'A';
}

We should finish this review and then focus on the next steps. I added more comments but I think the general logic is fine.

(Next step would be to rename the range in AAConstantRange to PossibleRange, or similar, and add a second range, e.g., ObservedRange which we need to deduce. We can then use the max(PossibleRange.minimum(), ObservedRange.maximum()) for dereferenceable deduction.

llvm/lib/IR/Operator.cpp
79 ↗	(On Diff #251822)	Style: I think it would be easier to read if you split the above if-else-cascade and use early exits: MinimalIndex = ConstOffset->getValue(); continue; } // The operand is not constant, check if an external analysis was provided. if (!ExternalAnalysis) return false; Do we need to track overflow before we use an external analysis? If not we don't need 75/76. Below we can than use `UsedExternalAnalysis` in 88 and remove it from 95. We should also be able to use a single overflowed flag.
99 ↗	(On Diff #251822)	Put the simple case first, maybe add a continue to avoid the "else"
llvm/lib/IR/Value.cpp
649 ↗	(On Diff #251822)	Simple case first please.
llvm/lib/Transforms/IPO/Attributor.cpp
552	Some names are still starting with a lower case letter. Do we have a test where the assumed range minimum is negative?
3749	Merge in a single debug message and line.
3803	No need for the newline in the beginning. TBH, the entire message doesn't help much given that we see the update debug message already.

Eliminated redundant debug messages, Style fixes, Added negative test case,
Don't use external analysis if the get operand is a struct type.

kuter marked an inline comment as done.Mar 25 2020, 8:22 AM

kuter added inline comments.

llvm/lib/IR/Operator.cpp
79 ↗	(On Diff #251822)	We probably don't need to detect a overflow before the use of external analysis I was just being safe. I am not sure what you mean by the early exit change. We need to detect overflows that happen after we use ExternalAnalysis even if it was not caused by the value that the ExternalAnalysis have returned. So the value of the ConstantOffset needs to pass through something that would detect a overflow if ExternalAnalysis is present and it has been used. We could do it with a lambda like this: AddArrayIndex(ConstOffset->getValue()); continue; } Also do we really need to have the old and the new code in this style: if (!External Analysis) { //old } else { //new } New code shouldn't really behave any different other than detecting overflows/underflows.

kuter marked 2 inline comments as done.Mar 25 2020, 9:59 AM

jdoerfert added inline comments.Mar 25 2020, 10:10 AM

llvm/lib/IR/Operator.cpp
79 ↗	(On Diff #251822)	When we want to check `UsedExternal` in order to determine if an overflow check is necessary not `ExternalAnalysis`. We should create the lambda you mentioned. Don't track overflowed outside of the lambda. In the lambda check if `usedexternal` is set, only if we track and act on overflows.

Simplfy accumulateConstantOffset

kuter marked 3 inline comments as done.Mar 25 2020, 2:15 PM

kuter marked an inline comment as done.Mar 26 2020, 6:32 AM

Two more minor comments, other than that the code looks good. Please update so I can commit it.

llvm/lib/IR/Operator.cpp
60 ↗	(On Diff #252668)	Nit: Don't call it `MinimalIndex` as there might be use cases that over-approximate the number. Index is just fine I think.
llvm/lib/Transforms/IPO/Attributor.cpp
566	Nit: `AllowNonInbounds` and `Value above. Check other variable names as well.

This revision is now accepted and ready to land.Mar 26 2020, 9:35 AM

kuter updated this revision to Diff 253022.Mar 26 2020, 5:28 PM

Apologies for the delay, can you rebase this and provide me with "Firstname Lastname <email>" from you so I can attribute it to you?

Rebased.
Small logic change in GEPOperator::accumulateConstantOffset to bailout on scalable vector types
except for when the offset is zero.
Not allowing zero breaks @test_accumulate_constant_offset_vscale_zero

see https://reviews.llvm.org/rGef64ba831194c7deac8882a325ea9bea64eb612a

In D76208#1956335, @jdoerfert wrote:

Apologies for the delay, can you rebase this and provide me with "Firstname Lastname <email>" from you so I can attribute it to you?

name, surname: Kuter Dinel
email: kuterdinel@gmail.com

Closed by commit rGe57807769b5c: [Attributor] Use AAValueConstantRange to infer dereferencability. (authored by kuter, committed by jdoerfert). · Explain WhyMay 13 2020, 3:17 PM

This revision was automatically updated to reflect the committed changes.

Herald added a subscriber: llvm-commits. · View Herald TranscriptMay 13 2020, 3:17 PM

jdoerfert mentioned this in rG6045a804b94b: [Attributor] Check lines accidentally not committed with D76208.May 13 2020, 4:26 PM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

IPO/

Attributor.cpp

154 lines

test/

Transforms/

Attributor/

dereferenceable-1.ll

32 lines

willreturn.ll

2 lines

Diff 250458

llvm/lib/Transforms/IPO/Attributor.cpp

Context not available.
	llvm_unreachable("Expected enum or string attribute!");	llvm_unreachable("Expected enum or string attribute!");
	}	}

		bool accumulateMinimalOffset(Attributor &A, const AbstractAttribute &QueryingAA,
		uenokuUnsubmitted Not Done Reply Inline Actions Add commnent here uenoku: Add commnent here
		const GEPOperator *gep, const DataLayout &DL,
		APInt &Offset) {
		assert(Offset.getBitWidth() ==
		DL.getIndexSizeInBits(gep->getPointerAddressSpace()) &&
		"The offset bit width does not match DL specification.");

		for (gep_type_iterator GTI = gep_type_begin(gep), GTE = gep_type_end(gep);
		GTI != GTE; ++GTI) {
		Value *V = GTI.getOperand();
		APInt MinimalIndex;
		// Handle ConstantInt if possible.
		if (auto ConstOffset = dyn_cast<ConstantInt>(V)) {
		if (ConstOffset->isZero())
		continue;

		// Handle a struct index, which adds its field offset to the pointer.
		// For array or vector indices, scale the index by the size of the type.
		if (StructType *STy = GTI.getStructTypeOrNull()) {
		unsigned ElementIdx = ConstOffset->getZExtValue();
		const StructLayout *SL = DL.getStructLayout(STy);
		Offset += APInt(Offset.getBitWidth(), SL->getElementOffset(ElementIdx));
		continue;
		}
		MinimalIndex = ConstOffset->getValue();
		} else {
		// If the operand is not a ConstantInt try to use ValueConstantRangeAA.
		const IRPosition &Pos = IRPosition::value(*V);
		// No need to track dependence as long as we use the known info.
		const AAValueConstantRange &ValueConstantRangeAA =
		A.getAAFor<AAValueConstantRange>(QueryingAA, Pos,
		/* TrackDependence */ false);
		// We can only use the lower part of the range because the upper part can
		// be higher than what the value can really be.
		MinimalIndex = ValueConstantRangeAA.getKnown().getSignedMin();
		}
		APInt Index = MinimalIndex.sextOrTrunc(Offset.getBitWidth());
		APInt IndexedSize =
		APInt(Offset.getBitWidth(), DL.getTypeAllocSize(GTI.getIndexedType()));

		// We must avoid overflow / underflow.
		bool Overflow = false;
		jdoerfertUnsubmitted Done Reply Inline Actions Style: `UseAssumed`, `Range`, `Value`, etc. jdoerfert: Style: `UseAssumed`, `Range`, `Value`, etc.
		jdoerfertUnsubmitted Done Reply Inline Actions Some names are still starting with a lower case letter. Do we have a test where the assumed range minimum is negative? jdoerfert: Some names are still starting with a lower case letter. Do we have a test where the assumed…
		APInt OffsetPlus = Index.smul_ov(IndexedSize, Overflow);
		if (Overflow)
		return false;
		Offset = Offset.sadd_ov(OffsetPlus, Overflow);
		if (Overflow)
		return false;
		}
		return true;
		}

		const Value *stripAndAccumulateMinimalOffsets(
		Attributor &A, const AbstractAttribute &QueryingAA, const Value *value,
		const DataLayout &DL, APInt &Offset, bool AllowNonInbounds) {
		if (!value->getType()->isPtrOrPtrVectorTy())
		jdoerfertUnsubmitted Not Done Reply Inline Actions Nit: `AllowNonInbounds` and `Value above. Check other variable names as well. jdoerfert: Nit: `AllowNonInbounds` and `Value above. Check other variable names as well.
		return value;

		unsigned BitWidth = Offset.getBitWidth();
		assert(BitWidth == DL.getIndexTypeSizeInBits(value->getType()) &&
		"The offset bit width does not match the DL specification.");

		// Even though we don't look through PHI nodes, we could be called on an
		// instruction in an unreachable block, which may be on a cycle.
		SmallPtrSet<const Value *, 4> Visited;
		Visited.insert(value);
		const Value *V = value;
		do {
		if (auto *GEP = dyn_cast<GEPOperator>(V)) {
		// If in-bounds was requested, we do not strip non-in-bounds GEPs.
		if (!AllowNonInbounds && !GEP->isInBounds())
		return V;

		// If one of the values we have visited is an addrspacecast, then
		// the pointer type of this GEP may be different from the type
		// of the Ptr parameter which was passed to this function. This
		// means when we construct GEPOffset, we need to use the size
		// of GEP's pointer type rather than the size of the original
		// pointer type.
		APInt GEPOffset(DL.getIndexTypeSizeInBits(V->getType()), 0);
		if (!accumulateMinimalOffset(A, QueryingAA, GEP, DL, GEPOffset))
		return V;

		// Stop traversal if the pointer offset wouldn't fit in the bit-width
		// provided by the Offset argument. This can happen due to AddrSpaceCast
		// stripping.
		if (GEPOffset.getMinSignedBits() > BitWidth)
		return V;

		Offset += GEPOffset.sextOrTrunc(BitWidth);
		V = GEP->getPointerOperand();
		} else if (Operator::getOpcode(V) == Instruction::BitCast \|\|
		Operator::getOpcode(V) == Instruction::AddrSpaceCast) {
		V = cast<Operator>(V)->getOperand(0);
		} else if (auto *GA = dyn_cast<GlobalAlias>(V)) {
		if (!GA->isInterposable())
		V = GA->getAliasee();
		} else if (const auto *Call = dyn_cast<CallBase>(V)) {
		if (const Value *RV = Call->getReturnedArgOperand())
		V = RV;
		}
		assert(V->getType()->isPtrOrPtrVectorTy() && "Unexpected operand type!");
		} while (Visited.insert(V).second);

		return V;
		}
		jdoerfertUnsubmitted Done Reply Inline Actions Isn't this almost the `stripAndAccumulate` code? How hard would it be to pass some kind of `AttributorInfo` object to that function (optionally) which triggers the lookup. Or more generically, a callback that takes a `Value` and returns a lower bound offset (plus indicates success/failure). jdoerfert: Isn't this almost the `stripAndAccumulate` code? How hard would it be to pass some kind of…

		static const Value *getMinimalBaseOfAccsesPointerOperand(
		Attributor &A, const AbstractAttribute &QueryingAA, const Instruction *I,
		int64_t &BytesOffset, const DataLayout &DL, bool allowNonInbounds = false) {
		const Value Ptr = getPointerOperand(I, / AllowVolatile */ false);
		if (!Ptr)
		return nullptr;
		APInt OffsetAPInt(DL.getIndexTypeSizeInBits(Ptr->getType()), 0);
		const Value *Base = stripAndAccumulateMinimalOffsets(
		A, QueryingAA, Ptr, DL, OffsetAPInt, allowNonInbounds);

		BytesOffset = OffsetAPInt.getSExtValue();
		return Base;
		}

	static const Value *	static const Value *
	getBasePointerOfAccessPointerOperand(const Instruction *I, int64_t &BytesOffset,	getBasePointerOfAccessPointerOperand(const Instruction *I, int64_t &BytesOffset,
	const DataLayout &DL,	const DataLayout &DL,
Context not available.
	TrackUse = true;	TrackUse = true;
	return 0;	return 0;
	}	}
	if (auto *GEP = dyn_cast<GetElementPtrInst>(I))
	if (GEP->hasAllConstantIndices()) {
	TrackUse = true;
	return 0;
	}

		if (isa<GetElementPtrInst>(I)) {
		TrackUse = true;
		return 0;
		}
	int64_t Offset;	int64_t Offset;
	if (const Value *Base = getBasePointerOfAccessPointerOperand(I, Offset, DL)) {	const Value *Base =
		getMinimalBaseOfAccsesPointerOperand(A, QueryingAA, I, Offset, DL);
		if (Base) {
	if (Base == &AssociatedValue &&	if (Base == &AssociatedValue &&
	getPointerOperand(I, /* AllowVolatile */ false) == UseV) {	getPointerOperand(I, /* AllowVolatile */ false) == UseV) {
	int64_t DerefBytes =	int64_t DerefBytes =
Context not available.
	}	}

	/// Corner case when an offset is 0.	/// Corner case when an offset is 0.
	if (const Value *Base = getBasePointerOfAccessPointerOperand(	Base = getBasePointerOfAccessPointerOperand(I, Offset, DL,
	I, Offset, DL, /AllowNonInbounds/ true)) {	/AllowNonInbounds/ true);
		if (Base) {
	if (Offset == 0 && Base == &AssociatedValue &&	if (Offset == 0 && Base == &AssociatedValue &&
	getPointerOperand(I, /* AllowVolatile */ false) == UseV) {	getPointerOperand(I, /* AllowVolatile */ false) == UseV) {
	int64_t DerefBytes =	int64_t DerefBytes =
Context not available.
	int64_t DerefBytes = getKnownNonNullAndDerefBytesForUse(	int64_t DerefBytes = getKnownNonNullAndDerefBytesForUse(
	A, *this, getAssociatedValue(), U, I, IsNonNull, TrackUse);	A, *this, getAssociatedValue(), U, I, IsNonNull, TrackUse);

		LLVM_DEBUG(dbgs() << "[AADereferenceable] follow use called on " << *I
		<< "\n");
		LLVM_DEBUG(dbgs() << "[AADereferenceable] Deref bytes" << DerefBytes
		<< "\n");
	addAccessedBytesForUse(A, U, I);	addAccessedBytesForUse(A, U, I);
	takeKnownDerefBytesMaximum(DerefBytes);	takeKnownDerefBytesMaximum(DerefBytes);
	return TrackUse;	return TrackUse;
Context not available.
	ChangeStatus Change = Base::updateImpl(A);	ChangeStatus Change = Base::updateImpl(A);

	const DataLayout &DL = A.getDataLayout();	const DataLayout &DL = A.getDataLayout();
		LLVM_DEBUG(
	auto VisitValueCB = [&](Value &V, DerefState &T, bool Stripped) -> bool {	dbgs()
		<< "\n[AADereferenceableFloating] Trying to merge floating values");
		auto VisitValueCB = [&](const Value &V, DerefState &T,
		bool Stripped) -> bool {
		LLVM_DEBUG(dbgs() << "\n[AADereferenceableFloating] Looking at value"
		<< V);
	unsigned IdxWidth =	unsigned IdxWidth =
	DL.getIndexSizeInBits(V.getType()->getPointerAddressSpace());	DL.getIndexSizeInBits(V.getType()->getPointerAddressSpace());
	APInt Offset(IdxWidth, 0);	APInt Offset(IdxWidth, 0);
	const Value *Base =	const Value *Base =
	V.stripAndAccumulateInBoundsConstantOffsets(DL, Offset);	stripAndAccumulateMinimalOffsets(A, *this, &V, DL, Offset, false);

	const auto &AA =	const auto &AA =
	A.getAAFor<AADereferenceable>(this, IRPosition::value(Base));	A.getAAFor<AADereferenceable>(this, IRPosition::value(Base));
Context not available.
		jdoerfertUnsubmitted Done Reply Inline Actions No need for the newline in the beginning. TBH, the entire message doesn't help much given that we see the update debug message already. jdoerfert: No need for the newline in the beginning. TBH, the entire message doesn't help much given that…
		jdoerfertUnsubmitted Done Reply Inline Actions Merge in a single debug message and line. jdoerfert: Merge in a single debug message and line.

llvm/test/Transforms/Attributor/dereferenceable-1.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -attributor -attributor-manifest-internal --attributor-disable=false -attributor-max-iterations-verify -attributor-annotate-decl-cs -attributor-max-iterations=16 -S < %s \| FileCheck %s --check-prefix=ATTRIBUTOR	; RUN: opt -attributor -attributor-manifest-internal --attributor-disable=false -attributor-max-iterations-verify -attributor-annotate-decl-cs -attributor-max-iterations=16 -S < %s \| FileCheck %s --check-prefix=ATTRIBUTOR
		; RUN: opt -passes=attributor-cgscc --attributor-disable=false -attributor-manifest-internal -S < %s \| FileCheck %s --check-prefixes=ATTRIBUTOR_NPM
	; FIXME: Figure out why we need 16 iterations here.	; FIXME: Figure out why we need 16 iterations here.

	declare void @deref_phi_user(i32* %a);	declare void @deref_phi_user(i32* %a);
Context not available.
	}	}

	; TEST 8	; TEST 8
	; Use Constant range in deereferenceable	; Use Constant range in dereferenceable
		uenokuUnsubmitted Done Reply Inline Actions I think optimaly %ptr is dereferenceable(30). Please add FIXME here. uenoku: I think optimaly %ptr is dereferenceable(30). Please add FIXME here.
		kuterAuthorUnsubmitted Done Reply Inline Actions OK, I will. For us to be able to deduce a 30, there would have to be separate mechanism that deduces the biggest number that the "Value" in question is going to be guaranteed to take. Perhaps we could use SCEV ? kuter: OK, I will. For us to be able to deduce a 30, there would have to be separate mechanism that…
		uenokuUnsubmitted Not Done Reply Inline Actions For us to be able to deduce a 30, there would have to be separate mechanism that deduces the biggest number that the "Value" in question is going to be guaranteed to take. Agreed. Perhaps we could use SCEV ? I think so, what I thought once is: Assume that a loop is in the context and it is guaranteed to proceed to the last iteration, we can use the biggest number of the value. For inbounds gep, it is allowed to take that number as dereferenceable bytes, and for non-inbounds gep, if SCEV of the value is something like <0, +, 1>, we can take the biggest number as derefrenceable bytes. uenoku: > For us to be able to deduce a 30, there would have to be separate mechanism that deduces the…
	; void g(int p, long long int range){	;Function Attrs: nounwind uwtable
	; int r = *range ; // [10, 99]	define void @test8(i8* %ptr) #0 {
	; fill_range(p, *range);	; ATTRIBUTOR_NPM: define void @test8(i8* nocapture nofree nonnull writeonly dereferenceable(21) %ptr)
	; }	br label %1
		1: ; preds = %5, %0
	; void fill_range(int* p, long long int start){	%i.0 = phi i32 [ 20, %0 ], [ %4, %5 ]
	; for(long long int i = start;i<start+10;i++){	%2 = sext i32 %i.0 to i64
	; // If p[i] is inbounds, p is dereferenceable(40) at least.	%3 = getelementptr inbounds i8, i8* %ptr, i64 %2
	; p[i] = i;	store i8 32, i8* %3, align 1
	; }	%4 = add nsw i32 %i.0, 1
	; }	br label %5
		5: ; preds = %1
		%6 = icmp slt i32 %4, 30
		br i1 %6, label %1, label %7

		7: ; preds = %5
		ret void
		}

	define internal void @fill_range_not_inbounds(i32* %p, i64 %start){	define internal void @fill_range_not_inbounds(i32* %p, i64 %start){
		jdoerfertUnsubmitted Not Done Reply Inline Actions Isn't this the `fill_range` code? jdoerfert: Isn't this the `fill_range` code?
		kuterAuthorUnsubmitted Done Reply Inline Actions No it is not. Currently this patch does not work with for loops. This is because with for loops AAFromMustBeExecutedContext looks at the branch at the top; calls the followUse() on both of the successors and it "and"s them together. This is probably the reason why only a single test is affected by this patch. I will address this issue with a separate patch that passes LoopInfo, DominatorTree and the PostDominatorTree to the MustBeExecutedContextExplorer. kuter: No it is not. Currently this patch does not work with for loops. This is because with for…
		jdoerfertUnsubmitted Not Done Reply Inline Actions Right. That reminds me of D64974, though I'm unsure if we still need it. What we need is to make the explorer aware of `CanProveNotTakenFirstIteration`. Similar to the use of that function that exists, we can go to the non-exit block from a header in `getMustBeExecutedNextInstruction` (or `findForwardJoinPoint`) if we know that edge is taken at least once. To not loose the code after the loop we should add a stack of unexplored edges. The exit edge goes there if the loop is known not to be endless (see `findForwardJoinPoint`). If we are out of forward instructions we can pop an edge from the stack and continue. Alternatively, we could check if the loop was not endless when we visit the header for the second time. jdoerfert: Right. That reminds me of D64974, though I'm unsure if we still need it. What we need is to…
	; ATTRIBUTOR-LABEL: define {{[^@]+}}@fill_range_not_inbounds	; ATTRIBUTOR-LABEL: define {{[^@]+}}@fill_range_not_inbounds
Context not available.

llvm/test/Transforms/Attributor/willreturn.ll

Context not available.
	; FIXME: missing willreturn	; FIXME: missing willreturn
	; ATTRIBUTOR_MODULE: Function Attrs: nofree noinline nosync nounwind readonly uwtable	; ATTRIBUTOR_MODULE: Function Attrs: nofree noinline nosync nounwind readonly uwtable
	; ATTRIBUTOR_CGSCC: Function Attrs: nofree noinline norecurse nosync nounwind readonly uwtable	; ATTRIBUTOR_CGSCC: Function Attrs: nofree noinline norecurse nosync nounwind readonly uwtable
	; ATTRIBUTOR-NEXT: define i32 @loop_constant_trip_count(i32* nocapture nofree readonly %0)	; ATTRIBUTOR-NEXT: define i32 @loop_constant_trip_count(i32* nocapture nofree nonnull readonly dereferenceable(4) %0)
	define i32 @loop_constant_trip_count(i32* nocapture readonly %0) #0 {	define i32 @loop_constant_trip_count(i32* nocapture readonly %0) #0 {
	br label %3	br label %3

Context not available.
	uenokuUnsubmitted Not Done Reply Inline Actions Why can't we get dereferenceable(4) for `p` in this case? uenoku: Why can't we get dereferenceable(4) for `p` in this case?
	kuterAuthorUnsubmitted Done Reply Inline Actions Ok for this example specifically, the variable `n` can be negative so the `ans +=p[n]` may never be executed. But in general, this patch doesn't work with loops that are not `do {} while (cond)` This is because `AAFromMustBeExecutedContext` is not aware when the first iteration is always going to be ran. This needs to be addressed with a separate path and would probably improve many other deductions as well. kuter: Ok for this example specifically, the variable `n` can be negative so the `ans +=p[n]` may…
	baziotisUnsubmitted Not Done Reply Inline Actions IIUC, loop rotation can help here because it provides this guarantee. baziotis: IIUC, [[ https://llvm.org/docs/LoopTerminology.html#rotated-loops \| loop rotation ]] can help…
	uenokuUnsubmitted Not Done Reply Inline Actions Oh, I thought `%n` is assumed to be positive:) Thanks. uenoku: Oh, I thought `%n` is assumed to be positive:) Thanks.

This is an archive of the discontinued LLVM Phabricator instance.

[Attributor] Use AAValueConstantRange to infer dereferencability.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 250458

llvm/lib/Transforms/IPO/Attributor.cpp

llvm/test/Transforms/Attributor/dereferenceable-1.ll

llvm/test/Transforms/Attributor/willreturn.ll

[Attributor] Use AAValueConstantRange to infer dereferencability.
ClosedPublic