This is an archive of the discontinued LLVM Phabricator instance.

llvm/lib/Transforms/IPO/Attributor.cpp
616 ↗	(On Diff #250458)	Isn't this almost the `stripAndAccumulate` code? How hard would it be to pass some kind of `AttributorInfo` object to that function (optionally) which triggers the lookup. Or more generically, a callback that takes a `Value` and returns a lower bound offset (plus indicates success/failure).
llvm/test/Transforms/Attributor/dereferenceable-1.ll
234	Isn't this the `fill_range` code?

Thank you for working on this!!

As a high-level comment, please add full-context of diff(diff -U99999)

llvm/lib/Transforms/IPO/Attributor.cpp
511 ↗	(On Diff #250458)	Add commnent here
llvm/test/Transforms/Attributor/dereferenceable-1.ll
214	I think optimaly %ptr is dereferenceable(30). Please add FIXME here.

uenoku added a reviewer: baziotis.Mar 15 2020, 10:15 PM

Made the changes @jdoerfert asked for.

@jdoerfert

This patch is not complete yet.

Currently it can only use the known information. I would like to make it possible for it to use the assumed information as well.
The problem with that is:

We need to iterate over the uses and take the maximum of the current dereferancability that is indicated by that use.
this works fine for the known info. But since the lower bound of the assumed constant range can decrease over time,
taking max with existing state is a problem.

The way that the AAFromMustBeExecutedContext is wired makes things a bit tricky and
I would like to get your opinions on this before I proceed.

bbn added a subscriber: bbn.Mar 19 2020, 6:53 AM

Can you explain why we need to catch overflows now but not before? I mean, the values determine by the external analysis are valid lower bounds, what is different from versioning on them and making them constant in one of the versions.

In D76208#1931081, @kuter wrote:

@jdoerfert

This patch is not complete yet.

Currently it can only use the known information. I would like to make it possible for it to use the assumed information as well.
The problem with that is:

We need to iterate over the uses and take the maximum of the current dereferancability that is indicated by that use.
this works fine for the known info. But since the lower bound of the assumed constant range can decrease over time,
taking max with existing state is a problem.

The way that the AAFromMustBeExecutedContext is wired makes things a bit tricky and
I would like to get your opinions on this before I proceed.

I think we should finish this one and then do the following:

We actually don't want the minimal offset but the maximal known offset, right? In addition to renaming the function we will check if the range from AAConstantRange is known, if so, we take the maximum instead.

llvm/include/llvm/IR/Operator.h
563	`= nullptr` is not sufficient? unfortunate. Nit: -`value` +`Offset`, or remove the names. Please describe what external analysis exactly does here. I like the solution of adding this callback though. We need more of these soon.
llvm/lib/IR/Value.cpp
647	Put the entire thing in an `if (ExternalAnalysis)` please. Or better `!ExternaAnalysis` then old code, else this code. TBH, I'm not sure why we don't bail on overflow in the old code as well, maybe we should. Can you check if that would make any test fail?
llvm/lib/Transforms/IPO/Attributor.cpp
552 ↗	(On Diff #251274)	Style: `UseAssumed`, `Range`, `Value`, etc.

kuter marked 3 inline comments as done.Mar 19 2020, 9:36 PM

kuter added inline comments.

llvm/test/Transforms/Attributor/dereferenceable-1.ll
214	OK, I will. For us to be able to deduce a 30, there would have to be separate mechanism that deduces the biggest number that the "Value" in question is going to be guaranteed to take. Perhaps we could use SCEV ?
234	No it is not. Currently this patch does not work with for loops. This is because with for loops AAFromMustBeExecutedContext looks at the branch at the top; calls the followUse() on both of the successors and it "and"s them together. This is probably the reason why only a single test is affected by this patch. I will address this issue with a separate patch that passes LoopInfo, DominatorTree and the PostDominatorTree to the MustBeExecutedContextExplorer.

In D76208#1932759, @jdoerfert wrote:

Can you explain why we need to catch overflows now but not before? I mean, the values determine by the external analysis are valid lower bounds, what is different from versioning on them and making them constant in one of the versions.

In D76208#1931081, @kuter wrote:

@jdoerfert

This patch is not complete yet.

Currently it can only use the known information. I would like to make it possible for it to use the assumed information as well.
The problem with that is:

We need to iterate over the uses and take the maximum of the current dereferancability that is indicated by that use.
this works fine for the known info. But since the lower bound of the assumed constant range can decrease over time,
taking max with existing state is a problem.

The way that the AAFromMustBeExecutedContext is wired makes things a bit tricky and
I would like to get your opinions on this before I proceed.

I think we should finish this one and then do the following:

We actually don't want the minimal offset but the maximal known offset, right? In addition to renaming the function we will check if the range from AAConstantRange is known, if so, we take the maximum instead.

void internal test(int index, char *ptrArg) {
  ptrArg[index] = 'A';
}

test(10, ptrA);
test(30, ptrB);
test(40, ptrC);

Based on call sites the index argument of the test function would be in range [10, 40] right ?
but marking ptrArg dereferancable(41) would be wrong woudn't it ?

The reason that I made the stripAndAccumulateConstantOffsets overflow aware is that the range can be lower than what it can be in reality.
I thought that these differences can result in unintended underflows.

For accumulateConstantOffset, I think for non inbound GEP's overflows should be ok.
but for stripAndAccumulateConstantOffsets bailing out should probably be the default behaviour.

In D76208#1932819, @kuter wrote:
In D76208#1932759, @jdoerfert wrote:

Can you explain why we need to catch overflows now but not before? I mean, the values determine by the external analysis are valid lower bounds, what is different from versioning on them and making them constant in one of the versions.

In D76208#1931081, @kuter wrote:

@jdoerfert

This patch is not complete yet.

Currently it can only use the known information. I would like to make it possible for it to use the assumed information as well.
The problem with that is:

We need to iterate over the uses and take the maximum of the current dereferancability that is indicated by that use.
this works fine for the known info. But since the lower bound of the assumed constant range can decrease over time,
taking max with existing state is a problem.

The way that the AAFromMustBeExecutedContext is wired makes things a bit tricky and
I would like to get your opinions on this before I proceed.

I think we should finish this one and then do the following:

We actually don't want the minimal offset but the maximal known offset, right? In addition to renaming the function we will check if the range from AAConstantRange is known, if so, we take the maximum instead.
void internal test(int index, char *ptrArg) {
  ptrArg[index] = 'A';
}

test(10, ptrA);
test(30, ptrB);
test(40, ptrC);
Based on call sites the index argument of the test function would be in range [10, 40] right ?
but marking ptrArg dereferancable(41) would be wrong woudn't it ?

Correct. First though was: we need something special for the loop case so we know we reach the upper bound. After all, that is probably the most important case to get.

The reason that I made the stripAndAccumulateConstantOffsets overflow aware is that the range can be lower than what it can be in reality.
I thought that these differences can result in unintended underflows.

I see. Agreed.

For accumulateConstantOffset, I think for non inbound GEP's overflows should be ok.
but for stripAndAccumulateConstantOffsets bailing out should probably be the default behaviour.

OK.

llvm/test/Transforms/Attributor/dereferenceable-1.ll
234	Right. That reminds me of D64974, though I'm unsure if we still need it. What we need is to make the explorer aware of `CanProveNotTakenFirstIteration`. Similar to the use of that function that exists, we can go to the non-exit block from a header in `getMustBeExecutedNextInstruction` (or `findForwardJoinPoint`) if we know that edge is taken at least once. To not loose the code after the loop we should add a stack of unexplored edges. The exit edge goes there if the loop is known not to be endless (see `findForwardJoinPoint`). If we are out of forward instructions we can pop an edge from the stack and continue. Alternatively, we could check if the loop was not endless when we visit the header for the second time.

Fixed styling, Added FIXME, Sperated new code.

kuter marked 2 inline comments as done.Mar 20 2020, 10:13 PM

kuter marked an inline comment as done.Mar 20 2020, 10:16 PM

uenoku added inline comments.Mar 21 2020, 4:19 AM

llvm/test/Transforms/Attributor/willreturn.ll
513	Why can't we get dereferenceable(4) for `p` in this case?

kuter marked an inline comment as done.Mar 21 2020, 5:40 AM

kuter added inline comments.

llvm/test/Transforms/Attributor/willreturn.ll
513	Ok for this example specifically, the variable `n` can be negative so the `ans +=p[n]` may never be executed. But in general, this patch doesn't work with loops that are not `do {} while (cond)` This is because `AAFromMustBeExecutedContext` is not aware when the first iteration is always going to be ran. This needs to be addressed with a separate path and would probably improve many other deductions as well.

baziotis added inline comments.Mar 21 2020, 6:13 AM

llvm/test/Transforms/Attributor/willreturn.ll
513	IIUC, loop rotation can help here because it provides this guarantee.

uenoku added inline comments.Mar 21 2020, 7:32 AM

llvm/test/Transforms/Attributor/willreturn.ll
513	Oh, I thought `%n` is assumed to be positive:) Thanks.

uenoku added inline comments.Mar 21 2020, 7:46 AM

llvm/test/Transforms/Attributor/dereferenceable-1.ll
214	For us to be able to deduce a 30, there would have to be separate mechanism that deduces the biggest number that the "Value" in question is going to be guaranteed to take. Agreed. Perhaps we could use SCEV ? I think so, what I thought once is: Assume that a loop is in the context and it is guaranteed to proceed to the last iteration, we can use the biggest number of the value. For inbounds gep, it is allowed to take that number as dereferenceable bytes, and for non-inbounds gep, if SCEV of the value is something like <0, +, 1>, we can take the biggest number as derefrenceable bytes.

This is because AAFromMustBeExecutedContext is not aware when the first iteration is always going to be ran.
This needs to be addressed with a separate path and would probably improve many other deductions as well.

Agreed and agreed.

Perhaps we could use SCEV ?

[...]

We could. What we want (regardless of how) is to track in AAConstantRange (conceptually) two ranges: (1) potential value range, (2) known value range. This is not the same as the assumed/known we track now. Right now bot track the potential value range, that is what value can this llvm::Value potentially have at runtime. I think tracking the second, thus, what value is this llvm::Value going to have at runtime (if it is executed) could be tracked in the same AA or a different one, depending on how much logic we can share. This is also a follow up patch (or two).

I'll have to go through the logic in this patch tomorrow (=rested) but I think it looks pretty good.

In D76208#1936117, @jdoerfert wrote:

We could. What we want (regardless of how) is to track in AAConstantRange (conceptually) two ranges: (1) potential value range, (2) known value range. This is not the same as the assumed/known we track now. Right now bot track the potential value range, that is what value can this llvm::Value potentially have at runtime. I think tracking the second, thus, what value is this llvm::Value going to have at runtime (if it is executed) could be tracked in the same AA or a different one, depending on how much logic we can share. This is also a follow up patch (or two).

Agreed.

Currently AAConstantRange tracks the range that a "llvm::value" is guaranteed to be within in runtime.
But for loops, this is results in information loss.

So I think what you suggest is that we track a specific range for loop like situations.
for an actual loop this would be, the first and the last index of the counter.

and when we are propagating this range from phi nodes, select instructions, call sites (of internal functions) we would intersect them together.

For example:

void test_use(char  *ptrA, char *ptrB) {
  for (int i = 10; i < 100; i++) {
    test(ptrA, i); //i is in [10, 99] loop range.
  }
  for (int i = 5; i < 150; i++) {
    test(ptrB, i); //i is in [5, 149] loop range.
  }
}

//i is in [10, 99] loop range
//ptr is dereferencable(100)
void internal test(char *ptr, int i) {
  ptr[i] = 'A';
}

We should finish this review and then focus on the next steps. I added more comments but I think the general logic is fine.

(Next step would be to rename the range in AAConstantRange to PossibleRange, or similar, and add a second range, e.g., ObservedRange which we need to deduce. We can then use the max(PossibleRange.minimum(), ObservedRange.maximum()) for dereferenceable deduction.

llvm/lib/IR/Operator.cpp
94	Style: I think it would be easier to read if you split the above if-else-cascade and use early exits: MinimalIndex = ConstOffset->getValue(); continue; } // The operand is not constant, check if an external analysis was provided. if (!ExternalAnalysis) return false; Do we need to track overflow before we use an external analysis? If not we don't need 75/76. Below we can than use `UsedExternalAnalysis` in 88 and remove it from 95. We should also be able to use a single overflowed flag.
112	Put the simple case first, maybe add a continue to avoid the "else"
llvm/lib/IR/Value.cpp
652	Simple case first please.
llvm/lib/Transforms/IPO/Attributor.cpp
552 ↗	(On Diff #251822)	Some names are still starting with a lower case letter. Do we have a test where the assumed range minimum is negative?
3749 ↗	(On Diff #251822)	Merge in a single debug message and line.
3803 ↗	(On Diff #251822)	No need for the newline in the beginning. TBH, the entire message doesn't help much given that we see the update debug message already.

Eliminated redundant debug messages, Style fixes, Added negative test case,
Don't use external analysis if the get operand is a struct type.

kuter marked an inline comment as done.Mar 25 2020, 8:22 AM

kuter added inline comments.

llvm/lib/IR/Operator.cpp
94	We probably don't need to detect a overflow before the use of external analysis I was just being safe. I am not sure what you mean by the early exit change. We need to detect overflows that happen after we use ExternalAnalysis even if it was not caused by the value that the ExternalAnalysis have returned. So the value of the ConstantOffset needs to pass through something that would detect a overflow if ExternalAnalysis is present and it has been used. We could do it with a lambda like this: AddArrayIndex(ConstOffset->getValue()); continue; } Also do we really need to have the old and the new code in this style: if (!External Analysis) { //old } else { //new } New code shouldn't really behave any different other than detecting overflows/underflows.

kuter marked 2 inline comments as done.Mar 25 2020, 9:59 AM

jdoerfert added inline comments.Mar 25 2020, 10:10 AM

llvm/lib/IR/Operator.cpp
94	When we want to check `UsedExternal` in order to determine if an overflow check is necessary not `ExternalAnalysis`. We should create the lambda you mentioned. Don't track overflowed outside of the lambda. In the lambda check if `usedexternal` is set, only if we track and act on overflows.

Simplfy accumulateConstantOffset

kuter marked 3 inline comments as done.Mar 25 2020, 2:15 PM

kuter marked an inline comment as done.Mar 26 2020, 6:32 AM

Two more minor comments, other than that the code looks good. Please update so I can commit it.

llvm/lib/IR/Operator.cpp
81	Nit: Don't call it `MinimalIndex` as there might be use cases that over-approximate the number. Index is just fine I think.
llvm/lib/Transforms/IPO/Attributor.cpp
566 ↗	(On Diff #252668)	Nit: `AllowNonInbounds` and `Value above. Check other variable names as well.

This revision is now accepted and ready to land.Mar 26 2020, 9:35 AM

kuter updated this revision to Diff 253022.Mar 26 2020, 5:28 PM

Apologies for the delay, can you rebase this and provide me with "Firstname Lastname <email>" from you so I can attribute it to you?

Rebased.
Small logic change in GEPOperator::accumulateConstantOffset to bailout on scalable vector types
except for when the offset is zero.
Not allowing zero breaks @test_accumulate_constant_offset_vscale_zero

see https://reviews.llvm.org/rGef64ba831194c7deac8882a325ea9bea64eb612a

In D76208#1956335, @jdoerfert wrote:

Apologies for the delay, can you rebase this and provide me with "Firstname Lastname <email>" from you so I can attribute it to you?

name, surname: Kuter Dinel
email: kuterdinel@gmail.com

Closed by commit rGe57807769b5c: [Attributor] Use AAValueConstantRange to infer dereferencability. (authored by kuter, committed by jdoerfert). · Explain WhyMay 13 2020, 3:17 PM

This revision was automatically updated to reflect the committed changes.

Herald added a subscriber: llvm-commits. · View Herald TranscriptMay 13 2020, 3:17 PM

jdoerfert mentioned this in rG6045a804b94b: [Attributor] Check lines accidentally not committed with D76208.May 13 2020, 4:26 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

IR/

Operator.h

25 lines

Value.h

20 lines

lib/

IR/

Operator.cpp

87 lines

Value.cpp

23 lines

Transforms/

IPO/

AttributorAttributes.cpp

63 lines

test/

Transforms/

Attributor/

dereferenceable-1.ll

26 lines

willreturn.ll

43 lines

Diff 263867

llvm/include/llvm/IR/Operator.h

Show First 20 Lines • Show All 541 Lines • ▼ Show 20 Lines	public:
unsigned countNonConstantIndices() const {		unsigned countNonConstantIndices() const {
return count_if(make_range(idx_begin(), idx_end()), [](const Use& use) {		return count_if(make_range(idx_begin(), idx_end()), [](const Use& use) {
return !isa<ConstantInt>(*use);		return !isa<ConstantInt>(*use);
});		});
}		}

/// Accumulate the constant address offset of this GEP if possible.		/// Accumulate the constant address offset of this GEP if possible.
///		///
/// This routine accepts an APInt into which it will accumulate the constant		/// This routine accepts an APInt into which it will try to accumulate the
/// offset of this GEP if the GEP is in fact constant. If the GEP is not		/// constant offset of this GEP.
/// all-constant, it returns false and the value of the offset APInt is		///
/// undefined (it is not preserved!). The APInt passed into this routine		/// If \p ExternalAnalysis is provided it will be used to calculate a offset
/// must be at exactly as wide as the IntPtr type for the address space of the		/// when a operand of GEP is not constant.
/// base GEP pointer.		/// For example, for a value \p ExternalAnalysis might try to calculate a
bool accumulateConstantOffset(const DataLayout &DL, APInt &Offset) const;		/// lower bound. If \p ExternalAnalysis is successful, it should return true.
		///
		/// If the \p ExternalAnalysis returns false or the value returned by \p
		/// ExternalAnalysis results in a overflow/underflow, this routine returns
		/// false and the value of the offset APInt is undefined (it is not
		/// preserved!).
		///
		/// The APInt passed into this routine must be at exactly as wide as the
		jdoerfertUnsubmitted Done Reply Inline Actions `= nullptr` is not sufficient? unfortunate. Nit: -`value` +`Offset`, or remove the names. Please describe what external analysis exactly does here. I like the solution of adding this callback though. We need more of these soon. jdoerfert: `= nullptr` is not sufficient? unfortunate. Nit: -`value` +`Offset`, or remove the names.
		/// IntPtr type for the address space of the base GEP pointer.
		bool accumulateConstantOffset(
		const DataLayout &DL, APInt &Offset,
		function_ref<bool(Value &, APInt &)> ExternalAnalysis = nullptr) const;
};		};

class PtrToIntOperator		class PtrToIntOperator
: public ConcreteOperator<Operator, Instruction::PtrToInt> {		: public ConcreteOperator<Operator, Instruction::PtrToInt> {
friend class PtrToInt;		friend class PtrToInt;
friend class ConstantExpr;		friend class ConstantExpr;

public:		public:
▲ Show 20 Lines • Show All 59 Lines • Show Last 20 Lines

llvm/include/llvm/IR/Value.h

Show First 20 Lines • Show All 587 Lines • ▼ Show 20 Lines	#include "llvm/IR/Value.def"
/// value, it returns 'this'.		/// value, it returns 'this'.
const Value *stripInBoundsConstantOffsets() const;		const Value *stripInBoundsConstantOffsets() const;
Value *stripInBoundsConstantOffsets() {		Value *stripInBoundsConstantOffsets() {
return const_cast<Value *>(		return const_cast<Value *>(
static_cast<const Value *>(this)->stripInBoundsConstantOffsets());		static_cast<const Value *>(this)->stripInBoundsConstantOffsets());
}		}

/// Accumulate the constant offset this value has compared to a base pointer.		/// Accumulate the constant offset this value has compared to a base pointer.
/// Only 'getelementptr' instructions (GEPs) with constant indices are		/// Only 'getelementptr' instructions (GEPs) are accumulated but other
/// accumulated but other instructions, e.g., casts, are stripped away as		/// instructions, e.g., casts, are stripped away as well.
/// well. The accumulated constant offset is added to \p Offset and the base		/// The accumulated constant offset is added to \p Offset and the base
/// pointer is returned.		/// pointer is returned.
///		///
/// The APInt \p Offset has to have a bit-width equal to the IntPtr type for		/// The APInt \p Offset has to have a bit-width equal to the IntPtr type for
/// the address space of 'this' pointer value, e.g., use		/// the address space of 'this' pointer value, e.g., use
/// DataLayout::getIndexTypeSizeInBits(Ty).		/// DataLayout::getIndexTypeSizeInBits(Ty).
///		///
/// If \p AllowNonInbounds is true, constant offsets in GEPs are stripped and		/// If \p AllowNonInbounds is true, offsets in GEPs are stripped and
/// accumulated even if the GEP is not "inbounds".		/// accumulated even if the GEP is not "inbounds".
///		///
		/// If \p ExternalAnalysis is provided it will be used to calculate a offset
		/// when a operand of GEP is not constant.
		/// For example, for a value \p ExternalAnalysis might try to calculate a
		/// lower bound. If \p ExternalAnalysis is successful, it should return true.
		///
/// If this is called on a non-pointer value, it returns 'this' and the		/// If this is called on a non-pointer value, it returns 'this' and the
/// \p Offset is not modified.		/// \p Offset is not modified.
///		///
/// Note that this function will never return a nullptr. It will also never		/// Note that this function will never return a nullptr. It will also never
/// manipulate the \p Offset in a way that would not match the difference		/// manipulate the \p Offset in a way that would not match the difference
/// between the underlying value and the returned one. Thus, if no constant		/// between the underlying value and the returned one. Thus, if no constant
/// offset was found, the returned value is the underlying one and \p Offset		/// offset was found, the returned value is the underlying one and \p Offset
/// is unchanged.		/// is unchanged.
const Value *stripAndAccumulateConstantOffsets(const DataLayout &DL,		const Value *stripAndAccumulateConstantOffsets(
APInt &Offset,		const DataLayout &DL, APInt &Offset, bool AllowNonInbounds,
bool AllowNonInbounds) const;		function_ref<bool(Value &Value, APInt &Offset)> ExternalAnalysis =
		nullptr) const;
Value *stripAndAccumulateConstantOffsets(const DataLayout &DL, APInt &Offset,		Value *stripAndAccumulateConstantOffsets(const DataLayout &DL, APInt &Offset,
bool AllowNonInbounds) {		bool AllowNonInbounds) {
return const_cast<Value *>(		return const_cast<Value *>(
static_cast<const Value *>(this)->stripAndAccumulateConstantOffsets(		static_cast<const Value *>(this)->stripAndAccumulateConstantOffsets(
DL, Offset, AllowNonInbounds));		DL, Offset, AllowNonInbounds));
}		}

/// This is a wrapper around stripAndAccumulateConstantOffsets with the		/// This is a wrapper around stripAndAccumulateConstantOffsets with the
▲ Show 20 Lines • Show All 327 Lines • Show Last 20 Lines

llvm/lib/IR/Operator.cpp

	Show All 25 Lines
	}			}

	Type *GEPOperator::getResultElementType() const {			Type *GEPOperator::getResultElementType() const {
	if (auto *I = dyn_cast<GetElementPtrInst>(this))			if (auto *I = dyn_cast<GetElementPtrInst>(this))
	return I->getResultElementType();			return I->getResultElementType();
	return cast<GetElementPtrConstantExpr>(this)->getResultElementType();			return cast<GetElementPtrConstantExpr>(this)->getResultElementType();
	}			}

	bool GEPOperator::accumulateConstantOffset(const DataLayout &DL,			bool GEPOperator::accumulateConstantOffset(
	APInt &Offset) const {			const DataLayout &DL, APInt &Offset,
				function_ref<bool(Value &, APInt &)> ExternalAnalysis) const {
	assert(Offset.getBitWidth() ==			assert(Offset.getBitWidth() ==
	DL.getIndexSizeInBits(getPointerAddressSpace()) &&			DL.getIndexSizeInBits(getPointerAddressSpace()) &&
	"The offset bit width does not match DL specification.");			"The offset bit width does not match DL specification.");

	for (gep_type_iterator GTI = gep_type_begin(this), GTE = gep_type_end(this);			bool UsedExternalAnalysis = false;
	GTI != GTE; ++GTI) {			auto AccumulateOffset = [&](APInt Index, uint64_t Size) -> bool {
	ConstantInt *OpC = dyn_cast<ConstantInt>(GTI.getOperand());			Index = Index.sextOrTrunc(Offset.getBitWidth());
	if (!OpC)			APInt IndexedSize = APInt(Offset.getBitWidth(), Size);
				// For array or vector indices, scale the index by the size of the type.
				if (!UsedExternalAnalysis) {
				Offset += Index * IndexedSize;
				} else {
				// External Analysis can return a result higher/lower than the value
				// represents. We need to detect overflow/underflow.
				bool Overflow = false;
				APInt OffsetPlus = Index.smul_ov(IndexedSize, Overflow);
				if (Overflow)
	return false;			return false;
	if (OpC->isZero())			Offset = Offset.sadd_ov(OffsetPlus, Overflow);
	continue;			if (Overflow)
				return false;
				}
				return true;
				};

	// Scalable vectors have are multiplied by a runtime constant.			for (gep_type_iterator GTI = gep_type_begin(this), GTE = gep_type_end(this);
				GTI != GTE; ++GTI) {
				// Scalable vectors are multiplied by a runtime constant.
				bool ScalableType = false;
	if (isa<ScalableVectorType>(GTI.getIndexedType()))			if (isa<ScalableVectorType>(GTI.getIndexedType()))
	return false;			ScalableType = true;

				Value *V = GTI.getOperand();
				StructType *STy = GTI.getStructTypeOrNull();
				// Handle ConstantInt if possible.
				if (auto ConstOffset = dyn_cast<ConstantInt>(V)) {
				if (ConstOffset->isZero())
				continue;
				// if the type is scalable and the constant is not zero (vscale * n * 0 =
				// 0) bailout.
				if (ScalableType)
				return false;
	// Handle a struct index, which adds its field offset to the pointer.			// Handle a struct index, which adds its field offset to the pointer.
	if (StructType *STy = GTI.getStructTypeOrNull()) {			if (STy) {
	unsigned ElementIdx = OpC->getZExtValue();			unsigned ElementIdx = ConstOffset->getZExtValue();
				jdoerfertUnsubmitted Not Done Reply Inline Actions Nit: Don't call it `MinimalIndex` as there might be use cases that over-approximate the number. Index is just fine I think. jdoerfert: Nit: Don't call it `MinimalIndex` as there might be use cases that over-approximate the number.
	const StructLayout *SL = DL.getStructLayout(STy);			const StructLayout *SL = DL.getStructLayout(STy);
	Offset += APInt(Offset.getBitWidth(), SL->getElementOffset(ElementIdx));			// Element offset is in bytes.
				if (!AccumulateOffset(
				APInt(Offset.getBitWidth(), SL->getElementOffset(ElementIdx)),
				1))
				return false;
				continue;
				}
				if (!AccumulateOffset(ConstOffset->getValue(),
				DL.getTypeAllocSize(GTI.getIndexedType())))
				return false;
	continue;			continue;
	}			}
				jdoerfertUnsubmitted Done Reply Inline Actions Style: I think it would be easier to read if you split the above if-else-cascade and use early exits: MinimalIndex = ConstOffset->getValue(); continue; } // The operand is not constant, check if an external analysis was provided. if (!ExternalAnalysis) return false; Do we need to track overflow before we use an external analysis? If not we don't need 75/76. Below we can than use `UsedExternalAnalysis` in 88 and remove it from 95. We should also be able to use a single overflowed flag. jdoerfert: Style: I think it would be easier to read if you split the above if-else-cascade and use early…
				kuterAuthorUnsubmitted Done Reply Inline Actions We probably don't need to detect a overflow before the use of external analysis I was just being safe. I am not sure what you mean by the early exit change. We need to detect overflows that happen after we use ExternalAnalysis even if it was not caused by the value that the ExternalAnalysis have returned. So the value of the ConstantOffset needs to pass through something that would detect a overflow if ExternalAnalysis is present and it has been used. We could do it with a lambda like this: AddArrayIndex(ConstOffset->getValue()); continue; } Also do we really need to have the old and the new code in this style: if (!External Analysis) { //old } else { //new } New code shouldn't really behave any different other than detecting overflows/underflows. kuter: We probably don't need to detect a overflow before the use of external analysis I was just…
				jdoerfertUnsubmitted Not Done Reply Inline Actions When we want to check `UsedExternal` in order to determine if an overflow check is necessary not `ExternalAnalysis`. We should create the lambda you mentioned. Don't track overflowed outside of the lambda. In the lambda check if `usedexternal` is set, only if we track and act on overflows. jdoerfert: 1) When we want to check `UsedExternal` in order to determine if an overflow check is necessary…

	// For array or vector indices, scale the index by the size of the type.			// The operand is not constant, check if an external analysis was provided.
	APInt Index = OpC->getValue().sextOrTrunc(Offset.getBitWidth());			// External analsis is not applicable to a struct type.
	Offset += Index * APInt(Offset.getBitWidth(),			if (!ExternalAnalysis \|\| STy \|\| ScalableType)
	DL.getTypeAllocSize(GTI.getIndexedType()));			return false;
				APInt AnalysisIndex;
				if (!ExternalAnalysis(*V, AnalysisIndex))
				return false;
				UsedExternalAnalysis = true;
				if (!AccumulateOffset(AnalysisIndex,
				DL.getTypeAllocSize(GTI.getIndexedType())))
				return false;
	}			}
	return true;			return true;
	}			}
	}			}
				jdoerfertUnsubmitted Done Reply Inline Actions Put the simple case first, maybe add a continue to avoid the "else" jdoerfert: Put the simple case first, maybe add a continue to avoid the "else"

llvm/lib/IR/Value.cpp

Show First 20 Lines • Show All 593 Lines • ▼ Show 20 Lines
const Value *Value::stripInBoundsConstantOffsets() const {		const Value *Value::stripInBoundsConstantOffsets() const {
return stripPointerCastsAndOffsets<PSK_InBoundsConstantIndices>(this);		return stripPointerCastsAndOffsets<PSK_InBoundsConstantIndices>(this);
}		}

const Value *Value::stripPointerCastsAndInvariantGroups() const {		const Value *Value::stripPointerCastsAndInvariantGroups() const {
return stripPointerCastsAndOffsets<PSK_ZeroIndicesAndInvariantGroups>(this);		return stripPointerCastsAndOffsets<PSK_ZeroIndicesAndInvariantGroups>(this);
}		}

const Value *		const Value *Value::stripAndAccumulateConstantOffsets(
Value::stripAndAccumulateConstantOffsets(const DataLayout &DL, APInt &Offset,		const DataLayout &DL, APInt &Offset, bool AllowNonInbounds,
bool AllowNonInbounds) const {		function_ref<bool(Value &, APInt &)> ExternalAnalysis) const {
if (!getType()->isPtrOrPtrVectorTy())		if (!getType()->isPtrOrPtrVectorTy())
return this;		return this;

unsigned BitWidth = Offset.getBitWidth();		unsigned BitWidth = Offset.getBitWidth();
assert(BitWidth == DL.getIndexTypeSizeInBits(getType()) &&		assert(BitWidth == DL.getIndexTypeSizeInBits(getType()) &&
"The offset bit width does not match the DL specification.");		"The offset bit width does not match the DL specification.");

// Even though we don't look through PHI nodes, we could be called on an		// Even though we don't look through PHI nodes, we could be called on an
Show All 9 Lines	if (auto *GEP = dyn_cast<GEPOperator>(V)) {

// If one of the values we have visited is an addrspacecast, then		// If one of the values we have visited is an addrspacecast, then
// the pointer type of this GEP may be different from the type		// the pointer type of this GEP may be different from the type
// of the Ptr parameter which was passed to this function. This		// of the Ptr parameter which was passed to this function. This
// means when we construct GEPOffset, we need to use the size		// means when we construct GEPOffset, we need to use the size
// of GEP's pointer type rather than the size of the original		// of GEP's pointer type rather than the size of the original
// pointer type.		// pointer type.
APInt GEPOffset(DL.getIndexTypeSizeInBits(V->getType()), 0);		APInt GEPOffset(DL.getIndexTypeSizeInBits(V->getType()), 0);
if (!GEP->accumulateConstantOffset(DL, GEPOffset))		if (!GEP->accumulateConstantOffset(DL, GEPOffset, ExternalAnalysis))
return V;		return V;

// Stop traversal if the pointer offset wouldn't fit in the bit-width		// Stop traversal if the pointer offset wouldn't fit in the bit-width
// provided by the Offset argument. This can happen due to AddrSpaceCast		// provided by the Offset argument. This can happen due to AddrSpaceCast
// stripping.		// stripping.
if (GEPOffset.getMinSignedBits() > BitWidth)		if (GEPOffset.getMinSignedBits() > BitWidth)
return V;		return V;

Offset += GEPOffset.sextOrTrunc(BitWidth);		// External Analysis can return a result higher/lower than the value
		// represents. We need to detect overflow/underflow.
		APInt GEPOffsetST = GEPOffset.sextOrTrunc(BitWidth);
		if (!ExternalAnalysis) {
		Offset += GEPOffsetST;
		} else {
		bool Overflow = false;
		APInt OldOffset = Offset;
		Offset = Offset.sadd_ov(GEPOffsetST, Overflow);
		jdoerfertUnsubmitted Done Reply Inline Actions Put the entire thing in an `if (ExternalAnalysis)` please. Or better `!ExternaAnalysis` then old code, else this code. TBH, I'm not sure why we don't bail on overflow in the old code as well, maybe we should. Can you check if that would make any test fail? jdoerfert: Put the entire thing in an `if (ExternalAnalysis)` please. Or better `!ExternaAnalysis` then…
		if (Overflow) {
		Offset = OldOffset;
		return V;
		}
		}
		jdoerfertUnsubmitted Done Reply Inline Actions Simple case first please. jdoerfert: Simple case first please.
V = GEP->getPointerOperand();		V = GEP->getPointerOperand();
} else if (Operator::getOpcode(V) == Instruction::BitCast \|\|		} else if (Operator::getOpcode(V) == Instruction::BitCast \|\|
Operator::getOpcode(V) == Instruction::AddrSpaceCast) {		Operator::getOpcode(V) == Instruction::AddrSpaceCast) {
V = cast<Operator>(V)->getOperand(0);		V = cast<Operator>(V)->getOperand(0);
} else if (auto *GA = dyn_cast<GlobalAlias>(V)) {		} else if (auto *GA = dyn_cast<GlobalAlias>(V)) {
if (!GA->isInterposable())		if (!GA->isInterposable())
V = GA->getAliasee();		V = GA->getAliasee();
} else if (const auto *Call = dyn_cast<CallBase>(V)) {		} else if (const auto *Call = dyn_cast<CallBase>(V)) {
▲ Show 20 Lines • Show All 398 Lines • Show Last 20 Lines

llvm/lib/Transforms/IPO/AttributorAttributes.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 343 Lines • ▼ Show 20 Lines	static bool genericValueTraversal(
// If we actually used liveness information so we have to record a dependence.		// If we actually used liveness information so we have to record a dependence.
if (AnyDead)		if (AnyDead)
A.recordDependence(*LivenessAA, QueryingAA, DepClassTy::OPTIONAL);		A.recordDependence(*LivenessAA, QueryingAA, DepClassTy::OPTIONAL);

// All values have been visited.		// All values have been visited.
return true;		return true;
}		}

		const Value *stripAndAccumulateMinimalOffsets(
		Attributor &A, const AbstractAttribute &QueryingAA, const Value *Val,
		const DataLayout &DL, APInt &Offset, bool AllowNonInbounds,
		bool UseAssumed = false) {

		auto AttributorAnalysis = [&](Value &V, APInt &ROffset) -> bool {
		const IRPosition &Pos = IRPosition::value(V);
		// Only track dependence if we are going to use the assumed info.
		const AAValueConstantRange &ValueConstantRangeAA =
		A.getAAFor<AAValueConstantRange>(QueryingAA, Pos,
		/* TrackDependence */ UseAssumed);
		ConstantRange Range = UseAssumed ? ValueConstantRangeAA.getAssumed()
		: ValueConstantRangeAA.getKnown();
		// We can only use the lower part of the range because the upper part can
		// be higher than what the value can really be.
		ROffset = Range.getSignedMin();
		return true;
		};

		return Val->stripAndAccumulateConstantOffsets(DL, Offset, AllowNonInbounds,
		AttributorAnalysis);
		}

		static const Value *getMinimalBaseOfAccsesPointerOperand(
		Attributor &A, const AbstractAttribute &QueryingAA, const Instruction *I,
		int64_t &BytesOffset, const DataLayout &DL, bool AllowNonInbounds = false) {
		const Value Ptr = getPointerOperand(I, / AllowVolatile */ false);
		if (!Ptr)
		return nullptr;
		APInt OffsetAPInt(DL.getIndexTypeSizeInBits(Ptr->getType()), 0);
		const Value *Base = stripAndAccumulateMinimalOffsets(
		A, QueryingAA, Ptr, DL, OffsetAPInt, AllowNonInbounds);

		BytesOffset = OffsetAPInt.getSExtValue();
		return Base;
		}

static const Value *		static const Value *
getBasePointerOfAccessPointerOperand(const Instruction *I, int64_t &BytesOffset,		getBasePointerOfAccessPointerOperand(const Instruction *I, int64_t &BytesOffset,
const DataLayout &DL,		const DataLayout &DL,
bool AllowNonInbounds = false) {		bool AllowNonInbounds = false) {
const Value Ptr = getPointerOperand(I, / AllowVolatile */ false);		const Value Ptr = getPointerOperand(I, / AllowVolatile */ false);
if (!Ptr)		if (!Ptr)
return nullptr;		return nullptr;

▲ Show 20 Lines • Show All 1,221 Lines • ▼ Show 20 Lines	static int64_t getKnownNonNullAndDerefBytesForUse(

// We need to follow common pointer manipulation uses to the accesses they		// We need to follow common pointer manipulation uses to the accesses they
// feed into. We can try to be smart to avoid looking through things we do not		// feed into. We can try to be smart to avoid looking through things we do not
// like for now, e.g., non-inbounds GEPs.		// like for now, e.g., non-inbounds GEPs.
if (isa<CastInst>(I)) {		if (isa<CastInst>(I)) {
TrackUse = true;		TrackUse = true;
return 0;		return 0;
}		}
if (auto *GEP = dyn_cast<GetElementPtrInst>(I))
if (GEP->hasAllConstantIndices()) {		if (isa<GetElementPtrInst>(I)) {
TrackUse = true;		TrackUse = true;
return 0;		return 0;
}		}

int64_t Offset;		int64_t Offset;
if (const Value *Base = getBasePointerOfAccessPointerOperand(I, Offset, DL)) {		const Value *Base =
		getMinimalBaseOfAccsesPointerOperand(A, QueryingAA, I, Offset, DL);
		if (Base) {
if (Base == &AssociatedValue &&		if (Base == &AssociatedValue &&
getPointerOperand(I, /* AllowVolatile */ false) == UseV) {		getPointerOperand(I, /* AllowVolatile */ false) == UseV) {
int64_t DerefBytes =		int64_t DerefBytes =
(int64_t)DL.getTypeStoreSize(PtrTy->getPointerElementType()) + Offset;		(int64_t)DL.getTypeStoreSize(PtrTy->getPointerElementType()) + Offset;

IsNonNull \|= !NullPointerIsDefined;		IsNonNull \|= !NullPointerIsDefined;
return std::max(int64_t(0), DerefBytes);		return std::max(int64_t(0), DerefBytes);
}		}
}		}

/// Corner case when an offset is 0.		/// Corner case when an offset is 0.
if (const Value *Base = getBasePointerOfAccessPointerOperand(		Base = getBasePointerOfAccessPointerOperand(I, Offset, DL,
I, Offset, DL, /AllowNonInbounds/ true)) {		/AllowNonInbounds/ true);
		if (Base) {
if (Offset == 0 && Base == &AssociatedValue &&		if (Offset == 0 && Base == &AssociatedValue &&
getPointerOperand(I, /* AllowVolatile */ false) == UseV) {		getPointerOperand(I, /* AllowVolatile */ false) == UseV) {
int64_t DerefBytes =		int64_t DerefBytes =
(int64_t)DL.getTypeStoreSize(PtrTy->getPointerElementType());		(int64_t)DL.getTypeStoreSize(PtrTy->getPointerElementType());
IsNonNull \|= !NullPointerIsDefined;		IsNonNull \|= !NullPointerIsDefined;
return std::max(int64_t(0), DerefBytes);		return std::max(int64_t(0), DerefBytes);
}		}
}		}
▲ Show 20 Lines • Show All 1,688 Lines • ▼ Show 20 Lines	struct AADereferenceableImpl : AADereferenceable {

/// See followUsesInMBEC		/// See followUsesInMBEC
bool followUseInMBEC(Attributor &A, const Use U, const Instruction I,		bool followUseInMBEC(Attributor &A, const Use U, const Instruction I,
AADereferenceable::StateType &State) {		AADereferenceable::StateType &State) {
bool IsNonNull = false;		bool IsNonNull = false;
bool TrackUse = false;		bool TrackUse = false;
int64_t DerefBytes = getKnownNonNullAndDerefBytesForUse(		int64_t DerefBytes = getKnownNonNullAndDerefBytesForUse(
A, *this, getAssociatedValue(), U, I, IsNonNull, TrackUse);		A, *this, getAssociatedValue(), U, I, IsNonNull, TrackUse);
		LLVM_DEBUG(dbgs() << "[AADereferenceable] Deref bytes: " << DerefBytes
		<< " for instruction " << *I << "\n");

addAccessedBytesForUse(A, U, I, State);		addAccessedBytesForUse(A, U, I, State);
State.takeKnownDerefBytesMaximum(DerefBytes);		State.takeKnownDerefBytesMaximum(DerefBytes);
return TrackUse;		return TrackUse;
}		}

/// See AbstractAttribute::manifest(...).		/// See AbstractAttribute::manifest(...).
ChangeStatus manifest(Attributor &A) override {		ChangeStatus manifest(Attributor &A) override {
Show All 32 Lines
struct AADereferenceableFloating : AADereferenceableImpl {		struct AADereferenceableFloating : AADereferenceableImpl {
AADereferenceableFloating(const IRPosition &IRP, Attributor &A)		AADereferenceableFloating(const IRPosition &IRP, Attributor &A)
: AADereferenceableImpl(IRP, A) {}		: AADereferenceableImpl(IRP, A) {}

/// See AbstractAttribute::updateImpl(...).		/// See AbstractAttribute::updateImpl(...).
ChangeStatus updateImpl(Attributor &A) override {		ChangeStatus updateImpl(Attributor &A) override {
const DataLayout &DL = A.getDataLayout();		const DataLayout &DL = A.getDataLayout();

auto VisitValueCB = [&](Value &V, const Instruction *, DerefState &T,		auto VisitValueCB = [&](const Value &V, const Instruction *, DerefState &T,
bool Stripped) -> bool {		bool Stripped) -> bool {
unsigned IdxWidth =		unsigned IdxWidth =
DL.getIndexSizeInBits(V.getType()->getPointerAddressSpace());		DL.getIndexSizeInBits(V.getType()->getPointerAddressSpace());
APInt Offset(IdxWidth, 0);		APInt Offset(IdxWidth, 0);
const Value *Base =		const Value *Base =
V.stripAndAccumulateInBoundsConstantOffsets(DL, Offset);		stripAndAccumulateMinimalOffsets(A, *this, &V, DL, Offset, false);

const auto &AA =		const auto &AA =
A.getAAFor<AADereferenceable>(this, IRPosition::value(Base));		A.getAAFor<AADereferenceable>(this, IRPosition::value(Base));
int64_t DerefBytes = 0;		int64_t DerefBytes = 0;
if (!Stripped && this == &AA) {		if (!Stripped && this == &AA) {
// Use IR information if we did not strip anything.		// Use IR information if we did not strip anything.
// TODO: track globally.		// TODO: track globally.
bool CanBeNull;		bool CanBeNull;
DerefBytes = Base->getPointerDereferenceableBytes(DL, CanBeNull);		DerefBytes = Base->getPointerDereferenceableBytes(DL, CanBeNull);
T.GlobalState.indicatePessimisticFixpoint();		T.GlobalState.indicatePessimisticFixpoint();
} else {		} else {
const DerefState &DS = static_cast<const DerefState &>(AA.getState());		const DerefState &DS = static_cast<const DerefState &>(AA.getState());
DerefBytes = DS.DerefBytesState.getAssumed();		DerefBytes = DS.DerefBytesState.getAssumed();
T.GlobalState &= DS.GlobalState;		T.GlobalState &= DS.GlobalState;
}		}

// TODO: Use `AAConstantRange` to infer dereferenceable bytes.

// For now we do not try to "increase" dereferenceability due to negative		// For now we do not try to "increase" dereferenceability due to negative
// indices as we first have to come up with code to deal with loops and		// indices as we first have to come up with code to deal with loops and
// for overflows of the dereferenceable bytes.		// for overflows of the dereferenceable bytes.
int64_t OffsetSExt = Offset.getSExtValue();		int64_t OffsetSExt = Offset.getSExtValue();
if (OffsetSExt < 0)		if (OffsetSExt < 0)
OffsetSExt = 0;		OffsetSExt = 0;

▲ Show 20 Lines • Show All 3,773 Lines • Show Last 20 Lines

llvm/test/Transforms/Attributor/dereferenceable-1.ll

	Show First 20 Lines • Show All 205 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: br i1 [[C]], label [[IF_TRUE:%.]], label [[IF_FALSE:%.]]			; CHECK-NEXT: br i1 [[C]], label [[IF_TRUE:%.]], label [[IF_FALSE:%.]]
	; CHECK: if.true:			; CHECK: if.true:
	; CHECK-NEXT: [[C:%.]] = tail call i32 @unkown_f(i32 nonnull align 4 dereferenceable(8) [[PTR]])			; CHECK-NEXT: [[C:%.]] = tail call i32 @unkown_f(i32 nonnull align 4 dereferenceable(8) [[PTR]])
	; CHECK-NEXT: [[D:%.]] = tail call i32 @unkown_f(i32 nonnull align 4 dereferenceable(8) [[PTR]])			; CHECK-NEXT: [[D:%.]] = tail call i32 @unkown_f(i32 nonnull align 4 dereferenceable(8) [[PTR]])
	; CHECK-NEXT: [[E:%.]] = tail call i32 @unkown_f(i32 nonnull align 4 dereferenceable(8) [[PTR]])			; CHECK-NEXT: [[E:%.]] = tail call i32 @unkown_f(i32 nonnull align 4 dereferenceable(8) [[PTR]])
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	; CHECK: if.false:			; CHECK: if.false:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
				uenokuUnsubmitted Done Reply Inline Actions I think optimaly %ptr is dereferenceable(30). Please add FIXME here. uenoku: I think optimaly %ptr is dereferenceable(30). Please add FIXME here.
				kuterAuthorUnsubmitted Done Reply Inline Actions OK, I will. For us to be able to deduce a 30, there would have to be separate mechanism that deduces the biggest number that the "Value" in question is going to be guaranteed to take. Perhaps we could use SCEV ? kuter: OK, I will. For us to be able to deduce a 30, there would have to be separate mechanism that…
				uenokuUnsubmitted Not Done Reply Inline Actions For us to be able to deduce a 30, there would have to be separate mechanism that deduces the biggest number that the "Value" in question is going to be guaranteed to take. Agreed. Perhaps we could use SCEV ? I think so, what I thought once is: Assume that a loop is in the context and it is guaranteed to proceed to the last iteration, we can use the biggest number of the value. For inbounds gep, it is allowed to take that number as dereferenceable bytes, and for non-inbounds gep, if SCEV of the value is something like <0, +, 1>, we can take the biggest number as derefrenceable bytes. uenoku: > For us to be able to deduce a 30, there would have to be separate mechanism that deduces the…
	%A = tail call i32 @unkown_f(i32* %ptr)			%A = tail call i32 @unkown_f(i32* %ptr)
	%ptr.0 = load i32, i32* %ptr			%ptr.0 = load i32, i32* %ptr
	; deref 4 hold			; deref 4 hold
	; FIXME: this should be %B = tail call i32 @unkown_f(i32* nonnull dereferenceable(4) %ptr)			; FIXME: this should be %B = tail call i32 @unkown_f(i32* nonnull dereferenceable(4) %ptr)
	%B = tail call i32 @unkown_f(i32* dereferenceable(1) %ptr)			%B = tail call i32 @unkown_f(i32* dereferenceable(1) %ptr)
	br i1%c, label %if.true, label %if.false			br i1%c, label %if.true, label %if.false
	if.true:			if.true:
	%C = tail call i32 @unkown_f(i32* %ptr)			%C = tail call i32 @unkown_f(i32* %ptr)
	%D = tail call i32 @unkown_f(i32* dereferenceable(8) %ptr)			%D = tail call i32 @unkown_f(i32* dereferenceable(8) %ptr)
	%E = tail call i32 @unkown_f(i32* %ptr)			%E = tail call i32 @unkown_f(i32* %ptr)
	ret void			ret void
	if.false:			if.false:
	ret void			ret void
	}			}

	define void @f7_2(i1 %c) {			define void @f7_2(i1 %c) {
	; CHECK-LABEL: define {{[^@]+}}@f7_2			; CHECK-LABEL: define {{[^@]+}}@f7_2
	; CHECK-SAME: (i1 [[C:%.*]])			; CHECK-SAME: (i1 [[C:%.*]])
	; CHECK-NEXT: [[PTR:%.]] = tail call nonnull align 4 dereferenceable(4) i32 @unkown_ptr()			; CHECK-NEXT: [[PTR:%.]] = tail call nonnull align 4 dereferenceable(4) i32 @unkown_ptr()
	; CHECK-NEXT: [[A:%.]] = tail call i32 @unkown_f(i32 nonnull align 4 dereferenceable(4) [[PTR]])			; CHECK-NEXT: [[A:%.]] = tail call i32 @unkown_f(i32 nonnull align 4 dereferenceable(4) [[PTR]])
				jdoerfertUnsubmitted Not Done Reply Inline Actions Isn't this the `fill_range` code? jdoerfert: Isn't this the `fill_range` code?
				kuterAuthorUnsubmitted Done Reply Inline Actions No it is not. Currently this patch does not work with for loops. This is because with for loops AAFromMustBeExecutedContext looks at the branch at the top; calls the followUse() on both of the successors and it "and"s them together. This is probably the reason why only a single test is affected by this patch. I will address this issue with a separate patch that passes LoopInfo, DominatorTree and the PostDominatorTree to the MustBeExecutedContextExplorer. kuter: No it is not. Currently this patch does not work with for loops. This is because with for…
				jdoerfertUnsubmitted Not Done Reply Inline Actions Right. That reminds me of D64974, though I'm unsure if we still need it. What we need is to make the explorer aware of `CanProveNotTakenFirstIteration`. Similar to the use of that function that exists, we can go to the non-exit block from a header in `getMustBeExecutedNextInstruction` (or `findForwardJoinPoint`) if we know that edge is taken at least once. To not loose the code after the loop we should add a stack of unexplored edges. The exit edge goes there if the loop is known not to be endless (see `findForwardJoinPoint`). If we are out of forward instructions we can pop an edge from the stack and continue. Alternatively, we could check if the loop was not endless when we visit the header for the second time. jdoerfert: Right. That reminds me of D64974, though I'm unsure if we still need it. What we need is to…
	; CHECK-NEXT: [[B:%.]] = tail call i32 @unkown_f(i32 nonnull align 4 dereferenceable(4) [[PTR]])			; CHECK-NEXT: [[B:%.]] = tail call i32 @unkown_f(i32 nonnull align 4 dereferenceable(4) [[PTR]])
	; CHECK-NEXT: br i1 [[C]], label [[IF_TRUE:%.]], label [[IF_FALSE:%.]]			; CHECK-NEXT: br i1 [[C]], label [[IF_TRUE:%.]], label [[IF_FALSE:%.]]
	; CHECK: if.true:			; CHECK: if.true:
	; CHECK-NEXT: [[C:%.]] = tail call i32 @unkown_f(i32 nonnull align 4 dereferenceable(8) [[PTR]])			; CHECK-NEXT: [[C:%.]] = tail call i32 @unkown_f(i32 nonnull align 4 dereferenceable(8) [[PTR]])
	; CHECK-NEXT: [[D:%.]] = tail call i32 @unkown_f(i32 nonnull align 4 dereferenceable(8) [[PTR]])			; CHECK-NEXT: [[D:%.]] = tail call i32 @unkown_f(i32 nonnull align 4 dereferenceable(8) [[PTR]])
	; CHECK-NEXT: [[E:%.]] = tail call i32 @unkown_f(i32 nonnull align 4 dereferenceable(8) [[PTR]])			; CHECK-NEXT: [[E:%.]] = tail call i32 @unkown_f(i32 nonnull align 4 dereferenceable(8) [[PTR]])
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	; CHECK: if.false:			; CHECK: if.false:
	▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines

	; TEST 8			; TEST 8
	; Use Constant range in deereferenceable			; Use Constant range in deereferenceable
	; void g(int p, long long int range){			; void g(int p, long long int range){
	; int r = *range ; // [10, 99]			; int r = *range ; // [10, 99]
	; fill_range(p, *range);			; fill_range(p, *range);
	; }			; }

				; FIXME: %ptr should be dereferenceable(31)
				define void @test8(i8* %ptr) #0 {
				br label %1
				1: ; preds = %5, %0
				%i.0 = phi i32 [ 20, %0 ], [ %4, %5 ]
				%2 = sext i32 %i.0 to i64
				%3 = getelementptr inbounds i8, i8* %ptr, i64 %2
				store i8 32, i8* %3, align 1
				%4 = add nsw i32 %i.0, 1
				br label %5
				5: ; preds = %1
				%6 = icmp slt i32 %4, 30
				br i1 %6, label %1, label %7

				7: ; preds = %5
				ret void
				}

				; 8.2 (negative case)
				define void @test8_neg(i32 %i, i8* %ptr) #0 {
				%1 = sext i32 %i to i64
				%2 = getelementptr inbounds i8, i8* %ptr, i64 %1
				store i8 65, i8* %2, align 1
				ret void
				}

	; void fill_range(int* p, long long int start){			; void fill_range(int* p, long long int start){
	; for(long long int i = start;i<start+10;i++){			; for(long long int i = start;i<start+10;i++){
	; // If p[i] is inbounds, p is dereferenceable(40) at least.			; // If p[i] is inbounds, p is dereferenceable(40) at least.
	; p[i] = i;			; p[i] = i;
	; }			; }
	; }			; }

	; NOTE: %p should not be dereferenceable			; NOTE: %p should not be dereferenceable
	▲ Show 20 Lines • Show All 413 Lines • Show Last 20 Lines

llvm/test/Transforms/Attributor/willreturn.ll

Show First 20 Lines • Show All 420 Lines • ▼ Show 20 Lines
; ans += p[i];		; ans += p[i];
; }		; }
; return ans;		; return ans;
; }		; }

; IS__TUNIT____: Function Attrs: argmemonly nofree noinline nosync nounwind readonly uwtable		; IS__TUNIT____: Function Attrs: argmemonly nofree noinline nosync nounwind readonly uwtable
; IS__CGSCC____: Function Attrs: argmemonly nofree noinline norecurse nosync nounwind readonly uwtable		; IS__CGSCC____: Function Attrs: argmemonly nofree noinline norecurse nosync nounwind readonly uwtable
define i32 @loop_constant_trip_count(i32* nocapture readonly %0) #0 {		define i32 @loop_constant_trip_count(i32* nocapture readonly %0) #0 {
; CHECK-LABEL: define {{[^@]+}}@loop_constant_trip_count		; IS________OPM-LABEL: define {{[^@]+}}@loop_constant_trip_count
; CHECK-SAME: (i32* nocapture nofree readonly [[TMP0:%.*]])		; IS________OPM-SAME: (i32* nocapture nofree readonly [[TMP0:%.*]])
; CHECK-NEXT: br label [[TMP3:%.*]]		; IS________OPM-NEXT: br label [[TMP3:%.*]]
; CHECK: 2:		; IS________OPM: 2:
; CHECK-NEXT: ret i32 [[TMP8:%.*]]		; IS________OPM-NEXT: ret i32 [[TMP8:%.*]]
; CHECK: 3:		; IS________OPM: 3:
; CHECK-NEXT: [[TMP4:%.]] = phi i64 [ 0, [[TMP1:%.]] ], [ [[TMP9:%.*]], [[TMP3]] ]		; IS________OPM-NEXT: [[TMP4:%.]] = phi i64 [ 0, [[TMP1:%.]] ], [ [[TMP9:%.*]], [[TMP3]] ]
; CHECK-NEXT: [[TMP5:%.*]] = phi i32 [ 0, [[TMP1]] ], [ [[TMP8]], [[TMP3]] ]		; IS________OPM-NEXT: [[TMP5:%.*]] = phi i32 [ 0, [[TMP1]] ], [ [[TMP8]], [[TMP3]] ]
; CHECK-NEXT: [[TMP6:%.]] = getelementptr inbounds i32, i32 [[TMP0]], i64 [[TMP4]]		; IS________OPM-NEXT: [[TMP6:%.]] = getelementptr inbounds i32, i32 [[TMP0]], i64 [[TMP4]]
; CHECK-NEXT: [[TMP7:%.]] = load i32, i32 [[TMP6]], align 4		; IS________OPM-NEXT: [[TMP7:%.]] = load i32, i32 [[TMP6]], align 4
; CHECK-NEXT: [[TMP8]] = add nsw i32 [[TMP7]], [[TMP5]]		; IS________OPM-NEXT: [[TMP8]] = add nsw i32 [[TMP7]], [[TMP5]]
; CHECK-NEXT: [[TMP9]] = add nuw nsw i64 [[TMP4]], 1		; IS________OPM-NEXT: [[TMP9]] = add nuw nsw i64 [[TMP4]], 1
; CHECK-NEXT: [[TMP10:%.*]] = icmp eq i64 [[TMP9]], 10		; IS________OPM-NEXT: [[TMP10:%.*]] = icmp eq i64 [[TMP9]], 10
; CHECK-NEXT: br i1 [[TMP10]], label [[TMP2:%.*]], label [[TMP3]]		; IS________OPM-NEXT: br i1 [[TMP10]], label [[TMP2:%.*]], label [[TMP3]]
		;
		; IS________NPM-LABEL: define {{[^@]+}}@loop_constant_trip_count
		; IS________NPM-SAME: (i32* nocapture nofree nonnull readonly dereferenceable(4) [[TMP0:%.*]])
		; IS________NPM-NEXT: br label [[TMP3:%.*]]
		; IS________NPM: 2:
		; IS________NPM-NEXT: ret i32 [[TMP8:%.*]]
		; IS________NPM: 3:
		; IS________NPM-NEXT: [[TMP4:%.]] = phi i64 [ 0, [[TMP1:%.]] ], [ [[TMP9:%.*]], [[TMP3]] ]
		; IS________NPM-NEXT: [[TMP5:%.*]] = phi i32 [ 0, [[TMP1]] ], [ [[TMP8]], [[TMP3]] ]
		; IS________NPM-NEXT: [[TMP6:%.]] = getelementptr inbounds i32, i32 [[TMP0]], i64 [[TMP4]]
		; IS________NPM-NEXT: [[TMP7:%.]] = load i32, i32 [[TMP6]], align 4
		; IS________NPM-NEXT: [[TMP8]] = add nsw i32 [[TMP7]], [[TMP5]]
		; IS________NPM-NEXT: [[TMP9]] = add nuw nsw i64 [[TMP4]], 1
		; IS________NPM-NEXT: [[TMP10:%.*]] = icmp eq i64 [[TMP9]], 10
		; IS________NPM-NEXT: br i1 [[TMP10]], label [[TMP2:%.*]], label [[TMP3]]
;		;
br label %3		br label %3

; <label>:2: ; preds = %3		; <label>:2: ; preds = %3
ret i32 %8		ret i32 %8

; <label>:3: ; preds = %3, %1		; <label>:3: ; preds = %3, %1
%4 = phi i64 [ 0, %1 ], [ %9, %3 ]		%4 = phi i64 [ 0, %1 ], [ %9, %3 ]
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	; <label>:8: ; preds = %4, %8
%13 = load i32, i32* %12, align 4		%13 = load i32, i32* %12, align 4
%14 = add nsw i32 %13, %10		%14 = add nsw i32 %13, %10
%15 = add i32 %9, %3		%15 = add i32 %9, %3
%16 = icmp eq i32 %15, %1		%16 = icmp eq i32 %15, %1
br i1 %16, label %6, label %8		br i1 %16, label %6, label %8
}		}


; TEST 13 (positive case)		; TEST 13 (positive case)
uenokuUnsubmitted Not Done Reply Inline Actions Why can't we get dereferenceable(4) for `p` in this case? uenoku: Why can't we get dereferenceable(4) for `p` in this case?
kuterAuthorUnsubmitted Done Reply Inline Actions Ok for this example specifically, the variable `n` can be negative so the `ans +=p[n]` may never be executed. But in general, this patch doesn't work with loops that are not `do {} while (cond)` This is because `AAFromMustBeExecutedContext` is not aware when the first iteration is always going to be ran. This needs to be addressed with a separate path and would probably improve many other deductions as well. kuter: Ok for this example specifically, the variable `n` can be negative so the `ans +=p[n]` may…
baziotisUnsubmitted Not Done Reply Inline Actions IIUC, loop rotation can help here because it provides this guarantee. baziotis: IIUC, [[ https://llvm.org/docs/LoopTerminology.html#rotated-loops \| loop rotation ]] can help…
uenokuUnsubmitted Not Done Reply Inline Actions Oh, I thought `%n` is assumed to be positive:) Thanks. uenoku: Oh, I thought `%n` is assumed to be positive:) Thanks.
; Function Attrs: norecurse nounwind readonly uwtable		; Function Attrs: norecurse nounwind readonly uwtable
; int loop_trip_dec(int n, int *p){		; int loop_trip_dec(int n, int *p){
; int ans = 0;		; int ans = 0;
; for(;n >= 0;n--){		; for(;n >= 0;n--){
; ans += p[n];		; ans += p[n];
; }		; }
; return ans;		; return ans;
; }		; }
▲ Show 20 Lines • Show All 594 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[Attributor] Use AAValueConstantRange to infer dereferencability.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 263867

llvm/include/llvm/IR/Operator.h

llvm/include/llvm/IR/Value.h

llvm/lib/IR/Operator.cpp

llvm/lib/IR/Value.cpp

llvm/lib/Transforms/IPO/AttributorAttributes.cpp

llvm/test/Transforms/Attributor/dereferenceable-1.ll

llvm/test/Transforms/Attributor/willreturn.ll

[Attributor] Use AAValueConstantRange to infer dereferencability.
ClosedPublic