This is an archive of the discontinued LLVM Phabricator instance.

llvm/lib/Transforms/IPO/Attributor.cpp
616	Isn't this almost the `stripAndAccumulate` code? How hard would it be to pass some kind of `AttributorInfo` object to that function (optionally) which triggers the lookup. Or more generically, a callback that takes a `Value` and returns a lower bound offset (plus indicates success/failure).
llvm/test/Transforms/Attributor/dereferenceable-1.ll
234	Isn't this the `fill_range` code?

Thank you for working on this!!

As a high-level comment, please add full-context of diff(diff -U99999)

llvm/lib/Transforms/IPO/Attributor.cpp
511	Add commnent here
llvm/test/Transforms/Attributor/dereferenceable-1.ll
214	I think optimaly %ptr is dereferenceable(30). Please add FIXME here.

uenoku added a reviewer: baziotis.Mar 15 2020, 10:15 PM

Made the changes @jdoerfert asked for.

@jdoerfert

This patch is not complete yet.

Currently it can only use the known information. I would like to make it possible for it to use the assumed information as well.
The problem with that is:

We need to iterate over the uses and take the maximum of the current dereferancability that is indicated by that use.
this works fine for the known info. But since the lower bound of the assumed constant range can decrease over time,
taking max with existing state is a problem.

The way that the AAFromMustBeExecutedContext is wired makes things a bit tricky and
I would like to get your opinions on this before I proceed.

bbn added a subscriber: bbn.Mar 19 2020, 6:53 AM

Can you explain why we need to catch overflows now but not before? I mean, the values determine by the external analysis are valid lower bounds, what is different from versioning on them and making them constant in one of the versions.

In D76208#1931081, @kuter wrote:

@jdoerfert

This patch is not complete yet.

Currently it can only use the known information. I would like to make it possible for it to use the assumed information as well.
The problem with that is:

We need to iterate over the uses and take the maximum of the current dereferancability that is indicated by that use.
this works fine for the known info. But since the lower bound of the assumed constant range can decrease over time,
taking max with existing state is a problem.

The way that the AAFromMustBeExecutedContext is wired makes things a bit tricky and
I would like to get your opinions on this before I proceed.

I think we should finish this one and then do the following:

We actually don't want the minimal offset but the maximal known offset, right? In addition to renaming the function we will check if the range from AAConstantRange is known, if so, we take the maximum instead.

llvm/include/llvm/IR/Operator.h
563	`= nullptr` is not sufficient? unfortunate. Nit: -`value` +`Offset`, or remove the names. Please describe what external analysis exactly does here. I like the solution of adding this callback though. We need more of these soon.
llvm/lib/IR/Value.cpp
644	Put the entire thing in an `if (ExternalAnalysis)` please. Or better `!ExternaAnalysis` then old code, else this code. TBH, I'm not sure why we don't bail on overflow in the old code as well, maybe we should. Can you check if that would make any test fail?
llvm/lib/Transforms/IPO/Attributor.cpp
552	Style: `UseAssumed`, `Range`, `Value`, etc.

kuter marked 3 inline comments as done.Mar 19 2020, 9:36 PM

kuter added inline comments.

llvm/test/Transforms/Attributor/dereferenceable-1.ll
214	OK, I will. For us to be able to deduce a 30, there would have to be separate mechanism that deduces the biggest number that the "Value" in question is going to be guaranteed to take. Perhaps we could use SCEV ?
234	No it is not. Currently this patch does not work with for loops. This is because with for loops AAFromMustBeExecutedContext looks at the branch at the top; calls the followUse() on both of the successors and it "and"s them together. This is probably the reason why only a single test is affected by this patch. I will address this issue with a separate patch that passes LoopInfo, DominatorTree and the PostDominatorTree to the MustBeExecutedContextExplorer.

In D76208#1932759, @jdoerfert wrote:

Can you explain why we need to catch overflows now but not before? I mean, the values determine by the external analysis are valid lower bounds, what is different from versioning on them and making them constant in one of the versions.

In D76208#1931081, @kuter wrote:

@jdoerfert

This patch is not complete yet.

Currently it can only use the known information. I would like to make it possible for it to use the assumed information as well.
The problem with that is:

We need to iterate over the uses and take the maximum of the current dereferancability that is indicated by that use.
this works fine for the known info. But since the lower bound of the assumed constant range can decrease over time,
taking max with existing state is a problem.

The way that the AAFromMustBeExecutedContext is wired makes things a bit tricky and
I would like to get your opinions on this before I proceed.

I think we should finish this one and then do the following:

We actually don't want the minimal offset but the maximal known offset, right? In addition to renaming the function we will check if the range from AAConstantRange is known, if so, we take the maximum instead.

void internal test(int index, char *ptrArg) {
  ptrArg[index] = 'A';
}

test(10, ptrA);
test(30, ptrB);
test(40, ptrC);

Based on call sites the index argument of the test function would be in range [10, 40] right ?
but marking ptrArg dereferancable(41) would be wrong woudn't it ?

The reason that I made the stripAndAccumulateConstantOffsets overflow aware is that the range can be lower than what it can be in reality.
I thought that these differences can result in unintended underflows.

For accumulateConstantOffset, I think for non inbound GEP's overflows should be ok.
but for stripAndAccumulateConstantOffsets bailing out should probably be the default behaviour.

In D76208#1932819, @kuter wrote:
In D76208#1932759, @jdoerfert wrote:

Can you explain why we need to catch overflows now but not before? I mean, the values determine by the external analysis are valid lower bounds, what is different from versioning on them and making them constant in one of the versions.

In D76208#1931081, @kuter wrote:

@jdoerfert

This patch is not complete yet.

Currently it can only use the known information. I would like to make it possible for it to use the assumed information as well.
The problem with that is:

We need to iterate over the uses and take the maximum of the current dereferancability that is indicated by that use.
this works fine for the known info. But since the lower bound of the assumed constant range can decrease over time,
taking max with existing state is a problem.

The way that the AAFromMustBeExecutedContext is wired makes things a bit tricky and
I would like to get your opinions on this before I proceed.

I think we should finish this one and then do the following:

We actually don't want the minimal offset but the maximal known offset, right? In addition to renaming the function we will check if the range from AAConstantRange is known, if so, we take the maximum instead.
void internal test(int index, char *ptrArg) {
  ptrArg[index] = 'A';
}

test(10, ptrA);
test(30, ptrB);
test(40, ptrC);
Based on call sites the index argument of the test function would be in range [10, 40] right ?
but marking ptrArg dereferancable(41) would be wrong woudn't it ?

Correct. First though was: we need something special for the loop case so we know we reach the upper bound. After all, that is probably the most important case to get.

The reason that I made the stripAndAccumulateConstantOffsets overflow aware is that the range can be lower than what it can be in reality.
I thought that these differences can result in unintended underflows.

I see. Agreed.

For accumulateConstantOffset, I think for non inbound GEP's overflows should be ok.
but for stripAndAccumulateConstantOffsets bailing out should probably be the default behaviour.

OK.

llvm/test/Transforms/Attributor/dereferenceable-1.ll
234	Right. That reminds me of D64974, though I'm unsure if we still need it. What we need is to make the explorer aware of `CanProveNotTakenFirstIteration`. Similar to the use of that function that exists, we can go to the non-exit block from a header in `getMustBeExecutedNextInstruction` (or `findForwardJoinPoint`) if we know that edge is taken at least once. To not loose the code after the loop we should add a stack of unexplored edges. The exit edge goes there if the loop is known not to be endless (see `findForwardJoinPoint`). If we are out of forward instructions we can pop an edge from the stack and continue. Alternatively, we could check if the loop was not endless when we visit the header for the second time.

Fixed styling, Added FIXME, Sperated new code.

kuter marked 2 inline comments as done.Mar 20 2020, 10:13 PM

kuter marked an inline comment as done.Mar 20 2020, 10:16 PM

uenoku added inline comments.Mar 21 2020, 4:19 AM

llvm/test/Transforms/Attributor/willreturn.ll
364	Why can't we get dereferenceable(4) for `p` in this case?

kuter marked an inline comment as done.Mar 21 2020, 5:40 AM

kuter added inline comments.

llvm/test/Transforms/Attributor/willreturn.ll
364	Ok for this example specifically, the variable `n` can be negative so the `ans +=p[n]` may never be executed. But in general, this patch doesn't work with loops that are not `do {} while (cond)` This is because `AAFromMustBeExecutedContext` is not aware when the first iteration is always going to be ran. This needs to be addressed with a separate path and would probably improve many other deductions as well.

baziotis added inline comments.Mar 21 2020, 6:13 AM

llvm/test/Transforms/Attributor/willreturn.ll
364	IIUC, loop rotation can help here because it provides this guarantee.

uenoku added inline comments.Mar 21 2020, 7:32 AM

llvm/test/Transforms/Attributor/willreturn.ll
364	Oh, I thought `%n` is assumed to be positive:) Thanks.

uenoku added inline comments.Mar 21 2020, 7:46 AM

llvm/test/Transforms/Attributor/dereferenceable-1.ll
214	For us to be able to deduce a 30, there would have to be separate mechanism that deduces the biggest number that the "Value" in question is going to be guaranteed to take. Agreed. Perhaps we could use SCEV ? I think so, what I thought once is: Assume that a loop is in the context and it is guaranteed to proceed to the last iteration, we can use the biggest number of the value. For inbounds gep, it is allowed to take that number as dereferenceable bytes, and for non-inbounds gep, if SCEV of the value is something like <0, +, 1>, we can take the biggest number as derefrenceable bytes.

This is because AAFromMustBeExecutedContext is not aware when the first iteration is always going to be ran.
This needs to be addressed with a separate path and would probably improve many other deductions as well.

Agreed and agreed.

Perhaps we could use SCEV ?

[...]

We could. What we want (regardless of how) is to track in AAConstantRange (conceptually) two ranges: (1) potential value range, (2) known value range. This is not the same as the assumed/known we track now. Right now bot track the potential value range, that is what value can this llvm::Value potentially have at runtime. I think tracking the second, thus, what value is this llvm::Value going to have at runtime (if it is executed) could be tracked in the same AA or a different one, depending on how much logic we can share. This is also a follow up patch (or two).

I'll have to go through the logic in this patch tomorrow (=rested) but I think it looks pretty good.

In D76208#1936117, @jdoerfert wrote:

We could. What we want (regardless of how) is to track in AAConstantRange (conceptually) two ranges: (1) potential value range, (2) known value range. This is not the same as the assumed/known we track now. Right now bot track the potential value range, that is what value can this llvm::Value potentially have at runtime. I think tracking the second, thus, what value is this llvm::Value going to have at runtime (if it is executed) could be tracked in the same AA or a different one, depending on how much logic we can share. This is also a follow up patch (or two).

Agreed.

Currently AAConstantRange tracks the range that a "llvm::value" is guaranteed to be within in runtime.
But for loops, this is results in information loss.

So I think what you suggest is that we track a specific range for loop like situations.
for an actual loop this would be, the first and the last index of the counter.

and when we are propagating this range from phi nodes, select instructions, call sites (of internal functions) we would intersect them together.

For example:

void test_use(char  *ptrA, char *ptrB) {
  for (int i = 10; i < 100; i++) {
    test(ptrA, i); //i is in [10, 99] loop range.
  }
  for (int i = 5; i < 150; i++) {
    test(ptrB, i); //i is in [5, 149] loop range.
  }
}

//i is in [10, 99] loop range
//ptr is dereferencable(100)
void internal test(char *ptr, int i) {
  ptr[i] = 'A';
}

We should finish this review and then focus on the next steps. I added more comments but I think the general logic is fine.

(Next step would be to rename the range in AAConstantRange to PossibleRange, or similar, and add a second range, e.g., ObservedRange which we need to deduce. We can then use the max(PossibleRange.minimum(), ObservedRange.maximum()) for dereferenceable deduction.

llvm/lib/IR/Operator.cpp
79	Style: I think it would be easier to read if you split the above if-else-cascade and use early exits: MinimalIndex = ConstOffset->getValue(); continue; } // The operand is not constant, check if an external analysis was provided. if (!ExternalAnalysis) return false; Do we need to track overflow before we use an external analysis? If not we don't need 75/76. Below we can than use `UsedExternalAnalysis` in 88 and remove it from 95. We should also be able to use a single overflowed flag.
99	Put the simple case first, maybe add a continue to avoid the "else"
llvm/lib/IR/Value.cpp
649	Simple case first please.
llvm/lib/Transforms/IPO/Attributor.cpp
552	Some names are still starting with a lower case letter. Do we have a test where the assumed range minimum is negative?
3749	Merge in a single debug message and line.
3803	No need for the newline in the beginning. TBH, the entire message doesn't help much given that we see the update debug message already.

Eliminated redundant debug messages, Style fixes, Added negative test case,
Don't use external analysis if the get operand is a struct type.

kuter marked an inline comment as done.Mar 25 2020, 8:22 AM

kuter added inline comments.

llvm/lib/IR/Operator.cpp
79	We probably don't need to detect a overflow before the use of external analysis I was just being safe. I am not sure what you mean by the early exit change. We need to detect overflows that happen after we use ExternalAnalysis even if it was not caused by the value that the ExternalAnalysis have returned. So the value of the ConstantOffset needs to pass through something that would detect a overflow if ExternalAnalysis is present and it has been used. We could do it with a lambda like this: AddArrayIndex(ConstOffset->getValue()); continue; } Also do we really need to have the old and the new code in this style: if (!External Analysis) { //old } else { //new } New code shouldn't really behave any different other than detecting overflows/underflows.

kuter marked 2 inline comments as done.Mar 25 2020, 9:59 AM

jdoerfert added inline comments.Mar 25 2020, 10:10 AM

llvm/lib/IR/Operator.cpp
79	When we want to check `UsedExternal` in order to determine if an overflow check is necessary not `ExternalAnalysis`. We should create the lambda you mentioned. Don't track overflowed outside of the lambda. In the lambda check if `usedexternal` is set, only if we track and act on overflows.

Simplfy accumulateConstantOffset

kuter marked 3 inline comments as done.Mar 25 2020, 2:15 PM

kuter marked an inline comment as done.Mar 26 2020, 6:32 AM

Two more minor comments, other than that the code looks good. Please update so I can commit it.

llvm/lib/IR/Operator.cpp
60	Nit: Don't call it `MinimalIndex` as there might be use cases that over-approximate the number. Index is just fine I think.
llvm/lib/Transforms/IPO/Attributor.cpp
566	Nit: `AllowNonInbounds` and `Value above. Check other variable names as well.

This revision is now accepted and ready to land.Mar 26 2020, 9:35 AM

kuter updated this revision to Diff 253022.Mar 26 2020, 5:28 PM

Apologies for the delay, can you rebase this and provide me with "Firstname Lastname <email>" from you so I can attribute it to you?

Rebased.
Small logic change in GEPOperator::accumulateConstantOffset to bailout on scalable vector types
except for when the offset is zero.
Not allowing zero breaks @test_accumulate_constant_offset_vscale_zero

see https://reviews.llvm.org/rGef64ba831194c7deac8882a325ea9bea64eb612a

In D76208#1956335, @jdoerfert wrote:

Apologies for the delay, can you rebase this and provide me with "Firstname Lastname <email>" from you so I can attribute it to you?

name, surname: Kuter Dinel
email: kuterdinel@gmail.com

Closed by commit rGe57807769b5c: [Attributor] Use AAValueConstantRange to infer dereferencability. (authored by kuter, committed by jdoerfert). · Explain WhyMay 13 2020, 3:17 PM

This revision was automatically updated to reflect the committed changes.

Herald added a subscriber: llvm-commits. · View Herald TranscriptMay 13 2020, 3:17 PM

jdoerfert mentioned this in rG6045a804b94b: [Attributor] Check lines accidentally not committed with D76208.May 13 2020, 4:26 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

IR/

Operator.h

25 lines

Value.h

20 lines

lib/

IR/

Operator.cpp

72 lines

Value.cpp

23 lines

Transforms/

IPO/

Attributor.cpp

72 lines

test/

Transforms/

Attributor/

dereferenceable-1.ll

27 lines

willreturn.ll

2 lines

Diff 251822

llvm/include/llvm/IR/Operator.h

Show First 20 Lines • Show All 541 Lines • ▼ Show 20 Lines	public:
unsigned countNonConstantIndices() const {		unsigned countNonConstantIndices() const {
return count_if(make_range(idx_begin(), idx_end()), [](const Use& use) {		return count_if(make_range(idx_begin(), idx_end()), [](const Use& use) {
return !isa<ConstantInt>(*use);		return !isa<ConstantInt>(*use);
});		});
}		}

/// Accumulate the constant address offset of this GEP if possible.		/// Accumulate the constant address offset of this GEP if possible.
///		///
/// This routine accepts an APInt into which it will accumulate the constant		/// This routine accepts an APInt into which it will try to accumulate the
/// offset of this GEP if the GEP is in fact constant. If the GEP is not		/// constant offset of this GEP.
/// all-constant, it returns false and the value of the offset APInt is		///
/// undefined (it is not preserved!). The APInt passed into this routine		/// If \p ExternalAnalysis is provided it will be used to calculate a offset
/// must be at exactly as wide as the IntPtr type for the address space of the		/// when a operand of GEP is not constant.
/// base GEP pointer.		/// For example, for a value \p ExternalAnalysis might try to calculate a
bool accumulateConstantOffset(const DataLayout &DL, APInt &Offset) const;		/// lower bound. If \p ExternalAnalysis is successful, it should return true.
		///
		/// If the \p ExternalAnalysis returns false or the value returned by \p
		/// ExternalAnalysis results in a overflow/underflow, this routine returns
		/// false and the value of the offset APInt is undefined (it is not
		/// preserved!).
		///
		/// The APInt passed into this routine must be at exactly as wide as the
		jdoerfertUnsubmitted Done Reply Inline Actions `= nullptr` is not sufficient? unfortunate. Nit: -`value` +`Offset`, or remove the names. Please describe what external analysis exactly does here. I like the solution of adding this callback though. We need more of these soon. jdoerfert: `= nullptr` is not sufficient? unfortunate. Nit: -`value` +`Offset`, or remove the names.
		/// IntPtr type for the address space of the base GEP pointer.
		bool accumulateConstantOffset(
		const DataLayout &DL, APInt &Offset,
		function_ref<bool(Value &, APInt &)> ExternalAnalysis = nullptr) const;
};		};

class PtrToIntOperator		class PtrToIntOperator
: public ConcreteOperator<Operator, Instruction::PtrToInt> {		: public ConcreteOperator<Operator, Instruction::PtrToInt> {
friend class PtrToInt;		friend class PtrToInt;
friend class ConstantExpr;		friend class ConstantExpr;

public:		public:
Show All 40 Lines

llvm/include/llvm/IR/Value.h

Show First 20 Lines • Show All 589 Lines • ▼ Show 20 Lines	#include "llvm/IR/Value.def"
/// value, it returns 'this'.		/// value, it returns 'this'.
const Value *stripInBoundsConstantOffsets() const;		const Value *stripInBoundsConstantOffsets() const;
Value *stripInBoundsConstantOffsets() {		Value *stripInBoundsConstantOffsets() {
return const_cast<Value *>(		return const_cast<Value *>(
static_cast<const Value *>(this)->stripInBoundsConstantOffsets());		static_cast<const Value *>(this)->stripInBoundsConstantOffsets());
}		}

/// Accumulate the constant offset this value has compared to a base pointer.		/// Accumulate the constant offset this value has compared to a base pointer.
/// Only 'getelementptr' instructions (GEPs) with constant indices are		/// Only 'getelementptr' instructions (GEPs) are accumulated but other
/// accumulated but other instructions, e.g., casts, are stripped away as		/// instructions, e.g., casts, are stripped away as well.
/// well. The accumulated constant offset is added to \p Offset and the base		/// The accumulated constant offset is added to \p Offset and the base
/// pointer is returned.		/// pointer is returned.
///		///
/// The APInt \p Offset has to have a bit-width equal to the IntPtr type for		/// The APInt \p Offset has to have a bit-width equal to the IntPtr type for
/// the address space of 'this' pointer value, e.g., use		/// the address space of 'this' pointer value, e.g., use
/// DataLayout::getIndexTypeSizeInBits(Ty).		/// DataLayout::getIndexTypeSizeInBits(Ty).
///		///
/// If \p AllowNonInbounds is true, constant offsets in GEPs are stripped and		/// If \p AllowNonInbounds is true, offsets in GEPs are stripped and
/// accumulated even if the GEP is not "inbounds".		/// accumulated even if the GEP is not "inbounds".
///		///
		/// If \p ExternalAnalysis is provided it will be used to calculate a offset
		/// when a operand of GEP is not constant.
		/// For example, for a value \p ExternalAnalysis might try to calculate a
		/// lower bound. If \p ExternalAnalysis is successful, it should return true.
		///
/// If this is called on a non-pointer value, it returns 'this' and the		/// If this is called on a non-pointer value, it returns 'this' and the
/// \p Offset is not modified.		/// \p Offset is not modified.
///		///
/// Note that this function will never return a nullptr. It will also never		/// Note that this function will never return a nullptr. It will also never
/// manipulate the \p Offset in a way that would not match the difference		/// manipulate the \p Offset in a way that would not match the difference
/// between the underlying value and the returned one. Thus, if no constant		/// between the underlying value and the returned one. Thus, if no constant
/// offset was found, the returned value is the underlying one and \p Offset		/// offset was found, the returned value is the underlying one and \p Offset
/// is unchanged.		/// is unchanged.
const Value *stripAndAccumulateConstantOffsets(const DataLayout &DL,		const Value *stripAndAccumulateConstantOffsets(
APInt &Offset,		const DataLayout &DL, APInt &Offset, bool AllowNonInbounds,
bool AllowNonInbounds) const;		function_ref<bool(Value &Value, APInt &Offset)> ExternalAnalysis =
		nullptr) const;
Value *stripAndAccumulateConstantOffsets(const DataLayout &DL, APInt &Offset,		Value *stripAndAccumulateConstantOffsets(const DataLayout &DL, APInt &Offset,
bool AllowNonInbounds) {		bool AllowNonInbounds) {
return const_cast<Value *>(		return const_cast<Value *>(
static_cast<const Value *>(this)->stripAndAccumulateConstantOffsets(		static_cast<const Value *>(this)->stripAndAccumulateConstantOffsets(
DL, Offset, AllowNonInbounds));		DL, Offset, AllowNonInbounds));
}		}

/// This is a wrapper around stripAndAccumulateConstantOffsets with the		/// This is a wrapper around stripAndAccumulateConstantOffsets with the
▲ Show 20 Lines • Show All 327 Lines • Show Last 20 Lines

llvm/lib/IR/Operator.cpp

	Show All 25 Lines
	}			}

	Type *GEPOperator::getResultElementType() const {			Type *GEPOperator::getResultElementType() const {
	if (auto *I = dyn_cast<GetElementPtrInst>(this))			if (auto *I = dyn_cast<GetElementPtrInst>(this))
	return I->getResultElementType();			return I->getResultElementType();
	return cast<GetElementPtrConstantExpr>(this)->getResultElementType();			return cast<GetElementPtrConstantExpr>(this)->getResultElementType();
	}			}

	bool GEPOperator::accumulateConstantOffset(const DataLayout &DL,			bool GEPOperator::accumulateConstantOffset(
	APInt &Offset) const {			const DataLayout &DL, APInt &Offset,
				function_ref<bool(Value &, APInt &)> ExternalAnalysis) const {
	assert(Offset.getBitWidth() ==			assert(Offset.getBitWidth() ==
	DL.getIndexSizeInBits(getPointerAddressSpace()) &&			DL.getIndexSizeInBits(getPointerAddressSpace()) &&
	"The offset bit width does not match DL specification.");			"The offset bit width does not match DL specification.");

				bool Overflowed = false, UsedExternalAnalysis = false;
	for (gep_type_iterator GTI = gep_type_begin(this), GTE = gep_type_end(this);			for (gep_type_iterator GTI = gep_type_begin(this), GTE = gep_type_end(this);
	GTI != GTE; ++GTI) {			GTI != GTE; ++GTI) {
	ConstantInt *OpC = dyn_cast<ConstantInt>(GTI.getOperand());			Value *V = GTI.getOperand();
	if (!OpC)			APInt MinimalIndex;
	return false;			// Handle ConstantInt if possible.
	if (OpC->isZero())			if (auto ConstOffset = dyn_cast<ConstantInt>(V)) {
				if (ConstOffset->isZero())
	continue;			continue;

	// Handle a struct index, which adds its field offset to the pointer.			// Handle a struct index, which adds its field offset to the pointer.
				// For array or vector indices, scale the index by the size of the type.
	if (StructType *STy = GTI.getStructTypeOrNull()) {			if (StructType *STy = GTI.getStructTypeOrNull()) {
	unsigned ElementIdx = OpC->getZExtValue();			unsigned ElementIdx = ConstOffset->getZExtValue();
	const StructLayout *SL = DL.getStructLayout(STy);			const StructLayout *SL = DL.getStructLayout(STy);
	Offset += APInt(Offset.getBitWidth(), SL->getElementOffset(ElementIdx));
				if (ExternalAnalysis) {
				bool OpOverflow = false;
				Offset = Offset.sadd_ov(
				APInt(Offset.getBitWidth(), SL->getElementOffset(ElementIdx)),
				OpOverflow);
				jdoerfertUnsubmitted Not Done Reply Inline Actions Nit: Don't call it `MinimalIndex` as there might be use cases that over-approximate the number. Index is just fine I think. jdoerfert: Nit: Don't call it `MinimalIndex` as there might be use cases that over-approximate the number.
				Overflowed \|= OpOverflow;
				if (UsedExternalAnalysis && Overflowed)
				return false;
				} else {
				Offset +=
				APInt(Offset.getBitWidth(), SL->getElementOffset(ElementIdx));
				}
	continue;			continue;
	}			}
				MinimalIndex = ConstOffset->getValue();
				} else if (ExternalAnalysis) {
				if (!ExternalAnalysis(*V, MinimalIndex))
				return false;
				UsedExternalAnalysis = true;
				if (Overflowed)
				return false;
				} else {
				return false;
				}
				jdoerfertUnsubmitted Done Reply Inline Actions Style: I think it would be easier to read if you split the above if-else-cascade and use early exits: MinimalIndex = ConstOffset->getValue(); continue; } // The operand is not constant, check if an external analysis was provided. if (!ExternalAnalysis) return false; Do we need to track overflow before we use an external analysis? If not we don't need 75/76. Below we can than use `UsedExternalAnalysis` in 88 and remove it from 95. We should also be able to use a single overflowed flag. jdoerfert: Style: I think it would be easier to read if you split the above if-else-cascade and use early…
				kuterAuthorUnsubmitted Done Reply Inline Actions We probably don't need to detect a overflow before the use of external analysis I was just being safe. I am not sure what you mean by the early exit change. We need to detect overflows that happen after we use ExternalAnalysis even if it was not caused by the value that the ExternalAnalysis have returned. So the value of the ConstantOffset needs to pass through something that would detect a overflow if ExternalAnalysis is present and it has been used. We could do it with a lambda like this: AddArrayIndex(ConstOffset->getValue()); continue; } Also do we really need to have the old and the new code in this style: if (!External Analysis) { //old } else { //new } New code shouldn't really behave any different other than detecting overflows/underflows. kuter: We probably don't need to detect a overflow before the use of external analysis I was just…
				jdoerfertUnsubmitted Not Done Reply Inline Actions When we want to check `UsedExternal` in order to determine if an overflow check is necessary not `ExternalAnalysis`. We should create the lambda you mentioned. Don't track overflowed outside of the lambda. In the lambda check if `usedexternal` is set, only if we track and act on overflows. jdoerfert: 1) When we want to check `UsedExternal` in order to determine if an overflow check is necessary…

				APInt Index = MinimalIndex.sextOrTrunc(Offset.getBitWidth());
	// For array or vector indices, scale the index by the size of the type.			// For array or vector indices, scale the index by the size of the type.
	APInt Index = OpC->getValue().sextOrTrunc(Offset.getBitWidth());			APInt IndexedSize =
	Offset += Index * APInt(Offset.getBitWidth(),			APInt(Offset.getBitWidth(), DL.getTypeAllocSize(GTI.getIndexedType()));
	DL.getTypeAllocSize(GTI.getIndexedType()));
				// External Analysis can return a result higher/lower than the value
				// represents. We need to detect overflow/underflow.
				if (ExternalAnalysis) {
				bool OpOverflow = false;
				APInt OffsetPlus = Index.smul_ov(IndexedSize, OpOverflow);
				Overflowed \|= OpOverflow;
				Offset = Offset.sadd_ov(OffsetPlus, OpOverflow);
				Overflowed \|= OpOverflow;

				if (UsedExternalAnalysis && Overflowed)
				return false;
				} else {
				Offset += Index * IndexedSize;
				}
				jdoerfertUnsubmitted Done Reply Inline Actions Put the simple case first, maybe add a continue to avoid the "else" jdoerfert: Put the simple case first, maybe add a continue to avoid the "else"
	}			}
	return true;			return true;
	}			}
	}			}

llvm/lib/IR/Value.cpp

Show First 20 Lines • Show All 590 Lines • ▼ Show 20 Lines
const Value *Value::stripInBoundsConstantOffsets() const {		const Value *Value::stripInBoundsConstantOffsets() const {
return stripPointerCastsAndOffsets<PSK_InBoundsConstantIndices>(this);		return stripPointerCastsAndOffsets<PSK_InBoundsConstantIndices>(this);
}		}

const Value *Value::stripPointerCastsAndInvariantGroups() const {		const Value *Value::stripPointerCastsAndInvariantGroups() const {
return stripPointerCastsAndOffsets<PSK_ZeroIndicesAndInvariantGroups>(this);		return stripPointerCastsAndOffsets<PSK_ZeroIndicesAndInvariantGroups>(this);
}		}

const Value *		const Value *Value::stripAndAccumulateConstantOffsets(
Value::stripAndAccumulateConstantOffsets(const DataLayout &DL, APInt &Offset,		const DataLayout &DL, APInt &Offset, bool AllowNonInbounds,
bool AllowNonInbounds) const {		function_ref<bool(Value &, APInt &)> ExternalAnalysis) const {
if (!getType()->isPtrOrPtrVectorTy())		if (!getType()->isPtrOrPtrVectorTy())
return this;		return this;

unsigned BitWidth = Offset.getBitWidth();		unsigned BitWidth = Offset.getBitWidth();
assert(BitWidth == DL.getIndexTypeSizeInBits(getType()) &&		assert(BitWidth == DL.getIndexTypeSizeInBits(getType()) &&
"The offset bit width does not match the DL specification.");		"The offset bit width does not match the DL specification.");

// Even though we don't look through PHI nodes, we could be called on an		// Even though we don't look through PHI nodes, we could be called on an
Show All 9 Lines	if (auto *GEP = dyn_cast<GEPOperator>(V)) {

// If one of the values we have visited is an addrspacecast, then		// If one of the values we have visited is an addrspacecast, then
// the pointer type of this GEP may be different from the type		// the pointer type of this GEP may be different from the type
// of the Ptr parameter which was passed to this function. This		// of the Ptr parameter which was passed to this function. This
// means when we construct GEPOffset, we need to use the size		// means when we construct GEPOffset, we need to use the size
// of GEP's pointer type rather than the size of the original		// of GEP's pointer type rather than the size of the original
// pointer type.		// pointer type.
APInt GEPOffset(DL.getIndexTypeSizeInBits(V->getType()), 0);		APInt GEPOffset(DL.getIndexTypeSizeInBits(V->getType()), 0);
if (!GEP->accumulateConstantOffset(DL, GEPOffset))		if (!GEP->accumulateConstantOffset(DL, GEPOffset, ExternalAnalysis))
return V;		return V;

// Stop traversal if the pointer offset wouldn't fit in the bit-width		// Stop traversal if the pointer offset wouldn't fit in the bit-width
// provided by the Offset argument. This can happen due to AddrSpaceCast		// provided by the Offset argument. This can happen due to AddrSpaceCast
// stripping.		// stripping.
if (GEPOffset.getMinSignedBits() > BitWidth)		if (GEPOffset.getMinSignedBits() > BitWidth)
return V;		return V;

Offset += GEPOffset.sextOrTrunc(BitWidth);		// External Analysis can return a result higher/lower than the value
		// represents. We need to detect overflow/underflow.
		APInt GEPOffsetST = GEPOffset.sextOrTrunc(BitWidth);
		if (ExternalAnalysis) {
		bool Overflow = false;
		APInt OldOffset = Offset;
		Offset = Offset.sadd_ov(GEPOffsetST, Overflow);
		if (Overflow) {
		Offset = OldOffset;
		jdoerfertUnsubmitted Done Reply Inline Actions Put the entire thing in an `if (ExternalAnalysis)` please. Or better `!ExternaAnalysis` then old code, else this code. TBH, I'm not sure why we don't bail on overflow in the old code as well, maybe we should. Can you check if that would make any test fail? jdoerfert: Put the entire thing in an `if (ExternalAnalysis)` please. Or better `!ExternaAnalysis` then…
		return V;
		}
		} else {
		Offset += GEPOffsetST;
		}
		jdoerfertUnsubmitted Done Reply Inline Actions Simple case first please. jdoerfert: Simple case first please.
V = GEP->getPointerOperand();		V = GEP->getPointerOperand();
} else if (Operator::getOpcode(V) == Instruction::BitCast \|\|		} else if (Operator::getOpcode(V) == Instruction::BitCast \|\|
Operator::getOpcode(V) == Instruction::AddrSpaceCast) {		Operator::getOpcode(V) == Instruction::AddrSpaceCast) {
V = cast<Operator>(V)->getOperand(0);		V = cast<Operator>(V)->getOperand(0);
} else if (auto *GA = dyn_cast<GlobalAlias>(V)) {		} else if (auto *GA = dyn_cast<GlobalAlias>(V)) {
if (!GA->isInterposable())		if (!GA->isInterposable())
V = GA->getAliasee();		V = GA->getAliasee();
} else if (const auto *Call = dyn_cast<CallBase>(V)) {		} else if (const auto *Call = dyn_cast<CallBase>(V)) {
▲ Show 20 Lines • Show All 397 Lines • Show Last 20 Lines

llvm/lib/Transforms/IPO/Attributor.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 502 Lines • ▼ Show 20 Lines	if (Attr.isEnumAttribute()) {
Attribute::AttrKind Kind = Attr.getKindAsEnum();		Attribute::AttrKind Kind = Attr.getKindAsEnum();
if (Attrs.hasAttribute(AttrIdx, Kind))		if (Attrs.hasAttribute(AttrIdx, Kind))
if (isEqualOrWorse(Attr, Attrs.getAttribute(AttrIdx, Kind)))		if (isEqualOrWorse(Attr, Attrs.getAttribute(AttrIdx, Kind)))
return false;		return false;
Attrs = Attrs.addAttribute(Ctx, AttrIdx, Attr);		Attrs = Attrs.addAttribute(Ctx, AttrIdx, Attr);
return true;		return true;
}		}
if (Attr.isStringAttribute()) {		if (Attr.isStringAttribute()) {
StringRef Kind = Attr.getKindAsString();		StringRef Kind = Attr.getKindAsString();
		uenokuUnsubmitted Not Done Reply Inline Actions Add commnent here uenoku: Add commnent here
if (Attrs.hasAttribute(AttrIdx, Kind))		if (Attrs.hasAttribute(AttrIdx, Kind))
if (isEqualOrWorse(Attr, Attrs.getAttribute(AttrIdx, Kind)))		if (isEqualOrWorse(Attr, Attrs.getAttribute(AttrIdx, Kind)))
return false;		return false;
Attrs = Attrs.addAttribute(Ctx, AttrIdx, Attr);		Attrs = Attrs.addAttribute(Ctx, AttrIdx, Attr);
return true;		return true;
}		}
if (Attr.isIntAttribute()) {		if (Attr.isIntAttribute()) {
Attribute::AttrKind Kind = Attr.getKindAsEnum();		Attribute::AttrKind Kind = Attr.getKindAsEnum();
if (Attrs.hasAttribute(AttrIdx, Kind))		if (Attrs.hasAttribute(AttrIdx, Kind))
if (isEqualOrWorse(Attr, Attrs.getAttribute(AttrIdx, Kind)))		if (isEqualOrWorse(Attr, Attrs.getAttribute(AttrIdx, Kind)))
return false;		return false;
Attrs = Attrs.removeAttribute(Ctx, AttrIdx, Kind);		Attrs = Attrs.removeAttribute(Ctx, AttrIdx, Kind);
Attrs = Attrs.addAttribute(Ctx, AttrIdx, Attr);		Attrs = Attrs.addAttribute(Ctx, AttrIdx, Attr);
return true;		return true;
}		}

llvm_unreachable("Expected enum or string attribute!");		llvm_unreachable("Expected enum or string attribute!");
}		}

		const Value *stripAndAccumulateMinimalOffsets(
		Attributor &A, const AbstractAttribute &QueryingAA, const Value *value,
		const DataLayout &DL, APInt &Offset, bool AllowNonInbounds,
		bool UseAssumed = false) {

		auto AttributorAnalysis = [&](Value &V, APInt &ROffset) -> bool {
		const IRPosition &Pos = IRPosition::value(V);
		// Only track dependence if we are going to use the assumed info.
		const AAValueConstantRange &ValueConstantRangeAA =
		A.getAAFor<AAValueConstantRange>(QueryingAA, Pos,
		/* TrackDependence */ UseAssumed);
		ConstantRange range = UseAssumed ? ValueConstantRangeAA.getAssumed()
		: ValueConstantRangeAA.getKnown();
		// We can only use the lower part of the range because the upper part can
		// be higher than what the value can really be.
		ROffset = range.getSignedMin();
		return true;
		};

		return value->stripAndAccumulateConstantOffsets(DL, Offset, AllowNonInbounds,
		AttributorAnalysis);
		}
		jdoerfertUnsubmitted Done Reply Inline Actions Style: `UseAssumed`, `Range`, `Value`, etc. jdoerfert: Style: `UseAssumed`, `Range`, `Value`, etc.
		jdoerfertUnsubmitted Done Reply Inline Actions Some names are still starting with a lower case letter. Do we have a test where the assumed range minimum is negative? jdoerfert: Some names are still starting with a lower case letter. Do we have a test where the assumed…

		static const Value *getMinimalBaseOfAccsesPointerOperand(
		Attributor &A, const AbstractAttribute &QueryingAA, const Instruction *I,
		int64_t &BytesOffset, const DataLayout &DL, bool allowNonInbounds = false) {
		const Value Ptr = getPointerOperand(I, / AllowVolatile */ false);
		if (!Ptr)
		return nullptr;
		APInt OffsetAPInt(DL.getIndexTypeSizeInBits(Ptr->getType()), 0);
		const Value *Base = stripAndAccumulateMinimalOffsets(
		A, QueryingAA, Ptr, DL, OffsetAPInt, allowNonInbounds);

		BytesOffset = OffsetAPInt.getSExtValue();
		return Base;
		}
		jdoerfertUnsubmitted Not Done Reply Inline Actions Nit: `AllowNonInbounds` and `Value above. Check other variable names as well. jdoerfert: Nit: `AllowNonInbounds` and `Value above. Check other variable names as well.

static const Value *		static const Value *
getBasePointerOfAccessPointerOperand(const Instruction *I, int64_t &BytesOffset,		getBasePointerOfAccessPointerOperand(const Instruction *I, int64_t &BytesOffset,
const DataLayout &DL,		const DataLayout &DL,
bool AllowNonInbounds = false) {		bool AllowNonInbounds = false) {
const Value Ptr = getPointerOperand(I, / AllowVolatile */ false);		const Value Ptr = getPointerOperand(I, / AllowVolatile */ false);
if (!Ptr)		if (!Ptr)
return nullptr;		return nullptr;

Show All 32 Lines	IRAttributeManifest::manifestAttrs(Attributor &A, const IRPosition &IRP,
case IRPosition::IRP_FLOAT:		case IRPosition::IRP_FLOAT:
return ChangeStatus::UNCHANGED;		return ChangeStatus::UNCHANGED;
case IRPosition::IRP_ARGUMENT:		case IRPosition::IRP_ARGUMENT:
case IRPosition::IRP_FUNCTION:		case IRPosition::IRP_FUNCTION:
case IRPosition::IRP_RETURNED:		case IRPosition::IRP_RETURNED:
Attrs = ScopeFn->getAttributes();		Attrs = ScopeFn->getAttributes();
break;		break;
case IRPosition::IRP_CALL_SITE:		case IRPosition::IRP_CALL_SITE:
case IRPosition::IRP_CALL_SITE_RETURNED:		case IRPosition::IRP_CALL_SITE_RETURNED:
		jdoerfertUnsubmitted Done Reply Inline Actions Isn't this almost the `stripAndAccumulate` code? How hard would it be to pass some kind of `AttributorInfo` object to that function (optionally) which triggers the lookup. Or more generically, a callback that takes a `Value` and returns a lower bound offset (plus indicates success/failure). jdoerfert: Isn't this almost the `stripAndAccumulate` code? How hard would it be to pass some kind of…
case IRPosition::IRP_CALL_SITE_ARGUMENT:		case IRPosition::IRP_CALL_SITE_ARGUMENT:
Attrs = ImmutableCallSite(&IRP.getAnchorValue()).getAttributes();		Attrs = ImmutableCallSite(&IRP.getAnchorValue()).getAttributes();
break;		break;
}		}

ChangeStatus HasChanged = ChangeStatus::UNCHANGED;		ChangeStatus HasChanged = ChangeStatus::UNCHANGED;
LLVMContext &Ctx = IRP.getAnchorValue().getContext();		LLVMContext &Ctx = IRP.getAnchorValue().getContext();
for (const Attribute &Attr : DeducedAttrs) {		for (const Attribute &Attr : DeducedAttrs) {
▲ Show 20 Lines • Show All 1,423 Lines • ▼ Show 20 Lines	static int64_t getKnownNonNullAndDerefBytesForUse(

// We need to follow common pointer manipulation uses to the accesses they		// We need to follow common pointer manipulation uses to the accesses they
// feed into. We can try to be smart to avoid looking through things we do not		// feed into. We can try to be smart to avoid looking through things we do not
// like for now, e.g., non-inbounds GEPs.		// like for now, e.g., non-inbounds GEPs.
if (isa<CastInst>(I)) {		if (isa<CastInst>(I)) {
TrackUse = true;		TrackUse = true;
return 0;		return 0;
}		}
if (auto *GEP = dyn_cast<GetElementPtrInst>(I))
if (GEP->hasAllConstantIndices()) {		if (isa<GetElementPtrInst>(I)) {
TrackUse = true;		TrackUse = true;
return 0;		return 0;
}		}

int64_t Offset;		int64_t Offset;
if (const Value *Base = getBasePointerOfAccessPointerOperand(I, Offset, DL)) {		const Value *Base =
		getMinimalBaseOfAccsesPointerOperand(A, QueryingAA, I, Offset, DL);
		if (Base) {
if (Base == &AssociatedValue &&		if (Base == &AssociatedValue &&
getPointerOperand(I, /* AllowVolatile */ false) == UseV) {		getPointerOperand(I, /* AllowVolatile */ false) == UseV) {
int64_t DerefBytes =		int64_t DerefBytes =
(int64_t)DL.getTypeStoreSize(PtrTy->getPointerElementType()) + Offset;		(int64_t)DL.getTypeStoreSize(PtrTy->getPointerElementType()) + Offset;

IsNonNull \|= !NullPointerIsDefined;		IsNonNull \|= !NullPointerIsDefined;
return std::max(int64_t(0), DerefBytes);		return std::max(int64_t(0), DerefBytes);
}		}
}		}

/// Corner case when an offset is 0.		/// Corner case when an offset is 0.
if (const Value *Base = getBasePointerOfAccessPointerOperand(		Base = getBasePointerOfAccessPointerOperand(I, Offset, DL,
I, Offset, DL, /AllowNonInbounds/ true)) {		/AllowNonInbounds/ true);
		if (Base) {
if (Offset == 0 && Base == &AssociatedValue &&		if (Offset == 0 && Base == &AssociatedValue &&
getPointerOperand(I, /* AllowVolatile */ false) == UseV) {		getPointerOperand(I, /* AllowVolatile */ false) == UseV) {
int64_t DerefBytes =		int64_t DerefBytes =
(int64_t)DL.getTypeStoreSize(PtrTy->getPointerElementType());		(int64_t)DL.getTypeStoreSize(PtrTy->getPointerElementType());
IsNonNull \|= !NullPointerIsDefined;		IsNonNull \|= !NullPointerIsDefined;
return std::max(int64_t(0), DerefBytes);		return std::max(int64_t(0), DerefBytes);
}		}
}		}
▲ Show 20 Lines • Show All 1,651 Lines • ▼ Show 20 Lines	struct AADereferenceableImpl : AADereferenceable {

/// See AAFromMustBeExecutedContext		/// See AAFromMustBeExecutedContext
bool followUse(Attributor &A, const Use U, const Instruction I,		bool followUse(Attributor &A, const Use U, const Instruction I,
AADereferenceable::StateType &State) {		AADereferenceable::StateType &State) {
bool IsNonNull = false;		bool IsNonNull = false;
bool TrackUse = false;		bool TrackUse = false;
int64_t DerefBytes = getKnownNonNullAndDerefBytesForUse(		int64_t DerefBytes = getKnownNonNullAndDerefBytesForUse(
A, *this, getAssociatedValue(), U, I, IsNonNull, TrackUse);		A, *this, getAssociatedValue(), U, I, IsNonNull, TrackUse);
		LLVM_DEBUG(dbgs() << "[AADereferenceable] follow use called on " << *I
		<< "\n");
		LLVM_DEBUG(dbgs() << "[AADereferenceable] Deref bytes" << DerefBytes
		<< "\n");
		jdoerfertUnsubmitted Done Reply Inline Actions Merge in a single debug message and line. jdoerfert: Merge in a single debug message and line.

addAccessedBytesForUse(A, U, I, State);		addAccessedBytesForUse(A, U, I, State);
State.takeKnownDerefBytesMaximum(DerefBytes);		State.takeKnownDerefBytesMaximum(DerefBytes);
return TrackUse;		return TrackUse;
}		}

/// See AbstractAttribute::manifest(...).		/// See AbstractAttribute::manifest(...).
ChangeStatus manifest(Attributor &A) override {		ChangeStatus manifest(Attributor &A) override {
Show All 35 Lines	using Base =
AAFromMustBeExecutedContext<AADereferenceable, AADereferenceableImpl>;		AAFromMustBeExecutedContext<AADereferenceable, AADereferenceableImpl>;
AADereferenceableFloating(const IRPosition &IRP) : Base(IRP) {}		AADereferenceableFloating(const IRPosition &IRP) : Base(IRP) {}

/// See AbstractAttribute::updateImpl(...).		/// See AbstractAttribute::updateImpl(...).
ChangeStatus updateImpl(Attributor &A) override {		ChangeStatus updateImpl(Attributor &A) override {
ChangeStatus Change = Base::updateImpl(A);		ChangeStatus Change = Base::updateImpl(A);

const DataLayout &DL = A.getDataLayout();		const DataLayout &DL = A.getDataLayout();
		LLVM_DEBUG(
auto VisitValueCB = [&](Value &V, DerefState &T, bool Stripped) -> bool {		dbgs()
		<< "\n[AADereferenceableFloating] Trying to merge floating values");
		jdoerfertUnsubmitted Done Reply Inline Actions No need for the newline in the beginning. TBH, the entire message doesn't help much given that we see the update debug message already. jdoerfert: No need for the newline in the beginning. TBH, the entire message doesn't help much given that…
		auto VisitValueCB = [&](const Value &V, DerefState &T,
		bool Stripped) -> bool {
		LLVM_DEBUG(dbgs() << "\n[AADereferenceableFloating] Looking at value"
		<< V);
unsigned IdxWidth =		unsigned IdxWidth =
DL.getIndexSizeInBits(V.getType()->getPointerAddressSpace());		DL.getIndexSizeInBits(V.getType()->getPointerAddressSpace());
APInt Offset(IdxWidth, 0);		APInt Offset(IdxWidth, 0);
const Value *Base =		const Value *Base =
V.stripAndAccumulateInBoundsConstantOffsets(DL, Offset);		stripAndAccumulateMinimalOffsets(A, *this, &V, DL, Offset, false);

const auto &AA =		const auto &AA =
A.getAAFor<AADereferenceable>(this, IRPosition::value(Base));		A.getAAFor<AADereferenceable>(this, IRPosition::value(Base));
int64_t DerefBytes = 0;		int64_t DerefBytes = 0;
if (!Stripped && this == &AA) {		if (!Stripped && this == &AA) {
// Use IR information if we did not strip anything.		// Use IR information if we did not strip anything.
// TODO: track globally.		// TODO: track globally.
bool CanBeNull;		bool CanBeNull;
DerefBytes = Base->getPointerDereferenceableBytes(DL, CanBeNull);		DerefBytes = Base->getPointerDereferenceableBytes(DL, CanBeNull);
T.GlobalState.indicatePessimisticFixpoint();		T.GlobalState.indicatePessimisticFixpoint();
} else {		} else {
const DerefState &DS = static_cast<const DerefState &>(AA.getState());		const DerefState &DS = static_cast<const DerefState &>(AA.getState());
DerefBytes = DS.DerefBytesState.getAssumed();		DerefBytes = DS.DerefBytesState.getAssumed();
T.GlobalState &= DS.GlobalState;		T.GlobalState &= DS.GlobalState;
}		}

// TODO: Use `AAConstantRange` to infer dereferenceable bytes.

// For now we do not try to "increase" dereferenceability due to negative		// For now we do not try to "increase" dereferenceability due to negative
// indices as we first have to come up with code to deal with loops and		// indices as we first have to come up with code to deal with loops and
// for overflows of the dereferenceable bytes.		// for overflows of the dereferenceable bytes.
int64_t OffsetSExt = Offset.getSExtValue();		int64_t OffsetSExt = Offset.getSExtValue();
if (OffsetSExt < 0)		if (OffsetSExt < 0)
OffsetSExt = 0;		OffsetSExt = 0;

T.takeAssumedDerefBytesMinimum(		T.takeAssumedDerefBytesMinimum(
▲ Show 20 Lines • Show All 5,175 Lines • Show Last 20 Lines

llvm/test/Transforms/Attributor/dereferenceable-1.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes
	; RUN: opt -attributor -attributor-manifest-internal --attributor-disable=false -attributor-max-iterations-verify -attributor-annotate-decl-cs -attributor-max-iterations=16 -S < %s \| FileCheck %s --check-prefix=ATTRIBUTOR			; RUN: opt -attributor -attributor-manifest-internal --attributor-disable=false -attributor-max-iterations-verify -attributor-annotate-decl-cs -attributor-max-iterations=16 -S < %s \| FileCheck %s --check-prefix=ATTRIBUTOR
				; RUN: opt -passes=attributor-cgscc --attributor-disable=false -attributor-manifest-internal -S < %s \| FileCheck %s --check-prefixes=ATTRIBUTOR_NPM


	; FIXME: Figure out why we need 16 iterations here.			; FIXME: Figure out why we need 16 iterations here.

	; UTC_ARGS: --disable			; UTC_ARGS: --disable

	declare void @deref_phi_user(i32* %a);			declare void @deref_phi_user(i32* %a);

	; TEST 1			; TEST 1
	; take mininimum of return values			; take mininimum of return values
	▲ Show 20 Lines • Show All 192 Lines • ▼ Show 20 Lines

	define void @deref_or_null_and_nonnull(i32* dereferenceable_or_null(100) %0) {			define void @deref_or_null_and_nonnull(i32* dereferenceable_or_null(100) %0) {
	; ATTRIBUTOR: define void @deref_or_null_and_nonnull(i32* nocapture nofree nonnull writeonly dereferenceable(100) %0)			; ATTRIBUTOR: define void @deref_or_null_and_nonnull(i32* nocapture nofree nonnull writeonly dereferenceable(100) %0)
	store i32 1, i32* %0			store i32 1, i32* %0
	ret void			ret void
	}			}

	; UTC_ARGS: --enable			; UTC_ARGS: --enable

				uenokuUnsubmitted Done Reply Inline Actions I think optimaly %ptr is dereferenceable(30). Please add FIXME here. uenoku: I think optimaly %ptr is dereferenceable(30). Please add FIXME here.
				kuterAuthorUnsubmitted Done Reply Inline Actions OK, I will. For us to be able to deduce a 30, there would have to be separate mechanism that deduces the biggest number that the "Value" in question is going to be guaranteed to take. Perhaps we could use SCEV ? kuter: OK, I will. For us to be able to deduce a 30, there would have to be separate mechanism that…
				uenokuUnsubmitted Not Done Reply Inline Actions For us to be able to deduce a 30, there would have to be separate mechanism that deduces the biggest number that the "Value" in question is going to be guaranteed to take. Agreed. Perhaps we could use SCEV ? I think so, what I thought once is: Assume that a loop is in the context and it is guaranteed to proceed to the last iteration, we can use the biggest number of the value. For inbounds gep, it is allowed to take that number as dereferenceable bytes, and for non-inbounds gep, if SCEV of the value is something like <0, +, 1>, we can take the biggest number as derefrenceable bytes. uenoku: > For us to be able to deduce a 30, there would have to be separate mechanism that deduces the…
	; TEST 8			; TEST 8
	; Use Constant range in deereferenceable			; Use Constant range in dereferenceable

				;Function Attrs: nounwind uwtable
				define void @test8(i8* %ptr) #0 {
				; FIXME: %ptr should be dereferenceable(31)
				; ATTRIBUTOR_NPM: define void @test8(i8* nocapture nofree nonnull writeonly dereferenceable(21) %ptr)
				br label %1
				1: ; preds = %5, %0
				%i.0 = phi i32 [ 20, %0 ], [ %4, %5 ]
				%2 = sext i32 %i.0 to i64
				%3 = getelementptr inbounds i8, i8* %ptr, i64 %2
				store i8 32, i8* %3, align 1
				%4 = add nsw i32 %i.0, 1
				br label %5
				5: ; preds = %1
				%6 = icmp slt i32 %4, 30
				br i1 %6, label %1, label %7

				7: ; preds = %5
				jdoerfertUnsubmitted Not Done Reply Inline Actions Isn't this the `fill_range` code? jdoerfert: Isn't this the `fill_range` code?
				kuterAuthorUnsubmitted Done Reply Inline Actions No it is not. Currently this patch does not work with for loops. This is because with for loops AAFromMustBeExecutedContext looks at the branch at the top; calls the followUse() on both of the successors and it "and"s them together. This is probably the reason why only a single test is affected by this patch. I will address this issue with a separate patch that passes LoopInfo, DominatorTree and the PostDominatorTree to the MustBeExecutedContextExplorer. kuter: No it is not. Currently this patch does not work with for loops. This is because with for…
				jdoerfertUnsubmitted Not Done Reply Inline Actions Right. That reminds me of D64974, though I'm unsure if we still need it. What we need is to make the explorer aware of `CanProveNotTakenFirstIteration`. Similar to the use of that function that exists, we can go to the non-exit block from a header in `getMustBeExecutedNextInstruction` (or `findForwardJoinPoint`) if we know that edge is taken at least once. To not loose the code after the loop we should add a stack of unexplored edges. The exit edge goes there if the loop is known not to be endless (see `findForwardJoinPoint`). If we are out of forward instructions we can pop an edge from the stack and continue. Alternatively, we could check if the loop was not endless when we visit the header for the second time. jdoerfert: Right. That reminds me of D64974, though I'm unsure if we still need it. What we need is to…
				ret void
				}


	; void g(int p, long long int range){			; void g(int p, long long int range){
	; int r = *range ; // [10, 99]			; int r = *range ; // [10, 99]
	; fill_range(p, *range);			; fill_range(p, *range);
	; }			; }

	; void fill_range(int* p, long long int start){			; void fill_range(int* p, long long int start){
	; for(long long int i = start;i<start+10;i++){			; for(long long int i = start;i<start+10;i++){
	; // If p[i] is inbounds, p is dereferenceable(40) at least.			; // If p[i] is inbounds, p is dereferenceable(40) at least.
	▲ Show 20 Lines • Show All 238 Lines • Show Last 20 Lines

llvm/test/Transforms/Attributor/willreturn.ll

Show First 20 Lines • Show All 300 Lines • ▼ Show 20 Lines
; for(int i = 0;i<10;i++){		; for(int i = 0;i<10;i++){
; ans += p[i];		; ans += p[i];
; }		; }
; return ans;		; return ans;
; }		; }

; ATTRIBUTOR_MODULE: Function Attrs: argmemonly nofree noinline nosync nounwind readonly uwtable willreturn		; ATTRIBUTOR_MODULE: Function Attrs: argmemonly nofree noinline nosync nounwind readonly uwtable willreturn
; ATTRIBUTOR_CGSCC: Function Attrs: argmemonly nofree noinline norecurse nosync nounwind readonly uwtable willreturn		; ATTRIBUTOR_CGSCC: Function Attrs: argmemonly nofree noinline norecurse nosync nounwind readonly uwtable willreturn
; ATTRIBUTOR-NEXT: define i32 @loop_constant_trip_count(i32* nocapture nofree readonly %0)		; ATTRIBUTOR-NEXT: define i32 @loop_constant_trip_count(i32* nocapture nofree nonnull readonly dereferenceable(4) %0)
define i32 @loop_constant_trip_count(i32* nocapture readonly %0) #0 {		define i32 @loop_constant_trip_count(i32* nocapture readonly %0) #0 {
br label %3		br label %3

; <label>:2: ; preds = %3		; <label>:2: ; preds = %3
ret i32 %8		ret i32 %8

; <label>:3: ; preds = %3, %1		; <label>:3: ; preds = %3, %1
%4 = phi i64 [ 0, %1 ], [ %9, %3 ]		%4 = phi i64 [ 0, %1 ], [ %9, %3 ]
Show All 38 Lines	; <label>:8: ; preds = %4, %8
%13 = load i32, i32* %12, align 4		%13 = load i32, i32* %12, align 4
%14 = add nsw i32 %13, %10		%14 = add nsw i32 %13, %10
%15 = add i32 %9, %3		%15 = add i32 %9, %3
%16 = icmp eq i32 %15, %1		%16 = icmp eq i32 %15, %1
br i1 %16, label %6, label %8		br i1 %16, label %6, label %8
}		}


; TEST 13 (positive case)		; TEST 13 (positive case)
uenokuUnsubmitted Not Done Reply Inline Actions Why can't we get dereferenceable(4) for `p` in this case? uenoku: Why can't we get dereferenceable(4) for `p` in this case?
kuterAuthorUnsubmitted Done Reply Inline Actions Ok for this example specifically, the variable `n` can be negative so the `ans +=p[n]` may never be executed. But in general, this patch doesn't work with loops that are not `do {} while (cond)` This is because `AAFromMustBeExecutedContext` is not aware when the first iteration is always going to be ran. This needs to be addressed with a separate path and would probably improve many other deductions as well. kuter: Ok for this example specifically, the variable `n` can be negative so the `ans +=p[n]` may…
baziotisUnsubmitted Not Done Reply Inline Actions IIUC, loop rotation can help here because it provides this guarantee. baziotis: IIUC, [[ https://llvm.org/docs/LoopTerminology.html#rotated-loops \| loop rotation ]] can help…
uenokuUnsubmitted Not Done Reply Inline Actions Oh, I thought `%n` is assumed to be positive:) Thanks. uenoku: Oh, I thought `%n` is assumed to be positive:) Thanks.
; Function Attrs: norecurse nounwind readonly uwtable		; Function Attrs: norecurse nounwind readonly uwtable
; int loop_trip_dec(int n, int *p){		; int loop_trip_dec(int n, int *p){
; int ans = 0;		; int ans = 0;
; for(;n >= 0;n--){		; for(;n >= 0;n--){
; ans += p[n];		; ans += p[n];
; }		; }
; return ans;		; return ans;
; }		; }
▲ Show 20 Lines • Show All 400 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[Attributor] Use AAValueConstantRange to infer dereferencability.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 251822

llvm/include/llvm/IR/Operator.h

llvm/include/llvm/IR/Value.h

llvm/lib/IR/Operator.cpp

llvm/lib/IR/Value.cpp

llvm/lib/Transforms/IPO/Attributor.cpp

llvm/test/Transforms/Attributor/dereferenceable-1.ll

llvm/test/Transforms/Attributor/willreturn.ll

[Attributor] Use AAValueConstantRange to infer dereferencability.
ClosedPublic