This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Transforms/IPO/
-
llvm/
-
Transforms/
-
IPO/
9/9
Attributor.h
-
lib/
-
Target/AMDGPU/
-
AMDGPU/
-
AMDGPUAttributor.cpp
-
Transforms/IPO/
-
IPO/
13/13
AttributorAttributes.cpp
-
test/
-
CodeGen/AMDGPU/
-
AMDGPU/
-
attributor.ll
-
implicitarg-attributes.ll
-
Transforms/Attributor/
-
Attributor/
-
call-simplify-pointer-info.ll
2/2
multiple-offsets-pointer-info.ll
1/1
value-simplify-pointer-info.ll

Differential D138646

[AAPointerInfo] track multiple constant offsets for each use
ClosedPublic

Authored by sameerds on Nov 24 2022, 2:39 AM.

Download Raw Diff

Details

Reviewers

jdoerfert
sstefan1

Commits

rG6a2305484e87: [AAPointerInfo] track multiple constant offsets for each use
rGc2a0baad1fbb: [AAPointerInfo] track multiple constant offsets for each use

Summary

An expression of the form gep(base, select(pred, const1, const2)) can result
in a set of offsets instead of just one. PointerInfo can now track these sets
instead of conservatively modeling them as Unknown.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

sameerds created this revision.Nov 24 2022, 2:39 AM

Herald added a project: Restricted Project. · View Herald TranscriptNov 24 2022, 2:39 AM

Herald added subscribers: kosarev, ormris, okura and 6 others. · View Herald Transcript

sameerds requested review of this revision.Nov 24 2022, 2:39 AM

Herald added a reviewer: jdoerfert. · View Herald TranscriptNov 24 2022, 2:39 AM

Herald added a reviewer: sstefan1. · View Herald Transcript

Herald added a project: Restricted Project. · View Herald Transcript

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B199368: Diff 477715.Nov 24 2022, 2:39 AM

sameerds added a parent revision: D138645: [AAPointerInfo] rearrange code in preparation for further changes.Nov 24 2022, 2:40 AM

Is this patch, and the ones prior in the stack, ready for review?

In D138646#3954187, @jdoerfert wrote:

Is this patch, and the ones prior in the stack, ready for review?

Yes this is ready for review. It lacks tests for function calls, which I intend to provide before submitting.

sameerds mentioned this in D138645: [AAPointerInfo] rearrange code in preparation for further changes.Nov 29 2022, 12:22 AM

Added some comments to the code
Removed a redundant CodeGen test
Improved readability in value-simplify-pointer-info.ll
Added tests in call-simplify-pointer-info.ll

sameerds added a child revision: D138991: [AAPointerInfo] handle multiple offsets in PHI.Nov 30 2022, 2:08 AM

Harbormaster completed remote builds in B200206: Diff 478855.Nov 30 2022, 2:21 AM

The only "issue" I have with this is the select traversal. We should rely on AAPotentialValues here, see the comment below. Everything else makes sense and looks pretty good.

llvm/include/llvm/Transforms/IPO/Attributor.h
5149	All assertions need messages please, also elsewhere.
llvm/lib/Transforms/IPO/AttributorAttributes.cpp
1483	I think we might want to call `Attributor::getPotentialValues` on the variable offsets. It should handle select and phi and more, e.g., loads. Maybe in addition to `Attributor::getAssumedConstant` we should have `Attributor::getAssumedConstantValues`. I want to avoid yet another traversal of some instructions in favor of common interfaces.a
1508	SExt, I think. -1 is a fine offset.
1540	Nit: Unsure why we need two returns here.
1647	Same as above

Use AAPotentialConstantValues instead of tediously traversing SelectInst.

Can we have a test for phi and store-load propagation to verify it's working as expected (not only selects)?

llvm/lib/Transforms/IPO/AttributorAttributes.cpp
1393–1396	This doesn't make sense to me. We need to look at all VariableOffsets and decide. So `return` should only be present if we give up.
1403	I don;t follow why we need two extra OffsetInfo objects here. We modify NewOI anyway, no?

Harbormaster completed remote builds in B201084: Diff 480067.Dec 5 2022, 8:46 AM

sameerds added inline comments.Dec 5 2022, 9:42 AM

llvm/lib/Transforms/IPO/AttributorAttributes.cpp
1393–1396	You might have missed the "not" in the condition. We want to give up and come again later if any information is not at fixed point. This is easily exercised by tests involving induction variables. The set of potential values of the induction variable keeps growing, but we should not use that set until it is fully enumerated. Any eager propagation of a non-fixed-point through PointerInfo at this point affects conclusions in other attributes that depend on it. I did not look for an example of correctness, but it does cause the attributor to retain stores that it would have otherwise removed in one existing test.
1403	On each iteration of the outer for loop over VariableOffsets, the expression is a product: UnionOfAllCopies = NewOI x AssumedSet CopyPerOffset is the temporary used by the inner loop to compute this product. We need UnionOfAllCopies because it must only contain modified copies of NewOI, but not NewOI itself. We can't merge the RHS into NewOI. We have to start with an empty set. The actual output of the function is UsrOI. We do not move NewOI into that if we exit early. I realize there is no test for this yet. Working on that ... writing a GEP by hand is hard, when it involves nested aggregate types!

jdoerfert added inline comments.Dec 5 2022, 11:06 AM

llvm/lib/Transforms/IPO/AttributorAttributes.cpp
1393–1396	We cannot wait in one AA for another to find a fixpoint. That is not sound. That is not even always possible. Even if it would be, it won't work in the current algorithm. You need to update the AA state based on the state of the other AA, always. Then signal if something changed. That said, if we retain stores doing this properly we need to understand why.
1403	I realize there is no test for this yet. Working on that ... writing a GEP by hand is hard, when it involves nested aggregate types! Use clang.

Remove incorrect use of isAtFixpoint. Instead, Access::operator& no longer drops content when ranges are merged.
Add more tests, move tests to new file for better naming.

sameerds marked 6 inline comments as done.Dec 7 2022, 8:14 AM

sameerds added inline comments.

llvm/include/llvm/Transforms/IPO/Attributor.h
5149	Can fix this locally too.
llvm/lib/Transforms/IPO/AttributorAttributes.cpp
1393–1396	You're right. What I did here was plain wrong. The root cause was that when me merge ranges in operator&= for Access objects, we conservatively drop the contents. We don't need to be so conservative ... just combining contents from the two Access objects works well in case both happen to have the same contents.
1403	Test added.
1540	I missed this. If there are no other changes required, I can fix this locally before submitting.

jdoerfert added inline comments.Dec 7 2022, 8:22 AM

llvm/include/llvm/Transforms/IPO/Attributor.h
5225–5226	It was dropping the results for a reason before, just going back on it is probably not sound either. If we have: range: [0-4] value: i32 -42 and merge it with range: [0-8] value: i64 -42 we need to do something here. At least, we need to change must to may but I think we cannot even keep the value if they are equal under under zext. If we would, writing `i32 0` and `i64 0` would make us believe the former writes 8 bytes and they'll all be 0, which is not true. When does this happen, maybe we need to understand the problem better.
llvm/test/Transforms/Attributor/multiple-offsets-pointer-info.ll
337	AAPotentialConstantValues uses AAPotentialValues but the case to handle PHINodes is missing here: https://github.com/llvm/llvm-project/blob/72d76a2403459a38a1d6daae62de6945097db8f9/llvm/lib/Transforms/IPO/AttributorAttributes.cpp#L9222 We just need to go over the operands and call the fillSetWithConstantValues function. We can do that in a follow up or pre-patch, just commenting here to make sure we don't forget.
llvm/test/Transforms/Attributor/value-simplify-pointer-info.ll
3157	I think this is the same problem, we do not handle LoadInst and should just call fillSetWithConstantValues on the load value itself.

sameerds marked 2 inline comments as done.Dec 7 2022, 8:57 AM

sameerds added inline comments.

llvm/include/llvm/Transforms/IPO/Attributor.h
5225–5226	The Access objects being merged here originate from the same remote instruction. Doesn't that mean that the type is guaranteed to be the same? I tried putting an assert(Ty == R.Ty), and it did not fire for the lit tests. Instead of asserting, we can always check that before calling combineInValueLattice().
llvm/test/Transforms/Attributor/multiple-offsets-pointer-info.ll
337	Exactly what I had in mind. Would prefer to do this as a follow-up.

jdoerfert added inline comments.Dec 7 2022, 9:33 AM

llvm/include/llvm/Transforms/IPO/Attributor.h
5225–5226	I'm still confused. Even if it's the same value this is a problem, no? range: [0-4] value: i32 4 must write merged with range: [4-8] value: i32 4 must write will result in range: [0-8] value: i32 4 must write which is not true. For one, we may only write 4 out of 8 bytes, and depending which ones its not going to be `4` if you read the range [0-8].

Harbormaster completed remote builds in B201705: Diff 480918.Dec 7 2022, 5:47 PM

sameerds added inline comments.Dec 7 2022, 9:28 PM

llvm/include/llvm/Transforms/IPO/Attributor.h
5225–5226	Here is what I see about the creation of Access objects: Each Access is for a unique value. If the value is a MemInstrinsic, then the length is known, and all ranges for that Access will have that same length. Else, if the value is an argument, then the length is unknown and the Access has only one range (unknown). Else, the value is an instruction with an optionally known type (see handleAccess()). If the length is known, then all ranges have the same length, else it is a single unknown range. This invariant is maintained even when looking through function calls. If the above is correct, then it might be redundant to even track the size for every Range in a RangeList. But that is assuming Ranges are used only for PointerInfo::Access objects. If the Range should remain generic, then we should allow the possibility that all Ranges in a RangeList are not the same size. We could add a bool "AllRangesAreSameSize" and check this when merging an Access into another Access. So if Ranges are not the same size, then the contents are unknown. Else, if the types are the same, then combine the contents. Else the contents are unknown.

sameerds marked an inline comment as not done.Dec 7 2022, 9:39 PM

sameerds added inline comments.

llvm/include/llvm/Transforms/IPO/Attributor.h
5225–5226	Correction in the third bullet ... the length may be unknown if it is not llvm::Argument. Otherwise the invariant about looking through function calls applies.

jdoerfert added inline comments.Dec 7 2022, 9:42 PM

llvm/include/llvm/Transforms/IPO/Attributor.h
5225–5226	I'm very confused. Apologies. So, if I understand this correctly now we will keep the value but mark the access as MAY if it has more than one range. That seems correct. My example above was merging the ranges, which we are not, and also keeping the MUST bit, which we are not. So, assuming I finally understand what is happening this should be fine.

LG, I think.

This revision is now accepted and ready to land.Dec 7 2022, 9:42 PM

sameerds marked an inline comment as not done.Dec 7 2022, 10:43 PM

sameerds added inline comments.

llvm/include/llvm/Transforms/IPO/Attributor.h
5225–5226	I am just glad to have your attention while I work through this! It's an important optimization for launching HIP programs. So yes, if there are multiple ranges in an Access, they are of the same size. We keep the contents if possible, but it is always a MAY access. I have put asserts in strategic places (isMayAccess and isMustAccess) to catch that.

FWIW, I landed https://reviews.llvm.org/rG1eab2d699e9581305f32473291e6afa47017d582 and you might need to verify the tests against it. Worst case, we need to recursively call getPotentialValues.

Rebased.
Added messages to assertions.
handleAccess expects a reference Type&, since that argument cannot be nullptr.

Herald added subscribers: foad, arsenm. · View Herald TranscriptDec 8 2022, 10:25 PM

Harbormaster completed remote builds in B202143: Diff 481518.Dec 8 2022, 10:50 PM

Rebased.
Yielded to clang-format's insistence in a couple of places.

Harbormaster completed remote builds in B202490: Diff 481992.Dec 11 2022, 10:34 PM

Closed by commit rGc2a0baad1fbb: [AAPointerInfo] track multiple constant offsets for each use (authored by sameerds). · Explain WhyDec 12 2022, 12:09 AM

This revision was automatically updated to reflect the committed changes.

sameerds added a commit: rGc2a0baad1fbb: [AAPointerInfo] track multiple constant offsets for each use.

sameerds added a reverting change: rG2fdeb2779006: Revert "[AAPointerInfo] track multiple constant offsets for each use".Dec 12 2022, 2:09 AM

sameerds added a commit: rG6a2305484e87: [AAPointerInfo] track multiple constant offsets for each use.Dec 13 2022, 8:57 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

Transforms/

IPO/

Attributor.h

247 lines

lib/

Target/

AMDGPU/

AMDGPUAttributor.cpp

2 lines

Transforms/

IPO/

AttributorAttributes.cpp

296 lines

test/

CodeGen/

AMDGPU/

attributor.ll

	implicitarg-attributes.ll
	attributor.ll

30 lines

Transforms/

Attributor/

call-simplify-pointer-info.ll

172 lines

multiple-offsets-pointer-info.ll

342 lines

value-simplify-pointer-info.ll

73 lines

Diff 482009

llvm/include/llvm/Transforms/IPO/Attributor.h

Show First 20 Lines • Show All 260 Lines • ▼ Show 20 Lines RangeTy &operator&=(const RangeTy &R) {

else if (Size == Unknown || R.Size == Unknown) else if (Size == Unknown || R.Size == Unknown)

Size = Unknown; Size = Unknown;

else if (R.Size != Unassigned) else if (R.Size != Unassigned)

Size = std::max(Size, R.Size); Size = std::max(Size, R.Size);

return *this; return *this;

} }

/// Comparison for sorting ranges by offset.

///

/// Returns true if the offset \p L is less than that of \p R.

inline static bool OffsetLessThan(const RangeTy &L, const RangeTy &R) {

return L.Offset < R.Offset;

}

/// Constants used to represent special offsets or sizes. /// Constants used to represent special offsets or sizes.

/// - This assumes that Offset and Size are non-negative. /// - This assumes that Offset and Size are non-negative.

/// - The constants should not clash with DenseMapInfo, such as EmptyKey /// - The constants should not clash with DenseMapInfo, such as EmptyKey

/// (INT64_MAX) and TombstoneKey (INT64_MIN). /// (INT64_MAX) and TombstoneKey (INT64_MIN).

static constexpr int64_t Unassigned = -1; static constexpr int64_t Unassigned = -1;

static constexpr int64_t Unknown = -2; static constexpr int64_t Unknown = -2;

}; };

inline raw_ostream &operator<<(raw_ostream &OS, const RangeTy &R) {

OS << "[" << R.Offset << ", " << R.Size << "]";

return OS;

}

inline bool operator==(const RangeTy &A, const RangeTy &B) { inline bool operator==(const RangeTy &A, const RangeTy &B) {

return A.Offset == B.Offset && A.Size == B.Size; return A.Offset == B.Offset && A.Size == B.Size;

} }

inline bool operator!=(const RangeTy &A, const RangeTy &B) { return !(A == B); } inline bool operator!=(const RangeTy &A, const RangeTy &B) { return !(A == B); }

/// Return the initial value of \p Obj with type \p Ty if that is a constant. /// Return the initial value of \p Obj with type \p Ty if that is a constant.

Constant *getInitialValueForObj(Value &Obj, Type &Ty, Constant *getInitialValueForObj(Value &Obj, Type &Ty,

▲ Show 20 Lines • Show All 4,720 Lines • ▼ Show 20 Lines enum AccessKind {

AK_MAY_READ = AK_MAY | AK_R, AK_MAY_READ = AK_MAY | AK_R,

AK_MAY_WRITE = AK_MAY | AK_W, AK_MAY_WRITE = AK_MAY | AK_W,

AK_MAY_READ_WRITE = AK_MAY | AK_R | AK_W, AK_MAY_READ_WRITE = AK_MAY | AK_R | AK_W,

AK_MUST_READ = AK_MUST | AK_R, AK_MUST_READ = AK_MUST | AK_R,

AK_MUST_WRITE = AK_MUST | AK_W, AK_MUST_WRITE = AK_MUST | AK_W,

AK_MUST_READ_WRITE = AK_MUST | AK_R | AK_W, AK_MUST_READ_WRITE = AK_MUST | AK_R | AK_W,

}; };

/// A container for a list of ranges.

struct RangeList {

// The set of ranges rarely contains more than one element, and is unlikely

// to contain more than say four elements. So we find the middle-ground with

// a sorted vector. This avoids hard-coding a rarely used number like "four"

// into every instance of a SmallSet.

using RangeTy = AA::RangeTy;

using VecTy = SmallVector<RangeTy>;

using iterator = VecTy::iterator;

using const_iterator = VecTy::const_iterator;

VecTy Ranges;

RangeList(const RangeTy &R) { Ranges.push_back(R); }

RangeList(ArrayRef<int64_t> Offsets, int64_t Size) {

Ranges.reserve(Offsets.size());

for (unsigned i = 0, e = Offsets.size(); i != e; ++i) {

assert(((i + 1 == e) || Offsets[i] < Offsets[i + 1]) &&

"Expected strictly ascending offsets.");

Ranges.emplace_back(Offsets[i], Size);

}

RangeList() = default;

iterator begin() { return Ranges.begin(); }

iterator end() { return Ranges.end(); }

const_iterator begin() const { return Ranges.begin(); }

const_iterator end() const { return Ranges.end(); }

// Helpers required for std::set_difference

using value_type = RangeTy;

void push_back(const RangeTy &R) {

assert((Ranges.empty() || RangeTy::OffsetLessThan(Ranges.back(), R)) &&

"Ensure the last element is the greatest.");

Ranges.push_back(R);

}

/// Copy ranges from \p L that are not in \p R, into \p D.

static void set_difference(const RangeList &L, const RangeList &R,

RangeList &D) {

std::set_difference(L.begin(), L.end(), R.begin(), R.end(),

std::back_inserter(D), RangeTy::OffsetLessThan);

}

unsigned size() const { return Ranges.size(); }

bool operator==(const RangeList &OI) const { return Ranges == OI.Ranges; }

/// Merge the ranges in \p RHS into the current ranges.

/// - Merging a list of unknown ranges makes the current list unknown.

/// - Ranges with the same offset are merged according to RangeTy::operator&

/// \return true if the current RangeList changed.

bool merge(const RangeList &RHS) {

if (isUnknown())

return false;

if (RHS.isUnknown()) {

setUnknown();

return true;

}

if (Ranges.empty()) {

Ranges = RHS.Ranges;

return true;

}

bool Changed = false;

auto LPos = Ranges.begin();

for (auto &R : RHS.Ranges) {

auto Result = insert(LPos, R);

if (isUnknown())

return true;

LPos = Result.first;

Changed |= Result.second;

}

return Changed;

}

/// Insert \p R at the given iterator \p Pos, and merge if necessary.

///

/// This assumes that all ranges before \p Pos are OffsetLessThan \p R, and

/// then maintains the sorted order for the suffix list.

///

/// \return The place of insertion and true iff anything changed.

std::pair<iterator, bool> insert(iterator Pos, const RangeTy &R) {

if (isUnknown())

return std::make_pair(Ranges.begin(), false);

if (R.offsetOrSizeAreUnknown()) {

return std::make_pair(setUnknown(), true);

}

// Maintain this as a sorted vector of unique entries.

auto LB = std::lower_bound(Pos, Ranges.end(), R, RangeTy::OffsetLessThan);

if (LB == Ranges.end() || LB->Offset != R.Offset)

return std::make_pair(Ranges.insert(LB, R), true);

bool Changed = *LB != R;

*LB &= R;

if (LB->offsetOrSizeAreUnknown())

return std::make_pair(setUnknown(), true);

return std::make_pair(LB, Changed);

}

/// Insert the given range \p R, maintaining sorted order.

///

/// \return The place of insertion and true iff anything changed.

std::pair<iterator, bool> insert(const RangeTy &R) {

return insert(Ranges.begin(), R);

}

/// Add the increment \p Inc to the offset of every range.

void addToAllOffsets(int64_t Inc) {

assert(!isUnassigned() &&

"Cannot increment if the offset is not yet computed!");

if (isUnknown())

return;

for (auto &R : Ranges) {

R.Offset += Inc;

}

/// Return true iff there is exactly one range and it is known.

bool isUnique() const {

return Ranges.size() == 1 && !Ranges.front().offsetOrSizeAreUnknown();

}

/// Return the unique range, assuming it exists.

const RangeTy &getUnique() const {

jdoerfertUnsubmitted

Done

const RangeTy &getUnique() const {

- assert(isUnique());

+ assert(isUnique() && "Cannot return a unique range if there is no single one.");

return Ranges.front();

All assertions need messages please, also elsewhere.

jdoerfert: All assertions need messages please, also elsewhere.

sameerdsAuthorUnsubmitted

Done

Can fix this locally too.

sameerds: Can fix this locally too.

assert(isUnique() && "No unique range to return!");

return Ranges.front();

}

/// Return true iff the list contains an unknown range.

bool isUnknown() const {

if (isUnassigned())

return false;

if (Ranges.front().offsetOrSizeAreUnknown()) {

assert(Ranges.size() == 1 && "Unknown is a singleton range.");

return true;

}

return false;

}

/// Discard all ranges and insert a single unknown range.

iterator setUnknown() {

Ranges.clear();

Ranges.push_back(RangeTy::getUnknown());

return Ranges.begin();

}

/// Return true if no ranges have been inserted.

bool isUnassigned() const { return Ranges.size() == 0; }

};

/// An access description. /// An access description.

struct Access { struct Access {

Access(Instruction *I, int64_t Offset, int64_t Size, Access(Instruction *I, int64_t Offset, int64_t Size,

std::optional<Value *> Content, AccessKind Kind, Type *Ty) std::optional<Value *> Content, AccessKind Kind, Type *Ty)

: LocalI(I), RemoteI(I), Content(Content), Range(Offset, Size), : LocalI(I), RemoteI(I), Content(Content), Ranges(Offset, Size),

Kind(Kind), Ty(Ty) { Kind(Kind), Ty(Ty) {

verify(); verify();

} }

Access(Instruction *LocalI, Instruction *RemoteI, const RangeList &Ranges,

std::optional<Value *> Content, AccessKind K, Type *Ty)

: LocalI(LocalI), RemoteI(RemoteI), Content(Content), Ranges(Ranges),

Kind(K), Ty(Ty) {

if (Ranges.size() > 1) {

Kind = AccessKind(Kind | AK_MAY);

Kind = AccessKind(Kind & ~AK_MUST);

}

verify();

}

Access(Instruction *LocalI, Instruction *RemoteI, int64_t Offset, Access(Instruction *LocalI, Instruction *RemoteI, int64_t Offset,

int64_t Size, std::optional<Value *> Content, AccessKind Kind, int64_t Size, std::optional<Value *> Content, AccessKind Kind,

Type *Ty) Type *Ty)

: LocalI(LocalI), RemoteI(RemoteI), Content(Content), : LocalI(LocalI), RemoteI(RemoteI), Content(Content),

Range(Offset, Size), Kind(Kind), Ty(Ty) { Ranges(Offset, Size), Kind(Kind), Ty(Ty) {

verify(); verify();

} }

Access(const Access &Other) = default; Access(const Access &Other) = default;

Access(const Access &&Other)

: LocalI(Other.LocalI), RemoteI(Other.RemoteI), Content(Other.Content),

Range(Other.Range), Kind(Other.Kind), Ty(Other.Ty) {}

Access &operator=(const Access &Other) = default; Access &operator=(const Access &Other) = default;

bool operator==(const Access &R) const { bool operator==(const Access &R) const {

return LocalI == R.LocalI && RemoteI == R.RemoteI && Range == R.Range && return LocalI == R.LocalI && RemoteI == R.RemoteI && Ranges == R.Ranges &&

Content == R.Content && Kind == R.Kind; Content == R.Content && Kind == R.Kind;

} }

bool operator!=(const Access &R) const { return !(*this == R); } bool operator!=(const Access &R) const { return !(*this == R); }

Access &operator&=(const Access &R) { Access &operator&=(const Access &R) {

assert(RemoteI == R.RemoteI && "Expected same instruction!"); assert(RemoteI == R.RemoteI && "Expected same instruction!");

assert(LocalI == R.LocalI && "Expected same instruction!"); assert(LocalI == R.LocalI && "Expected same instruction!");

Kind = AccessKind(Kind | R.Kind);

auto Before = Range; // Note that every Access object corresponds to a unique Value, and only

Range &= R.Range; // accesses to the same Value are merged. Hence we assume that all ranges

if (Before.isUnassigned() || Before == Range) { // are the same size. If ranges can be different size, then the contents

// must be dropped.

Ranges.merge(R.Ranges);

Content = Content =

AA::combineOptionalValuesInAAValueLatice(Content, R.Content, Ty); AA::combineOptionalValuesInAAValueLatice(Content, R.Content, Ty);

} else {

// Since the Range information changed, set a conservative state -- drop // Combine the access kind, which results in a bitwise union.

// the contents, and assume MayAccess rather than MustAccess. // - If MAY is present in the union, then any MUST needs to be removed.

setWrittenValueUnknown(); // - If there is more than one range, then this must be a MAY.

Kind = AccessKind(Kind | R.Kind);

if ((Kind & AK_MAY) || Ranges.size() > 1) {

jdoerfertUnsubmitted

Done

It was dropping the results for a reason before, just going back on it is probably not sound either.

If we have:

range: [0-4]
value: i32 -42

and merge it with

range: [0-8]
value: i64 -42

we need to do something here.
At least, we need to change must to may but I think we cannot even keep the value if they are equal under under zext.
If we would, writing i32 0 and i64 0 would make us believe the former writes 8 bytes and they'll all be 0, which is not true.

When does this happen, maybe we need to understand the problem better.

jdoerfert: It was dropping the results for a reason before, just going back on it is probably not sound…

sameerdsAuthorUnsubmitted

Done

The Access objects being merged here originate from the same remote instruction. Doesn't that mean that the type is guaranteed to be the same? I tried putting an assert(Ty == R.Ty), and it did not fire for the lit tests. Instead of asserting, we can always check that before calling combineInValueLattice().

sameerds: The Access objects being merged here originate from the same remote instruction. Doesn't that…

jdoerfertUnsubmitted

Done

I'm still confused. Even if it's the same value this is a problem, no?

range: [0-4]
value: i32 4
must write

merged with

range: [4-8]
value: i32 4
must write

will result in

range: [0-8]
value: i32 4
must write

which is not true. For one, we may only write 4 out of 8 bytes, and depending which ones its not going to be 4 if you read the range [0-8].

jdoerfert: I'm still confused. Even if it's the same value this is a problem, no? ``` range: [0-4] value…

sameerdsAuthorUnsubmitted

Done

Here is what I see about the creation of Access objects:

Each Access is for a unique value.
If the value is a MemInstrinsic, then the length is known, and all ranges for that Access will have that same length.
Else, if the value is an argument, then the length is unknown and the Access has only one range (unknown).
Else, the value is an instruction with an optionally known type (see handleAccess()). If the length is known, then all ranges have the same length, else it is a single unknown range.

This invariant is maintained even when looking through function calls.

If the above is correct, then it might be redundant to even track the size for every Range in a RangeList. But that is assuming Ranges are used only for PointerInfo::Access objects. If the Range should remain generic, then we should allow the possibility that all Ranges in a RangeList are not the same size. We could add a bool "AllRangesAreSameSize" and check this when merging an Access into another Access.

So if Ranges are not the same size, then the contents are unknown. Else, if the types are the same, then combine the contents. Else the contents are unknown.

sameerds: Here is what I see about the creation of Access objects: - Each Access is for a unique value.

jdoerfertUnsubmitted

Done

I'm very confused. Apologies.

So, if I understand this correctly now we will keep the value but mark the access as MAY if it has more than one range. That seems correct. My example above was merging the ranges, which we are not, and also keeping the MUST bit, which we are not. So, assuming I finally understand what is happening this should be fine.

jdoerfert: I'm very confused. Apologies. So, if I understand this correctly now we will keep the value…

sameerdsAuthorUnsubmitted

Done

I am just glad to have your attention while I work through this! It's an important optimization for launching HIP programs.

So yes, if there are multiple ranges in an Access, they are of the same size. We keep the contents if possible, but it is always a MAY access. I have put asserts in strategic places (isMayAccess and isMustAccess) to catch that.

sameerds: I am just glad to have your attention while I work through this! It's an important optimization…

sameerdsAuthorUnsubmitted

Done

Correction in the third bullet ... the length may be unknown if it is not llvm::Argument. Otherwise the invariant about looking through function calls applies.

sameerds: Correction in the third bullet ... the length **may** be unknown if it is not llvm::Argument.

Kind = AccessKind(Kind | AK_MAY); Kind = AccessKind(Kind | AK_MAY);

Kind = AccessKind(Kind & ~AK_MUST); Kind = AccessKind(Kind & ~AK_MUST);

} }

verify(); verify();

return *this; return *this;

} }

void verify() { void verify() {

assert(isMustAccess() + isMayAccess() == 1 && assert(isMustAccess() + isMayAccess() == 1 &&

"Expect must or may access, not both."); "Expect must or may access, not both.");

assert(isAssumption() + isWrite() <= 1 && assert(isAssumption() + isWrite() <= 1 &&

"Expect assumption access or write access, never both."); "Expect assumption access or write access, never both.");

assert((isMayAccess() || Ranges.size() == 1) &&

"Cannot be a must access if there are multiple ranges.");

} }

/// Return the access kind. /// Return the access kind.

AccessKind getKind() const { return Kind; } AccessKind getKind() const { return Kind; }

/// Return true if this is a read access. /// Return true if this is a read access.

bool isRead() const { return Kind & AK_R; } bool isRead() const { return Kind & AK_R; }

/// Return true if this is a write access. /// Return true if this is a write access.

bool isWrite() const { return Kind & AK_W; } bool isWrite() const { return Kind & AK_W; }

/// Return true if this is a write access. /// Return true if this is a write access.

bool isWriteOrAssumption() const { return isWrite() || isAssumption(); } bool isWriteOrAssumption() const { return isWrite() || isAssumption(); }

/// Return true if this is an assumption access. /// Return true if this is an assumption access.

bool isAssumption() const { return Kind == AK_ASSUMPTION; } bool isAssumption() const { return Kind == AK_ASSUMPTION; }

bool isMustAccess() const { return Kind & AK_MUST; } bool isMustAccess() const {

bool isMayAccess() const { return Kind & AK_MAY; } bool MustAccess = Kind & AK_MUST;

assert((!MustAccess || Ranges.size() < 2) &&

"Cannot be a must access if there are multiple ranges.");

return MustAccess;

}

bool isMayAccess() const {

bool MayAccess = Kind & AK_MAY;

assert((MayAccess || Ranges.size() < 2) &&

"Cannot be a must access if there are multiple ranges.");

return MayAccess;

}

/// Return the instruction that causes the access with respect to the local /// Return the instruction that causes the access with respect to the local

/// scope of the associated attribute. /// scope of the associated attribute.

Instruction *getLocalInst() const { return LocalI; } Instruction *getLocalInst() const { return LocalI; }

/// Return the actual instruction that causes the access. /// Return the actual instruction that causes the access.

Instruction *getRemoteInst() const { return RemoteI; } Instruction *getRemoteInst() const { return RemoteI; }

Show All 17 Lines Value *getWrittenValue() const {

"Value needs to be determined before accessing it."); "Value needs to be determined before accessing it.");

return *Content; return *Content;

} }

/// Return the written value which can be `llvm::null` if it is not yet /// Return the written value which can be `llvm::null` if it is not yet

/// determined. /// determined.

std::optional<Value *> getContent() const { return Content; } std::optional<Value *> getContent() const { return Content; }

/// Return the offset for this access. bool hasUniqueRange() const { return Ranges.isUnique(); }

int64_t getOffset() const { return Range.Offset; } const AA::RangeTy &getUniqueRange() const { return Ranges.getUnique(); }

/// Add a range accessed by this Access.

///

/// If there are multiple ranges, then this is a "may access".

void addRange(int64_t Offset, int64_t Size) {

Ranges.insert({Offset, Size});

if (!hasUniqueRange()) {

Kind = AccessKind(Kind | AK_MAY);

Kind = AccessKind(Kind & ~AK_MUST);

}

const RangeList &getRanges() const { return Ranges; }

/// Return the size for this access. using const_iterator = RangeList::const_iterator;

int64_t getSize() const { return Range.Size; } const_iterator begin() const { return Ranges.begin(); }

const_iterator end() const { return Ranges.end(); }

private: private:

/// The instruction responsible for the access with respect to the local /// The instruction responsible for the access with respect to the local

/// scope of the associated attribute. /// scope of the associated attribute.

Instruction *LocalI; Instruction *LocalI;

/// The instruction responsible for the access. /// The instruction responsible for the access.

Instruction *RemoteI; Instruction *RemoteI;

/// The value written, if any. `llvm::none` means "not known yet", `nullptr` /// The value written, if any. `llvm::none` means "not known yet", `nullptr`

/// cannot be determined. /// cannot be determined.

std::optional<Value *> Content; std::optional<Value *> Content;

/// The object accessed, in terms of an offset and size in bytes. /// Set of potential ranges accessed from the base pointer.

AA::RangeTy Range; RangeList Ranges;

/// The access kind, e.g., READ, as bitset (could be more than one). /// The access kind, e.g., READ, as bitset (could be more than one).

AccessKind Kind; AccessKind Kind;

/// The type of the content, thus the type read/written, can be null if not /// The type of the content, thus the type read/written, can be null if not

/// available. /// available.

Type *Ty; Type *Ty;

}; };

▲ Show 20 Lines • Show All 85 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp

Show First 20 Lines • Show All 753 Lines • ▼ Show 20 Lines	bool runOnModule(Module &M) override {
}		}

CallGraphUpdater CGUpdater;		CallGraphUpdater CGUpdater;
BumpPtrAllocator Allocator;		BumpPtrAllocator Allocator;
AMDGPUInformationCache InfoCache(M, AG, Allocator, nullptr, *TM);		AMDGPUInformationCache InfoCache(M, AG, Allocator, nullptr, *TM);
DenseSet<const char *> Allowed(		DenseSet<const char *> Allowed(
{&AAAMDAttributes::ID, &AAUniformWorkGroupSize::ID,		{&AAAMDAttributes::ID, &AAUniformWorkGroupSize::ID,
&AAPotentialValues::ID, &AAAMDFlatWorkGroupSize::ID, &AACallEdges::ID,		&AAPotentialValues::ID, &AAAMDFlatWorkGroupSize::ID, &AACallEdges::ID,
&AAPointerInfo::ID});		&AAPointerInfo::ID, &AAPotentialConstantValues::ID});

AttributorConfig AC(CGUpdater);		AttributorConfig AC(CGUpdater);
AC.Allowed = &Allowed;		AC.Allowed = &Allowed;
AC.IsModulePass = true;		AC.IsModulePass = true;
AC.DefaultInitializeLiveInternals = false;		AC.DefaultInitializeLiveInternals = false;

Attributor A(Functions, InfoCache, AC);		Attributor A(Functions, InfoCache, AC);

Show All 24 Lines

llvm/lib/Transforms/IPO/AttributorAttributes.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 813 Lines • ▼ Show 20 Lines	struct AA::PointerInfo::State : public AbstractState {
/// Add a new Access to the state at offset \p Offset and with size \p Size.		/// Add a new Access to the state at offset \p Offset and with size \p Size.
/// The access is associated with \p I, writes \p Content (if anything), and		/// The access is associated with \p I, writes \p Content (if anything), and
/// is of kind \p Kind. If an Access already exists for the same \p I and same		/// is of kind \p Kind. If an Access already exists for the same \p I and same
/// \p RemoteI, the two are combined, potentially losing information about		/// \p RemoteI, the two are combined, potentially losing information about
/// offset and size. The resulting access must now be moved from its original		/// offset and size. The resulting access must now be moved from its original
/// OffsetBin to the bin for its new offset.		/// OffsetBin to the bin for its new offset.
///		///
/// \Returns CHANGED, if the state changed, UNCHANGED otherwise.		/// \Returns CHANGED, if the state changed, UNCHANGED otherwise.
ChangeStatus addAccess(Attributor &A, int64_t Offset, int64_t Size,		ChangeStatus addAccess(Attributor &A, const AAPointerInfo::RangeList &Ranges,
Instruction &I, std::optional<Value *> Content,		Instruction &I, std::optional<Value *> Content,
AAPointerInfo::AccessKind Kind, Type *Ty,		AAPointerInfo::AccessKind Kind, Type *Ty,
Instruction *RemoteI = nullptr);		Instruction *RemoteI = nullptr);

using OffsetBinsTy = DenseMap<RangeTy, SmallSet<unsigned, 4>>;		using OffsetBinsTy = DenseMap<RangeTy, SmallSet<unsigned, 4>>;

using const_bin_iterator = OffsetBinsTy::const_iterator;		using const_bin_iterator = OffsetBinsTy::const_iterator;
const_bin_iterator begin() const { return OffsetBins.begin(); }		const_bin_iterator begin() const { return OffsetBins.begin(); }
Show All 29 Lines	bool forallInterferingAccesses(

for (const auto &It : OffsetBins) {		for (const auto &It : OffsetBins) {
AA::RangeTy ItRange = It.getFirst();		AA::RangeTy ItRange = It.getFirst();
if (!Range.mayOverlap(ItRange))		if (!Range.mayOverlap(ItRange))
continue;		continue;
bool IsExact = Range == ItRange && !Range.offsetOrSizeAreUnknown();		bool IsExact = Range == ItRange && !Range.offsetOrSizeAreUnknown();
for (auto Index : It.getSecond()) {		for (auto Index : It.getSecond()) {
auto &Access = AccessList[Index];		auto &Access = AccessList[Index];
if (!CB(Access, IsExact))		if (!CB(Access, IsExact && Access.hasUniqueRange()))
return false;		return false;
}		}
}		}
return true;		return true;
}		}

/// See AAPointerInfo::forallInterferingAccesses.		/// See AAPointerInfo::forallInterferingAccesses.
bool forallInterferingAccesses(		bool forallInterferingAccesses(
Instruction &I,		Instruction &I,
function_ref<bool(const AAPointerInfo::Access &, bool)> CB,		function_ref<bool(const AAPointerInfo::Access &, bool)> CB,
AA::RangeTy &Range) const {		AA::RangeTy &Range) const {
if (!isValidState())		if (!isValidState())
return false;		return false;

auto LocalList = RemoteIMap.find(&I);		auto LocalList = RemoteIMap.find(&I);
if (LocalList == RemoteIMap.end()) {		if (LocalList == RemoteIMap.end()) {
return true;		return true;
}		}

for (auto LI : LocalList->getSecond()) {		for (unsigned Index : LocalList->getSecond()) {
auto &Access = AccessList[LI];		for (auto &R : AccessList[Index]) {
Range &= {Access.getOffset(), Access.getSize()};		Range &= R;
		if (Range.offsetOrSizeAreUnknown())
		break;
		}
}		}
return forallInterferingAccesses(Range, CB);		return forallInterferingAccesses(Range, CB);
}		}

private:		private:
/// State to track fixpoint and validity.		/// State to track fixpoint and validity.
BooleanState BS;		BooleanState BS;
};		};

ChangeStatus AA::PointerInfo::State::addAccess(Attributor &A, int64_t Offset,		ChangeStatus AA::PointerInfo::State::addAccess(
int64_t Size, Instruction &I,		Attributor &A, const AAPointerInfo::RangeList &Ranges, Instruction &I,
std::optional<Value *> Content,		std::optional<Value > Content, AAPointerInfo::AccessKind Kind, Type Ty,
AAPointerInfo::AccessKind Kind,		Instruction *RemoteI) {
Type Ty, Instruction RemoteI) {
RemoteI = RemoteI ? RemoteI : &I;		RemoteI = RemoteI ? RemoteI : &I;
AAPointerInfo::Access Acc(&I, RemoteI, Offset, Size, Content, Kind, Ty);

// Check if we have an access for this instruction, if not, simply add it.		// Check if we have an access for this instruction, if not, simply add it.
auto &LocalList = RemoteIMap[RemoteI];		auto &LocalList = RemoteIMap[RemoteI];
bool AccExists = false;		bool AccExists = false;
unsigned AccIndex = AccessList.size();		unsigned AccIndex = AccessList.size();
for (auto Index : LocalList) {		for (auto Index : LocalList) {
auto &A = AccessList[Index];		auto &A = AccessList[Index];
if (A.getLocalInst() == &I) {		if (A.getLocalInst() == &I) {
AccExists = true;		AccExists = true;
AccIndex = Index;		AccIndex = Index;
break;		break;
}		}
}		}

		auto AddToBins = [&](const AAPointerInfo::RangeList &ToAdd) {
		LLVM_DEBUG(if (ToAdd.size()) dbgs()
		<< "[AAPointerInfo] Inserting access in new offset bins\n";);

		for (auto Key : ToAdd) {
		LLVM_DEBUG(dbgs() << " key " << Key << "\n");
		OffsetBins[Key].insert(AccIndex);
		}
		};

if (!AccExists) {		if (!AccExists) {
AccessList.push_back(Acc);		AccessList.emplace_back(&I, RemoteI, Ranges, Content, Kind, Ty);
		assert((AccessList.size() == AccIndex + 1) &&
		"New Access should have been at AccIndex");
LocalList.push_back(AccIndex);		LocalList.push_back(AccIndex);
} else {		AddToBins(AccessList[AccIndex].getRanges());
// The new one will be combined with the existing one.		return ChangeStatus::CHANGED;
		}

		// Combine the new Access with the existing Access, and then update the
		// mapping in the offset bins.
		AAPointerInfo::Access Acc(&I, RemoteI, Ranges, Content, Kind, Ty);
auto &Current = AccessList[AccIndex];		auto &Current = AccessList[AccIndex];
auto Before = Current;		auto Before = Current;
Current &= Acc;		Current &= Acc;
if (Current == Before)		if (Current == Before)
return ChangeStatus::UNCHANGED;		return ChangeStatus::UNCHANGED;

Acc = Current;		auto &ExistingRanges = Before.getRanges();
AA::RangeTy Key{Before.getOffset(), Before.getSize()};		auto &NewRanges = Current.getRanges();

		// Ranges that are in the old access but not the new access need to be removed
		// from the offset bins.
		AAPointerInfo::RangeList ToRemove;
		AAPointerInfo::RangeList::set_difference(ExistingRanges, NewRanges, ToRemove);
		LLVM_DEBUG(if (ToRemove.size()) dbgs()
		<< "[AAPointerInfo] Removing access from old offset bins\n";);

		for (auto Key : ToRemove) {
		LLVM_DEBUG(dbgs() << " key " << Key << "\n");
assert(OffsetBins.count(Key) && "Existing Access must be in some bin.");		assert(OffsetBins.count(Key) && "Existing Access must be in some bin.");
auto &Bin = OffsetBins[Key];		auto &Bin = OffsetBins[Key];
assert(Bin.count(AccIndex) &&		assert(Bin.count(AccIndex) &&
"Expected bin to actually contain the Access.");		"Expected bin to actually contain the Access.");
LLVM_DEBUG(dbgs() << "[AAPointerInfo] Removing Access "
<< AccessList[AccIndex] << " with key {" << Key.Offset
<< ',' << Key.Size << "}\n");
Bin.erase(AccIndex);		Bin.erase(AccIndex);
}		}

AA::RangeTy Key{Acc.getOffset(), Acc.getSize()};		// Ranges that are in the new access but not the old access need to be added
LLVM_DEBUG(dbgs() << "[AAPointerInfo] Inserting Access " << Acc		// to the offset bins.
<< " with key {" << Key.Offset << ',' << Key.Size << "}\n");		AAPointerInfo::RangeList ToAdd;
OffsetBins[Key].insert(AccIndex);		AAPointerInfo::RangeList::set_difference(NewRanges, ExistingRanges, ToAdd);
		AddToBins(ToAdd);
return ChangeStatus::CHANGED;		return ChangeStatus::CHANGED;
}		}

namespace {		namespace {

/// Helper struct, will support ranges eventually.		/// A helper containing a list of offsets computed for a Use. Ideally this
///		/// list should be strictly ascending, but we ensure that only when we
/// FIXME: Tracks a single Offset until we have proper support for a list of		/// actually translate the list of offsets to a RangeList.
/// RangeTy objects.
struct OffsetInfo {		struct OffsetInfo {
int64_t Offset = AA::RangeTy::Unassigned;		using VecTy = SmallVector<int64_t>;
		using const_iterator = VecTy::const_iterator;
		VecTy Offsets;

		const_iterator begin() const { return Offsets.begin(); }
		const_iterator end() const { return Offsets.end(); }

		bool operator==(const OffsetInfo &RHS) const {
		return Offsets == RHS.Offsets;
		}

		void insert(int64_t Offset) { Offsets.push_back(Offset); }
		bool isUnassigned() const { return Offsets.size() == 0; }

bool operator==(const OffsetInfo &OI) const { return Offset == OI.Offset; }		bool isUnknown() const {
		if (isUnassigned())
		return false;
		if (Offsets.size() == 1)
		return Offsets.front() == AA::RangeTy::Unknown;
		return false;
		}

		void setUnknown() {
		Offsets.clear();
		Offsets.push_back(AA::RangeTy::Unknown);
		}

		void addToAll(int64_t Inc) {
		for (auto &Offset : Offsets) {
		Offset += Inc;
		}
		}

		/// Copy offsets from \p R into the current list.
		///
		/// Ideally all lists should be strictly ascending, but we defer that to the
		/// actual use of the list. So we just blindly append here.
		void merge(const OffsetInfo &R) { Offsets.append(R.Offsets); }
};		};

		static raw_ostream &operator<<(raw_ostream &OS, const OffsetInfo &OI) {
		ListSeparator LS;
		OS << "[";
		for (auto Offset : OI) {
		OS << LS << Offset;
		}
		OS << "]";
		return OS;
		}

struct AAPointerInfoImpl		struct AAPointerInfoImpl
: public StateWrapper<AA::PointerInfo::State, AAPointerInfo> {		: public StateWrapper<AA::PointerInfo::State, AAPointerInfo> {
using BaseTy = StateWrapper<AA::PointerInfo::State, AAPointerInfo>;		using BaseTy = StateWrapper<AA::PointerInfo::State, AAPointerInfo>;
AAPointerInfoImpl(const IRPosition &IRP, Attributor &A) : BaseTy(IRP) {}		AAPointerInfoImpl(const IRPosition &IRP, Attributor &A) : BaseTy(IRP) {}

/// See AbstractAttribute::getAsStr().		/// See AbstractAttribute::getAsStr().
const std::string getAsStr() const override {		const std::string getAsStr() const override {
return std::string("PointerInfo ") +		return std::string("PointerInfo ") +
▲ Show 20 Lines • Show All 205 Lines • ▼ Show 20 Lines	for (const auto &It : State) {
if (IsByval && !RAcc.isRead())		if (IsByval && !RAcc.isRead())
continue;		continue;
bool UsedAssumedInformation = false;		bool UsedAssumedInformation = false;
AccessKind AK = RAcc.getKind();		AccessKind AK = RAcc.getKind();
auto Content = A.translateArgumentToCallSiteContent(		auto Content = A.translateArgumentToCallSiteContent(
RAcc.getContent(), CB, *this, UsedAssumedInformation);		RAcc.getContent(), CB, *this, UsedAssumedInformation);
AK = AccessKind(AK & (IsByval ? AccessKind::AK_R : AccessKind::AK_RW));		AK = AccessKind(AK & (IsByval ? AccessKind::AK_R : AccessKind::AK_RW));
AK = AccessKind(AK \| (RAcc.isMayAccess() ? AK_MAY : AK_MUST));		AK = AccessKind(AK \| (RAcc.isMayAccess() ? AK_MAY : AK_MUST));
Changed =
Changed \| addAccess(A, It.first.Offset, It.first.Size, CB, Content,		Changed \|= addAccess(A, RAcc.getRanges(), CB, Content, AK,
AK, RAcc.getType(), RAcc.getRemoteInst());		RAcc.getType(), RAcc.getRemoteInst());
}		}
}		}
return Changed;		return Changed;
}		}

ChangeStatus translateAndAddState(Attributor &A, const AAPointerInfo &OtherAA,		ChangeStatus translateAndAddState(Attributor &A, const AAPointerInfo &OtherAA,
int64_t Offset, CallBase &CB) {		const OffsetInfo &Offsets, CallBase &CB) {
using namespace AA::PointerInfo;		using namespace AA::PointerInfo;
if (!OtherAA.getState().isValidState() \|\| !isValidState())		if (!OtherAA.getState().isValidState() \|\| !isValidState())
return indicatePessimisticFixpoint();		return indicatePessimisticFixpoint();

const auto &OtherAAImpl = static_cast<const AAPointerInfoImpl &>(OtherAA);		const auto &OtherAAImpl = static_cast<const AAPointerInfoImpl &>(OtherAA);

// Combine the accesses bin by bin.		// Combine the accesses bin by bin.
ChangeStatus Changed = ChangeStatus::UNCHANGED;		ChangeStatus Changed = ChangeStatus::UNCHANGED;
const auto &State = OtherAAImpl.getState();		const auto &State = OtherAAImpl.getState();
for (const auto &It : State) {		for (const auto &It : State) {
AA::RangeTy Range = AA::RangeTy::getUnknown();
if (Offset != AA::RangeTy::Unknown &&
!It.first.offsetOrSizeAreUnknown()) {
Range = AA::RangeTy(It.first.Offset + Offset, It.first.Size);
}
for (auto Index : It.getSecond()) {		for (auto Index : It.getSecond()) {
const auto &RAcc = State.getAccess(Index);		const auto &RAcc = State.getAccess(Index);
AccessKind AK = RAcc.getKind();		for (auto Offset : Offsets) {
Changed = Changed \| addAccess(A, Range.Offset, Range.Size, CB,		auto NewRanges = Offset == AA::RangeTy::Unknown
RAcc.getContent(), AK, RAcc.getType(),		? AA::RangeTy::getUnknown()
RAcc.getRemoteInst());		: RAcc.getRanges();
		if (!NewRanges.isUnknown()) {
		NewRanges.addToAllOffsets(Offset);
		}
		Changed \|=
		addAccess(A, NewRanges, CB, RAcc.getContent(), RAcc.getKind(),
		RAcc.getType(), RAcc.getRemoteInst());
		}
}		}
}		}
return Changed;		return Changed;
}		}

/// Statistic tracking for all AAPointerInfo implementations.		/// Statistic tracking for all AAPointerInfo implementations.
/// See AbstractAttribute::trackStatistics().		/// See AbstractAttribute::trackStatistics().
void trackPointerInfoStatistics(const IRPosition &IRP) const {}		void trackPointerInfoStatistics(const IRPosition &IRP) const {}
Show All 23 Lines
struct AAPointerInfoFloating : public AAPointerInfoImpl {		struct AAPointerInfoFloating : public AAPointerInfoImpl {
using AccessKind = AAPointerInfo::AccessKind;		using AccessKind = AAPointerInfo::AccessKind;
AAPointerInfoFloating(const IRPosition &IRP, Attributor &A)		AAPointerInfoFloating(const IRPosition &IRP, Attributor &A)
: AAPointerInfoImpl(IRP, A) {}		: AAPointerInfoImpl(IRP, A) {}

/// Deal with an access and signal if it was handled successfully.		/// Deal with an access and signal if it was handled successfully.
bool handleAccess(Attributor &A, Instruction &I,		bool handleAccess(Attributor &A, Instruction &I,
std::optional<Value *> Content, AccessKind Kind,		std::optional<Value *> Content, AccessKind Kind,
int64_t Offset, ChangeStatus &Changed, Type &Ty) {		SmallVectorImpl<int64_t> &Offsets, ChangeStatus &Changed,
		Type &Ty) {
using namespace AA::PointerInfo;		using namespace AA::PointerInfo;
auto Size = AA::RangeTy::Unknown;		auto Size = AA::RangeTy::Unknown;
const DataLayout &DL = A.getDataLayout();		const DataLayout &DL = A.getDataLayout();
TypeSize AccessSize = DL.getTypeStoreSize(&Ty);		TypeSize AccessSize = DL.getTypeStoreSize(&Ty);
if (!AccessSize.isScalable())		if (!AccessSize.isScalable())
Size = AccessSize.getFixedSize();		Size = AccessSize.getFixedSize();
Changed = Changed \| addAccess(A, Offset, Size, I, Content, Kind, &Ty);
		// Make a strictly ascending list of offsets as required by addAccess()
		llvm::sort(Offsets);
		auto Last = std::unique(Offsets.begin(), Offsets.end());
		Offsets.erase(Last, Offsets.end());

		Changed = Changed \| addAccess(A, {Offsets, Size}, I, Content, Kind, &Ty);
return true;		return true;
};		};

/// See AbstractAttribute::updateImpl(...).		/// See AbstractAttribute::updateImpl(...).
ChangeStatus updateImpl(Attributor &A) override;		ChangeStatus updateImpl(Attributor &A) override;

		void collectConstantsForGEP(Attributor &A, const DataLayout &DL,
		OffsetInfo &UsrOI, const OffsetInfo &PtrOI,
		const GEPOperator *GEP, bool &Follow);

/// See AbstractAttribute::trackStatistics()		/// See AbstractAttribute::trackStatistics()
void trackStatistics() const override {		void trackStatistics() const override {
AAPointerInfoImpl::trackPointerInfoStatistics(getIRPosition());		AAPointerInfoImpl::trackPointerInfoStatistics(getIRPosition());
}		}
};		};

		/// If the indices to the GEP can be traced to constants, incorporate all
		/// of these into UsrOI.
		void AAPointerInfoFloating::collectConstantsForGEP(
		Attributor &A, const DataLayout &DL, OffsetInfo &UsrOI,
		const OffsetInfo &PtrOI, const GEPOperator *GEP, bool &Follow) {
		unsigned BitWidth = DL.getIndexTypeSizeInBits(GEP->getType());
		MapVector<Value *, APInt> VariableOffsets;
		APInt ConstantOffset(BitWidth, 0);

		Follow = true;

		if (!GEP->collectOffset(DL, BitWidth, VariableOffsets, ConstantOffset)) {
		UsrOI.setUnknown();
		return;
		}

		UsrOI = PtrOI;
		if (VariableOffsets.empty()) {
		LLVM_DEBUG(dbgs() << "[AAPointerInfo] GEP offset is constant " << *GEP
		<< "\n");
		UsrOI.addToAll(ConstantOffset.getSExtValue());
		return;
		}

		auto Union = UsrOI;
		Union.addToAll(ConstantOffset.getSExtValue());

		// Each VI in VariableOffsets has a set of potential constant values. Every
		// combination of elements, picked one each from these sets, is separately
		// added to the original set of offsets, thus resulting in more offsets.
		for (const auto &VI : VariableOffsets) {
		auto &PotentialConstantsAA = A.getAAFor<AAPotentialConstantValues>(
		this, IRPosition::value(VI.first), DepClassTy::OPTIONAL);
		if (!PotentialConstantsAA.isValidState()) {
		LLVM_DEBUG(dbgs() << "[AAPointerInfo] GEP offset is not constant " << *GEP
		<< "\n");
		UsrOI.setUnknown();
		return;
		}
		OffsetInfo NewUnion;
		jdoerfertUnsubmitted Done Reply Inline Actions This doesn't make sense to me. We need to look at all VariableOffsets and decide. So `return` should only be present if we give up. jdoerfert: This doesn't make sense to me. We need to look at all VariableOffsets and decide. So `return`…
		sameerdsAuthorUnsubmitted Done Reply Inline Actions You might have missed the "not" in the condition. We want to give up and come again later if any information is not at fixed point. This is easily exercised by tests involving induction variables. The set of potential values of the induction variable keeps growing, but we should not use that set until it is fully enumerated. Any eager propagation of a non-fixed-point through PointerInfo at this point affects conclusions in other attributes that depend on it. I did not look for an example of correctness, but it does cause the attributor to retain stores that it would have otherwise removed in one existing test. sameerds: You might have missed the "not" in the condition. We want to give up and come again later if…
		jdoerfertUnsubmitted Done Reply Inline Actions We cannot wait in one AA for another to find a fixpoint. That is not sound. That is not even always possible. Even if it would be, it won't work in the current algorithm. You need to update the AA state based on the state of the other AA, always. Then signal if something changed. That said, if we retain stores doing this properly we need to understand why. jdoerfert: We cannot wait in one AA for another to find a fixpoint. That is not sound. That is not even…
		sameerdsAuthorUnsubmitted Done Reply Inline Actions You're right. What I did here was plain wrong. The root cause was that when me merge ranges in operator&= for Access objects, we conservatively drop the contents. We don't need to be so conservative ... just combining contents from the two Access objects works well in case both happen to have the same contents. sameerds: You're right. What I did here was plain wrong. The root cause was that when me merge ranges in…
		for (const auto &ConstOffset : PotentialConstantsAA.getAssumedSet()) {
		auto CopyPerOffset = Union;
		CopyPerOffset.addToAll(ConstOffset.getSExtValue() *
		VI.second.getZExtValue());
		NewUnion.merge(CopyPerOffset);
		}
		Union = NewUnion;
		jdoerfertUnsubmitted Done Reply Inline Actions I don;t follow why we need two extra OffsetInfo objects here. We modify NewOI anyway, no? jdoerfert: I don;t follow why we need two extra OffsetInfo objects here. We modify NewOI anyway, no?
		sameerdsAuthorUnsubmitted Done Reply Inline Actions On each iteration of the outer for loop over VariableOffsets, the expression is a product: UnionOfAllCopies = NewOI x AssumedSet CopyPerOffset is the temporary used by the inner loop to compute this product. We need UnionOfAllCopies because it must only contain modified copies of NewOI, but not NewOI itself. We can't merge the RHS into NewOI. We have to start with an empty set. The actual output of the function is UsrOI. We do not move NewOI into that if we exit early. I realize there is no test for this yet. Working on that ... writing a GEP by hand is hard, when it involves nested aggregate types! sameerds: On each iteration of the outer for loop over VariableOffsets, the expression is a product…
		jdoerfertUnsubmitted Done Reply Inline Actions I realize there is no test for this yet. Working on that ... writing a GEP by hand is hard, when it involves nested aggregate types! Use clang. jdoerfert: > I realize there is no test for this yet. Working on that ... writing a GEP by hand is hard…
		sameerdsAuthorUnsubmitted Done Reply Inline Actions Test added. sameerds: Test added.
		}

		UsrOI = std::move(Union);
		return;
		}

ChangeStatus AAPointerInfoFloating::updateImpl(Attributor &A) {		ChangeStatus AAPointerInfoFloating::updateImpl(Attributor &A) {
using namespace AA::PointerInfo;		using namespace AA::PointerInfo;
ChangeStatus Changed = ChangeStatus::UNCHANGED;		ChangeStatus Changed = ChangeStatus::UNCHANGED;
		const DataLayout &DL = A.getDataLayout();
Value &AssociatedValue = getAssociatedValue();		Value &AssociatedValue = getAssociatedValue();

const DataLayout &DL = A.getDataLayout();
DenseMap<Value *, OffsetInfo> OffsetInfoMap;		DenseMap<Value *, OffsetInfo> OffsetInfoMap;
OffsetInfoMap[&AssociatedValue] = OffsetInfo{0};		OffsetInfoMap[&AssociatedValue].insert(0);

auto HandlePassthroughUser = [&](Value *Usr, const OffsetInfo &PtrOI,		auto HandlePassthroughUser = [&](Value *Usr, const OffsetInfo &PtrOI,
bool &Follow) {		bool &Follow) {
assert(PtrOI.Offset != AA::RangeTy::Unassigned &&		assert(!PtrOI.isUnassigned() &&
"Cannot pass through if the input Ptr was not visited!");		"Cannot pass through if the input Ptr was not visited!");
OffsetInfoMap[Usr] = PtrOI;		OffsetInfoMap[Usr] = PtrOI;
Follow = true;		Follow = true;
return true;		return true;
};		};

const auto *TLI =		const auto *TLI =
getAnchorScope()		getAnchorScope()
Show All 18 Lines	if (ConstantExpr *CE = dyn_cast<ConstantExpr>(Usr)) {
return false;		return false;
}		}
}		}
if (auto *GEP = dyn_cast<GEPOperator>(Usr)) {		if (auto *GEP = dyn_cast<GEPOperator>(Usr)) {
// Note the order here, the Usr access might change the map, CurPtr is		// Note the order here, the Usr access might change the map, CurPtr is
// already in it though.		// already in it though.
auto &UsrOI = OffsetInfoMap[Usr];		auto &UsrOI = OffsetInfoMap[Usr];
auto &PtrOI = OffsetInfoMap[CurPtr];		auto &PtrOI = OffsetInfoMap[CurPtr];
UsrOI = PtrOI;

// TODO: Use range information.
APInt GEPOffset(DL.getIndexTypeSizeInBits(GEP->getType()), 0);
if (PtrOI.Offset == AA::RangeTy::Unknown \|\|
!GEP->accumulateConstantOffset(DL, GEPOffset)) {
LLVM_DEBUG(dbgs() << "[AAPointerInfo] GEP offset not constant " << *GEP
<< "\n");
UsrOI.Offset = AA::RangeTy::Unknown;
Follow = true;		Follow = true;

		if (PtrOI.isUnknown()) {
		UsrOI.setUnknown();
return true;		return true;
}		}

LLVM_DEBUG(dbgs() << "[AAPointerInfo] GEP offset is constant " << *GEP		collectConstantsForGEP(A, DL, UsrOI, PtrOI, GEP, Follow);
<< "\n");
UsrOI.Offset = PtrOI.Offset + GEPOffset.getZExtValue();
Follow = true;
return true;		return true;
}		}
if (isa<PtrToIntInst>(Usr))		if (isa<PtrToIntInst>(Usr))
return false;		return false;
if (isa<CastInst>(Usr) \|\| isa<SelectInst>(Usr) \|\| isa<ReturnInst>(Usr))		if (isa<CastInst>(Usr) \|\| isa<SelectInst>(Usr) \|\| isa<ReturnInst>(Usr))
return HandlePassthroughUser(Usr, OffsetInfoMap[CurPtr], Follow);		return HandlePassthroughUser(Usr, OffsetInfoMap[CurPtr], Follow);

// For PHIs we need to take care of the recurrence explicitly as the value		// For PHIs we need to take care of the recurrence explicitly as the value
// might change while we iterate through a loop. For now, we give up if		// might change while we iterate through a loop. For now, we give up if
// the PHI is not invariant.		// the PHI is not invariant.
if (isa<PHINode>(Usr)) {		if (isa<PHINode>(Usr)) {
// Note the order here, the Usr access might change the map, CurPtr is		// Note the order here, the Usr access might change the map, CurPtr is
// already in it though.		// already in it though.
bool IsFirstPHIUser = !OffsetInfoMap.count(Usr);		bool IsFirstPHIUser = !OffsetInfoMap.count(Usr);
auto &UsrOI = OffsetInfoMap[Usr];		auto &UsrOI = OffsetInfoMap[Usr];
auto &PtrOI = OffsetInfoMap[CurPtr];		auto &PtrOI = OffsetInfoMap[CurPtr];

// Check if the PHI operand has already an unknown offset as we can't		// Check if the PHI operand has already an unknown offset as we can't
// improve on that anymore.		// improve on that anymore.
if (PtrOI.Offset == AA::RangeTy::Unknown) {		if (PtrOI.isUnknown()) {
		jdoerfertUnsubmitted Done Reply Inline Actions I think we might want to call `Attributor::getPotentialValues` on the variable offsets. It should handle select and phi and more, e.g., loads. Maybe in addition to `Attributor::getAssumedConstant` we should have `Attributor::getAssumedConstantValues`. I want to avoid yet another traversal of some instructions in favor of common interfaces.a jdoerfert: I think we might want to call `Attributor::getPotentialValues` on the variable offsets. It…
LLVM_DEBUG(dbgs() << "[AAPointerInfo] PHI operand offset unknown "		LLVM_DEBUG(dbgs() << "[AAPointerInfo] PHI operand offset unknown "
<< CurPtr << " in " << Usr << "\n");		<< CurPtr << " in " << Usr << "\n");
Follow = UsrOI.Offset != AA::RangeTy::Unknown;		Follow = !UsrOI.isUnknown();
UsrOI = PtrOI;		UsrOI.setUnknown();
return true;		return true;
}		}

// Check if the PHI is invariant (so far).		// Check if the PHI is invariant (so far).
if (UsrOI == PtrOI) {		if (UsrOI == PtrOI) {
assert(PtrOI.Offset != AA::RangeTy::Unassigned &&		assert(!PtrOI.isUnassigned() &&
"Cannot assign if the current Ptr was not visited!");		"Cannot assign if the current Ptr was not visited!");
LLVM_DEBUG(dbgs() << "[AAPointerInfo] PHI is invariant (so far)");		LLVM_DEBUG(dbgs() << "[AAPointerInfo] PHI is invariant (so far)");
return true;		return true;
}		}

// Check if the PHI operand is not dependent on the PHI itself.		// Check if the PHI operand is not dependent on the PHI itself.
APInt Offset(		APInt Offset(
DL.getIndexSizeInBits(CurPtr->getType()->getPointerAddressSpace()),		DL.getIndexSizeInBits(CurPtr->getType()->getPointerAddressSpace()),
0);		0);
Value *CurPtrBase = CurPtr->stripAndAccumulateConstantOffsets(		Value *CurPtrBase = CurPtr->stripAndAccumulateConstantOffsets(
DL, Offset, /* AllowNonInbounds */ true);		DL, Offset, /* AllowNonInbounds */ true);
auto It = OffsetInfoMap.find(CurPtrBase);		auto It = OffsetInfoMap.find(CurPtrBase);
if (It != OffsetInfoMap.end()) {		if (It != OffsetInfoMap.end()) {
Offset += It->getSecond().Offset;		auto BaseOI = It->getSecond();
if (IsFirstPHIUser \|\| Offset == UsrOI.Offset)		BaseOI.addToAll(Offset.getZExtValue());
		jdoerfertUnsubmitted Done Reply Inline Actions SExt, I think. -1 is a fine offset. jdoerfert: SExt, I think. -1 is a fine offset.
		if (IsFirstPHIUser \|\| BaseOI == UsrOI) {
		LLVM_DEBUG(dbgs() << "[AAPointerInfo] PHI is invariant " << *CurPtr
		<< " in " << *Usr << "\n");
return HandlePassthroughUser(Usr, PtrOI, Follow);		return HandlePassthroughUser(Usr, PtrOI, Follow);
		}
LLVM_DEBUG(		LLVM_DEBUG(
dbgs() << "[AAPointerInfo] PHI operand pointer offset mismatch "		dbgs() << "[AAPointerInfo] PHI operand pointer offset mismatch "
<< CurPtr << " in " << Usr << "\n");		<< CurPtr << " in " << Usr << "\n");
} else {		} else {
LLVM_DEBUG(dbgs() << "[AAPointerInfo] PHI operand is too complex "		LLVM_DEBUG(dbgs() << "[AAPointerInfo] PHI operand is too complex "
<< CurPtr << " in " << Usr << "\n");		<< CurPtr << " in " << Usr << "\n");
}		}

// TODO: Approximate in case we know the direction of the recurrence.		// TODO: Approximate in case we know the direction of the recurrence.
UsrOI = PtrOI;		UsrOI.setUnknown();
UsrOI.Offset = AA::RangeTy::Unknown;
Follow = true;		Follow = true;
return true;		return true;
}		}

if (auto *LoadI = dyn_cast<LoadInst>(Usr)) {		if (auto *LoadI = dyn_cast<LoadInst>(Usr)) {
// If the access is to a pointer that may or may not be the associated		// If the access is to a pointer that may or may not be the associated
// value, e.g. due to a PHI, we cannot assume it will be read.		// value, e.g. due to a PHI, we cannot assume it will be read.
AccessKind AK = AccessKind::AK_R;		AccessKind AK = AccessKind::AK_R;
if (getUnderlyingObject(CurPtr) == &AssociatedValue)		if (getUnderlyingObject(CurPtr) == &AssociatedValue)
AK = AccessKind(AK \| AccessKind::AK_MUST);		AK = AccessKind(AK \| AccessKind::AK_MUST);
else		else
AK = AccessKind(AK \| AccessKind::AK_MAY);		AK = AccessKind(AK \| AccessKind::AK_MAY);
if (!handleAccess(A, LoadI, / Content */ nullptr, AK,		if (!handleAccess(A, LoadI, / Content */ nullptr, AK,
OffsetInfoMap[CurPtr].Offset, Changed,		OffsetInfoMap[CurPtr].Offsets, Changed,
*LoadI->getType()))		*LoadI->getType()))
return false;		return false;

		jdoerfertUnsubmitted Done Reply Inline Actions Nit: Unsure why we need two returns here. jdoerfert: Nit: Unsure why we need two returns here.
		sameerdsAuthorUnsubmitted Done Reply Inline Actions I missed this. If there are no other changes required, I can fix this locally before submitting. sameerds: I missed this. If there are no other changes required, I can fix this locally before submitting.
auto IsAssumption = [](Instruction &I) {		auto IsAssumption = [](Instruction &I) {
if (auto *II = dyn_cast<IntrinsicInst>(&I))		if (auto *II = dyn_cast<IntrinsicInst>(&I))
return II->isAssumeLikeIntrinsic();		return II->isAssumeLikeIntrinsic();
return false;		return false;
};		};

auto IsImpactedInRange = [&](Instruction FromI, Instruction ToI) {		auto IsImpactedInRange = [&](Instruction FromI, Instruction ToI) {
// Check if the assumption and the load are executed together without		// Check if the assumption and the load are executed together without
▲ Show 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	if (auto *LoadI = dyn_cast<LoadInst>(Usr)) {
return true;		return true;

LLVM_DEBUG(dbgs() << "[AAPointerInfo] Assumption found "		LLVM_DEBUG(dbgs() << "[AAPointerInfo] Assumption found "
<< Assumption.second << ": " << LoadI		<< Assumption.second << ": " << LoadI
<< " == " << *Assumption.first << "\n");		<< " == " << *Assumption.first << "\n");

return handleAccess(		return handleAccess(
A, *Assumption.second, Assumption.first, AccessKind::AK_ASSUMPTION,		A, *Assumption.second, Assumption.first, AccessKind::AK_ASSUMPTION,
OffsetInfoMap[CurPtr].Offset, Changed, *LoadI->getType());		OffsetInfoMap[CurPtr].Offsets, Changed, *LoadI->getType());
}		}

auto HandleStoreLike = [&](Instruction &I, Value *ValueOp, Type &ValueTy,		auto HandleStoreLike = [&](Instruction &I, Value *ValueOp, Type &ValueTy,
ArrayRef<Value *> OtherOps, AccessKind AK) {		ArrayRef<Value *> OtherOps, AccessKind AK) {
for (auto *OtherOp : OtherOps) {		for (auto *OtherOp : OtherOps) {
if (OtherOp == CurPtr) {		if (OtherOp == CurPtr) {
LLVM_DEBUG(		LLVM_DEBUG(
dbgs()		dbgs()
Show All 9 Lines	auto HandleStoreLike = [&](Instruction &I, Value *ValueOp, Type &ValueTy,
AK = AccessKind(AK \| AccessKind::AK_MUST);		AK = AccessKind(AK \| AccessKind::AK_MUST);
else		else
AK = AccessKind(AK \| AccessKind::AK_MAY);		AK = AccessKind(AK \| AccessKind::AK_MAY);
bool UsedAssumedInformation = false;		bool UsedAssumedInformation = false;
std::optional<Value *> Content = nullptr;		std::optional<Value *> Content = nullptr;
if (ValueOp)		if (ValueOp)
Content = A.getAssumedSimplified(		Content = A.getAssumedSimplified(
ValueOp, this, UsedAssumedInformation, AA::Interprocedural);		ValueOp, this, UsedAssumedInformation, AA::Interprocedural);
return handleAccess(A, I, Content, AK, OffsetInfoMap[CurPtr].Offset,		return handleAccess(A, I, Content, AK, OffsetInfoMap[CurPtr].Offsets,
Changed, ValueTy);		Changed, ValueTy);
};		};

if (auto *StoreI = dyn_cast<StoreInst>(Usr))		if (auto *StoreI = dyn_cast<StoreInst>(Usr))
		jdoerfertUnsubmitted Done Reply Inline Actions Same as above jdoerfert: Same as above
return HandleStoreLike(*StoreI, StoreI->getValueOperand(),		return HandleStoreLike(*StoreI, StoreI->getValueOperand(),
*StoreI->getValueOperand()->getType(),		*StoreI->getValueOperand()->getType(),
{StoreI->getValueOperand()}, AccessKind::AK_W);		{StoreI->getValueOperand()}, AccessKind::AK_W);
if (auto *RMWI = dyn_cast<AtomicRMWInst>(Usr))		if (auto *RMWI = dyn_cast<AtomicRMWInst>(Usr))
return HandleStoreLike(RMWI, nullptr, RMWI->getValOperand()->getType(),		return HandleStoreLike(RMWI, nullptr, RMWI->getValOperand()->getType(),
{RMWI->getValOperand()}, AccessKind::AK_RW);		{RMWI->getValOperand()}, AccessKind::AK_RW);
if (auto *CXI = dyn_cast<AtomicCmpXchgInst>(Usr))		if (auto *CXI = dyn_cast<AtomicCmpXchgInst>(Usr))
return HandleStoreLike(		return HandleStoreLike(
CXI, nullptr, CXI->getNewValOperand()->getType(),		CXI, nullptr, CXI->getNewValOperand()->getType(),
{CXI->getCompareOperand(), CXI->getNewValOperand()},		{CXI->getCompareOperand(), CXI->getNewValOperand()},
AccessKind::AK_RW);		AccessKind::AK_RW);

if (auto *CB = dyn_cast<CallBase>(Usr)) {		if (auto *CB = dyn_cast<CallBase>(Usr)) {
if (CB->isLifetimeStartOrEnd())		if (CB->isLifetimeStartOrEnd())
return true;		return true;
if (getFreedOperand(CB, TLI) == U)		if (getFreedOperand(CB, TLI) == U)
return true;		return true;
if (CB->isArgOperand(&U)) {		if (CB->isArgOperand(&U)) {
unsigned ArgNo = CB->getArgOperandNo(&U);		unsigned ArgNo = CB->getArgOperandNo(&U);
const auto &CSArgPI = A.getAAFor<AAPointerInfo>(		const auto &CSArgPI = A.getAAFor<AAPointerInfo>(
this, IRPosition::callsite_argument(CB, ArgNo),		this, IRPosition::callsite_argument(CB, ArgNo),
DepClassTy::REQUIRED);		DepClassTy::REQUIRED);
Changed = translateAndAddState(A, CSArgPI, OffsetInfoMap[CurPtr].Offset,		Changed = translateAndAddState(A, CSArgPI, OffsetInfoMap[CurPtr], *CB) \|
*CB) \|
Changed;		Changed;
return isValidState();		return isValidState();
}		}
LLVM_DEBUG(dbgs() << "[AAPointerInfo] Call user not handled " << *CB		LLVM_DEBUG(dbgs() << "[AAPointerInfo] Call user not handled " << *CB
<< "\n");		<< "\n");
// TODO: Allow some call uses		// TODO: Allow some call uses
return false;		return false;
}		}

LLVM_DEBUG(dbgs() << "[AAPointerInfo] User not handled " << *Usr << "\n");		LLVM_DEBUG(dbgs() << "[AAPointerInfo] User not handled " << *Usr << "\n");
return false;		return false;
};		};
auto EquivalentUseCB = [&](const Use &OldU, const Use &NewU) {		auto EquivalentUseCB = [&](const Use &OldU, const Use &NewU) {
assert(OffsetInfoMap.count(OldU) && "Old use should be known already!");		assert(OffsetInfoMap.count(OldU) && "Old use should be known already!");
if (OffsetInfoMap.count(NewU)) {		if (OffsetInfoMap.count(NewU)) {
LLVM_DEBUG({		LLVM_DEBUG({
if (!(OffsetInfoMap[NewU] == OffsetInfoMap[OldU])) {		if (!(OffsetInfoMap[NewU] == OffsetInfoMap[OldU])) {
dbgs() << "[AAPointerInfo] Equivalent use callback failed: "		dbgs() << "[AAPointerInfo] Equivalent use callback failed: "
<< OffsetInfoMap[NewU].Offset << " vs "		<< OffsetInfoMap[NewU] << " vs " << OffsetInfoMap[OldU]
<< OffsetInfoMap[OldU].Offset << "\n";		<< "\n";
}		}
});		});
return OffsetInfoMap[NewU] == OffsetInfoMap[OldU];		return OffsetInfoMap[NewU] == OffsetInfoMap[OldU];
}		}
OffsetInfoMap[NewU] = OffsetInfoMap[OldU];		OffsetInfoMap[NewU] = OffsetInfoMap[OldU];
return true;		return true;
};		};
if (!A.checkForAllUses(UsePred, *this, AssociatedValue,		if (!A.checkForAllUses(UsePred, *this, AssociatedValue,
▲ Show 20 Lines • Show All 63 Lines • ▼ Show 20 Lines	if (auto *MI = dyn_cast_or_null<MemIntrinsic>(getCtxI())) {
if (ArgNo > 1) {		if (ArgNo > 1) {
LLVM_DEBUG(dbgs() << "[AAPointerInfo] Unhandled memory intrinsic "		LLVM_DEBUG(dbgs() << "[AAPointerInfo] Unhandled memory intrinsic "
<< *MI << "\n");		<< *MI << "\n");
return indicatePessimisticFixpoint();		return indicatePessimisticFixpoint();
} else {		} else {
auto Kind =		auto Kind =
ArgNo == 0 ? AccessKind::AK_MUST_WRITE : AccessKind::AK_MUST_READ;		ArgNo == 0 ? AccessKind::AK_MUST_WRITE : AccessKind::AK_MUST_READ;
Changed =		Changed =
Changed \| addAccess(A, 0, LengthVal, *MI, nullptr, Kind, nullptr);		Changed \| addAccess(A, {0, LengthVal}, *MI, nullptr, Kind, nullptr);
}		}
LLVM_DEBUG({		LLVM_DEBUG({
dbgs() << "Accesses by bin after update:\n";		dbgs() << "Accesses by bin after update:\n";
dumpState(dbgs());		dumpState(dbgs());
});		});

return Changed;		return Changed;
}		}
Show All 21 Lines	if (!NoCaptureAA.isAssumedNoCapture())
return indicatePessimisticFixpoint();		return indicatePessimisticFixpoint();

bool IsKnown = false;		bool IsKnown = false;
if (AA::isAssumedReadNone(A, getIRPosition(), *this, IsKnown))		if (AA::isAssumedReadNone(A, getIRPosition(), *this, IsKnown))
return ChangeStatus::UNCHANGED;		return ChangeStatus::UNCHANGED;
bool ReadOnly = AA::isAssumedReadOnly(A, getIRPosition(), *this, IsKnown);		bool ReadOnly = AA::isAssumedReadOnly(A, getIRPosition(), *this, IsKnown);
auto Kind =		auto Kind =
ReadOnly ? AccessKind::AK_MAY_READ : AccessKind::AK_MAY_READ_WRITE;		ReadOnly ? AccessKind::AK_MAY_READ : AccessKind::AK_MAY_READ_WRITE;
return addAccess(A, AA::RangeTy::Unknown, AA::RangeTy::Unknown, *getCtxI(),		return addAccess(A, AA::RangeTy::getUnknown(), *getCtxI(), nullptr, Kind,
nullptr, Kind, nullptr);		nullptr);
}		}

/// See AbstractAttribute::trackStatistics()		/// See AbstractAttribute::trackStatistics()
void trackStatistics() const override {		void trackStatistics() const override {
AAPointerInfoImpl::trackPointerInfoStatistics(getIRPosition());		AAPointerInfoImpl::trackPointerInfoStatistics(getIRPosition());
}		}
};		};

▲ Show 20 Lines • Show All 9,487 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/attributor.ll

This file was moved to llvm/test/CodeGen/AMDGPU/implicitarg-attributes.ll.

llvm/test/CodeGen/AMDGPU/implicitarg-attributes.ll

This file was moved from llvm/test/CodeGen/AMDGPU/attributor.ll.

; RUN: llc < %s \| FileCheck %s		; RUN: llc < %s \| FileCheck %s

target triple = "amdgcn-amd-amdhsa"		target triple = "amdgcn-amd-amdhsa"

; The call to intrinsic implicitarg_ptr reaches a load through a phi. The		; The call to intrinsic implicitarg_ptr reaches a load through a phi. The
; offsets of the phi cannot be determined, and hence the attirbutor assumes that		; offsets of the phi cannot be determined, and hence the attirbutor assumes that
; hostcall is in use.		; hostcall is in use.

		; CHECK-LABEL: amdhsa.kernels:
; CHECK: .value_kind: hidden_hostcall_buffer		; CHECK: .value_kind: hidden_hostcall_buffer
; CHECK: .value_kind: hidden_multigrid_sync_arg		; CHECK: .value_kind: hidden_multigrid_sync_arg
		; CHECK-LABEL: .name: kernel_1

define amdgpu_kernel void @the_kernel(i32 addrspace(1)* %a, i64 %index1, i64 %index2, i1 %cond) {		define amdgpu_kernel void @kernel_1(i32 addrspace(1)* %a, i64 %index1, i64 %index2, i1 %cond) {
entry:		entry:
%tmp7 = tail call i8 addrspace(4)* @llvm.amdgcn.implicitarg.ptr()		%tmp7 = tail call i8 addrspace(4)* @llvm.amdgcn.implicitarg.ptr()
br i1 %cond, label %old, label %new		br i1 %cond, label %old, label %new

old: ; preds = %entry		old: ; preds = %entry
%tmp4 = getelementptr i8, i8 addrspace(4)* %tmp7, i64 %index1		%tmp4 = getelementptr i8, i8 addrspace(4)* %tmp7, i64 %index1
br label %join		br label %join

Show All 9 Lines	join: ; preds = %new, %old
%.in = load i16, i16 addrspace(4)* %.in.in, align 2		%.in = load i16, i16 addrspace(4)* %.in.in, align 2

%idx.ext = sext i16 %.in to i64		%idx.ext = sext i16 %.in to i64
%add.ptr3 = getelementptr inbounds i32, i32 addrspace(1)* %a, i64 %idx.ext		%add.ptr3 = getelementptr inbounds i32, i32 addrspace(1)* %a, i64 %idx.ext
%tmp16 = atomicrmw add i32 addrspace(1)* %add.ptr3, i32 15 syncscope("agent-one-as") monotonic, align 4		%tmp16 = atomicrmw add i32 addrspace(1)* %add.ptr3, i32 15 syncscope("agent-one-as") monotonic, align 4
ret void		ret void
}		}

		; The call to intrinsic implicitarg_ptr is combined with an offset produced by
		; select'ing between two constants, before it is eventually used in a GEP to
		; form the address of a load. This test ensures that AAPointerInfo can look
		; through the select to maintain a set of indices, so that it can precisely
		; determine that hostcall and other expensive implicit args are not in use.

		; CHECK-NOT: hidden_hostcall_buffer
		; CHECK-NOT: hidden_multigrid_sync_arg
		; CHECK-LABEL: .name: kernel_2

		define amdgpu_kernel void @kernel_2(i32 addrspace(1)* %a, i1 %cond) {
		entry:
		%tmp7 = tail call i8 addrspace(4)* @llvm.amdgcn.implicitarg.ptr()
		%tmp5 = select i1 %cond, i64 12, i64 18
		%tmp6 = getelementptr inbounds i8, i8 addrspace(4)* %tmp7, i64 %tmp5
		%tmp8 = bitcast i8 addrspace(4)* %tmp6 to i16 addrspace(4)*

		;;; THIS USE is where multiple offsets are possible, relative to implicitarg_ptr
		%tmp9 = load i16, i16 addrspace(4)* %tmp8, align 2

		%idx.ext = sext i16 %tmp9 to i64
		%add.ptr3 = getelementptr inbounds i32, i32 addrspace(1)* %a, i64 %idx.ext
		%tmp16 = atomicrmw add i32 addrspace(1)* %add.ptr3, i32 15 syncscope("agent-one-as") monotonic, align 4
		ret void
		}

declare i32 @llvm.amdgcn.workitem.id.x()		declare i32 @llvm.amdgcn.workitem.id.x()

declare align 4 i8 addrspace(4)* @llvm.amdgcn.implicitarg.ptr()		declare align 4 i8 addrspace(4)* @llvm.amdgcn.implicitarg.ptr()

declare i32 @llvm.amdgcn.workgroup.id.x()		declare i32 @llvm.amdgcn.workgroup.id.x()

llvm/test/Transforms/Attributor/call-simplify-pointer-info.ll

Show All 11 Lines
; CGSCC-NEXT: ret i8 [[L]]		; CGSCC-NEXT: ret i8 [[L]]
;		;
entry:		entry:
%l = load i8, i8* %p, align 1		%l = load i8, i8* %p, align 1
ret i8 %l		ret i8 %l
}		}

define internal i8 @read_arg_index(i8* %p, i64 %index) {		define internal i8 @read_arg_index(i8* %p, i64 %index) {
; TUNIT: Function Attrs: nofree norecurse nosync nounwind willreturn memory(argmem: read)
; TUNIT-LABEL: define {{[^@]+}}@read_arg_index
; TUNIT-SAME: (i8* nocapture nofree noundef nonnull readonly align 2 dereferenceable(1022) [[P:%.*]]) #[[ATTR0:[0-9]+]] {
; TUNIT-NEXT: entry:
; TUNIT-NEXT: [[L:%.]] = load i8, i8 [[P]], align 2
; TUNIT-NEXT: ret i8 [[L]]
;
; CGSCC: Function Attrs: nofree norecurse nosync nounwind willreturn memory(argmem: read)		; CGSCC: Function Attrs: nofree norecurse nosync nounwind willreturn memory(argmem: read)
; CGSCC-LABEL: define {{[^@]+}}@read_arg_index		; CGSCC-LABEL: define {{[^@]+}}@read_arg_index
; CGSCC-SAME: (i8* nocapture nofree noundef nonnull readonly dereferenceable(1022) [[P:%.*]]) #[[ATTR0]] {		; CGSCC-SAME: (i8* nocapture nofree noundef nonnull readonly align 16 dereferenceable(1024) [[P:%.*]]) #[[ATTR0]] {
; CGSCC-NEXT: entry:		; CGSCC-NEXT: entry:
; CGSCC-NEXT: [[L:%.]] = load i8, i8 [[P]], align 1		; CGSCC-NEXT: [[G:%.]] = getelementptr inbounds i8, i8 [[P]], i64 2
		; CGSCC-NEXT: [[L:%.]] = load i8, i8 [[G]], align 1
; CGSCC-NEXT: ret i8 [[L]]		; CGSCC-NEXT: ret i8 [[L]]
;		;
entry:		entry:
%g = getelementptr inbounds i8, i8* %p, i64 %index		%g = getelementptr inbounds i8, i8* %p, i64 %index
%l = load i8, i8* %g, align 1		%l = load i8, i8* %g, align 1
ret i8 %l		ret i8 %l
}		}

define i8 @call_simplifiable_1() {		define i8 @call_simplifiable_1() {
; TUNIT: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)		; TUNIT: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)
; TUNIT-LABEL: define {{[^@]+}}@call_simplifiable_1		; TUNIT-LABEL: define {{[^@]+}}@call_simplifiable_1
; TUNIT-SAME: () #[[ATTR1:[0-9]+]] {		; TUNIT-SAME: () #[[ATTR0:[0-9]+]] {
; TUNIT-NEXT: entry:		; TUNIT-NEXT: entry:
; TUNIT-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16		; TUNIT-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
; TUNIT-NEXT: [[I0:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 2		; TUNIT-NEXT: [[I0:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 2
; TUNIT-NEXT: ret i8 2		; TUNIT-NEXT: ret i8 2
;		;
; CGSCC: Function Attrs: nofree nosync nounwind willreturn memory(none)		; CGSCC: Function Attrs: nofree nosync nounwind willreturn memory(none)
; CGSCC-LABEL: define {{[^@]+}}@call_simplifiable_1		; CGSCC-LABEL: define {{[^@]+}}@call_simplifiable_1
; CGSCC-SAME: () #[[ATTR1:[0-9]+]] {		; CGSCC-SAME: () #[[ATTR1:[0-9]+]] {
; CGSCC-NEXT: entry:		; CGSCC-NEXT: entry:
; CGSCC-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16		; CGSCC-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
; CGSCC-NEXT: [[I0:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 2		; CGSCC-NEXT: [[I0:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 2
; CGSCC-NEXT: store i8 2, i8* [[I0]], align 2		; CGSCC-NEXT: store i8 2, i8* [[I0]], align 2
; CGSCC-NEXT: [[R:%.]] = call i8 @read_arg(i8 nocapture nofree noundef nonnull readonly align 2 dereferenceable(1022) [[I0]]) #[[ATTR3:[0-9]+]]		; CGSCC-NEXT: [[R:%.]] = call i8 @read_arg(i8 nocapture nofree noundef nonnull readonly align 2 dereferenceable(1022) [[I0]]) #[[ATTR4:[0-9]+]]
; CGSCC-NEXT: ret i8 [[R]]		; CGSCC-NEXT: ret i8 [[R]]
;		;
entry:		entry:
%Bytes = alloca [1024 x i8], align 16		%Bytes = alloca [1024 x i8], align 16
%i0 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 2		%i0 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 2
store i8 2, i8* %i0, align 1		store i8 2, i8* %i0, align 1
%r = call i8 @read_arg(i8* %i0)		%r = call i8 @read_arg(i8* %i0)
ret i8 %r		ret i8 %r
Show All 13 Lines	entry:
%l = load i8, i8* %p, align 1		%l = load i8, i8* %p, align 1
ret i8 %l		ret i8 %l
}		}

define internal i8 @sum_two_same_loads(i8* %p) {		define internal i8 @sum_two_same_loads(i8* %p) {
; CGSCC: Function Attrs: nofree nosync nounwind willreturn memory(argmem: read)		; CGSCC: Function Attrs: nofree nosync nounwind willreturn memory(argmem: read)
; CGSCC-LABEL: define {{[^@]+}}@sum_two_same_loads		; CGSCC-LABEL: define {{[^@]+}}@sum_two_same_loads
; CGSCC-SAME: (i8* nocapture nofree noundef nonnull readonly dereferenceable(1022) [[P:%.*]]) #[[ATTR2:[0-9]+]] {		; CGSCC-SAME: (i8* nocapture nofree noundef nonnull readonly dereferenceable(1022) [[P:%.*]]) #[[ATTR2:[0-9]+]] {
; CGSCC-NEXT: [[X:%.]] = call i8 @read_arg_1(i8 nocapture nofree noundef nonnull readonly dereferenceable(1022) [[P]]) #[[ATTR4:[0-9]+]]		; CGSCC-NEXT: [[X:%.]] = call i8 @read_arg_1(i8 nocapture nofree noundef nonnull readonly dereferenceable(1022) [[P]]) #[[ATTR5:[0-9]+]]
; CGSCC-NEXT: [[Y:%.]] = call i8 @read_arg_1(i8 nocapture nofree noundef nonnull readonly dereferenceable(1022) [[P]]) #[[ATTR4]]		; CGSCC-NEXT: [[Y:%.]] = call i8 @read_arg_1(i8 nocapture nofree noundef nonnull readonly dereferenceable(1022) [[P]]) #[[ATTR5]]
; CGSCC-NEXT: [[Z:%.*]] = add nsw i8 [[X]], [[Y]]		; CGSCC-NEXT: [[Z:%.*]] = add nsw i8 [[X]], [[Y]]
; CGSCC-NEXT: ret i8 [[Z]]		; CGSCC-NEXT: ret i8 [[Z]]
;		;
%x = call i8 @read_arg_1(i8* %p)		%x = call i8 @read_arg_1(i8* %p)
%y = call i8 @read_arg_1(i8* %p)		%y = call i8 @read_arg_1(i8* %p)
%z = add nsw i8 %x, %y		%z = add nsw i8 %x, %y
ret i8 %z		ret i8 %z
}		}

define i8 @call_simplifiable_2() {		define i8 @call_simplifiable_2() {
; TUNIT: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)		; TUNIT: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)
; TUNIT-LABEL: define {{[^@]+}}@call_simplifiable_2		; TUNIT-LABEL: define {{[^@]+}}@call_simplifiable_2
; TUNIT-SAME: () #[[ATTR1]] {		; TUNIT-SAME: () #[[ATTR0]] {
; TUNIT-NEXT: entry:		; TUNIT-NEXT: entry:
; TUNIT-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16		; TUNIT-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
; TUNIT-NEXT: [[I0:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 2		; TUNIT-NEXT: [[I0:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 2
; TUNIT-NEXT: [[I1:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 3		; TUNIT-NEXT: [[I1:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 3
; TUNIT-NEXT: ret i8 4		; TUNIT-NEXT: ret i8 4
;		;
; CGSCC: Function Attrs: nofree nosync nounwind willreturn memory(none)		; CGSCC: Function Attrs: nofree nosync nounwind willreturn memory(none)
; CGSCC-LABEL: define {{[^@]+}}@call_simplifiable_2		; CGSCC-LABEL: define {{[^@]+}}@call_simplifiable_2
; CGSCC-SAME: () #[[ATTR1]] {		; CGSCC-SAME: () #[[ATTR1]] {
; CGSCC-NEXT: entry:		; CGSCC-NEXT: entry:
; CGSCC-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16		; CGSCC-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
; CGSCC-NEXT: [[I0:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 2		; CGSCC-NEXT: [[I0:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 2
; CGSCC-NEXT: store i8 2, i8* [[I0]], align 2		; CGSCC-NEXT: store i8 2, i8* [[I0]], align 2
; CGSCC-NEXT: [[I1:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 3		; CGSCC-NEXT: [[I1:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 3
; CGSCC-NEXT: store i8 3, i8* [[I1]], align 1		; CGSCC-NEXT: store i8 3, i8* [[I1]], align 1
; CGSCC-NEXT: [[R:%.]] = call i8 @sum_two_same_loads(i8 nocapture nofree noundef nonnull readonly align 2 dereferenceable(1022) [[I0]]) #[[ATTR3]]		; CGSCC-NEXT: [[R:%.]] = call i8 @sum_two_same_loads(i8 nocapture nofree noundef nonnull readonly align 2 dereferenceable(1022) [[I0]]) #[[ATTR4]]
; CGSCC-NEXT: ret i8 [[R]]		; CGSCC-NEXT: ret i8 [[R]]
;		;
entry:		entry:
%Bytes = alloca [1024 x i8], align 16		%Bytes = alloca [1024 x i8], align 16
%i0 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 2		%i0 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 2
store i8 2, i8* %i0		store i8 2, i8* %i0
%i1 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 3		%i1 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 3
store i8 3, i8* %i1		store i8 3, i8* %i1
%r = call i8 @sum_two_same_loads(i8* %i0)		%r = call i8 @sum_two_same_loads(i8* %i0)
ret i8 %r		ret i8 %r
}		}

define i8 @call_not_simplifiable_1() {		define i8 @call_simplifiable_3() {
; TUNIT: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)		; TUNIT: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)
; TUNIT-LABEL: define {{[^@]+}}@call_not_simplifiable_1		; TUNIT-LABEL: define {{[^@]+}}@call_simplifiable_3
; TUNIT-SAME: () #[[ATTR1]] {		; TUNIT-SAME: () #[[ATTR0]] {
; TUNIT-NEXT: entry:		; TUNIT-NEXT: entry:
; TUNIT-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16		; TUNIT-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
; TUNIT-NEXT: [[I0:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 2		; TUNIT-NEXT: [[I2:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 2
; TUNIT-NEXT: store i8 2, i8* [[I0]], align 2		; TUNIT-NEXT: ret i8 2
; TUNIT-NEXT: [[R:%.]] = call i8 @read_arg_index(i8 nocapture nofree noundef nonnull readonly align 2 dereferenceable(1022) [[I0]]) #[[ATTR2:[0-9]+]]
; TUNIT-NEXT: ret i8 [[R]]
;		;
; CGSCC: Function Attrs: nofree nosync nounwind willreturn memory(none)		; CGSCC: Function Attrs: nofree nosync nounwind willreturn memory(none)
; CGSCC-LABEL: define {{[^@]+}}@call_not_simplifiable_1		; CGSCC-LABEL: define {{[^@]+}}@call_simplifiable_3
; CGSCC-SAME: () #[[ATTR1]] {		; CGSCC-SAME: () #[[ATTR1]] {
; CGSCC-NEXT: entry:		; CGSCC-NEXT: entry:
; CGSCC-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16		; CGSCC-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
; CGSCC-NEXT: [[I0:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 2		; CGSCC-NEXT: [[I0:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 0
; CGSCC-NEXT: store i8 2, i8* [[I0]], align 2		; CGSCC-NEXT: [[I2:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 2
; CGSCC-NEXT: [[R:%.]] = call i8 @read_arg_index(i8 nocapture nofree noundef nonnull readonly align 2 dereferenceable(1022) [[I0]]) #[[ATTR3]]		; CGSCC-NEXT: store i8 2, i8* [[I2]], align 2
		; CGSCC-NEXT: [[R:%.]] = call i8 @read_arg_index(i8 nocapture nofree noundef nonnull readonly align 16 dereferenceable(1024) [[I0]]) #[[ATTR4]]
; CGSCC-NEXT: ret i8 [[R]]		; CGSCC-NEXT: ret i8 [[R]]
;		;
entry:		entry:
%Bytes = alloca [1024 x i8], align 16		%Bytes = alloca [1024 x i8], align 16
%i0 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 2		%i0 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 0
store i8 2, i8* %i0, align 1		%i2 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 2
%r = call i8 @read_arg_index(i8* %i0, i64 0)		store i8 2, i8* %i2, align 1
		%r = call i8 @read_arg_index(i8* %i0, i64 2)
ret i8 %r		ret i8 %r
}		}

;;; Same as read_arg, but we need a copy to form distinct leaves in the callgraph.		;;; Same as read_arg, but we need a copy to form distinct leaves in the callgraph.

define internal i8 @read_arg_2(i8* %p) {		define internal i8 @read_arg_2(i8* %p) {
; TUNIT: Function Attrs: nofree norecurse nosync nounwind willreturn memory(argmem: read)		; TUNIT: Function Attrs: nofree norecurse nosync nounwind willreturn memory(argmem: read)
; TUNIT-LABEL: define {{[^@]+}}@read_arg_2		; TUNIT-LABEL: define {{[^@]+}}@read_arg_2
; TUNIT-SAME: (i8* nocapture nofree noundef nonnull readonly dereferenceable(1021) [[P:%.*]]) #[[ATTR0]] {		; TUNIT-SAME: (i8* nocapture nofree noundef nonnull readonly dereferenceable(971) [[P:%.*]]) #[[ATTR1:[0-9]+]] {
; TUNIT-NEXT: entry:		; TUNIT-NEXT: entry:
; TUNIT-NEXT: [[L:%.]] = load i8, i8 [[P]], align 1		; TUNIT-NEXT: [[L:%.]] = load i8, i8 [[P]], align 1
; TUNIT-NEXT: ret i8 [[L]]		; TUNIT-NEXT: ret i8 [[L]]
;		;
; CGSCC: Function Attrs: nofree norecurse nosync nounwind willreturn memory(argmem: read)		; CGSCC: Function Attrs: nofree norecurse nosync nounwind willreturn memory(argmem: read)
; CGSCC-LABEL: define {{[^@]+}}@read_arg_2		; CGSCC-LABEL: define {{[^@]+}}@read_arg_2
; CGSCC-SAME: (i8* nocapture nofree noundef nonnull readonly dereferenceable(1) [[P:%.*]]) #[[ATTR0]] {		; CGSCC-SAME: (i8* nocapture nofree noundef nonnull readonly dereferenceable(1) [[P:%.*]]) #[[ATTR0]] {
; CGSCC-NEXT: entry:		; CGSCC-NEXT: entry:
; CGSCC-NEXT: [[L:%.]] = load i8, i8 [[P]], align 1		; CGSCC-NEXT: [[L:%.]] = load i8, i8 [[P]], align 1
; CGSCC-NEXT: ret i8 [[L]]		; CGSCC-NEXT: ret i8 [[L]]
;		;
entry:		entry:
%l = load i8, i8* %p, align 1		%l = load i8, i8* %p, align 1
ret i8 %l		ret i8 %l
}		}

define internal i8 @sum_two_different_loads(i8* %p, i8* %q) {		define internal i8 @sum_two_different_loads(i8* %p, i8* %q) {
; TUNIT: Function Attrs: nofree norecurse nosync nounwind willreturn memory(argmem: read)		; TUNIT: Function Attrs: nofree norecurse nosync nounwind willreturn memory(argmem: read)
; TUNIT-LABEL: define {{[^@]+}}@sum_two_different_loads		; TUNIT-LABEL: define {{[^@]+}}@sum_two_different_loads
; TUNIT-SAME: (i8* nocapture nofree noundef nonnull readonly align 2 dereferenceable(1022) [[P:%.]], i8 nocapture nofree noundef nonnull readonly dereferenceable(1021) [[Q:%.*]]) #[[ATTR0]] {		; TUNIT-SAME: (i8* nocapture nofree nonnull readonly dereferenceable(972) [[P:%.]], i8 nocapture nofree noundef nonnull readonly dereferenceable(971) [[Q:%.*]]) #[[ATTR1]] {
; TUNIT-NEXT: [[X:%.]] = call i8 @read_arg_2(i8 nocapture nofree noundef nonnull readonly align 2 dereferenceable(1022) [[P]]) #[[ATTR2]]		; TUNIT-NEXT: [[X:%.]] = call i8 @read_arg_2(i8 nocapture nofree nonnull readonly dereferenceable(972) [[P]]) #[[ATTR3:[0-9]+]]
; TUNIT-NEXT: [[Y:%.]] = call i8 @read_arg_2(i8 nocapture nofree noundef nonnull readonly dereferenceable(1021) [[Q]]) #[[ATTR2]]		; TUNIT-NEXT: [[Y:%.]] = call i8 @read_arg_2(i8 nocapture nofree noundef nonnull readonly dereferenceable(971) [[Q]]) #[[ATTR3]]
; TUNIT-NEXT: [[Z:%.*]] = add nsw i8 [[X]], [[Y]]		; TUNIT-NEXT: [[Z:%.*]] = add nsw i8 [[X]], [[Y]]
; TUNIT-NEXT: ret i8 [[Z]]		; TUNIT-NEXT: ret i8 [[Z]]
;		;
; CGSCC: Function Attrs: nofree nosync nounwind willreturn memory(argmem: read)		; CGSCC: Function Attrs: nofree nosync nounwind willreturn memory(argmem: read)
; CGSCC-LABEL: define {{[^@]+}}@sum_two_different_loads		; CGSCC-LABEL: define {{[^@]+}}@sum_two_different_loads
; CGSCC-SAME: (i8* nocapture nofree noundef nonnull readonly dereferenceable(1022) [[P:%.]], i8 nocapture nofree noundef nonnull readonly dereferenceable(1021) [[Q:%.*]]) #[[ATTR2]] {		; CGSCC-SAME: (i8* nocapture nofree noundef nonnull readonly dereferenceable(972) [[P:%.]], i8 nocapture nofree noundef nonnull readonly dereferenceable(971) [[Q:%.*]]) #[[ATTR2]] {
; CGSCC-NEXT: [[X:%.]] = call i8 @read_arg_2(i8 nocapture nofree noundef nonnull readonly dereferenceable(1022) [[P]]) #[[ATTR4]]		; CGSCC-NEXT: [[X:%.]] = call i8 @read_arg_2(i8 nocapture nofree noundef nonnull readonly dereferenceable(972) [[P]]) #[[ATTR5]]
; CGSCC-NEXT: [[Y:%.]] = call i8 @read_arg_2(i8 nocapture nofree noundef nonnull readonly dereferenceable(1021) [[Q]]) #[[ATTR4]]		; CGSCC-NEXT: [[Y:%.]] = call i8 @read_arg_2(i8 nocapture nofree noundef nonnull readonly dereferenceable(971) [[Q]]) #[[ATTR5]]
; CGSCC-NEXT: [[Z:%.*]] = add nsw i8 [[X]], [[Y]]		; CGSCC-NEXT: [[Z:%.*]] = add nsw i8 [[X]], [[Y]]
; CGSCC-NEXT: ret i8 [[Z]]		; CGSCC-NEXT: ret i8 [[Z]]
;		;
%x = call i8 @read_arg_2(i8* %p)		%x = call i8 @read_arg_2(i8* %p)
%y = call i8 @read_arg_2(i8* %q)		%y = call i8 @read_arg_2(i8* %q)
%z = add nsw i8 %x, %y		%z = add nsw i8 %x, %y
ret i8 %z		ret i8 %z
}		}

define i8 @call_not_simplifiable_2() {		define i8 @call_partially_simplifiable_1() {
; TUNIT: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)		; TUNIT: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)
; TUNIT-LABEL: define {{[^@]+}}@call_not_simplifiable_2		; TUNIT-LABEL: define {{[^@]+}}@call_partially_simplifiable_1
; TUNIT-SAME: () #[[ATTR1]] {		; TUNIT-SAME: () #[[ATTR0]] {
; TUNIT-NEXT: entry:		; TUNIT-NEXT: entry:
; TUNIT-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16		; TUNIT-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
; TUNIT-NEXT: [[I0:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 2		; TUNIT-NEXT: [[I2:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 2
; TUNIT-NEXT: store i8 2, i8* [[I0]], align 2		; TUNIT-NEXT: store i8 2, i8* [[I2]], align 2
; TUNIT-NEXT: [[I1:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 3		; TUNIT-NEXT: [[I3:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 3
; TUNIT-NEXT: store i8 3, i8* [[I1]], align 1		; TUNIT-NEXT: store i8 3, i8* [[I3]], align 1
; TUNIT-NEXT: [[BASE:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 0		; TUNIT-NEXT: [[I4:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 4
; TUNIT-NEXT: [[R:%.]] = call i8 @sum_two_different_loads(i8 nocapture nofree noundef nonnull readonly align 2 dereferenceable(1022) [[I0]], i8* nocapture nofree noundef nonnull readonly dereferenceable(1021) [[I1]]) #[[ATTR2]]		; TUNIT-NEXT: [[R:%.]] = call i8 @sum_two_different_loads(i8 nocapture nofree noundef nonnull readonly align 2 dereferenceable(1022) [[I2]], i8* nocapture nofree noundef nonnull readonly dereferenceable(1021) [[I3]]) #[[ATTR3]]
; TUNIT-NEXT: ret i8 [[R]]		; TUNIT-NEXT: ret i8 [[R]]
;		;
; CGSCC: Function Attrs: nofree nosync nounwind willreturn memory(none)		; CGSCC: Function Attrs: nofree nosync nounwind willreturn memory(none)
; CGSCC-LABEL: define {{[^@]+}}@call_not_simplifiable_2		; CGSCC-LABEL: define {{[^@]+}}@call_partially_simplifiable_1
; CGSCC-SAME: () #[[ATTR1]] {		; CGSCC-SAME: () #[[ATTR1]] {
; CGSCC-NEXT: entry:		; CGSCC-NEXT: entry:
; CGSCC-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16		; CGSCC-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
; CGSCC-NEXT: [[I0:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 2		; CGSCC-NEXT: [[I2:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 2
; CGSCC-NEXT: store i8 2, i8* [[I0]], align 2		; CGSCC-NEXT: store i8 2, i8* [[I2]], align 2
; CGSCC-NEXT: [[I1:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 3		; CGSCC-NEXT: [[I3:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 3
; CGSCC-NEXT: store i8 3, i8* [[I1]], align 1		; CGSCC-NEXT: store i8 3, i8* [[I3]], align 1
; CGSCC-NEXT: [[BASE:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 0		; CGSCC-NEXT: [[I4:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 4
; CGSCC-NEXT: [[R:%.]] = call i8 @sum_two_different_loads(i8 nocapture nofree noundef nonnull readonly align 2 dereferenceable(1022) [[I0]], i8* nocapture nofree noundef nonnull readonly dereferenceable(1021) [[I1]]) #[[ATTR3]]		; CGSCC-NEXT: store i8 4, i8* [[I4]], align 4
		; CGSCC-NEXT: [[R:%.]] = call i8 @sum_two_different_loads(i8 nocapture nofree noundef nonnull readonly align 2 dereferenceable(1022) [[I2]], i8* nocapture nofree noundef nonnull readonly dereferenceable(1021) [[I3]]) #[[ATTR4]]
; CGSCC-NEXT: ret i8 [[R]]		; CGSCC-NEXT: ret i8 [[R]]
;		;
entry:		entry:
%Bytes = alloca [1024 x i8], align 16		%Bytes = alloca [1024 x i8], align 16
%i0 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 2		%i2 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 2
store i8 2, i8* %i0		store i8 2, i8* %i2
%i1 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 3		%i3 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 3
store i8 3, i8* %i1		store i8 3, i8* %i3
%base = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 0		%i4 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 4
%r = call i8 @sum_two_different_loads(i8* %i0, i8* %i1)		;;; This store is redundant, hence removed.
		store i8 4, i8* %i4
		%r = call i8 @sum_two_different_loads(i8* %i2, i8* %i3)
		ret i8 %r
		}

		define i8 @call_partially_simplifiable_2(i1 %cond) {
		; TUNIT: Function Attrs: nofree norecurse nosync nounwind willreturn
		; TUNIT-LABEL: define {{[^@]+}}@call_partially_simplifiable_2
		; TUNIT-SAME: (i1 [[COND:%.*]]) #[[ATTR2:[0-9]+]] {
		; TUNIT-NEXT: entry:
		; TUNIT-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
		; TUNIT-NEXT: [[I51:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 51
		; TUNIT-NEXT: [[I52:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 52
		; TUNIT-NEXT: store i8 2, i8* [[I52]], align 4
		; TUNIT-NEXT: [[I53:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 53
		; TUNIT-NEXT: store i8 3, i8* [[I53]], align 1
		; TUNIT-NEXT: [[I54:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 54
		; TUNIT-NEXT: [[SEL:%.]] = select i1 [[COND]], i8 [[I51]], i8* [[I52]]
		; TUNIT-NEXT: [[R:%.]] = call i8 @sum_two_different_loads(i8 nocapture nofree nonnull readonly dereferenceable(972) [[SEL]], i8* nocapture nofree noundef nonnull readonly dereferenceable(971) [[I53]]) #[[ATTR3]]
		; TUNIT-NEXT: ret i8 [[R]]
		;
		; CGSCC: Function Attrs: nofree nosync nounwind willreturn
		; CGSCC-LABEL: define {{[^@]+}}@call_partially_simplifiable_2
		; CGSCC-SAME: (i1 [[COND:%.*]]) #[[ATTR3:[0-9]+]] {
		; CGSCC-NEXT: entry:
		; CGSCC-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
		; CGSCC-NEXT: [[I51:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 51
		; CGSCC-NEXT: [[I52:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 52
		; CGSCC-NEXT: store i8 2, i8* [[I52]], align 4
		; CGSCC-NEXT: [[I53:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 53
		; CGSCC-NEXT: store i8 3, i8* [[I53]], align 1
		; CGSCC-NEXT: [[I54:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 54
		; CGSCC-NEXT: store i8 4, i8* [[I54]], align 2
		; CGSCC-NEXT: [[SEL:%.]] = select i1 [[COND]], i8 [[I51]], i8* [[I52]]
		; CGSCC-NEXT: [[R:%.]] = call i8 @sum_two_different_loads(i8 nocapture nofree noundef nonnull readonly dereferenceable(972) [[SEL]], i8* nocapture nofree noundef nonnull readonly dereferenceable(971) [[I53]]) #[[ATTR4]]
		; CGSCC-NEXT: ret i8 [[R]]
		;
		entry:
		%Bytes = alloca [1024 x i8], align 16
		%i51 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 51
		%i52 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 52
		store i8 2, i8* %i52
		%i53 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 53
		store i8 3, i8* %i53
		%i54 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 54
		;;; This store is redundant, hence removed. Not affected by the select.
		store i8 4, i8* %i54
		%sel = select i1 %cond, i8* %i51, i8 *%i52
		%r = call i8 @sum_two_different_loads(i8* %sel, i8* %i53)
ret i8 %r		ret i8 %r
}		}

;.		;.
; TUNIT: attributes #[[ATTR0]] = { nofree norecurse nosync nounwind willreturn memory(argmem: read) }		; TUNIT: attributes #[[ATTR0]] = { nofree norecurse nosync nounwind willreturn memory(none) }
; TUNIT: attributes #[[ATTR1]] = { nofree norecurse nosync nounwind willreturn memory(none) }		; TUNIT: attributes #[[ATTR1]] = { nofree norecurse nosync nounwind willreturn memory(argmem: read) }
; TUNIT: attributes #[[ATTR2]] = { nofree nosync nounwind willreturn }		; TUNIT: attributes #[[ATTR2]] = { nofree norecurse nosync nounwind willreturn }
		; TUNIT: attributes #[[ATTR3]] = { nofree nosync nounwind willreturn }
;.		;.
; CGSCC: attributes #[[ATTR0]] = { nofree norecurse nosync nounwind willreturn memory(argmem: read) }		; CGSCC: attributes #[[ATTR0]] = { nofree norecurse nosync nounwind willreturn memory(argmem: read) }
; CGSCC: attributes #[[ATTR1]] = { nofree nosync nounwind willreturn memory(none) }		; CGSCC: attributes #[[ATTR1]] = { nofree nosync nounwind willreturn memory(none) }
; CGSCC: attributes #[[ATTR2]] = { nofree nosync nounwind willreturn memory(argmem: read) }		; CGSCC: attributes #[[ATTR2]] = { nofree nosync nounwind willreturn memory(argmem: read) }
; CGSCC: attributes #[[ATTR3]] = { willreturn }		; CGSCC: attributes #[[ATTR3]] = { nofree nosync nounwind willreturn }
; CGSCC: attributes #[[ATTR4]] = { willreturn memory(read) }		; CGSCC: attributes #[[ATTR4]] = { willreturn }
		; CGSCC: attributes #[[ATTR5]] = { willreturn memory(read) }
;.		;.

llvm/test/Transforms/Attributor/multiple-offsets-pointer-info.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --check-attributes --check-globals
				; RUN: opt -aa-pipeline=basic-aa -passes=attributor -attributor-manifest-internal -attributor-max-iterations-verify -attributor-annotate-decl-cs -attributor-max-iterations=3 -S < %s \| FileCheck %s --check-prefixes=CHECK,TUNIT
				; RUN: opt -aa-pipeline=basic-aa -passes=attributor-cgscc -attributor-manifest-internal -attributor-annotate-decl-cs -S < %s \| FileCheck %s --check-prefixes=CHECK,CGSCC

				%struct.T = type { i32, [10 x [20 x i8]] }

				define i8 @select_offsets_simplifiable_1(i1 %cnd1, i1 %cnd2) {
				; CHECK: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)
				; CHECK-LABEL: define {{[^@]+}}@select_offsets_simplifiable_1
				; CHECK-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR0:[0-9]+]] {
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
				; CHECK-NEXT: [[GEP23:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 23
				; CHECK-NEXT: store i8 23, i8* [[GEP23]], align 4
				; CHECK-NEXT: [[GEP29:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 29
				; CHECK-NEXT: store i8 29, i8* [[GEP29]], align 4
				; CHECK-NEXT: [[GEP7:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 7
				; CHECK-NEXT: store i8 7, i8* [[GEP7]], align 4
				; CHECK-NEXT: [[GEP31:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 31
				; CHECK-NEXT: [[SEL0:%.*]] = select i1 [[CND1]], i64 23, i64 29
				; CHECK-NEXT: [[SEL1:%.*]] = select i1 [[CND2]], i64 [[SEL0]], i64 7
				; CHECK-NEXT: [[GEP_SEL:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 [[SEL1]]
				; CHECK-NEXT: [[I:%.]] = load i8, i8 [[GEP_SEL]], align 4
				; CHECK-NEXT: ret i8 [[I]]
				;
				entry:
				%Bytes = alloca [1024 x i8], align 16

				%gep23 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 23
				store i8 23, i8* %gep23, align 4
				%gep29 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 29
				store i8 29, i8* %gep29, align 4
				%gep7 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 7
				store i8 7, i8* %gep7, align 4

				;; This store is redundant, hence removed.
				%gep31 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 31
				store i8 42, i8* %gep31, align 4

				%sel0 = select i1 %cnd1, i64 23, i64 29
				%sel1 = select i1 %cnd2, i64 %sel0, i64 7
				%gep.sel = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 %sel1
				%i = load i8, i8* %gep.sel, align 4
				ret i8 %i
				}

				define i8 @select_offsets_simplifiable_2(i1 %cnd1, i1 %cnd2) {
				; CHECK: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)
				; CHECK-LABEL: define {{[^@]+}}@select_offsets_simplifiable_2
				; CHECK-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR0]] {
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
				; CHECK-NEXT: [[GEP23:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 23
				; CHECK-NEXT: store i8 23, i8* [[GEP23]], align 4
				; CHECK-NEXT: [[GEP29:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 29
				; CHECK-NEXT: store i8 29, i8* [[GEP29]], align 4
				; CHECK-NEXT: [[GEP7:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 7
				; CHECK-NEXT: store i8 7, i8* [[GEP7]], align 4
				; CHECK-NEXT: [[GEP31:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 31
				; CHECK-NEXT: [[SEL0:%.*]] = select i1 [[CND1]], i64 20, i64 26
				; CHECK-NEXT: [[SEL1:%.*]] = select i1 [[CND2]], i64 [[SEL0]], i64 4
				; CHECK-NEXT: [[GEP_SEL:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 [[SEL1]]
				; CHECK-NEXT: [[GEP_PLUS:%.]] = getelementptr inbounds i8, i8 [[GEP_SEL]], i64 3
				; CHECK-NEXT: [[I:%.]] = load i8, i8 [[GEP_PLUS]], align 4
				; CHECK-NEXT: ret i8 [[I]]
				;
				entry:
				%Bytes = alloca [1024 x i8], align 16

				%gep23 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 23
				store i8 23, i8* %gep23, align 4
				%gep29 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 29
				store i8 29, i8* %gep29, align 4
				%gep7 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 7
				store i8 7, i8* %gep7, align 4

				;; This store is redundant, hence removed.
				%gep31 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 31
				store i8 42, i8* %gep31, align 4

				;; Adjust the offsets so that they match the stores after adding 3
				%sel0 = select i1 %cnd1, i64 20, i64 26
				%sel1 = select i1 %cnd2, i64 %sel0, i64 4
				%gep.sel = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 %sel1
				%gep.plus = getelementptr inbounds i8, i8* %gep.sel, i64 3
				%i = load i8, i8* %gep.plus, align 4
				ret i8 %i
				}

				define i8 @select_offsets_simplifiable_3(i1 %cnd1, i1 %cnd2) {
				; CHECK: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)
				; CHECK-LABEL: define {{[^@]+}}@select_offsets_simplifiable_3
				; CHECK-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR0]] {
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[BUNDLE:%.]] = alloca [[STRUCT_T:%.]], align 64
				; CHECK-NEXT: [[SEL1:%.*]] = select i1 [[CND1]], i64 1, i64 3
				; CHECK-NEXT: [[SEL2:%.*]] = select i1 [[CND2]], i64 5, i64 11
				; CHECK-NEXT: [[GEP_SEL:%.]] = getelementptr inbounds [[STRUCT_T]], %struct.T [[BUNDLE]], i64 0, i32 1, i64 [[SEL1]], i64 [[SEL2]]
				; CHECK-NEXT: ret i8 100
				;
				entry:
				%bundle = alloca %struct.T, align 64
				%gep.fixed = getelementptr inbounds %struct.T, %struct.T* %bundle, i64 0, i32 1, i64 1, i64 1
				store i8 100, i8* %gep.fixed, align 4
				%sel1 = select i1 %cnd1, i64 1, i64 3
				%sel2 = select i1 %cnd2, i64 5, i64 11
				%gep.sel = getelementptr inbounds %struct.T, %struct.T* %bundle, i64 0, i32 1, i64 %sel1, i64 %sel2
				store i8 42, i8* %gep.sel, align 4
				%i = load i8, i8* %gep.fixed, align 4
				ret i8 %i
				}

				define i8 @select_offsets_not_simplifiable_1(i1 %cnd1, i1 %cnd2) {
				; CHECK: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)
				; CHECK-LABEL: define {{[^@]+}}@select_offsets_not_simplifiable_1
				; CHECK-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR0]] {
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
				; CHECK-NEXT: [[SEL0:%.*]] = select i1 [[CND1]], i64 23, i64 29
				; CHECK-NEXT: [[SEL1:%.*]] = select i1 [[CND2]], i64 [[SEL0]], i64 7
				; CHECK-NEXT: [[GEP23:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 23
				; CHECK-NEXT: store i8 100, i8* [[GEP23]], align 4
				; CHECK-NEXT: [[GEP_SEL:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 [[SEL1]]
				; CHECK-NEXT: [[I:%.]] = load i8, i8 [[GEP_SEL]], align 4
				; CHECK-NEXT: ret i8 [[I]]
				;
				entry:
				%Bytes = alloca [1024 x i8], align 16
				%sel0 = select i1 %cnd1, i64 23, i64 29
				%sel1 = select i1 %cnd2, i64 %sel0, i64 7
				%gep23 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 23
				store i8 100, i8* %gep23, align 4
				%gep.sel = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 %sel1
				%i = load i8, i8* %gep.sel, align 4
				ret i8 %i
				}

				define i8 @select_offsets_not_simplifiable_2(i1 %cnd1, i1 %cnd2) {
				; CHECK: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)
				; CHECK-LABEL: define {{[^@]+}}@select_offsets_not_simplifiable_2
				; CHECK-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR0]] {
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
				; CHECK-NEXT: [[SEL0:%.*]] = select i1 [[CND1]], i64 23, i64 29
				; CHECK-NEXT: [[SEL1:%.*]] = select i1 [[CND2]], i64 [[SEL0]], i64 7
				; CHECK-NEXT: [[GEP32:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 32
				; CHECK-NEXT: store i8 100, i8* [[GEP32]], align 16
				; CHECK-NEXT: [[GEP_SEL:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 [[SEL1]]
				; CHECK-NEXT: [[GEP_PLUS:%.]] = getelementptr inbounds i8, i8 [[GEP_SEL]], i64 3
				; CHECK-NEXT: [[I:%.]] = load i8, i8 [[GEP_PLUS]], align 4
				; CHECK-NEXT: ret i8 [[I]]
				;
				entry:
				%Bytes = alloca [1024 x i8], align 16
				%sel0 = select i1 %cnd1, i64 23, i64 29
				%sel1 = select i1 %cnd2, i64 %sel0, i64 7
				%gep32 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 32
				store i8 100, i8* %gep32, align 4
				%gep.sel = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 %sel1
				%gep.plus = getelementptr inbounds i8, i8* %gep.sel, i64 3
				%i = load i8, i8* %gep.plus, align 4
				ret i8 %i
				}

				define i8 @select_offsets_not_simplifiable_3(i1 %cnd1, i1 %cnd2) {
				; CHECK: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)
				; CHECK-LABEL: define {{[^@]+}}@select_offsets_not_simplifiable_3
				; CHECK-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR0]] {
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
				; CHECK-NEXT: [[SEL0:%.*]] = select i1 [[CND1]], i64 23, i64 29
				; CHECK-NEXT: [[SEL1:%.*]] = select i1 [[CND2]], i64 [[SEL0]], i64 7
				; CHECK-NEXT: [[GEP_SEL:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 [[SEL1]]
				; CHECK-NEXT: store i8 100, i8* [[GEP_SEL]], align 4
				; CHECK-NEXT: [[GEP29:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 29
				; CHECK-NEXT: [[I:%.]] = load i8, i8 [[GEP29]], align 4
				; CHECK-NEXT: ret i8 [[I]]
				;
				entry:
				%Bytes = alloca [1024 x i8], align 16
				%sel0 = select i1 %cnd1, i64 23, i64 29
				%sel1 = select i1 %cnd2, i64 %sel0, i64 7
				%gep.sel = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 %sel1
				store i8 100, i8* %gep.sel, align 4
				%gep29 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 29
				%i = load i8, i8* %gep29, align 4
				ret i8 %i
				}

				define i8 @select_offsets_not_simplifiable_4(i1 %cnd1, i1 %cnd2) {
				; CHECK: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)
				; CHECK-LABEL: define {{[^@]+}}@select_offsets_not_simplifiable_4
				; CHECK-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR0]] {
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
				; CHECK-NEXT: [[SEL0:%.*]] = select i1 [[CND1]], i64 23, i64 29
				; CHECK-NEXT: [[SEL1:%.*]] = select i1 [[CND2]], i64 [[SEL0]], i64 7
				; CHECK-NEXT: [[GEP_SEL:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 [[SEL1]]
				; CHECK-NEXT: [[GEP_PLUS:%.]] = getelementptr inbounds i8, i8 [[GEP_SEL]], i64 3
				; CHECK-NEXT: store i8 100, i8* [[GEP_PLUS]], align 4
				; CHECK-NEXT: [[GEP32:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 32
				; CHECK-NEXT: [[I:%.]] = load i8, i8 [[GEP32]], align 16
				; CHECK-NEXT: ret i8 [[I]]
				;
				entry:
				%Bytes = alloca [1024 x i8], align 16
				%sel0 = select i1 %cnd1, i64 23, i64 29
				%sel1 = select i1 %cnd2, i64 %sel0, i64 7
				%gep.sel = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 %sel1
				%gep.plus = getelementptr inbounds i8, i8* %gep.sel, i64 3
				store i8 100, i8* %gep.plus, align 4
				%gep32 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 32
				%i = load i8, i8* %gep32, align 4
				ret i8 %i
				}

				define i8 @select_offsets_not_simplifiable_5(i1 %cnd1, i1 %cnd2) {
				; CHECK: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)
				; CHECK-LABEL: define {{[^@]+}}@select_offsets_not_simplifiable_5
				; CHECK-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR0]] {
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[BUNDLE:%.]] = alloca [[STRUCT_T:%.]], align 64
				; CHECK-NEXT: [[GEP_FIXED:%.]] = getelementptr inbounds [[STRUCT_T]], %struct.T [[BUNDLE]], i64 0, i32 1, i64 3, i64 5
				; CHECK-NEXT: store i8 100, i8* [[GEP_FIXED]], align 4
				; CHECK-NEXT: [[SEL1:%.*]] = select i1 [[CND1]], i64 1, i64 3
				; CHECK-NEXT: [[SEL2:%.*]] = select i1 [[CND2]], i64 5, i64 11
				; CHECK-NEXT: [[GEP_SEL:%.]] = getelementptr inbounds [[STRUCT_T]], %struct.T [[BUNDLE]], i64 0, i32 1, i64 [[SEL1]], i64 [[SEL2]]
				; CHECK-NEXT: store i8 42, i8* [[GEP_SEL]], align 4
				; CHECK-NEXT: [[I:%.]] = load i8, i8 [[GEP_FIXED]], align 4
				; CHECK-NEXT: ret i8 [[I]]
				;
				entry:
				%bundle = alloca %struct.T, align 64
				%gep.fixed = getelementptr inbounds %struct.T, %struct.T* %bundle, i64 0, i32 1, i64 3, i64 5
				store i8 100, i8* %gep.fixed, align 4
				%sel1 = select i1 %cnd1, i64 1, i64 3
				%sel2 = select i1 %cnd2, i64 5, i64 11
				%gep.sel = getelementptr inbounds %struct.T, %struct.T* %bundle, i64 0, i32 1, i64 %sel1, i64 %sel2

				;; This store prevents the constant 100 from being propagated to ret
				store i8 42, i8* %gep.sel, align 4

				%i = load i8, i8* %gep.fixed, align 4
				ret i8 %i
				}

				define i8 @select_gep_simplifiable_1(i1 %cnd1, i1 %cnd2) {
				; CHECK: Function Attrs: nofree norecurse nosync nounwind willreturn memory(write)
				; CHECK-LABEL: define {{[^@]+}}@select_gep_simplifiable_1
				; CHECK-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR1:[0-9]+]] {
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
				; CHECK-NEXT: [[GEP7:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 7
				; CHECK-NEXT: [[GEP23:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 23
				; CHECK-NEXT: [[SEL_PTR:%.]] = select i1 [[CND1]], i8 [[GEP7]], i8* [[GEP23]]
				; CHECK-NEXT: store i8 42, i8* [[SEL_PTR]], align 4
				; CHECK-NEXT: ret i8 21
				;
				entry:
				%Bytes = alloca [1024 x i8], align 16
				%gep3 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 3
				store i8 21, i8* %gep3, align 4
				%gep7 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 7
				%gep23 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 23
				%sel.ptr = select i1 %cnd1, i8* %gep7, i8* %gep23
				store i8 42, i8* %sel.ptr, align 4
				%i = load i8, i8* %gep3, align 4
				ret i8 %i
				}

				define i8 @select_gep_not_simplifiable_1(i1 %cnd1, i1 %cnd2) {
				; CHECK: Function Attrs: nofree norecurse nosync nounwind willreturn
				; CHECK-LABEL: define {{[^@]+}}@select_gep_not_simplifiable_1
				; CHECK-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR2:[0-9]+]] {
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
				; CHECK-NEXT: [[GEP7:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 7
				; CHECK-NEXT: [[GEP23:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 23
				; CHECK-NEXT: [[SEL_PTR:%.]] = select i1 [[CND1]], i8 [[GEP7]], i8* [[GEP23]]
				; CHECK-NEXT: store i8 42, i8* [[SEL_PTR]], align 4
				; CHECK-NEXT: [[I:%.]] = load i8, i8 [[GEP7]], align 4
				; CHECK-NEXT: ret i8 [[I]]
				;
				entry:
				%Bytes = alloca [1024 x i8], align 16
				%gep7 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 7
				%gep23 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 23
				%sel.ptr = select i1 %cnd1, i8* %gep7, i8* %gep23
				store i8 42, i8* %sel.ptr, align 4
				%i = load i8, i8* %gep7, align 4
				ret i8 %i
				}

				; FIXME: This should be simplifiable. See comment inside.

				define i8 @phi_offsets_fixme(i1 %cnd1, i1 %cnd2) {
				; CHECK: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)
				; CHECK-LABEL: define {{[^@]+}}@phi_offsets_fixme
				; CHECK-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR0]] {
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
				; CHECK-NEXT: [[GEP_FIXED:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 0
				; CHECK-NEXT: store i8 100, i8* [[GEP_FIXED]], align 16
				; CHECK-NEXT: br i1 [[CND1]], label [[THEN:%.]], label [[ELSE:%.]]
				; CHECK: then:
				; CHECK-NEXT: br label [[JOIN:%.*]]
				; CHECK: else:
				; CHECK-NEXT: br label [[JOIN]]
				; CHECK: join:
				; CHECK-NEXT: [[PHI:%.*]] = phi i64 [ 3, [[THEN]] ], [ 11, [[ELSE]] ]
				; CHECK-NEXT: [[GEP_PHI:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 [[PHI]]
				; CHECK-NEXT: store i8 42, i8* [[GEP_PHI]], align 4
				; CHECK-NEXT: [[I:%.]] = load i8, i8 [[GEP_FIXED]], align 16
				; CHECK-NEXT: ret i8 [[I]]
				;
				entry:
				%Bytes = alloca [1024 x i8], align 16
				%gep.fixed = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 0
				store i8 100, i8* %gep.fixed, align 4
				br i1 %cnd1, label %then, label %else

				then:
				br label %join

				else:
				br label %join

				join:
				; FIXME: AAPotentialConstantValues does not detect the constant values for the
				; PHI below. It needs to rely on AAPotentialValues.
				%phi = phi i64 [ 3, %then ], [ 11, %else ]
				%gep.phi = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 %phi
				store i8 42, i8* %gep.phi, align 4
				%i = load i8, i8* %gep.fixed, align 4
				ret i8 %i
				}

				jdoerfertUnsubmitted Done Reply Inline Actions AAPotentialConstantValues uses AAPotentialValues but the case to handle PHINodes is missing here: https://github.com/llvm/llvm-project/blob/72d76a2403459a38a1d6daae62de6945097db8f9/llvm/lib/Transforms/IPO/AttributorAttributes.cpp#L9222 We just need to go over the operands and call the fillSetWithConstantValues function. We can do that in a follow up or pre-patch, just commenting here to make sure we don't forget. jdoerfert: AAPotentialConstantValues uses AAPotentialValues but the case to handle PHINodes is missing…
				sameerdsAuthorUnsubmitted Done Reply Inline Actions Exactly what I had in mind. Would prefer to do this as a follow-up. sameerds: Exactly what I had in mind. Would prefer to do this as a follow-up.
				;.
				; CHECK: attributes #[[ATTR0]] = { nofree norecurse nosync nounwind willreturn memory(none) }
				; CHECK: attributes #[[ATTR1]] = { nofree norecurse nosync nounwind willreturn memory(write) }
				; CHECK: attributes #[[ATTR2]] = { nofree norecurse nosync nounwind willreturn }
				;.

llvm/test/Transforms/Attributor/value-simplify-pointer-info.ll

	Show First 20 Lines • Show All 3,104 Lines • ▼ Show 20 Lines
	; CGSCC-NEXT: ret void			; CGSCC-NEXT: ret void
	;			;
	%l = load i32, i32* %a			%l = load i32, i32* %a
	%sel = select i1 %c, i32 %l, i32 42			%sel = select i1 %c, i32 %l, i32 42
	store i32 %sel, i32* %a			store i32 %sel, i32* %a
	ret void			ret void
	}			}

	define i8 @multiple_offsets_not_simplifiable_1(i1 %cnd1, i1 %cnd2) {			define i8 @gep_index_from_binary_operator(i1 %cnd1, i1 %cnd2) {
	; TUNIT: Function Attrs: nofree norecurse nosync nounwind willreturn			; CHECK: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)
	; TUNIT-LABEL: define {{[^@]+}}@multiple_offsets_not_simplifiable_1			; CHECK-LABEL: define {{[^@]+}}@gep_index_from_binary_operator
	; TUNIT-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR3]] {			; CHECK-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR4]] {
	; TUNIT-NEXT: entry:			; CHECK-NEXT: entry:
	; TUNIT-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16			; CHECK-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
	; TUNIT-NEXT: [[GEP7:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 7			; CHECK-NEXT: [[GEP_FIXED:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 12
	; TUNIT-NEXT: [[GEP23:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 23			; CHECK-NEXT: ret i8 100
	; TUNIT-NEXT: [[SEL_PTR:%.]] = select i1 [[CND1]], i8 [[GEP7]], i8* [[GEP23]]
	; TUNIT-NEXT: store i8 42, i8* [[SEL_PTR]], align 4
	; TUNIT-NEXT: [[I:%.]] = load i8, i8 [[GEP7]], align 4
	; TUNIT-NEXT: ret i8 [[I]]
	;
	; CGSCC: Function Attrs: nofree norecurse nosync nounwind willreturn
	; CGSCC-LABEL: define {{[^@]+}}@multiple_offsets_not_simplifiable_1
	; CGSCC-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR5]] {
	; CGSCC-NEXT: entry:
	; CGSCC-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
	; CGSCC-NEXT: [[GEP7:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 7
	; CGSCC-NEXT: [[GEP23:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 23
	; CGSCC-NEXT: [[SEL_PTR:%.]] = select i1 [[CND1]], i8 [[GEP7]], i8* [[GEP23]]
	; CGSCC-NEXT: store i8 42, i8* [[SEL_PTR]], align 4
	; CGSCC-NEXT: [[I:%.]] = load i8, i8 [[GEP7]], align 4
	; CGSCC-NEXT: ret i8 [[I]]
	;			;
	entry:			entry:
	%Bytes = alloca [1024 x i8], align 16			%Bytes = alloca [1024 x i8], align 16
	%gep7 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 7			%offset = add i64 5, 7
	%gep23 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 23			%gep.fixed = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 12
	; %phi.ptr = phi i8* [ %gep7, %then ], [ %gep23, %else ]			%gep.sum = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 %offset
	%sel.ptr = select i1 %cnd1, i8* %gep7, i8* %gep23			store i8 100, i8* %gep.fixed, align 4
	store i8 42, i8* %sel.ptr, align 4			%i = load i8, i8* %gep.sum, align 4
	%i = load i8, i8* %gep7, align 4
	ret i8 %i			ret i8 %i
	}			}

				; FIXME: This should be simplifiable. See comment inside.

				define i8 @gep_index_from_memory(i1 %cnd1, i1 %cnd2) {
				; CHECK: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)
				; CHECK-LABEL: define {{[^@]+}}@gep_index_from_memory
				; CHECK-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR4]] {
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
				; CHECK-NEXT: [[GEP_FIXED:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 12
				; CHECK-NEXT: [[GEP_LOADED:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 12
				; CHECK-NEXT: store i8 100, i8* [[GEP_LOADED]], align 4
				; CHECK-NEXT: [[I:%.]] = load i8, i8 [[GEP_FIXED]], align 4
				; CHECK-NEXT: ret i8 [[I]]
				;
				entry:
				%Bytes = alloca [1024 x i8], align 16
				%addr = alloca i64, align 16
				%gep.addr = getelementptr inbounds i64, i64* %addr, i64 0
				store i64 12, i64* %gep.addr, align 8
				%offset = load i64, i64* %gep.addr, align 8
				%gep.fixed = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 12

				; FIXME: AAPotentialConstantValues does not detect the constant offset being
				; passed to this GEP. It needs to rely on AAPotentialValues.
				%gep.loaded = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 %offset
				store i8 100, i8* %gep.loaded, align 4
				jdoerfertUnsubmitted Done Reply Inline Actions I think this is the same problem, we do not handle LoadInst and should just call fillSetWithConstantValues on the load value itself. jdoerfert: I think this is the same problem, we do not handle LoadInst and should just call…

				%i = load i8, i8* %gep.fixed, align 4
				ret i8 %i
				}

	!llvm.module.flags = !{!0, !1}			!llvm.module.flags = !{!0, !1}
	!llvm.ident = !{!2}			!llvm.ident = !{!2}

	!0 = !{i32 1, !"wchar_size", i32 4}			!0 = !{i32 1, !"wchar_size", i32 4}
	!1 = !{i32 7, !"uwtable", i32 1}			!1 = !{i32 7, !"uwtable", i32 1}
	!2 = !{!"clang version 13.0.0"}			!2 = !{!"clang version 13.0.0"}
	!3 = !{!4, !4, i64 0}			!3 = !{!4, !4, i64 0}
	▲ Show 20 Lines • Show All 142 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[AAPointerInfo] track multiple constant offsets for each useClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 482009

llvm/include/llvm/Transforms/IPO/Attributor.h

llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp

llvm/lib/Transforms/IPO/AttributorAttributes.cpp

llvm/test/CodeGen/AMDGPU/attributor.ll

llvm/test/CodeGen/AMDGPU/implicitarg-attributes.ll

llvm/test/Transforms/Attributor/call-simplify-pointer-info.ll

llvm/test/Transforms/Attributor/multiple-offsets-pointer-info.ll

llvm/test/Transforms/Attributor/value-simplify-pointer-info.ll

[AAPointerInfo] track multiple constant offsets for each use
ClosedPublic