This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Transforms/IPO/
-
llvm/
-
Transforms/
-
IPO/
9/9
Attributor.h
-
lib/Transforms/IPO/
-
Transforms/
-
IPO/
13/13
AttributorAttributes.cpp
-
test/
-
CodeGen/AMDGPU/
-
AMDGPU/
-
attributor-through-select.ll
-
Transforms/Attributor/
-
Attributor/
1/1
value-simplify-pointer-info.ll

Differential D138646

[AAPointerInfo] track multiple constant offsets for each use
ClosedPublic

Authored by sameerds on Nov 24 2022, 2:39 AM.

Download Raw Diff

Details

Reviewers

jdoerfert
sstefan1

Commits

rG6a2305484e87: [AAPointerInfo] track multiple constant offsets for each use
rGc2a0baad1fbb: [AAPointerInfo] track multiple constant offsets for each use

Summary

An expression of the form gep(base, select(pred, const1, const2)) can result
in a set of offsets instead of just one. PointerInfo can now track these sets
instead of conservatively modeling them as Unknown.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

sameerds created this revision.Nov 24 2022, 2:39 AM

Herald added a project: Restricted Project. · View Herald TranscriptNov 24 2022, 2:39 AM

Herald added subscribers: kosarev, ormris, okura and 6 others. · View Herald Transcript

sameerds requested review of this revision.Nov 24 2022, 2:39 AM

Herald added a reviewer: jdoerfert. · View Herald TranscriptNov 24 2022, 2:39 AM

Herald added a reviewer: sstefan1. · View Herald Transcript

Herald added a project: Restricted Project. · View Herald Transcript

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B199368: Diff 477715.Nov 24 2022, 2:39 AM

sameerds added a parent revision: D138645: [AAPointerInfo] rearrange code in preparation for further changes.Nov 24 2022, 2:40 AM

Is this patch, and the ones prior in the stack, ready for review?

In D138646#3954187, @jdoerfert wrote:

Is this patch, and the ones prior in the stack, ready for review?

Yes this is ready for review. It lacks tests for function calls, which I intend to provide before submitting.

sameerds mentioned this in D138645: [AAPointerInfo] rearrange code in preparation for further changes.Nov 29 2022, 12:22 AM

Added some comments to the code
Removed a redundant CodeGen test
Improved readability in value-simplify-pointer-info.ll
Added tests in call-simplify-pointer-info.ll

sameerds added a child revision: D138991: [AAPointerInfo] handle multiple offsets in PHI.Nov 30 2022, 2:08 AM

Harbormaster completed remote builds in B200206: Diff 478855.Nov 30 2022, 2:21 AM

The only "issue" I have with this is the select traversal. We should rely on AAPotentialValues here, see the comment below. Everything else makes sense and looks pretty good.

llvm/include/llvm/Transforms/IPO/Attributor.h
5142	All assertions need messages please, also elsewhere.
llvm/lib/Transforms/IPO/AttributorAttributes.cpp
1485	I think we might want to call `Attributor::getPotentialValues` on the variable offsets. It should handle select and phi and more, e.g., loads. Maybe in addition to `Attributor::getAssumedConstant` we should have `Attributor::getAssumedConstantValues`. I want to avoid yet another traversal of some instructions in favor of common interfaces.a
1529	SExt, I think. -1 is a fine offset.
1561	Nit: Unsure why we need two returns here.
1590	Same as above

Use AAPotentialConstantValues instead of tediously traversing SelectInst.

Can we have a test for phi and store-load propagation to verify it's working as expected (not only selects)?

llvm/lib/Transforms/IPO/AttributorAttributes.cpp
1391–1394	This doesn't make sense to me. We need to look at all VariableOffsets and decide. So `return` should only be present if we give up.
1401	I don;t follow why we need two extra OffsetInfo objects here. We modify NewOI anyway, no?

Harbormaster completed remote builds in B201084: Diff 480067.Dec 5 2022, 8:46 AM

sameerds added inline comments.Dec 5 2022, 9:42 AM

llvm/lib/Transforms/IPO/AttributorAttributes.cpp
1391–1394	You might have missed the "not" in the condition. We want to give up and come again later if any information is not at fixed point. This is easily exercised by tests involving induction variables. The set of potential values of the induction variable keeps growing, but we should not use that set until it is fully enumerated. Any eager propagation of a non-fixed-point through PointerInfo at this point affects conclusions in other attributes that depend on it. I did not look for an example of correctness, but it does cause the attributor to retain stores that it would have otherwise removed in one existing test.
1401	On each iteration of the outer for loop over VariableOffsets, the expression is a product: UnionOfAllCopies = NewOI x AssumedSet CopyPerOffset is the temporary used by the inner loop to compute this product. We need UnionOfAllCopies because it must only contain modified copies of NewOI, but not NewOI itself. We can't merge the RHS into NewOI. We have to start with an empty set. The actual output of the function is UsrOI. We do not move NewOI into that if we exit early. I realize there is no test for this yet. Working on that ... writing a GEP by hand is hard, when it involves nested aggregate types!

jdoerfert added inline comments.Dec 5 2022, 11:06 AM

llvm/lib/Transforms/IPO/AttributorAttributes.cpp
1391–1394	We cannot wait in one AA for another to find a fixpoint. That is not sound. That is not even always possible. Even if it would be, it won't work in the current algorithm. You need to update the AA state based on the state of the other AA, always. Then signal if something changed. That said, if we retain stores doing this properly we need to understand why.
1401	I realize there is no test for this yet. Working on that ... writing a GEP by hand is hard, when it involves nested aggregate types! Use clang.

Remove incorrect use of isAtFixpoint. Instead, Access::operator& no longer drops content when ranges are merged.
Add more tests, move tests to new file for better naming.

sameerds marked 6 inline comments as done.Dec 7 2022, 8:14 AM

sameerds added inline comments.

llvm/include/llvm/Transforms/IPO/Attributor.h
5142	Can fix this locally too.
llvm/lib/Transforms/IPO/AttributorAttributes.cpp
1391–1394	You're right. What I did here was plain wrong. The root cause was that when me merge ranges in operator&= for Access objects, we conservatively drop the contents. We don't need to be so conservative ... just combining contents from the two Access objects works well in case both happen to have the same contents.
1401	Test added.
1561	I missed this. If there are no other changes required, I can fix this locally before submitting.

jdoerfert added inline comments.Dec 7 2022, 8:22 AM

llvm/include/llvm/Transforms/IPO/Attributor.h
5204–5211	It was dropping the results for a reason before, just going back on it is probably not sound either. If we have: range: [0-4] value: i32 -42 and merge it with range: [0-8] value: i64 -42 we need to do something here. At least, we need to change must to may but I think we cannot even keep the value if they are equal under under zext. If we would, writing `i32 0` and `i64 0` would make us believe the former writes 8 bytes and they'll all be 0, which is not true. When does this happen, maybe we need to understand the problem better.
llvm/test/Transforms/Attributor/multiple-offsets-pointer-info.ll
336 ↗	(On Diff #480918)	AAPotentialConstantValues uses AAPotentialValues but the case to handle PHINodes is missing here: https://github.com/llvm/llvm-project/blob/72d76a2403459a38a1d6daae62de6945097db8f9/llvm/lib/Transforms/IPO/AttributorAttributes.cpp#L9222 We just need to go over the operands and call the fillSetWithConstantValues function. We can do that in a follow up or pre-patch, just commenting here to make sure we don't forget.
llvm/test/Transforms/Attributor/value-simplify-pointer-info.ll
3221	I think this is the same problem, we do not handle LoadInst and should just call fillSetWithConstantValues on the load value itself.

sameerds marked 2 inline comments as done.Dec 7 2022, 8:57 AM

sameerds added inline comments.

llvm/include/llvm/Transforms/IPO/Attributor.h
5204–5211	The Access objects being merged here originate from the same remote instruction. Doesn't that mean that the type is guaranteed to be the same? I tried putting an assert(Ty == R.Ty), and it did not fire for the lit tests. Instead of asserting, we can always check that before calling combineInValueLattice().
llvm/test/Transforms/Attributor/multiple-offsets-pointer-info.ll
336 ↗	(On Diff #480918)	Exactly what I had in mind. Would prefer to do this as a follow-up.

jdoerfert added inline comments.Dec 7 2022, 9:33 AM

llvm/include/llvm/Transforms/IPO/Attributor.h
5204–5211	I'm still confused. Even if it's the same value this is a problem, no? range: [0-4] value: i32 4 must write merged with range: [4-8] value: i32 4 must write will result in range: [0-8] value: i32 4 must write which is not true. For one, we may only write 4 out of 8 bytes, and depending which ones its not going to be `4` if you read the range [0-8].

Harbormaster completed remote builds in B201705: Diff 480918.Dec 7 2022, 5:47 PM

sameerds added inline comments.Dec 7 2022, 9:28 PM

llvm/include/llvm/Transforms/IPO/Attributor.h
5204–5211	Here is what I see about the creation of Access objects: Each Access is for a unique value. If the value is a MemInstrinsic, then the length is known, and all ranges for that Access will have that same length. Else, if the value is an argument, then the length is unknown and the Access has only one range (unknown). Else, the value is an instruction with an optionally known type (see handleAccess()). If the length is known, then all ranges have the same length, else it is a single unknown range. This invariant is maintained even when looking through function calls. If the above is correct, then it might be redundant to even track the size for every Range in a RangeList. But that is assuming Ranges are used only for PointerInfo::Access objects. If the Range should remain generic, then we should allow the possibility that all Ranges in a RangeList are not the same size. We could add a bool "AllRangesAreSameSize" and check this when merging an Access into another Access. So if Ranges are not the same size, then the contents are unknown. Else, if the types are the same, then combine the contents. Else the contents are unknown.

sameerds marked an inline comment as not done.Dec 7 2022, 9:39 PM

sameerds added inline comments.

llvm/include/llvm/Transforms/IPO/Attributor.h
5204–5211	Correction in the third bullet ... the length may be unknown if it is not llvm::Argument. Otherwise the invariant about looking through function calls applies.

jdoerfert added inline comments.Dec 7 2022, 9:42 PM

llvm/include/llvm/Transforms/IPO/Attributor.h
5204–5211	I'm very confused. Apologies. So, if I understand this correctly now we will keep the value but mark the access as MAY if it has more than one range. That seems correct. My example above was merging the ranges, which we are not, and also keeping the MUST bit, which we are not. So, assuming I finally understand what is happening this should be fine.

LG, I think.

This revision is now accepted and ready to land.Dec 7 2022, 9:42 PM

sameerds marked an inline comment as not done.Dec 7 2022, 10:43 PM

sameerds added inline comments.

llvm/include/llvm/Transforms/IPO/Attributor.h
5204–5211	I am just glad to have your attention while I work through this! It's an important optimization for launching HIP programs. So yes, if there are multiple ranges in an Access, they are of the same size. We keep the contents if possible, but it is always a MAY access. I have put asserts in strategic places (isMayAccess and isMustAccess) to catch that.

FWIW, I landed https://reviews.llvm.org/rG1eab2d699e9581305f32473291e6afa47017d582 and you might need to verify the tests against it. Worst case, we need to recursively call getPotentialValues.

Rebased.
Added messages to assertions.
handleAccess expects a reference Type&, since that argument cannot be nullptr.

Herald added subscribers: foad, arsenm. · View Herald TranscriptDec 8 2022, 10:25 PM

Harbormaster completed remote builds in B202143: Diff 481518.Dec 8 2022, 10:50 PM

Rebased.
Yielded to clang-format's insistence in a couple of places.

Harbormaster completed remote builds in B202490: Diff 481992.Dec 11 2022, 10:34 PM

Closed by commit rGc2a0baad1fbb: [AAPointerInfo] track multiple constant offsets for each use (authored by sameerds). · Explain WhyDec 12 2022, 12:09 AM

This revision was automatically updated to reflect the committed changes.

sameerds added a commit: rGc2a0baad1fbb: [AAPointerInfo] track multiple constant offsets for each use.

sameerds added a reverting change: rG2fdeb2779006: Revert "[AAPointerInfo] track multiple constant offsets for each use".Dec 12 2022, 2:09 AM

sameerds added a commit: rG6a2305484e87: [AAPointerInfo] track multiple constant offsets for each use.Dec 13 2022, 8:57 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

Transforms/

IPO/

Attributor.h

229 lines

lib/

Transforms/

IPO/

AttributorAttributes.cpp

331 lines

test/

CodeGen/

AMDGPU/

attributor-through-select.ll

43 lines

Transforms/

Attributor/

value-simplify-pointer-info.ll

153 lines

Diff 477715

llvm/include/llvm/Transforms/IPO/Attributor.h

Show First 20 Lines • Show All 259 Lines • ▼ Show 20 Lines RangeTy &operator&=(const RangeTy &R) {

else if (Size == Unknown || R.Size == Unknown) else if (Size == Unknown || R.Size == Unknown)

Size = Unknown; Size = Unknown;

else if (R.Size != Unassigned) else if (R.Size != Unassigned)

Size = std::max(Size, R.Size); Size = std::max(Size, R.Size);

return *this; return *this;

} }

/// Comparison for sorting ranges by offset.

///

/// Returns true if the offset \p L is less than that of \p R.

inline static bool OffsetLessThan(const RangeTy &L, const RangeTy &R) {

return L.Offset < R.Offset;

}

/// Constants used to represent special offsets or sizes. /// Constants used to represent special offsets or sizes.

/// - This assumes that Offset and Size are non-negative. /// - This assumes that Offset and Size are non-negative.

/// - The constants should not clash with DenseMapInfo, such as EmptyKey /// - The constants should not clash with DenseMapInfo, such as EmptyKey

/// (INT64_MAX) and TombstoneKey (INT64_MIN). /// (INT64_MAX) and TombstoneKey (INT64_MIN).

static constexpr int64_t Unassigned = -1; static constexpr int64_t Unassigned = -1;

static constexpr int64_t Unknown = -2; static constexpr int64_t Unknown = -2;

}; };

inline raw_ostream &operator<<(raw_ostream &OS, const RangeTy &R) {

OS << "[" << R.Offset << ", " << R.Size << "]";

return OS;

}

inline bool operator==(const RangeTy &A, const RangeTy &B) { inline bool operator==(const RangeTy &A, const RangeTy &B) {

return A.Offset == B.Offset && A.Size == B.Size; return A.Offset == B.Offset && A.Size == B.Size;

} }

inline bool operator!=(const RangeTy &A, const RangeTy &B) { return !(A == B); } inline bool operator!=(const RangeTy &A, const RangeTy &B) { return !(A == B); }

/// Return the initial value of \p Obj with type \p Ty if that is a constant. /// Return the initial value of \p Obj with type \p Ty if that is a constant.

Constant *getInitialValueForObj(Value &Obj, Type &Ty, Constant *getInitialValueForObj(Value &Obj, Type &Ty,

▲ Show 20 Lines • Show All 4,714 Lines • ▼ Show 20 Lines enum AccessKind {

AK_MAY_READ = AK_MAY | AK_R, AK_MAY_READ = AK_MAY | AK_R,

AK_MAY_WRITE = AK_MAY | AK_W, AK_MAY_WRITE = AK_MAY | AK_W,

AK_MAY_READ_WRITE = AK_MAY | AK_R | AK_W, AK_MAY_READ_WRITE = AK_MAY | AK_R | AK_W,

AK_MUST_READ = AK_MUST | AK_R, AK_MUST_READ = AK_MUST | AK_R,

AK_MUST_WRITE = AK_MUST | AK_W, AK_MUST_WRITE = AK_MUST | AK_W,

AK_MUST_READ_WRITE = AK_MUST | AK_R | AK_W, AK_MUST_READ_WRITE = AK_MUST | AK_R | AK_W,

}; };

/// A container for a list of ranges.

struct RangeList {

// The set of ranges rarely contains more than one element, and is unlikely

// to contain more than say four elements. So we find the middle-ground with

// a sorted vector. This avoids hard-coding a rarely used number like "four"

// into every instance of a SmallSet.

using RangeTy = AA::RangeTy;

using VecTy = SmallVector<RangeTy>;

using iterator = VecTy::iterator;

using const_iterator = VecTy::const_iterator;

VecTy Ranges;

RangeList(const RangeTy &R) { Ranges.push_back(R); }

RangeList(ArrayRef<int64_t> Offsets, int64_t Size) {

Ranges.reserve(Offsets.size());

for (unsigned i = 0, e = Offsets.size(); i != e; ++i) {

assert(((i + 1 == e) || Offsets[i] < Offsets[i + 1]) &&

"Expected strictly ascending offsets.");

Ranges.emplace_back(Offsets[i], Size);

}

RangeList() = default;

iterator begin() { return Ranges.begin(); }

iterator end() { return Ranges.end(); }

const_iterator begin() const { return Ranges.begin(); }

const_iterator end() const { return Ranges.end(); }

// Helpers required for std::set_difference

using value_type = RangeTy;

void push_back(const RangeTy &R) {

assert((Ranges.empty() || RangeTy::OffsetLessThan(Ranges.back(), R)) &&

"Ensure the last element is the greatest.");

Ranges.push_back(R);

}

/// Copy ranges from \p L that are not in \p R, into \p D.

static void set_difference(const RangeList &L, const RangeList &R,

RangeList &D) {

std::set_difference(L.begin(), L.end(), R.begin(), R.end(),

std::back_inserter(D), RangeTy::OffsetLessThan);

}

unsigned size() const { return Ranges.size(); }

bool operator==(const RangeList &OI) const { return Ranges == OI.Ranges; }

/// Merge the ranges in \p RHS into the current ranges.

/// - Merging a list of unknown ranges makes the current list unknown.

/// - Ranges with the same offset are merged according to RangeTy::operator&

/// \return true if the current RangeList changed.

bool merge(const RangeList &RHS) {

if (isUnknown())

return false;

if (RHS.isUnknown()) {

setUnknown();

return true;

}

if (Ranges.empty()) {

Ranges = RHS.Ranges;

return true;

}

bool Changed = false;

auto LPos = Ranges.begin();

for (auto &R : RHS.Ranges) {

auto Result = insert(LPos, R);

if (isUnknown())

return true;

LPos = Result.first;

Changed |= Result.second;

}

return Changed;

}

/// Insert \p R at the given iterator \p Pos, and merge if necessary.

///

/// This assumes that all ranges before \p Pos are OffsetLessThan \p R, and

/// then maintains the sorted order for the suffix list.

///

/// \return The place of insertion and true iff anything changed.

std::pair<iterator, bool> insert(iterator Pos, const RangeTy &R) {

if (isUnknown())

return std::make_pair(Ranges.begin(), false);

if (R.offsetOrSizeAreUnknown()) {

return std::make_pair(setUnknown(), true);

}

// Maintain this as a sorted vector of unique entries.

auto LB = std::lower_bound(Pos, Ranges.end(), R, RangeTy::OffsetLessThan);

if (LB == Ranges.end() || LB->Offset != R.Offset)

return std::make_pair(Ranges.insert(LB, R), true);

bool Changed = *LB != R;

*LB &= R;

if (LB->offsetOrSizeAreUnknown())

return std::make_pair(setUnknown(), true);

return std::make_pair(LB, Changed);

}

/// Insert the given range \p R, maintaining sorted order.

///

/// \return The place of insertion and true iff anything changed.

std::pair<iterator, bool> insert(const RangeTy &R) {

return insert(Ranges.begin(), R);

}

/// Add the increment \p Inc to the offset of every range.

void addToAllOffsets(int64_t Inc) {

assert(!isUnassigned());

if (isUnknown())

return;

for (auto &R : Ranges) {

R.Offset += Inc;

}

/// Return true iff there is exactly one range and it is known.

bool isUnique() const {

return Ranges.size() == 1 && !Ranges.front().offsetOrSizeAreUnknown();

}

/// Return the unique range, assuming it exists.

const RangeTy &getUnique() const {

assert(isUnique());

jdoerfertUnsubmitted

Done

const RangeTy &getUnique() const {

- assert(isUnique());

+ assert(isUnique() && "Cannot return a unique range if there is no single one.");

return Ranges.front();

All assertions need messages please, also elsewhere.

jdoerfert: All assertions need messages please, also elsewhere.

sameerdsAuthorUnsubmitted

Done

Can fix this locally too.

sameerds: Can fix this locally too.

return Ranges.front();

}

/// Return true iff the list contains an unknown range.

bool isUnknown() const {

if (isUnassigned())

return false;

if (Ranges.front().offsetOrSizeAreUnknown()) {

assert(Ranges.size() == 1);

return true;

}

return false;

}

/// Discard all ranges and insert a single unknown range.

iterator setUnknown() {

Ranges.clear();

Ranges.push_back(RangeTy::getUnknown());

return Ranges.begin();

}

/// Return true if no ranges have been inserted.

bool isUnassigned() const { return Ranges.size() == 0; }

};

/// An access description. /// An access description.

struct Access { struct Access {

Access(Instruction *I, int64_t Offset, int64_t Size, Access(Instruction *I, int64_t Offset, int64_t Size,

Optional<Value *> Content, AccessKind Kind, Type *Ty) Optional<Value *> Content, AccessKind Kind, Type *Ty)

: LocalI(I), RemoteI(I), Content(Content), Range(Offset, Size), : LocalI(I), RemoteI(I), Content(Content), Ranges(Offset, Size),

Kind(Kind), Ty(Ty) { Kind(Kind), Ty(Ty) {

verify(); verify();

} }

Access(Instruction *LocalI, Instruction *RemoteI, const RangeList &Ranges,

Optional<Value *> Content, AccessKind K, Type *Ty)

: LocalI(LocalI), RemoteI(RemoteI), Content(Content), Ranges(Ranges),

Kind(K), Ty(Ty) {

if (Ranges.size() > 1) {

Kind = AccessKind(Kind | AK_MAY);

Kind = AccessKind(Kind & ~AK_MUST);

}

verify();

}

Access(Instruction *LocalI, Instruction *RemoteI, int64_t Offset, Access(Instruction *LocalI, Instruction *RemoteI, int64_t Offset,

int64_t Size, Optional<Value *> Content, AccessKind Kind, Type *Ty) int64_t Size, Optional<Value *> Content, AccessKind Kind, Type *Ty)

: LocalI(LocalI), RemoteI(RemoteI), Content(Content), : LocalI(LocalI), RemoteI(RemoteI), Content(Content),

Range(Offset, Size), Kind(Kind), Ty(Ty) { Ranges(Offset, Size), Kind(Kind), Ty(Ty) {

verify(); verify();

} }

Access(const Access &Other) = default; Access(const Access &Other) = default;

Access(const Access &&Other)

: LocalI(Other.LocalI), RemoteI(Other.RemoteI), Content(Other.Content),

Range(Other.Range), Kind(Other.Kind), Ty(Other.Ty) {}

Access &operator=(const Access &Other) = default; Access &operator=(const Access &Other) = default;

bool operator==(const Access &R) const { bool operator==(const Access &R) const {

return LocalI == R.LocalI && RemoteI == R.RemoteI && Range == R.Range && return LocalI == R.LocalI && RemoteI == R.RemoteI && Ranges == R.Ranges &&

Content == R.Content && Kind == R.Kind; Content == R.Content && Kind == R.Kind;

} }

bool operator!=(const Access &R) const { return !(*this == R); } bool operator!=(const Access &R) const { return !(*this == R); }

Access &operator&=(const Access &R) { Access &operator&=(const Access &R) {

assert(RemoteI == R.RemoteI && "Expected same instruction!"); assert(RemoteI == R.RemoteI && "Expected same instruction!");

assert(LocalI == R.LocalI && "Expected same instruction!"); assert(LocalI == R.LocalI && "Expected same instruction!");

Kind = AccessKind(Kind | R.Kind); Kind = AccessKind(Kind | R.Kind);

auto Before = Range; bool Changed = Ranges.merge(R.Ranges);

Range &= R.Range; if (!Changed) {

if (Before.isUnassigned() || Before == Range) {

Content = Content =

AA::combineOptionalValuesInAAValueLatice(Content, R.Content, Ty); AA::combineOptionalValuesInAAValueLatice(Content, R.Content, Ty);

} else { } else {

// Since the Range information changed, set a conservative state -- drop // Since the Range information changed, set a conservative state -- drop

// the contents, and assume MayAccess rather than MustAccess. // the contents, and assume MayAccess rather than MustAccess.

jdoerfertUnsubmitted

Done

It was dropping the results for a reason before, just going back on it is probably not sound either.

If we have:

range: [0-4]
value: i32 -42

and merge it with

range: [0-8]
value: i64 -42

we need to do something here.
At least, we need to change must to may but I think we cannot even keep the value if they are equal under under zext.
If we would, writing i32 0 and i64 0 would make us believe the former writes 8 bytes and they'll all be 0, which is not true.

When does this happen, maybe we need to understand the problem better.

jdoerfert: It was dropping the results for a reason before, just going back on it is probably not sound…

sameerdsAuthorUnsubmitted

Done

The Access objects being merged here originate from the same remote instruction. Doesn't that mean that the type is guaranteed to be the same? I tried putting an assert(Ty == R.Ty), and it did not fire for the lit tests. Instead of asserting, we can always check that before calling combineInValueLattice().

sameerds: The Access objects being merged here originate from the same remote instruction. Doesn't that…

jdoerfertUnsubmitted

Done

I'm still confused. Even if it's the same value this is a problem, no?

range: [0-4]
value: i32 4
must write

merged with

range: [4-8]
value: i32 4
must write

will result in

range: [0-8]
value: i32 4
must write

which is not true. For one, we may only write 4 out of 8 bytes, and depending which ones its not going to be 4 if you read the range [0-8].

jdoerfert: I'm still confused. Even if it's the same value this is a problem, no? ``` range: [0-4] value…

sameerdsAuthorUnsubmitted

Done

Here is what I see about the creation of Access objects:

Each Access is for a unique value.
If the value is a MemInstrinsic, then the length is known, and all ranges for that Access will have that same length.
Else, if the value is an argument, then the length is unknown and the Access has only one range (unknown).
Else, the value is an instruction with an optionally known type (see handleAccess()). If the length is known, then all ranges have the same length, else it is a single unknown range.

This invariant is maintained even when looking through function calls.

If the above is correct, then it might be redundant to even track the size for every Range in a RangeList. But that is assuming Ranges are used only for PointerInfo::Access objects. If the Range should remain generic, then we should allow the possibility that all Ranges in a RangeList are not the same size. We could add a bool "AllRangesAreSameSize" and check this when merging an Access into another Access.

So if Ranges are not the same size, then the contents are unknown. Else, if the types are the same, then combine the contents. Else the contents are unknown.

sameerds: Here is what I see about the creation of Access objects: - Each Access is for a unique value.

jdoerfertUnsubmitted

Done

I'm very confused. Apologies.

So, if I understand this correctly now we will keep the value but mark the access as MAY if it has more than one range. That seems correct. My example above was merging the ranges, which we are not, and also keeping the MUST bit, which we are not. So, assuming I finally understand what is happening this should be fine.

jdoerfert: I'm very confused. Apologies. So, if I understand this correctly now we will keep the value…

sameerdsAuthorUnsubmitted

Done

I am just glad to have your attention while I work through this! It's an important optimization for launching HIP programs.

So yes, if there are multiple ranges in an Access, they are of the same size. We keep the contents if possible, but it is always a MAY access. I have put asserts in strategic places (isMayAccess and isMustAccess) to catch that.

sameerds: I am just glad to have your attention while I work through this! It's an important optimization…

sameerdsAuthorUnsubmitted

Done

Correction in the third bullet ... the length may be unknown if it is not llvm::Argument. Otherwise the invariant about looking through function calls applies.

sameerds: Correction in the third bullet ... the length **may** be unknown if it is not llvm::Argument.

setWrittenValueUnknown(); setWrittenValueUnknown();

Kind = AccessKind(Kind | AK_MAY); Kind = AccessKind(Kind | AK_MAY);

Kind = AccessKind(Kind & ~AK_MUST); Kind = AccessKind(Kind & ~AK_MUST);

} }

verify(); verify();

return *this; return *this;

} }

void verify() { void verify() {

assert(isMustAccess() + isMayAccess() == 1 && assert(isMustAccess() + isMayAccess() == 1 &&

"Expect must or may access, not both."); "Expect must or may access, not both.");

assert((isMayAccess() || Ranges.size() == 1) &&

"Cannot be a must access if there are multiple ranges.");

} }

/// Return the access kind. /// Return the access kind.

AccessKind getKind() const { return Kind; } AccessKind getKind() const { return Kind; }

/// Return true if this is a read access. /// Return true if this is a read access.

bool isRead() const { return Kind & AK_R; } bool isRead() const { return Kind & AK_R; }

/// Return true if this is a write access. /// Return true if this is a write access.

bool isWrite() const { return Kind & AK_W; } bool isWrite() const { return Kind & AK_W; }

bool isMustAccess() const { return Kind & AK_MUST; } bool isMustAccess() const {

bool isMayAccess() const { return Kind & AK_MAY; } bool MustAccess = Kind & AK_MUST;

assert((!MustAccess || Ranges.size() == 1) &&

"Cannot be a must access if there are multiple ranges.");

return MustAccess;

}

bool isMayAccess() const {

bool MayAccess = Kind & AK_MAY;

assert((MayAccess || Ranges.size() == 1) &&

"Cannot be a must access if there are multiple ranges.");

return MayAccess;

}

/// Return the instruction that causes the access with respect to the local /// Return the instruction that causes the access with respect to the local

/// scope of the associated attribute. /// scope of the associated attribute.

Instruction *getLocalInst() const { return LocalI; } Instruction *getLocalInst() const { return LocalI; }

/// Return the actual instruction that causes the access. /// Return the actual instruction that causes the access.

Instruction *getRemoteInst() const { return RemoteI; } Instruction *getRemoteInst() const { return RemoteI; }

Show All 17 Lines Value *getWrittenValue() const {

"Value needs to be determined before accessing it."); "Value needs to be determined before accessing it.");

return *Content; return *Content;

} }

/// Return the written value which can be `llvm::null` if it is not yet /// Return the written value which can be `llvm::null` if it is not yet

/// determined. /// determined.

Optional<Value *> getContent() const { return Content; } Optional<Value *> getContent() const { return Content; }

/// Return the offset for this access. bool hasUniqueRange() const { return Ranges.isUnique(); }

int64_t getOffset() const { return Range.Offset; } const AA::RangeTy &getUniqueRange() const { return Ranges.getUnique(); }

/// Add a range accessed by this Access.

///

/// If there are multiple ranges, then this is a "may access".

void addRange(int64_t Offset, int64_t Size) {

Ranges.insert({Offset, Size});

if (!hasUniqueRange()) {

Kind = AccessKind(Kind | AK_MAY);

Kind = AccessKind(Kind & ~AK_MUST);

}

const RangeList &getRanges() const { return Ranges; }

/// Return the size for this access. using const_iterator = RangeList::const_iterator;

int64_t getSize() const { return Range.Size; } const_iterator begin() const { return Ranges.begin(); }

const_iterator end() const { return Ranges.end(); }

private: private:

/// The instruction responsible for the access with respect to the local /// The instruction responsible for the access with respect to the local

/// scope of the associated attribute. /// scope of the associated attribute.

Instruction *LocalI; Instruction *LocalI;

/// The instruction responsible for the access. /// The instruction responsible for the access.

Instruction *RemoteI; Instruction *RemoteI;

/// The value written, if any. `llvm::none` means "not known yet", `nullptr` /// The value written, if any. `llvm::none` means "not known yet", `nullptr`

/// cannot be determined. /// cannot be determined.

Optional<Value *> Content; Optional<Value *> Content;

/// The object accessed, in terms of an offset and size in bytes. /// Set of potential ranges accessed from the base pointer.

AA::RangeTy Range; RangeList Ranges;

/// The access kind, e.g., READ, as bitset (could be more than one). /// The access kind, e.g., READ, as bitset (could be more than one).

AccessKind Kind; AccessKind Kind;

/// The type of the content, thus the type read/written, can be null if not /// The type of the content, thus the type read/written, can be null if not

/// available. /// available.

Type *Ty; Type *Ty;

}; };

▲ Show 20 Lines • Show All 85 Lines • Show Last 20 Lines

llvm/lib/Transforms/IPO/AttributorAttributes.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

//===- AttributorAttributes.cpp - Attributes for Attributor deduction -----===//		//===- AttributorAttributes.cpp - Attributes for Attributor deduction -----===//
		Lint: Lint Inline Actions clang-format suggested style edits found: Lint: Lint: clang-format suggested style edits found:
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// See the Attributor.h file comment and the class descriptions in that file for		// See the Attributor.h file comment and the class descriptions in that file for
▲ Show 20 Lines • Show All 803 Lines • ▼ Show 20 Lines	struct AA::PointerInfo::State : public AbstractState {
/// Add a new Access to the state at offset \p Offset and with size \p Size.		/// Add a new Access to the state at offset \p Offset and with size \p Size.
/// The access is associated with \p I, writes \p Content (if anything), and		/// The access is associated with \p I, writes \p Content (if anything), and
/// is of kind \p Kind. If an Access already exists for the same \p I and same		/// is of kind \p Kind. If an Access already exists for the same \p I and same
/// \p RemoteI, the two are combined, potentially losing information about		/// \p RemoteI, the two are combined, potentially losing information about
/// offset and size. The resulting access must now be moved from its original		/// offset and size. The resulting access must now be moved from its original
/// OffsetBin to the bin for its new offset.		/// OffsetBin to the bin for its new offset.
///		///
/// \Returns CHANGED, if the state changed, UNCHANGED otherwise.		/// \Returns CHANGED, if the state changed, UNCHANGED otherwise.
ChangeStatus addAccess(Attributor &A, int64_t Offset, int64_t Size,		ChangeStatus addAccess(Attributor &A, const AAPointerInfo::RangeList &Ranges,
Instruction &I, Optional<Value *> Content,		Instruction &I, Optional<Value *> Content,
AAPointerInfo::AccessKind Kind, Type *Ty,		AAPointerInfo::AccessKind Kind, Type *Ty,
Instruction *RemoteI = nullptr);		Instruction *RemoteI = nullptr);

using OffsetBinsTy = DenseMap<RangeTy, SmallSet<unsigned, 4>>;		using OffsetBinsTy = DenseMap<RangeTy, SmallSet<unsigned, 4>>;

using const_bin_iterator = OffsetBinsTy::const_iterator;		using const_bin_iterator = OffsetBinsTy::const_iterator;
const_bin_iterator begin() const { return OffsetBins.begin(); }		const_bin_iterator begin() const { return OffsetBins.begin(); }
Show All 29 Lines	bool forallInterferingAccesses(

for (const auto &It : OffsetBins) {		for (const auto &It : OffsetBins) {
AA::RangeTy ItRange = It.getFirst();		AA::RangeTy ItRange = It.getFirst();
if (!Range.mayOverlap(ItRange))		if (!Range.mayOverlap(ItRange))
continue;		continue;
bool IsExact = Range == ItRange && !Range.offsetOrSizeAreUnknown();		bool IsExact = Range == ItRange && !Range.offsetOrSizeAreUnknown();
for (auto Index : It.getSecond()) {		for (auto Index : It.getSecond()) {
auto &Access = AccessList[Index];		auto &Access = AccessList[Index];
if (!CB(Access, IsExact))		if (!CB(Access, IsExact && Access.hasUniqueRange()))
return false;		return false;
}		}
}		}
return true;		return true;
}		}

/// See AAPointerInfo::forallInterferingAccesses.		/// See AAPointerInfo::forallInterferingAccesses.
bool forallInterferingAccesses(		bool forallInterferingAccesses(
Instruction &I,		Instruction &I,
function_ref<bool(const AAPointerInfo::Access &, bool)> CB,		function_ref<bool(const AAPointerInfo::Access &, bool)> CB,
AA::RangeTy &Range) const {		AA::RangeTy &Range) const {
if (!isValidState())		if (!isValidState())
return false;		return false;

auto LocalList = RemoteIMap.find(&I);		auto LocalList = RemoteIMap.find(&I);
if (LocalList == RemoteIMap.end()) {		if (LocalList == RemoteIMap.end()) {
return true;		return true;
}		}

for (auto LI : LocalList->getSecond()) {		for (unsigned Index : LocalList->getSecond()) {
auto &Access = AccessList[LI];		for (auto &R : AccessList[Index]) {
Range &= {Access.getOffset(), Access.getSize()};		Range &= R;
		if (Range.offsetOrSizeAreUnknown())
		break;
		}
}		}
return forallInterferingAccesses(Range, CB);		return forallInterferingAccesses(Range, CB);
}		}

private:		private:
/// State to track fixpoint and validity.		/// State to track fixpoint and validity.
BooleanState BS;		BooleanState BS;
};		};

ChangeStatus AA::PointerInfo::State::addAccess(Attributor &A, int64_t Offset,		ChangeStatus AA::PointerInfo::State::addAccess(
int64_t Size, Instruction &I,		Attributor &A, const AAPointerInfo::RangeList &Ranges, Instruction &I,
Optional<Value *> Content,		Optional<Value > Content, AAPointerInfo::AccessKind Kind, Type Ty,
AAPointerInfo::AccessKind Kind,		Instruction *RemoteI) {
Type Ty, Instruction RemoteI) {
RemoteI = RemoteI ? RemoteI : &I;		RemoteI = RemoteI ? RemoteI : &I;
AAPointerInfo::Access Acc(&I, RemoteI, Offset, Size, Content, Kind, Ty);

// Check if we have an access for this instruction, if not, simply add it.		// Check if we have an access for this instruction, if not, simply add it.
auto &LocalList = RemoteIMap[RemoteI];		auto &LocalList = RemoteIMap[RemoteI];
bool AccExists = false;		bool AccExists = false;
unsigned AccIndex = AccessList.size();		unsigned AccIndex = AccessList.size();
for (auto Index : LocalList) {		for (auto Index : LocalList) {
auto &A = AccessList[Index];		auto &A = AccessList[Index];
if (A.getLocalInst() == &I) {		if (A.getLocalInst() == &I) {
AccExists = true;		AccExists = true;
AccIndex = Index;		AccIndex = Index;
break;		break;
}		}
}		}

		auto AddToBins = [&](const AAPointerInfo::RangeList &ToAdd) {
		LLVM_DEBUG(
		if (ToAdd.size())
		dbgs() << "[AAPointerInfo] Inserting access in new offset bins\n";
		);

		for (auto Key : ToAdd) {
		LLVM_DEBUG(dbgs() << " key " << Key << "\n");
		OffsetBins[Key].insert(AccIndex);
		}
		};

if (!AccExists) {		if (!AccExists) {
AccessList.push_back(Acc);		AccessList.emplace_back(&I, RemoteI, Ranges, Content, Kind, Ty);
		assert((AccessList.size() == AccIndex + 1) &&
		"New Access should have been at AccIndex");
LocalList.push_back(AccIndex);		LocalList.push_back(AccIndex);
} else {		AddToBins(AccessList[AccIndex].getRanges());
// The new one will be combined with the existing one.		return ChangeStatus::CHANGED;
		}

		// Combine the new Access with the existing Access, and then update the
		// mapping in the offset bins.
		AAPointerInfo::Access Acc(&I, RemoteI, Ranges, Content, Kind, Ty);
auto &Current = AccessList[AccIndex];		auto &Current = AccessList[AccIndex];
auto Before = Current;		auto Before = Current;
Current &= Acc;		Current &= Acc;
if (Current == Before)		if (Current == Before)
return ChangeStatus::UNCHANGED;		return ChangeStatus::UNCHANGED;

Acc = Current;		auto &ExistingRanges = Before.getRanges();
AA::RangeTy Key{Before.getOffset(), Before.getSize()};		auto &NewRanges = Current.getRanges();

		// Ranges that are in the old access but not the new access need to be removed
		// from the offset bins.
		AAPointerInfo::RangeList ToRemove;
		AAPointerInfo::RangeList::set_difference(ExistingRanges, NewRanges, ToRemove);
		LLVM_DEBUG(
		if (ToRemove.size())
		dbgs() << "[AAPointerInfo] Removing access from old offset bins\n";
		);

		for (auto Key : ToRemove) {
		LLVM_DEBUG(dbgs() << " key " << Key << "\n");
assert(OffsetBins.count(Key) && "Existing Access must be in some bin.");		assert(OffsetBins.count(Key) && "Existing Access must be in some bin.");
auto &Bin = OffsetBins[Key];		auto &Bin = OffsetBins[Key];
assert(Bin.count(AccIndex) &&		assert(Bin.count(AccIndex) &&
"Expected bin to actually contain the Access.");		"Expected bin to actually contain the Access.");
LLVM_DEBUG(dbgs() << "[AAPointerInfo] Removing Access "
<< AccessList[AccIndex] << " with key {" << Key.Offset
<< ',' << Key.Size << "}\n");
Bin.erase(AccIndex);		Bin.erase(AccIndex);
}		}

AA::RangeTy Key{Acc.getOffset(), Acc.getSize()};		// Ranges that are in the new access but not the old access need to be added
LLVM_DEBUG(dbgs() << "[AAPointerInfo] Inserting Access " << Acc		// to the offset bins.
<< " with key {" << Key.Offset << ',' << Key.Size << "}\n");		AAPointerInfo::RangeList ToAdd;
OffsetBins[Key].insert(AccIndex);		AAPointerInfo::RangeList::set_difference(NewRanges, ExistingRanges, ToAdd);
		AddToBins(ToAdd);
return ChangeStatus::CHANGED;		return ChangeStatus::CHANGED;
}		}

namespace {		namespace {

/// Helper struct, will support ranges eventually.		/// A helper containing a list of offsets computed for a Use. Ideally this
///		/// list should be strictly ascending, but we ensure that only when we
/// FIXME: Tracks a single Offset until we have proper support for a list of		/// actually translate the list of offsets to a RangeList.
/// RangeTy objects.
struct OffsetInfo {		struct OffsetInfo {
int64_t Offset = AA::RangeTy::Unassigned;		using VecTy = SmallVector<int64_t>;
		using const_iterator = VecTy::const_iterator;
		VecTy Offsets;

		const_iterator begin() const { return Offsets.begin(); }
		const_iterator end() const { return Offsets.end(); }

		bool operator==(const OffsetInfo &RHS) const {
		return Offsets == RHS.Offsets;
		}

		void insert(int64_t Offset) { Offsets.push_back(Offset); }
		bool isUnassigned() const { return Offsets.size() == 0; }

bool operator==(const OffsetInfo &OI) const { return Offset == OI.Offset; }		bool isUnknown() const {
		if (isUnassigned())
		return false;
		if (Offsets.size() == 1)
		return Offsets.front() == AA::RangeTy::Unknown;
		return false;
		}

		void setUnknown() {
		Offsets.clear();
		Offsets.push_back(AA::RangeTy::Unknown);
		}

		void addToAll(int64_t Inc) {
		for (auto &Offset : Offsets) {
		Offset += Inc;
		}
		}

		/// Copy offsets from \p R into the current list.
		///
		/// Ideally all lists should be strictly ascending, but we defer that to the
		/// actual use of the list. So we just blindly append here.
		void merge(const OffsetInfo &R) { Offsets.append(R.Offsets); }
};		};

		static raw_ostream &operator<<(raw_ostream &OS, const OffsetInfo &OI) {
		ListSeparator LS;
		OS << "[";
		for (auto Offset : OI) {
		OS << LS << Offset;
		}
		OS << "]";
		return OS;
		}

struct AAPointerInfoImpl		struct AAPointerInfoImpl
: public StateWrapper<AA::PointerInfo::State, AAPointerInfo> {		: public StateWrapper<AA::PointerInfo::State, AAPointerInfo> {
using BaseTy = StateWrapper<AA::PointerInfo::State, AAPointerInfo>;		using BaseTy = StateWrapper<AA::PointerInfo::State, AAPointerInfo>;
AAPointerInfoImpl(const IRPosition &IRP, Attributor &A) : BaseTy(IRP) {}		AAPointerInfoImpl(const IRPosition &IRP, Attributor &A) : BaseTy(IRP) {}

/// See AbstractAttribute::getAsStr().		/// See AbstractAttribute::getAsStr().
const std::string getAsStr() const override {		const std::string getAsStr() const override {
return std::string("PointerInfo ") +		return std::string("PointerInfo ") +
▲ Show 20 Lines • Show All 201 Lines • ▼ Show 20 Lines	ChangeStatus translateAndAddStateFromCallee(Attributor &A,
const auto &State = OtherAAImpl.getState();		const auto &State = OtherAAImpl.getState();
for (const auto &It : State) {		for (const auto &It : State) {
for (auto Index : It.getSecond()) {		for (auto Index : It.getSecond()) {
const auto &RAcc = State.getAccess(Index);		const auto &RAcc = State.getAccess(Index);
if (IsByval && !RAcc.isRead())		if (IsByval && !RAcc.isRead())
continue;		continue;
bool UsedAssumedInformation = false;		bool UsedAssumedInformation = false;
AccessKind AK = RAcc.getKind();		AccessKind AK = RAcc.getKind();
Optional<Value *> Content = RAcc.getContent();		Optional<Value *> Content = A.translateArgumentToCallSiteContent(
Content = A.translateArgumentToCallSiteContent(
RAcc.getContent(), CB, *this, UsedAssumedInformation);		RAcc.getContent(), CB, *this, UsedAssumedInformation);
AK = AccessKind(AK & (IsByval ? AccessKind::AK_R : AccessKind::AK_RW));		AK = AccessKind(AK & (IsByval ? AccessKind::AK_R : AccessKind::AK_RW));
AK = AccessKind(AK \| (RAcc.isMayAccess() ? AK_MAY : AK_MUST));		AK = AccessKind(AK \| (RAcc.isMayAccess() ? AK_MAY : AK_MUST));
Changed =
Changed \| addAccess(A, It.first.Offset, It.first.Size, CB, Content,		Changed \|= addAccess(A, RAcc.getRanges(), CB, Content, AK,
AK, RAcc.getType(), RAcc.getRemoteInst());		RAcc.getType(), RAcc.getRemoteInst());
}		}
}		}
return Changed;		return Changed;
}		}

ChangeStatus translateAndAddState(Attributor &A, const AAPointerInfo &OtherAA,		ChangeStatus translateAndAddState(Attributor &A, const AAPointerInfo &OtherAA,
int64_t Offset, CallBase &CB) {		const OffsetInfo &Offsets, CallBase &CB) {
using namespace AA::PointerInfo;		using namespace AA::PointerInfo;
if (!OtherAA.getState().isValidState() \|\| !isValidState())		if (!OtherAA.getState().isValidState() \|\| !isValidState())
return indicatePessimisticFixpoint();		return indicatePessimisticFixpoint();

const auto &OtherAAImpl = static_cast<const AAPointerInfoImpl &>(OtherAA);		const auto &OtherAAImpl = static_cast<const AAPointerInfoImpl &>(OtherAA);

// Combine the accesses bin by bin.		// Combine the accesses bin by bin.
ChangeStatus Changed = ChangeStatus::UNCHANGED;		ChangeStatus Changed = ChangeStatus::UNCHANGED;
const auto &State = OtherAAImpl.getState();		const auto &State = OtherAAImpl.getState();
for (const auto &It : State) {		for (const auto &It : State) {
AA::RangeTy Range = AA::RangeTy::getUnknown();
if (Offset != AA::RangeTy::Unknown &&
!It.first.offsetOrSizeAreUnknown()) {
Range = AA::RangeTy(It.first.Offset + Offset, It.first.Size);
}
for (auto Index : It.getSecond()) {		for (auto Index : It.getSecond()) {
const auto &RAcc = State.getAccess(Index);		const auto &RAcc = State.getAccess(Index);
AccessKind AK = RAcc.getKind();		for (auto Offset : Offsets) {
Optional<Value *> Content = RAcc.getContent();		auto NewRanges = Offset == AA::RangeTy::Unknown
Changed = Changed \| addAccess(A, Range.Offset, Range.Size, CB, Content,		? AA::RangeTy::getUnknown()
AK, RAcc.getType(), RAcc.getRemoteInst());		: RAcc.getRanges();
		if (!NewRanges.isUnknown()) {
		NewRanges.addToAllOffsets(Offset);
		}
		Changed \|=
		addAccess(A, NewRanges, CB, RAcc.getContent(), RAcc.getKind(),
		RAcc.getType(), RAcc.getRemoteInst());
		}
}		}
}		}
return Changed;		return Changed;
}		}

/// Statistic tracking for all AAPointerInfo implementations.		/// Statistic tracking for all AAPointerInfo implementations.
/// See AbstractAttribute::trackStatistics().		/// See AbstractAttribute::trackStatistics().
void trackPointerInfoStatistics(const IRPosition &IRP) const {}		void trackPointerInfoStatistics(const IRPosition &IRP) const {}
Show All 22 Lines

struct AAPointerInfoFloating : public AAPointerInfoImpl {		struct AAPointerInfoFloating : public AAPointerInfoImpl {
using AccessKind = AAPointerInfo::AccessKind;		using AccessKind = AAPointerInfo::AccessKind;
AAPointerInfoFloating(const IRPosition &IRP, Attributor &A)		AAPointerInfoFloating(const IRPosition &IRP, Attributor &A)
: AAPointerInfoImpl(IRP, A) {}		: AAPointerInfoImpl(IRP, A) {}

/// Deal with an access and signal if it was handled successfully.		/// Deal with an access and signal if it was handled successfully.
bool handleAccess(Attributor &A, Instruction &I, Optional<Value *> Content,		bool handleAccess(Attributor &A, Instruction &I, Optional<Value *> Content,
AccessKind Kind, int64_t Offset, ChangeStatus &Changed,		AccessKind Kind, SmallVectorImpl<int64_t> &Offsets,
Type *Ty) {		ChangeStatus &Changed, Type *Ty) {
using namespace AA::PointerInfo;		using namespace AA::PointerInfo;
auto Size = AA::RangeTy::Unknown;		auto Size = AA::RangeTy::Unknown;
const DataLayout &DL = A.getDataLayout();		const DataLayout &DL = A.getDataLayout();
TypeSize AccessSize = DL.getTypeStoreSize(Ty);		TypeSize AccessSize = DL.getTypeStoreSize(Ty);
if (!AccessSize.isScalable())		if (!AccessSize.isScalable())
Size = AccessSize.getFixedSize();		Size = AccessSize.getFixedSize();
Changed = Changed \| addAccess(A, Offset, Size, I, Content, Kind, Ty);
		// Make a strictly ascending list of offsets as required by addAccess()
		llvm::sort(Offsets);
		auto Last = std::unique(Offsets.begin(), Offsets.end());
		Offsets.erase(Last, Offsets.end());

		Changed = Changed \| addAccess(A, {Offsets, Size}, I, Content, Kind, Ty);
return true;		return true;
};		};

/// See AbstractAttribute::updateImpl(...).		/// See AbstractAttribute::updateImpl(...).
ChangeStatus updateImpl(Attributor &A) override;		ChangeStatus updateImpl(Attributor &A) override;

/// See AbstractAttribute::trackStatistics()		/// See AbstractAttribute::trackStatistics()
void trackStatistics() const override {		void trackStatistics() const override {
AAPointerInfoImpl::trackPointerInfoStatistics(getIRPosition());		AAPointerInfoImpl::trackPointerInfoStatistics(getIRPosition());
}		}
};		};

		static bool collectConstantOffsets(SmallVectorImpl<unsigned> &Offsets,
		SelectInst *Select) {
		SmallVector<SelectInst *> Stack;
		SmallSet<SelectInst *, 4> Visited;
		Stack.push_back(Select);
		Visited.insert(Select);

		while (!Stack.empty()) {
		auto *Select = Stack.pop_back_val();
		Visited.insert(Select);

		// Operands [1] and [2] are the True and False inputs.k
		for (int i = 1; i != 3; ++i) {
		auto *Opnd = Select->getOperand(i);
		if (auto *Sel = dyn_cast<SelectInst>(Opnd)) {
		if (Visited.insert(Sel).second)
		Stack.push_back(Sel);
		continue;
		}
		if (auto *ConstantOp = dyn_cast<ConstantInt>(Opnd)) {
		Offsets.push_back(ConstantOp->getZExtValue());
		continue;
		}
		return false;
		}
		}

		return true;
		}

		static bool
		handleConstantSelect(OffsetInfo &UsrOI, unsigned ConstantOffset,
		const MapVector<Value *, APInt> &VariableOffsets) {
		auto NewOI = UsrOI;
		NewOI.addToAll(ConstantOffset);

		for (const auto &VI : VariableOffsets) {
		auto *Select = dyn_cast<SelectInst>(VI.first);
		if (!Select)
		return false;
		jdoerfertUnsubmitted Done Reply Inline Actions This doesn't make sense to me. We need to look at all VariableOffsets and decide. So `return` should only be present if we give up. jdoerfert: This doesn't make sense to me. We need to look at all VariableOffsets and decide. So `return`…
		sameerdsAuthorUnsubmitted Done Reply Inline Actions You might have missed the "not" in the condition. We want to give up and come again later if any information is not at fixed point. This is easily exercised by tests involving induction variables. The set of potential values of the induction variable keeps growing, but we should not use that set until it is fully enumerated. Any eager propagation of a non-fixed-point through PointerInfo at this point affects conclusions in other attributes that depend on it. I did not look for an example of correctness, but it does cause the attributor to retain stores that it would have otherwise removed in one existing test. sameerds: You might have missed the "not" in the condition. We want to give up and come again later if…
		jdoerfertUnsubmitted Done Reply Inline Actions We cannot wait in one AA for another to find a fixpoint. That is not sound. That is not even always possible. Even if it would be, it won't work in the current algorithm. You need to update the AA state based on the state of the other AA, always. Then signal if something changed. That said, if we retain stores doing this properly we need to understand why. jdoerfert: We cannot wait in one AA for another to find a fixpoint. That is not sound. That is not even…
		sameerdsAuthorUnsubmitted Done Reply Inline Actions You're right. What I did here was plain wrong. The root cause was that when me merge ranges in operator&= for Access objects, we conservatively drop the contents. We don't need to be so conservative ... just combining contents from the two Access objects works well in case both happen to have the same contents. sameerds: You're right. What I did here was plain wrong. The root cause was that when me merge ranges in…
		SmallVector<unsigned> Offsets;
		if (!collectConstantOffsets(Offsets, Select))
		return false;

		OffsetInfo UnionOfAllCopies;
		for (auto ConstOffset : Offsets) {
		auto CopyPerOffset = NewOI;
		jdoerfertUnsubmitted Done Reply Inline Actions I don;t follow why we need two extra OffsetInfo objects here. We modify NewOI anyway, no? jdoerfert: I don;t follow why we need two extra OffsetInfo objects here. We modify NewOI anyway, no?
		sameerdsAuthorUnsubmitted Done Reply Inline Actions On each iteration of the outer for loop over VariableOffsets, the expression is a product: UnionOfAllCopies = NewOI x AssumedSet CopyPerOffset is the temporary used by the inner loop to compute this product. We need UnionOfAllCopies because it must only contain modified copies of NewOI, but not NewOI itself. We can't merge the RHS into NewOI. We have to start with an empty set. The actual output of the function is UsrOI. We do not move NewOI into that if we exit early. I realize there is no test for this yet. Working on that ... writing a GEP by hand is hard, when it involves nested aggregate types! sameerds: On each iteration of the outer for loop over VariableOffsets, the expression is a product…
		jdoerfertUnsubmitted Done Reply Inline Actions I realize there is no test for this yet. Working on that ... writing a GEP by hand is hard, when it involves nested aggregate types! Use clang. jdoerfert: > I realize there is no test for this yet. Working on that ... writing a GEP by hand is hard…
		sameerdsAuthorUnsubmitted Done Reply Inline Actions Test added. sameerds: Test added.
		CopyPerOffset.addToAll(ConstOffset);
		UnionOfAllCopies.merge(CopyPerOffset);
		}
		NewOI = UnionOfAllCopies;
		}

		UsrOI = std::move(NewOI);
		return true;
		}

ChangeStatus AAPointerInfoFloating::updateImpl(Attributor &A) {		ChangeStatus AAPointerInfoFloating::updateImpl(Attributor &A) {
using namespace AA::PointerInfo;		using namespace AA::PointerInfo;
ChangeStatus Changed = ChangeStatus::UNCHANGED;		ChangeStatus Changed = ChangeStatus::UNCHANGED;
		const DataLayout &DL = A.getDataLayout();
Value &AssociatedValue = getAssociatedValue();		Value &AssociatedValue = getAssociatedValue();

const DataLayout &DL = A.getDataLayout();
DenseMap<Value *, OffsetInfo> OffsetInfoMap;		DenseMap<Value *, OffsetInfo> OffsetInfoMap;
OffsetInfoMap[&AssociatedValue] = OffsetInfo{0};		OffsetInfoMap[&AssociatedValue].insert(0);

auto HandlePassthroughUser = [&](Value *Usr, const OffsetInfo &PtrOI,		auto HandlePassthroughUser = [&](Value *Usr, const OffsetInfo &PtrOI,
bool &Follow) {		bool &Follow) {
assert(PtrOI.Offset != AA::RangeTy::Unassigned &&		assert(!PtrOI.isUnassigned() &&
"Cannot pass through if the input Ptr was not visited!");		"Cannot pass through if the input Ptr was not visited!");
OffsetInfoMap[Usr] = PtrOI;		OffsetInfoMap[Usr] = PtrOI;
Follow = true;		Follow = true;
return true;		return true;
};		};

const auto *TLI =		const auto *TLI =
getAnchorScope()		getAnchorScope()
Show All 18 Lines	if (ConstantExpr *CE = dyn_cast<ConstantExpr>(Usr)) {
return false;		return false;
}		}
}		}
if (auto *GEP = dyn_cast<GEPOperator>(Usr)) {		if (auto *GEP = dyn_cast<GEPOperator>(Usr)) {
// Note the order here, the Usr access might change the map, CurPtr is		// Note the order here, the Usr access might change the map, CurPtr is
// already in it though.		// already in it though.
auto &UsrOI = OffsetInfoMap[Usr];		auto &UsrOI = OffsetInfoMap[Usr];
auto &PtrOI = OffsetInfoMap[CurPtr];		auto &PtrOI = OffsetInfoMap[CurPtr];
UsrOI = PtrOI;

// TODO: Use range information.
APInt GEPOffset(DL.getIndexTypeSizeInBits(GEP->getType()), 0);
if (PtrOI.Offset == AA::RangeTy::Unknown \|\|
!GEP->accumulateConstantOffset(DL, GEPOffset)) {
LLVM_DEBUG(dbgs() << "[AAPointerInfo] GEP offset not constant " << *GEP
<< "\n");
UsrOI.Offset = AA::RangeTy::Unknown;
Follow = true;		Follow = true;

		if (PtrOI.isUnknown()) {
		UsrOI.setUnknown();
return true;		return true;
}		}

		unsigned BitWidth = DL.getIndexTypeSizeInBits(GEP->getType());
		MapVector<Value *, APInt> VariableOffsets;
		APInt ConstantOffset(BitWidth, 0);
		if (!GEP->collectOffset(DL, BitWidth, VariableOffsets, ConstantOffset)) {
		UsrOI.setUnknown();
		return true;
		}

		UsrOI = PtrOI;
		if (VariableOffsets.empty()) {
LLVM_DEBUG(dbgs() << "[AAPointerInfo] GEP offset is constant " << *GEP		LLVM_DEBUG(dbgs() << "[AAPointerInfo] GEP offset is constant " << *GEP
<< "\n");		<< "\n");
UsrOI.Offset = PtrOI.Offset + GEPOffset.getZExtValue();		UsrOI.addToAll(ConstantOffset.getZExtValue());
Follow = true;		} else if (!handleConstantSelect(UsrOI, ConstantOffset.getZExtValue(),
		VariableOffsets)) {
		LLVM_DEBUG(dbgs() << "[AAPointerInfo] GEP offset is not constant "
		<< *GEP << "\n");
		UsrOI.setUnknown();
		}

return true;		return true;
		jdoerfertUnsubmitted Done Reply Inline Actions I think we might want to call `Attributor::getPotentialValues` on the variable offsets. It should handle select and phi and more, e.g., loads. Maybe in addition to `Attributor::getAssumedConstant` we should have `Attributor::getAssumedConstantValues`. I want to avoid yet another traversal of some instructions in favor of common interfaces.a jdoerfert: I think we might want to call `Attributor::getPotentialValues` on the variable offsets. It…
}		}
if (isa<PtrToIntInst>(Usr))		if (isa<PtrToIntInst>(Usr))
return false;		return false;
if (isa<CastInst>(Usr) \|\| isa<SelectInst>(Usr) \|\| isa<ReturnInst>(Usr))		if (isa<CastInst>(Usr) \|\| isa<SelectInst>(Usr) \|\| isa<ReturnInst>(Usr))
return HandlePassthroughUser(Usr, OffsetInfoMap[CurPtr], Follow);		return HandlePassthroughUser(Usr, OffsetInfoMap[CurPtr], Follow);

// For PHIs we need to take care of the recurrence explicitly as the value		// For PHIs we need to take care of the recurrence explicitly as the value
// might change while we iterate through a loop. For now, we give up if		// might change while we iterate through a loop. For now, we give up if
// the PHI is not invariant.		// the PHI is not invariant.
if (isa<PHINode>(Usr)) {		if (isa<PHINode>(Usr)) {
// Note the order here, the Usr access might change the map, CurPtr is		// Note the order here, the Usr access might change the map, CurPtr is
// already in it though.		// already in it though.
bool IsFirstPHIUser = !OffsetInfoMap.count(Usr);		bool IsFirstPHIUser = !OffsetInfoMap.count(Usr);
auto &UsrOI = OffsetInfoMap[Usr];		auto &UsrOI = OffsetInfoMap[Usr];
auto &PtrOI = OffsetInfoMap[CurPtr];		auto &PtrOI = OffsetInfoMap[CurPtr];

// Check if the PHI operand has already an unknown offset as we can't		// Check if the PHI operand has already an unknown offset as we can't
// improve on that anymore.		// improve on that anymore.
if (PtrOI.Offset == AA::RangeTy::Unknown) {		if (PtrOI.isUnknown()) {
LLVM_DEBUG(dbgs() << "[AAPointerInfo] PHI operand offset unknown "		LLVM_DEBUG(dbgs() << "[AAPointerInfo] PHI operand offset unknown "
<< CurPtr << " in " << Usr << "\n");		<< CurPtr << " in " << Usr << "\n");
Follow = UsrOI.Offset != AA::RangeTy::Unknown;		Follow = !UsrOI.isUnknown();
UsrOI = PtrOI;		UsrOI.setUnknown();
return true;		return true;
}		}

// Check if the PHI is invariant (so far).		// Check if the PHI is invariant (so far).
if (UsrOI == PtrOI) {		if (UsrOI == PtrOI) {
assert(PtrOI.Offset != AA::RangeTy::Unassigned &&		assert(!PtrOI.isUnassigned() &&
"Cannot assign if the current Ptr was not visited!");		"Cannot assign if the current Ptr was not visited!");
LLVM_DEBUG(dbgs() << "[AAPointerInfo] PHI is invariant (so far)");		LLVM_DEBUG(dbgs() << "[AAPointerInfo] PHI is invariant (so far)");
return true;		return true;
}		}

// Check if the PHI operand is not dependent on the PHI itself.		// Check if the PHI operand is not dependent on the PHI itself.
APInt Offset(		APInt Offset(
DL.getIndexSizeInBits(CurPtr->getType()->getPointerAddressSpace()),		DL.getIndexSizeInBits(CurPtr->getType()->getPointerAddressSpace()),
0);		0);
Value *CurPtrBase = CurPtr->stripAndAccumulateConstantOffsets(		Value *CurPtrBase = CurPtr->stripAndAccumulateConstantOffsets(
DL, Offset, /* AllowNonInbounds */ true);		DL, Offset, /* AllowNonInbounds */ true);
auto It = OffsetInfoMap.find(CurPtrBase);		auto It = OffsetInfoMap.find(CurPtrBase);
if (It != OffsetInfoMap.end()) {		if (It != OffsetInfoMap.end()) {
Offset += It->getSecond().Offset;		auto BaseOI = It->getSecond();
if (IsFirstPHIUser \|\| Offset == UsrOI.Offset)		BaseOI.addToAll(Offset.getZExtValue());
		jdoerfertUnsubmitted Done Reply Inline Actions SExt, I think. -1 is a fine offset. jdoerfert: SExt, I think. -1 is a fine offset.
		if (IsFirstPHIUser \|\| BaseOI == UsrOI) {
		LLVM_DEBUG(dbgs() << "[AAPointerInfo] PHI is invariant " << *CurPtr
		<< " in " << *Usr << "\n");
return HandlePassthroughUser(Usr, PtrOI, Follow);		return HandlePassthroughUser(Usr, PtrOI, Follow);
		}
LLVM_DEBUG(		LLVM_DEBUG(
dbgs() << "[AAPointerInfo] PHI operand pointer offset mismatch "		dbgs() << "[AAPointerInfo] PHI operand pointer offset mismatch "
<< CurPtr << " in " << Usr << "\n");		<< CurPtr << " in " << Usr << "\n");
} else {		} else {
LLVM_DEBUG(dbgs() << "[AAPointerInfo] PHI operand is too complex "		LLVM_DEBUG(dbgs() << "[AAPointerInfo] PHI operand is too complex "
<< CurPtr << " in " << Usr << "\n");		<< CurPtr << " in " << Usr << "\n");
}		}

// TODO: Approximate in case we know the direction of the recurrence.		// TODO: Approximate in case we know the direction of the recurrence.
UsrOI = PtrOI;		UsrOI.setUnknown();
UsrOI.Offset = AA::RangeTy::Unknown;
Follow = true;		Follow = true;
return true;		return true;
}		}

if (auto *LoadI = dyn_cast<LoadInst>(Usr)) {		if (auto *LoadI = dyn_cast<LoadInst>(Usr)) {
// If the access is to a pointer that may or may not be the associated		// If the access is to a pointer that may or may not be the associated
// value, e.g. due to a PHI, we cannot assume it will be read.		// value, e.g. due to a PHI, we cannot assume it will be read.
AccessKind AK = AccessKind::AK_R;		AccessKind AK = AccessKind::AK_R;
if (getUnderlyingObject(CurPtr) == &AssociatedValue)		if (getUnderlyingObject(CurPtr) == &AssociatedValue)
AK = AccessKind(AK \| AccessKind::AK_MUST);		AK = AccessKind(AK \| AccessKind::AK_MUST);
else		else
AK = AccessKind(AK \| AccessKind::AK_MAY);		AK = AccessKind(AK \| AccessKind::AK_MAY);
return handleAccess(A, LoadI, / Content */ nullptr, AK,		if (!handleAccess(A, LoadI, / Content */ nullptr, AK,
OffsetInfoMap[CurPtr].Offset, Changed,		OffsetInfoMap[CurPtr].Offsets, Changed,
LoadI->getType());		LoadI->getType()))
		return false;
		return true;
		jdoerfertUnsubmitted Done Reply Inline Actions Nit: Unsure why we need two returns here. jdoerfert: Nit: Unsure why we need two returns here.
		sameerdsAuthorUnsubmitted Done Reply Inline Actions I missed this. If there are no other changes required, I can fix this locally before submitting. sameerds: I missed this. If there are no other changes required, I can fix this locally before submitting.
}		}

auto HandleStoreLike = [&](Instruction &I, Value *ValueOp, Type &ValueTy,		auto HandleStoreLike = [&](Instruction &I, Value *ValueOp, Type &ValueTy,
ArrayRef<Value *> OtherOps, AccessKind AK) {		ArrayRef<Value *> OtherOps, AccessKind AK) {
for (auto *OtherOp : OtherOps) {		for (auto *OtherOp : OtherOps) {
if (OtherOp == CurPtr) {		if (OtherOp == CurPtr) {
LLVM_DEBUG(		LLVM_DEBUG(
dbgs()		dbgs()
Show All 9 Lines	auto HandleStoreLike = [&](Instruction &I, Value *ValueOp, Type &ValueTy,
AK = AccessKind(AK \| AccessKind::AK_MUST);		AK = AccessKind(AK \| AccessKind::AK_MUST);
else		else
AK = AccessKind(AK \| AccessKind::AK_MAY);		AK = AccessKind(AK \| AccessKind::AK_MAY);
bool UsedAssumedInformation = false;		bool UsedAssumedInformation = false;
Optional<Value *> Content = nullptr;		Optional<Value *> Content = nullptr;
if (ValueOp)		if (ValueOp)
Content = A.getAssumedSimplified(		Content = A.getAssumedSimplified(
ValueOp, this, UsedAssumedInformation, AA::Interprocedural);		ValueOp, this, UsedAssumedInformation, AA::Interprocedural);
return handleAccess(A, I, Content, AK, OffsetInfoMap[CurPtr].Offset,		if (!handleAccess(A, I, Content, AK, OffsetInfoMap[CurPtr].Offsets,
Changed, &ValueTy);		Changed, &ValueTy))
		return false;
		return true;
		jdoerfertUnsubmitted Done Reply Inline Actions Same as above jdoerfert: Same as above
};		};

if (auto *StoreI = dyn_cast<StoreInst>(Usr))		if (auto *StoreI = dyn_cast<StoreInst>(Usr))
return HandleStoreLike(*StoreI, StoreI->getValueOperand(),		return HandleStoreLike(*StoreI, StoreI->getValueOperand(),
*StoreI->getValueOperand()->getType(),		*StoreI->getValueOperand()->getType(),
{StoreI->getValueOperand()}, AccessKind::AK_W);		{StoreI->getValueOperand()}, AccessKind::AK_W);
if (auto *RMWI = dyn_cast<AtomicRMWInst>(Usr))		if (auto *RMWI = dyn_cast<AtomicRMWInst>(Usr))
return HandleStoreLike(RMWI, nullptr, RMWI->getValOperand()->getType(),		return HandleStoreLike(RMWI, nullptr, RMWI->getValOperand()->getType(),
Show All 9 Lines	if (auto *CB = dyn_cast<CallBase>(Usr)) {
return true;		return true;
if (getFreedOperand(CB, TLI) == U)		if (getFreedOperand(CB, TLI) == U)
return true;		return true;
if (CB->isArgOperand(&U)) {		if (CB->isArgOperand(&U)) {
unsigned ArgNo = CB->getArgOperandNo(&U);		unsigned ArgNo = CB->getArgOperandNo(&U);
const auto &CSArgPI = A.getAAFor<AAPointerInfo>(		const auto &CSArgPI = A.getAAFor<AAPointerInfo>(
this, IRPosition::callsite_argument(CB, ArgNo),		this, IRPosition::callsite_argument(CB, ArgNo),
DepClassTy::REQUIRED);		DepClassTy::REQUIRED);
Changed = translateAndAddState(A, CSArgPI, OffsetInfoMap[CurPtr].Offset,		Changed = translateAndAddState(A, CSArgPI, OffsetInfoMap[CurPtr], *CB) \|
*CB) \|
Changed;		Changed;
return isValidState();		return isValidState();
}		}
LLVM_DEBUG(dbgs() << "[AAPointerInfo] Call user not handled " << *CB		LLVM_DEBUG(dbgs() << "[AAPointerInfo] Call user not handled " << *CB
<< "\n");		<< "\n");
// TODO: Allow some call uses		// TODO: Allow some call uses
return false;		return false;
}		}

LLVM_DEBUG(dbgs() << "[AAPointerInfo] User not handled " << *Usr << "\n");		LLVM_DEBUG(dbgs() << "[AAPointerInfo] User not handled " << *Usr << "\n");
return false;		return false;
};		};
auto EquivalentUseCB = [&](const Use &OldU, const Use &NewU) {		auto EquivalentUseCB = [&](const Use &OldU, const Use &NewU) {
if (OffsetInfoMap.count(NewU)) {		if (OffsetInfoMap.count(NewU)) {
LLVM_DEBUG({		LLVM_DEBUG({
if (!(OffsetInfoMap[NewU] == OffsetInfoMap[OldU])) {		if (!(OffsetInfoMap[NewU] == OffsetInfoMap[OldU])) {
dbgs() << "[AAPointerInfo] Equivalent use callback failed: "		dbgs() << "[AAPointerInfo] Equivalent use callback failed: "
<< OffsetInfoMap[NewU].Offset << " vs "		<< OffsetInfoMap[NewU] << " vs " << OffsetInfoMap[OldU]
<< OffsetInfoMap[OldU].Offset << "\n";		<< "\n";
}		}
});		});
return OffsetInfoMap[NewU] == OffsetInfoMap[OldU];		return OffsetInfoMap[NewU] == OffsetInfoMap[OldU];
}		}
OffsetInfoMap[NewU] = OffsetInfoMap[OldU];		OffsetInfoMap[NewU] = OffsetInfoMap[OldU];
return true;		return true;
};		};
if (!A.checkForAllUses(UsePred, *this, AssociatedValue,		if (!A.checkForAllUses(UsePred, *this, AssociatedValue,
▲ Show 20 Lines • Show All 63 Lines • ▼ Show 20 Lines	if (auto *MI = dyn_cast_or_null<MemIntrinsic>(getCtxI())) {
if (ArgNo > 1) {		if (ArgNo > 1) {
LLVM_DEBUG(dbgs() << "[AAPointerInfo] Unhandled memory intrinsic "		LLVM_DEBUG(dbgs() << "[AAPointerInfo] Unhandled memory intrinsic "
<< *MI << "\n");		<< *MI << "\n");
return indicatePessimisticFixpoint();		return indicatePessimisticFixpoint();
} else {		} else {
auto Kind =		auto Kind =
ArgNo == 0 ? AccessKind::AK_MUST_WRITE : AccessKind::AK_MUST_READ;		ArgNo == 0 ? AccessKind::AK_MUST_WRITE : AccessKind::AK_MUST_READ;
Changed =		Changed =
Changed \| addAccess(A, 0, LengthVal, *MI, nullptr, Kind, nullptr);		Changed \| addAccess(A, {0, LengthVal}, *MI, nullptr, Kind, nullptr);
}		}
LLVM_DEBUG({		LLVM_DEBUG({
dbgs() << "Accesses by bin after update:\n";		dbgs() << "Accesses by bin after update:\n";
dumpState(dbgs());		dumpState(dbgs());
});		});

return Changed;		return Changed;
}		}
Show All 21 Lines	if (!NoCaptureAA.isAssumedNoCapture())
return indicatePessimisticFixpoint();		return indicatePessimisticFixpoint();

bool IsKnown = false;		bool IsKnown = false;
if (AA::isAssumedReadNone(A, getIRPosition(), *this, IsKnown))		if (AA::isAssumedReadNone(A, getIRPosition(), *this, IsKnown))
return ChangeStatus::UNCHANGED;		return ChangeStatus::UNCHANGED;
bool ReadOnly = AA::isAssumedReadOnly(A, getIRPosition(), *this, IsKnown);		bool ReadOnly = AA::isAssumedReadOnly(A, getIRPosition(), *this, IsKnown);
auto Kind =		auto Kind =
ReadOnly ? AccessKind::AK_MAY_READ : AccessKind::AK_MAY_READ_WRITE;		ReadOnly ? AccessKind::AK_MAY_READ : AccessKind::AK_MAY_READ_WRITE;
return addAccess(A, AA::RangeTy::Unknown, AA::RangeTy::Unknown, *getCtxI(),		return addAccess(A, AA::RangeTy::getUnknown(), *getCtxI(), nullptr, Kind,
nullptr, Kind, nullptr);		nullptr);
}		}

/// See AbstractAttribute::trackStatistics()		/// See AbstractAttribute::trackStatistics()
void trackStatistics() const override {		void trackStatistics() const override {
AAPointerInfoImpl::trackPointerInfoStatistics(getIRPosition());		AAPointerInfoImpl::trackPointerInfoStatistics(getIRPosition());
}		}
};		};

▲ Show 20 Lines • Show All 9,433 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/attributor-through-select.ll

This file was added.

				; RUN: llc < %s \| FileCheck %s

				target triple = "amdgcn-amd-amdhsa"

				; The call to intrinsic implicitarg_ptr is combined with an offset produced by
				; select'ing between two constants, before it is eventually used in a GEP to
				; form the address of a load. This test ensures that AAPointerInfo can look
				; through the select to maintain a set of indices, so that it can precisely
				; determine that hostcall and other expensive implicit args are not in use.

				; CHECK: amdhsa.kernels:
				; CHECK-NOT: .value_kind: hidden_hostcall_buffer
				; CHECK-NOT: .value_kind: hidden_multigrid_sync_arg
				; CHECK: .name: the_kernel

				define protected amdgpu_kernel void @the_kernel(float addrspace(1)* nocapture noundef readonly %a.coerce, float addrspace(1)* nocapture noundef %d.coerce) local_unnamed_addr {
				entry:
				%0 = tail call i8 addrspace(4)* @llvm.amdgcn.implicitarg.ptr()
				%1 = tail call i32 @llvm.amdgcn.workgroup.id.x()
				%2 = bitcast i8 addrspace(4)* %0 to i32 addrspace(4)*
				%3 = load i32, i32 addrspace(4)* %2, align 4
				%4 = icmp ult i32 %1, %3
				%5 = select i1 %4, i64 12, i64 18
				%6 = getelementptr inbounds i8, i8 addrspace(4)* %0, i64 %5
				%7 = bitcast i8 addrspace(4)* %6 to i16 addrspace(4)*
				%8 = load i16, i16 addrspace(4)* %7, align 2
				%conv.i.i = zext i16 %8 to i32
				%mul = mul i32 %1, %conv.i.i
				%9 = tail call i32 @llvm.amdgcn.workitem.id.x()
				%add = add i32 %mul, %9
				%idx.ext = sext i32 %add to i64
				%add.ptr3 = getelementptr inbounds float, float addrspace(1)* %d.coerce, i64 %idx.ext
				%arrayidx4 = getelementptr inbounds float, float addrspace(1)* %a.coerce, i64 %idx.ext
				%10 = load float, float addrspace(1)* %arrayidx4, align 4
				%11 = atomicrmw fadd float addrspace(1)* %add.ptr3, float %10 syncscope("agent-one-as") monotonic, align 4
				ret void
				}

				declare i32 @llvm.amdgcn.workitem.id.x()

				declare align 4 i8 addrspace(4)* @llvm.amdgcn.implicitarg.ptr()

				declare i32 @llvm.amdgcn.workgroup.id.x()

llvm/test/Transforms/Attributor/value-simplify-pointer-info.ll

Show First 20 Lines • Show All 3,101 Lines • ▼ Show 20 Lines
; CGSCC-NEXT: ret void		; CGSCC-NEXT: ret void
;		;
%l = load i32, i32* %a		%l = load i32, i32* %a
%sel = select i1 %c, i32 %l, i32 42		%sel = select i1 %c, i32 %l, i32 42
store i32 %sel, i32* %a		store i32 %sel, i32* %a
ret void		ret void
}		}

		define i8 @multiple_offsets_simplifiable_1(i1 %cnd1, i1 %cnd2) {
		; CHECK: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)
		; CHECK-LABEL: define {{[^@]+}}@multiple_offsets_simplifiable_1
		; CHECK-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR4]] {
		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
		; CHECK-NEXT: [[SEL0:%.*]] = select i1 [[CND1]], i64 23, i64 29
		; CHECK-NEXT: [[SEL1:%.*]] = select i1 [[CND2]], i64 [[SEL0]], i64 7
		; CHECK-NEXT: [[GEP31:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 31
		; CHECK-NEXT: [[GEP_SEL:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 [[SEL1]]
		; CHECK-NEXT: [[I:%.]] = load i8, i8 [[GEP_SEL]], align 4
		; CHECK-NEXT: ret i8 [[I]]
		;
		entry:
		%Bytes = alloca [1024 x i8], align 16
		%sel0 = select i1 %cnd1, i64 23, i64 29
		%sel1 = select i1 %cnd2, i64 %sel0, i64 7
		%gep31 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 31
		store i8 42, i8* %gep31, align 4
		%gep.sel = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 %sel1
		%i = load i8, i8* %gep.sel, align 4
		ret i8 %i
		}

		define i8 @multiple_offsets_simplifiable_2(i1 %cnd1, i1 %cnd2) {
		; CHECK: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)
		; CHECK-LABEL: define {{[^@]+}}@multiple_offsets_simplifiable_2
		; CHECK-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR4]] {
		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
		; CHECK-NEXT: [[SEL0:%.*]] = select i1 [[CND1]], i64 23, i64 29
		; CHECK-NEXT: [[SEL1:%.*]] = select i1 [[CND2]], i64 [[SEL0]], i64 7
		; CHECK-NEXT: [[GEP31:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 31
		; CHECK-NEXT: [[GEP_SEL:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 [[SEL1]]
		; CHECK-NEXT: [[GEP_PLUS:%.]] = getelementptr inbounds i8, i8 [[GEP_SEL]], i64 3
		; CHECK-NEXT: [[I:%.]] = load i8, i8 [[GEP_PLUS]], align 4
		; CHECK-NEXT: ret i8 [[I]]
		;
		entry:
		%Bytes = alloca [1024 x i8], align 16
		%sel0 = select i1 %cnd1, i64 23, i64 29
		%sel1 = select i1 %cnd2, i64 %sel0, i64 7
		%gep31 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 31
		store i8 42, i8* %gep31, align 4
		%gep.sel = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 %sel1
		%gep.plus = getelementptr inbounds i8, i8* %gep.sel, i64 3
		%i = load i8, i8* %gep.plus, align 4
		ret i8 %i
		}

define i8 @multiple_offsets_not_simplifiable_1(i1 %cnd1, i1 %cnd2) {		define i8 @multiple_offsets_not_simplifiable_1(i1 %cnd1, i1 %cnd2) {
; TUNIT: Function Attrs: nofree norecurse nosync nounwind willreturn		; TUNIT: Function Attrs: nofree norecurse nosync nounwind willreturn
; TUNIT-LABEL: define {{[^@]+}}@multiple_offsets_not_simplifiable_1		; TUNIT-LABEL: define {{[^@]+}}@multiple_offsets_not_simplifiable_1
; TUNIT-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR3]] {		; TUNIT-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR3]] {
; TUNIT-NEXT: entry:		; TUNIT-NEXT: entry:
; TUNIT-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16		; TUNIT-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
; TUNIT-NEXT: [[GEP7:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 7		; TUNIT-NEXT: [[GEP7:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 7
; TUNIT-NEXT: [[GEP23:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 23		; TUNIT-NEXT: [[GEP23:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 23
Show All 20 Lines	entry:
%gep23 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 23		%gep23 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 23
; %phi.ptr = phi i8* [ %gep7, %then ], [ %gep23, %else ]		; %phi.ptr = phi i8* [ %gep7, %then ], [ %gep23, %else ]
%sel.ptr = select i1 %cnd1, i8* %gep7, i8* %gep23		%sel.ptr = select i1 %cnd1, i8* %gep7, i8* %gep23
store i8 42, i8* %sel.ptr, align 4		store i8 42, i8* %sel.ptr, align 4
%i = load i8, i8* %gep7, align 4		%i = load i8, i8* %gep7, align 4
ret i8 %i		ret i8 %i
}		}

		define i8 @multiple_offsets_not_simplifiable_2(i1 %cnd1, i1 %cnd2) {
		; CHECK: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)
		; CHECK-LABEL: define {{[^@]+}}@multiple_offsets_not_simplifiable_2
		; CHECK-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR4]] {
		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
		; CHECK-NEXT: [[SEL0:%.*]] = select i1 [[CND1]], i64 23, i64 29
		; CHECK-NEXT: [[SEL1:%.*]] = select i1 [[CND2]], i64 [[SEL0]], i64 7
		; CHECK-NEXT: [[GEP23:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 23
		; CHECK-NEXT: store i8 100, i8* [[GEP23]], align 4
		; CHECK-NEXT: [[GEP_SEL:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 [[SEL1]]
		; CHECK-NEXT: [[I:%.]] = load i8, i8 [[GEP_SEL]], align 4
		; CHECK-NEXT: ret i8 [[I]]
		;
		entry:
		%Bytes = alloca [1024 x i8], align 16
		%sel0 = select i1 %cnd1, i64 23, i64 29
		%sel1 = select i1 %cnd2, i64 %sel0, i64 7
		%gep23 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 23
		store i8 100, i8* %gep23, align 4
		%gep.sel = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 %sel1
		%i = load i8, i8* %gep.sel, align 4
		ret i8 %i
		}

		define i8 @multiple_offsets_not_simplifiable_3(i1 %cnd1, i1 %cnd2) {
		jdoerfertUnsubmitted Done Reply Inline Actions I think this is the same problem, we do not handle LoadInst and should just call fillSetWithConstantValues on the load value itself. jdoerfert: I think this is the same problem, we do not handle LoadInst and should just call…
		; CHECK: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)
		; CHECK-LABEL: define {{[^@]+}}@multiple_offsets_not_simplifiable_3
		; CHECK-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR4]] {
		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
		; CHECK-NEXT: [[SEL0:%.*]] = select i1 [[CND1]], i64 23, i64 29
		; CHECK-NEXT: [[SEL1:%.*]] = select i1 [[CND2]], i64 [[SEL0]], i64 7
		; CHECK-NEXT: [[GEP32:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 32
		; CHECK-NEXT: store i8 100, i8* [[GEP32]], align 16
		; CHECK-NEXT: [[GEP_SEL:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 [[SEL1]]
		; CHECK-NEXT: [[GEP_PLUS:%.]] = getelementptr inbounds i8, i8 [[GEP_SEL]], i64 3
		; CHECK-NEXT: [[I:%.]] = load i8, i8 [[GEP_PLUS]], align 4
		; CHECK-NEXT: ret i8 [[I]]
		;
		entry:
		%Bytes = alloca [1024 x i8], align 16
		%sel0 = select i1 %cnd1, i64 23, i64 29
		%sel1 = select i1 %cnd2, i64 %sel0, i64 7
		%gep32 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 32
		store i8 100, i8* %gep32, align 4
		%gep.sel = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 %sel1
		%gep.plus = getelementptr inbounds i8, i8* %gep.sel, i64 3
		%i = load i8, i8* %gep.plus, align 4
		ret i8 %i
		}

		define i8 @multiple_offsets_not_simplifiable_4(i1 %cnd1, i1 %cnd2) {
		; CHECK: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)
		; CHECK-LABEL: define {{[^@]+}}@multiple_offsets_not_simplifiable_4
		; CHECK-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR4]] {
		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
		; CHECK-NEXT: [[SEL0:%.*]] = select i1 [[CND1]], i64 23, i64 29
		; CHECK-NEXT: [[SEL1:%.*]] = select i1 [[CND2]], i64 [[SEL0]], i64 7
		; CHECK-NEXT: [[GEP_SEL:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 [[SEL1]]
		; CHECK-NEXT: store i8 100, i8* [[GEP_SEL]], align 4
		; CHECK-NEXT: [[GEP29:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 29
		; CHECK-NEXT: [[I:%.]] = load i8, i8 [[GEP29]], align 4
		; CHECK-NEXT: ret i8 [[I]]
		;
		entry:
		%Bytes = alloca [1024 x i8], align 16
		%sel0 = select i1 %cnd1, i64 23, i64 29
		%sel1 = select i1 %cnd2, i64 %sel0, i64 7
		%gep.sel = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 %sel1
		store i8 100, i8* %gep.sel, align 4
		%gep29 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 29
		%i = load i8, i8* %gep29, align 4
		ret i8 %i
		}

		define i8 @multiple_offsets_not_simplifiable_5(i1 %cnd1, i1 %cnd2) {
		; CHECK: Function Attrs: nofree norecurse nosync nounwind willreturn memory(none)
		; CHECK-LABEL: define {{[^@]+}}@multiple_offsets_not_simplifiable_5
		; CHECK-SAME: (i1 [[CND1:%.]], i1 [[CND2:%.]]) #[[ATTR4]] {
		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[BYTES:%.*]] = alloca [1024 x i8], align 16
		; CHECK-NEXT: [[SEL0:%.*]] = select i1 [[CND1]], i64 23, i64 29
		; CHECK-NEXT: [[SEL1:%.*]] = select i1 [[CND2]], i64 [[SEL0]], i64 7
		; CHECK-NEXT: [[GEP_SEL:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 [[SEL1]]
		; CHECK-NEXT: [[GEP_PLUS:%.]] = getelementptr inbounds i8, i8 [[GEP_SEL]], i64 3
		; CHECK-NEXT: store i8 100, i8* [[GEP_PLUS]], align 4
		; CHECK-NEXT: [[GEP32:%.]] = getelementptr inbounds [1024 x i8], [1024 x i8] [[BYTES]], i64 0, i64 32
		; CHECK-NEXT: [[I:%.]] = load i8, i8 [[GEP32]], align 16
		; CHECK-NEXT: ret i8 [[I]]
		;
		entry:
		%Bytes = alloca [1024 x i8], align 16
		%sel0 = select i1 %cnd1, i64 23, i64 29
		%sel1 = select i1 %cnd2, i64 %sel0, i64 7
		%gep.sel = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 %sel1
		%gep.plus = getelementptr inbounds i8, i8* %gep.sel, i64 3
		store i8 100, i8* %gep.plus, align 4
		%gep32 = getelementptr inbounds [1024 x i8], [1024 x i8]* %Bytes, i64 0, i64 32
		%i = load i8, i8* %gep32, align 4
		ret i8 %i
		}

!llvm.module.flags = !{!0, !1}		!llvm.module.flags = !{!0, !1}
!llvm.ident = !{!2}		!llvm.ident = !{!2}

!0 = !{i32 1, !"wchar_size", i32 4}		!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 7, !"uwtable", i32 1}		!1 = !{i32 7, !"uwtable", i32 1}
!2 = !{!"clang version 13.0.0"}		!2 = !{!"clang version 13.0.0"}
!3 = !{!4, !4, i64 0}		!3 = !{!4, !4, i64 0}
▲ Show 20 Lines • Show All 142 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[AAPointerInfo] track multiple constant offsets for each useClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 477715

llvm/include/llvm/Transforms/IPO/Attributor.h

llvm/lib/Transforms/IPO/AttributorAttributes.cpp

llvm/test/CodeGen/AMDGPU/attributor-through-select.ll

llvm/test/Transforms/Attributor/value-simplify-pointer-info.ll

[AAPointerInfo] track multiple constant offsets for each use
ClosedPublic