This is an archive of the discontinued LLVM Phabricator instance.

[AAPointerInfo] track multiple constant offsets for each use
ClosedPublic

Authored by sameerds on Nov 24 2022, 2:39 AM.

Details

Summary

An expression of the form gep(base, select(pred, const1, const2)) can result
in a set of offsets instead of just one. PointerInfo can now track these sets
instead of conservatively modeling them as Unknown.

Diff Detail

Event Timeline

sameerds created this revision.Nov 24 2022, 2:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 24 2022, 2:39 AM
sameerds requested review of this revision.Nov 24 2022, 2:39 AM
Herald added a reviewer: sstefan1. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript

Is this patch, and the ones prior in the stack, ready for review?

Is this patch, and the ones prior in the stack, ready for review?

Yes this is ready for review. It lacks tests for function calls, which I intend to provide before submitting.

sameerds updated this revision to Diff 478855.Nov 30 2022, 1:41 AM
  • Added some comments to the code
  • Removed a redundant CodeGen test
  • Improved readability in value-simplify-pointer-info.ll
  • Added tests in call-simplify-pointer-info.ll

The only "issue" I have with this is the select traversal. We should rely on AAPotentialValues here, see the comment below. Everything else makes sense and looks pretty good.

llvm/include/llvm/Transforms/IPO/Attributor.h
5148

All assertions need messages please, also elsewhere.

llvm/lib/Transforms/IPO/AttributorAttributes.cpp
1487

I think we might want to call Attributor::getPotentialValues on the variable offsets. It should handle select and phi and more, e.g., loads.
Maybe in addition to Attributor::getAssumedConstant we should have Attributor::getAssumedConstantValues.
I want to avoid yet another traversal of some instructions in favor of common interfaces.a

1512

SExt, I think. -1 is a fine offset.

1544

Nit: Unsure why we need two returns here.

1651

Same as above

sameerds updated this revision to Diff 480067.Dec 5 2022, 5:24 AM

Use AAPotentialConstantValues instead of tediously traversing SelectInst.

Can we have a test for phi and store-load propagation to verify it's working as expected (not only selects)?

llvm/lib/Transforms/IPO/AttributorAttributes.cpp
1397–1400

This doesn't make sense to me. We need to look at all VariableOffsets and decide. So return should only be present if we give up.

1407

I don;t follow why we need two extra OffsetInfo objects here. We modify NewOI anyway, no?

sameerds added inline comments.Dec 5 2022, 9:42 AM
llvm/lib/Transforms/IPO/AttributorAttributes.cpp
1397–1400

You might have missed the "not" in the condition. We want to give up and come again later if any information is not at fixed point. This is easily exercised by tests involving induction variables. The set of potential values of the induction variable keeps growing, but we should not use that set until it is fully enumerated. Any eager propagation of a non-fixed-point through PointerInfo at this point affects conclusions in other attributes that depend on it. I did not look for an example of correctness, but it does cause the attributor to retain stores that it would have otherwise removed in one existing test.

1407

On each iteration of the outer for loop over VariableOffsets, the expression is a product:

UnionOfAllCopies = NewOI x AssumedSet

CopyPerOffset is the temporary used by the inner loop to compute this product. We need UnionOfAllCopies because it must only contain modified copies of NewOI, but not NewOI itself. We can't merge the RHS into NewOI. We have to start with an empty set.

The actual output of the function is UsrOI. We do not move NewOI into that if we exit early.

I realize there is no test for this yet. Working on that ... writing a GEP by hand is hard, when it involves nested aggregate types!

jdoerfert added inline comments.Dec 5 2022, 11:06 AM
llvm/lib/Transforms/IPO/AttributorAttributes.cpp
1397–1400

We cannot wait in one AA for another to find a fixpoint. That is not sound. That is not even always possible. Even if it would be, it won't work in the current algorithm. You need to update the AA state based on the state of the other AA, always. Then signal if something changed.

That said, if we retain stores doing this properly we need to understand why.

1407

I realize there is no test for this yet. Working on that ... writing a GEP by hand is hard, when it involves nested aggregate types!

Use clang.

sameerds updated this revision to Diff 480918.Dec 7 2022, 8:07 AM
  • Remove incorrect use of isAtFixpoint. Instead, Access::operator& no longer drops content when ranges are merged.
  • Add more tests, move tests to new file for better naming.
sameerds marked 6 inline comments as done.Dec 7 2022, 8:14 AM
sameerds added inline comments.
llvm/include/llvm/Transforms/IPO/Attributor.h
5148

Can fix this locally too.

llvm/lib/Transforms/IPO/AttributorAttributes.cpp
1397–1400

You're right. What I did here was plain wrong. The root cause was that when me merge ranges in operator&= for Access objects, we conservatively drop the contents. We don't need to be so conservative ... just combining contents from the two Access objects works well in case both happen to have the same contents.

1407

Test added.

1544

I missed this. If there are no other changes required, I can fix this locally before submitting.

jdoerfert added inline comments.Dec 7 2022, 8:22 AM
llvm/include/llvm/Transforms/IPO/Attributor.h
5224–5225

It was dropping the results for a reason before, just going back on it is probably not sound either.

If we have:

range: [0-4]
value: i32 -42

and merge it with

range: [0-8]
value: i64 -42

we need to do something here.
At least, we need to change must to may but I think we cannot even keep the value if they are equal under under zext.
If we would, writing i32 0 and i64 0 would make us believe the former writes 8 bytes and they'll all be 0, which is not true.

When does this happen, maybe we need to understand the problem better.

llvm/test/Transforms/Attributor/multiple-offsets-pointer-info.ll
337

AAPotentialConstantValues uses AAPotentialValues but the case to handle PHINodes is missing here: https://github.com/llvm/llvm-project/blob/72d76a2403459a38a1d6daae62de6945097db8f9/llvm/lib/Transforms/IPO/AttributorAttributes.cpp#L9222
We just need to go over the operands and call the fillSetWithConstantValues function. We can do that in a follow up or pre-patch, just commenting here to make sure we don't forget.

llvm/test/Transforms/Attributor/value-simplify-pointer-info.ll
3154

I think this is the same problem, we do not handle LoadInst and should just call fillSetWithConstantValues on the load value itself.

sameerds marked 2 inline comments as done.Dec 7 2022, 8:57 AM
sameerds added inline comments.
llvm/include/llvm/Transforms/IPO/Attributor.h
5224–5225

The Access objects being merged here originate from the same remote instruction. Doesn't that mean that the type is guaranteed to be the same? I tried putting an assert(Ty == R.Ty), and it did not fire for the lit tests. Instead of asserting, we can always check that before calling combineInValueLattice().

llvm/test/Transforms/Attributor/multiple-offsets-pointer-info.ll
337

Exactly what I had in mind. Would prefer to do this as a follow-up.

jdoerfert added inline comments.Dec 7 2022, 9:33 AM
llvm/include/llvm/Transforms/IPO/Attributor.h
5224–5225

I'm still confused. Even if it's the same value this is a problem, no?

range: [0-4]
value: i32 4
must write

merged with

range: [4-8]
value: i32 4
must write

will result in

range: [0-8]
value: i32 4
must write

which is not true. For one, we may only write 4 out of 8 bytes, and depending which ones its not going to be 4 if you read the range [0-8].

sameerds added inline comments.Dec 7 2022, 9:28 PM
llvm/include/llvm/Transforms/IPO/Attributor.h
5224–5225

Here is what I see about the creation of Access objects:

  • Each Access is for a unique value.
  • If the value is a MemInstrinsic, then the length is known, and all ranges for that Access will have that same length.
  • Else, if the value is an argument, then the length is unknown and the Access has only one range (unknown).
  • Else, the value is an instruction with an optionally known type (see handleAccess()). If the length is known, then all ranges have the same length, else it is a single unknown range.

This invariant is maintained even when looking through function calls.

If the above is correct, then it might be redundant to even track the size for every Range in a RangeList. But that is assuming Ranges are used only for PointerInfo::Access objects. If the Range should remain generic, then we should allow the possibility that all Ranges in a RangeList are not the same size. We could add a bool "AllRangesAreSameSize" and check this when merging an Access into another Access.

So if Ranges are not the same size, then the contents are unknown. Else, if the types are the same, then combine the contents. Else the contents are unknown.

sameerds marked an inline comment as not done.Dec 7 2022, 9:39 PM
sameerds added inline comments.
llvm/include/llvm/Transforms/IPO/Attributor.h
5224–5225

Correction in the third bullet ... the length may be unknown if it is not llvm::Argument. Otherwise the invariant about looking through function calls applies.

jdoerfert added inline comments.Dec 7 2022, 9:42 PM
llvm/include/llvm/Transforms/IPO/Attributor.h
5224–5225

I'm very confused. Apologies.

So, if I understand this correctly now we will keep the value but mark the access as MAY if it has more than one range. That seems correct. My example above was merging the ranges, which we are not, and also keeping the MUST bit, which we are not. So, assuming I finally understand what is happening this should be fine.

jdoerfert accepted this revision.Dec 7 2022, 9:42 PM

LG, I think.

This revision is now accepted and ready to land.Dec 7 2022, 9:42 PM
sameerds marked an inline comment as not done.Dec 7 2022, 10:43 PM
sameerds added inline comments.
llvm/include/llvm/Transforms/IPO/Attributor.h
5224–5225

I am just glad to have your attention while I work through this! It's an important optimization for launching HIP programs.

So yes, if there are multiple ranges in an Access, they are of the same size. We keep the contents if possible, but it is always a MAY access. I have put asserts in strategic places (isMayAccess and isMustAccess) to catch that.

FWIW, I landed https://reviews.llvm.org/rG1eab2d699e9581305f32473291e6afa47017d582 and you might need to verify the tests against it. Worst case, we need to recursively call getPotentialValues.

sameerds updated this revision to Diff 481518.Dec 8 2022, 10:25 PM
sameerds marked 8 inline comments as done.
  • Rebased.
  • Added messages to assertions.
  • handleAccess expects a reference Type&, since that argument cannot be nullptr.
sameerds updated this revision to Diff 481992.Dec 11 2022, 9:50 PM
  • Rebased.
  • Yielded to clang-format's insistence in a couple of places.