Page MenuHomePhabricator

Please use GitHub pull requests for new patches. Phabricator shutdown timeline

[AggressiveInstCombine] folding load for constant global patterened arrays and structs by alignment
ClosedPublic

Authored by khei4 on Feb 20 2023, 8:10 PM.

Details

Summary

Fold loadInst for constant aggregate types, those load results are equivalent for the given alignments.

This revision will be part 1 of the following stages.

  1. alignment-based (This revision will be)
  2. GEP-based https://reviews.llvm.org/D146622
  • add ConstantFoldLoadFromPatternedAggregate method to AggressiveInstCombine

alive proofs: https://alive2.llvm.org/ce/z/qBGl72
Depends on: https://reviews.llvm.org/D145355
Fixes: https://github.com/rust-lang/rust/issues/107208.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

now ready for review again

llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
870–872

are there any better way to bitcast APInt?

901

I removed loadSize in the end

khei4 updated this revision to Diff 505377.Mar 14 2023, 11:12 PM

fix awkward indent

khei4 planned changes to this revision.Mar 16 2023, 2:16 AM

Sorry, I noticed test failed.

khei4 updated this revision to Diff 505733.Mar 16 2023, 2:20 AM

fix wrong argument

nikic requested changes to this revision.Mar 19 2023, 7:54 AM

Extracting the stride from GEPs turned out to be trickier than I expected...

llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
833

Should add a negative test using volatile.

844

This comment looks out of place, should be further down?

848

64K -> 4K

850

uint64_t

856

This is again creating a very, very wide integer :) The correct size to use here is DL.getIndexTypeSizeInBits(PtrOp->getType()) as you already do in the loop. In that case you also don't need the separate COff variable, you can let collectOffset() directly add onto this one.

858

This looks somewhat dubious -- I assume the intention here is that this is a "no-op" value for the first GreatestCommonDivisor call?

I think it would be clearer to make Stride an std::optional and then set it the first time, and use GCD for later iterations. (After the loop, if Stride is not set, it's okay to just bail out: The constant-offset case is already handled by other code, so we needn't bother with it.)

881

indeces -> indices

883

Hm, what happens if ConstOffset itself is negative? Would urem produce correct results in that case? Let's say ConstOffset = -3 and Stride = 3. Let's assume 8-bit address space to keep things simple, then ConstOffset = 253 in unsigned interpretation and 253 % 3 == 1.

Also I just realized another problem here: Just using the GCD isn't right if the GEP is not inbounds (i.e. multiplication can overflow) and the stride is not a power of two.

I believe the correct way to handle these cases is implemented in BasicAA, in particular https://github.com/llvm/llvm-project/blob/47aa1fe376a477939a1991ffe37504124af25f52/llvm/lib/Analysis/BasicAliasAnalysis.cpp#L1135-L1138 to only keep a power of two factor for non-inbounds and https://github.com/llvm/llvm-project/blob/47aa1fe376a477939a1991ffe37504124af25f52/llvm/lib/Analysis/BasicAliasAnalysis.cpp#L1168-L1170 for the final modulus.

884

A correctness check missing here is that after looking through all the GEPs, you must arrive back at GV. Otherwise there might be some kind of other operation sitting in between which preserves the underlying object, but where we can't determine the offset. As a test case, one could use a call to llvm.ptrmask for example.

893

This probably also needs to check that ConstOffset is zero? I don't think using load alignment as stride is right with a non-zero start offset.

897

It would be a bit simpler to keep using (and adding to) ConstOffset here, as we need an APInt anyway.

This revision now requires changes to proceed.Mar 19 2023, 7:54 AM
nikic added a comment.Mar 19 2023, 8:03 AM

Extracting the stride from GEPs turned out to be trickier than I expected...

If you like, we can go back to just the alignment handling for this patch, and then add the GEP handling in a separate one on top. Either way works for me.

khei4 planned changes to this revision.Mar 20 2023, 12:09 AM

Thank you for a lot of good catches!

Extracting the stride from GEPs turned out to be trickier than I expected...

If you like, we can go back to just the alignment handling for this patch, and then add the GEP handling in a separate one on top. Either way works for me.

Sounds good to me!
Then I want to separate this into following

  1. alignment-based (This revision will be)
  2. (all) inbounds GEP index-based
  3. non-inbounds included GEP index-based

TBH, I feel non-inbounds GEP handling is a little bit handful, but I'll read AA's implementation and try it first!

llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
856

Oh... I was confused with offset and the GV byte sizes! You're right!

858

This looks somewhat dubious -- I assume the intention here is that this is a "no-op" value for the first GreatestCommonDivisor call?

Yes!

I think it would be clearer to make Stride a std::optional and then set it the first time,

Sounds reasonable, I'll use that!

883

Ah, I mistakenly assumed inbounds! Without inbounds, ConstantOffset could be negative and could overflow!

884

Good catch! I haven't known underlying-object preserved operations! I'll add tests!

893

Good catch! Yeah, we should have considered GEP-stride-based and alignment-based separately also for ConstOffset.

khei4 updated this revision to Diff 506490.Mar 20 2023, 1:15 AM
khei4 retitled this revision from [AggressiveInstCombine] folding load for constant global patterened arrays and structs to (WIP) [AggressiveInstCombine] folding load for constant global patterened arrays and structs.
khei4 edited the summary of this revision. (Show Details)

update summary and address simple fixes.

khei4 updated this revision to Diff 506516.Mar 20 2023, 3:03 AM

addresss

  • ptrmasked pointer failures
  • not inbounds GEP

and refactoring

  • use APInt in loop
  • fix BitWidth for APInt
  • fix typo
nikic requested changes to this revision.Mar 21 2023, 7:33 AM

Nearly there, I think...

llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
831

Unnecessary/unusual newline.

838

getPointerOperand()` is a bit more elegant.

845

What you want to do is after the GEPOperator loop, check that the final PtrOpV is GV. This makes sure there were only GEPs.

854

"to avoid scanning too much memory" maybe -- we're not really allocating anything here.

864

This shouldn't be needed when using std::optional.

869

I don't really get what this comment is trying to say. The reason why non-inbounds GEP is tricky to handle is that there is an implicit modulo the address space size, which makes it harder to calculate the GCD.

871

You should probably extract the whole "get stride of GEP" logic into a separate function. That way, all these early bailouts don't apply for the alignment case (where we don't care about inbounds).

This revision now requires changes to proceed.Mar 21 2023, 7:33 AM
khei4 planned changes to this revision.EditedMar 21 2023, 9:55 PM

Thank you for the review! I may fuse the GEP-related plans I proposed! because I might have misunderstood non-inbounds GEP with correct alignment!

llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
845

Thanks! I'll check that before and after the loop!

864

I might have missed some patterns, but I guess it's necessary because GCD (self-referencing) calculations need its initial value as the first type size. :)

Stride = APIntOps::GreatestCommonDivisor(
    Stride,
    APInt(ConstOffset.getBitWidth(), IndexTypeSize.getZExtValue()));
871

(with above comment)
Ah, OK I might misunderstand the behavior of the load with non-inbounds GEP indices, but correct alignment!

khei4 retitled this revision from (WIP) [AggressiveInstCombine] folding load for constant global patterened arrays and structs to (WIP) [AggressiveInstCombine] folding load for constant global patterened arrays and structs by alignment.Mar 22 2023, 1:05 AM
khei4 edited the summary of this revision. (Show Details)Mar 22 2023, 1:23 AM
khei4 updated this revision to Diff 507300.Mar 22 2023, 3:30 AM
khei4 retitled this revision from (WIP) [AggressiveInstCombine] folding load for constant global patterened arrays and structs by alignment to [AggressiveInstCombine] folding load for constant global patterened arrays and structs by alignment.

rebase to be only alignment-based analysis. GEP-based revision will be created.

khei4 updated this revision to Diff 507309.Mar 22 2023, 3:59 AM

make ConstOffset be not optional

nikic added inline comments.Mar 22 2023, 4:09 AM
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
860

These isa checks aren't needed in this patch: For the alignment case, we don't care what the operand is.

llvm/test/Transforms/AggressiveInstCombine/patterned-load.ll
86

I don't get how this one folds with just this patch. If align 1 we have stride 1, but constarray1 needs a stride of 2, no?

Allen added a subscriber: Allen.Mar 22 2023, 4:38 AM
khei4 added inline comments.Mar 22 2023, 4:48 AM
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
860

Seems like, I could not get the point of ptrmask cares! (Precisely, it seems like we need to check alignment-based and GEP-based separately for completeness) Thanks!

llvm/test/Transforms/AggressiveInstCombine/patterned-load.ll
86

Ah, sorry seems like I just forgot to commit test changes...

khei4 updated this revision to Diff 507312.Mar 22 2023, 4:51 AM

update tests, remove redundant isa check

nikic added inline comments.Mar 22 2023, 7:02 AM
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
855–880

Some simplifications.

I think as far as foldPatternedLoads is concerned, we don't need the std::optional around Stride, because a Stride of 1 is always conservatively correct. It's only relevant internally while calculating GEP stride in the next patch.

856
khei4 updated this revision to Diff 507385.Mar 22 2023, 9:09 AM

apply suggestions

khei4 added inline comments.Mar 22 2023, 9:12 AM
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
855–880

Thanks! Yeah, simplicity for this patch matters!

nikic accepted this revision.Mar 23 2023, 4:40 AM

LGTM

This revision is now accepted and ready to land.Mar 23 2023, 4:40 AM
khei4 updated this revision to Diff 507735.Mar 23 2023, 7:28 AM

fix wrong arrow operator