This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] OptimizePointerDifference when a gep has phi ptr
Needs ReviewPublic

Authored by bipmis on Aug 15 2023, 6:15 AM.

Details

Summary

For scenarios like
phi(sub(prttoint(A), ptrint(B)), ...)
where A -> GEP(PHI(gep A, B)) and B is a ptr/GEP
if the use of this sub results in a cmp with 0, or(cmp with 0) and
the difference of Indexes b/w 2 GEP's is a positive index
the sub can essentially be folded.

Diff Detail

Event Timeline

bipmis created this revision.Aug 15 2023, 6:15 AM
bipmis requested review of this revision.Aug 15 2023, 6:15 AM
bipmis added a comment.Sep 1 2023, 5:43 AM

Gentle Ping.

Can you add alive2 links?

llvm/lib/Transforms/InstCombine/InstCombineInternal.h
96

nit: Can you put this decl a bit later in the file? (After the block of visit*)

llvm/test/Transforms/InstCombine/sub-gep.ll
472

Can you precommit tests?

bipmis updated this revision to Diff 555697.Sep 4 2023, 3:28 AM

Rebase patch after pre-committing tests. Address review comments.
Alive 2 links
https://alive2.llvm.org/ce/z/wZJfzE
https://alive2.llvm.org/ce/z/izA8ij

bipmis marked 2 inline comments as done.Sep 4 2023, 3:30 AM

Rebase patch after pre-committing tests. Address review comments.
Alive 2 links
https://alive2.llvm.org/ce/z/wZJfzE
https://alive2.llvm.org/ce/z/izA8ij

These can/should be simplified a great deal.
For example if I understand correctly youre trying to transform the following pattern:

define i1 @src(ptr %p, i64 %A, i64 %B, i64 %C, i1 %c) {
entry:
  br i1 %c, label %true, label %false

true:
  %p_gepA = getelementptr i8, ptr %p, i64 %A
  br label %false
false:
  %p_phi = phi ptr [ %p_gepA, %true ], [ %p, %entry ]
  %phi_gepC = getelementptr i8, ptr %p_phi, i64 %C

  %gepC_int = ptrtoint ptr %phi_gepC to i64
  %p_gepB = getelementptr i8, ptr %p, i64 %B
  %gepB_int = ptrtoint ptr %p_gepB to i64
  %sub = sub i64 %gepC_int, %gepB_int
  %r = icmp eq i64 %sub, 0
  ret i1 %r
}

Also in general this patch appears to be doing a lot more than the summary indicates.
Please update to summary to be more precise.

AFAICT Its really:

(phi, (sub (phi (phi GEPA), GEPB), GEPB), ...)
or
(phi, (ashr/lshr (sub (phi (phi GEPA), GEPB), GEPB), ...), ...)

Is that correct?

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1994

just cast.

1997

These need to be part of proof + tested.

Should probably also use match here but thats not particularly important.

2002

Should be split to a seperate function, otherwise early returns may interfere with other folds.

2012

What is this Or doing here?

bipmis edited the summary of this revision. (Show Details)Sep 11 2023, 3:35 AM
bipmis updated this revision to Diff 556413.Sep 11 2023, 3:40 AM

Updated the patch with review comments.

Rebase patch after pre-committing tests. Address review comments.
Alive 2 links
https://alive2.llvm.org/ce/z/wZJfzE
https://alive2.llvm.org/ce/z/izA8ij

These can/should be simplified a great deal.
For example if I understand correctly youre trying to transform the following pattern:

define i1 @src(ptr %p, i64 %A, i64 %B, i64 %C, i1 %c) {
entry:
  br i1 %c, label %true, label %false

true:
  %p_gepA = getelementptr i8, ptr %p, i64 %A
  br label %false
false:
  %p_phi = phi ptr [ %p_gepA, %true ], [ %p, %entry ]
  %phi_gepC = getelementptr i8, ptr %p_phi, i64 %C

  %gepC_int = ptrtoint ptr %phi_gepC to i64
  %p_gepB = getelementptr i8, ptr %p, i64 %B
  %gepB_int = ptrtoint ptr %p_gepB to i64
  %sub = sub i64 %gepC_int, %gepB_int
  %r = icmp eq i64 %sub, 0
  ret i1 %r
}

Also in general this patch appears to be doing a lot more than the summary indicates.
Please update to summary to be more precise.

AFAICT Its really:

(phi, (sub (phi (phi GEPA), GEPB), GEPB), ...)
or
(phi, (ashr/lshr (sub (phi (phi GEPA), GEPB), GEPB), ...), ...)

Is that correct?

Not exactly. This transform will benefit but what I am trying to address is the test case which has a loop in the given pattern. If the sub which is the result of the loop results in an icmp, or(icmp) and based on the pattern it can be deduced that we can are better off with just the difference of index of the 2 pointers. In our example it then
converts the sub to a sub(gep_index1, gep_index2).
It is also noted that other transforms in presence of TBAA can further reduce this reduction by eliminating the entire loop.

As per my understanding it is a
phi(sub(prttoint(A), ptrint(B)), ...)
where A -> GEP(PHI(gep A, B)) and B is a ptr/GEP

Rebase patch after pre-committing tests. Address review comments.
Alive 2 links
https://alive2.llvm.org/ce/z/wZJfzE
https://alive2.llvm.org/ce/z/izA8ij

These can/should be simplified a great deal.
For example if I understand correctly youre trying to transform the following pattern:

define i1 @src(ptr %p, i64 %A, i64 %B, i64 %C, i1 %c) {
entry:
  br i1 %c, label %true, label %false

true:
  %p_gepA = getelementptr i8, ptr %p, i64 %A
  br label %false
false:
  %p_phi = phi ptr [ %p_gepA, %true ], [ %p, %entry ]
  %phi_gepC = getelementptr i8, ptr %p_phi, i64 %C

  %gepC_int = ptrtoint ptr %phi_gepC to i64
  %p_gepB = getelementptr i8, ptr %p, i64 %B
  %gepB_int = ptrtoint ptr %p_gepB to i64
  %sub = sub i64 %gepC_int, %gepB_int
  %r = icmp eq i64 %sub, 0
  ret i1 %r
}

Also in general this patch appears to be doing a lot more than the summary indicates.
Please update to summary to be more precise.

AFAICT Its really:

(phi, (sub (phi (phi GEPA), GEPB), GEPB), ...)
or
(phi, (ashr/lshr (sub (phi (phi GEPA), GEPB), GEPB), ...), ...)

Is that correct?

Not exactly. This transform will benefit but what I am trying to address is the test case which has a loop in the given pattern. If the sub which is the result of the loop results in an icmp, or(icmp) and based on the pattern it can be deduced that we can are better off with just the difference of index of the 2 pointers. In our example it then
converts the sub to a sub(gep_index1, gep_index2).
It is also noted that other transforms in presence of TBAA can further reduce this reduction by eliminating the entire loop.

As per my understanding it is a
phi(sub(prttoint(A), ptrint(B)), ...)
where A -> GEP(PHI(gep A, B)) and B is a ptr/GEP

So:

(phi (sub (ptrtoint (gep (phi (gep A, (ptrtoint B)), ...)), (ptrtoint B), ...)

?

If so can you make the proofs / tests explicitly test that construct. As well if you are going to keep
the ashr/or stuff in also include that in the proofs/tests.

Even if the motivation is loops, the tests for the InstCombine codes shouldn't have to rely on that.
Its fine to keep some loop tests, but robust tests of all the cases in there simplest form (esp for the
proofs) makes it easier to review.

Also, can you split the tests to a seperate patch so we can see the diff generated by this patch?

Thanks for the review.
On further observation I see a lot of the optimization of similar kind happening in InstructionCombineCompares.
So a fold should allow for these optimzations to happen. I have made these in the patch https://reviews.llvm.org/D159499
Maybe it would be good to review this first. Thanks