Page MenuHomePhabricator

Please use GitHub pull requests for new patches. Phabricator shutdown timeline

[SCEV] Compute SCEV for ashr(add(shl(x, n), c), m) instr triplet
ClosedPublic

Authored by vedant-amd on Jun 6 2023, 8:14 AM.

Details

Summary

%x = shl i64 %w, n
%y = add i64 %x, c
%z = ashr i64 %y, m

The above given instruction triplet is seen many times in the generated
LLVM IR, but SCEV model is not able to compute the SCEV value of AShr
instruction in this case.

This patch models the two cases of the above instruction pattern using
the following expression:

> sext(add(mul(trunc(w), 2^(n-m)), c >> m))

  1. when n = m the expression reduces to sext(add(trunc(w), c >> n))

as n-m=0, and multiplying with 2^0 gives the same result.

  1. when n > m the expression works as given above.

It also adds several unittest to verify that SCEV is able to compute
the value.

$ opt sext-add-inreg.ll -passes="print<scalar-evolution>"

Comparing the snippets of the result of SCEV analysis:

  • SCEV of ashr before change ----------------------------

%idxprom = ashr exact i64 %sext, 32

-->  %idxprom U: [-2147483648,2147483648) S: [-2147483648,2147483648)
Exits: 8                LoopDispositions: { %for.body: Variant }
  • SCEV of ashr after change ---------------------------

%idxprom = ashr exact i64 %sext, 32

-->  {0,+,1}<nuw><nsw><%for.body> U: [0,9) S: [0,9)
Exits: 8                LoopDispositions: { %for.body: Computable }

LoopDisposition of the given SCEV was LoopVariant before, after adding
the new way to model the instruction, the LoopDisposition becomes
LoopComputable as it is able to compute the SCEV of the instruction.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

updated using clang-format

nikic added a comment.Jun 9 2023, 12:44 AM

I assume the motivation for this is the following InstCombine transform? https://llvm.godbolt.org/z/zd63TWKbW I don't think we can change that canonicalization, so undoing it in SCEV is fine.

llvm/lib/Analysis/ScalarEvolution.cpp
7879–7907

We already have code to handle the general shl+ashr pattern here, including for the case where n and m are not the same. We should reuse the same code. For your pattern, we'd just have an extra add at the start.

vedant-amd added inline comments.Jun 9 2023, 1:35 AM
llvm/lib/Analysis/ScalarEvolution.cpp
7879–7907

That could be done, but it would be a major refactoring. I will post a patch anyways.

I assume the motivation for this is the following InstCombine transform? https://llvm.godbolt.org/z/zd63TWKbW I don't think we can change that canonicalization, so undoing it in SCEV is fine.

Yup, that's correct. This canonicalization is used several times in LSR, so having it's SCEV might be beneficial.

vedant-amd updated this revision to Diff 529876.Jun 9 2023, 2:52 AM

Refactored code to reuse ashr + shl handling code

I assume the motivation for this is the following InstCombine transform? https://llvm.godbolt.org/z/zd63TWKbW I don't think we can change that canonicalization, so undoing it in SCEV is fine.

Hey @nikic, I did refactor the change, please take a look, all the SCEV test cases do pass with it. It look correct functionally.

I assume the motivation for this is the following InstCombine transform? https://llvm.godbolt.org/z/zd63TWKbW I don't think we can change that canonicalization, so undoing it in SCEV is fine.

Hey @nikic, I did refactor the change, please take a look, all the SCEV test cases do pass with it. It look correct functionally.

ping ! could you please give a +1 ? @nikic @efriedma

xbolva00 added a subscriber: xbolva00.EditedJun 11 2023, 4:35 AM

You should update checks in loopstrengthreduce::scaling-factor-incompat-type.ll

You should update checks in loopstrengthreduce::scaling-factor-incompat-type.ll

I was going to send another patch for that, testcase failure. Sure, will do so.

Ping ! I didn't add the LSR failing testcase fix here, because I honestly don't know if it uncovered a bug, or it's a valid transformation in LSR.

You have to update that file anyway. Pre-commit builds are clearly failing.

You have to update that file anyway. Pre-commit builds are clearly failing.

Right, is it fine if I mark the test case as XFAIL for now ?

nikic added a comment.Jun 13 2023, 6:07 AM

You have to update that file anyway. Pre-commit builds are clearly failing.

Right, is it fine if I mark the test case as XFAIL for now ?

No, you need to regenerate the test by running the update_test_checks.py script.

llvm/lib/Analysis/ScalarEvolution.cpp
7892

There should be no restriction on the shift amount.

7896

We should verify that the add constant has lowest n bits unset, to make reassociation valid. It doesn't matter if n==m, but it matters if n!=m.

7906

You can keep the truncate in the comment code by making this trunc(add()) instead of add(trunc()). Unless I'm missing something, they're equivalent in this context.

7916

Stray newline

llvm/test/Analysis/ScalarEvolution/sext-add-inreg.ll
7

You can test these cases with basic patterns using just the shl+add+ashr sequence, without a loop (though it's also ok to keep a loop test as a motivating case). Please also test the case where the shift amounts are different.

You have to update that file anyway. Pre-commit builds are clearly failing.

Right, is it fine if I mark the test case as XFAIL for now ?

No, you need to regenerate the test by running the update_test_checks.py script.

Will do so, I was not sure if the testcase will be still valid after the updated CHECKs.

updated failing testcase in LSR

vedant-amd marked an inline comment as done.Jun 13 2023, 11:12 PM

removed restriction on shift amount and the corresponding test case

vedant-amd marked an inline comment as done.Jun 13 2023, 11:19 PM
vedant-amd marked an inline comment as done.
vedant-amd added inline comments.Jun 13 2023, 11:54 PM
llvm/lib/Analysis/ScalarEvolution.cpp
7896

Do you mean to say check for unset bits from nth bit to the mth bit ? because anything lower than mth bit will anyways be irrelevant once we right shift by m amount.

vedant-amd marked an inline comment as not done.Jun 14 2023, 12:20 AM
vedant-amd added inline comments.Jun 14 2023, 4:49 AM
llvm/lib/Analysis/ScalarEvolution.cpp
7896

I tried to understand this, I am not able to make sense of it. Can you explain further what you mean to say ?

updated code to shift right the AddConstant by ShlAmt and flip add, trunc exprs

llvm/lib/Analysis/ScalarEvolution.cpp
7906

So, it should look like this ?

AddTruncateExpr = getTruncateExpr(getAddExpr(ShlOp0SCEV, AddConstant), TruncTy);

updated code to shift right the AddConstant by ShlAmt and flip add, trunc exprs

ping !! I have addressed majority of the comments.

vedant-amd added a comment.EditedJul 31 2023, 2:50 AM

ping !! @nikic can you take a look at this, thanks !

Updated LSR testcase and added new testcases for AShr SCEV model

Hey @nikic and @efriedma Sorry for the ping again, but I have addressed all the comments (added the testcases as well), can you please review once again, it's been stuck since a long time. Thanks for your time.

efriedma added inline comments.Aug 1 2023, 2:09 PM
llvm/test/Analysis/ScalarEvolution/sext-add-inreg-unequal.ll
13

How is this equivalent? Say %a is zero; the original function returns 1, this SCEV expression returns 0. (I think maybe the "ashr" of the constant is shifting by 10, instead of shifting by 8?)

vedant-amd added inline comments.Aug 3 2023, 1:03 AM
llvm/test/Analysis/ScalarEvolution/sext-add-inreg-unequal.ll
13

Yeah, the SCEV is wrong. I will fix this. This is what is should have been, but it's a bit opposite.

= (a*2^10 + 256)/2*8
= a*4 + 1

So, we need SCEV like this:

2^(shl_amt - ashr_amt) * a + c >> ashr_amt

I have updated my code, here's the correct SCEV:

(1 + (sext i56 (4 * (trunc i64 %a to i56)) to i64))<nuw><nsw> U: [1,-2) S: [-36028797018963967,36028797018963966)

Does it seem correct ?

vedant-amd updated this revision to Diff 546769.Aug 3 2023, 2:37 AM

Fixed issue with generated SCEV for ShlAmt > AshrAmt

Hey @efriedma Thanks for spotting the bug, I updated the SCEV for case when m > n, it seems to be correct now. Can you please review the patch, thanks !

CC: @nikic

efriedma added inline comments.Aug 3 2023, 8:43 AM
llvm/test/Analysis/ScalarEvolution/sext-add-inreg-unequal.ll
13

I suspect the updated code isn't right... can you add an example with a larger operand to the add? The add needs to happen before the sign-extend, but in this specific example, that doesn't matter because the two expressions are equivalent.

vedant-amd added inline comments.Aug 4 2023, 1:18 AM
llvm/test/Analysis/ScalarEvolution/sext-add-inreg-unequal.ll
13

I will add one, but I believe we can move the add out of the sext as the add constant is already shifted right by Ashr amount.

vedant-amd added inline comments.Aug 4 2023, 1:23 AM
llvm/test/Analysis/ScalarEvolution/sext-add-inreg-unequal.ll
13

Also, I am confused about one thing, the Type of MulExpr turns out to be i56 and that of the addConstant is i64, I am able to add them before the Sign extend. Adding the m > n part to this ashr model seems to buggy at this point, as I had proposed can this patch just address the case where m=n ? and I submit, the m > n thing later with thorough thinking. But, I think @nikic wants them in the same patch.

vedant-amd added inline comments.Aug 4 2023, 1:38 AM
llvm/test/Analysis/ScalarEvolution/sext-add-inreg-unequal.ll
13

Adding to the previous comment, the Ashr + Add + Shl with shlamt > ashramt quite rarely occur in LLVM IR. But, the case where the shlamt = ashramt is quite common in LLVM IR, notably because instcombine like passes do the transformation. I propose that this patch just address the latter and I send a patch for the m > n case at a later point. This patch is stuck from a long time, and just taking care of m=n case won't negatively affect as it'll safely exit in the m > n case.

Please let me know your thoughts about this, thanks ! cc: @nikic

I'm not sure why you're having so much trouble with the unequal shift math. The case of unequal shift amount is really just treating the shl as two shifts: one shift by an arbitrary amount, followed by one shift with shift amount equal to the ashr.

That said I'd be okay with a patch that bailed out on that case.

Update the SCEV code and add a new testcase

I'm not sure why you're having so much trouble with the unequal shift math. The case of unequal shift amount is really just treating the shl as two shifts: one shift by an arbitrary amount, followed by one shift with shift amount equal to the ashr.

I am having trouble with keeping the addition part inside the SextExpr thing, because it always returns back i64 int, but now came up with a better implementation. It seems correct to me, please take a look.

That said I'd be okay with a patch that bailed out on that case.

I'm not sure why you're having so much trouble with the unequal shift math. The case of unequal shift amount is really just treating the shl as two shifts: one shift by an arbitrary amount, followed by one shift with shift amount equal to the ashr.

That said I'd be okay with a patch that bailed out on that case.

updated with clang-format

added empty line at end of file in a test case

@efriedma I have fixed the SCEV part, it mostly seems correct. Also added the testcase with large add constant. Please take a look once ! thanks !

The revised logic makes sense.

llvm/lib/Analysis/ScalarEvolution.cpp
7907

Instead of checking if (ShlAmt > AShrAmt) here, can you just unconditionally do AddTruncateExpr = getTruncateExpr(ShlOp0SCEV, TruncTy);, then add an if (L->getOpcode() != Instruction::Shl) check to the equal shift amount case?

Or better, just unify the ShlAmt > AShrAmt and ShlAmt == AShrAmt cases; the logic for the ShlAmt > AShrAmt case should just work for the ShlAmt == AShrAmt case (it ends up multiplying by a constant 1, which simplifies to exactly the same thing as the existing ShlAmt == AShrAmt code).

vedant-amd marked an inline comment as done.

Simplified expression handling code, and cleaned up other things

llvm/lib/Analysis/ScalarEvolution.cpp
7907

I have implemented the second suggestion, refactored and cleaned up the comments. Now the code looks in good shape. Please give a final review for the same.

This revision is now accepted and ready to land.Aug 24 2023, 9:51 AM
vedant-amd edited the summary of this revision. (Show Details)

Updated the commit message

This revision was landed with ongoing or failed builds.Aug 24 2023, 10:47 PM
This revision was automatically updated to reflect the committed changes.