This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] get more accurate range for AddExpr with NW flag
ClosedPublic

Authored by shchenz on Jul 17 2019, 7:31 AM.

Details

Summary

for AddExpr (1 + %a)<nsw>, if %a (LoopInvariant, such as function parameter) range is fullset, 1 range is [1, 2), for now, constant range add operation will return fullset is any of its operand is fullset.

For above AddExpr, (1 + %a)<nsw> has No-Signed-Wrap flag, so SINT_MIN is not in (1 + %a)<nsw> range. But now we get a conservative range, fullset.

Diff Detail

Event Timeline

shchenz created this revision.Jul 17 2019, 7:31 AM
reames requested changes to this revision.Jul 17 2019, 9:16 AM
reames added inline comments.
llvm/lib/Analysis/ScalarEvolution.cpp
5578–5579

Stylistically, it would be much cleaner to add a function to ConstantRange with the signature
CR add(CR, NoWrapKind)

5580

Your code is incorrect as implemented. There's no requirement that a SCEVAddExpr have only two operands.

This revision now requires changes to proceed.Jul 17 2019, 9:16 AM
nikic added inline comments.Jul 17 2019, 9:57 AM
llvm/lib/Analysis/ScalarEvolution.cpp
5578–5579

I was planning to add this ConstantRange functionality based on D61653.

reames added inline comments.Jul 17 2019, 10:06 AM
llvm/lib/Analysis/ScalarEvolution.cpp
5578–5579

That looks entirely reasonable. Land that, and rebase?

@reames Thanks for your comments. I will update this patch after @nikic lands his improvement in https://reviews.llvm.org/D61653

shchenz updated this revision to Diff 211507.Jul 24 2019, 8:04 AM

address @reames comments

I will commit another NFC patch to merge addWithNoSignedWrap and new added addWithNoWrap.

reames requested changes to this revision.Aug 6 2019, 1:26 PM
reames added inline comments.
llvm/lib/Analysis/ScalarEvolution.cpp
5579

I'd very strongly prefer if you structured this as so:
NoWrapKind = compute(Add);
for each operand:

sum += sum.addWithWrapping(Op, NoWrapKind)

This requires that NoWrapKind above be any of: none, signed, unsigned, and both.

llvm/lib/IR/ConstantRange.cpp
829

Why not allow both?

834

I believe this is impossible from the dominating check, please convert to an assert.

This revision now requires changes to proceed.Aug 6 2019, 1:26 PM
shchenz updated this revision to Diff 214575.Aug 11 2019, 10:40 PM
shchenz marked an inline comment as done.

address @reames comments

shchenz marked 2 inline comments as done.Aug 12 2019, 12:23 AM
shchenz added inline comments.
llvm/lib/IR/ConstantRange.cpp
829

I think we can not support both wrap type in one range type(unsigned/signed range). We have signed and unsigned range for one SCEV and we set nsw only based on signed range and nuw only based on unsigned range. For example, if one ADD SCEV's signed range is [-10, 20), and its unsigned range is [10, 20), in StrengthenNoWrapFlags, we will set this SCEV with nsw and nuw. But when we wants to get its range after nsw and nuw are set, MGNR unsigned wrap for [-10, 20) is [0, 1), after intersect with [10, 20), we get empty set for this SCEV for both signed range and unsigned range. This is not right. we should handle nsw for signed range and nuw for unsigned range.

Hi @reames Could you help to have another look at this? Thanks

reames accepted this revision.Aug 19 2019, 1:58 PM

LGTM

llvm/lib/Analysis/ScalarEvolution.cpp
5583

Move this if/else chain outside the loop. It's loop invariant, and confusing (but correct) as written.

This revision is now accepted and ready to land.Aug 19 2019, 1:58 PM
nikic requested changes to this revision.Aug 19 2019, 3:10 PM

makeGuaranteedNoWrapRegion() cannot be used to constrain input ranges. E.g. consider that the RHS is [UINT_MAX-1, UINT_MAX]. Then the GWNR is just [0, 0], even though 1 would be a legal value for LHS (assuming the RHS is UINT_MAX-1).

This revision now requires changes to proceed.Aug 19 2019, 3:10 PM
shchenz planned changes to this revision.Aug 20 2019, 6:22 PM

Agree with your comment. @nikic Now we may get a smaller range than its real one with this patch, it may leads to wrong opt. Get a bigger range like trunk is conservative, but right. I will seek for another change for this issue.

shchenz updated this revision to Diff 218590.Sep 3 2019, 10:15 PM

new implementation not based on makeGuaranteedNoWrapRegion

Hi @nikic @reames , could you help to have another look at the patch? Thanks in advance.

nikic added a comment.Sep 7 2019, 4:23 AM

Can you please split off the ConstantRange addition into a separate revision?

llvm/lib/IR/ConstantRange.cpp
853

You can use getNonEmpty(A, B) here to avoid the explicit check for full range. Similar below.

862

I think that this whole code can be somewhat simplified. I have an unfinished implementation of add nuw lying around that looks like this:

APInt LMin = L.getUnsignedMin(), LMax = L.getUnsignedMax();
APInt RMin = R.getUnsignedMin(), RMax = R.getUnsignedMax();
bool Overflow;
APInt NewMin = LMin.uadd_ov(RMin, Overflow);
if (Overflow)
  return ConstantRange::getEmpty(L.getBitWidth());
APInt NewMax = LMax.uadd_sat(RMax);
return ConstantRange::getNonEmpty(NewMin, std::move(NewMax) + 1);
shchenz updated this revision to Diff 219287.Sep 8 2019, 8:09 PM

address @nikic comments

Can you please split off the ConstantRange addition into a separate revision?

@nikic done. addWithNoWrap is added in patch https://reviews.llvm.org/D67339 Thanks for your good comments.

nikic added a comment.Oct 6 2019, 8:25 AM

The addWithNoWrap addition has landed, but I guess this is now blocked because D64868 has been reverted?

shchenz planned changes to this revision.Oct 7 2019, 6:50 PM

Yes, I plan to take a look at https://reviews.llvm.org/D64868 firstly.

Seems we can not simply add nsw|nuw for start + stride as I did in https://reviews.llvm.org/D64868. Since SCEV are shared with expressions in different blocks, we must ensure all expressions mapped to same SCEV can be added with nsw|nuw.

shchenz updated this revision to Diff 234678.Dec 19 2019, 1:32 AM
nikic added inline comments.Dec 19 2019, 1:02 PM
llvm/lib/Analysis/ScalarEvolution.cpp
5579

I'd suggest to also pass the RangeType here, though I'm not sure whether it makes any practical difference.

nikic added inline comments.Dec 19 2019, 1:14 PM
llvm/test/Transforms/IRCE/ranges_of_different_types.ll
87 ↗(On Diff #234678)

Can you get a range dump for this code? I'm not entirely clear on why we get an improvement here (assuming the metadata on the load is taken into account, I don't see where addWithNoWrap would produce a better result).

shchenz updated this revision to Diff 235097.Dec 22 2019, 11:38 PM
shchenz marked 2 inline comments as done.

address @nikic comments

llvm/test/Transforms/IRCE/ranges_of_different_types.ll
87 ↗(On Diff #234678)

Before the change, SCEV for exit.mainloop.at is (0 smax (101 smin ((-1 * (-13 smax (-2147483647 + %len)<nuw><nsw>))<nsw> + %len)<nuw><nsw>)), now we take nuw for add into account, so we get conclusion that (101 smin ((-1 * (-13 smax (-2147483647 + %len)<nuw><nsw>))<nsw> + %len)<nuw><nsw>) will not be smaller than 0. So after the change, SCEV for exit.mainloop.at is changed to (101 smin ((-1 * (-13 smax (-2147483647 + %len)<nuw><nsw>))<nsw> + %len)<nuw><nsw>)

nikic accepted this revision.Dec 23 2019, 2:13 AM

LGTM

llvm/test/Transforms/IRCE/ranges_of_different_types.ll
87 ↗(On Diff #234678)

Thanks. So the difference comes from the sub, which is really a add nuw nsw on '-1 * %smax1`.

This revision is now accepted and ready to land.Dec 23 2019, 2:13 AM
This revision was automatically updated to reflect the committed changes.