This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Introduce SCEVPtrToIntExpr (PR46786)
ClosedPublic

Authored by lebedev.ri on Oct 15 2020, 3:57 AM.

Details

Summary

And use it to model LLVM IR's ptrtoint cast.

This is essentially an alternative to D88806, but with no chance for
all the problems it caused due to having the cast as implicit there.
(see rG7ee6c402474a2f5fd21c403e7529f97f6362fdb3)

As we've established by now, there are at least two reasons why we want this:

  • It will allow SCEV to actually model the ptrtoint casts and their operands, instead of treating them as SCEVUnknown
  • It should help with initial problem of PR46786 - this should eventually allow us to not loose pointer-ness of an expression in more cases

As discussed in PR46786, in principle,
we could just extend SCEVUnknown with a is ptrtoint cast, because ScalarEvolution::getPtrToIntExpr()
should sink the cast as far down into the expression as possible,
so in the end we should always end up with SCEVPtrToIntExpr of SCEVUnknown.

But i think that it isn't the best solution, because it doesn't really matter
from memory consumption side - there probably won't be *that* many SCEVPtrToIntExprs
for it to matter, and it allows for much better discoverability.

This probably needs some more tests.
I'm not adding any rewrites right in this patch.

Diff Detail

Event Timeline

lebedev.ri created this revision.Oct 15 2020, 3:57 AM
lebedev.ri edited the summary of this revision. (Show Details)Oct 15 2020, 3:59 AM
lebedev.ri edited the summary of this revision. (Show Details)Oct 15 2020, 6:36 AM
lebedev.ri edited the summary of this revision. (Show Details)Oct 15 2020, 6:38 AM
efriedma added inline comments.Oct 15 2020, 11:01 AM
llvm/lib/Analysis/ScalarEvolution.cpp
3566–3569

Is this change necessary?

5520

zextOrTrunc shouldn't be necessary.

6124

I think it would be a useful invariant if SCEVConstant was never a pointer. We currently have a weird hack in SCEVExpander to deal with expanding pointer expressions that don't actually have any pointer operands.

I think we could just use getUnknown(V) here, though. I can't think of any reason to have a special SCEV expression for null pointers.

polly/test/Isl/CodeGen/ptrtoint_as_parameter.ll
3

I think it's just checking it doesn't crash.

More generally, I would guess your patch is interfering with polly's code to look through ptrtoint instructions. If you want, I can fix up the polly stuff after this patch is finished.

@efriedma thank you for taking a look!

llvm/lib/Analysis/ScalarEvolution.cpp
3566–3569

Yes, otherwise we end up in an endless recursion with createSCEV().

5520

Agree, but unfortunately it is, because ScalarEvolution::getEffectiveSCEVType() is broken.

// The only other support type is pointer.
assert(Ty->isPointerTy() && "Unexpected non-pointer non-integer type!");
return getDataLayout().getIndexType(Ty);

Index type can be, but is not always, the same type as the pointer-sized integer type.
There's existing test that crashes with assertion without that zextOrTrunc.

6124

Let's see..

lebedev.ri added inline comments.Oct 15 2020, 11:16 AM
llvm/lib/Analysis/ScalarEvolution.cpp
3566–3569
********************
FAIL: LLVM :: Analysis/ScalarEvolution/scalable-vector.ll (851 of 943)
******************** TEST 'LLVM :: Analysis/ScalarEvolution/scalable-vector.ll' FAILED ********************
Script:
--
: 'RUN: at line 1';   /builddirs/llvm-project/build-Clang11-unknown/bin/opt -scalar-evolution -analyze -enable-new-pm=0 < /repositories/llvm-project/llvm/test/Analysis/ScalarEvolution/scalable-vector.ll | /builddirs/llvm-project/build-Clang11-unknown/bin/FileCheck /repositories/llvm-project/llvm/test/Analysis/ScalarEvolution/scalable-vector.ll
: 'RUN: at line 2';   /builddirs/llvm-project/build-Clang11-unknown/bin/opt "-passes=print<scalar-evolution>" -disable-output < /repositories/llvm-project/llvm/test/Analysis/ScalarEvolution/scalable-vector.ll 2>&1 | /builddirs/llvm-project/build-Clang11-unknown/bin/FileCheck /repositories/llvm-project/llvm/test/Analysis/ScalarEvolution/scalable-vector.ll
--
Exit Code: 1

Command Output (stderr):
--
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.      Program arguments: /builddirs/llvm-project/build-Clang11-unknown/bin/opt -scalar-evolution -analyze -enable-new-pm=0 
1.      Running pass 'Function Pass Manager' on module '<stdin>'.
2.      Running pass 'FunctionPass Printer: Scalar Evolution Analysis' on function '@a'
  #0 0x00007fea72fbf6b3 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /repositories/llvm-project/llvm/lib/Support/Unix/Signals.inc:563:13
  #1 0x00007fea72fbd5d2 llvm::sys::RunSignalHandlers() /repositories/llvm-project/llvm/lib/Support/Signals.cpp:72:18
  #2 0x00007fea72fbfbb5 SignalHandler(int) /repositories/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3
  #3 0x00007fea75dfc140 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14140)
  #4 0x00007fea73adfc2a llvm::ScalarEvolution::getSCEV(llvm::Value*) /repositories/llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp:3747:19
  #5 0x00007fea73aeb3c1 llvm::ScalarEvolution::createNodeForGEP(llvm::GEPOperator*) /repositories/llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp:0:0
  #6 0x00007fea73ae42a9 llvm::ScalarEvolution::createSCEV(llvm::Value*) /repositories/llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp:0:12
  #7 0x00007fea73adfc48 llvm::ScalarEvolution::getSCEV(llvm::Value*) /repositories/llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp:3749:7
  #8 0x00007fea73ae449c llvm::isa_impl_cl<llvm::SCEVConstant, llvm::SCEV const*>::doit(llvm::SCEV const*) /repositories/llvm-project/llvm/include/llvm/Support/Casting.h:104:5
  #9 0x00007fea73ae449c llvm::isa_impl_wrap<llvm::SCEVConstant, llvm::SCEV const*, llvm::SCEV const*>::doit(llvm::SCEV const* const&) /repositories/llvm-project/llvm/include/llvm/Support/Casting.h:131:12
 #10 0x00007fea73ae449c llvm::isa_impl_wrap<llvm::SCEVConstant, llvm::SCEV const* const, llvm::SCEV const*>::doit(llvm::SCEV const* const&) /repositories/llvm-project/llvm/include/llvm/Support/Casting.h:121:12
 #11 0x00007fea73ae449c bool llvm::isa<llvm::SCEVConstant, llvm::SCEV const*>(llvm::SCEV const* const&) /repositories/llvm-project/llvm/include/llvm/Support/Casting.h:142:10
 #12 0x00007fea73ae449c llvm::ScalarEvolution::createSCEV(llvm::Value*) /repositories/llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp:6373:9
 #13 0x00007fea73adfc48 llvm::ScalarEvolution::getSCEV(llvm::Value*) /repositories/llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp:3749:7
 #14 0x00007fea73ae01b3 llvm::ScalarEvolution::getSizeOfExpr(llvm::Type*, llvm::Type*) /repositories/llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp:3568:1
 #15 0x00007fea73adf970 llvm::ScalarEvolution::getGEPExpr(llvm::GEPOperator*, llvm::SmallVectorImpl<llvm::SCEV const*> const&) /repositories/llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp:3361:19
 #16 0x00007fea73aeb426 llvm::SmallVectorTemplateCommon<llvm::SCEV const*, void>::isSmall() const /repositories/llvm-project/llvm/include/llvm/ADT/SmallVector.h:131:39
 #17 0x00007fea73aeb426 llvm::SmallVectorImpl<llvm::SCEV const*>::~SmallVectorImpl() /repositories/llvm-project/llvm/include/llvm/ADT/SmallVector.h:388:16
 #18 0x00007fea73aeb426 llvm::SmallVector<llvm::SCEV const*, 4u>::~SmallVector() /repositories/llvm-project/llvm/include/llvm/ADT/SmallVector.h:898:3
 #19 0x00007fea73aeb426 llvm::ScalarEvolution::createNodeForGEP(llvm::GEPOperator*) /repositories/llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp:5297:1
 #20 0x00007fea73ae42a9 llvm::ScalarEvolution::createSCEV(llvm::Value*) /repositories/llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp:0:12
<...>

createSCEV(): don't model nullptr constant as zero integer, but as SCEVUnknon.

lebedev.ri marked 4 inline comments as done.Oct 15 2020, 12:17 PM
lebedev.ri added inline comments.
polly/test/Isl/CodeGen/ptrtoint_as_parameter.ll
3

If you want, I can fix up the polly stuff after this patch is finished.

Yes, that would be awesome, thank you! I'm not at all familiar with polly.

lebedev.ri marked an inline comment as done.

Rebased, NFC.

efriedma added inline comments.Oct 15 2020, 1:25 PM
llvm/lib/Analysis/ScalarEvolution.cpp
3566–3569

Oh, that makes sense. Maybe worth adding a comment to explain.

5520

If the index width is narrower than pointer width, the range computed using zextOrTrunc is nonsense: the actual pointer value may not fit into the index type at all.

Off the top of my head, I'm not sure what the canonical IR looks like in cases like that; do we canonicalize the ptrtoint to the pointer width, or the index width?

6456

This shouldn't be necessary with the other change for null pointers.

Oh, also, are you planning to implement the SCEV simplifications in this patch, or a followup? I can see reasons to go either way, I guess. I'd weakly lean towards implementing them here, so we can avoid churn implementing handling for cases that can't happen after simplification.

lebedev.ri marked an inline comment as done and 2 inline comments as not done.Oct 15 2020, 1:38 PM

Oh, also, are you planning to implement the SCEV simplifications in this patch, or a followup? I can see reasons to go either way, I guess. I'd weakly lean towards implementing them here, so we can avoid churn implementing handling for cases that can't happen after simplification.

I would strongly prefer to do that in a follow-up patch, since that is more in-line with incremental development / iterative improvements.

llvm/lib/Analysis/ScalarEvolution.cpp
5520

Let me take one more look at what is going on there..

lebedev.ri marked 3 inline comments as done.

@efriedma thank you for taking a look!
Addressed review notes.

llvm/lib/Analysis/ScalarEvolution.cpp
5520

Addressed by D89540.

Based on disscussion in D89540, let's just not bother with the case where index size doesn't match pointer size.

I think this looks good. But I'd like a second opinion since this is a substantial change to SCEV modeling.

llvm/unittests/Transforms/Utils/ScalarEvolutionExpanderTest.cpp
957

Please reformat this to something readable.

lebedev.ri updated this revision to Diff 298748.EditedOct 16 2020, 1:52 PM

Reformat the snippet.

Looks like removal of nullptr constant special handling affected
polly tests and i didn't notice it, still need to update them.

I think this looks good. But I'd like a second opinion since this is a substantial change to SCEV modeling.

@efriedma thank you for taking a look!

But I'd like a second opinion since this is a substantial change to SCEV modeling.

@mkazantsev @fhahn thoughts? :)

lebedev.ri marked an inline comment as done.
lebedev.ri added reviewers: Meinersbur, grosser.

Mostly update polly tests.

I don't think that mixing null into AddRecs is correct and expected.

llvm/include/llvm/Analysis/ScalarEvolutionExpressions.h
100

Nit: better add it last, and keep it 1 per line.

llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll
534

What's the type of the AddRec here? Is it an integral-type AddRec with pointer base? If so, this will be a source of various bugs because the code relies on fact that AddRec's type matches Base's type in many places.

mkazantsev requested changes to this revision.Oct 18 2020, 9:50 PM

Seems that this patch leads to worse code generation in Polly in many cases. And I don't see a single case where it does something useful. I think we should have fixes for all of them if we want to merge it.

polly/test/Isl/CodeGen/partial_write_impossible_restriction.ll
8 ↗(On Diff #298843)

TODO/FIXME?

polly/test/Isl/CodeGen/phi_after_error_block_outside_of_scop.ll
7 ↗(On Diff #298843)

Regression.

polly/test/Isl/CodeGen/pointer-type-expressions.ll
36 ↗(On Diff #298843)

Regression.

polly/test/ScopInfo/int2ptr_ptr2int.ll
33

Regression.

polly/test/ScopInfo/multidim_2d_with_modref_call_2.ll
56 ↗(On Diff #298843)

Regression.

This revision now requires changes to proceed.Oct 18 2020, 9:50 PM
efriedma added inline comments.Oct 18 2020, 10:33 PM
llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll
534

Should be a pointer-type addrec; it's treating "null" as an opaque pointer.

polly/test/Isl/CodeGen/pointer-type-expressions.ll
36 ↗(On Diff #298843)

I suspect the change to null handling is changing the way the icmp is analyzed; should be easy to fix? I'll try to find time to look in the next couple days.

polly/test/ScopInfo/multidim_2d_with_modref_call_2.ll
56 ↗(On Diff #298843)

Is it? As far as I can tell, the llvm.gcread is actually loading from a null pointer.

I think we are conflating things here.
@efriedma while i agree that treating nullptr constant as zero seems, suboptimal?,
i'm not sure we have to do that change immediately together with the ptrtoint handling?
Can we do that later? I'll undo it for now..

Backout controversial nullptr constant handling changes - should be a separate patch.

mkazantsev added inline comments.Oct 19 2020, 3:40 AM
polly/test/ScopInfo/int2ptr_ptr2int_2.ll
33–36

This code is still worse than it used to be. What boons do we expect from this patch?

lebedev.ri marked an inline comment as done.Oct 19 2020, 6:24 AM
lebedev.ri added inline comments.
polly/test/ScopInfo/int2ptr_ptr2int_2.ll
33–36

E.g. see the example suggested by @efriedma - @pr46786_c26_char()/@pr46786_c26_int() in llvm/test/Analysis/ScalarEvolution/ptrtoint.ll.
If we don't model ptrtoint, we basically fail to analyze the address of the load/store there,
while it's actually quite simple after D89692.

lebedev.ri marked an inline comment as done.Oct 19 2020, 2:21 PM

Will rebase once i have proper test coverage for D89692 - still need tests for udiv/mul involving a pointer.

Rebased, addded proper test coverage.

mkazantsev added inline comments.Oct 23 2020, 3:53 AM
llvm/lib/Analysis/ScalarEvolution.cpp
8445

It doesn't look like the right way to check for loop invariance. getSCEVAtScope can potentially simplify your expression that is still invariant. Example: you have Op = A + B and loop is guarded by condition B = 0. getSCEVAtScope(Op) may return A which is still loop invariant, but not equal to Op.

What you might be looking for is isAvailableAtLoopEntry.

@mkazantsev thank you for taking a look!

llvm/lib/Analysis/ScalarEvolution.cpp
8445

Disclaimer: *i'm* not looking for loop invariance.
This chunk is a verbatum copy-paste from SCEVTruncateExpr/SCEVSignExtendExpr/SCEVZeroExtendExpr handling above.
If this isn't the right handling, then they are all wrong, and that's a preexisting problem.

mkazantsev accepted this revision.Oct 23 2020, 5:14 AM

I don't see any more problems, thanks.

llvm/lib/Analysis/ScalarEvolution.cpp
8445

Allright, maybe it's just a misleading comment then. Otherwise the code looks fine to me. I'll take a look later into these pieces.

This revision is now accepted and ready to land.Oct 23 2020, 5:14 AM
lebedev.ri marked 2 inline comments as done.Oct 23 2020, 5:19 AM

I don't see any more problems, thanks.

Thank you for the review!
I think it makes sense to land the simplifications (D89692) about at the same time,
so i'll wait until we are happy with that patch too.

Rebased, NFC.

Rebased, NFC.

This revision was landed with ongoing or failed builds.Oct 30 2020, 1:14 AM
This revision was automatically updated to reflect the committed changes.
nikic added a subscriber: nikic.Apr 18 2021, 7:16 AM

SCEVUnknown is implemented as a value handle that nulls out the SCEV expression on deletion, and whenever a SCEV expression for a Value is looked up, we check whether there are any nullptr SCEVUnknowns and discard them in that case.

As SCEVPtrToInt is really similar to SCEVUnknown, I've been wondering why we need this handling for SCEVUnknown but not for SCEVPtrToInt.

(I don't have any particular reason for this question, just confused while reading the code...)

SCEVUnknown is implemented as a value handle that nulls out the SCEV expression on deletion, and whenever a SCEV expression for a Value is looked up, we check whether there are any nullptr SCEVUnknowns and discard them in that case.

As SCEVPtrToInt is really similar to SCEVUnknown, I've been wondering why we need this handling for SCEVUnknown but not for SCEVPtrToInt.

(I don't have any particular reason for this question, just confused while reading the code...)

SCEVPtrToInt isn't really similar to SCEVUnknown, because former is a "proper" SCEV expression,
with a SCEV expression parent, while latter is a "root" expression, with LLVM IR Value* as a parent.
So i wouldn't see why it would need the same handling as SCEVUnknown.
Or to reword, shouldn't then every single SCEV expression have that same handling as SCEVUnknown?

nikic added a comment.Apr 18 2021, 7:31 AM

SCEVUnknown is implemented as a value handle that nulls out the SCEV expression on deletion, and whenever a SCEV expression for a Value is looked up, we check whether there are any nullptr SCEVUnknowns and discard them in that case.

As SCEVPtrToInt is really similar to SCEVUnknown, I've been wondering why we need this handling for SCEVUnknown but not for SCEVPtrToInt.

(I don't have any particular reason for this question, just confused while reading the code...)

SCEVPtrToInt isn't really similar to SCEVUnknown, because former is a "proper" SCEV expression,
with a SCEV expression parent, while latter is a "root" expression, with LLVM IR Value* as a parent.
So i wouldn't see why it would need the same handling as SCEVUnknown.
Or to reword, shouldn't then every single SCEV expression have that same handling as SCEVUnknown?

Oh, of course! I got really confused here. For some reason I thought that SCEVPtrToInt directly referenced an IR value. I have no idea where I got that impression in the first place, it doesn't make sense.

SCEVUnknown is implemented as a value handle that nulls out the SCEV expression on deletion, and whenever a SCEV expression for a Value is looked up, we check whether there are any nullptr SCEVUnknowns and discard them in that case.

As SCEVPtrToInt is really similar to SCEVUnknown, I've been wondering why we need this handling for SCEVUnknown but not for SCEVPtrToInt.

(I don't have any particular reason for this question, just confused while reading the code...)

SCEVPtrToInt isn't really similar to SCEVUnknown, because former is a "proper" SCEV expression,
with a SCEV expression parent, while latter is a "root" expression, with LLVM IR Value* as a parent.
So i wouldn't see why it would need the same handling as SCEVUnknown.
Or to reword, shouldn't then every single SCEV expression have that same handling as SCEVUnknown?

Oh, of course! I got really confused here. For some reason I thought that SCEVPtrToInt directly referenced an IR value. I have no idea where I got that impression in the first place, it doesn't make sense.

It does make sense, and that would be a totally valid model for SCEVPtrToInt, but i specifically choose *not* to use it.