If we've got an SCEVPtrToIntExpr(op), where op is not an SCEVUnknown,
we want to sink the SCEVPtrToIntExpr into an operand, so that the operation
is performed on integers, and eventually we end up with just an SCEVPtrToIntExpr(SCEVUnknown).
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Time | Test | |
---|---|---|
440 ms | linux > HWAddressSanitizer-x86_64.TestCases::sizes.cpp |
Event Timeline
@efriedma does this look about right?
Speaking for myself, this isn't quite what i want to have in the end,
it should be closer to: (just so i don't loose the code)
const SCEV *ScalarEvolution::getPtrToIntExpr(const SCEV *Op, Type *Ty, unsigned Depth) { assert(Ty->isIntegerTy() && "Target type must be an integer type!"); assert(Depth <= 1 && "getPtrToIntExpr() should self-recurse at most once."); // We could be called with an integer-typed operands during SCEV rewrites. // Since the operand is an integer already, just perform zext/trunc/self cast. if (!Op->getType()->isPointerTy()) return getTruncateOrZeroExtend(Op, Ty); // What would be an ID for such a SCEV cast expression? FoldingSetNodeID ID; ID.AddInteger(scPtrToInt); ID.AddPointer(Op); void *IP = nullptr; // Is there already an expression for such a cast? if (const SCEV *S = UniqueSCEVs.FindNodeOrInsertPos(ID, IP)) return getTruncateOrZeroExtend(S, Ty); // If not, is this expression something we can't reduce any further? if (isa<SCEVUnknown>(Op)) { // Create an explicit cast node. // We can reuse the existing insert position since if we get here, // we won't have made any changes which would invalidate it. Type *IntPtrTy = getDataLayout().getIntPtrType(Op->getType()); assert(getDataLayout().getTypeSizeInBits(getEffectiveSCEVType( Op->getType())) == getDataLayout().getTypeSizeInBits(IntPtrTy) && "We can only model ptrtoint if SCEV's effective (integer) type is " "sufficiently wide to represent all possible pointer values."); SCEV *S = new (SCEVAllocator) SCEVPtrToIntExpr(ID.Intern(SCEVAllocator), Op, IntPtrTy); UniqueSCEVs.InsertNode(S, IP); addToLoopUseLists(S); return getTruncateOrZeroExtend(S, Ty); } assert(Depth == 0 && "getPtrToIntExpr() should not self-recurse for non-SCEVUnknown's."); // Otherwise, we've got some expression that is more complex than just a // single SCEVUnknown. But we don't want to have a SCEVPtrToIntExpr of an // arbitrary expression, we want to have SCEVPtrToIntExpr of an SCEVUnknown // only, and the expressions must otherwise be integer-typed. // So sink the cast down to the SCEVUnknown's. /// The SCEVPtrToIntSinkingRewriter takes a scalar evolution expression, /// which computes a pointer-typed value, and rewrites the whole expression /// tree so that *all* the computations are done on integers, and the only /// pointer-typed operands in the expression are SCEVUnknown. class SCEVPtrToIntSinkingRewriter : public SCEVRewriteVisitor<SCEVPtrToIntSinkingRewriter> { using Base = SCEVRewriteVisitor<SCEVPtrToIntSinkingRewriter>; public: SCEVPtrToIntSinkingRewriter(ScalarEvolution &SE) : SCEVRewriteVisitor(SE) {} static const SCEV *rewrite(const SCEV *Scev, ScalarEvolution &SE) { SCEVPtrToIntSinkingRewriter Rewriter(SE); return Rewriter.visit(Scev); } const SCEV *visit(const SCEV *S) { Type *STy = S->getType(); // If the expression is not pointer-typed, just keep it as-is. if (!STy->isPointerTy()) return S; // Else, recursively sink the cast down into it. return Base::visit(S); } const SCEV *visitUnknown(const SCEVUnknown *Expr) { Type *ExprPtrTy = Expr->getType(); assert(ExprPtrTy->isPointerTy() && "Should only reach pointer-typed SCEVUnknown's."); Type *ExprIntPtrTy = SE.getDataLayout().getIntPtrType(ExprPtrTy); return SE.getPtrToIntExpr(Expr, ExprIntPtrTy, /*Depth=*/1); } }; // And actually perform the cast sinking. const SCEV *IntOp = SCEVPtrToIntSinkingRewriter::rewrite(Op, *this); assert(IntOp->getType()->isIntegerTy() && "We must have succeeded in sinking the cast, " "and ending up with an integer-typed expression!"); return getTruncateOrZeroExtend(IntOp, Ty); }
... but that requires SCEVRewriteVisitor changes so it doesn't loose no-wrap flags on add/mul,
(unlike AddRec, where it already preserves them),
but i would imagine the approach to that refactoring to be non-obvious,
so i would prefer not to block this on that, and do that as a follow-up..
llvm/test/Analysis/ScalarEvolution/ptrtoint.ll | ||
---|---|---|
501 | Regression in range estimates. |
@mkazantsev thank you for taking a look!
llvm/test/Analysis/ScalarEvolution/ptrtoint.ll | ||
---|---|---|
501 | Originally we had pointer-typed AddRec there: {%arg,+,4}<nuw><%bb6>, which is: ; X32-NEXT: %i7 = phi i32* [ %arg, %bb3 ], [ %i15, %bb6 ] ; X32-NEXT: --> {%arg,+,4}<nuw><%bb6> U: full-set S: full-set Exits: ((4 * ((-4 + (-1 * %arg) + %arg1) /u 4))<nuw> + %arg) LoopDispositions: { %bb6: Computable } It's constant range is 32-bit fullset, because we don't have any range knowledge for %arg. Then we had a cast of it to pointer: (zext i32 (ptrtoint i32* {%arg,+,4}<nuw><%bb6> to i32) to i64), But now we start with zext i32 (ptrtoint i32* %arg to i32) to i64 (which is [0,4294967296)) So that's it, i believe. ptrtoint is essentially a red herring here, |
llvm/test/Analysis/ScalarEvolution/ptrtoint.ll | ||
---|---|---|
501 | Edit: what i forgot to add: SCEVPtrToIntCast is intentionally not bitwidth-changing itself, It doesn't do that indiscriminately: // If the input value is a chrec scev, and we can prove that the value // did not overflow the old, smaller, value, we can zero extend all of the // operands (often constants). This allows analysis of something like // this: for (unsigned char X = 0; X < 100; ++X) { int Y = X; } if (const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(Op)) Clearly, there was NUW on that SCEVAddRecExpr, so the heuristic passed. But obviously i32 SCEVAddRecExpr w/NUW and i64 SCEVAddRecExpr w/NUW result in |
llvm/test/Analysis/ScalarEvolution/ptrtoint.ll | ||
---|---|---|
501 | Thanks for clarifying this. Actually that looks familiar. Just curious, could you please check if the problem goes away with https://reviews.llvm.org/D89381 and flag scalar-evolution-use-expensive-range-sharpening turned on? |
llvm/lib/Analysis/ScalarEvolution.cpp | ||
---|---|---|
1048 | If you check it's either Nary or unknown, checking it NOT be something else in addition is an overkill. :) | |
1058 | nit: could go with SmallVector<const SCEV *, 2> NewOps(NaryExpr->getNumOperands()) | |
1064 | Please init with nullptr to make it easier to track further possible mistakes. | |
1085 | Is it commented on purpose? |
@mkazantsev thank you for taking a look!
Addressing review comments.
llvm/lib/Analysis/ScalarEvolution.cpp | ||
---|---|---|
1058 | That doesn't reserve thought, but creates a vector with that many nullptrs. for(int I = 0; I != NaryExpr->getNumOperands(); ++I) { const auto& Op = NaryExpr->getOperand(I); NewOps[I] = Op->getType()->isPointerTy() ? getPtrToIntExpr(Op, IntPtrTy, Depth + 1) : Op; } which seems strictly worse to me. So i'll keep this as-is. | |
1085 | Mainly because i was unsuccessful with acquiring a test case. But for now, let's just err on the safe side, and have dead/untested code. | |
llvm/test/Analysis/ScalarEvolution/ptrtoint.ll | ||
501 | I just tried, and that (the patch + flag) does not help here. |
Hello. This increased codesize by a chunk on one of the testcases we track. Part of csibe/zlib. I've put a reproducer in https://godbolt.org/z/o58s7d. I've not really looked any deeper, other than it's quite a bit longer even without outlining, happens on the few architectures I tried and possibly something is going odd in LSR? It was (comparatively) a fairly big chunk of codesize across zlib, not just this file.
Hmm. In that snipped, after middle-end, the IR is identical.
But if i llvm-reduce that snippet based on the criteria that the final assembly differs, i get
And if we compare llc -debug -print-before-all logs: ,
indeed, it would seem like that allowed LSR to change pointer-typed PHI into an integer-typed one,
which ends up being bad for the final code size.
If you check it's either Nary or unknown, checking it NOT be something else in addition is an overkill. :)