The SCEVExpander cannot deal with all SCEVs Polly allows in all kinds of expressions. To this end we introduce a ScopExpander that handles the additional expressions separatly and falls back to the SCEVExpander for everything else.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Some comments.
lib/CodeGen/BlockGenerators.cpp | ||
---|---|---|
387 ↗ | (On Diff #32253) | This is needed as we otherwise think the new instructions are escape users, however I think we have to create enforce a new and initially empty entering block now. |
lib/Support/ScopHelper.cpp | ||
286 ↗ | (On Diff #32253) | Not is missing. |
349 ↗ | (On Diff #32253) | left over |
The SCEVExpander cannot deal with all SCEVs Polly allows in all kinds
of expressions. To this end we introduce a ScopExpander that handles
the additional expressions separatly and falls back to the
SCEVExpander for everything else.
Shouldn't you just enhance SCEVExpander?
The SCEVExpander cannot deal with all SCEVs Polly allows in all kinds
of expressions. To this end we introduce a ScopExpander that handlesthe additional expressions separatly and falls back to theSCEVExpander for everything else.Shouldn't you just enhance SCEVExpander?
Hi Hal,
thanks for jumping in. The point is that Polly starts to support thinks that go beyond what we
expression in SCEV generally. The current expressions we are facing, sDiv and sRem, could probably be added
to SCEV and consequently the SCEVExpander as well, but the next step will be piecewise expressions
such as a < b ? c : d. I am less convinced this really needs to be added to SCEV. Or at least,
I feel that we better keep this in Polly for a while, gain some experience, and move it if the
added value of this becomes really visible. In general I am trying to push as much of the stuff we
are doing directly into LLVM, but at the same time I feel we need to make sure we don't create too
much code that is only useful for Polly ATM.
I would be interested if you agree/disagree with this reasoning.
Best,
Tobias
No problem ;)
The point is that Polly starts to support thinks that go beyond what we
expression in SCEV generally. The current expressions we are facing, sDiv and sRem, could probably be added
to SCEV and consequently the SCEVExpander as well, but the next step will be piecewise expressions
such as a < b ? c : d. I am less convinced this really needs to be added to SCEV. Or at least,
I feel that we better keep this in Polly for a while, gain some experience, and move it if the
added value of this becomes really visible. In general I am trying to push as much of the stuff we
are doing directly into LLVM, but at the same time I feel we need to make sure we don't create too
much code that is only useful for Polly ATM.I would be interested if you agree/disagree with this reasoning.
How are you representing these new SCEV-like nodes? This code seems to look though SCEV unknowns that happen to be sdiv/srem instructions and re-expand the operands. I don't understand why the operands would not otherwise be available (unless these are uninserted instructions -- in which case, I'd think that adding more SCEV subclasses would be a better design -- but maybe I'm overlooking some downside to that).
Best,
Tobias
lib/Support/ScopHelper.cpp | ||
---|---|---|
292 ↗ | (On Diff #32253) | I don't grok how Polly uses SCEVs, but won't LHS be identical to Inst->getOperand(0) here? Likewise for RHS and Inst->getOperand(1)? Why can't this function just return E->getValue() (after which ScopExpander::expandCodeFor should become equivalent to SCEVExpander::expandCodeFor)? |
Let me try to comment on both questions (by Hal as well as Sanjoy) at once.
@hfinkel regarding how we model these additions to SCEV:
We use SCEVs as an intermediate step from IR to isl expressions (more precicly isl piecewise affince functions [isl_pw_aff]). However, these piecewise affine functions can express things SCEVs cannot and vice versa. With SDiv and SRem (where the right hand side is a constant) we allowed Polly to represent more than SCEVs as piecewise affine functions. To this end we have to check the SCEVUnknowns each time and decide if they are parameters of the region we analyze (then we treat them as unknown values) or if we can model them. The latter is only supported for the SDiv and SRem at the moment but will be extended at some point to more (e.g. there is a patch that allows bitwise operations etc.).
@sanjoy & @hfinkel regarding this patch and its purpose:
So far the modeling as described above was sufficent to represent SDiv and SRem operations in the mathematical model and to generate a new, optimized code from it. However, recently we noticed a bug in our code generation if the SDiv or SRem operation are part of a parameter that we assume to be copyable. That means we think we can just copy the (side effect free!) computation of the parameter in front of our optimized code version without worrying where the orginal computation was. But if there was a SDiv or SRem in the computation and these did not dominate the analyzed region but where part of it, we generated invalid code when we expanded the parameter SCEV in front of the optimized code region. It was invalid because it referenced the SDiv or SRem in the original region that did not dominate the optimized region. [Note that during our analysis we looked through the SDiv and SRem instruction but SCEVExpander stopped with them]. To this end we introduced the ScopExpander which will expand SCEVs as the SCEVExpander but in case of SDiv or SRem instructions move the expansions in front of the SCoP and expand the SDiv or SRem there instead of just using the original instruction.
AFAIU you always insert the new instructions into the entering block. What if the argument of SDiv/SRem depends of an inner loop induction variable (An point raised by Tobias in a mailing list discussion)?
lib/Support/ScopHelper.cpp | ||
---|---|---|
261 ↗ | (On Diff #32271) | ScopExpander doesn't need the Scop, only the Region. |
294 ↗ | (On Diff #32271) | Introduce getStartIP() which sets StartIP if nullptr ? |
Hi Johannes,
this patch goes in the right direction. I have a couple of comments, but most of them are minor. Can you submit another version, such that I can have a final look. (mostly interested in the handleOutsideUsers test case).
Thank you,
Tobias
include/polly/CodeGen/IslExprBuilder.h | ||
---|---|---|
96 ↗ | (On Diff #32271) | Expander is not a parameter any more. |
include/polly/Support/ScopHelper.h | ||
82 ↗ | (On Diff #32271) | "all SCEV additions"? you mean extended "to handle Polly specific operations not handled by SCEVExpander"? |
84 ↗ | (On Diff #32271) | internally |
89 ↗ | (On Diff #32271) | It might still make sense to document the parameters briefly. At least I do not know them by heart. |
lib/CodeGen/BlockGenerators.cpp | ||
127 ↗ | (On Diff #32271) | Do we happen to have a test case covering this use? Meaning a loop-variant srem node with data-dependences accross basic-blocks that would crash without this addition? |
387 ↗ | (On Diff #32271) | If I drop this change, all test cases pass. Could you possibly add a test case that illustrates why this change is needed. (It is not 100% clear to me). |
lib/Support/ScopHelper.cpp | ||
258 ↗ | (On Diff #32271) | Maybe add a comment that explains why we have to extend the SCEVExpander (we can probably resue the replies to Hal and Sanjay). |
268 ↗ | (On Diff #32271) | Why do we not use the ScopExpander inside the region. Even in the region we may generate code at a location which is not dominated by all SCEVUnknown expressions in the SCEV. |
270 ↗ | (On Diff #32271) | Why don't we use the ScopExpander in the region? Even inside the region we |
Comments and the new version will follow shortly.
include/polly/Support/ScopHelper.h | ||
---|---|---|
82 ↗ | (On Diff #32271) | Yes, I rephrased. |
lib/CodeGen/BlockGenerators.cpp | ||
127 ↗ | (On Diff #32271) | No and I do not think we would crash here anyway. I think only the SDiv/SRem is a parameter of the SCoP case can crash. However, I changed all Expander locations for two reasons:
If you think it is better to only use the new interface where it is necessary I am almost certain this change and the ones in the IslExprBuilder can be avoided. |
387 ↗ | (On Diff #32271) | I am unsure because I cannot reproduce it atm. I will remove it run lnt and if it passes I will just keep this in mind. |
lib/Support/ScopHelper.cpp | ||
270 ↗ | (On Diff #32271) | Like I said above. I don't think so. Even if that might be the case, the current handling might not work then. For now I see 3 cases we expands SCEVs:
If the above observation is correct the code should be too. If not or we extend/change Polly at some point the new structure should allows us to modify the behavior of the SCEV expansion pretty easily. |
294 ↗ | (On Diff #32271) | Done in the constructor. |
jdoerfert wrote:
grosser wrote:
Do we happen to have a test case covering this use? Meaning a loop-variant srem node with data-dependences accross basic-blocks that would crash without this addition?
No and I do not think we would crash here anyway. I think only the SDiv/SRem is a parameter of the SCoP case can crash. However, I changed all Expander locations for two reasons:
- We have a clear interface now that hides how we actually expand code.
- We have one location where we need to handle new features later on.
OK, that's fine.
Comment at: lib/Support/ScopHelper.cpp:270
@@ +269,3 @@
+ if (!S.getRegion().contains(I))
+ E = visit(E);+ return Expander.expandCodeFor(E, Ty, I);
grosser wrote:
Why don't we use the ScopExpander in the region? Even inside the region we
may work at places where I does not dominate all SCEVUnknowns, no?Like I said above. I don't think so. Even if that might be the case, the current handling might not work then. For now I see 3 cases we expands SCEVs:
- IslExprBuilder: Here we create (mainly) loop bounds etc. (inside the region) and the problem should not occure since we generate the full expression (except the parameters) from scratch.
- BlockGenerator: Here we synthezise values during code generation (inside the region) but again, everything except the parameters are generated from scratch (afaik).
- IslNodeBuilder: Here we generate code for the parameters (in front of the region) and we can crash. However, only if the parameters contains instructions in the region.
If the above observation is correct the code should be too. If not or we extend/change Polly at some point the new structure should allows us to modify the behavior of the SCEV expansion pretty easily.
A test case for this is test/Isl/CodeGen/srem-in-other-bb.ll.
It works today as we IndependentBlocks just moves the full operand tree into the
right location. If we disable independent blocks entirely, scalar dependences
are introduced and the values are propagated via memory. So it seems the
way we are currently handling such cases is save, but could be improved in the future
by possibly dropping independent blocks entirely, not model the memory dependences and
then fully expand these extended SCEV expressions during code generation.
However, I believe this is out of scope for this patch.
Feel free to commit if this patch passes LNT for you.
Best,
Tobias
jdoerfert updated this revision to Diff 32394.
jdoerfert marked 4 inline comments as done.
jdoerfert added a comment.Added test cases and modified according to the review.
http://reviews.llvm.org/D12066
Files:
include/polly/CodeGen/IslExprBuilder.h include/polly/CodeGen/IslNodeBuilder.h include/polly/Support/ScopHelper.h lib/CodeGen/BlockGenerators.cpp lib/CodeGen/IslExprBuilder.cpp lib/CodeGen/IslNodeBuilder.cpp lib/Support/ScopHelper.cpp test/Isl/CodeGen/inner_scev.ll test/Isl/CodeGen/inner_scev_2.ll test/Isl/CodeGen/inner_scev_sdiv_1.ll test/Isl/CodeGen/inner_scev_sdiv_2.ll test/Isl/CodeGen/inner_scev_sdiv_3.ll test/Isl/CodeGen/inner_scev_sdiv_in_lb.ll test/Isl/CodeGen/inner_scev_sdiv_in_lb_invariant.ll test/Isl/CodeGen/inner_scev_sdiv_in_rtc.ll
+/ The SCEVExpander will not generate any code for an existing SDiv/SRem
+/ instruction but just use it, if it is references as a SCEVUnknown. We want
referenced
+/ however to generate new code if the instruction is in the analyzed region
+/ and we generate code outside/infront of that region. Hence, we generate the
+/ code for the SDiv/SRem operands in front of the analyzed region and then
+/ create a new SDiv/SRem operation there too.
+struct ScopExpander : SCEVVisitor<ScopExpander, const SCEV *> {
+ friend struct SCEVVisitor<ScopExpander, const SCEV *>;
+
+ explicit ScopExpander(const Region &R, ScalarEvolution &SE,
+ const DataLayout &DL, const char *Name)
+ : Expander(SCEVExpander(SE, DL, Name)), SE(SE), Name(Name), R(R) {}
+
+ Value *expandCodeFor(const SCEV *E, Type *Ty, Instruction *I) {
+ If we generate code in the region we will immediately fall back to the
+ SCEVExpander, otherwise we will stop at all unknowns in the SCEV and if
+ needed replace them by copies computed in the entering block.
+ if (!R.contains(I))
+ E = visit(E);
+ return Expander.expandCodeFor(E, Ty, I);
+ }
+
+private:
+ SCEVExpander Expander;
+ ScalarEvolution &SE;
+ const char *Name;
+ const Region &R;
+
+ const SCEV *visitUnknown(const SCEVUnknown *E) {
+ Instruction *Inst = dyn_cast<Instruction>(E->getValue());
+ if (!Inst || (Inst->getOpcode() != Instruction::SRem &&
+ Inst->getOpcode() != Instruction::SDiv))
+ return E;
+
+ if (!R.contains(Inst))
+ return E;
+
+ Instruction *StartIP = R.getEnteringBlock()->getTerminator();
+
+ const SCEV *LHSScev = visit(SE.getSCEV(Inst->getOperand(0)));
+ const SCEV *RHSScev = visit(SE.getSCEV(Inst->getOperand(1)));
+
+ Value *LHS = Expander.expandCodeFor(LHSScev, E->getType(), StartIP);
+ Value *RHS = Expander.expandCodeFor(RHSScev, E->getType(), StartIP);
+
+ Inst = BinaryOperator::Create((Instruction::BinaryOps)Inst->getOpcode(),
+ LHS, RHS, Inst->getName() + Name, StartIP);
+ return SE.getSCEV(Inst);
+ }
+
+ / The following functions will just traverse the SCEV and rebuild it with
+ / the new operands returned by the traversal.
+ /
+ /{
+ const SCEV *visitConstant(const SCEVConstant *E) { return E; }
+ const SCEV *visitTruncateExpr(const SCEVTruncateExpr *E) {
+ return SE.getTruncateExpr(visit(E->getOperand()), E->getType());
+ }
+ const SCEV *visitZeroExtendExpr(const SCEVZeroExtendExpr *E) {
+ return SE.getZeroExtendExpr(visit(E->getOperand()), E->getType());
+ }
+ const SCEV *visitSignExtendExpr(const SCEVSignExtendExpr *E) {
+ return SE.getSignExtendExpr(visit(E->getOperand()), E->getType());
+ }
+ const SCEV *visitUDivExpr(const SCEVUDivExpr *E) {
+ return SE.getUDivExpr(visit(E->getLHS()), visit(E->getRHS()));
+ }
+ const SCEV *visitAddExpr(const SCEVAddExpr *E) {
+ SmallVector<const SCEV *, 4> NewOps;
+ for (const SCEV *Op : E->operands())
+ NewOps.push_back(visit(Op));
+ return SE.getAddExpr(NewOps);
+ }
+ const SCEV *visitMulExpr(const SCEVMulExpr *E) {
+ SmallVector<const SCEV *, 4> NewOps;
+ for (const SCEV *Op : E->operands())
+ NewOps.push_back(visit(Op));
+ return SE.getMulExpr(NewOps);
+ }
+ const SCEV *visitUMaxExpr(const SCEVUMaxExpr *E) {
+ SmallVector<const SCEV *, 4> NewOps;
+ for (const SCEV *Op : E->operands())
+ NewOps.push_back(visit(Op));
+ return SE.getUMaxExpr(NewOps);
+ }
+ const SCEV *visitSMaxExpr(const SCEVSMaxExpr *E) {
+ SmallVector<const SCEV *, 4> NewOps;
+ for (const SCEV *Op : E->operands())
+ NewOps.push_back(visit(Op));
+ return SE.getSMaxExpr(NewOps);
+ }
+ const SCEV *visitAddRecExpr(const SCEVAddRecExpr *E) {
+ SmallVector<const SCEV *, 4> NewOps;
+ for (const SCEV *Op : E->operands())
+ NewOps.push_back(visit(Op));
+ return SE.getAddRecExpr(NewOps, E->getLoop(), E->getNoWrapFlags());
+ }
+ /}
+};
+/// This wrapper will internally call the SCEVExpander but also make sure that
makes sure
Tobias