Before this change only the memory accesses belonging to the roots of the
operand trees were removed, this resulted in some redundant reads, but more
importantly also caused crashes in the code generation, as instructions for
which a read access still existed where modeled as inter-statement use, which
caused polly-simplify's mark-and-sweep algorithm to not reach these statements,
which resulted in dropped instructions and consequently broken code.
Details
- Reviewers
Meinersbur bollu singam-sanjay gareevroman - Commits
- rGd6e0679c4e3a: [ForwardOp] Remove read accesses for all instructions that have been moved
rPLO312456: [ForwardOp] Remove read accesses for all instructions that have been moved
rL312456: [ForwardOp] Remove read accesses for all instructions that have been moved
Diff Detail
- Build Status
Buildable 9873 Build 9873: arc lint + arc unit
Event Timeline
Forwarded instructions may have uses remaining in the same statement, removing the MemoryAccess will leave it without a value to use and crash in codegen. This is why removing access has to be deferred to -polly-simplify.
The test-case forward_load_tripleuse doesn't crash in codegen, even with a -polly-simplify inserted. However, the result for func2 looks strange. I'll take a look.
Hi Michael,
I forgot to add the failing test case. It breaks with -polly-optree -polly-simplify -polly-codegen.
Best,
Tobias
Hi Tobias.
thank for uploading the test case.
This fixes the problem:
diff --git a/lib/Transform/ForwardOpTree.cpp b/lib/Transform/ForwardOpTree.cpp index dffd3cb1..78ff481f 100644 --- a/lib/Transform/ForwardOpTree.cpp +++ b/lib/Transform/ForwardOpTree.cpp @@ -419,7 +419,7 @@ public: // available to all instructions following in the instruction list), but // do not add another MemoryAccess. MemoryAccess *Access = TargetStmt->getArrayAccessOrNULLFor(LI); - if (Access && !DoIt) + if (Access && TargetStmt != UseStmt && !DoIt) return FD_CanForwardLeaf; if (DoIt)
We could improve the robustness of VirtualUse by giving priority to whether the value is in the instruction list (over looking for input MemoryAccesses). It would allow ignoring the input MemoryAccess when the value is available as intra- and inter-access. The explicit instruction list did not exist when VirtualUse was designed.
The entire return FD_ForwardTree might be useless for forwarding loads, we always want to forward a load if it is accessed through its scalar (which is always the case, forwardOperandTrees only ever tries to forward scalars). This might be the better patch:
diff --git a/lib/Transform/ForwardOpTree.cpp b/lib/Transform/ForwardOpTree.cpp index 894f8097..8f44cf33 100644 --- a/lib/Transform/ForwardOpTree.cpp +++ b/lib/Transform/ForwardOpTree.cpp @@ -420,7 +420,7 @@ public: // do not add another MemoryAccess. MemoryAccess *Access = TargetStmt->getArrayAccessOrNULLFor(LI); if (Access && !DoIt) - return FD_CanForwardLeaf; + return FD_CanForwardTree; if (DoIt) TargetStmt->prependInstruction(LI);
This seems to work, but my commit message is not very good yet. Can you suggest a commit message that describes the reason you prefer this solution?
I assume that not everyone subscribed to llvm-commits knows that "ForwardOpTree"/"ForwardOp" is a component of Polly. The tag "[Polly]" might still be useful.
Before this patch, OpTree did not consider forwarding an operand tree consisting of only single LoadInst as useful. The motivation was that, like an access to a read-only variable, it would just replace one MemoryAccess by another. However, in contrast to read-only accesses, this would replace a scalar access by an array access, which is something worth doing.
In addition, leaving scalar MemoryAccess is problematic in that VirtualUse prioritizes inter-Stmt use over intra-Stmt. It was possible that the same LLVM value has a MemoryAccess for accessing the remote Stmt's LoadInst as well as having the same LoadInst in its own instruction list (due to being forwarded from another operand tree).
With this patch we ensure that if a LoadInst is forwarded is any operand tree, also the operand tree containing just the LoadInst is forwarded as well, which effectively removes the scalar MemoryAccess such that only the array access remains, not both.