When SelectionDAG performs CSE it is possible that the context's source location is different from that of the selected node. This can lead to incorrect line number records. This patch proposes to update the debug location to the one that occurs earlier in the instruction sequence. This is in order to address PR21006 (oscillating line number records due to CSE performed on constant pool loads).
Details
Diff Detail
Event Timeline
Hi,
Are there any reasons to implement the check in separate function instead of
changing 3-argument overload of SelectionDAG::FindNodeOrInsertPos to handle
nodes of all kinds (that would require updating its comment a bit)? If I
counted correctly, the changes cover 20 of 27 uses of that overload and it might
make sense to keep debug location erasure in one place to make it less
error-prone, unless there are reasons not to do this.
Regards,
Sergey
This is what I intended to do at first, but there are several contexts where the call to FindNodeOrInsertPos() is used as a query rather than with an intent to eliminate common subexpressions (e.g. all 3 versions of FindModifiedNodeSlot(), and getNodeIfExists()). It would be undesirable to possibly null out the debugLoc in those contexts, and indeed one llvm test broke by degrading the line number info when I did it this way.
That's not to say filtering the debugLoc couldn't be centralized but it would probably require some rework of the Constant node handling as well. I'm certainly open to that.
The other reason is more stylistic. I'm uncomfortable with having a function called "FindNodeOrInsertPos()", which is meant to retrieve something, have a significant side effect such as nulling out the debug location, when it's not well-defined inside the function how the result is used.
Not sure about getNodeIfExists(), here is part of DAGCombiner:
if (const auto *BinNode = dyn_cast<BinaryWithFlagsSDNode>(N)) { CSENode = DAG.getNodeIfExists(N->getOpcode(), N->getVTList(), Ops, &BinNode->Flags); } else { CSENode = DAG.getNodeIfExists(N->getOpcode(), N->getVTList(), Ops); }
Regarding FindModifiedNodeSlot():
// See if the modified node already exists. void *InsertPos = nullptr; if (SDNode *Existing = FindModifiedNodeSlot(N, Op, InsertPos)) return Existing;
So, if new node already exists, reuse it in different place. Looks like we need to drop old debug location here as well.
I guess the rest are from functions that create constant nodes.
As for the failing test, maybe it's relying on something it shouldn't? In general, cached nodes are queried for reuse, I'm not sure there are other valid cases, that's why I'm a bit suspicious about leaving some cases unhandled.
The other reason is more stylistic. I'm uncomfortable with having a function called "FindNodeOrInsertPos()", which is meant to retrieve something, have a significant side effect such as nulling out the debug location, when it's not well-defined inside the function how the result is used.
I agree, changing naming shouldn't be a problem, if you have something suitable on your mind. The interface will differ a bit for different queries, but that's not an issue as for me.
Thank you for your comments, Sergey. If find that if I outfit getIfExists() with a debugLoc() parameter and pass it through, the failure goes away, so I have revised the patch to update the node from within the FindNode helper, dispensing with the FilterDebugLoc() routine. I've renamed the helper routine to reflect that it updates the node and took the liberty to adjust the comment a bit.
Thanks for the update, LGTM with returned first line of SelectionDAG.cpp.
lib/CodeGen/SelectionDAG/SelectionDAG.cpp | ||
---|---|---|
1 | You probably removed this by accident. |
Oops, thank you for noticing that and thank you for the review.
Would it be possible for you to commit this change for me? I don't have commit access yet.
I've made an inline comment on this, if you could reply that would be good. Please don't commit this yet.
Thanks!
-eric
lib/CodeGen/SelectionDAG/SelectionDAG.cpp | ||
---|---|---|
997–1001 | This is an interesting perspective. I'm not sure I agree though. Things like CSE'ing of nodes can cause attribution of expressions to a line that does an equivalent computation, but this change is going to associate the expression to a line that doesn't do the expression at all in the general case. How is this better than the current status quo? |
lib/CodeGen/SelectionDAG/SelectionDAG.cpp | ||
---|---|---|
997–1001 | I have to agree that it's not a good solution, and the only reason I prefer it over the status quo is that I haven't been able to come up with a case where an expression is actually attributed to a line that doesn't compute it (likely the preceding line), whereas there's at least one case that suffers from the status quo (PR21006). Anecdotal evidence, I know. |
Given the concerns about nulling out the debug location here is a different attempt: We determine from the context whether the place where the node is reused is located earlier in the sequence of instructions. To this end we're using the SDNodes' IROrder which reflects the sequence of IR instructions. This required adjustment of the interface of FindNodeOrInsertPos() because we need an SDLoc instead of just a DebugLoc.
Sergey, I wasn't sure if we can subsume Constant nodes under this change, but it seemed that it's still better to null out the DebugLocation in those cases.
So basically among all uses of the same CSEed node this picks debug location from the earliest one? And original issue could be rephrased as: debug location of CSEed nodes is picked non-deterministically. If that's the case, this seems to be really nice change.
Sergey, I wasn't sure if we can subsume Constant nodes under this change, but it seemed that it's still better to null out the DebugLocation in those cases.
I agree, and it's safer to leave it as is, at least until counter example is provided.
Could you do an upload with full context please? It's a bit hard to pick out bits of the selection dag changes. In general I think the idea is fine, but the default: case could use a comment while you're at it.
Thanks!
-eric
One inline comment.
Thanks!
-eric
lib/CodeGen/SelectionDAG/SelectionDAG.cpp | ||
---|---|---|
853 | I'm unclear why we have a default SDLoc here? |
Adressed review comment on default SDLoc() in getNodeIfExists(). This routine is used in an attempt to take advantage of existing commutative binary nodes by the DAG combiner. No reason to not update the debug location if the node is reused.
lib/CodeGen/SelectionDAG/SelectionDAG.cpp | ||
---|---|---|
853 | I didn't author the original code, but I see no reason to not update the DebugLoc. This routine is called from DAGCombine in an attempt to reuse a binary commutative node. Thanks for pointing this out. |
Updated diff against r265709. No additional functional changes. There is no need to patch DAGCombiner.cpp any more because of an interface change to getNodeIfExists(). Apologies for leaving this dormant for a while.