This is an archive of the discontinued LLVM Phabricator instance.

Avoid the propagation of debug locations in SelectionDAG via CSE
ClosedPublic

Authored by wolfgangp on Aug 17 2015, 6:02 PM.

Details

Summary

When SelectionDAG performs CSE it is possible that the context's source location is different from that of the selected node. This can lead to incorrect line number records. This patch proposes to update the debug location to the one that occurs earlier in the instruction sequence. This is in order to address PR21006 (oscillating line number records due to CSE performed on constant pool loads).

Diff Detail

Repository
rL LLVM

Event Timeline

wolfgangp updated this revision to Diff 32358.Aug 17 2015, 6:02 PM
wolfgangp retitled this revision from to Avoid the propagation of debug locations in SelectionDAG via CSE.
wolfgangp updated this object.
wolfgangp added reviewers: echristo, sdmitrouk.
wolfgangp added a subscriber: llvm-commits.
sdmitrouk edited edge metadata.Aug 18 2015, 3:02 AM

Hi,

Are there any reasons to implement the check in separate function instead of
changing 3-argument overload of SelectionDAG::FindNodeOrInsertPos to handle
nodes of all kinds (that would require updating its comment a bit)? If I
counted correctly, the changes cover 20 of 27 uses of that overload and it might
make sense to keep debug location erasure in one place to make it less
error-prone, unless there are reasons not to do this.

Regards,
Sergey

asl added a subscriber: asl.Aug 18 2015, 6:50 AM

Hi,

Are there any reasons to implement the check in separate function instead of
changing 3-argument overload of SelectionDAG::FindNodeOrInsertPos to handle
nodes of all kinds (that would require updating its comment a bit)? If I
counted correctly, the changes cover 20 of 27 uses of that overload and it might
make sense to keep debug location erasure in one place to make it less
error-prone, unless there are reasons not to do this.

Regards,
Sergey

This is what I intended to do at first, but there are several contexts where the call to FindNodeOrInsertPos() is used as a query rather than with an intent to eliminate common subexpressions (e.g. all 3 versions of FindModifiedNodeSlot(), and getNodeIfExists()). It would be undesirable to possibly null out the debugLoc in those contexts, and indeed one llvm test broke by degrading the line number info when I did it this way.
That's not to say filtering the debugLoc couldn't be centralized but it would probably require some rework of the Constant node handling as well. I'm certainly open to that.

The other reason is more stylistic. I'm uncomfortable with having a function called "FindNodeOrInsertPos()", which is meant to retrieve something, have a significant side effect such as nulling out the debug location, when it's not well-defined inside the function how the result is used.

This is what I intended to do at first, but there are several contexts where the call to FindNodeOrInsertPos() is used as a query rather than with an intent to eliminate common subexpressions (e.g. all 3 versions of FindModifiedNodeSlot(), and getNodeIfExists()). It would be undesirable to possibly null out the debugLoc in those contexts, and indeed one llvm test broke by degrading the line number info when I did it this way.
That's not to say filtering the debugLoc couldn't be centralized but it would probably require some rework of the Constant node handling as well. I'm certainly open to that.

Not sure about getNodeIfExists(), here is part of DAGCombiner:

if (const auto *BinNode = dyn_cast<BinaryWithFlagsSDNode>(N)) {
  CSENode = DAG.getNodeIfExists(N->getOpcode(), N->getVTList(), Ops,
                                &BinNode->Flags);
} else {
  CSENode = DAG.getNodeIfExists(N->getOpcode(), N->getVTList(), Ops);
}

Regarding FindModifiedNodeSlot():

// See if the modified node already exists.
void *InsertPos = nullptr;
if (SDNode *Existing = FindModifiedNodeSlot(N, Op, InsertPos))
  return Existing;

So, if new node already exists, reuse it in different place. Looks like we need to drop old debug location here as well.

I guess the rest are from functions that create constant nodes.

As for the failing test, maybe it's relying on something it shouldn't? In general, cached nodes are queried for reuse, I'm not sure there are other valid cases, that's why I'm a bit suspicious about leaving some cases unhandled.

The other reason is more stylistic. I'm uncomfortable with having a function called "FindNodeOrInsertPos()", which is meant to retrieve something, have a significant side effect such as nulling out the debug location, when it's not well-defined inside the function how the result is used.

I agree, changing naming shouldn't be a problem, if you have something suitable on your mind. The interface will differ a bit for different queries, but that's not an issue as for me.

wolfgangp updated this revision to Diff 32779.Aug 20 2015, 5:54 PM
wolfgangp edited edge metadata.

Thank you for your comments, Sergey. If find that if I outfit getIfExists() with a debugLoc() parameter and pass it through, the failure goes away, so I have revised the patch to update the node from within the FindNode helper, dispensing with the FilterDebugLoc() routine. I've renamed the helper routine to reflect that it updates the node and took the liberty to adjust the comment a bit.

sdmitrouk accepted this revision.Aug 21 2015, 5:39 AM
sdmitrouk edited edge metadata.

Thanks for the update, LGTM with returned first line of SelectionDAG.cpp.

lib/CodeGen/SelectionDAG/SelectionDAG.cpp
1 ↗(On Diff #32779)

You probably removed this by accident.

This revision is now accepted and ready to land.Aug 21 2015, 5:39 AM
wolfgangp updated this revision to Diff 32841.Aug 21 2015, 10:14 AM
wolfgangp edited edge metadata.

Added back the first line of SelectionDAG.cpp, which was accidentally removed.

Thanks for the update, LGTM with returned first line of SelectionDAG.cpp.

Oops, thank you for noticing that and thank you for the review.

Would it be possible for you to commit this change for me? I don't have commit access yet.

echristo requested changes to this revision.Aug 21 2015, 11:27 AM
echristo edited edge metadata.

I've made an inline comment on this, if you could reply that would be good. Please don't commit this yet.

Thanks!

-eric

lib/CodeGen/SelectionDAG/SelectionDAG.cpp
997–1001 ↗(On Diff #32841)

This is an interesting perspective. I'm not sure I agree though. Things like CSE'ing of nodes can cause attribution of expressions to a line that does an equivalent computation, but this change is going to associate the expression to a line that doesn't do the expression at all in the general case. How is this better than the current status quo?

This revision now requires changes to proceed.Aug 21 2015, 11:27 AM
wolfgangp added inline comments.Aug 21 2015, 2:09 PM
lib/CodeGen/SelectionDAG/SelectionDAG.cpp
997–1001 ↗(On Diff #32841)

I have to agree that it's not a good solution, and the only reason I prefer it over the status quo is that I haven't been able to come up with a case where an expression is actually attributed to a line that doesn't compute it (likely the preceding line), whereas there's at least one case that suffers from the status quo (PR21006). Anecdotal evidence, I know.

wolfgangp updated this revision to Diff 33837.Sep 2 2015, 11:55 AM
wolfgangp edited edge metadata.

Given the concerns about nulling out the debug location here is a different attempt: We determine from the context whether the place where the node is reused is located earlier in the sequence of instructions. To this end we're using the SDNodes' IROrder which reflects the sequence of IR instructions. This required adjustment of the interface of FindNodeOrInsertPos() because we need an SDLoc instead of just a DebugLoc.

Sergey, I wasn't sure if we can subsume Constant nodes under this change, but it seemed that it's still better to null out the DebugLocation in those cases.

We determine from the context whether the place where the node is reused is located earlier in the sequence of instructions.

So basically among all uses of the same CSEed node this picks debug location from the earliest one? And original issue could be rephrased as: debug location of CSEed nodes is picked non-deterministically. If that's the case, this seems to be really nice change.

Sergey, I wasn't sure if we can subsume Constant nodes under this change, but it seemed that it's still better to null out the DebugLocation in those cases.

I agree, and it's safer to leave it as is, at least until counter example is provided.

wolfgangp updated this revision to Diff 40995.Nov 23 2015, 4:03 PM
wolfgangp edited edge metadata.

Ping...

Updated the test case to catch up with recent IR changes.

Could you do an upload with full context please? It's a bit hard to pick out bits of the selection dag changes. In general I think the idea is fine, but the default: case could use a comment while you're at it.

Thanks!

-eric

wolfgangp updated this revision to Diff 41707.Dec 2 2015, 6:19 PM
wolfgangp updated this object.

Thanks, Eric. Uploaded full diff and added a comment to the default: case.

One inline comment.

Thanks!

-eric

lib/CodeGen/SelectionDAG/SelectionDAG.cpp
853 ↗(On Diff #41707)

I'm unclear why we have a default SDLoc here?

wolfgangp updated this revision to Diff 41795.Dec 3 2015, 1:26 PM

Adressed review comment on default SDLoc() in getNodeIfExists(). This routine is used in an attempt to take advantage of existing commutative binary nodes by the DAG combiner. No reason to not update the debug location if the node is reused.

wolfgangp added inline comments.Dec 3 2015, 1:30 PM
lib/CodeGen/SelectionDAG/SelectionDAG.cpp
853 ↗(On Diff #41795)

I didn't author the original code, but I see no reason to not update the DebugLoc. This routine is called from DAGCombine in an attempt to reuse a binary commutative node. Thanks for pointing this out.

wolfgangp updated this revision to Diff 52984.Apr 7 2016, 5:35 PM

Updated diff against r265709. No additional functional changes. There is no need to patch DAGCombiner.cpp any more because of an interface change to getNodeIfExists(). Apologies for leaving this dormant for a while.

echristo accepted this revision.May 1 2016, 10:31 PM
echristo edited edge metadata.

LGTM. Thanks!

-eric

This revision is now accepted and ready to land.May 1 2016, 10:31 PM
This revision was automatically updated to reflect the committed changes.