This is an archive of the discontinued LLVM Phabricator instance.

[Utils] Skip RemoveRedundantDbgInstrs in MergeBlockIntoPredecessor (PR47746)
ClosedPublic

Authored by vsk on Oct 6 2020, 2:33 PM.

Details

Summary

This patch changes MergeBlockIntoPredecessor to skip the call to
RemoveRedundantDbgInstrs, in effect partially reverting D71480 due to
some compile-time issues spotted in LoopUnroll and SimplifyCFG.

The call to RemoveRedundantDbgInstrs appears to have changed the
worst-case behavior of the merging utility. Loosely speaking, it seems
to have gone from O(#phis) to O(#insts).

It might not be possible to mitigate this by scanning a block to
determine whether there are any debug intrinsics to remove, since such a
scan costs O(#insts).

So: skip the call to RemoveRedundantDbgInstrs. There's surprisingly
little fallout from this, and most of it can be addressed by doing
RemoveRedundantDbgInstrs later. The exception is (the block-local
version of) SimplifyCFG, where it might just be too expensive to call
RemoveRedundantDbgInstrs.

Diff Detail

Event Timeline

vsk created this revision.Oct 6 2020, 2:33 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 6 2020, 2:33 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
vsk requested review of this revision.Oct 6 2020, 2:33 PM
fhahn added a comment.Oct 7 2020, 1:57 AM

So: skip the call to RemoveRedundantDbgInstrs. There's surprisingly little fallout from this, and most of it can be addressed by doing RemoveRedundantDbgInstrs later.

As an alternative, would it make sense to have a dedicated pass to eliminate redundant debug instructions, that's run late in the pipeline? not sure if running it once, but for every function will be better or worse in the end. It should be easy to skip for functions without debug info though.

bjope added a comment.Oct 7 2020, 2:18 AM

So: skip the call to RemoveRedundantDbgInstrs. There's surprisingly little fallout from this, and most of it can be addressed by doing RemoveRedundantDbgInstrs later.

As an alternative, would it make sense to have a dedicated pass to eliminate redundant debug instructions, that's run late in the pipeline? not sure if running it once, but for every function will be better or worse in the end. It should be easy to skip for functions without debug info though.

I don't remember exactly, but isn't one goal with removing the redundant dbg instructions early to speed up compilation (when -g is used). It's like saying that we shouldn't remove dead code until late in the pipeline to me (spending time on optimizing the dead code all the way through opt).

Looks like it has been unfortunate to do the RemoveRedundantDbgInstrs inside MergeBlockIntoPredecessor for the CodeGenPrepare use-case, as it can end up merging blocks in several steps (resulting in bad complexity). So for example having an option to MergeBlockIntoPredecessor (or a wrapper) to be able to postpone the cleanup to afterwards in such use-cases seems like a good idea to me.
There actually is a pass that only is used in lit tests (afaik) that is doing this kind of cleanup (RedundantDbgInstElimination). But if we want to cleanup along the way, we might end up wanting to run that pass after every pass that may introduce redundant dbg instrs. In the end that might be more costly, since we wouldn't know if the earlier passes actually did any transformations or not. The currently implementation tries to limit the amount of cleanup by doing it at transforms that are known to introduce redundant dbg instrs.

One thing I realize is that we probaly spend time looking for dbg instrs also in non-debug builds, right? I'm not sure if there is a quick way to check if the IR has dbg instrs or not, to make an early return in RemoveRedundantDbgInstrs if the module/function/basicblock is known to not having any dbg instrs. Any ideas on how to do that?

vsk added a comment.EditedOct 7 2020, 10:15 AM

So: skip the call to RemoveRedundantDbgInstrs. There's surprisingly little fallout from this, and most of it can be addressed by doing RemoveRedundantDbgInstrs later.

As an alternative, would it make sense to have a dedicated pass to eliminate redundant debug instructions, that's run late in the pipeline? not sure if running it once, but for every function will be better or worse in the end. It should be easy to skip for functions without debug info though.

I don't remember exactly, but isn't one goal with removing the redundant dbg instructions early to speed up compilation (when -g is used). It's like saying that we shouldn't remove dead code until late in the pipeline to me (spending time on optimizing the dead code all the way through opt).

Right.

Looks like it has been unfortunate to do the RemoveRedundantDbgInstrs inside MergeBlockIntoPredecessor for the CodeGenPrepare use-case, as it can end up merging blocks in several steps (resulting in bad complexity). So for example having an option to MergeBlockIntoPredecessor (or a wrapper) to be able to postpone the cleanup to afterwards in such use-cases seems like a good idea to me.

An obstacle here is that code that calls MergeBlockIntoPredecessor tends not to have enough context to pass in a flag to enable RemoveRedundantDbgInstrs at the right time. I can think of two examples:

  • SimplifyCFGOpt::simplifyOnce(BB) can do block merging repeatedly. It seems better to call RemoveRedundantDbgInstrs just once in SimplifyCFGOpt::run() after all changes are done.
  • CodeGenPrepare::eliminateFallthrough() can also do block merging repeatedly, but it seems better to call RemoveRedundantDbgInstrs after all the merging is done.

There actually is a pass that only is used in lit tests (afaik) that is doing this kind of cleanup (RedundantDbgInstElimination). But if we want to cleanup along the way, we might end up wanting to run that pass after every pass that may introduce redundant dbg instrs. In the end that might be more costly, since we wouldn't know if the earlier passes actually did any transformations or not. The currently implementation tries to limit the amount of cleanup by doing it at transforms that are known to introduce redundant dbg instrs.

One thing I realize is that we probaly spend time looking for dbg instrs also in non-debug builds, right? I'm not sure if there is a quick way to check if the IR has dbg instrs or not, to make an early return in RemoveRedundantDbgInstrs if the module/function/basicblock is known to not having any dbg instrs. Any ideas on how to do that?

I don't think there's a quick way to check whether a function/basicblock has debug instructions (e.g. you can always inline dbg intrinsics into a function without a !dbg attachment). We could change the IR to make that kind of lookup fast. Strawman idea: we could add a NumDebugInstrs bit to BasicBlock or Function, just like llvm::Value has a IsUsedByMD bit, and keep it up-to-date whenever the BasicBlock/Function ilist changes. But I think that idea fails two tests: it's not very general (seems very specific to this problem), and it moves us away from the high-level direction of removing debug insts from the instruction stream.

Assuming I understand the history correctly, this patch in itself won't introduce any inaccuracy into debug-info, and has a motivating performance example, so LGTM.

As an aside, Vedant wrote:

I don't think there's a quick way to check whether a function/basicblock has debug instructions (e.g. you can always inline dbg intrinsics into a function without a !dbg attachment).

(Genuine question:) wouldn't dbg intrinsics be useless without a DISubprogram attached to the Function though, seeing how there wouldn't be a DW_AT_subprogram in the dwarf for variables to be placed in?

vsk added a comment.Oct 12 2020, 10:27 AM

I don't think there's a quick way to check whether a function/basicblock has debug instructions (e.g. you can always inline dbg intrinsics into a function without a !dbg attachment).

(Genuine question:) wouldn't dbg intrinsics be useless without a DISubprogram attached to the Function though, seeing how there wouldn't be a DW_AT_subprogram in the dwarf for variables to be placed in?

Yes, unless the dbg intrinsic is inlined into a function that does have a DISubprogram (and its scope is correctly fixed up).

I pushed D89818, which is based on the patch that I submitted to my downstream to fix this issue.

I don't know if this patch as-is exhaustively removes redundant debug instructions in all cases it did before, or if we care, so I can't approve it, but I am OK with this approach.

I would like to see this issue addressed sooner than later though. If this patch is merged, then D89818 should be abandoned.

vsk added a comment.Oct 20 2020, 4:33 PM

I don't know if this patch as-is exhaustively removes redundant debug instructions in all cases it did before, or if we care, so I can't approve it, but I am OK with this approach.

I would like to see this issue addressed sooner than later though. If this patch is merged, then D89818 should be abandoned.

@ctetreau thanks for taking a look. This patch changes llvm to remove redundant dbg insts in fewer instances than it did before, but I think that's ok. Looking through the history:

  • D71480 replaced some hand-rolled duplicated dbg inst removal code with a call to RemoveRedundantDbgInstrs
  • The original removal code was added for llvm.org/PR35113 ; I've fixed up LoopRotationUtils.cpp to avoid breaking the associated test

As I mentioned in D89818, I think the RemoveRedundantDbgInstrs behavior should be opt-in as it has a compile-time cost.

I was hoping for another +1 after Jeremy lgtm'd this change to be safe. Seeing as this appears to be time-sensitive, if there aren't any concerns/objections I'll land this by end of day tomorrow (PST).

vsk edited the summary of this revision. (Show Details)Oct 20 2020, 4:34 PM
In D88928#2343175, @vsk wrote:

I was hoping for another +1 after Jeremy lgtm'd this change to be safe. Seeing as this appears to be time-sensitive, if there aren't any concerns/objections I'll land this by end of day tomorrow (PST).

It's not *that* time sensitive, it can wait a few days. I just don't want it to fall through the cracks, and I pushed my patch because progress seemed to have stalled (and this patch was billed as being exploratory).

ctetreau accepted this revision.Oct 27 2020, 9:40 AM

Looks good to me.

This revision is now accepted and ready to land.Oct 27 2020, 9:40 AM
This revision was landed with ongoing or failed builds.Oct 27 2020, 10:13 AM
This revision was automatically updated to reflect the committed changes.