Page MenuHomePhabricator

[globalisel] Add lost debug locations verifier
ClosedPublic

Authored by dsanders on Apr 6 2020, 11:21 AM.

Details

Summary

This verifier tries to ensure that DebugLoc's don't just disappear as
we transform the MIR. It observes the instructions created, erased, and
changed and at checkpoints chosen by the client algorithm verifies the
locations affected by those changes.

In particular, it verifies that:

  • Every DebugLoc for an erased/changing instruction is still present on at least one new/changed instruction
  • Failing that, that there is a line-0 location in the new/changed instructions. It's not possible to confirm which locations were merged so it conservatively assumes all unaccounted for locations are accounted for by any line-0 location to avoid false positives.

If that fails, it prints the lost locations in the debug output along with
the instructions that should have accounted for them.

In theory, this is usable by the legalizer, combiner, selector and any other
pass that performs incremental changes to the MIR. However, it has so far
only really been tested on the legalizer (not including the artifact
combiner) where it has caught lots of lost locations, particularly in Custom
legalizations. The tests in this patch are those the verifier detected a problem in after applying it to all the .mir tests for AArch64 and AMDGPU.

Depends on D77575, D77446

Diff Detail

Event Timeline

dsanders created this revision.Apr 6 2020, 11:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 6 2020, 11:21 AM
dsanders marked an inline comment as done.Apr 6 2020, 11:28 AM
dsanders added inline comments.
llvm/lib/CodeGen/GlobalISel/Legalizer.cpp
169

The addition of this argument is rather unfortunate given the AuxObservers list. The reason is the LocObserver needs to be more tightly integrated into the algorithm than most observers. The fewer changes that are made between checkpoints, the more accurately it can detect lossage, particularly given the conservative behaviour for line-0 locations.

In the end I settled on a separate argument over finding the observer within the AuxObservers list as we currently don't have isa/dyn_cast support in observers

paquette added inline comments.Apr 6 2020, 11:32 AM
llvm/lib/CodeGen/GlobalISel/Legalizer.cpp
369

Can this use NV?

E.g. something like

R << "lost " << NV("NumLostDebugLocs", LocObserver.getNumLostDebugLocs()) << " debug locations during pass";

I think that would make optimization records easier to parse, if you choose to save them.

It would also be nice to have a testcase for the remarks.

arsenm added inline comments.Apr 6 2020, 11:32 AM
llvm/include/llvm/CodeGen/GlobalISel/LostDebugLocObserver.h
2

Missing C++ mode comment

37–40

Defined to no-op calls in debug build?

llvm/lib/CodeGen/GlobalISel/Legalizer.cpp
64

Missing static const?

370

Missing newline?

llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fdiv.mir
3 ↗(On Diff #255416)

Do we really need to double the number of run lines in every test?

paquette added inline comments.Apr 6 2020, 11:41 AM
llvm/lib/CodeGen/GlobalISel/Legalizer.cpp
369

Although that being said, any testcase that exposes this is a bug we should fix, so... maybe it's not reasonable to test it. :)

It would be nice to know what the opt remark output looks like though.

vsk added inline comments.Apr 6 2020, 12:42 PM
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fdiv.mir
3 ↗(On Diff #255416)

There are some alternatives:

  • Get rid of the non-mir debugify check lines. This is fine by me, but some pass authors might not like this.
  • Run the tests with llvm-lit -Dopt='opt -debugify -Dllc='llc -run-pass=mir-debugify'`. That means only a subset of users/bots are test with debug info.
dsanders marked 5 inline comments as done.Apr 6 2020, 5:54 PM
dsanders added inline comments.
llvm/include/llvm/CodeGen/GlobalISel/LostDebugLocObserver.h
37–40

I assume you mean the non-debug build.

I suppose we could do as there's a small code-size saving from it. It shouldn't affect execution time though as we just decline to install the observer for that case

llvm/lib/CodeGen/GlobalISel/Legalizer.cpp
369

Can this use NV?

I didn't know about NV. Fixed in my working copy

It would also be nice to have a testcase for the remarks.

A test case is tricky because as you say, we should fix all the cases where this actually happens.

It would be nice to know what the opt remark output looks like though.

I'll add an example in a comment

370

As far as I know, it's not needed for remarks. The other two above don't have a newline at least

llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fdiv.mir
3 ↗(On Diff #255416)

Do we really need to double the number of run lines in every test?

I kept it to the ones that actually reveal something but I agree it's not great to have the extra run lines. Not least because they don't actually do anything different to the main ones on non-asserts builds because the LostDebugLocObserver doesn't get installed. For that build, they're just wasting time.

I want to end up at a -verify-mir-debug-locations similar to -verify-machineinstrs (however that spelling is a bit misleading because each pass must cooperate to verify anything) or -mir-debugify-before-all -mir-strip-debug-after-all or something along those lines. At that point injecting the options via lit makes a lot of sense.

dsanders edited the summary of this revision. (Show Details)Apr 6 2020, 5:56 PM
dsanders updated this revision to Diff 255563.Apr 6 2020, 6:45 PM
dsanders marked an inline comment as done.

C++ mode comment
Use ore::NV
Fix build errors in legalizer unit test

dsanders updated this revision to Diff 256622.Apr 10 2020, 11:17 AM

Removed test changes, now covered by llvm-lit -Dllc='llc --debugify-and-strip-all-safe --debugify-level=locations'

vsk added inline comments.Apr 10 2020, 11:32 AM
llvm/lib/CodeGen/GlobalISel/LostDebugLocObserver.cpp
44

Why not delete HasUnseenLineZeroLoc and return early here?

84

Please have this return early, ideally like 'if (MI.isConstant()) return;'

I don't mean to suggest that you make tangential changes, it's just that it seems like there's a need for such a predicate (e.g. https://llvm.org/doxygen/AMDGPUInstructionSelector_8cpp_source.html#l01470 defines isConstant for MIs but leaves out G_FCONSTANT).

92

Ditto.

dsanders updated this revision to Diff 256720.Apr 10 2020, 5:43 PM
dsanders marked 4 inline comments as done.

Add missing opcodes that never have locations
Predicate
Early returns

llvm/lib/CodeGen/GlobalISel/LostDebugLocObserver.cpp
84

I looked at this again and found I'd missed G_IMPLICIT_DEF and G_GLOBAL_VALUE which also have their debug locations removed by the IRTranslator to avoid 'jumpy' behaviour in the debugger. I've added a predicate so we don't have two copies of the code in this file but it's not really meaningful for other code.

vsk added a comment.Apr 13 2020, 9:31 AM

I think this is in good shape overall, but I'll defer to someone with more GISel experience to lgtm this patch.

llvm/include/llvm/CodeGen/GlobalISel/LostDebugLocObserver.h
29

Can analyzeDebugLocations be private?

33

The checkpoint method could use some doxygen (when exactly should clients call it, and when should CheckDebugLocs be false)?

llvm/lib/CodeGen/GlobalISel/Legalizer.cpp
64

This is still missing.

dsanders updated this revision to Diff 257036.Apr 13 2020, 11:16 AM

static const
Doxygen comment
Make a function private

dsanders marked 4 inline comments as done.Apr 13 2020, 11:17 AM
aprantl added inline comments.Apr 13 2020, 12:21 PM
llvm/lib/CodeGen/GlobalISel/LostDebugLocObserver.cpp
2

Please don't put -*- C++ -*- markers into .cpp files, that's redundant. They are only needed for .h files, where it is ambiguous whether it's C or C++.

paquette accepted this revision.Apr 13 2020, 12:58 PM

After you address @aprantl's comment, I think that this looks fine.

This revision is now accepted and ready to land.Apr 13 2020, 12:58 PM

Hi Daniel,

What is the compile time impact of this change?

Is there a way to make the callers agnostic to the kind of observers we are dealing with?

Basically, I like the observer idea for checking the debug information, but I don't like that:

  1. We have to add a bunch of checkpoint calls all over the place
  2. We have to specifically carry this observer around instead of having it being with the other observers (AuxObservers argument)

Bottom line, it feels intrusive and if we have to do that everywhere I am afraid we won't be able to scale.

I am fine with moving forward with that patch for the time being, but I think we have to rethink our strategy long term.

Cheers,
-Quentin

Hi Daniel,

What is the compile time impact of this change?

That depends on VerifyDebugLocs. When it's None (default for release), there shouldn't be a measurable impact as the observer doesn't get installed. I haven't measured it for Locations (current default for asserts) yet as I've been more focused on fixing the bugs it finds in our out-of-tree target so far. I don't expect it to be noticable as it's a fairly simple matcher on small search spaces but if it is we can change the default to None at the cost of needing to use an extra option in out llvm-lit -Dllc='llc ...' command.

Is there a way to make the callers agnostic to the kind of observers we are dealing with?

In general, not really as each observer implementation will have different requirements depending on what they're for. That said, two of the three observers so far wrap logical changes (transactions except we don't really have that concept) to the IR and the third just follows the events as they happen. So while we can't really solve it in general, we could add another event to observer interface. There's no guarantee we won't need something else in the future though that causes the observers to diverge again though.

Basically, I like the observer idea for checking the debug information, but I don't like that:

  1. We have to add a bunch of checkpoint calls all over the place

There's not really any avoiding these calls. The verifier needs to check often to minimize the times it just gives up but not so often that it checks while the IR is expected to be in a broken state. Only the algorithm knows when it's safe to have the verifier check the locations. Maybe a transaction-complete style event is common enough that we can add it to the observer interface and leave it very vague as to what a transaction is.

  1. We have to specifically carry this observer around instead of having it being with the other observers (AuxObservers argument)

FWIW, it's 'as well as it being with the other observers' but that makes no real difference to your point. I dislike this too both because AuxObservers is trying to force common behaviour where it doesn't really exist but also because if it's going to be there then there shouldn't need to be separate arguments as well. It's possible to fix this but only if we design the observer interface to suit the observers that exist today rather than what they are in principle.

dsanders updated this revision to Diff 257546.Apr 14 2020, 3:56 PM

One more last-minute fix: If there was a line-0 location in the input
and the output it was matching them up. If another location folded
into that line-0 location it would falsely be considered lost despite
being handled correctly because the code to treat line-0 locations
as covering everything was not reached.

Hi Daniel,

What is the compile time impact of this change?

That depends on VerifyDebugLocs. When it's None (default for release), there shouldn't be a measurable impact as the observer doesn't get installed. I haven't measured it for Locations (current default for asserts) yet as I've been more focused on fixing the bugs it finds in our out-of-tree target so far. I don't expect it to be noticable as it's a fairly simple matcher on small search spaces but if it is we can change the default to None at the cost of needing to use an extra option in out llvm-lit -Dllc='llc ...' command.

Running CTMark with -j20 (because I'm not looking at individual tests, just the rough total), I get:

  • X86 -fglobal-isel -g: 3215s without this patch, 3209s with it
  • X86 -fglobal-isel: 3133s without this patch, 3133s with it
  • AArch64 -fglobal-isel -g: 3065s without this patch, 3053s with it
  • AArch64 -fglobal-isel: 2972s without this patch, 2979s with it

So I'd say that leaving it on by default for asserts has no significant impact on compile time

I did run into an interesting issue in some other testing I was doing though. A verifier failure can cause GlobalISel to fall back on DAGISel when fallbacks are enabled, changing codegen. I'm going to have to tweak how failures are reported to fix that

This revision was automatically updated to reflect the committed changes.