Page MenuHomePhabricator

[Debugify] Support checking Machine IR debug info
ClosedPublic

Authored by xiangzhangllvm on Nov 17 2020, 12:58 AM.

Details

Summary

Add mir-check-debug pass to check MIR-level debug info.

For IR-level, currently, LLVM have debugify + check-debugify to generate and check debug IR.
Much like the IR-level pass debugifymir-debugify inserts sequentially increasing line locations to each MachineInstr in a Module,
But there is no equivalent MIR-level check-debugify pass, So now we support it at "mir-check-debug".

Currently (our first step), what mir-check-debug do is similar with check-debugify, to check the integrity of line info in DILocation and DILocalVariable
which mir-debugify generated before.

The 3 tests in patch show how this tools work.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
llvm/lib/CodeGen/MachineDebugify.cpp
138

I'll fix them, thank you!

llvm/test/CodeGen/Generic/MIRDebugify/check-line-and-variables.ll
2

Terminator IR ret is too simple to check the debug info.
Currently I check two MIR pass will pass this debug info check.
And I think it is better for this *.ll file to keep "same" with its MIR file check-line-and-variables.mir.
tks!

djtodoro added inline comments.Nov 20 2020, 12:53 AM
llvm/lib/CodeGen/MachineCheckDebug.cpp
58 ↗(On Diff #305941)

Yes, but that is why it is being reported as warning...
Use case: For example, we use debugify and realize that a bunch of add instructions don't have a dbg location attached after a pass X - we investigate pass X, and find out that there is a creating of the instruction add, but we have missed to add I->setDebugLoc(). It makes sense when dealing with non-dbg instructions only, since the locations of these are used during debugging sessions, since only these instructions end up in the final code.

Please consider the code from trunk (IR Debugify.cpp):

// Find missing lines.
for (Instruction &I : instructions(F)) {
  if (isa<DbgValueInst>(&I) || isa<PHINode>(&I))
    continue;

  auto DL = I.getDebugLoc();
  if (DL && DL.getLine() != 0) {
    MissingLines.reset(DL.getLine() - 1);
    continue;
  }

  if (!DL) {
    dbg() << "WARNING: Instruction with empty DebugLoc in function ";
    dbg() << F.getName() << " --";
    I.print(dbg());
    dbg() << "\n";
  }
}

This gives us a chance to investigate if a pass drops the location accidentally or it should have done it, but only for non-debug ones, as:

if (isa<DbgValueInst>(&I) || isa<PHINode>(&I))
  continue;
xiangzhangllvm added inline comments.Nov 20 2020, 1:03 AM
llvm/lib/CodeGen/MachineCheckDebug.cpp
58 ↗(On Diff #305941)

Thanks for the explanation, I'll check the non-debug MIR.

xiangzhangllvm marked 6 inline comments as done.Nov 20 2020, 7:56 PM

Thanks!

llvm/lib/CodeGen/MachineCheckDebugify.cpp
57

I'm not against adding two loops -- one for finding missing lines, and the other one for finding missing variables.

59

Not sure we want to report missing dbgLoc for meta instructions other than DBG_VALUE (e.g., DBG_LABEL does not end up in the final code as well..)

68

Early continue?

xiangzhangllvm added inline comments.Nov 23 2020, 4:35 PM
llvm/lib/CodeGen/MachineCheckDebugify.cpp
57

Yes, two loops is clear

59

For missing line, we should check all the MIRs which contribute the line number in MachineDebugify -- (non DBG_VALUEs), so here we just need to check MI.isDebugValue().

68

Still need to do line 73, I'll let it clear by using 2 loops soon. thank you!

72

Duplicated "\n" with MI.print(), remove here.

xiangzhangllvm added a reviewer: djtodoro.
xiangzhangllvm marked 3 inline comments as done.

Nice! (few nits inline)

llvm/lib/CodeGen/MachineCheckDebugify.cpp
10

checks

11

checks

59

I see, but that should be fixed at MachineDebugify side first. At least, we can put a TODO (or FIXME) marker, saying "Avoid meta instructions other than dbg_val"

66

Nit: please add \n here.

72

Nit: please add \n here.

xiangzhangllvm marked 5 inline comments as done.Nov 24 2020, 4:34 PM

Thanks very much for your reviews!!

Thanks for addressing the comments! We are almost there.

llvm/docs/HowToUpdateDebugInfo.rst
412

The name of option -debugify-and-check-strip-all-safe seems a bit confusing ...

llvm/lib/CodeGen/MachineCheckDebugify.cpp
53–56

We don't need MaybeMF,

MachineFunction *MF = MMI.getMachineFunction(F);
if (!MF)
  continue;

and use *MF.

llvm/lib/CodeGen/TargetPassConfig.cpp
126

The description should be more precise.

xiangzhangllvm added inline comments.Nov 25 2020, 9:30 PM
llvm/lib/CodeGen/MachineCheckDebugify.cpp
53–56

Yes, I'll refine here, thank you again : )

llvm/lib/CodeGen/TargetPassConfig.cpp
126

The name and description of "debugify-and-check-strip-all-safe" refer to upper "debugify-and-strip-all-safe", Maybe It is a little different for our non-English speaking people to name or descript well. Do you have good suggestion of its name ?
How about I refine the description to "Debugify MIR before each pass except those known to be unsafe when debug info is present, then run mir-check-debugify for them, and finally strip the debug info with mir-strip-debug" ?

djtodoro added inline comments.Nov 25 2020, 11:49 PM
llvm/lib/CodeGen/TargetPassConfig.cpp
126

The name and description of "debugify-and-check-strip-all-safe" refer to upper "debugify-and-strip-all-safe"

I think "debugify-check-and-strip-all-safe" would be more appropriate, wdyt?

(DebugifyAndCheckStripAll should be DebugifyCheckAndStripAll)

How about I refine the description to "Debugify MIR before each pass except those known to be unsafe when debug info is present, then run mir-check-debugify for them, and finally strip the debug info with mir-strip-debug" ?

Hmm... we are checking/stripping after each pass right?

"Debugify MIR before, by checking and stripping the debug info after, each pass except those known to be unsafe when debug info is present" ?

xiangzhangllvm added inline comments.Nov 26 2020, 2:57 AM
llvm/lib/CodeGen/TargetPassConfig.cpp
126

think "debugify-check-and-strip-all-safe" would be more appropriate, wdyt?

Looks good!

Hmm... we are checking/stripping after each pass right?

Yes, same with "debugify-and-strip-all-safe".

"Debugify MIR before, by checking and stripping the debug info after, each pass except those known to be unsafe when debug info is present"

That's better, thank you!!

xiangzhangllvm marked 3 inline comments as done.
djtodoro accepted this revision.Nov 26 2020, 6:46 AM

LGTM, thanks!

(please leave some time (a few days) to other reviewers to take a look)

llvm/lib/CodeGen/MachineCheckDebugify.cpp
77

Please add a TODO marker for DBG_INSTR_REF as well (it is still under an experimental option, but we should be aware it is coming).

This revision is now accepted and ready to land.Nov 26 2020, 6:46 AM
xiangzhangllvm added inline comments.Nov 26 2020, 3:58 PM
llvm/lib/CodeGen/MachineCheckDebugify.cpp
77

OK, I'll mark it when I commit it (next week).
For DBG_INSTR_REF it is a good idea to split the value and location for llvm.dbg.value.
Thanks for all your works !!

vsk added inline comments.Nov 30 2020, 12:05 PM
llvm/docs/HowToUpdateDebugInfo.rst
346

nit, s/with/to the/

llvm/lib/CodeGen/MachineCheckDebugify.cpp
34

Why does this need to be named differently from the IR-level named debugify metadata?

41

I don't think it's a good idea to introduce a copy of the IR-level check-debugify logic, since this makes it possible for the IR & MIR-level checking behaviors to diverge.

Can you find a way to factor out and reuse the checking logic? One way to do it might be to introduce a generic DebugifyChecker which parses the named metadata and operates on DebugLoc.

xiangzhangllvm added inline comments.Nov 30 2020, 9:45 PM
llvm/docs/HowToUpdateDebugInfo.rst
346

Hi @vsk Do you mean refine "is similar with IR-level" to "is similar to IR-level" ?

llvm/lib/CodeGen/MachineCheckDebugify.cpp
34

The MIR Check base on the MIR debugify, the "llvm.mir.debugify" will save the line number and variable number in MIR debugify.
Please refer llvm/lib/CodeGen/MachineDebugify.cpp line 160,162

41

It is not get/check IR-level metadata info, as explained before, here use the getDebugifyOperand to get "llvm.mir.debugify" info which generated by MIR debugify.

vsk added inline comments.Dec 1 2020, 1:56 PM
llvm/docs/HowToUpdateDebugInfo.rst
346

Yep!

llvm/lib/CodeGen/MachineCheckDebugify.cpp
34

Yes, I understand that. What's not clear to me is why the named metadata needs to be different from the IR-level named metadata ('llvm.debugify').

41

Hm, I don't think we're on the same page. My concern is that this patch copies logic from checkDebugifyMetadata (in lib/Transforms/Utils/Debugify.cpp), such as the logic to set up bitvectors to track missing lines. Could you please find a way to implement the warnings in a generic way, and reuse that facility in both the IR & MIR check-debugify passes?

xiangzhangllvm added inline comments.Dec 1 2020, 4:42 PM
llvm/docs/HowToUpdateDebugInfo.rst
346

I'll fix it soon, thank you!

llvm/lib/CodeGen/MachineCheckDebugify.cpp
34

There is 2 reasons:

  1. 'llvm.debugify' collected IR debug info (line + var number), if we "overwrite" the MIR debug info into it, it is confused to read it.
  2. In fact, Machine debugify is called from (IR level) debugify. If we write the MIR debug info into 'llvm.debugify' metadata, it will be "overwrite" by IR debug info at last.
41

en..., In fact, we intentionally make the logic same with IR debugify, it is reasonable to check MIR similar with IR. (It has already help me to find some debug info problems in some MIR optimizations). It is trouble to reuse that facility in both the IR & MIR. The logic of MIR-debugify is some with IR-debugify too, but MIR-debugify also implemented in a independent pass.

Mark TODO for DBG_INSTR_REF and refine docs/HowToUpdateDebugInfo.rst

xiangzhangllvm marked 2 inline comments as done.Dec 1 2020, 7:15 PM
xiangzhangllvm added inline comments.
llvm/lib/CodeGen/MachineCheckDebugify.cpp
41

en..., In fact, we intentionally make the logic same with IR debugify, it is reasonable to check MIR similar with IR. (It has already help me to find some debug info problems in some MIR optimizations). It is trouble to reuse that facility in both the IR & MIR. The logic of MIR-debugify is some with IR-debugify too, but MIR-debugify also implemented in a independent pass.

vsk added inline comments.Dec 2 2020, 10:04 AM
llvm/lib/CodeGen/MachineCheckDebugify.cpp
34

Ah, got it. The reason the named metadata nodes are kept separate is so the IR in a .mir file can be checked, and so there's no confusion about whether the metadata describes IR/MIR.

One small clarification -- it looks like it is the other way around: MachineDebugify calls the IR-level debugify (in DebugifyMachineModule::runOnModule).

41

I think we're still talking a bit past each other. Could you elaborate on this point:

It is trouble to reuse that facility in both the IR & MIR. The logic of MIR-debugify is some with IR-debugify too

The way I see it, the IR-level and MIR-level checking is very similar. I'm simply asking for the similar pieces to be factored out into a generic implementation.

xiangzhangllvm added inline comments.Dec 2 2020, 4:29 PM
llvm/lib/CodeGen/MachineCheckDebugify.cpp
34

One small clarification -- it looks like it is the other way around: MachineDebugify calls the IR-level debugify (in DebugifyMachineModule::runOnModule).

Yes, That's more accurate.

41

I thought your point is: reuse the same code (Debugify.cpp: checkDebugifyMetadata) to check both IR and MIR. right?
The troubles are

  1. We must complex current clear checkDebugifyMetadata to fit the MIR check,(the line/variable numbers are different, the IR/MIR iterator is different, all need to fit) and call relations. What's more, Current key code in mir-check-debugify just tens of lines, very small.

2)Current mir-check-debugify is a simple and dependent pass, easy to extend later, if reuse the Debugify.cpp code, it at last need to call IR-level debugify as mir-debugify. (We mostly use -run-pass=mir-check-debugify to run it.)

xiangzhangllvm added inline comments.Dec 2 2020, 4:38 PM
llvm/lib/CodeGen/MachineCheckDebugify.cpp
41

The logic of MIR-debugify is same with IR-debugify too

Both of them mainly add debug IR/MIR for very instructions, and MIR-debugify really called IR-debugify's code, but it just want to use IR-debugify to create metadata.

vsk added inline comments.Dec 3 2020, 10:51 AM
llvm/lib/CodeGen/MachineCheckDebugify.cpp
41

I thought your point is: reuse the same code (Debugify.cpp: checkDebugifyMetadata) to check both IR and MIR. right?

Ah, no, not the exact same code, no. There are certain similarities between the checking logic in checkDebugifyMetadata, and the checking logic introduced in this patch. Instead of having two copies of the same checking logic that can diverge over time, I'm suggesting that you factor out the similar pieces into a single generic implementation. The parts of the logic that are IR or MIR specific, of course, can/should remain separate.

xiangzhangllvm added inline comments.Dec 3 2020, 4:45 PM
llvm/lib/CodeGen/MachineCheckDebugify.cpp
41

Yes, merging same logic is the general rule of coding.
But if it makes things complex or down the readability, I prefer not merging.
Currently, both IR/MIR check is very simple and clear, merging won't take profit. If we do merging, we at least need to do:

  1. Adding function call, let both passes can call it.
  2. Adding pass Line/variable numbers and others to the function.
  3. Distinguish IR/MIR (Function/MachineFunction, BB/MBB, I/MI) in the function.
  4. Distinguish "isa<DbgValueInst>(&I)" with "MI.isDebugValue()" and so on.
This revision was automatically updated to reflect the committed changes.

Apologies, I reverted this because else reverting D78135 (which breaks check-llvm) creates merge conflicts that I can't quickly figure out.

Apologies, I reverted this because else reverting D78135 (which breaks check-llvm) creates merge conflicts that I can't quickly figure out.

D78135 was innocent, but this change here apparently caused these debugify test failures:
http://lab.llvm.org:8011/#/builders/109/builds/4730
http://45.33.8.238/linux/35299/step_12.txt

These tests stopped failing after the revert of this change here landed. So please check these tests before relanding.

Apologies, I reverted this because else reverting D78135 https://reviews.llvm.org/D78135 which breaks check-llvm) creates merge conflicts that I can't quickly figure out.

Yes, it cause fail in the tests of D78135, I'll update it. thank you.

Hi @xiangzhangllvm, please wait for @vsk to approve as well, since he is author of the original debugify IR patch and he is very involved in the area.
Thanks a lot for your patience and work.

OK, thanks for reminding.

xiangzhangllvm reopened this revision.Dec 15 2020, 12:28 AM
This revision is now accepted and ready to land.Dec 15 2020, 12:28 AM

Update affected tests: AArch64/GlobalISel/constant-mir-debugify.mir and phi-mir-debugify.mir
Because this patch add metadata "llvm.mir.debugify", it affected the index of some old test.
For e.g. change Checking "debug-location !10" to Checking "debug-location !11"

Hello @vsk ,could you help accept it ?

vsk resigned from this revision.Dec 15 2020, 11:49 AM

Hi @xiangzhangllvm, apologies for the delay.

But if it makes things complex or down the readability, I prefer not merging.
Currently, both IR/MIR check is very simple and clear, merging won't take profit.

Imo this overstates the difficulty of code sharing in this context, and underweights the potential long-term effects of having divergent code paths that are supposed to do the same thing. I urge you to reconsider :).

I'm not sure I'll have time to continue with this review, so I'll let other voices take over.

Imo this overstates the difficulty of code sharing in this context, and underweights the potential long-term effects of having divergent code paths that are supposed to do the same thing. I urge you to reconsider :).

Hi @vsk, I am fine to merge if you still feel the following explanation doesn't hold water:

  1. The logic here not 100% same, currently, mir-check doesn't check the phi node. (which will eliminate at back end)
  2. I still feel it is trouble to "Distinguish isa<DbgValueInst>(&I) with MI.isDebugValue() and so on"
  3. In my eyes, the small logic of ir/mir check here is just happen to be same/similar, not must (or no need) to be, make them not affect each other is better.

I'm not sure I'll have time to continue with this review, so I'll let other voices take over.

Thank you for your continuous reviews, I wish we can reach an agreement on the "merge" : )

As this review has last so long, (1 month), It has accepted before and I fixed all reviewers' comments except @vsk 's "merging" suggestion.
I want to let this patch in first.
About @vsk 's "merging" suggestion, I am almost sure that the code will be some ugly compared with current code,
So I'll use another special patch to handle his "merging" suggestion, let/wait @vsk to accept it or not.
tks!

This revision was automatically updated to reflect the committed changes.