This is an archive of the discontinued LLVM Phabricator instance.

Add a stripNonLineTableDebugInfo() helper that downgrades -g debug info to what -gline-tables-only would have produced.
ClosedPublic

Authored by aprantl on Oct 5 2016, 10:05 AM.

Details

Summary

This adds a new function to DebugInfo.cpp that takes an llvm::Module as input and removes all debug info metadata that is not directly needed for line tables, thus effectively stripping all type and variable information from the module.

The primary motivation for this feature was the bitcode work flow (cf. http://lists.llvm.org/pipermail/llvm-dev/2016-June/100643.html for more background). This is not wired up yet, but will be in subsequent patches.
For testing, the new functionality is exposed to `opt`` with a `-strip-nonlinetable-debuginfo`` option.

The secondary use-case (and one that works right now!) is as a reduction pass in bugpoint. I added two new bugpoint options (`-disable-strip-debuginfo`` and `-disable-strip-debug-types``) to control the new features. By default it will first attempt to remove all debug information, then only the type info, and then proceed to hack at any remaining MDNodes.

Diff Detail

Event Timeline

aprantl updated this revision to Diff 73659.Oct 5 2016, 10:05 AM
aprantl retitled this revision from to Add a stripNonLineTableDebugInfo() helper that downgrades -g debug info to what -gline-tables-only would have produced..
aprantl updated this object.
aprantl added a subscriber: steven_wu.
aprantl updated this object.Oct 5 2016, 10:06 AM
dblaikie edited edge metadata.Oct 5 2016, 11:00 AM

More descriptive names for test cases would be nice.

lib/IR/DebugInfo.cpp
383–384

Two map lookups could be one?

496–502

Duplicate lookup (count+insert) - should be able to unconditionally insert, check the result of insert to see if it actually inserted?

542–545

What other named debug metadata is there that needs removing?

test/Transforms/Util/strip-nonlinetable-debuginfo.ll
18–21 ↗(On Diff #73659)

Might be worth explaining why this would be stripped? (& why the first CU was not stripped)

I'm not sure I understand the need for the interesting handle of indistinct disubprograms - aren't all disubprograms distinct now? (the ones that represent llvm function definitions, at least)

test/Transforms/Util/strip-nonlinetable-debuginfo2.ll
1 ↗(On Diff #73659)

Is that arm flag necessary?

6 ↗(On Diff #73659)

Should this be checking that the llvm.dbg.declare was removed?

test/Transforms/Util/strip-nonlinetable-debuginfo4.ll
1 ↗(On Diff #73659)

This is a very long chunk of bitcode - is it all really necessary? What's it intended to test?

Thanks! Some quick inline replies while I'm working on the next iteration of the patch.

lib/IR/DebugInfo.cpp
542–545

Historically LLVM stored a list of variables in llvm.dbg.var.

test/Transforms/Util/strip-nonlinetable-debuginfo4.ll
1 ↗(On Diff #73659)

I added this to provide coverage a crash while setting the scope of a DICompositeType, but I'm sure I can further reduce this by at least getting rid of all the code in the testcase.

aprantl updated this revision to Diff 73698.Oct 5 2016, 3:12 PM
aprantl marked 7 inline comments as done.
aprantl edited edge metadata.

Address review comments.

milseman added inline comments.Oct 5 2016, 11:58 PM
lib/Transforms/Utils/CMakeLists.txt
46

And where is this source file?

aprantl updated this revision to Diff 73805.Oct 6 2016, 9:02 AM

Add missing file.

milseman edited edge metadata.Oct 6 2016, 11:10 AM

r283473

Sorry if I didn't get the commit hooked up to this review properly. What is the next step to get it hooked up?

aprantl updated this revision to Diff 73942.Oct 7 2016, 10:33 AM
aprantl edited edge metadata.

Renamed the testcases to be more specific and added better commentary on what they are testing.

aprantl updated this revision to Diff 73943.Oct 7 2016, 10:34 AM

Fix a typo.

dblaikie added inline comments.Oct 7 2016, 10:45 AM
test/Transforms/Util/strip-nonlinetable-debuginfo-containingtypes.ll
7

Sentence fragment? I'm guessing that was meant to keep saying something?

10–14

This test case would probably be simpler if the ctor (_ZN1BC1Ev) didn't need to be emitted. It would remove a lot of the metadata and IR.

Using -fstandalone-debug and "B *b;" rather than "B b" might help there?

(& is the inheritance necessary from A to B necessary, rather than just using type A directly?)

aprantl added inline comments.Oct 7 2016, 3:10 PM
test/Transforms/Util/strip-nonlinetable-debuginfo-containingtypes.ll
10–14

I fear that both are necessary.

The test is for stripping the containing type of a (distinct) DISubprogram. We need at least one function to have a DISubprogram in the metadata that is not part of the type hierarchy.

The parent class is trickier:

  • In order to deal with cycles we only replace nodes in the outer loop in DebugTypeInfoRemoval::traverse() and not recursively while visiting other nodes.
  • An earlier version of the code replaced removed nodes with an empty MDNode !{}.
  • At some point we made the IR stricter to only accept nullptrs instead of empty nodes for empty fields.

This caused an assertion to go off when:

  1. A DIComposite Type was remapped to an empty MDNode.
  2. DebugTypeInfoRemoval::getReplacementSubprogram() tries to create a DITypeRef from that remapped empty node.

In order to trigger this situation the base class is necessary because only then the DICompositeType will be visited and remapped before the DISubprogram.

aprantl updated this revision to Diff 73993.Oct 7 2016, 3:10 PM

Fix another double lookup.

aprantl updated this object.Oct 11 2016, 10:48 AM
aprantl added a subscriber: test.
aprantl updated this revision to Diff 74273.Oct 11 2016, 10:50 AM
aprantl updated this object.

Use a more idiomatic way to strip debug metadata from global variables.

dblaikie accepted this revision.Oct 12 2016, 9:11 AM
dblaikie edited edge metadata.
dblaikie added inline comments.
lib/IR/DebugInfo.cpp
498–499

Could roll the insert into the if condition and drop the "Result" variable that's only used once here - but up to you. I realize styles differ (I tend to write, perhaps excessively, compact code)

516–532

Remove duplication - could use a local lambda called twice for example. You can pass in the string name of the function, and then cast to Instruction rather than DbgValue/DeclareInst, I think?

auto RemoveUses = [&](StringRef Name) {
  if (auto *DbgVal = M.getFunction(Name)) {
    while (!DbgVal->use_empty())
      cast<Instruction>(DbgVal->user_back())->eraseFromParent();
    DbgVal->eraseFromParent();
    Changed = true;
  }
};
RemoveUses("llvm.dbg.declare");
RemoveUses("llvm.dbg.value");

Or I guess you could use a range-for loop:

for (StringRef Name : {"llvm.dbg.declare", "llvm.dbg.value"})
  ...
553–554

Roll declaration into if condition

568

If you make this inlinedAt variable MDNode * rather than auto, then the remap scope and inlinedAt code become teh same (there's no need for the cast) & can be commoned similar to the suggestion above. (DebugLoc::get just takes InlinedAt as an MDNode* anyway - though perhaps it should take it as DILocation?)

586–587

Worth checking if either Scope or InlinedAt actually changed & skip setting the DebugLoc if no change was needed? (not sure how much it costs - perhaps not enough to worry about)

597–599

Might even be nice to have these 3 lines out in a lambda (used here and for DISubprogram, Scope, and InlinedAt), or as a little utility/wrapper function in/with the Mapper?

603

Optionally reduce indentation by inverting this:

if (!Changed)
  continue;
604–607

I'm not sure how this works/what this does - maybe a comment would be useful.

Specifically it looks like it's dropping all references /to/ the named metadata, then adding operands to the node. Wouldn't that result in the named metadata containing the old (unmapped) values, as well as the newly mapped values - while also dropping the named metadata from the module?

It sounds like rather than NMD.dropAllReferences, what's desired is to drop all operands? then replace them with the newly mapped ones. But I guess I'm missing something.

tools/bugpoint/CrashDebugger.cpp
1133–1155

Looks mostly duplicate except for the function it's calling - could common this into a lambda or for loop.

(if you prefer it written out separately, I'd at least suggest combining the two if conditions around both of these (again, can do so separately) into:

if (!No.... && !BugpointIsInterrupted) {

Though perhaps that's just the idiom here... *shrug*
)

This revision is now accepted and ready to land.Oct 12 2016, 9:11 AM
aprantl marked 6 inline comments as done.Oct 24 2016, 10:30 AM
aprantl added inline comments.
lib/IR/DebugInfo.cpp
604–607
void dropAllReferences() {
  for (Use &U : operands())
    U.set(nullptr);
}

This deletes the NMD's operands and not the references to the NMD.

aprantl updated this revision to Diff 75611.Oct 24 2016, 10:31 AM
aprantl edited edge metadata.

Address all remaining comments.

Still a couple of comments that look like they got lost in the noise, maybe - but no big deal, commit when ready.

test/Transforms/Util/strip-nonlinetable-debuginfo-containingtypes.ll
7

ping

tools/bugpoint/CrashDebugger.cpp
1133–1155

ping

aprantl updated this revision to Diff 75613.Oct 24 2016, 10:48 AM

Completely missed Duncan's suggestion.

aprantl updated this revision to Diff 75619.Oct 24 2016, 11:57 AM
aprantl marked 3 inline comments as done.

Refactored the bugpoint code to use a lambda and unique_ptr.

aprantl closed this revision.Oct 25 2016, 1:14 PM