This is an archive of the discontinued LLVM Phabricator instance.

[bugpoint] Add a named metadata (+their operands) reducer
ClosedPublic

Authored by loladiro on Oct 24 2015, 5:08 PM.

Details

Summary

We frequently run bugpoint on a linked-together module of all modules we jit while compiling the julia standard library. As a result, this module has a very large number of compile units (10000+) in llvm.dbg.cu, which didn't get reduced at all, requiring manual post processing. This is an attempt to
have bugpoint go through and attempt to reduce the number of global named metadata nodes as well as their operands, to cut down the number of roots for such metadata.

Diff Detail

Repository
rL LLVM

Event Timeline

loladiro updated this revision to Diff 38333.Oct 24 2015, 5:08 PM
loladiro retitled this revision from to [bugpoint] Add a named metadata (+their operands) reducer.
loladiro updated this object.
loladiro added a reviewer: dexonsmith.
loladiro set the repository for this revision to rL LLVM.
loladiro added a subscriber: llvm-commits.
loladiro updated this revision to Diff 38337.Oct 24 2015, 5:13 PM

Forgot to clang-format (now fixed!)

reames added a subscriber: reames.Nov 3 2015, 6:03 PM

Test cases?

tools/bugpoint/CrashDebugger.cpp
530 ↗(On Diff #38337)

Please use a shared_ptr or something. Raw pointers should be avoided.

533 ↗(On Diff #38337)

min(10, size)

549 ↗(On Diff #38337)

Possibly: ToDelete.reserve(NamedMD.size())?

611 ↗(On Diff #38337)

shared_ptr

615 ↗(On Diff #38337)

Do you need to worry about iterator invalidation here?

Would it be easier to just remove the operands or set them to undef?

816 ↗(On Diff #38337)

Please add an option to easily disable these passes.

821 ↗(On Diff #38337)

insert(end, md_begin, md_end)

831 ↗(On Diff #38337)

insert(end, op_begin, op_end)

dexonsmith edited edge metadata.Nov 3 2015, 8:43 PM
dexonsmith added a subscriber: dexonsmith.

This sounds awesome. I won't have a chance to look until at least tomorrow, but I know Adrian and Pete both had ideas around bugpoint and debug info so I'm CCing them.

  • dpnes
loladiro added inline comments.Nov 3 2015, 10:40 PM
tools/bugpoint/CrashDebugger.cpp
530 ↗(On Diff #38337)

I copied that from above. Happy to change the API throughout, but that should be a separate change.

549 ↗(On Diff #38337)

Sure

615 ↗(On Diff #38337)

No, the iterators are in the original module, but the ones we're dropping are in the new one. I didn't see an obvious way to just remove operands. I agree that would be preferable.

816 ↗(On Diff #38337)

Will do.

821 ↗(On Diff #38337)

Needs to convert from named md to its name

loladiro updated this revision to Diff 39167.Nov 3 2015, 10:43 PM
loladiro edited edge metadata.

Address review comments and add test case.

reames accepted this revision.Nov 4 2015, 11:30 AM
reames added a reviewer: reames.

LGTM

This revision is now accepted and ready to land.Nov 4 2015, 11:30 AM
pete accepted this revision.Nov 4 2015, 11:30 AM
pete added a reviewer: pete.
pete added a subscriber: pete.

Philip has been doing most of the reviewing, so I'll defer to him for a final LGTM. But FWIW, this LGTM given the work to address his feedback.

BTW, my approach was a little different. You are reducing the number of CU's which is great. I should have tried that first.

My approach was to reduce the number of types, global variables, imported entities, and subprograms in an individual CU. Given that these tend to also be referenced from llvm.dbg.* intrinsics, many dbg info entries would remain in the metadata, even after being removed from the CU. From there, I would then strip dbg info from instructions and see if there was a reduced set which would still crash. Eventually I'd get the smallest possible CU in terms of types, subprograms, etc, which demonstrate the bug.

Unfortunately, this didn't seem to be 100% reliable. I kept encountering crashes in llc when it tried to emit the debug info as it thought entries were missing, or weren't in states it wanted them to be in. So I may have actually found real bugs, i'm not sure, but those bugs made it impossible to run llc in bug point and know that the crashes were the original bug and not what i was causing with reduced debug info.

Anyway, i still have the patch lying around if its ever of interest. Its about 6 months old though so likely needs fixed to work on ToT.

Yes, I think that would be a great step further. As I mentioned, this patch arose from the very specific situation of having tons of otherwise unreferenced CUs left over, but there is certainly room for more.

(Just in case you're waiting for me: I still haven't looked, but if I find anything I can review post-commit.)

This revision was automatically updated to reflect the committed changes.