Page MenuHomePhabricator

[WIP][lld] Implement crash reproducer
Needs ReviewPublic

Authored by haowei on May 11 2021, 8:47 PM.

Details

Summary

This is still work in progress to test my ideas. Not ready for review yet.

In this patch, I tried to use CrashRecoveryContext to re-run lldMain with '--reproduce' when the first run crashes. Flag 'reproduce-on-crash' is added to optionally turn on this feature. Flag '--debug-crash' to allow lld crash on purpose to test crash reproducer.

The reason I choose this approach is that it has a few advantages:

  • It is quite simple.
  • It works with every lld driver which already support '--reproduce'.

However, during testing, I saw a few issues:

  • If the crash happens before the driver initialize the TarWriter for reproducer, this implementation will not catch it.
  • lldMain is not designed to run twice internally. It will write unexpected errors to stderr. (e.g. I saw 'lld/ELF/InputSection.cpp:1402: void lld::elf::MergeInputSection::splitIntoPieces(): Assertion `pieces.empty()' failed.' error in tests)

Therefore, I am wondering if I should consider another approach: reimplement the reproducer each driver (start with ELF driver) to make it independent from the lld driver. In this case the reproducer can be directly invoked from CrashRecoveryContext in the lld/tools/lld/lld.cpp instead of being invoked from the rerun lld driver.

Discussion thread: https://lists.llvm.org/pipermail/llvm-dev/2021-April/149853.html

Diff Detail

Event Timeline

haowei created this revision.May 11 2021, 8:47 PM
haowei requested review of this revision.May 11 2021, 8:47 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 11 2021, 8:47 PM

I don't understand the purpose. Didn't we agree that implementing this on a different process is a correct approach?

(it'd be good to link the description to the discussion thread on the issues, whatever way this goes)

haowei edited the summary of this revision. (Show Details)May 20 2021, 5:34 PM

I don't understand the purpose. Didn't we agree that implementing this on a different process is a correct approach?

Sorry for the confusion, I created this change so I can share the code with other people. It's not in a reviewable state and I mainly use it as a baseline for testing.

If I understood correctly, we agreed to implement a lld crash reproducer in a similar way like clang with "-fintegrated-cc1" as the first step, which relies on CrashRecoveryContext instead of relying on a second process. The crash reproducer can be further improved by implementing a reproducer that works on a different process if the first step is not good enough.

D102304 added a createCrashReproduceTar to the lld elf driver, which generates a reproducer tarball without re-invoke the lldMain and will be used by CrashRecoveryContext in case lld recovered from a crash. The principle is to record the file names used by lld and use them to generate the reproducer in case lld crashes. I tried the other approach which makes the reproducer to determine the file needs to be included but it turns out very easy to miss some files and hard to maintain. So I just record the filenames, which also seems to be easier to be extended to other drivers. Please let me know your opinions on this.