This is an archive of the discontinued LLVM Phabricator instance.

Optionally print symbolizer markup backtraces.
ClosedPublic

Authored by mysterymath on Dec 9 2022, 4:47 PM.

Details

Summary

When the environment LLVM_ENABLE_SYMBOLIZER_MARKUP is set, if
llvm-symbolizer fails or is disabled, this change will print a backtrace
in llvm-symbolizer markup instead of falling back to in-process
symbolization mechanisms.

This allows llvm-symbolizer to be run on the output later to produce a
high quality backtrace, even for fully-stripped LLVM utilities.

Diff Detail

Event Timeline

mysterymath created this revision.Dec 9 2022, 4:47 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 9 2022, 4:47 PM
mysterymath requested review of this revision.Dec 9 2022, 4:48 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 9 2022, 4:48 PM

Add negative test.

I thought we agreed that the desirable semantics to keep things simple is that if markup is requested then markup is always printed, period. It shouldn't have anything to do with whether llvm-symbolizer works.

I thought we agreed that the desirable semantics to keep things simple is that if markup is requested then markup is always printed, period. It shouldn't have anything to do with whether llvm-symbolizer works.

I agree, I think it's fine if we keep markup as a fallback in the case when llvm-symbolizer fails or is disabled, but there should also be a way to force the use markup without trying anything else.

I thought we agreed that the desirable semantics to keep things simple is that if markup is requested then markup is always printed, period. It shouldn't have anything to do with whether llvm-symbolizer works.

I agree, I think it's fine if we keep markup as a fallback in the case when llvm-symbolizer fails or is disabled, but there should also be a way to force the use markup without trying anything else.

The idea was to keep LLVM_DISABLE_SYMBOLIZATION and LLVM_ENABLE_SYMBOLIZER_MARKUP orthogonal; it seems prima facie that setting LLVM_ENABLE_SYMBOLIZER_MARKUP shouldn't also have the effect of setting LLVM_DISABLE_SYMBOLIZATION.

But the available options form a priority ordering; at most one method should be used, and some are superior to others. We could reasonable consider the markup workflow to have superior characteristics the others in the scenarios where it applies, so I think it's reasonable for enabling it to implicitly disable the others. Accordingly, I've moved the markup option to the top of the list.

Give markup priority over other symbolization options.

Add Windows version of printMarkupContext.

Apply git clang-format.

phosek added inline comments.Dec 18 2022, 9:30 AM
llvm/lib/Support/Unix/Signals.inc
539

Can you use reinterpret_cast instead of C-style cast?

562

Can you use C++-style instead of C-style cast here and below?

572

What does Amt refer to? Might be better to use something more self-descriptive.

mysterymath marked an inline comment as done.

Address review comments.

llvm/lib/Support/Unix/Signals.inc
539

This is just an array to pointer decay, so static_cast works. format just doesn't like being handed an array type.

562

Done, throughout.

572

Renamed these as the bytes remaining until the next item.

No idea where would the most appropriate place be for this, but should this be added to the docs somewhere?

Add blurb about LLVM_ENABLE_SYMBOLIZER_MARKUP to markup doc.

I'd like to ping this; having symbolizer markup would be boon for debugging crashes in CI environments where the toolchain is stripped. Fuchsia does this to minimize the install image for the toolchain, but it currently means that we must reproduce all crashes locally with a debug build to get a usable stack trace.

mcgrathr accepted this revision.Aug 10 2023, 1:32 PM

Some minor improvements could be made, but this generally lgtm.

llvm/lib/Support/Signals.cpp
271

Best format is to append :ra or :pc depending on whether it's a return-address or exact PC frame.
When it's from a crash, the first PC is an exact PC and usually others are return-address frames (though when using full CFI unwinding, individual frames might be exact PC frames).

llvm/lib/Support/Unix/Signals.inc
527

It's less useful but you can still print the context if there's no build ID. The name might be helpful, though automated lookup likely won't work.

539

IMHO &ModeStr[0] is more natural.

604

This can use char Str[4] in the decl to document the requirement on the size (and some compilers will even do warnings based on that). But IMHO it would be better for this function to simply return std::array<char, 4> itself.

617

You may want to use the new reset:begin / reset:end pair around the context + backtrace, in which case IMHO neither really belongs in the "context" part of the printing per se.

This revision is now accepted and ready to land.Aug 10 2023, 1:32 PM
mysterymath marked an inline comment as done.

Address review comments.

llvm/lib/Support/Signals.cpp
271

Unfortunately, the first PC will be well within the signal handler, so the source of the signal will be in some earlier frame. It seems difficult to find it, and separately difficult to tell whether it is a raise, direct function call, or a e.g. SIGSEGV. Accordingly, this seems like it would be a useful enhancement, but I'd expect it to be of a similar size to this patch.

llvm/lib/Support/Unix/Signals.inc
527

At least at present, llvm-symbolizer doesn't accept module lines without a supported module type. The definition for "elf" also requires a build ID. It's possible to change this of course, but we'd probably want to do some surgery to the language of the spec too to make it more explicit that the typed portion of module and mmap are optional.

617

Ack; this seems a better fit for a change set where we actually do something with them (plus define them in the reference).

Looks like clang-format is complaining. Best make sure you've reformatted everything you've added using it.

clang-format

This revision was landed with ongoing or failed builds.Aug 17 2023, 10:55 AM
This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Aug 17 2023, 11:08 AM

This may not build on Mac: http://45.33.8.238/macm1/67288/step_4.txt

It possibly breaks tests on Linux: http://45.33.8.238/linux/115648/step_12.txt

It's possible that's due to bot weirdness, but maybe it's something real.

This breaks the build on Darwin: https://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/37054/:

Undefined symbols for architecture x86_64:
  "printMarkupContext(llvm::raw_ostream&, char const*)", referenced from:
      printMarkupStackTrace(llvm::StringRef, void**, int, llvm::raw_ostream&) in libLLVMSupport.a(Signals.cpp.o)

@mysterymath could you take a look please?

This breaks the build on Darwin: https://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/37054/:

Undefined symbols for architecture x86_64:
  "printMarkupContext(llvm::raw_ostream&, char const*)", referenced from:
      printMarkupStackTrace(llvm::StringRef, void**, int, llvm::raw_ostream&) in libLLVMSupport.a(Signals.cpp.o)

@mysterymath could you take a look please?

Ah, thanks, both of these real oversights. Missing stub for the Darwin path, and overly-restrictive regex for the Linux case. I'll have a fix out in an hour or so; if that's too late, feel free to revert.

hliao added a subscriber: hliao.Aug 17 2023, 1:02 PM

I committed two quick fixes for Linux builds, where that unit test failed unreliably. a2dbb195d8ae and 7026a0cacaa0. Please correct that if the "0x" prefix is mandatory. If 0x is mandatory, please don't rely on the '#' flag, as it only outputs 0x for non-zero values.

8d3ff601a3 should fix the Darwin build issue and make the Linux unit tests sufficiently lenient. The %#x behavior was intentional; the spec is written with it in mind.

mcgrathr added inline comments.Aug 17 2023, 3:00 PM
llvm/lib/Support/Signals.cpp
271

The distinction is a feature of the unwinder semantics, not something you need to invent.

mcgrathr added inline comments.Aug 17 2023, 3:32 PM
llvm/lib/Support/Signals.cpp
271

It's not very obvious, but the Boolean (as int) value returned in the out parameter of _Unwind_GetIPInfo is the "is signal frame" flag.