Page MenuHomePhabricator

[lldb/Target] Add Assert StackFrame Recognizer
ClosedPublic

Authored by mib on Jan 23 2020, 4:02 PM.

Details

Summary

When a thread stops, this checks depending on the platform if the top frame is
an abort stack frame. If so, it looks for an assert stack frame in the upper
frames and set it as the most relavant frame when found.

To do so, the StackFrameRecognizer class holds a "Most Relevant Frame" and a
"cooked" stop reason description. When the thread is about to stop, it checks
if the current frame is recognized, and if so, it fetches the recognized frame's
attributes and applies them.

rdar://58528686

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
JDevlieghere added inline comments.Jan 23 2020, 8:42 PM
lldb/source/Target/AbortRecognizer.cpp
51 ↗(On Diff #240038)

Same comments as the previous method.

81 ↗(On Diff #240038)

Magic value? Why 10?

friss added inline comments.Jan 23 2020, 9:29 PM
lldb/source/Target/AbortRecognizer.cpp
126–127 ↗(On Diff #240038)

Why do you even check the abort location? The recognizer has been registered for this specific module and function, so you're never going to get here with a different top frame.

lldb/source/Target/Process.cpp
543

We need to find a better place to register this.

945

This variable name just looks weird. I know what it is in the context of this patch, but I would never get it re-reading this code a year from now. Just hoist the start_frame variable from bellow and use it.

946

Huh, reading further down the patch, it feels pretty wrong that you have to compute the "most relevant frame" here and in WillStop(). Should this just be something like start_frame = cur_thread->GetSelectedFrame()->GetIndex()?

947

No point in using the variable here. You want the top frame, let the code reflect this and just use 0.

lldb/source/Target/Thread.cpp
576

Why does this function name use a plural form? Wouldn't SelectMostRelevantFrame be a more explicit name?

581

Unused variable?

teemperor added inline comments.
lldb/include/lldb/Target/AbortRecognizer.h
29 ↗(On Diff #240038)

This function (especially the return types) deserve some documentation.

lldb/source/Target/AbortRecognizer.cpp
1 ↗(On Diff #240038)

Please remove the -*- C++ -*- as that's only for header files.

85 ↗(On Diff #240038)

Could we make this frame_index instead of just I. I was confused that I isn't just an index into a vector when you later interpret it as a stack frame index.

99 ↗(On Diff #240038)

You can just do sym_ctx.module_sp->GetFileSpec().FileEquals(module_spec) which also handles case insensitive file systems IIRC.

101 ↗(On Diff #240038)

Maybe add a comment why you need to skip one frame.

104 ↗(On Diff #240038)

Maybe something like this? Just a suggestion though.

uint32_t last_frame_index = frames_to_fetch - 1;
m_most_relevant_frame = thread_sp->GetStackFrameAtIndex(std::min(i + 1, last_frame_index));
mib marked 14 inline comments as done.Jan 24 2020, 8:14 AM
mib added inline comments.
lldb/source/Target/AbortRecognizer.cpp
1 ↗(On Diff #240038)

I saw this on a bunch of source files (StopInfo.cpp / Breakpoint.cpp / SymbolContext.cpp ...). I'm not sure what you mean by that.

25 ↗(On Diff #240038)

I use ConstString because StackFrameRecognizerManager::AddRecognizer takes a ConstString.

And also SymbolContext::GetFunctionName returns a ConstString.

81 ↗(On Diff #240038)

Pretty much. In the beginning I was unwinding the entire stack. But if there is too much frames (i.e. recursion) this would take too much time. We agreed with @friss and @jingham to unwind up to 10 frames at most.

Can you think of a better way to do it ?

lldb/source/Target/Process.cpp
543

May be this could go in Target/LanguageRuntime.cpp or Target/SystemRuntime.cpp ?

mib updated this revision to Diff 240213.Jan 24 2020, 8:15 AM
mib marked 2 inline comments as done.
friss added inline comments.Jan 24 2020, 8:30 AM
lldb/source/Target/AbortRecognizer.cpp
81 ↗(On Diff #240038)

I don't think that's exactly what I said. My opinion was that asserts will have a pretty deterministic layout, and that we should know exactly what to unwind to determine whether we have an assert or not.

In your test, you actually check that the frame with the assert is frame 4. If it is anything other then 4 in any configuration, then the test will fail. You might as well just put 4 in the code here rather than a magical 10.

mib marked an inline comment as done.Jan 24 2020, 9:20 AM
mib added inline comments.
lldb/source/Target/AbortRecognizer.cpp
81 ↗(On Diff #240038)

The number of frames to unwind is different depending on the platform. I chose 10 to give us some extra margin but I'll change it to 4 or 5,

jingham added inline comments.Jan 24 2020, 9:41 AM
lldb/source/Target/AbortRecognizer.cpp
81 ↗(On Diff #240038)

We want this as small as will do the job - since we don't want to unwind more than necessary. But we aren't going to do this unwinding on every stop - only when we actually see a thread with frame 0 in __pthread_kill (or whatever the system kill is). So we don't need to over optimize. It would be silly to go have to fix this because some platform added another layer between assert and kill.

I'd go see what it is on all the Unixes we know about and add a couple to future proof it.

The test should definitely not depend on how many frames there are between the assert and the _pthread_kill. It should just check that the currently selected frame has the right file & line number. If you want to keep this a shell test, you could backtrace the thread and make sure the * is on the line with "assert.c", or maybe try:

(lldb) source info
Lines found in module `killme
[0x0000000100000f2e-0x0000000100000f6f): /tmp/assert.c:7:3

If the frame wasn't selected properly this will show:

(lldb) source info
error: No debug info for the selected frame.
mib updated this revision to Diff 240238.Jan 24 2020, 9:47 AM
mib marked 2 inline comments as done.
friss added inline comments.Jan 24 2020, 11:11 AM
lldb/source/Target/AbortRecognizer.cpp
81 ↗(On Diff #240038)

You say it is 4 or 5, so your test will fail in one of those configurations as you have hardcoded the frame number there.

jingham added inline comments.Jan 24 2020, 5:22 PM
lldb/source/Target/AbortRecognizer.cpp
81 ↗(On Diff #240038)

Actually, the test you have is fine if you just leave out the "4". You really only care about the "assert.c".

mib updated this revision to Diff 240543.Jan 27 2020, 6:16 AM
mib marked 12 inline comments as done.

Changed the RecognizedStackFrame and AbortRecognizedStackFrame to add a StopReasonDescription method instead of a StopInfo instance.

This way, lldb can show a special stop reason description, for recognized frames that override the StopReasonDescription method, while keeping track of the underlying StopInfo. To display the underlying StopInfo, the thread-format token stop-reason-raw can be used.

The stop description can also be retrieved through the SBAPI using SBThread::GetStopDescription. It's either returning the "cooked" or the "raw" stop reason depending if the thread a has recognized frame.

This patch also changes the ObjC exception recognizer to have a special StopReasonDescription and updates its tests.

JDevlieghere added inline comments.Jan 27 2020, 12:25 PM
lldb/docs/use/formatting.rst
137–139

This change seems somewhat orthogonal to this patch. Is the abort recognizer the only one where these two will be different? If not I would suggest splitting this off into a separate patch.

lldb/include/lldb/Target/AbortRecognizer.h
33 ↗(On Diff #240543)

Why is this in the header? If this is only used by the implementation, you should make these static function in the cpp file.

48 ↗(On Diff #240543)

Doxygen comment?

lldb/source/Core/FormatEntity.cpp
1277–1278

I know you didn't touch this line but since you copied it below...

if(Thread *thread = exe_ctx->GetThreadPtr())
lldb/source/Target/AbortRecognizer.cpp
42 ↗(On Diff #240543)

(nit) We usually just use // for block comments.

67 ↗(On Diff #240543)

I like std::tie but I wonder if a struct with named fields wouldn't make this more readable with less code. Saves you six lines in this function and the next one in exchange for a 4-line struct definition. Anyway I'll leave this up to you to decide.

return lldb::RecognizedStackFrameSP(
      new AbortRecognizedStackFrame(thread_sp, assert_location->module_spec, assert_location->function_name));
107 ↗(On Diff #240543)

Why is this a std::string and not a StringRef? I'm pretty sure FileSpec has ctor that takes a StringRef.

mib marked 9 inline comments as done.Jan 27 2020, 5:45 PM
mib added inline comments.
lldb/docs/use/formatting.rst
137–139

It wouldn't make sense to split this into different patches since I'm only changing the current thread.stop-reason in this one. And for users who still want the original behaviour, I added thread.stop-reason-raw.

lldb/source/Target/AbortRecognizer.cpp
67 ↗(On Diff #240543)

I changed the implementation so I'm only passing 1 parameter (the most relevant frame) to the constructor.

mib updated this revision to Diff 240744.EditedJan 27 2020, 6:01 PM
mib marked 2 inline comments as done.
mib retitled this revision from [lldb/Target] Add Abort StackFrame Recognizer to [lldb/Target] Add Assert StackFrame Recognizer.
mib edited the summary of this revision. (Show Details)

Renamed AbortRecognizer to AssertFrameRecognizer.

Moved the assert frame (aka most relevant frame) lookup from the AssertRecognizedStackFrame constructor to AssertFrameRecognizer::RecognizeFrame(), so if the assert frame is not found, only a base RecognizedStackFrame is returned.

Added doxygen comments.

All tests pass now.

JDevlieghere accepted this revision.Jan 28 2020, 8:14 AM

LGTM with one inline comment, but let's give the other reviewers a chance to take another look before landing this.

lldb/source/Target/AssertFrameRecognizer.cpp
156

Why not return here?

StackFrameSP most_relevant_frame_sp = thread_sp->GetStackFrameAtIndex(std::min(frame_index + 1, last_frame_index));
return lldb::RecognizedStackFrameSP(new AssertRecognizedStackFrame(most_relevant_frame_sp));

That way you don't have to check again after you finish the loop and you can just return RecognizedStackFrameSP();.

This revision is now accepted and ready to land.Jan 28 2020, 8:14 AM
mib marked an inline comment as done.Jan 28 2020, 8:38 AM
mib added inline comments.
lldb/source/Target/AssertFrameRecognizer.cpp
156

Fair.

mib updated this revision to Diff 240899.Jan 28 2020, 8:58 AM

Added early return in AssertFrameRecognizer::RecognizeFrame

mib marked an inline comment as done.Jan 28 2020, 8:59 AM
This revision was automatically updated to reflect the committed changes.
mib reopened this revision.Jan 28 2020, 9:43 AM
This revision is now accepted and ready to land.Jan 28 2020, 9:43 AM

This generally looks good. I'm still not fond of registering this in the Target itself. But I don't have a immediately better idea as we don't have a C language runtime.

A couple more comments/questions that can be addressed as followups:

lldb/include/lldb/Target/StackFrameRecognizer.h
38

The fact that this returns a StringRef, means it must return a pointer to permanent string storage. In this case, you're returning a constant string, and it's fine, but it means that if you wanted to construct a return value (for example out of data you extract from the inferior), you need to store the string backing the StringRef somewhere. The concrete RecognizedStackFrame instance seems like a good fit, but I have no clue about their lifetime guarantees compared to the lifetime of the StringRef returned here. Maybe we should make this a std::string?

lldb/packages/Python/lldbsuite/test/functionalities/inferior-assert/TestInferiorAssert.py
118

Here, you're now just checking self consistency, I son't think that's valuable. This should check for the actual text of the assert recognizer (and potentially something else on the systems where the recognizer is not supported).

197

as we just did self.runCmd above anyway, I'd replace all of this by self.runCmd('frame select 0'). Or at least just write it as a single line, no need to check for every intermediate result. Such verbosity really distracts from what the test is trying to verify.

lldb/source/Target/AssertFrameRecognizer.cpp
111–113

Can you move this with the other AssertRecognizedStackFrame methods?

lldb/source/Target/StackFrameRecognizer.cpp
95–96

What's the reason of this change?

labath added a subscriber: labath.Jan 29 2020, 1:07 AM

Since the failure reason was not very obvious from the bot message, I took a quick look, and to save you the trouble of reproducing this, here's what I found:

The reason the new test failed was because the symbols you are searching for (__GI_raise, __GI___assert_fail) are private libc symbols which are only available if you happen to have debug info for the system libc installed. If you don't, these will show up as raise and __assert_fail, respectively.

Also the TestExitDuringStep was failing because of some apparent memory corruption in the stop reason (I am getting output like * thread #2, name = 'a.out', stop reason = P�4��

mib updated this revision to Diff 242634.Feb 5 2020, 8:48 AM
mib marked 5 inline comments as done.

Fixed the corrupted stop reason description.
Fixed linux support for public symbols.

This revision was automatically updated to reflect the committed changes.
mib added inline comments.Feb 5 2020, 8:51 AM
lldb/source/Target/StackFrameRecognizer.cpp
95–96

In line 101, lldb tries to get the symbol from the symbol context. If the symbol was not populated in the SymbolContext before calling GetRecognizerForFrame then it's null and the function returns a StackFrameRecognizerSP.

This broke LLDB tests on Windows Build Bot http://lab.llvm.org:8011/builders/lldb-x64-windows-ninja/builds/13415

Can you please fix it or revert it? Thank you!

Failing Tests (4):
    lldb-api :: commands/watchpoints/step_over_watchpoint/TestStepOverWatchpoint.py
    lldb-api :: functionalities/postmortem/minidump-new/TestMiniDumpNew.py
    lldb-api :: python_api/thread/TestThreadAPI.py
    lldb-shell :: Driver/TestConvenienceVariables.test

  Expected Passes    : 1381
  Expected Failures  : 10
  Unsupported Tests  : 541
  Unexpected Failures: 4

It looks like there is still some memory corruption going on:

lldb) target create --core "../llvm-project/lldb/packages/Python/lldbsuite/test/functionalities/postmortem/minidump/fizzbuzz_no_heap.dmp"
(lldb) bt
* thread #1, stop reason = Exception 0xc0000005 encountered at address 0x164d14
  * frame #0: 0x00164d14 fizzbuzz.exe
    frame #1: 0x00167c79 fizzbuzz.exe
   ...
(lldb) script lldb.thread.GetStopDescription(256)
'\xf0\xfca\x02'
lldb/source/Target/AssertFrameRecognizer.cpp
64–66

This part looks pretty kludgy. Maybe the function should just return a list of symbols (and shared libraries) and have this thing be handled at a higher level.

This is going to be needed in order to support other C library implementations (e.g. musl) on linux, and it's not inconceivable that the internal glibc symbol name will change in the future too...

friss added inline comments.Feb 5 2020, 1:19 PM
lldb/source/API/SBThread.cpp
327–332

This is wrong. GetStopDescription() returns a temporary std::string. stop_desc will be a dangling pointer on the next line. Just make stop_desc a std::string and call .data() below.

mib reopened this revision.Feb 5 2020, 4:03 PM

There are still some bugs on Windows and Linux that need to investigated ...

This revision is now accepted and ready to land.Feb 5 2020, 4:03 PM
mib updated this revision to Diff 242926.Feb 6 2020, 9:25 AM

Fixing the memory management bug should hopefully solve the Linux and Windows Issues.

This revision is now accepted and ready to land.Feb 6 2020, 9:25 AM
This revision was automatically updated to reflect the committed changes.
max-kudr removed a subscriber: max-kudr.Feb 6 2020, 9:48 AM

Even with the latest variant (commit 7ebe9cc4fc2d97ec0ed2a6038be25b2a7ed1aac2) I am getting on Fedora 30 x86_64:

/home/jkratoch/redhat/llvm-monorepo/lldb/test/Shell/Recognizer/assert.test:5:10: error: CHECK: expected string not found in input
# CHECK: thread #{{.*}}stop reason = hit program assert
         ^
<stdin>:1:1: note: scanning from here
(lldb) command source -s 0 '/home/jkratoch/redhat/llvm-monorepo-clangassert/tools/lldb/test/Shell/lit-lldb-init'
^
<stdin>:15:34: note: possible intended match here
* thread #1, name = 'assert.test.tmp', stop reason = signal SIGABRT
                                 ^

The output is:

(lldb) command source -s 0 '/home/jkratoch/redhat/llvm-monorepo/lldb/test/Shell/Recognizer/assert.test'
Executing commands in '/home/jkratoch/redhat/llvm-monorepo/lldb/test/Shell/Recognizer/assert.test'.
(lldb) run
assert.test.tmp.out: /home/jkratoch/redhat/llvm-monorepo/lldb/test/Shell/Recognizer/Inputs/assert.c:7: int main(): Assertion `a == 42' failed.
Process 29077 stopped
* thread #1, name = 'assert.test.tmp', stop reason = signal SIGABRT
    frame #0: 0x00007ffff7ddee35 libc.so.6`raise + 325
libc.so.6`raise:
->  0x7ffff7ddee35 <+325>: movq   0x108(%rsp), %rax
    0x7ffff7ddee3d <+333>: xorq   %fs:0x28, %rax
    0x7ffff7ddee46 <+342>: jne    0x7ffff7ddee6c            ; <+380>
    0x7ffff7ddee48 <+344>: movl   %r8d, %eax

Process 29077 launched: '/home/jkratoch/redhat/llvm-monorepo-clangassert/tools/lldb/test/Recognizer/Output/assert.test.tmp.out' (x86_64)
(lldb) frame info
frame #0: 0x00007ffff7ddee35 libc.so.6`raise + 325
(lldb) frame recognizer info 0
frame 0 is recognized by Assert StackFrame Recognizer
(lldb) set set thread-format "{${thread.stop-reason-raw}}\n"
(lldb) thread info
signal SIGABRT

(lldb) q

Visible also on (silent) Fedora buildbot.

mib added a comment.Feb 7 2020, 9:51 AM

The test seems to fail on certain linux distros (fedora and archlinux) ... Since it's not a breaking change, I sent a patch to skip it on linux, until I can reproduce it on those platform and fix the issue.