This is an archive of the discontinued LLVM Phabricator instance.

[RFC] Add GDB remote packet reproducer.
ClosedPublic

Authored by JDevlieghere on Aug 3 2018, 7:10 AM.

Diff Detail

Event Timeline

JDevlieghere created this revision.Aug 3 2018, 7:10 AM
vsk added a subscriber: vsk.Aug 3 2018, 11:22 AM

Hi Jonas, great you are working on this! I look through and can't say much about the logic. It looks good to me, just added a few notes on minor things.

include/lldb/Utility/Reproducer.h
29

Not sure if this is relevant here, but in my experience nested classes always caused trouble as there is still no way in C++ to forward declare their names.

49

This returns a copy by value, while AddProviderToIndex() takes a const ref (that makes sense because ProviderInfo is no sink argument there).
It works because local const refs prolong the lifespan of the temporary, but is it intended?

100

This is the same as Provider &Register(std::unique_ptr<Provider> provider);

https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rf-consume

source/Utility/Reproducer.cpp
35

Missing brackets, should be assert((!value || !use_reproducer) && "..."); Same below.

JDevlieghere marked 4 inline comments as done.

Thanks for the review Stefan!

shafik added inline comments.Aug 21 2018, 1:04 PM
source/Utility/CMakeLists.txt
66

Do you also need to update the Xcode project?

JDevlieghere added inline comments.Sep 4 2018, 7:20 AM
source/Utility/CMakeLists.txt
66

Yup. I'm using the cmake build for development so I'll leave that until the end of the review.

JDevlieghere retitled this revision from [WIP] Add GDB remote packet reproducer. to [RFC] Add GDB remote packet reproducer..Sep 19 2018, 5:23 AM
JDevlieghere added a comment.EditedSep 19 2018, 6:01 AM

I'm having some trouble with the test case. Based on the initialization code I assume I'm not supposed to destroy the SBDebugger singleton shared by the LLDB test suite. If I do it anyway the test crashes with the an exception:

libc++abi.dylib: terminating with uncaught exception of type std::__1::system_error: recursive_mutex lock failed: Invalid argument

Instead I tried to create my own SBDebugger instance and do everything though my_debugger.GetCommandInterpreter().HandleCommand(). But then I don't get a backtrace:

Output Message:
* thread #1
  * frame #0: 0x0000000100004000 dyld`_dyld_start

While I expect:

* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x0000000100000f48 a.out`foo at main.c:13
    frame #1: 0x0000000100000f7b a.out`main(argc=1, argv=0x00007ffeefbff188) at main.c:17
    frame #2: 0x00007fff6b9380a1 libdyld.dylib`start + 1
    frame #3: 0x00007fff6b9380a1 libdyld.dylib`start + 1

Anybody know what I'm doing wrong here?

JDevlieghere edited the summary of this revision. (Show Details)Sep 19 2018, 7:03 AM

Sorry, I can't help with the test. Code looks great! Only a few minor things inline.

source/Core/Debugger.cpp
587

Is static_cast<bool>(p) not working?

source/Plugins/Process/gdb-remote/GDBRemoteCommunicationHistory.h
56

The default is support to record no history? I think the % operator in NormalizeIndex() will cause UB in that case.

98

This code existed already, but anyway: it's an overwriting ring buffer right? Maybe that's worth a comment.
Is there another usage apart from DumpProcessGDBRemotePacketHistory?

First batch of comments, may come back to it today.

source/Host/common/HostInfoBase.cpp
200

So C++11 made it thread safe when initializing a static variable see stmt.dcl/p4 and also std::call_once docs also mention this.

Is there a reason why in llvm we would prefer call_once?

source/Plugins/Process/gdb-remote/GDBRemoteCommunication.cpp
1256

5 looks like a magic number, we should give it a name.

source/Plugins/Process/gdb-remote/GDBRemoteCommunicationHistory.h
44

It says binary data but it contains a string ... it feels confusing. I can see the underlying methods you are using take char which is annoying since char could be either signed or unsigned so not obvious which u/int8_t make sense.

source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
3479

It is not obvious to me what the underlying assumptions but GetTarget() is locking a weak_ptr don't we have to check and make sure we actually obtain a non empty result? I realize that is not directly here but then why use a weak_ptr if we never verify the return of lock() ?

aprantl added inline comments.Sep 20 2018, 3:46 PM
source/API/SBDebugger.cpp
1062

This pattern appears multiple times in this file. Factor it out into a make / get CStringOrNull() function?

source/Core/Debugger.cpp
286

Will it do this on the side or on demand? If it is the former, then "log" seems more appropriate than "generate". If it is the latter, the help text should indicate what triggers the generation.

What about "replay" instead of "reproduce"?

591

clang-format

593

I don't understand that comment. What's that?

source/Plugins/Process/gdb-remote/GDBRemoteCommunicationHistory.cpp
40
if (!size)
  return;
56

ditto

85

if (!log) return ...

source/Plugins/Process/gdb-remote/GDBRemoteCommunicationHistory.h
24

Doxygen comments?

source/Plugins/Process/gdb-remote/GDBRemoteCommunicationReplayServer.cpp
63

please invert the condition and early-exit in the error case.

150

invert

186

while (true)

195

return nullptr

198

return nullptr

Does this ever return anything else?

source/Plugins/Process/gdb-remote/GDBRemoteCommunicationReplayServer.h
21

These should be before the C++ includes.

37

Doxygen?

source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
3480

Factor this out into a static function?

JDevlieghere marked 18 inline comments as done.

Thanks a lot for the review feedback everyone!

source/API/SBDebugger.cpp
1062

I don't think that would help a lot:

  • It only happens twice: for the prompt and for the reproducer.
  • We call different methods on m_opaque_sp so we'd have to pass a lambda or function pointer.
source/Core/Debugger.cpp
286

Sounds good!

source/Host/common/HostInfoBase.cpp
200

I just copied the code from somewhere else. However, your link to cppref specifies thread safety for function-local statics, which is not the case here. I don't think that guarantee applies to globals?

source/Plugins/Process/gdb-remote/GDBRemoteCommunication.cpp
1256

I moved this code, but you're totally right, I had to open the header to know what it was.

source/Plugins/Process/gdb-remote/GDBRemoteCommunicationHistory.h
98

Added it to the Doxygen comment.

source/Plugins/Process/gdb-remote/GDBRemoteCommunicationReplayServer.cpp
198

Nope :-)

friss added inline comments.Sep 21 2018, 9:29 AM
source/Host/common/HostInfoBase.cpp
200

call_once is better for a couple reasons:

  • it avoid a static constructor which would slow down the launch/loading of the debugger.
  • When you have global objects, you have to care not only about their construction, but also their destruction. This will cost time on exit and potentially introduce ordering issues if those objects use other global resources.
shafik added inline comments.Sep 21 2018, 10:17 AM
source/Host/common/HostInfoBase.cpp
200

Fred, thank you for the comments on general usage.

200

It is not obvious this is being invoked during global construction ... so I guess that would mean Generator is a global since that seems to be what is calling GetReproducerTempDir?

Maybe a comment explaining the rationale would help prevent someone in the future from coming around and saying hey we can refactor this.

friss added inline comments.Sep 21 2018, 10:23 AM
source/Host/common/HostInfoBase.cpp
200

Note that my comment was really general about the use of call_once. I'm a little ashamed to admit that I haven't actually looked at this particular case :-)

Ping. Anybody have any potential issues/suggestions/concerns regarding this patch?

  • Introduce command object for reproducers.
  • Fix issue where lifetime of the reproducer conflicted with the lifetime of the GDB remote process.
  • Ensure packets are properly flushed.
  • Reduce complexity in test.
  • Rebase
  • Replace reproducer capture -enable and reproducer capture -disable with reproducer capture enable and reproducer capture disable
jingham accepted this revision.Oct 31 2018, 3:47 PM

LGTM

This revision is now accepted and ready to land.Oct 31 2018, 3:47 PM
JDevlieghere closed this revision.Nov 13 2018, 11:42 AM

Forgot to edit the patch description before landing this.

This landed in rL346780

I'm having some trouble with the test case. Based on the initialization code I assume I'm not supposed to destroy the SBDebugger singleton shared by the LLDB test suite. If I do it anyway the test crashes with the an exception:

libc++abi.dylib: terminating with uncaught exception of type std::__1::system_error: recursive_mutex lock failed: Invalid argument

I'm looking at a similar issue that happens during extended lldb testing. I see the following error: "terminating with uncaught exception of type std::__1::system_error: mutex lock failed: Invalid argument "

It happens *very* infrequently, a fraction of a percentage of the time, and only seems to happen on release builds and always when lldb test appears to be exiting. Looks like this:

test_find_types_struct_type (TestFindStructTypes.TestFindTypesOnStructType)

Ran 1 test in 0.813s

RESULT: PASSED (1 passes, 0 failures, 0 errors, 0 skipped, 0 expected failures, 0 unexpected successes)
terminating with uncaught exception of type std::__1::system_error: mutex lock failed: Invalid argument


This causes the harness to report false failures and there is no clear cut pattern of failure so I can note the tests as flaky and rerun.

Any clue as to what I might look at? I think it might be something in the teardown sequence in Debugger.cpp.

Thanks