Page MenuHomePhabricator

Initial support for native debugging of x86/x64 Windows processes
ClosedPublic

Authored by asmith on Jun 11 2019, 1:45 PM.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Hui added inline comments.Jun 12 2019, 3:01 PM
source/Plugins/Process/Windows/Common/DebuggerThread.cpp
350 ↗(On Diff #204154)

The definition is in ntstatus.h which has quite a few macros redefined in winnt.h that is implicitly included in both LLDB. and LLVM.
Could force to include ntstatus.h but need to make a handful of extra patches to replace EXCEPTION_BREAKPOINT macro etc with ntstatus's
in all concerned sources.

source/Plugins/Process/Windows/Common/NativeProcessWindows.cpp
529–530 ↗(On Diff #204154)

This factory attach will return a constructed native process which needs process architecture ahead to construct its native thread with a proper register context. So I think can't do it after the attach operation. At least need to do it before the factory routine returns. Do you mean to put these codes before line 540, i.e. return std::move(process_up)?

In D63165#1540606, @Hui wrote:

Sorry for the stupid question, but ...

What exactly is meant here by "Native"? How is a NativeProcessWindows different from ProcessWindows?

The Native*** classes are meant to be used from lldb-server. They look somewhat similar to their non-native counterpart because they still do debugging, but they're a lot dumber, because they only deal with basic process control, and none of the fancy symbolic stuff that you'd need debug info for.

They differ in APIs but most of them have common implementations. The APIs from native process classes are more easy to apply process/thread control.
Hope the native and non-native ones can be merged. The similar thing to the RegisterContext and NativeRegisterContext classes.

The other thing is that using "native" classes can avoid linking a lot of unnecessary lldb libs (LLDB plugins or whatever comes with the plugins) to lldb-server.
The nativeprocesswindows could just be a pass-through to processwindows plugin, but the usage is a sort of strange since the
lldb-server needs to initialize the plugin, create a target, and create a instance just like what lldb does. This means literally
there will be two lldb debuggers, one on host and the other one on remote. It is doable, but not that applicable.

So the idea is ProcessWindows will be deleted once lldb-server supports windows debugging. A bit of history here: when we first started we started making ProcessXXXX for each platform (mac, linux, windows). Then we thought about remote debugging and decided to have everyone just make a lldb-server for their platform and even when we are locally debugging, we launch a lldb-server. This is how linux, macOS, the BSDs and other targets do it. Why? Because if you do it the other way you have more code to support: one native plug-in and one remote plug-in. This means the remote debugging usually has more issues since it is the less tested debugging scenario. It also means that if you had a native process implementation only, like ProcessWindows is currently, you can't remotely debug to a windows machine.

So yes there is duplicated code for now, but ProcessWindows.cpp and ThreadWindows.cpp should go away in the near future once lldb-server takes over so the temporary code duplication is just to keep things working until lldb-server takes over.

Hui added inline comments.Jun 14 2019, 2:49 PM
source/Plugins/Process/Windows/Common/NativeRegisterContextWindows.h
31 ↗(On Diff #204154)

No, it doesn't override anything. It has different signature from the pure virtual method with the same name.

NativeRegisterContext::virtual Status ReadAllRegisterValues(lldb::DataBufferSP &data_sp) = 0;

It would be better to change the name to be ReadAllRegisterValuesWithSize or something else.

labath added inline comments.Jun 17 2019, 6:34 AM
source/Plugins/Process/Windows/Common/NativeProcessWindows.cpp
315 ↗(On Diff #204154)

This doesn't look right. You're completely ignoring the file_name argument. The idea of this function is to return the load (base) address of the module specified by that argument. Here I guess you're always returning the load address of the main module.

529–530 ↗(On Diff #204154)

I meant doing this *before* the factory returns, but *after* you perform the actual OS attach syscall is completed.

Combining this with the constructor suggestion below, I'd imagine doing something like

Error E = Error::success();
auto process_up = std::make_unique<NativeProcessWindows>(pid, -1, native_delegate, E);
if (E)
  return E;
return std::move(process_up);

Then in the process constructor, you'd do:

ErrorAsOutParameter EOut(E);
auto delegate_sp = std::make_shared<NativeDebugDelegate>(*this);
ProcessAttachInfo attach_info;
attach_info.SetPID(pid); // TODO: This is dumb, we are already passing the pid separately
E = AttachProcess(pid, attach_info, delegate).ToError();
if (E)
  return;
SetID(GetDebuggedProcessId());
ProcessInstanceInfo info;
if (!Host::GetProcessInfo(pid, info)) {
  E = createStringError(inconvertibleErrorCode(), "Cannot get process information")
  return;
}
m_arch = info.GetArchitecture();

Among other things, I'm trying to cut down on the number of layers one has to go through in order to launch/attach. This file already has a number of functions called "attach", D63166 has a bunch more, and then there even more in DebuggerThread. By the time one gets to the bottom of the stack and figures out what's actually happening, he's forgotten what was he actually trying to achieve.

source/Plugins/Process/Windows/Common/NativeProcessWindows.h
31 ↗(On Diff #204154)

I'm not sure this multiple inheritance is really helping anything here. While looking through the code i kept wondering "why doesn't this function just do the job itself instead of delegating to somebody else" only to have to remind myself that the function it is delegating to is actually a different object, which does not know anything about the bigger picture.

If using composition here wouldn't introduce additional complexities/code (which it doesn't look like it would), I think it might be better to do that instead.

133 ↗(On Diff #204154)

It doesn't look like process should ever be null. So, I'd suggest replacing the pointer by a reference, and deleting all the null checks.

source/Plugins/Process/Windows/Common/NativeRegisterContextWindows.h
31 ↗(On Diff #204154)

Hm.., interesting. I think I am starting to understand what's going on here. IIUC, the size of the register context is a constant for any given architecture, but it varies between architectures (i.e., NativeRegisterContextWindows_XXX instances). If that's the case, then it sounds like the data_size should be an argument to the NativeRegisterContextWindows constructor, instead it being passed down for every call. Would that work?

asmith updated this revision to Diff 205164.Jun 17 2019, 1:23 PM

Address reviewers comments

asmith updated this revision to Diff 205175.Jun 17 2019, 1:46 PM

Fix caps for WoW64

Hui added inline comments.Jun 21 2019, 8:35 AM
source/Plugins/Process/Windows/Common/NativeProcessWindows.h
31 ↗(On Diff #204154)

Do you mean to move what is in ProcessDebugger into nativeprocess instead?

labath added inline comments.Jun 21 2019, 8:42 AM
source/Plugins/Process/Windows/Common/NativeProcessWindows.h
31 ↗(On Diff #204154)

No, just instead of inheriting from a ProcessDebugger, making it a member variable instead. Unless there are reasons which make that hard/impossible...

Hui added inline comments.Jun 21 2019, 9:37 AM
source/Plugins/Process/Windows/Common/NativeProcessWindows.h
31 ↗(On Diff #204154)

Debug event callback in ProcessDebugger need to be overridden by both processwindows and nativeprocess. Not sure if the change is doable. Also same change is required to processwindows. That will make the patch a little bigger.

asmith updated this revision to Diff 206146.Jun 23 2019, 4:18 PM
asmith marked an inline comment as done.
asmith marked 10 inline comments as done.Jun 23 2019, 4:28 PM
labath added inline comments.Jun 24 2019, 1:49 AM
lldb/source/Plugins/Process/Windows/Common/NativeProcessWindows.h
46–50

I guess these shouldn't be public as these object should be constructed through the factory, right?

lldb/source/Plugins/Process/Windows/Common/NativeRegisterContextWindows.cpp
32–37

new doesn't fail. This is dead code.

53–58

This can't ever be true.

69–90

These look like they should be operating invariants (at most guarded by an assertion, but probably even that is not needed). Right now, they're just making it hard to figure out what this function actually does...

lldb/source/Plugins/Process/Windows/Common/NativeRegisterContextWindows_WoW64.cpp
83

How about a helper method like NativeThreadWindows &GetThread() to hide the static_cast everywhere. You could also change the constructor argument to NativeThreadWindows& to make it clear that the register context is only to be constructed with threads of this type.

a GetThreadHandle might be nice too since fetching the thread seems to be invariably followed by the GetHostThread().GetNativeThread().GetSystemHandle() blurb.

lldb/source/Plugins/Process/Windows/Common/NativeThreadWindows.cpp
81–85

What's the purpose of this function? It seems to be only called immediately before deleting the thread, so nobody gets to read it's stop reason anyway...

source/Plugins/Process/Windows/Common/NativeProcessWindows.h
31 ↗(On Diff #204154)

Ok, let's leave it like that for now. However, I find it very weird that the ProcessDebugger class takes a "delegate" object *and* it requires you to inherit from it to override some of it's methods. Seems like one of these extension mechanisms ought to be enough.

source/Plugins/Process/Windows/Common/NativeRegisterContextWindows.h
31 ↗(On Diff #204154)

I still think the register context size would be better handled as an constructor argument.

asmith updated this revision to Diff 207002.Jun 28 2019, 12:27 AM

removed the "clang-format off" in a few places

asmith added inline comments.Jun 30 2019, 1:37 PM
lldb/source/Plugins/Process/Windows/Common/NativeRegisterContextWindows.cpp
32–37

The code is copied from/following the other targets. Maybe all the targets could be cleaned up in another PR. For example Arm64,

NativeRegisterContextLinux_arm64::ReadAllRegisterValues()

data_sp.reset(new DataBufferHeap(REG_CONTEXT_SIZE, 0));
if (!data_sp)
53–58

Copied from Arm64

asmith updated this revision to Diff 207234.Jun 30 2019, 1:38 PM

Add a define for STATUS_WX86_BREAKPOINT.

labath added inline comments.Jul 1 2019, 6:21 AM
lldb/source/Plugins/Process/Windows/Common/NativeRegisterContextWindows.cpp
32–37

Done in r364744.

53–58

Done in r364754.

asmith updated this revision to Diff 207635.Jul 2 2019, 2:43 PM
asmith edited the summary of this revision. (Show Details)

removed the redundant null checks
added GetThreadHandle to simplify code

asmith marked 6 inline comments as done.Jul 2 2019, 2:45 PM
asmith updated this revision to Diff 210480.Jul 17 2019, 9:07 PM
asmith edited the summary of this revision. (Show Details)

I believe this address all the comments.

I am sorry, but the code still seems a lot more verbose to me than it should be needed to implement the given functionality. I'd like to understand why/if it's that necessary..

lldb/source/Plugins/Process/Utility/RegisterContextWindows_WoW64.cpp
72–105

This is very verbose. You could just have GetRegisterInfo return g_register_infos_WoW64 directly (and similarly for other functions. Then all you'd be left with is putting an assert(target_arch.GetMachine() == llvm::Triple::x86) into the RegisterContextWindows_WoW64 constructor. Same goes for other register context classes.

lldb/source/Plugins/Process/Utility/RegisterContextWindows_x86_64.cpp
141

Why is all of this complexity necessary? Couldn't you just switch on the target architecture in CreateRegisterInfoInterface in NativeRegisterContextWindows_x86_64.cpp and create either RegisterContextWindows_WoW64 or RegisterContextWindows_x86_64 ?

In fact, if RegisterContextWindows_WoW64 is identical to RegisterContextWindows_i386, then why do you even need the _WoW64 version of the class in the first place? FWIW, linux also does not have a RegisterContextLinux_32_bit_process_on_64_bit_kernel class...

lldb/source/Plugins/Process/Windows/Common/NativeProcessWindows.cpp
285–286

I think it would be better to return an error here. llvm::errc::not_supported ?

311–337

This doesn't seem right. You only retrieve the module list the first time this function was called, but the list of loaded modules can change over time. You probably need to invalidate this cache every time the process is resumed...

lldb/source/Plugins/Process/Windows/Common/NativeProcessWindows.h
46–50

This doesn't seem to be addressed.

lldb/source/Plugins/Process/Windows/Common/NativeRegisterContextWindows_x86_64.cpp
104–109

Are these suspend/resume calls necessary? You should be able to assume that the process is stopped (due to breakpoint, exception or whatever). Nobody will be calling functions that get/set registers on a running process. (linux does not support that either, but we don't bother explicitly stopping the process before we attempt to do this).

lldb/source/Plugins/Process/Windows/Common/NativeThreadWindows.cpp
139–199

IIUC, none of the current register contexts implement watchpoint support right now. Which means this code is dead/untested. Let's remove it for the time being...

lldb/unittests/tools/lldb-server/tests/TestClient.cpp
258–260

This doesn't sound right. If the inferior exits, lldb-server should exit too. It doesn't matter if that's implemented via SIGHUP, or whatever. If some of the tests don't pass because of this, maybe you can disable the tests for the time being..

amccarth requested changes to this revision.Jul 18 2019, 8:17 AM
In D63165#1540606, @Hui wrote:

Sorry for the stupid question, but ...

What exactly is meant here by "Native"? How is a NativeProcessWindows different from ProcessWindows?

The Native*** classes are meant to be used from lldb-server. They look somewhat similar to their non-native counterpart because they still do debugging, but they're a lot dumber, because they only deal with basic process control, and none of the fancy symbolic stuff that you'd need debug info for.

They differ in APIs but most of them have common implementations. The APIs from native process classes are more easy to apply process/thread control.
Hope the native and non-native ones can be merged. The similar thing to the RegisterContext and NativeRegisterContext classes.

The other thing is that using "native" classes can avoid linking a lot of unnecessary lldb libs (LLDB plugins or whatever comes with the plugins) to lldb-server.
The nativeprocesswindows could just be a pass-through to processwindows plugin, but the usage is a sort of strange since the
lldb-server needs to initialize the plugin, create a target, and create a instance just like what lldb does. This means literally
there will be two lldb debuggers, one on host and the other one on remote. It is doable, but not that applicable.

So the idea is ProcessWindows will be deleted once lldb-server supports windows debugging. A bit of history here: when we first started we started making ProcessXXXX for each platform (mac, linux, windows). Then we thought about remote debugging and decided to have everyone just make a lldb-server for their platform and even when we are locally debugging, we launch a lldb-server. This is how linux, macOS, the BSDs and other targets do it. Why? Because if you do it the other way you have more code to support: one native plug-in and one remote plug-in. This means the remote debugging usually has more issues since it is the less tested debugging scenario. It also means that if you had a native process implementation only, like ProcessWindows is currently, you can't remotely debug to a windows machine.

So yes there is duplicated code for now, but ProcessWindows.cpp and ThreadWindows.cpp should go away in the near future once lldb-server takes over so the temporary code duplication is just to keep things working until lldb-server takes over.

Thanks for all the clarification. "Native" is an overloaded term in LLDB.

I'm trying to think through the implications of this always-use-an-lldb-server approach on cross-platform postmortem debugging. I'll have to do some studying, but I guess this patch doesn't change anything in that regard.

lldb/source/Plugins/Process/Windows/Common/NativeProcessWindows.cpp
95

This code appears to be based on ProcessWindows::DoResume, which has a pending patch to fix a race condition by moving the checking of the exception record to later in the function. I assume this will need the same fix. See https://reviews.llvm.org/D62183 for details.

This revision now requires changes to proceed.Jul 18 2019, 8:17 AM

I'm trying to think through the implications of this always-use-an-lldb-server approach on cross-platform postmortem debugging. I'll have to do some studying, but I guess this patch doesn't change anything in that regard.

I think the implications of this are zero. In the post mortem case we would still create an (in-process) instance of ProcessMinidump, and it work the the same way as it does now. Just like we create a ProcessElfCore for opening linux core files even though we always use lldb-server (via ProcessGdbRemote) for live debugging.

Pretty much nothing of what lldb-server does is useful for debugging core files anyway. (Though I have had some ideas about creating a mock lldb-server operating on core files for the purposes of testing lldb.)

Hui added inline comments.Jul 18 2019, 9:09 PM
lldb/source/Plugins/Process/Windows/Common/NativeProcessWindows.h
46–50

These constructors contain a call to function template make_unique that is not allowed to access private members of NativeProcessWindows.
Simply do the following could address this concern.

-  auto process_up =
-      std::make_unique<NativeProcessWindows>(launch_info, native_delegate, E);

+  auto process_up = std::unique_ptr<NativeProcessWindows>(
+      new NativeProcessWindows(launch_info, native_delegate, E));
Hui added inline comments.Jul 18 2019, 10:46 PM
lldb/source/Plugins/Process/Windows/Common/NativeRegisterContextWindows_x86_64.cpp
104–109

Yes, agreed. But such codes are added to be consistent with what is now in upstream from https://reviews.llvm.org/rL364216. (CacheAllRegisterValues).

Hui added inline comments.Jul 18 2019, 10:54 PM
lldb/source/Plugins/Process/Utility/RegisterContextWindows_x86_64.cpp
141

I think WoW64 is i686 that shall deserve a separate arch specific implementation?

labath added inline comments.Jul 19 2019, 3:14 AM
lldb/source/Plugins/Process/Utility/RegisterContextWindows_x86_64.cpp
141

I am sorry, but I don't follow. Can you repeat the question?

(FWIW, I don't believe that the fact that two things are different from a certain point of view justifies copy-pasting (at least) 150 LOC. If you think it's confusing to use something called "i386" in things dealing with WoW64, you can always typedef the WOW64 context to it, or rename the i386 context to something more generic.)

lldb/source/Plugins/Process/Windows/Common/NativeProcessWindows.h
46–50

The make_unique thing is unfortunate, but that is the nature of c++. Since you're concerned about consistency, I'll mention that other classes (e.g. NativeProcessLinux) also prioritize clarifying what is the public interface of a class over the niceness of being able to call make_unique.

lldb/source/Plugins/Process/Windows/Common/NativeRegisterContextWindows_x86_64.cpp
104–109

Interesting. I don't think it's worth being consistent with that, as that code is hopefully going to go away soon, and lldb-server can offer much stronger guarantees about when these things can be called (due to everything being funneled through the gdb-remote protocol, which doesn't even support sending any packets while the process is running).

However, I am not familiar with windows APIs, so I'll leave this decision up to you...

asmith updated this revision to Diff 212259.Jul 29 2019, 6:02 PM
asmith marked an inline comment as done.
  • Undo the change to the test client
  • Remove changes for threads
  • Invalidate module cache on loading/unloading DLLs
asmith marked 11 inline comments as done.Jul 29 2019, 6:10 PM
asmith added inline comments.
lldb/source/Plugins/Process/Windows/Common/NativeProcessWindows.cpp
95

Thanks for the heads up

asmith marked an inline comment as done.Jul 29 2019, 6:10 PM
asmith updated this revision to Diff 212472.Jul 30 2019, 5:03 PM
asmith marked 2 inline comments as done.Jul 30 2019, 5:07 PM

I don't see anything else to address in this review. Comments?

lldb/source/Plugins/Process/Utility/RegisterContextWindows_x86_64.cpp
141

I don't want this to block the review and have removed the code.

labath added inline comments.Jul 31 2019, 12:48 AM
lldb/source/Plugins/Process/Utility/RegisterContextWindows_i386.cpp
71–104

I think all of this could be bolied down to:

RegisterContextWindows_i386::RegisterContextWindows_i386(
    const ArchSpec &target_arch)
    : lldb_private::RegisterInfoInterface(target_arch),
      m_register_info_p(g_register_infos_i386),
      m_register_info_count(llvm::array_lengthof(g_register_infos_i386)),
      m_user_register_count(llvm::array_lengthof(g_register_infos_i386)) {
  assert(target_arch.GetMachine() == llvm::Triple::x86);
}

In fact, I'd probably not even bother with the member variables, but have the public RegisterContextWindows_i386 functions return the values directly (just like GetGPRSize does).

lldb/source/Plugins/Process/Utility/RegisterContextWindows_x86_64.cpp
141

Thanks. FTR, I'm not opposed to splitting these classes again in the future, if we develop a need for them to diverge (though it would be nice to find a way to factor the common parts in that case).

However, there's one more thing that bothers me here. It's this business with constructing a RegisterContextWindows_i386 here, copying the entries out of it, and re-using them elsewhere while asserting that things were done in the right order. It seems very complex, and I'm not sure that complexity is needed. It seems to me like all of that could be removed if you just made the decision which set of registers to use one level up (i.e., in CreateRegisterInfoInterface in NativeRegisterContextWindows_x86_64.cpp. You could just have that function switch on the process byte size, and return RegisterContextWindows_i386 or RegisterContextWindows_x86_64. This is the same way things are done on x86 linux (see NativeRegisterContextLinux_x86_64.cpp).

labath added inline comments.Jul 31 2019, 2:14 AM
lldb/source/Plugins/Process/Utility/RegisterContextWindows_x86_64.cpp
141

Unfortunately NativeRegisterContextLinux_x86_64 is not good example here. It turns I misread, and that class actually switches on the *host* architecture instead of *target*, and its implementation of RegisterContextLinux_x86_64 does indeed the i386 copying tricks that are very similar to what you do here. However, that class has the "excuse" of needing to fix up the register offsets in the i386-on-x86_64 case (for those following along: see UPDATE_REGISTER_INFOS_I386_STRUCT_WITH_X86_64_OFFSETS). You don't need to do anything like that here, so I would hope that we can do something simpler here.

And I'll try to figure out a way to do the linux thing in a saner fashion...

asmith marked an inline comment as done.Aug 2 2019, 12:11 AM
asmith added inline comments.
lldb/source/Plugins/Process/Utility/RegisterContextWindows_x86_64.cpp
141

Are you accepting this review?

Are you accepting this review?

No I would still like to see the two comments I made in <D63165#1607877> to be addressed. The last comment was just acknowledging the fact that you (or is it @Hui? I'm not sure who actually writes these patches...) were following the existing pattern in the linux register contexts. However, the pattern over there is for a reason, and that reason does not apply here. That doesn't make the linux code less ugly (probably the opposite), but it does mean we shouldn't try to emulate it if there are better options available...

asmith added a comment.Aug 2 2019, 2:26 AM

I can make the one simplification but Im not sure what you are asking as far as the refactoring. Provide an example and I will try to accommodate your request. I'm not going to be able to spend much more time on lldb patches though.

labath added a comment.Aug 2 2019, 2:47 AM

I can make the one simplification but Im not sure what you are asking as far as the refactoring. Provide an example and I will try to accommodate your request. I'm not going to be able to spend much more time on lldb patches though.

I think you should change the CreateRegisterInfoInterface in NativeRegisterContextWindows_x86_64.cpp to something like:

static RegisterInfoInterface *
CreateRegisterInfoInterface(const ArchSpec &target_arch) {
  assert((HostInfo::GetArchitecture().GetAddressByteSize() == 8) &&
         "Register setting path assumes this is a 64-bit host");
  switch (target.arch.GetAddressByteSize()) {
  case 8:
    return new RegisterContextWindows_x86_64(target_arch);
  case 8:
    return new RegisterContextWindows_i386(target_arch);
  default:
    llvm_unreachable("Unsupported byte size!");
}

Then you should be able to remove all of the g_private_reg_interface business from RegisterContextWindows_x86_64.cpp, as the RegisterContextWindows_x86_64 will only ever be called with a 64-bit ArchSpec. In fact, since you already have NativeRegisterContextWindows_WoW64, I am not sure if even the 4 branch is needed, as in case of GetAddressByteSize() == 4, we should end up in CreateRegisterInfoInterface in NativeRegisterContextWindows_WoW64 (which already correctly constructs a RegisterContextWindows_i386).

asmith updated this revision to Diff 213781.Aug 6 2019, 6:59 PM
asmith marked 6 inline comments as done.

Any last comments before this is committed?

I am still waiting for the g_private_reg_interface thingy to be removed, or for an explanation of why it cannot be done...

asmith updated this revision to Diff 214781.Aug 13 2019, 1:32 AM

This is done.

labath accepted this revision.Aug 13 2019, 7:38 AM

This looks great to me. Thank you for your patience.

lldb/source/Plugins/Process/Utility/RegisterContextWindows_x86_64.cpp
11

I think this include should no longer be needed.

asmith updated this revision to Diff 214936.Aug 13 2019, 3:03 PM
asmith edited the summary of this revision. (Show Details)
asmith updated this revision to Diff 214942.Aug 13 2019, 3:14 PM
This revision was not accepted when it landed; it landed in state Needs Review.Aug 13 2019, 3:18 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptAug 13 2019, 3:18 PM