This is an archive of the discontinued LLVM Phabricator instance.

[LLDB] [Windows] Initial support for ARM64 register contexts
ClosedPublic

Authored by mstorsjo on Sep 24 2019, 3:04 AM.

Diff Detail

Event Timeline

mstorsjo created this revision.Sep 24 2019, 3:04 AM

This seems to be enough for getting a backtrace and variable values for MinGW-built binaries for ARM64.

To clarify; this is binaries that use DWARF debug info.

Have you considered going the "native" route directly? My understanding is that this route is already functional on x86 (modulo watchpoints, which I need to get around to reviewing). It would be great to be able to delete the in-process route soon, as it's not good to have both for a long time, and for that we need to ensure that the lldb-server route does not lag behind in features. I'm not really sure what's needed to enable the lldb-server mechanism, but @aleksandr.urakov should.

One of the advantages of lldb-server is that is possible to test it by cross-debugging. The mechanism for doing that is a bit complicated, and I'm not entirely sure how that works after all the lit changes, but it goes approximately like this:

  1. build lldb for the host
  2. build lldb-server for the target (either on the target, or via cross-compile)
  3. run lldb-server platform --server *:1234 on the target
  4. run the dotest test suite arguments suitable for cross-compilation. This means:
    • setting the test compiler to be a cross compiler (cmake -DLLDB_TEST_C(XX)_COMPILER)
    • choosing a suitable test architecture (-DLLDB_TEST_ARCH)
    • telling it how to connect to the target (-DLLDB_TEST_USER_ARGS=--platform-name remote-windows --platform-url connect://remote:1234 --platform-working-dir=c:\tmp)
    • crossing your fingers :)

But, of course the easiest way to test this would be to build and test natively. The usual obstacle for that is that the arm device is too small to comfortably work on it, but you say you don't have the full build environment *yet*, so it's not clear to me whether that is the problem you're running into...

Have you considered going the "native" route directly? My understanding is that this route is already functional on x86 (modulo watchpoints, which I need to get around to reviewing). It would be great to be able to delete the in-process route soon, as it's not good to have both for a long time, and for that we need to ensure that the lldb-server route does not lag behind in features. I'm not really sure what's needed to enable the lldb-server mechanism, but @aleksandr.urakov should.

I haven't dug in to see what's needed to enable it (if it maybe is enabled separately per arch somewhere?) - I just built lldb for arm64, tested using it and noting that it crashed due to m_thread_reg_ctx being unset, made a first proof of concept implementation of the missing class, and noted that it actually worked for getting usable debug info from my testcase.

One of the advantages of lldb-server is that is possible to test it by cross-debugging. The mechanism for doing that is a bit complicated, and I'm not entirely sure how that works after all the lit changes, but it goes approximately like this:

  1. build lldb for the host
  2. build lldb-server for the target (either on the target, or via cross-compile)
  3. run lldb-server platform --server *:1234 on the target
  4. run the dotest test suite arguments suitable for cross-compilation. This means:
    • setting the test compiler to be a cross compiler (cmake -DLLDB_TEST_C(XX)_COMPILER)
    • choosing a suitable test architecture (-DLLDB_TEST_ARCH)
    • telling it how to connect to the target (-DLLDB_TEST_USER_ARGS=--platform-name remote-windows --platform-url connect://remote:1234 --platform-working-dir=c:\tmp)
    • crossing your fingers :)

Interesting - sounds sensible but also sounds like a bit more effort to get going than just running the usual lit tests. Will look into that at some point.

But, of course the easiest way to test this would be to build and test natively. The usual obstacle for that is that the arm device is too small to comfortably work on it, but you say you don't have the full build environment *yet*, so it's not clear to me whether that is the problem you're running into...

Well, I cross build as the environment on the target device is a bit bare, and as it's a new architecture and all, there's not much available when it comes to prebuilt packages (for everything you need for building) - I do have a working compiler on that device, but nothing else. The devices do support running i386 binaries emulated, so I could install such a version of msys2 and do building there, but even more slowly... In general as well, I think building llvm/lldb on such a device requires quite a bit of patience, and rebuilding after fetching newer upstream commits would be pretty painful as well.

I think the most practical setup, at least for lit-style tests that don't require any compilation in themselves, is cross-building all the binaries needed, moving them over to the target device, and running them with llvm-lit there. But that still requires having python available for running llvm-lit.

The alternative is probably to leverage WSL for a mess-free environment with python, shells and everything available, but I'm not sure if that messes things up (like llvm-lit thinks it runs on linux even though the binaries it should execute will run are windows ones).

compnerd accepted this revision.Sep 24 2019, 8:30 AM

Honestly, this is just setting up the register context for ARM64. I dont think that there is much of a test for this. I mean, I suppose you could test this by instantiating the context and trying to read it through the interface. But, I question the value of such a test. Whether you go with the in-process or out-of-process approach and whether you are doing DWARF of CodeView debugging this is going to be needed. As to running the test suite - you can cross-compile and run the tests remotely.

lldb/source/Plugins/Process/Windows/Common/arm64/RegisterContextWindows_arm64.cpp
57

It formats better if you add a trailing comma to the list (gpr_cpsr,)

64

Similar

70

Similar

This revision is now accepted and ready to land.Sep 24 2019, 8:30 AM

Honestly, this is just setting up the register context for ARM64. I dont think that there is much of a test for this. I mean, I suppose you could test this by instantiating the context and trying to read it through the interface. But, I question the value of such a test. Whether you go with the in-process or out-of-process approach and whether you are doing DWARF of CodeView debugging this is going to be needed. As to running the test suite - you can cross-compile and run the tests remotely.

I disagree. As D67892 shows, it quite possible to mess up even with a "trivial" class as this one. Ideally I'd like to see here tests similar to what @mgorny added for x86 (see lldb/lit/Register). However, the problem is that we don't have a way to run those tests at the moment. Since this is a problem that's going to show up sooner or later, perhaps with more "nontrivial" patches, I think it's good to figure out what to do early.

labath requested changes to this revision.Sep 24 2019, 9:23 AM
This revision now requires changes to proceed.Sep 24 2019, 9:23 AM

Have you considered going the "native" route directly? My understanding is that this route is already functional on x86 (modulo watchpoints, which I need to get around to reviewing). It would be great to be able to delete the in-process route soon, as it's not good to have both for a long time, and for that we need to ensure that the lldb-server route does not lag behind in features. I'm not really sure what's needed to enable the lldb-server mechanism, but @aleksandr.urakov should.

Hello! I'm just commenting out content of the ProcessWindows::Initialize() function to use the lldb-server route.

There are several problems with cross-debugging on Windows now, @leonid.mashinskiy got deep into this. Leonid, can you explain the difficulties, please?

mstorsjo updated this revision to Diff 224757.Oct 12 2019, 1:40 PM
mstorsjo retitled this revision from [LLDB] [Windows] Initial support for ARM64 debugging to [LLDB] [Windows] Initial support for ARM64 register contexts.
mstorsjo edited the summary of this revision. (Show Details)
mstorsjo added a reviewer: aleksandr.urakov.

Added two lit/shell based tests that pass on both linux/arm64 and windows/arm64. I've managed to set up some sort of hacked up environment where I can run lit/shell based tests (even though the main python test driver runs in WSL, but executing native windows binaries for the tests).

I also added a NativeRegisterContext for arm64, for lldb-server. For the RegisterInfoInterface for NativeRegisterContext, I reused RegisterInfoPOSIX_arm64 instead of creating a new copy similar to it, since I didn't really see anything OS specific in there.

The tests pass both with and without use of lldb-server. However, when using lldb-server with NativeRegisterContext, while the register values are correct, I don't get a correct working backtrace with it. Without lldb-server, I get a perfect backtrace. (The tested binary uses SEH unwind tables, but DWARF debug info.) Any clues about what might be going wrong there?

Herald added a project: Restricted Project. · View Herald TranscriptOct 12 2019, 1:40 PM
labath accepted this revision.Oct 14 2019, 1:50 AM
labath added a reviewer: mgorny.

Thank you _very_ much for those tests. +@mgorny, in case he has any comments on those.

The tests pass both with and without use of lldb-server. However, when using lldb-server with NativeRegisterContext, while the register values are correct, I don't get a correct working backtrace with it. Without lldb-server, I get a perfect backtrace. (The tested binary uses SEH unwind tables, but DWARF debug info.) Any clues about what might be going wrong there?

Hard to say off-hand, but the first thing I'd check is whether the information about loaded modules and their addresses is making its way into lldb. You can use the "image list" command to inspect that. Then there's the "image show-unwind" command which can show you how lldb will try to unwind for a given function/address. Also, if you enable the "unwind" log channel (log enable lldb unwind), you'll get a trace of what lldb did while attempting to unwind.

This revision is now accepted and ready to land.Oct 14 2019, 1:50 AM

Thank you _very_ much for those tests. +@mgorny, in case he has any comments on those.

The tests pass both with and without use of lldb-server. However, when using lldb-server with NativeRegisterContext, while the register values are correct, I don't get a correct working backtrace with it. Without lldb-server, I get a perfect backtrace. (The tested binary uses SEH unwind tables, but DWARF debug info.) Any clues about what might be going wrong there?

Hard to say off-hand, but the first thing I'd check is whether the information about loaded modules and their addresses is making its way into lldb. You can use the "image list" command to inspect that. Then there's the "image show-unwind" command which can show you how lldb will try to unwind for a given function/address. Also, if you enable the "unwind" log channel (log enable lldb unwind), you'll get a trace of what lldb did while attempting to unwind.

Thanks for the debugging tips. When I run image list I get error: the target has no associated executable images, so that pretty clearly shows that something's missing. On x86_64, image list shows only the debugged exe itself, when run with LLDB_USE_LLDB_SERVER=1, while it shows the exe and a few system dlls (ntdll, kernel32, kernelbase and ucrtbase) when run without lldb-server.

Hard to say off-hand, but the first thing I'd check is whether the information about loaded modules and their addresses is making its way into lldb. You can use the "image list" command to inspect that. Then there's the "image show-unwind" command which can show you how lldb will try to unwind for a given function/address. Also, if you enable the "unwind" log channel (log enable lldb unwind), you'll get a trace of what lldb did while attempting to unwind.

Thanks for the debugging pointers. I managed to track this down, with a fix suggestion in D68939. I guess that indicates another issue elsewhere, but it's pretty much out of scope for me to dig further into that issue now. I guess it should be possible to reproduce the same issue on x86_64 by changing the triple similarly there as well.

This revision was automatically updated to reflect the committed changes.

Is there a specific reason you've only covered x0..x7 in the test?

Is there a specific reason you've only covered x0..x7 in the test?

No, only for keeping the test short and cohesive.