This is an archive of the discontinued LLVM Phabricator instance.

Refactor and generalize debugserver code for setting hardware watchpoints on AArch64
ClosedPublic

Authored by jasonmolenda on Apr 23 2023, 11:17 PM.

Details

Summary

I am adding MASK (power of 2, with that alignment) watchpoint support to debugserver AArch64, where it currently only does Byte Address Select (BAS, 1-8 bytes aligned to doubleword) watchpoints. This patch generalizes debugserver's watchpoint setting code so that it will be simple to drop in MASK watchpoints for larger memory regions in a later patch.

debugserver supports an unusual feature where a request to watch an unaligned range of bytes is handled. If you have a 16 byte buffer at 0x1000 and ask to watch 4 bytes at 0x1006, lldb needs to watch 0x1006-0x1009 inclusive. A single BAS watchpoint can only watch either 0x1000-0x1007 or 0x1008-0x100f. Debugserver splits this unaligned watch request into two BAS watchpoints and uses two hardware registers to do it. Most of this patch is generalizing that "align and find the correct place to split the watch request" code to work for any size watch request byte range, not just 8-byte watch requests. I spent a bit of time to check that it was behaving correctly for a lot of BAS+MASK scenarios, e.g. watch 32 bytes at 0x1038 means an 8-byte BAS watchpoint for 0x1038-0x103f and a 32-byte MASK watchpoint from 0x1040-0x105f, ideally ignoring writes to the last 8 bytes which the user didn't ask to watch.

This is also part of the processing I'll need to do for the WatchpointLocations I proposed a bit ago ( https://discourse.llvm.org/t/aarch64-watchpoints-reported-address-outside-watched-range-adopting-mask-style-watchpoints/67660/6 ). If a target/stub only supports 8-byte watchpoints, and a user asks to watch a 32-byte object, lldb should use 4 watchpoint registers (if it can) to watch that object. Or a single MASK watchpoint register. Or two hardware registers if we have MASK+BAS to watch an unaligned memory range.

This is only doing the simple "find correct alignment, split memory request to aligned ranges", it doesn't try to use n watchpoints to watch a large object, only 2 for unaligned requests. It's a starting point.

debugserver internally has a "HiLo" table to track these joined hardware registers corresponding to one user requested watchpoint. There's no feedback given to lldb that this is how the object was watched; the Z packet doesn't have any way of communicating it. I added a test for this specific debugserver feature where I watch 4 bytes unaligned, then detect that the watchpoints are triggered as loops in the binary run. It's not a fancy test, but we didn't have anything for this previously so it's a simple start.

I expect this patch to result in no observable change in behavior to debugserver. It is laying the groundwork for adding MASK watchpoints, which will be a little patch on top of it. lldb itself is in no position to deal with larger watchpoints (and the false positives we can get when we watch a region larger than the user's requested one). I also don't try to solve a user requesting multiple watchpoints that all overlap in a single hardware register. e.g. user asks to watch 0x1000-0x1001 and 0x1005-0x1007 inclusive. This can be represented by a single BAS watchpoint in AArch64 watching noncontiguous byte ranges within that doubleword, but debugserver isn't set up to combine a new watchpoint request with existing watchpoints that already cover part of that range. Another thing I might need to think about when we start using power-of-2 MASK watchpoints.

Functionally, it's taking EnableHardwareWatchpoint, separating out the "align and split" logic to its own method, then once we have valid watchpoint specification(s), we call out to the new EnableBASWatchpoint to set the correct bits in the control register. The next patch will be to add an EnableMASKWatchpoint for larger requests.

Diff Detail

Event Timeline

jasonmolenda created this revision.Apr 23 2023, 11:17 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 23 2023, 11:17 PM
jasonmolenda requested review of this revision.Apr 23 2023, 11:17 PM
JDevlieghere added inline comments.Apr 26 2023, 10:45 PM
lldb/tools/debugserver/source/MacOSX/arm64/DNBArchImplARM64.cpp
833–835

Do you ever expect this to return more than two watchpoints? Seems like this could be a struct that holds two optional WatchpointSpecs. I don't feel strongly about it, but it looks like you never iterate over the list and the way you have to check after the recursive call is a bit awkward.

edit: nvm, it looks like you do actually iterate over them in EnableHardwareWatchpoint

841

I would get rid of wps and return an empty list here (return {}). It's pretty obvious here, but on line 893 I was forced to go back and make sure nothing had been added to the list in the meantime.

844

Could be constexpr. Same for addr_byte_size and addr_bit_size

851

s/nearest/next/

856

Beautiful. Once we have C++20 we can use std::bit_ceil (https://en.cppreference.com/w/cpp/numeric/bit_ceil)

897

If you get rid of wps you can use return {{first_wp[0], second_wp[0]}} instead

923

Why is this > 1 and not >= 1 (i.e. no 0 which we checked on line 916). TBH I don't really understand what this function is doing.

lldb/tools/debugserver/source/MacOSX/arm64/DNBArchImplARM64.h
39

All of debugserver's structs and classes use CamelCase.

82–83

Nit: I know what user_ means in the context of this patch, but I'd go with either requested_addr and requested_size or even just addr and size. Another option would be user_specified_addr but that seems pretty verbose, although it would be consistent with the members of the struct.

tschuett added inline comments.
lldb/tools/debugserver/source/MacOSX/arm64/DNBArchImplARM64.cpp
856

Is the builtin available on all supported platforms and compilers? There are some alignment functions in MathExtras.h.

Thanks for all the helpful comments, I'll update the patch.

lldb/tools/debugserver/source/MacOSX/arm64/DNBArchImplARM64.cpp
833–835

I was thinking about how this current scheme only ever splits 1 watchpoint into 2, but a future design that could expand a user requested watch into a larger number of hardware watchpoints would expand it further. If a user asks to watch a 32 byte object, and the target only supports 8 byte watchpoints, and the object is doubleword aligned, we could watch it with 4 hardware watchpoints. The older code in debugserver was written in terms of "one or two" but I switched to a vector of hardware implementable watchpoints that may expand as we evaluate the hardware capabilities of the target/stub.

jasonmolenda marked 7 inline comments as done.Apr 28 2023, 4:38 PM
jasonmolenda added inline comments.
lldb/tools/debugserver/source/MacOSX/arm64/DNBArchImplARM64.cpp
833–835

There's some extent where this debugserver implementation is me sketching out the first part of the WatchpointLocation work I want to do in lldb. I will likely be copying this code up in to lldb, so it's written with an eye to where I'm heading there, agreed it's not necessary tho. tbh once that WatchpointLocation stuff exists in lldb, all of this can be pulled from debugserver.

856

Ah, that would be very nice and a lot clearer for readers. I might add that as a comment.

923

Clarified this. We have two cases: the user's watchpoint request needs either one hardware watchpoint register (one WatchpointSpec) or two hardware watchpoint registers (two WatchpointSpecs). There's the missing pass from debugserver which takes an aligned one or two WatchpointSpecs which could expand the WatchpointSpecs more if we only supported 8 byte watchpoints and wanted to use many hardware watchpoint registers to implement them. So I'm writing more general code than debugserver is actually going to do any time (I'll be adding MASK watchpoints next, for larger regions).

lldb/tools/debugserver/source/MacOSX/arm64/DNBArchImplARM64.h
82–83

Yeah, in the argument list it looks overly verbose, but then in the method where we're using a combination of the user's original intent and the actual aligned addresses & sizes, it gets a little confusing to call them addr & size. I'm fine with requested_.

jasonmolenda marked an inline comment as done.

Update patch to incorporate Jonas' feedback.

JDevlieghere accepted this revision.Apr 28 2023, 6:08 PM

Thanks. This LGTM.

lldb/tools/debugserver/source/MacOSX/arm64/DNBArchImplARM64.cpp
856

Yes: debugserver is only supported on macOS

This revision is now accepted and ready to land.Apr 28 2023, 6:08 PM