This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lldb/
-
test/API/functionalities/watchpoint/unaligned-spanning-two-dwords/
-
API/
-
functionalities/
-
watchpoint/
-
unaligned-spanning-two-dwords/
-
Makefile
-
TestUnalignedSpanningDwords.py
-
main.c
-
tools/debugserver/source/MacOSX/arm64/
-
debugserver/
-
source/
-
MacOSX/
-
arm64/
3/3
DNBArchImplARM64.h
10/13
DNBArchImplARM64.cpp

Differential D149040

Refactor and generalize debugserver code for setting hardware watchpoints on AArch64
ClosedPublic

Authored by jasonmolenda on Apr 23 2023, 11:17 PM.

Download Raw Diff

Details

Reviewers

JDevlieghere

Commits

rG5679379cc7df: Refactor and generalize AArch64 watchpoint support in debugserver

Summary

I am adding MASK (power of 2, with that alignment) watchpoint support to debugserver AArch64, where it currently only does Byte Address Select (BAS, 1-8 bytes aligned to doubleword) watchpoints. This patch generalizes debugserver's watchpoint setting code so that it will be simple to drop in MASK watchpoints for larger memory regions in a later patch.

debugserver supports an unusual feature where a request to watch an unaligned range of bytes is handled. If you have a 16 byte buffer at 0x1000 and ask to watch 4 bytes at 0x1006, lldb needs to watch 0x1006-0x1009 inclusive. A single BAS watchpoint can only watch either 0x1000-0x1007 or 0x1008-0x100f. Debugserver splits this unaligned watch request into two BAS watchpoints and uses two hardware registers to do it. Most of this patch is generalizing that "align and find the correct place to split the watch request" code to work for any size watch request byte range, not just 8-byte watch requests. I spent a bit of time to check that it was behaving correctly for a lot of BAS+MASK scenarios, e.g. watch 32 bytes at 0x1038 means an 8-byte BAS watchpoint for 0x1038-0x103f and a 32-byte MASK watchpoint from 0x1040-0x105f, ideally ignoring writes to the last 8 bytes which the user didn't ask to watch.

This is also part of the processing I'll need to do for the WatchpointLocations I proposed a bit ago ( https://discourse.llvm.org/t/aarch64-watchpoints-reported-address-outside-watched-range-adopting-mask-style-watchpoints/67660/6 ). If a target/stub only supports 8-byte watchpoints, and a user asks to watch a 32-byte object, lldb should use 4 watchpoint registers (if it can) to watch that object. Or a single MASK watchpoint register. Or two hardware registers if we have MASK+BAS to watch an unaligned memory range.

This is only doing the simple "find correct alignment, split memory request to aligned ranges", it doesn't try to use n watchpoints to watch a large object, only 2 for unaligned requests. It's a starting point.

debugserver internally has a "HiLo" table to track these joined hardware registers corresponding to one user requested watchpoint. There's no feedback given to lldb that this is how the object was watched; the Z packet doesn't have any way of communicating it. I added a test for this specific debugserver feature where I watch 4 bytes unaligned, then detect that the watchpoints are triggered as loops in the binary run. It's not a fancy test, but we didn't have anything for this previously so it's a simple start.

I expect this patch to result in no observable change in behavior to debugserver. It is laying the groundwork for adding MASK watchpoints, which will be a little patch on top of it. lldb itself is in no position to deal with larger watchpoints (and the false positives we can get when we watch a region larger than the user's requested one). I also don't try to solve a user requesting multiple watchpoints that all overlap in a single hardware register. e.g. user asks to watch 0x1000-0x1001 and 0x1005-0x1007 inclusive. This can be represented by a single BAS watchpoint in AArch64 watching noncontiguous byte ranges within that doubleword, but debugserver isn't set up to combine a new watchpoint request with existing watchpoints that already cover part of that range. Another thing I might need to think about when we start using power-of-2 MASK watchpoints.

Functionally, it's taking EnableHardwareWatchpoint, separating out the "align and split" logic to its own method, then once we have valid watchpoint specification(s), we call out to the new EnableBASWatchpoint to set the correct bits in the control register. The next patch will be to add an EnableMASKWatchpoint for larger requests.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jasonmolenda created this revision.Apr 23 2023, 11:17 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 23 2023, 11:17 PM

Herald added a subscriber: kristof.beyls. · View Herald Transcript

jasonmolenda requested review of this revision.Apr 23 2023, 11:17 PM

Herald added a subscriber: lldb-commits. · View Herald TranscriptApr 23 2023, 11:17 PM

Harbormaster completed remote builds in B227632: Diff 516276.Apr 23 2023, 11:30 PM

JDevlieghere added inline comments.Apr 26 2023, 10:45 PM

lldb/tools/debugserver/source/MacOSX/arm64/DNBArchImplARM64.cpp
833–835	Do you ever expect this to return more than two watchpoints? Seems like this could be a `struct` that holds two optional `WatchpointSpec`s. I don't feel strongly about it, but it looks like you never iterate over the list and the way you have to check after the recursive call is a bit awkward. edit: nvm, it looks like you do actually iterate over them in `EnableHardwareWatchpoint`
841	I would get rid of `wps` and return an empty list here (`return {}`). It's pretty obvious here, but on line 893 I was forced to go back and make sure nothing had been added to the list in the meantime.
844	Could be `constexpr`. Same for `addr_byte_size` and `addr_bit_size`
851	`s/nearest/next/`
856	Beautiful. Once we have C++20 we can use `std::bit_ceil` (https://en.cppreference.com/w/cpp/numeric/bit_ceil)
897	If you get rid of `wps` you can use `return {{first_wp[0], second_wp[0]}}` instead
923	Why is this `> 1` and not `>= 1` (i.e. no 0 which we checked on line 916). TBH I don't really understand what this function is doing.
lldb/tools/debugserver/source/MacOSX/arm64/DNBArchImplARM64.h
39	All of debugserver's structs and classes use CamelCase.
82–83	Nit: I know what `user_` means in the context of this patch, but I'd go with either `requested_addr` and `requested_size` or even just `addr` and `size`. Another option would be `user_specified_addr` but that seems pretty verbose, although it would be consistent with the members of the struct.

tschuett added a subscriber: tschuett.Apr 28 2023, 7:22 AM

tschuett added inline comments.

lldb/tools/debugserver/source/MacOSX/arm64/DNBArchImplARM64.cpp
856	Is the builtin available on all supported platforms and compilers? There are some alignment functions in MathExtras.h.

Thanks for all the helpful comments, I'll update the patch.

lldb/tools/debugserver/source/MacOSX/arm64/DNBArchImplARM64.cpp
833–835	I was thinking about how this current scheme only ever splits 1 watchpoint into 2, but a future design that could expand a user requested watch into a larger number of hardware watchpoints would expand it further. If a user asks to watch a 32 byte object, and the target only supports 8 byte watchpoints, and the object is doubleword aligned, we could watch it with 4 hardware watchpoints. The older code in debugserver was written in terms of "one or two" but I switched to a vector of hardware implementable watchpoints that may expand as we evaluate the hardware capabilities of the target/stub.

jasonmolenda marked 7 inline comments as done.Apr 28 2023, 4:38 PM

jasonmolenda added inline comments.

lldb/tools/debugserver/source/MacOSX/arm64/DNBArchImplARM64.cpp
833–835	There's some extent where this debugserver implementation is me sketching out the first part of the WatchpointLocation work I want to do in lldb. I will likely be copying this code up in to lldb, so it's written with an eye to where I'm heading there, agreed it's not necessary tho. tbh once that WatchpointLocation stuff exists in lldb, all of this can be pulled from debugserver.
856	Ah, that would be very nice and a lot clearer for readers. I might add that as a comment.
923	Clarified this. We have two cases: the user's watchpoint request needs either one hardware watchpoint register (one WatchpointSpec) or two hardware watchpoint registers (two WatchpointSpecs). There's the missing pass from debugserver which takes an aligned one or two WatchpointSpecs which could expand the WatchpointSpecs more if we only supported 8 byte watchpoints and wanted to use many hardware watchpoint registers to implement them. So I'm writing more general code than debugserver is actually going to do any time (I'll be adding MASK watchpoints next, for larger regions).
lldb/tools/debugserver/source/MacOSX/arm64/DNBArchImplARM64.h
82–83	Yeah, in the argument list it looks overly verbose, but then in the method where we're using a combination of the user's original intent and the actual aligned addresses & sizes, it gets a little confusing to call them addr & size. I'm fine with `requested_`.

Update patch to incorporate Jonas' feedback.

Harbormaster completed remote builds in B228962: Diff 518101.Apr 28 2023, 5:39 PM

Thanks. This LGTM.

lldb/tools/debugserver/source/MacOSX/arm64/DNBArchImplARM64.cpp
856	Yes: `debugserver` is only supported on macOS

This revision is now accepted and ready to land.Apr 28 2023, 6:08 PM

Closed by commit rG5679379cc7df: Refactor and generalize AArch64 watchpoint support in debugserver (authored by jasonmolenda). · Explain WhyApr 28 2023, 6:24 PM

This revision was automatically updated to reflect the committed changes.

jasonmolenda added a commit: rG5679379cc7df: Refactor and generalize AArch64 watchpoint support in debugserver.

jasonmolenda mentioned this in D149792: Add AArch64 MASK watchpoint support to debugserver.May 3 2023, 3:11 PM

Revision Contents

Path

Size

lldb/

test/

API/

functionalities/

watchpoint/

unaligned-spanning-two-dwords/

Makefile

3 lines

TestUnalignedSpanningDwords.py

61 lines

main.c

17 lines

tools/

debugserver/

source/

MacOSX/

arm64/

DNBArchImplARM64.h

14 lines

DNBArchImplARM64.cpp

297 lines

Diff 518112

lldb/test/API/functionalities/watchpoint/unaligned-spanning-two-dwords/Makefile

This file was added.

				C_SOURCES := main.c

				include Makefile.rules

lldb/test/API/functionalities/watchpoint/unaligned-spanning-two-dwords/TestUnalignedSpanningDwords.py

This file was added.

				"""
				Watch 4 bytes which spawn two doubleword aligned regions.
				On a target that supports 8 byte watchpoints, this will
				need to be implemented with a hardware watchpoint on both
				doublewords.
				"""



				import lldb
				from lldbsuite.test.decorators import *
				from lldbsuite.test.lldbtest import *
				from lldbsuite.test import lldbutil

				class UnalignedWatchpointTestCase(TestBase):

				def hit_watchpoint_and_continue(self, process, iter_str):
				process.Continue()
				self.assertEqual(process.GetState(), lldb.eStateStopped,
				iter_str)
				thread = process.GetSelectedThread()
				self.assertEqual(thread.GetStopReason(), lldb.eStopReasonWatchpoint, iter_str)
				self.assertEqual(thread.GetStopReasonDataCount(), 1, iter_str)
				wp_num = thread.GetStopReasonDataAtIndex(0)
				self.assertEqual(wp_num, 1, iter_str)

				NO_DEBUG_INFO_TESTCASE = True
				# debugserver on AArch64 has this feature.
				@skipIf(archs=no_match(['x86_64', 'arm64', 'arm64e', 'aarch64']))
				@skipUnlessDarwin
				# debugserver only started returning an exception address within
				# a range lldb expects in https://reviews.llvm.org/D147820 2023-04-12.
				# older debugservers will return the base address of the doubleword
				# which lldb doesn't understand, and will stop executing without a
				# proper stop reason.
				@skipIfOutOfTreeDebugserver

				def test_unaligned_watchpoint(self):
				"""Test a watchpoint that is handled by two hardware watchpoint registers."""
				self.build()
				self.main_source_file = lldb.SBFileSpec("main.c")
				(target, process, thread, bkpt) = lldbutil.run_to_source_breakpoint(self,
				"break here", self.main_source_file)

				frame = thread.GetFrameAtIndex(0)

				a_bytebuf_6 = frame.GetValueForVariablePath("a.bytebuf[6]")
				a_bytebuf_6_addr = a_bytebuf_6.GetAddress().GetLoadAddress(target)
				err = lldb.SBError()
				wp = target.WatchAddress(a_bytebuf_6_addr, 4, False, True, err)
				self.assertTrue(err.Success())
				self.assertTrue(wp.IsEnabled())
				self.assertEqual(wp.GetWatchSize(), 4)
				self.assertGreater(wp.GetWatchAddress() % 8, 4, "watched region spans two doublewords")

				# We will hit our watchpoint 6 times during the execution
				# of the inferior. If the remote stub does not actually split
				# the watched region into two doubleword watchpoints, we will
				# exit before we get to 6 watchpoint hits.
				for i in range(1, 7):
				self.hit_watchpoint_and_continue(process, "wp hit number %s" % i)

lldb/test/API/functionalities/watchpoint/unaligned-spanning-two-dwords/main.c

This file was added.

				#include <stdint.h>
				#include <stdio.h>
				int main() {
				union {
				uint8_t bytebuf[16];
				uint16_t shortbuf[8];
				uint64_t dwordbuf[2];
				} a;
				a.dwordbuf[0] = a.dwordbuf[1] = 0;
				a.bytebuf[0] = 0; // break here
				for (int i = 0; i < 8; i++) {
				a.shortbuf[i] += i;
				}
				for (int i = 0; i < 8; i++) {
				a.shortbuf[i] += i;
				}
				}

lldb/tools/debugserver/source/MacOSX/arm64/DNBArchImplARM64.h

//===-- DNBArchImplARM64.h --------------------------------------*- C++ -*-===// //===-- DNBArchImplARM64.h --------------------------------------*- C++ -*-===//

// //

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information. // See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

// //

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

#ifndef LLDB_TOOLS_DEBUGSERVER_SOURCE_MACOSX_ARM64_DNBARCHIMPLARM64_H #ifndef LLDB_TOOLS_DEBUGSERVER_SOURCE_MACOSX_ARM64_DNBARCHIMPLARM64_H

#define LLDB_TOOLS_DEBUGSERVER_SOURCE_MACOSX_ARM64_DNBARCHIMPLARM64_H #define LLDB_TOOLS_DEBUGSERVER_SOURCE_MACOSX_ARM64_DNBARCHIMPLARM64_H

#if defined(__arm__) || defined(__arm64__) || defined(__aarch64__) #if defined(__arm__) || defined(__arm64__) || defined(__aarch64__)

#include <mach/thread_status.h> #include <mach/thread_status.h>

#include <map> #include <map>

#include <vector>

#if defined(ARM_THREAD_STATE64_COUNT) #if defined(ARM_THREAD_STATE64_COUNT)

#include "DNBArch.h" #include "DNBArch.h"

class MachThread; class MachThread;

class DNBArchMachARM64 : public DNBArchProtocol { class DNBArchMachARM64 : public DNBArchProtocol {

public: public:

enum { kMaxNumThumbITBreakpoints = 4 }; enum { kMaxNumThumbITBreakpoints = 4 };

DNBArchMachARM64(MachThread *thread) DNBArchMachARM64(MachThread *thread)

: m_thread(thread), m_state(), m_disabled_watchpoints(), : m_thread(thread), m_state(), m_disabled_watchpoints(),

m_disabled_breakpoints(), m_watchpoint_hw_index(-1), m_disabled_breakpoints(), m_watchpoint_hw_index(-1),

m_watchpoint_did_occur(false), m_watchpoint_did_occur(false),

m_watchpoint_resume_single_step_enabled(false), m_watchpoint_resume_single_step_enabled(false),

m_saved_register_states() { m_saved_register_states() {

m_disabled_watchpoints.resize(16); m_disabled_watchpoints.resize(16);

m_disabled_breakpoints.resize(16); m_disabled_breakpoints.resize(16);

memset(&m_dbg_save, 0, sizeof(m_dbg_save)); memset(&m_dbg_save, 0, sizeof(m_dbg_save));

} }

struct WatchpointSpec {

JDevlieghereUnsubmitted

Done

memset(&m_dbg_save, 0, sizeof(m_dbg_save));

}

- struct watchpoint_spec {

+ struct WatchpointSpec {

nub_addr_t aligned_start;

All of debugserver's structs and classes use CamelCase.

JDevlieghere: All of debugserver's structs and classes use CamelCase.

nub_addr_t aligned_start;

nub_addr_t requested_start;

nub_size_t aligned_size;

nub_size_t requested_size;

};

virtual ~DNBArchMachARM64() {} virtual ~DNBArchMachARM64() {}

static void Initialize(); static void Initialize();

static const DNBRegisterSetInfo *GetRegisterSetInfo(nub_size_t *num_reg_sets); static const DNBRegisterSetInfo *GetRegisterSetInfo(nub_size_t *num_reg_sets);

bool GetRegisterValue(uint32_t set, uint32_t reg, bool GetRegisterValue(uint32_t set, uint32_t reg,

DNBRegisterValue *value) override; DNBRegisterValue *value) override;

bool SetRegisterValue(uint32_t set, uint32_t reg, bool SetRegisterValue(uint32_t set, uint32_t reg,

Show All 20 Lines public:

uint32_t NumSupportedHardwareBreakpoints() override; uint32_t NumSupportedHardwareBreakpoints() override;

uint32_t NumSupportedHardwareWatchpoints() override; uint32_t NumSupportedHardwareWatchpoints() override;

uint32_t EnableHardwareBreakpoint(nub_addr_t addr, nub_size_t size, uint32_t EnableHardwareBreakpoint(nub_addr_t addr, nub_size_t size,

bool also_set_on_task) override; bool also_set_on_task) override;

bool DisableHardwareBreakpoint(uint32_t hw_break_index, bool DisableHardwareBreakpoint(uint32_t hw_break_index,

bool also_set_on_task) override; bool also_set_on_task) override;

std::vector<WatchpointSpec>

AlignRequestedWatchpoint(nub_addr_t requested_addr,

JDevlieghereUnsubmitted

Done

Nit: I know what user_ means in the context of this patch, but I'd go with either requested_addr and requested_size or even just addr and size. Another option would be user_specified_addr but that seems pretty verbose, although it would be consistent with the members of the struct.

JDevlieghere: Nit: I know what `user_` means in the context of this patch, but I'd go with either…

jasonmolendaAuthorUnsubmitted

Done

Yeah, in the argument list it looks overly verbose, but then in the method where we're using a combination of the user's original intent and the actual aligned addresses & sizes, it gets a little confusing to call them addr & size. I'm fine with requested_.

jasonmolenda: Yeah, in the argument list it looks overly verbose, but then in the method where we're using a…

nub_size_t requested_size);

uint32_t EnableHardwareWatchpoint(nub_addr_t addr, nub_size_t size, bool read, uint32_t EnableHardwareWatchpoint(nub_addr_t addr, nub_size_t size, bool read,

bool write, bool also_set_on_task) override; bool write, bool also_set_on_task) override;

uint32_t SetBASWatchpoint(WatchpointSpec wp, bool read, bool write,

bool also_set_on_task);

uint32_t SetMASKWatchpoint(WatchpointSpec wp);

bool DisableHardwareWatchpoint(uint32_t hw_break_index, bool DisableHardwareWatchpoint(uint32_t hw_break_index,

bool also_set_on_task) override; bool also_set_on_task) override;

bool DisableHardwareWatchpoint_helper(uint32_t hw_break_index, bool DisableHardwareWatchpoint_helper(uint32_t hw_break_index,

bool also_set_on_task); bool also_set_on_task);

protected: protected:

kern_return_t EnableHardwareSingleStep(bool enable); kern_return_t EnableHardwareSingleStep(bool enable);

static bool FixGenericRegisterNumber(uint32_t &set, uint32_t &reg); static bool FixGenericRegisterNumber(uint32_t &set, uint32_t &reg);

▲ Show 20 Lines • Show All 173 Lines • Show Last 20 Lines

lldb/tools/debugserver/source/MacOSX/arm64/DNBArchImplARM64.cpp

Show First 20 Lines • Show All 824 Lines • ▼ Show 20 Lines	if (i < num_hw_breakpoints) {
"EnableHardwareBreakpoint(): All "		"EnableHardwareBreakpoint(): All "
"hardware resources (%u) are in use.",		"hardware resources (%u) are in use.",
num_hw_breakpoints);		num_hw_breakpoints);
}		}
}		}
return INVALID_NUB_HW_INDEX;		return INVALID_NUB_HW_INDEX;
}		}

		std::vector<DNBArchMachARM64::WatchpointSpec>
		DNBArchMachARM64::AlignRequestedWatchpoint(nub_addr_t requested_addr,
		nub_size_t requested_size) {
		JDevlieghereUnsubmitted Done Reply Inline Actions Do you ever expect this to return more than two watchpoints? Seems like this could be a `struct` that holds two optional `WatchpointSpec`s. I don't feel strongly about it, but it looks like you never iterate over the list and the way you have to check after the recursive call is a bit awkward. edit: nvm, it looks like you do actually iterate over them in `EnableHardwareWatchpoint` JDevlieghere: Do you ever expect this to return more than two watchpoints? Seems like this could be a…
		jasonmolendaAuthorUnsubmitted Done Reply Inline Actions I was thinking about how this current scheme only ever splits 1 watchpoint into 2, but a future design that could expand a user requested watch into a larger number of hardware watchpoints would expand it further. If a user asks to watch a 32 byte object, and the target only supports 8 byte watchpoints, and the object is doubleword aligned, we could watch it with 4 hardware watchpoints. The older code in debugserver was written in terms of "one or two" but I switched to a vector of hardware implementable watchpoints that may expand as we evaluate the hardware capabilities of the target/stub. jasonmolenda: I was thinking about how this current scheme only ever splits 1 watchpoint into 2, but a future…
		jasonmolendaAuthorUnsubmitted Done Reply Inline Actions There's some extent where this debugserver implementation is me sketching out the first part of the WatchpointLocation work I want to do in lldb. I will likely be copying this code up in to lldb, so it's written with an eye to where I'm heading there, agreed it's not necessary tho. tbh once that WatchpointLocation stuff exists in lldb, all of this can be pulled from debugserver. jasonmolenda: There's some extent where this debugserver implementation is me sketching out the first part of…

		// Can't watch zero bytes
		if (requested_size == 0)
		return {};

		// Smallest size we can watch on AArch64 is 8 bytes
		JDevlieghereUnsubmitted Done Reply Inline Actions I would get rid of `wps` and return an empty list here (`return {}`). It's pretty obvious here, but on line 893 I was forced to go back and make sure nothing had been added to the list in the meantime. JDevlieghere: I would get rid of `wps` and return an empty list here (`return {}`). It's pretty obvious here…
		constexpr nub_size_t min_watchpoint_alignment = 8;
		nub_size_t aligned_size = std::max(requested_size, min_watchpoint_alignment);

		JDevlieghereUnsubmitted Done Reply Inline Actions Could be `constexpr`. Same for `addr_byte_size` and `addr_bit_size` JDevlieghere: Could be `constexpr`. Same for `addr_byte_size` and `addr_bit_size`
		// AArch64 addresses are 8 bytes.
		constexpr int addr_byte_size = 8;
		constexpr int addr_bit_size = addr_byte_size * 8;

		/// Round up \a requested_size to the next power-of-2 size, at least 8
		/// bytes
		/// requested_size == 3 -> aligned_size == 8
		JDevlieghereUnsubmitted Done Reply Inline Actions `s/nearest/next/` JDevlieghere: `s/nearest/next/`
		/// requested_size == 13 -> aligned_size == 16
		/// requested_size == 16 -> aligned_size == 16
		/// Could be `std::bit_ceil(aligned_size)` when we build with C++20?
		aligned_size = 1ULL << (addr_bit_size - __builtin_clzll(aligned_size - 1));

		JDevlieghereUnsubmitted Done Reply Inline Actions Beautiful. Once we have C++20 we can use `std::bit_ceil` (https://en.cppreference.com/w/cpp/numeric/bit_ceil) JDevlieghere: Beautiful. Once we have C++20 we can use `std::bit_ceil` (https://en.cppreference.
		tschuettUnsubmitted Not Done Reply Inline Actions Is the builtin available on all supported platforms and compilers? There are some alignment functions in MathExtras.h. tschuett: Is the builtin available on all supported platforms and compilers? There are some alignment…
		JDevlieghereUnsubmitted Not Done Reply Inline Actions Yes: `debugserver` is only supported on macOS JDevlieghere: Yes: `debugserver` is only supported on macOS
		jasonmolendaAuthorUnsubmitted Done Reply Inline Actions Ah, that would be very nice and a lot clearer for readers. I might add that as a comment. jasonmolenda: Ah, that would be very nice and a lot clearer for readers. I might add that as a comment.
		nub_addr_t aligned_start = requested_addr & ~(aligned_size - 1);
		// Does this power-of-2 memory range, aligned to power-of-2, completely
		// encompass the requested watch region.
		if (aligned_start + aligned_size >= requested_addr + requested_size) {
		WatchpointSpec wp;
		wp.aligned_start = aligned_start;
		wp.requested_start = requested_addr;
		wp.aligned_size = aligned_size;
		wp.requested_size = requested_size;
		return {{wp}};
		}

		// We need to split this into two watchpoints, split on the aligned_size
		// boundary and re-evaluate the alignment of each half.
		//
		// requested_addr 48 requested_size 20 -> aligned_size 32
		// aligned_start 32
		// split_addr 64
		// first_requested_addr 48
		// first_requested_size 16
		// second_requested_addr 64
		// second_requested_size 4
		nub_addr_t split_addr = aligned_start + aligned_size;

		nub_addr_t first_requested_addr = requested_addr;
		nub_size_t first_requested_size = split_addr - requested_addr;
		nub_addr_t second_requested_addr = split_addr;
		nub_size_t second_requested_size = requested_size - first_requested_size;

		std::vector<WatchpointSpec> first_wp =
		AlignRequestedWatchpoint(first_requested_addr, first_requested_size);
		std::vector<WatchpointSpec> second_wp =
		AlignRequestedWatchpoint(second_requested_addr, second_requested_size);
		if (first_wp.size() != 1 \|\| second_wp.size() != 1)
		return {};

		return {{first_wp[0], second_wp[0]}};
		}

uint32_t DNBArchMachARM64::EnableHardwareWatchpoint(nub_addr_t addr,		uint32_t DNBArchMachARM64::EnableHardwareWatchpoint(nub_addr_t addr,
nub_size_t size, bool read,		nub_size_t size, bool read,
		JDevlieghereUnsubmitted Done Reply Inline Actions If you get rid of `wps` you can use `return {{first_wp[0], second_wp[0]}}` instead JDevlieghere: If you get rid of `wps` you can use `return {{first_wp[0], second_wp[0]}}` instead
bool write,		bool write,
bool also_set_on_task) {		bool also_set_on_task) {
DNBLogThreadedIf(LOG_WATCHPOINTS,		DNBLogThreadedIf(LOG_WATCHPOINTS,
"DNBArchMachARM64::EnableHardwareWatchpoint(addr = "		"DNBArchMachARM64::EnableHardwareWatchpoint(addr = "
"0x%8.8llx, size = %zu, read = %u, write = %u)",		"0x%8.8llx, size = %zu, read = %u, write = %u)",
(uint64_t)addr, size, read, write);		(uint64_t)addr, size, read, write);

const uint32_t num_hw_watchpoints = NumSupportedHardwareWatchpoints();		std::vector<DNBArchMachARM64::WatchpointSpec> wps =
		AlignRequestedWatchpoint(addr, size);
		DNBLogThreadedIf(LOG_WATCHPOINTS,
		"DNBArchMachARM64::EnableHardwareWatchpoint() using %zu "
		"hardware watchpoints",
		wps.size());

// Can't watch zero bytes		if (wps.size() == 0)
if (size == 0)
return INVALID_NUB_HW_INDEX;		return INVALID_NUB_HW_INDEX;

// We must watch for either read or write		// We must watch for either read or write
if (read == false && write == false)		if (read == false && write == false)
return INVALID_NUB_HW_INDEX;		return INVALID_NUB_HW_INDEX;

// Otherwise, can't watch more than 8 bytes per WVR/WCR pair		// Only one hardware watchpoint needed
if (size > 8)		// to implement the user's request.
		if (wps.size() == 1) {
		if (wps[0].aligned_size <= 8)
		return SetBASWatchpoint(wps[0], read, write, also_set_on_task);
		JDevlieghereUnsubmitted Not Done Reply Inline Actions Why is this `> 1` and not `>= 1` (i.e. no 0 which we checked on line 916). TBH I don't really understand what this function is doing. JDevlieghere: Why is this `> 1` and not `>= 1` (i.e. no 0 which we checked on line 916). TBH I don't really…
		jasonmolendaAuthorUnsubmitted Done Reply Inline Actions Clarified this. We have two cases: the user's watchpoint request needs either one hardware watchpoint register (one WatchpointSpec) or two hardware watchpoint registers (two WatchpointSpecs). There's the missing pass from debugserver which takes an aligned one or two WatchpointSpecs which could expand the WatchpointSpecs more if we only supported 8 byte watchpoints and wanted to use many hardware watchpoint registers to implement them. So I'm writing more general code than debugserver is actually going to do any time (I'll be adding MASK watchpoints next, for larger regions). jasonmolenda: Clarified this. We have two cases: the user's watchpoint request needs either one hardware…
		else
return INVALID_NUB_HW_INDEX;		return INVALID_NUB_HW_INDEX;
		}

// Aarch64 watchpoints are in one of two forms: (1) 1-8 bytes, aligned to		// We have multiple WatchpointSpecs
// an 8 byte address, or (2) a power-of-two size region of memory; minimum
// 8 bytes, maximum 2GB; the starting address must be aligned to that power
// of two.
//
// For (1), 1-8 byte watchpoints, using the Byte Address Selector field in
// DBGWCR<n>.BAS. Any of the bytes may be watched, but if multiple bytes
// are watched, the bytes selected must be contiguous. The start address
// watched must be doubleword (8-byte) aligned; if the start address is
// word (4-byte) aligned, only 4 bytes can be watched.
//
// For (2), the MASK field in DBGWCR<n>.MASK is used.
//
// See the ARM ARM, section "Watchpoint exceptions", and more specifically,
// "Watchpoint data address comparisons".
//
// debugserver today only supports (1) - the Byte Address Selector 1-8 byte
// watchpoints that are 8-byte aligned. To support larger watchpoints,
// debugserver would need to interpret the mach exception when the watched
// region was hit, see if the address accessed lies within the subset
// of the power-of-two region that lldb asked us to watch (v. ARM ARM,
// "Determining the memory location that caused a Watchpoint exception"),
// and silently resume the inferior (disable watchpoint, stepi, re-enable
// watchpoint) if the address lies outside the region that lldb asked us
// to watch.
//
// Alternatively, lldb would need to be prepared for a larger region
// being watched than it requested, and silently resume the inferior if
// the accessed address is outside the region lldb wants to watch.

nub_addr_t aligned_wp_address = addr & ~0x7;
uint32_t addr_dword_offset = addr & 0x7;

// Do we need to split up this logical watchpoint into two hardware watchpoint
// registers?
// e.g. a watchpoint of length 4 on address 6. We need do this with
// one watchpoint on address 0 with bytes 6 & 7 being monitored
// one watchpoint on address 8 with bytes 0, 1, 2, 3 being monitored

if (addr_dword_offset + size > 8) {
DNBLogThreadedIf(LOG_WATCHPOINTS, "DNBArchMachARM64::"
"EnableHardwareWatchpoint(addr = "
"0x%8.8llx, size = %zu) needs two "
"hardware watchpoints slots to monitor",
(uint64_t)addr, size);
int low_watchpoint_size = 8 - addr_dword_offset;
int high_watchpoint_size = addr_dword_offset + size - 8;

uint32_t lo = EnableHardwareWatchpoint(addr, low_watchpoint_size, read,		std::vector<uint32_t> wp_slots_used;
write, also_set_on_task);		for (size_t i = 0; i < wps.size(); i++) {
if (lo == INVALID_NUB_HW_INDEX)		uint32_t idx =
return INVALID_NUB_HW_INDEX;		EnableHardwareWatchpoint(wps[i].requested_start, wps[i].requested_size,
uint32_t hi =
EnableHardwareWatchpoint(aligned_wp_address + 8, high_watchpoint_size,
read, write, also_set_on_task);		read, write, also_set_on_task);
if (hi == INVALID_NUB_HW_INDEX) {		if (idx != INVALID_NUB_HW_INDEX)
DisableHardwareWatchpoint(lo, also_set_on_task);		wp_slots_used.push_back(idx);
		}

		// Did we fail to set all of the WatchpointSpecs needed
		// for this user's request?
		if (wps.size() != wp_slots_used.size()) {
		for (int wp_slot : wp_slots_used)
		DisableHardwareWatchpoint(wp_slot, also_set_on_task);
return INVALID_NUB_HW_INDEX;		return INVALID_NUB_HW_INDEX;
}		}
// Tag this lo->hi mapping in our database.
LoHi[lo] = hi;		LoHi[wp_slots_used[0]] = wp_slots_used[1];
return lo;		return wp_slots_used[0];
}		}

// At this point		uint32_t DNBArchMachARM64::SetBASWatchpoint(DNBArchMachARM64::WatchpointSpec wp,
// 1 aligned_wp_address is the requested address rounded down to 8-byte		bool read, bool write,
// alignment		bool also_set_on_task) {
// 2 addr_dword_offset is the offset into that double word (8-byte) region		const uint32_t num_hw_watchpoints = NumSupportedHardwareWatchpoints();
// that we are watching
// 3 size is the number of bytes within that 8-byte region that we are		nub_addr_t aligned_dword_addr = wp.aligned_start;
// watching		nub_addr_t watching_offset = wp.requested_start - wp.aligned_start;
		nub_size_t watching_size = wp.requested_size;

		// If user asks to watch 3 bytes at 0x1005,
		// aligned_dword_addr 0x1000
		// watching_offset 5
		// watching_size 3

// Set the Byte Address Selects bits DBGWCRn_EL1 bits [12:5] based on the		// Set the Byte Address Selects bits DBGWCRn_EL1 bits [12:5] based on the
// above.		// above.
// The bit shift and negation operation will give us 0b11 for 2, 0b1111 for 4,		// The bit shift and negation operation will give us 0b11 for 2, 0b1111 for 4,
// etc, up to 0b11111111 for 8.		// etc, up to 0b11111111 for 8.
// then we shift those bits left by the offset into this dword that we are		// then we shift those bits left by the offset into this dword that we are
// interested in.		// interested in.
// e.g. if we are watching bytes 4,5,6,7 in a dword we want a BAS of		// e.g. if we are watching bytes 4,5,6,7 in a dword we want a BAS of
// 0b11110000.		// 0b11110000.
uint32_t byte_address_select = ((1 << size) - 1) << addr_dword_offset;		uint32_t byte_address_select = ((1 << watching_size) - 1) << watching_offset;

// Read the debug state		// Read the debug state
kern_return_t kret = GetDBGState(false);		kern_return_t kret = GetDBGState(false);
		if (kret != KERN_SUCCESS)
		return INVALID_NUB_HW_INDEX;

if (kret == KERN_SUCCESS) {
// Check to make sure we have the needed hardware support		// Check to make sure we have the needed hardware support
uint32_t i = 0;		uint32_t i = 0;

for (i = 0; i < num_hw_watchpoints; ++i) {		for (i = 0; i < num_hw_watchpoints; ++i) {
if ((m_state.dbg.__wcr[i] & WCR_ENABLE) == 0)		if ((m_state.dbg.__wcr[i] & WCR_ENABLE) == 0)
break; // We found an available hw watchpoint slot (in i)		break; // We found an available hw watchpoint slot
		}
		if (i == num_hw_watchpoints) {
		DNBLogThreadedIf(LOG_WATCHPOINTS,
		"DNBArchMachARM64::"
		"SetBASWatchpoint(): All "
		"hardware resources (%u) are in use.",
		num_hw_watchpoints);
		return INVALID_NUB_HW_INDEX;
}		}

// See if we found an available hw watchpoint slot above		DNBLogThreadedIf(LOG_WATCHPOINTS,
if (i < num_hw_watchpoints) {		"DNBArchMachARM64::"
// DumpDBGState(m_state.dbg);		"SetBASWatchpoint() "
		"set hardware register %d to BAS watchpoint "
		"aligned start address 0x%llx, watch region start "
		"offset %lld, number of bytes %zu",
		i, aligned_dword_addr, watching_offset, watching_size);

// Clear any previous LoHi joined-watchpoint that may have been in use		// Clear any previous LoHi joined-watchpoint that may have been in use
LoHi[i] = 0;		LoHi[i] = 0;

// shift our Byte Address Select bits up to the correct bit range for the		// shift our Byte Address Select bits up to the correct bit range for the
// DBGWCRn_EL1		// DBGWCRn_EL1
byte_address_select = byte_address_select << 5;		byte_address_select = byte_address_select << 5;

// Make sure bits 1:0 are clear in our address		// Make sure bits 1:0 are clear in our address
m_state.dbg.__wvr[i] = aligned_wp_address; // DVA (Data Virtual Address)		m_state.dbg.__wvr[i] = aligned_dword_addr; // DVA (Data Virtual Address)
m_state.dbg.__wcr[i] = byte_address_select \| // Which bytes that follow		m_state.dbg.__wcr[i] = byte_address_select \| // Which bytes that follow
// the DVA that we will watch		// the DVA that we will watch
S_USER \| // Stop only in user mode		S_USER \| // Stop only in user mode
(read ? WCR_LOAD : 0) \| // Stop on read access?		(read ? WCR_LOAD : 0) \| // Stop on read access?
(write ? WCR_STORE : 0) \| // Stop on write access?		(write ? WCR_STORE : 0) \| // Stop on write access?
WCR_ENABLE; // Enable this watchpoint;		WCR_ENABLE; // Enable this watchpoint;

DNBLogThreadedIf(		DNBLogThreadedIf(LOG_WATCHPOINTS,
LOG_WATCHPOINTS, "DNBArchMachARM64::EnableHardwareWatchpoint() "		"DNBArchMachARM64::SetBASWatchpoint() "
"adding watchpoint on address 0x%llx with control "		"adding watchpoint on address 0x%llx with control "
"register value 0x%x",		"register value 0x%x",
(uint64_t)m_state.dbg.__wvr[i], (uint32_t)m_state.dbg.__wcr[i]);		(uint64_t)m_state.dbg.__wvr[i],
		(uint32_t)m_state.dbg.__wcr[i]);

// The kernel will set the MDE_ENABLE bit in the MDSCR_EL1 for us		// The kernel will set the MDE_ENABLE bit in the MDSCR_EL1 for us
// automatically, don't need to do it here.		// automatically, don't need to do it here.

kret = SetDBGState(also_set_on_task);		kret = SetDBGState(also_set_on_task);
// DumpDBGState(m_state.dbg);		// DumpDBGState(m_state.dbg);

DNBLogThreadedIf(LOG_WATCHPOINTS, "DNBArchMachARM64::"		DNBLogThreadedIf(LOG_WATCHPOINTS,
"EnableHardwareWatchpoint() "		"DNBArchMachARM64::"
		"SetBASWatchpoint() "
"SetDBGState() => 0x%8.8x.",		"SetDBGState() => 0x%8.8x.",
kret);		kret);

if (kret == KERN_SUCCESS)		if (kret == KERN_SUCCESS)
return i;		return i;
} else {
DNBLogThreadedIf(LOG_WATCHPOINTS, "DNBArchMachARM64::"
"EnableHardwareWatchpoint(): All "
"hardware resources (%u) are in use.",
num_hw_watchpoints);
}
}
return INVALID_NUB_HW_INDEX;		return INVALID_NUB_HW_INDEX;
}		}

bool DNBArchMachARM64::ReenableHardwareWatchpoint(uint32_t hw_index) {		bool DNBArchMachARM64::ReenableHardwareWatchpoint(uint32_t hw_index) {
// If this logical watchpoint # is actually implemented using		// If this logical watchpoint # is actually implemented using
// two hardware watchpoint registers, re-enable both of them.		// two hardware watchpoint registers, re-enable both of them.

if (hw_index < NumSupportedHardwareWatchpoints() && LoHi[hw_index]) {		if (hw_index < NumSupportedHardwareWatchpoints() && LoHi[hw_index]) {
Show All 11 Lines	bool DNBArchMachARM64::ReenableHardwareWatchpoint_helper(uint32_t hw_index) {

const uint32_t num_hw_points = NumSupportedHardwareWatchpoints();		const uint32_t num_hw_points = NumSupportedHardwareWatchpoints();
if (hw_index >= num_hw_points)		if (hw_index >= num_hw_points)
return false;		return false;

m_state.dbg.__wvr[hw_index] = m_disabled_watchpoints[hw_index].addr;		m_state.dbg.__wvr[hw_index] = m_disabled_watchpoints[hw_index].addr;
m_state.dbg.__wcr[hw_index] = m_disabled_watchpoints[hw_index].control;		m_state.dbg.__wcr[hw_index] = m_disabled_watchpoints[hw_index].control;

DNBLogThreadedIf(LOG_WATCHPOINTS, "DNBArchMachARM64::"		DNBLogThreadedIf(LOG_WATCHPOINTS,
"EnableHardwareWatchpoint( %u ) - WVR%u = "		"DNBArchMachARM64::"
		"SetBASWatchpoint( %u ) - WVR%u = "
"0x%8.8llx WCR%u = 0x%8.8llx",		"0x%8.8llx WCR%u = 0x%8.8llx",
hw_index, hw_index, (uint64_t)m_state.dbg.__wvr[hw_index],		hw_index, hw_index, (uint64_t)m_state.dbg.__wvr[hw_index],
hw_index, (uint64_t)m_state.dbg.__wcr[hw_index]);		hw_index, (uint64_t)m_state.dbg.__wcr[hw_index]);

// The kernel will set the MDE_ENABLE bit in the MDSCR_EL1 for us		// The kernel will set the MDE_ENABLE bit in the MDSCR_EL1 for us
// automatically, don't need to do it here.		// automatically, don't need to do it here.

kret = SetDBGState(false);		kret = SetDBGState(false);

▲ Show 20 Lines • Show All 1,382 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Refactor and generalize debugserver code for setting hardware watchpoints on AArch64ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 518112

lldb/test/API/functionalities/watchpoint/unaligned-spanning-two-dwords/Makefile

lldb/test/API/functionalities/watchpoint/unaligned-spanning-two-dwords/TestUnalignedSpanningDwords.py

lldb/test/API/functionalities/watchpoint/unaligned-spanning-two-dwords/main.c

lldb/tools/debugserver/source/MacOSX/arm64/DNBArchImplARM64.h

lldb/tools/debugserver/source/MacOSX/arm64/DNBArchImplARM64.cpp

Refactor and generalize debugserver code for setting hardware watchpoints on AArch64
ClosedPublic