This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
packages/Python/lldbsuite/test/functionalities/watchpoint/multi_watchpoint_slots/
-
Python/
-
lldbsuite/
-
test/
-
functionalities/
-
watchpoint/
-
multi_watchpoint_slots/
-
Makefile
-
TestWatchpointMultipleSlots.py
3
main.c
-
source/Plugins/Process/Linux/
-
Plugins/
-
Process/
-
Linux/
11
NativeRegisterContextLinux_arm.cpp
-
NativeThreadLinux.h
3
NativeThreadLinux.cpp

Differential D24610

LLDB Arm Watchpoints: Use single hardware watchpoint slot to watch multiple bytes where possible
AbandonedPublic

Authored by omjavaid on Sep 15 2016, 6:41 AM.

Download Raw Diff

Details

Reviewers

clayborg
labath

Summary

This patch enables LLDB to use a single hardware watchpoint slot to watch multiple bytes or half words.

For example:
instead of using 4 different hardware watchpoint slots to watch 1 byte at 0x0040, 0x0041, 0x0042 and 0x0043. We are now able to use 1 hardware watchpoint slot to watch all 4 of these single byte watchpoints. Similarly for two consecutive half words we can use a single slot.

Arm has 4 hardware watchpoint slots and this patch optimizes the use of those slots wherever possible.

I have made necessary changes to watchpointSizeTests to capture this change.

I am working on similar patch for AArch64.

Diff Detail

Event Timeline

omjavaid updated this revision to Diff 71500.Sep 15 2016, 6:41 AM

omjavaid retitled this revision from to LLDB Arm Watchpoints: Use single hardware watchpoint slot to watch multiple bytes where possible.

omjavaid updated this object.

omjavaid added a reviewer: labath.

Herald added subscribers: samparker, rengolin, aemerson. · View Herald TranscriptSep 15 2016, 6:41 AM

omjavaid added a reviewer: clayborg.Sep 15 2016, 6:43 AM

omjavaid added a subscriber: lldb-commits.

I have some doubts about the validity of this patch. We should make sure those are cleared before putting this in.

packages/Python/lldbsuite/test/functionalities/watchpoint/watchpoint_size/TestWatchpointSizes.py
43 ↗	(On Diff #71500)	It's not clear to me why you need to modify the existing test for this change. You are adding functionality, so all existing tests should pass as-is (which will also validate that your change did not introduce regressions).
154 ↗	(On Diff #71500)	It looks like this test will test something completely different on arm than on other architectures. You would probably be better off writing a new test for this.
source/Plugins/Process/Linux/NativeRegisterContextLinux_arm.cpp
574	The logic here is extremely convoluted. Doesn't this code basically boil down to: current_size = m_hwp_regs[wp_index].control & 1 ? GetWatchpointSize(wp_index) : 0; new_size = llvm::NextPowerOf2(std::max(current_size, watch_mask)); // update the control value, write the debug registers... Also `watch_mask` should probably be renamed to `watch_size`, as it doesn't appear to be a mask.
609	This looks a bit worrying. What will happen after the following sequence of events: client tells us to set a watchpoint at 0x1000 we set the watchpoint client tells us to set a watchpoint at 0x1001 we extend the previous watchpoint to watch this address as well client tells us to delete the watchpoint at 0x1000 ??? Will we remain watching the address 0x1001? I don't see how will you be able to do that without maintaining a some info about the original watchpoints the client requested (and I have not seen that code). Please add a test for this.

This revision now requires changes to proceed.Sep 15 2016, 7:47 AM

Great fix. Just fix the testing so that it isn't ARM specific. There shouldn't be any:

if self.getArchitecture() in ['arm']:
  do arm stuff
else:
  do non arm stuff

Also we will need to be able to test the set watch at 0x1000, then at 0x1001 by sharing, clear 0x1000 make sure 0x1001 still works, etc.

comments inline.

packages/Python/lldbsuite/test/functionalities/watchpoint/watchpoint_size/TestWatchpointSizes.py
43 ↗	(On Diff #71500)	As we keep on adding these tests its increasing our overall testing time. I just thought using the same test with some changes will cover the cases we want to test. Anyways we can write separate tests as well.
source/Plugins/Process/Linux/NativeRegisterContextLinux_arm.cpp
574	Seems legit. I ll update this in next patch.
609	I just realized that I missed a crucial change in my last patch that we need to make all this work. Let me get back with correction and desired test cases.

I have added a new test case that tests suggested scnario without changing any previous test cases.

Also I have made sure we re validate all watchpoint installed on thread resume to make sure we have the latest values assigned to hardware watchpoint registers.

This passes on ARM (RaspberryPi3, Samsung Chromebook). I have not yet tested on android.

This will fail on targets which dont support multiple watchpoint slots.

Also this should fail on AArch64 which I am currently working on.

Herald added subscribers: srhines, danalbert, tberghammer. · View Herald TranscriptSep 16 2016, 6:47 AM

zturner added a subscriber: zturner.Sep 16 2016, 11:47 AM

zturner added inline comments.

packages/Python/lldbsuite/test/functionalities/watchpoint/multi_watchpoint_slots/main.c
24	What's up with all the double spaced source code? Is this intentional?
source/Plugins/Process/Linux/NativeRegisterContextLinux_arm.cpp
515–523	This block of code is a bit confusing to me. Is this equivalent to: lldb::addr_t start = llvm::alignDown(addr, 4); lldb::addr_t end = addr + size; if (start == end \|\| (end-start)>4) return LLDB_INVALID_INDEX32;

labath added inline comments.Sep 18 2016, 1:14 PM

packages/Python/lldbsuite/test/functionalities/watchpoint/multi_watchpoint_slots/main.c
24	Indeed. The spaces seem superfluous.
source/Plugins/Process/Linux/NativeRegisterContextLinux_arm.cpp
515–523	I am not sure this is much clearer, especially, as we will later need a separate varaible for `end-start` anyway. +1 for `llvm::alignDown` though.
571	Looks much better. Any reason for not using `NextPowerOf2` ? Among other things, it is self-documenting, so you do not need the comment above that.
source/Plugins/Process/Linux/NativeThreadLinux.cpp
205	If you add this, then the comment below becomes obsolete. Seems like a pretty elegant solution to the incremental watchpoint update problem. I am wondering whether we need to do it on every resume though. I think it should be enough to do it when a watchpoint gets deleted (`NTL::RemoveWatchpoint`). Also, we should throw out the implementation of `NativeRegisterContextLinux_arm::ClearHardwareWatchpoint` -- it's no longer necessary, and it's not even correct anymore.

Answers to comments. I will upload a updated patch after corrections and updates.

packages/Python/lldbsuite/test/functionalities/watchpoint/multi_watchpoint_slots/main.c
24	Agreed. Will be removed in next iteration.
source/Plugins/Process/Linux/NativeRegisterContextLinux_arm.cpp
515–523	There is significant performance difference when we choose addr = addr & (~0x03); over llvm::alignDown It will eventually effect responsiveness if we keep increasing code size like this. Even if we use -Os then alignDown is squeezed down to 8-9 instructions. I havnt tried clang though. Instructions needed for addr = addr & (~0x03): 4008bd: 48 8b 45 e8 mov -0x18(%rbp),%rax 4008c1: 48 83 e0 fc and $0xfffffffffffffffc,%rax 4008c5: 48 89 45 e8 mov %rax,-0x18(%rbp) Call penalty for llvm::alignDown 400918: ba 00 00 00 00 mov $0x0,%edx 40091d: 48 89 ce mov %rcx,%rsi 400920: 48 89 c7 mov %rax,%rdi 400923: e8 ae 00 00 00 callq 4009d6 <_Z9alignDownmmm> 400928: 49 89 c4 mov %rax,%r12 40092b: 48 8b 5d d8 mov -0x28(%rbp),%rbx Disassembly for llvm::alignDown 00000000004009d6 <_Z9alignDownmmm>: 4009d6: 55 push %rbp 4009d7: 48 89 e5 mov %rsp,%rbp 4009da: 48 89 7d f8 mov %rdi,-0x8(%rbp) 4009de: 48 89 75 f0 mov %rsi,-0x10(%rbp) 4009e2: 48 89 55 e8 mov %rdx,-0x18(%rbp) 4009e6: 48 8b 45 e8 mov -0x18(%rbp),%rax 4009ea: ba 00 00 00 00 mov $0x0,%edx 4009ef: 48 f7 75 f0 divq -0x10(%rbp) 4009f3: 48 89 55 e8 mov %rdx,-0x18(%rbp) 4009f7: 48 8b 45 f8 mov -0x8(%rbp),%rax 4009fb: 48 2b 45 e8 sub -0x18(%rbp),%rax 4009ff: ba 00 00 00 00 mov $0x0,%edx 400a04: 48 f7 75 f0 divq -0x10(%rbp) 400a08: 48 0f af 45 f0 imul -0x10(%rbp),%rax 400a0d: 48 89 c2 mov %rax,%rdx 400a10: 48 8b 45 e8 mov -0x18(%rbp),%rax 400a14: 48 01 d0 add %rdx,%rax 400a17: 5d pop %rbp 400a18: c3 retq 400a19: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) Number of instructions generated for alignDown with gcc -Os 400892: 48 8b 6c 24 08 mov 0x8(%rsp),%rbp 400897: 48 8b 4c 24 10 mov 0x10(%rsp),%rcx 40089c: 31 d2 xor %edx,%edx 40089e: be c2 0a 40 00 mov $0x400ac2,%esi 4008a3: bf a0 11 60 00 mov $0x6011a0,%edi 4008a8: 48 89 e8 mov %rbp,%rax 4008ab: 48 f7 f1 div %rcx 4008ae: 48 0f af c1 imul %rcx,%rax 4008b2: 48 89 c3 mov %rax,%rbx
571	so llvm::NextPowerOf2 doesnt serve the intended behaviour. llvm::NextPowerOf2 returns 2 for 1, 4 for 2 or 3, and 8 for 4. We just want to make sure that our new_size is a power of 2 if its already not which will only be the case if size turns out to be 3.
source/Plugins/Process/Linux/NativeThreadLinux.cpp
205	This can be improved for performance I intentionally didn't do it to minimize changes to the generic component.

labath accepted this revision.Sep 22 2016, 5:57 AM

labath edited edge metadata.

labath added inline comments.

source/Plugins/Process/Linux/NativeRegisterContextLinux_arm.cpp
515–523	I tried this with clang-3.6 -O3. The entire alignDown function call compiled down to `andq $-4, %rdi`. I'll leave this up to you, as I think it is readable enough right now, but I don't think we should be afraid of using utility functions like this. I think LLVM cares a lot more about performance than we, so I believe we can rely on them knowing what they're doing. Also we have much bigger higher-level issues affecting performance, the least of which is the change below, where you re-calculate all watchpoints on every resume.
571	OK, nevermind then.
source/Plugins/Process/Linux/NativeThreadLinux.cpp
205	I don't care about the performance too much - the code isn't that hot. What bothers me is that this leaves the code in an inconsistent state - NativeRegisterContextLinux_arm::ClearHardwareWatchpoint thinks it is doing the watchpoint removal "the old way" whereas what it does actually does not matter, as we will nuke the watchpoint registers anyways. Ideally, I'd like to see this done in the opposite order - first switching the code to use the "nuking" approach to removing watchpoints, and after that adding the slot reuse code (at which point it will not need to touch any generic code at all). If you want to do it the other way then go ahead, but I do expect to see a follow-up change to clean this up.

This is a new version of what seems to me fully implementing functionality we intend to have here.

On a second thought nuking ClearHardwareWatchpoint function seems to be the wrong approach here. I spent some time taking different approaches and it turns out that if we do not ClearHardwareWatchpoint when back-end asks us to remove it then we wont be able to step over watchpoints. On ARM targets we have to first clear and then reinstall watchpoints to step over the watchpoint instruction.

On the other hand if we call NativeRegisterContextLinux_arm::ClearHardwareWatchpoint then that watchpoint stands removed if call is just to delete watch on one of the bytes. And if we follow up with creating a new watchpoint on a different word the slot being used may appear vaccant which is actually inconsistent behavior.

So I have a new approach that does clear watchpoint registers if NativeRegisterContextLinux_arm::ClearHardwareWatchpoint is called but we still track reference counts by re-introducing refcount that I removed in my last patch. This will mean that a follow up create may fail just because there are still references to disabled watchpoint and watchpoint slots are still not vaccant. I have made changes to the test to reflect this behaviour.

Please comment if you have any reservation about this approach.

In D24610#553331, @omjavaid wrote:

This is a new version of what seems to me fully implementing functionality we intend to have here.

On a second thought nuking ClearHardwareWatchpoint function seems to be the wrong approach here. I spent some time taking different approaches and it turns out that if we do not ClearHardwareWatchpoint when back-end asks us to remove it then we wont be able to step over watchpoints. On ARM targets we have to first clear and then reinstall watchpoints to step over the watchpoint instruction.

On the other hand if we call NativeRegisterContextLinux_arm::ClearHardwareWatchpoint then that watchpoint stands removed if call is just to delete watch on one of the bytes. And if we follow up with creating a new watchpoint on a different word the slot being used may appear vaccant which is actually inconsistent behavior.

So I have a new approach that does clear watchpoint registers if NativeRegisterContextLinux_arm::ClearHardwareWatchpoint is called but we still track reference counts by re-introducing refcount that I removed in my last patch. This will mean that a follow up create may fail just because there are still references to disabled watchpoint and watchpoint slots are still not vaccant. I have made changes to the test to reflect this behaviour.

Please comment if you have any reservation about this approach.

Hmm.... I am indeed starting to have serious reservations about this.

The more I think about this, the more it starts to look like a big hack. So, now ClearHardwareWatchpoint still maintains a refcount on the number of users of the watchpoint slot, but it disables the slot everytime the slot usage count is decremented ? (as opposed to when the refcount reaches zero). And this is supposed to be the reason that step-over watchpoint (a pretty critical piece of functionality) works ? And the reason why we are still able to do the watchpoints is that before a continue (but not before a single-step, because that would break the previous item) we nuke the watchpoint registers (and their reference counts) and start over?

I am not convinced that having watchpoint slot sharing is important enough to balance the amount of technical debt this introduces.

This revision now requires changes to proceed.Sep 27 2016, 3:27 AM

Give this approach a rethink I dont see a lot of problems with this final implementation unless it fails on other architectures.
We are already hacking our way to have these byte selection watchpoints working in existing code. New code seems to be improving the hack in my opinion.

Let me explain what I am doing and may be you can provide your suggestion and feedback.

Watchpoint Install => Register watchpoint into hardware watchpoint register cache
Watchpoint Enable => Enable in cache and write registers using ptrace interface.

Ideally we should be able to install/uninstall watchpoints in cache and then enable them all on a resume.
In case of arm If a watchpoint is hit we should be able to disable that watchpoint.
Step over watchpoint instruction and then re-enable the watchpoint.
Our existing implementation will require a lot of changes to do that so thats why here is what i am doing.

SetHardwareWatchpoint => Performs Install and Enable

If a new watchpoint slot is going to be used we Install and enable.
For new watchpoint we should be able to complete both Install and or we report an error.
If a duplicate slot is going to be used we Install and enable if required.
Install means updating size if needed plus updating ref count.
Enable means updating registers if size was updated.

ClearHardwareWatchpoint

Disable and uinstall watchpoint means
Decrement ref count and clear hardware watchpoint regsiters.
Advantage of keeping ref counts is:
If refcount is greater than zero then SetHardwareWatchpoint cannot use this slot for a new watchpoint (new address).
But SetHardwareWatchpoint can be use this slot to install duplicate watchpoints (same address but different byte or word)

ClearAllHardwareWatchpoint -- Just clear the whole watchpoint cache and call SetHardwareWatchpoint for all available watchpoints.

NativeThreadLinux:

On Watchpoint Remove -> Invalidate watchpoint cache
On Resume - > Re-validate watchpoints by creating a new cache and re-enabling all watchpoints

So this fixes our step-over issue and also preserves watchpoint slot if it is being used by multiple watchpoints.

Can you think of any scenarios which might fail for this approach?

In D24610#554587, @omjavaid wrote:

Give this approach a rethink I dont see a lot of problems with this final implementation unless it fails on other architectures.
We are already hacking our way to have these byte selection watchpoints working in existing code. New code seems to be improving the hack in my opinion.

I don't believe that the presence of one hack justifies the addition of another. If anything, then it's the opposite.
The fact that we have to fiddle with the byte selects to get the kernel to accept our watchpoints could be thought of as a "hack", but I think there it's justified, as otherwise we would be unable to watch anything that isn't dword-aligned. And most importantly, the issue is contained within the SetHardwareWatchpoint function -- noone outside of that needs to know about the fact that we are doing something funny with byte-selects. This is not the case here, as I explain below.

Can you think of any scenarios which might fail for this approach?

I can. Try the following sequence of commands:

position the PC on an instruction that will write to address A (aligned)
set a byte watchpoint on A+0
set a byte watchpoint on A+1
clear the second watchpoint
single-step

What will happen with your patch applied? (*) Nothing, as the watchpoint will not get hit because you disabled it. Now this is a pretty dubious example, but I think it demonstrates, the problems I have with this solution very nicely.

You are lying to the client about which watchpoints are enabled. The client asked you to remove just one watchpoint, so as far as he is concerned the other one is still active. But you have disabled the other one as well. And now you make this a feature, as without it single stepping a multi-watchpoint will not work. And you are here relying on the fact that the client implements the step-over-watchpoint as a simple $z2, $s, $Z2 packet sequence. If the client decides to do anything more complicated, say evaluate an expression to get some state before the update, then your feature will break, as an intermediate $c, will reinstate your hidden watchpoint (in fact, this is maybe possible already, if a conditional breakpoint and a watchpoint are hit simultaneously). And you don't even fix the step-over-instruction-triggering-multiple-watchpoints problem definitively - this can still happen e.g., if you have a watch on 0x1000 and 0x1008, and a single instruction triggers both.(**)

This is why I think this change is bad, as it is making an invisible line between the lowest levels of the server and the highest levels of the client, which now have to be kept in sync, otherwise things will unexpectedly break. I don't believe that having the ability to watch more than 4 locations at once is worth it, particularly when it can be easily worked around by the user (just set one big watchpoint instead of multiple small ones). Do you have a specific use case where this would be necessary/useful? I personally have never used more than one watchpoint per debug session in my life, and I expect this to be true for most people (I think the typical use case is "who is corrupting my variable?"), So, I believe that it is better to have support for fewer watchpoints, but have it working well, than the opposite.

(*) Now when I tried this out on lldb without this change, it still did not work as expected - for some reason, after hitting and stepping over the watchpoint, lldb decided to to issue $c, and lost control of the inferior. I think that tracking this issue and fixing it would have more impact than adding the multi-watchpoint support (and it will reduce the number of corner cases, where we are wrong rather than increasing it).

(**) Another issue whose solving would have more impact than this.

Is there anything we need to do on this review?

There is not exact solution that satisfies all corner cases. Abandoning for now until I come up with a solution that covers us from all corners.

Revision Contents

Path

Size

packages/

Python/

lldbsuite/

test/

functionalities/

watchpoint/

multi_watchpoint_slots/

Makefile

5 lines

TestWatchpointMultipleSlots.py

135 lines

main.c

40 lines

source/

Plugins/

Process/

Linux/

NativeRegisterContextLinux_arm.cpp

169 lines

NativeThreadLinux.h

1 line

NativeThreadLinux.cpp

16 lines

Diff 72723

packages/Python/lldbsuite/test/functionalities/watchpoint/multi_watchpoint_slots/Makefile

				LEVEL = ../../../make

				C_SOURCES := main.c

				include $(LEVEL)/Makefile.rules

packages/Python/lldbsuite/test/functionalities/watchpoint/multi_watchpoint_slots/TestWatchpointMultipleSlots.py

				"""
				Test watchpoint slots (1-byte/2-byte watchpoints with selective deletion)
				Make sure we can watch all installed byte/word size watchpoints,
				when we have installed multiple byte/word size watchpoints within same dword.
				Also make sure we hit only the ones which are left after we selective deletion.

				"""

				from __future__ import print_function

				import os
				import time
				import lldb
				from lldbsuite.test.decorators import *
				from lldbsuite.test.lldbtest import *
				from lldbsuite.test import lldbutil


				class WatchpointSlotsTestCase(TestBase):
				NO_DEBUG_INFO_TESTCASE = True

				mydir = TestBase.compute_mydir(__file__)

				def setUp(self):
				# Call super's setUp().
				TestBase.setUp(self)

				# Source filename.
				self.source = 'main.c'

				# Output filename.
				self.exe_name = 'a.out'
				self.d = {'C_SOURCES': self.source, 'EXE': self.exe_name}

				# Watchpoints not supported
				@expectedFailureAndroid(archs=['arm', 'aarch64'])
				@expectedFailureAll(
				oslist=["windows"],
				bugnumber="llvm.org/pr24446: WINDOWS XFAIL TRIAGE - Watchpoints not supported on Windows")
				# Read-write watchpoints not supported on SystemZ
				@expectedFailureAll(archs=['s390x'])
				# This is a arm specific test case. No other architectures tested.
				@skipIf(archs=no_match(['arm']))
				def test_byte_size_watchpoints_multi_slots(self):
				"""Test to selectively watch different bytes in a 8-byte array."""
				self.run_watchpoint_slot_test('byteArray', 16, '1',
				['2', '4', '6', '8'], ['1', '3', '5', '7'])

				# Watchpoints not supported
				@expectedFailureAndroid(archs=['arm', 'aarch64'])
				@expectedFailureAll(
				oslist=["windows"],
				bugnumber="llvm.org/pr24446: WINDOWS XFAIL TRIAGE - Watchpoints not supported on Windows")
				# Read-write watchpoints not supported on SystemZ
				@expectedFailureAll(archs=['s390x'])
				# This is a arm specific test case. No other architectures tested.
				@skipIf(archs=no_match(['arm']))
				def test_two_byte_watchpoints_multi_slots(self):
				"""Test to randomly watch different words in an 8-byte word array."""
				self.run_watchpoint_slot_test('wordArray', 8, '2',
				['2', '4', '6', '8'], ['1', '3', '5', '7'])

				def run_watchpoint_slot_test(self, arrayName, array_size, watchsize,
				watch_hit_list, watch_del_list):
				self.build(dictionary=self.d)
				self.setTearDownCleanup(dictionary=self.d)

				exe = os.path.join(os.getcwd(), self.exe_name)
				self.runCmd("file " + exe, CURRENT_EXECUTABLE_SET)

				# Detect line number after which we are going to increment arrayName.
				loc_line = line_number('main.c', '// About to write ' + arrayName)

				# Set a breakpoint on the line detected above.
				lldbutil.run_break_set_by_file_and_line(
				self, "main.c", loc_line, num_expected_locations=1, loc_exact=True)

				# Run the program.
				self.runCmd("run", RUN_SUCCEEDED)

				# The stop reason of the thread should be breakpoint.
				self.expect("thread list", STOPPED_DUE_TO_BREAKPOINT,
				substrs=['stopped', 'stop reason = breakpoint'])

				# Delete breakpoint we just hit.
				self.expect("breakpoint delete 1", substrs=['1 breakpoints deleted'])

				for i in range(array_size):
				# Set a read_write type watchpoint arrayName
				watch_loc = arrayName + "[" + str(i) + "]"
				self.expect(
				"watchpoint set variable -w read_write " +
				watch_loc,
				WATCHPOINT_CREATED,
				substrs=[
				'Watchpoint created',
				'size = ' +
				watchsize,
				'type = rw'])

				# Use the '-v' option to do verbose listing of the watchpoint.
				# The hit count should be 0 initially.
				self.expect("watchpoint list -v 1", substrs=['hit_count = 0'])

				self.runCmd("process continue")

				# We should be stopped due to the watchpoint.
				# The stop reason of the thread should be watchpoint.
				self.expect("thread list", STOPPED_DUE_TO_WATCHPOINT,
				substrs=['stopped', 'stop reason = watchpoint 1'])

				# Use the '-v' option to do verbose listing of the watchpoint.
				# The hit count should now be 1.
				self.expect("watchpoint list -v 1", substrs=['hit_count = 1'])

				# Selectively delete watchpoints.
				for idel in watch_del_list:
				self.expect("watchpoint delete " + idel,
				substrs=['1 watchpoints deleted'])

				# Try setting watchpoint here and it should fail.
				self.expect("watchpoint set variable -w read_write dWordVar",
				substrs=['Watchpoint creation failed'], error=True)

				for ihit in watch_hit_list:
				# Resume inferior.
				self.runCmd("process continue")

				# We should be stopped due to the watchpoint.
				# The stop reason of the thread should be watchpoint.
				self.expect("thread list", STOPPED_DUE_TO_WATCHPOINT,
				substrs=['stopped', 'stop reason = watchpoint ' + ihit])

				# Resume inferior.
				self.runCmd("process continue")

packages/Python/lldbsuite/test/functionalities/watchpoint/multi_watchpoint_slots/main.c

				//===-- main.c --------------------------------------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				#include <stdio.h>
				#include <stdint.h>

				uint64_t pad0 = 0;
				uint8_t byteArray[16] = {0};
				uint64_t pad1 = 0;
				uint16_t wordArray[8] = {0};
				uint32_t dWordVar = 0;

				int main(int argc, char** argv) {

				int i;

				for (i = 0; i < 16; i++)
				{
				printf("About to write byteArray[%d] ...\n", i); // About to write byteArray
				zturnerUnsubmitted Not Done Reply Inline Actions What's up with all the double spaced source code? Is this intentional? zturner: What's up with all the double spaced source code? Is this intentional?
				labathUnsubmitted Not Done Reply Inline Actions Indeed. The spaces seem superfluous. labath: Indeed. The spaces seem superfluous.
				omjavaidAuthorUnsubmitted Not Done Reply Inline Actions Agreed. Will be removed in next iteration. omjavaid: Agreed. Will be removed in next iteration.
				pad0++;
				byteArray[i] = 7;
				pad1++;
				}

				for (i = 0; i < 8; i++)
				{
				printf("About to write wordArray[%d] ...\n", i); // About to write wordArray
				pad0++;
				wordArray[i] = 7;
				pad1++;
				}

				dWordVar = 5; // Write dWordVar to check if a stale watchpoint is active
				return 0;
				}

source/Plugins/Process/Linux/NativeRegisterContextLinux_arm.cpp

Show First 20 Lines • Show All 502 Lines • ▼ Show 20 Lines	uint32_t NativeRegisterContextLinux_arm::SetHardwareWatchpoint(
Error error;		Error error;

// Read hardware breakpoint and watchpoint information.		// Read hardware breakpoint and watchpoint information.
error = ReadHardwareDebugInfo();		error = ReadHardwareDebugInfo();

if (error.Fail())		if (error.Fail())
return LLDB_INVALID_INDEX32;		return LLDB_INVALID_INDEX32;

uint32_t control_value = 0, wp_index = 0, addr_word_offset = 0, byte_mask = 0;		uint32_t control_value = 0;
lldb::addr_t real_addr = addr;		lldb::addr_t real_addr = addr;
		uint32_t wp_index = LLDB_INVALID_INDEX32;

		// Find out how many bytes we need to watch after 4-byte alignment boundary.
		uint8_t watch_size = (addr & 0x03) + size;

		// We cannot watch zero or more than 4 bytes after 4-byte alignment boundary.
		if (size == 0 \|\| watch_size > 4)
		return LLDB_INVALID_INDEX32;

		// Strip away last two bits of address for byte/half-word/word selection.
		addr &= ~((lldb::addr_t)3);
		zturnerUnsubmitted Not Done Reply Inline Actions This block of code is a bit confusing to me. Is this equivalent to: lldb::addr_t start = llvm::alignDown(addr, 4); lldb::addr_t end = addr + size; if (start == end \|\| (end-start)>4) return LLDB_INVALID_INDEX32; zturner: This block of code is a bit confusing to me. Is this equivalent to: ``` lldb::addr_t start =…
		labathUnsubmitted Not Done Reply Inline Actions I am not sure this is much clearer, especially, as we will later need a separate varaible for `end-start` anyway. +1 for `llvm::alignDown` though. labath: I am not sure this is much clearer, especially, as we will later need a separate varaible for…
		omjavaidAuthorUnsubmitted Not Done Reply Inline Actions There is significant performance difference when we choose addr = addr & (~0x03); over llvm::alignDown It will eventually effect responsiveness if we keep increasing code size like this. Even if we use -Os then alignDown is squeezed down to 8-9 instructions. I havnt tried clang though. Instructions needed for addr = addr & (~0x03): 4008bd: 48 8b 45 e8 mov -0x18(%rbp),%rax 4008c1: 48 83 e0 fc and $0xfffffffffffffffc,%rax 4008c5: 48 89 45 e8 mov %rax,-0x18(%rbp) Call penalty for llvm::alignDown 400918: ba 00 00 00 00 mov $0x0,%edx 40091d: 48 89 ce mov %rcx,%rsi 400920: 48 89 c7 mov %rax,%rdi 400923: e8 ae 00 00 00 callq 4009d6 <_Z9alignDownmmm> 400928: 49 89 c4 mov %rax,%r12 40092b: 48 8b 5d d8 mov -0x28(%rbp),%rbx Disassembly for llvm::alignDown 00000000004009d6 <_Z9alignDownmmm>: 4009d6: 55 push %rbp 4009d7: 48 89 e5 mov %rsp,%rbp 4009da: 48 89 7d f8 mov %rdi,-0x8(%rbp) 4009de: 48 89 75 f0 mov %rsi,-0x10(%rbp) 4009e2: 48 89 55 e8 mov %rdx,-0x18(%rbp) 4009e6: 48 8b 45 e8 mov -0x18(%rbp),%rax 4009ea: ba 00 00 00 00 mov $0x0,%edx 4009ef: 48 f7 75 f0 divq -0x10(%rbp) 4009f3: 48 89 55 e8 mov %rdx,-0x18(%rbp) 4009f7: 48 8b 45 f8 mov -0x8(%rbp),%rax 4009fb: 48 2b 45 e8 sub -0x18(%rbp),%rax 4009ff: ba 00 00 00 00 mov $0x0,%edx 400a04: 48 f7 75 f0 divq -0x10(%rbp) 400a08: 48 0f af 45 f0 imul -0x10(%rbp),%rax 400a0d: 48 89 c2 mov %rax,%rdx 400a10: 48 8b 45 e8 mov -0x18(%rbp),%rax 400a14: 48 01 d0 add %rdx,%rax 400a17: 5d pop %rbp 400a18: c3 retq 400a19: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) Number of instructions generated for alignDown with gcc -Os 400892: 48 8b 6c 24 08 mov 0x8(%rsp),%rbp 400897: 48 8b 4c 24 10 mov 0x10(%rsp),%rcx 40089c: 31 d2 xor %edx,%edx 40089e: be c2 0a 40 00 mov $0x400ac2,%esi 4008a3: bf a0 11 60 00 mov $0x6011a0,%edi 4008a8: 48 89 e8 mov %rbp,%rax 4008ab: 48 f7 f1 div %rcx 4008ae: 48 0f af c1 imul %rcx,%rax 4008b2: 48 89 c3 mov %rax,%rbx omjavaid: There is significant performance difference when we choose addr = addr & (~0x03); over llvm…
		labathUnsubmitted Not Done Reply Inline Actions I tried this with clang-3.6 -O3. The entire alignDown function call compiled down to `andq $-4, %rdi`. I'll leave this up to you, as I think it is readable enough right now, but I don't think we should be afraid of using utility functions like this. I think LLVM cares a lot more about performance than we, so I believe we can rely on them knowing what they're doing. Also we have much bigger higher-level issues affecting performance, the least of which is the change below, where you re-calculate all watchpoints on every resume. labath: I tried this with clang-3.6 -O3. The entire alignDown function call compiled down to `andq $-4…

// Check if we are setting watchpoint other than read/write/access		// Check if we are setting watchpoint other than read/write/access
// Also update watchpoint flag to match Arm write-read bit configuration.		// Also update watchpoint flag to match Arm write-read bit configuration.
switch (watch_flags) {		switch (watch_flags) {
case 1:		case 1:
watch_flags = 2;		watch_flags = 2;
break;		break;
case 2:		case 2:
watch_flags = 1;		watch_flags = 1;
break;		break;
case 3:		case 3:
break;		break;
default:		default:
return LLDB_INVALID_INDEX32;		return LLDB_INVALID_INDEX32;
}		}

// Can't watch zero bytes		// Iterate over stored watchpoints and find a free or duplicate wp_index
// Can't watch more than 4 bytes per WVR/WCR pair		for (uint32_t i = 0; i < m_max_hwp_supported; i++) {
		if ((m_hwp_regs[i].control & 1) == 0 && (m_hwp_regs[i].refcount <= 0)) {
if (size == 0 \|\| size > 4)		wp_index = i; // Mark last free slot
return LLDB_INVALID_INDEX32;		} else if (m_hwp_regs[i].address == addr) {
		wp_index = i; // Mark duplicate index
// Check 4-byte alignment for hardware watchpoint target address.		break; // Stop searching here
// Below is a hack to recalculate address and size in order to		}
// make sure we can watch non 4-byte alligned addresses as well.		}
if (addr & 0x03) {
uint8_t watch_mask = (addr & 0x03) + size;

if (watch_mask > 0x04)		// No vaccant slot available and no duplicate slot found.
		if (wp_index == LLDB_INVALID_INDEX32)
return LLDB_INVALID_INDEX32;		return LLDB_INVALID_INDEX32;
else if (watch_mask <= 0x02)
size = 2;
else if (watch_mask <= 0x04)
size = 4;

addr = addr & (~0x03);		uint8_t current_watch_size, new_watch_size;
		// Calculate overall size width to be watched by current hardware watchpoint slot.
		current_watch_size = GetWatchpointSize(wp_index);
		new_watch_size = std::max(current_watch_size, watch_size);

		// Make new_watch_size a power of 2.
		if (new_watch_size == 3)
		new_watch_size++;

		// There is no need to update watchpoint registers
		// if requested byte range is already covered by exiting watchpoint.
		if (current_watch_size == new_watch_size &&
		m_hwp_regs[wp_index].control & 1) {
		m_hwp_regs[wp_index].refcount++;
		return wp_index;
}		}

// We can only watch up to four bytes that follow a 4 byte aligned address
// per watchpoint register pair, so make sure we can properly encode this.
addr_word_offset = addr % 4;
byte_mask = ((1u << size) - 1u) << addr_word_offset;

// Check if we need multiple watchpoint register
if (byte_mask > 0xfu)
return LLDB_INVALID_INDEX32;

// Setup control value		// Setup control value
		labathUnsubmitted Not Done Reply Inline Actions Looks much better. Any reason for not using `NextPowerOf2` ? Among other things, it is self-documenting, so you do not need the comment above that. labath: Looks much better. Any reason for not using `NextPowerOf2` ? Among other things, it is self…
		omjavaidAuthorUnsubmitted Not Done Reply Inline Actions so llvm::NextPowerOf2 doesnt serve the intended behaviour. llvm::NextPowerOf2 returns 2 for 1, 4 for 2 or 3, and 8 for 4. We just want to make sure that our new_size is a power of 2 if its already not which will only be the case if size turns out to be 3. omjavaid: so llvm::NextPowerOf2 doesnt serve the intended behaviour. llvm::NextPowerOf2 returns 2 for 1…
		labathUnsubmitted Not Done Reply Inline Actions OK, nevermind then. labath: OK, nevermind then.
		// Create byte mask for control register
// Make the byte_mask into a valid Byte Address Select mask		// Make the byte_mask into a valid Byte Address Select mask
control_value = byte_mask << 5;		control_value = ((1u << new_watch_size) - 1u) << 5;
		labathUnsubmitted Not Done Reply Inline Actions The logic here is extremely convoluted. Doesn't this code basically boil down to: current_size = m_hwp_regs[wp_index].control & 1 ? GetWatchpointSize(wp_index) : 0; new_size = llvm::NextPowerOf2(std::max(current_size, watch_mask)); // update the control value, write the debug registers... Also `watch_mask` should probably be renamed to `watch_size`, as it doesn't appear to be a mask. labath: The logic here is extremely convoluted. Doesn't this code basically boil down to: ```…
		omjavaidAuthorUnsubmitted Not Done Reply Inline Actions Seems legit. I ll update this in next patch. omjavaid: Seems legit. I ll update this in next patch.

// Turn on appropriate watchpoint flags read or write		// Turn on appropriate watchpoint flags read or write
control_value \|= (watch_flags << 3);		control_value \|= (watch_flags << 3);

// Enable this watchpoint and make it stop in privileged or user mode;		// Enable this watchpoint and make it stop in privileged or user mode;
control_value \|= 7;		control_value \|= 7;

// Make sure bits 1:0 are clear in our address		lldb::addr_t tempAddr = 0;
addr &= ~((lldb::addr_t)3);		uint32_t tempControl = 0;

// Iterate over stored watchpoints
// Find a free wp_index or update reference count if duplicate.
wp_index = LLDB_INVALID_INDEX32;
for (uint32_t i = 0; i < m_max_hwp_supported; i++) {
if ((m_hwp_regs[i].control & 1) == 0) {
wp_index = i; // Mark last free slot
} else if (m_hwp_regs[i].address == addr &&
m_hwp_regs[i].control == control_value) {
wp_index = i; // Mark duplicate index
break; // Stop searching here
}
}

if (wp_index == LLDB_INVALID_INDEX32)		// Create a backup we can revert to in case of failure.
return LLDB_INVALID_INDEX32;		tempAddr = m_hwp_regs[wp_index].address;
		tempControl = m_hwp_regs[wp_index].control;

// Add new or update existing watchpoint
if ((m_hwp_regs[wp_index].control & 1) == 0) {
// Update watchpoint in local cache		// Update watchpoint in local cache
m_hwp_regs[wp_index].real_addr = real_addr;
m_hwp_regs[wp_index].address = addr;		m_hwp_regs[wp_index].address = addr;
m_hwp_regs[wp_index].control = control_value;		m_hwp_regs[wp_index].control = control_value;
m_hwp_regs[wp_index].refcount = 1;

// PTRACE call to set corresponding watchpoint register.		// PTRACE call to set corresponding watchpoint register.
error = WriteHardwareDebugRegs(eDREGTypeWATCH, wp_index);		error = WriteHardwareDebugRegs(eDREGTypeWATCH, wp_index);

if (error.Fail()) {		if (error.Fail()) {
m_hwp_regs[wp_index].address = 0;		m_hwp_regs[wp_index].control = tempControl;
m_hwp_regs[wp_index].control &= ~1;		m_hwp_regs[wp_index].address = tempAddr;
m_hwp_regs[wp_index].refcount = 0;

return LLDB_INVALID_INDEX32;		return LLDB_INVALID_INDEX32;
}		}
} else
		m_hwp_regs[wp_index].real_addr = real_addr;
m_hwp_regs[wp_index].refcount++;		m_hwp_regs[wp_index].refcount++;

return wp_index;		return wp_index;
}		}

bool NativeRegisterContextLinux_arm::ClearHardwareWatchpoint(		bool NativeRegisterContextLinux_arm::ClearHardwareWatchpoint(
		labathUnsubmitted Not Done Reply Inline Actions This looks a bit worrying. What will happen after the following sequence of events: client tells us to set a watchpoint at 0x1000 we set the watchpoint client tells us to set a watchpoint at 0x1001 we extend the previous watchpoint to watch this address as well client tells us to delete the watchpoint at 0x1000 ??? Will we remain watching the address 0x1001? I don't see how will you be able to do that without maintaining a some info about the original watchpoints the client requested (and I have not seen that code). Please add a test for this. labath: This looks a bit worrying. What will happen after the following sequence of events: - client…
		omjavaidAuthorUnsubmitted Not Done Reply Inline Actions I just realized that I missed a crucial change in my last patch that we need to make all this work. Let me get back with correction and desired test cases. omjavaid: I just realized that I missed a crucial change in my last patch that we need to make all this…
uint32_t wp_index) {		uint32_t wp_index) {
Log *log(lldb_private::GetLogIfAllCategoriesSet(LIBLLDB_LOG_WATCHPOINTS));		Log *log(lldb_private::GetLogIfAllCategoriesSet(LIBLLDB_LOG_WATCHPOINTS));

if (log)		if (log)
log->Printf("NativeRegisterContextLinux_arm::%s()", __FUNCTION__);		log->Printf("NativeRegisterContextLinux_arm::%s()", __FUNCTION__);

Error error;		Error error;

// Read hardware breakpoint and watchpoint information.		// Read hardware breakpoint and watchpoint information.
error = ReadHardwareDebugInfo();		error = ReadHardwareDebugInfo();

if (error.Fail())		if (error.Fail())
return false;		return false;

if (wp_index >= m_max_hwp_supported)		if (wp_index >= m_max_hwp_supported)
return false;		return false;

// Update reference count if multiple references.
if (m_hwp_regs[wp_index].refcount > 1) {
m_hwp_regs[wp_index].refcount--;
return true;
} else if (m_hwp_regs[wp_index].refcount == 1) {
// Create a backup we can revert to in case of failure.		// Create a backup we can revert to in case of failure.
lldb::addr_t tempAddr = m_hwp_regs[wp_index].address;
uint32_t tempControl = m_hwp_regs[wp_index].control;		uint32_t tempControl = m_hwp_regs[wp_index].control;
uint32_t tempRefCount = m_hwp_regs[wp_index].refcount;

// Update watchpoint in local cache		// Update watchpoint in local cache
m_hwp_regs[wp_index].control &= ~1;		m_hwp_regs[wp_index].control &= ~1;
m_hwp_regs[wp_index].address = 0;		m_hwp_regs[wp_index].refcount--;
m_hwp_regs[wp_index].refcount = 0;

// Ptrace call to update hardware debug registers		// Ptrace call to update hardware debug registers
error = WriteHardwareDebugRegs(eDREGTypeWATCH, wp_index);		error = WriteHardwareDebugRegs(eDREGTypeWATCH, wp_index);

if (error.Fail()) {		if (error.Fail()) {
m_hwp_regs[wp_index].control = tempControl;		m_hwp_regs[wp_index].control = tempControl;
m_hwp_regs[wp_index].address = tempAddr;		m_hwp_regs[wp_index].refcount++;
m_hwp_regs[wp_index].refcount = tempRefCount;

return false;		return false;
}		}

return true;		return true;
}		}

return false;
}

Error NativeRegisterContextLinux_arm::ClearAllHardwareWatchpoints() {		Error NativeRegisterContextLinux_arm::ClearAllHardwareWatchpoints() {
Log *log(lldb_private::GetLogIfAllCategoriesSet(LIBLLDB_LOG_WATCHPOINTS));		Log *log(lldb_private::GetLogIfAllCategoriesSet(LIBLLDB_LOG_WATCHPOINTS));

if (log)		if (log)
log->Printf("NativeRegisterContextLinux_arm::%s()", __FUNCTION__);		log->Printf("NativeRegisterContextLinux_arm::%s()", __FUNCTION__);

Error error;		Error error;

Show All 9 Lines	Error NativeRegisterContextLinux_arm::ClearAllHardwareWatchpoints() {
for (uint32_t i = 0; i < m_max_hwp_supported; i++) {		for (uint32_t i = 0; i < m_max_hwp_supported; i++) {
if (m_hwp_regs[i].control & 0x01) {		if (m_hwp_regs[i].control & 0x01) {
// Create a backup we can revert to in case of failure.		// Create a backup we can revert to in case of failure.
tempAddr = m_hwp_regs[i].address;		tempAddr = m_hwp_regs[i].address;
tempControl = m_hwp_regs[i].control;		tempControl = m_hwp_regs[i].control;
tempRefCount = m_hwp_regs[i].refcount;		tempRefCount = m_hwp_regs[i].refcount;

// Clear watchpoints in local cache		// Clear watchpoints in local cache
m_hwp_regs[i].control &= ~1;		m_hwp_regs[i].control = 0;
m_hwp_regs[i].address = 0;		m_hwp_regs[i].address = 0;
m_hwp_regs[i].refcount = 0;		m_hwp_regs[i].refcount = 0;

// Ptrace call to update hardware debug registers		// Ptrace call to update hardware debug registers
error = WriteHardwareDebugRegs(eDREGTypeWATCH, i);		error = WriteHardwareDebugRegs(eDREGTypeWATCH, i);

if (error.Fail()) {		if (error.Fail()) {
m_hwp_regs[i].control = tempControl;		m_hwp_regs[i].control = tempControl;
m_hwp_regs[i].address = tempAddr;		m_hwp_regs[i].address = tempAddr;
m_hwp_regs[i].refcount = tempRefCount;		m_hwp_regs[i].refcount = tempRefCount;

return error;		return error;
}		}
}		}
		else {
		m_hwp_regs[i].control = 0;
		m_hwp_regs[i].address = 0;
		m_hwp_regs[i].refcount = 0;
		}
}		}

return Error();		return Error();
}		}

uint32_t NativeRegisterContextLinux_arm::GetWatchpointSize(uint32_t wp_index) {		uint32_t NativeRegisterContextLinux_arm::GetWatchpointSize(uint32_t wp_index) {
Log *log(lldb_private::GetLogIfAllCategoriesSet(LIBLLDB_LOG_WATCHPOINTS));		Log *log(lldb_private::GetLogIfAllCategoriesSet(LIBLLDB_LOG_WATCHPOINTS));

Show All 34 Lines	Error NativeRegisterContextLinux_arm::GetWatchpointHitIndex(

uint32_t watch_size;		uint32_t watch_size;
lldb::addr_t watch_addr;		lldb::addr_t watch_addr;

for (wp_index = 0; wp_index < m_max_hwp_supported; ++wp_index) {		for (wp_index = 0; wp_index < m_max_hwp_supported; ++wp_index) {
watch_size = GetWatchpointSize(wp_index);		watch_size = GetWatchpointSize(wp_index);
watch_addr = m_hwp_regs[wp_index].address;		watch_addr = m_hwp_regs[wp_index].address;

if (m_hwp_regs[wp_index].refcount >= 1 && WatchpointIsEnabled(wp_index) &&		if (WatchpointIsEnabled(wp_index) && trap_addr >= watch_addr &&
trap_addr >= watch_addr && trap_addr < watch_addr + watch_size) {		trap_addr < watch_addr + watch_size) {
m_hwp_regs[wp_index].hit_addr = trap_addr;		m_hwp_regs[wp_index].hit_addr = trap_addr;
return Error();		return Error();
}		}
}		}

wp_index = LLDB_INVALID_INDEX32;		wp_index = LLDB_INVALID_INDEX32;
return Error();		return Error();
}		}
▲ Show 20 Lines • Show All 215 Lines • Show Last 20 Lines

source/Plugins/Process/Linux/NativeThreadLinux.h

Show First 20 Lines • Show All 102 Lines • ▼ Show 20 Lines	private:
// ---------------------------------------------------------------------		// ---------------------------------------------------------------------
lldb::StateType m_state;		lldb::StateType m_state;
ThreadStopInfo m_stop_info;		ThreadStopInfo m_stop_info;
NativeRegisterContextSP m_reg_context_sp;		NativeRegisterContextSP m_reg_context_sp;
std::string m_stop_description;		std::string m_stop_description;
using WatchpointIndexMap = std::map<lldb::addr_t, uint32_t>;		using WatchpointIndexMap = std::map<lldb::addr_t, uint32_t>;
WatchpointIndexMap m_watchpoint_index_map;		WatchpointIndexMap m_watchpoint_index_map;
cpu_set_t m_original_cpu_set; // For single-step workaround.		cpu_set_t m_original_cpu_set; // For single-step workaround.
		bool m_invalidate_watchpoints;
};		};

typedef std::shared_ptr<NativeThreadLinux> NativeThreadLinuxSP;		typedef std::shared_ptr<NativeThreadLinux> NativeThreadLinuxSP;
} // namespace process_linux		} // namespace process_linux
} // namespace lldb_private		} // namespace lldb_private

#endif // #ifndef liblldb_NativeThreadLinux_H_		#endif // #ifndef liblldb_NativeThreadLinux_H_

source/Plugins/Process/Linux/NativeThreadLinux.cpp

Show First 20 Lines • Show All 81 Lines • ▼ Show 20 Lines	log.Printf("%s: %s invalid stop reason %" PRIu32, __FUNCTION__, header,
static_cast<uint32_t>(stop_info.reason));		static_cast<uint32_t>(stop_info.reason));
}		}
}		}
}		}

NativeThreadLinux::NativeThreadLinux(NativeProcessLinux *process,		NativeThreadLinux::NativeThreadLinux(NativeProcessLinux *process,
lldb::tid_t tid)		lldb::tid_t tid)
: NativeThreadProtocol(process, tid), m_state(StateType::eStateInvalid),		: NativeThreadProtocol(process, tid), m_state(StateType::eStateInvalid),
m_stop_info(), m_reg_context_sp(), m_stop_description() {}		m_stop_info(), m_reg_context_sp(), m_stop_description(),
		m_invalidate_watchpoints(false) {}

std::string NativeThreadLinux::GetName() {		std::string NativeThreadLinux::GetName() {
NativeProcessProtocolSP process_sp = m_process_wp.lock();		NativeProcessProtocolSP process_sp = m_process_wp.lock();
if (!process_sp)		if (!process_sp)
return "<unknown: no process>";		return "<unknown: no process>";

// const NativeProcessLinux *const process =		// const NativeProcessLinux *const process =
// reinterpret_cast<NativeProcessLinux*> (process_sp->get ());		// reinterpret_cast<NativeProcessLinux*> (process_sp->get ());
▲ Show 20 Lines • Show All 81 Lines • ▼ Show 20 Lines
}		}

Error NativeThreadLinux::RemoveWatchpoint(lldb::addr_t addr) {		Error NativeThreadLinux::RemoveWatchpoint(lldb::addr_t addr) {
auto wp = m_watchpoint_index_map.find(addr);		auto wp = m_watchpoint_index_map.find(addr);
if (wp == m_watchpoint_index_map.end())		if (wp == m_watchpoint_index_map.end())
return Error();		return Error();
uint32_t wp_index = wp->second;		uint32_t wp_index = wp->second;
m_watchpoint_index_map.erase(wp);		m_watchpoint_index_map.erase(wp);
if (GetRegisterContext()->ClearHardwareWatchpoint(wp_index))		if (GetRegisterContext()->ClearHardwareWatchpoint(wp_index)) {
		m_invalidate_watchpoints = true;
return Error();		return Error();
		}
return Error("Clearing hardware watchpoint failed.");		return Error("Clearing hardware watchpoint failed.");
}		}

Error NativeThreadLinux::Resume(uint32_t signo) {		Error NativeThreadLinux::Resume(uint32_t signo) {
const StateType new_state = StateType::eStateRunning;		const StateType new_state = StateType::eStateRunning;
MaybeLogStateChange(new_state);		MaybeLogStateChange(new_state);
m_state = new_state;		m_state = new_state;

m_stop_info.reason = StopReason::eStopReasonNone;		m_stop_info.reason = StopReason::eStopReasonNone;
m_stop_description.clear();		m_stop_description.clear();

// If watchpoints have been set, but none on this thread,		// Invalidate watchpoint index map for a re-sync
// then this is a new thread. So set all existing watchpoints.		if (m_invalidate_watchpoints) {
		labathUnsubmitted Not Done Reply Inline Actions If you add this, then the comment below becomes obsolete. Seems like a pretty elegant solution to the incremental watchpoint update problem. I am wondering whether we need to do it on every resume though. I think it should be enough to do it when a watchpoint gets deleted (`NTL::RemoveWatchpoint`). Also, we should throw out the implementation of `NativeRegisterContextLinux_arm::ClearHardwareWatchpoint` -- it's no longer necessary, and it's not even correct anymore. labath: If you add this, then the comment below becomes obsolete. Seems like a pretty elegant solution…
		omjavaidAuthorUnsubmitted Not Done Reply Inline Actions This can be improved for performance I intentionally didn't do it to minimize changes to the generic component. omjavaid: This can be improved for performance I intentionally didn't do it to minimize changes to the…
		labathUnsubmitted Not Done Reply Inline Actions I don't care about the performance too much - the code isn't that hot. What bothers me is that this leaves the code in an inconsistent state - NativeRegisterContextLinux_arm::ClearHardwareWatchpoint thinks it is doing the watchpoint removal "the old way" whereas what it does actually does not matter, as we will nuke the watchpoint registers anyways. Ideally, I'd like to see this done in the opposite order - first switching the code to use the "nuking" approach to removing watchpoints, and after that adding the slot reuse code (at which point it will not need to touch any generic code at all). If you want to do it the other way then go ahead, but I do expect to see a follow-up change to clean this up. labath: I don't care about the performance too much - the code isn't that hot. What bothers me is that…
		m_invalidate_watchpoints = false;
		m_watchpoint_index_map.clear();
		}

		// Re-sync all available watchpoints.
if (m_watchpoint_index_map.empty()) {		if (m_watchpoint_index_map.empty()) {
NativeProcessLinux &process = GetProcess();		NativeProcessLinux &process = GetProcess();

const auto &watchpoint_map = process.GetWatchpointMap();		const auto &watchpoint_map = process.GetWatchpointMap();
GetRegisterContext()->ClearAllHardwareWatchpoints();		GetRegisterContext()->ClearAllHardwareWatchpoints();
for (const auto &pair : watchpoint_map) {		for (const auto &pair : watchpoint_map) {
const auto &wp = pair.second;		const auto &wp = pair.second;
SetWatchpoint(wp.m_addr, wp.m_size, wp.m_watch_flags, wp.m_hardware);		SetWatchpoint(wp.m_addr, wp.m_size, wp.m_watch_flags, wp.m_hardware);
▲ Show 20 Lines • Show All 272 Lines • Show Last 20 Lines