There were (at least) two problems that were causing TestStepOverWatchpoints.py to fail on arm64 Darwin systems. The first was that the "step over the watch instruction" plan that we pushed so we could report the watch stop when we knew the "old and new" values was a utility plan that wouldn't cause a user stop when it was done, and so instead of reporting the watchpoint we just continued on with the step.
The second one may be specific to Darwin systems, but when you single-step the processor over an instruction that raises the watchpoint, when we stop the stop reason is Trace and not Watchpoint. That not lldb's interpretation, that what the exception we get from the kernel says.
I don't have a good way to work around the second one yet, but I this patch fixes the first problem, and reworks the test so that we have a case that doesn't end up single-stepping over the instruction that triggers the watchpoint - which now succeeds on Darwin - and one that tests single-stepping over that instruction which still fails.