This is an archive of the discontinued LLVM Phabricator instance.

[LLDB] Fix code breakpoints on tagged addresses
AbandonedPublic

Authored by DavidSpickett on Oct 28 2022, 3:29 AM.

Details

Summary

This allows you to break on the value of a function pointer
with non-address bits in it. For example your ABI might mandate
function pointer signing.

Previously it would either try to write to an address that was
incorrect, or try to set a hardware register incorrectly.

I have placed the ABI check before GetBreakableLoadAddress
on the logic that that function probably works best with a
fixed address.

This is somewhat academic because the only user of
GetBreakableLoadAddress is MIPS where we don't support any
form of address tagging.

Diff Detail

Event Timeline

DavidSpickett created this revision.Oct 28 2022, 3:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 28 2022, 3:29 AM
DavidSpickett requested review of this revision.Oct 28 2022, 3:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 28 2022, 3:29 AM

Add a release note.

Herald added a project: Restricted Project. · View Herald TranscriptOct 28 2022, 3:37 AM

Tests will run only for armv8.3-a hardware so I've tested them with qemu-system. Existing tests pass on Armv8 Linux as before.

Some of these PAC tests could run on Apple Silicon now that I think about it. That's for another time, needs extra work.

If you want to understand the motivation beyond "this should work why doesn't it", here it is.

If your ABI mandated signing all function pointers you might store a callback in a struct somewhere. Not knowing what that callback points to you would like to be able to do breakpoint set -a mystruct.callback and not have to first print it, then break on the function name or manually fix the address.

Completely forgot about this, ping!

Hi David, thanks for the ping sorry for not replying earlier to this. I looked at this a little bit today but want to look at how I solved the same issue a while ago (but never upstreamed, which is my bad, I need to clean those up). (I did solved this in OptionArgParser::ToAddress and I haven't looked at this enough to argue for mine over your approach -- but mine gets you any address expression command like source lookup -a and image lookup -a etc. I'll look at this more closely this weekend and follow up though.

(I got distracted way too much when I tried to make a repo test case, and my function pointer variable was called f and it would never work. I looked at it with Jim and we were both confused mightily, until we noticed that OptionArgParser::ToAddress throws things through a "see if it can be parsed as base16 even without a prefix" so you can't use a variable name that consists of [a-f][0-9a-f]* lol. Jim or I are definitely going to be removing that little bit of too-cleverness.)

At first I was wondering if the ABI should save the ArchSpec it is created with so it can call the Architecture's GetBreakableLoadAddress instead of requiring all the callers to call both of these, but I see from the ppc64 description of what that method does, it would not be applicable to most code addresses, only function entry, and only when we were calling into this function I think?

I have a little example of a function pointer using ptrath in an arm64e (ARMv8.3) program, to show what I was talking about,

* thread #1, queue = 'com.apple.main-thread', stop reason = instruction step into
    frame #0: 0x0000000100003fa0 a.out`main + 40 at a.c:6
   1   	int foo () { return 5; }
   2   	
   3   	int main()
   4   	{
   5   	  int (*func)() = foo;
-> 6   	  return func();
   7   	}
Target 0: (a.out) stopped.

(lldb) p func
(int (*)(...)) $0 = 0xb868000100003f70 (actual=0x0000000100003f70 a.out`foo at a.c:1)

(lldb) reg read x8
      x8 = 0xb868000100003f70 (0x0000000100003f70) a.out`foo at a.c:1

(lldb) ima loo -a func
      Address: a.out[0x0000000100003f70] (a.out.__TEXT.__text + 0)
      Summary: a.out`foo at a.c:1

(lldb) br s -a func
Breakpoint 2: where = a.out`foo at a.c:1, address = 0x0000000100003f70

(lldb) mem read -c 1 -s 4 -f x func
0x100003f70: 0x528000a0

(lldb) x/1i func
    0x100003f70: 0x528000a0   mov    w0, #0x5
(lldb)

(nb out of the box installs of macOS will not run arm64e binaries like this, the ABI is not finalized -- it's meant to prevent people from sharing builds that may stop working some day when the ABI changes)

This process isn't using TBI, but the high bits end up being used for signing so it's the same effect. I located this in OptionArgParser::ToAddress() because we'll need this same behavior in

if (expr_result == eExpressionCompleted) {
  if (valobj_sp)
    valobj_sp = valobj_sp->GetQualifiedRepresentationIfAvailable(
        valobj_sp->GetDynamicValueType(), true);
  // Get the address to watch.
  if (valobj_sp)
    addr = valobj_sp->GetValueAsUnsigned(fail_value, &success);
  if (success) {
    if (error_ptr)
      error_ptr->Clear();
    if (abi_sp)
      addr = abi_sp->FixCodeAddress (addr);
    return addr;
  } else {

What do you think about locating this change in ToAddress instead of Target::GetBreakableLoadAddress? It looks like the one caller to GetBreakableLoadAddress is Target::CreateBreakpoint(addr_t addr) - which is probably called by an SBTarget method if we want to think of the most general purpose use case. I think stripping non-addressable bits in OptionArgParser::ToAddress is the right thing to do - but I don't have an opinion about whether it should be done in Target::GetBreakableLoadAddress or not.

One thing to note is that I also have an SBValue method that needs to be upstreamed, SBValue::GetValueAsAddress(). So if someone has an SBValue of a function pointer and they want to take the value of that func ptr and call SBTarget::CreateBreakpoint on it, I would say "well yes, you need to use GetValueAsAddress before you pass that to CreateBreakpoint".

What do you think about locating this change in ToAddress instead of Target::GetBreakableLoadAddress? It looks like the one caller to GetBreakableLoadAddress is Target::CreateBreakpoint(addr_t addr) - which is probably called by an SBTarget method if we want to think of the most general purpose use case. I think stripping non-addressable bits in OptionArgParser::ToAddress is the right thing to do - but I don't have an opinion about whether it should be done in Target::GetBreakableLoadAddress or not.

I had not checked the other lookup commands, so yeah this sounds good to me. Doing this fixing in fewer places is always better. Put up your change and I'll check the test from this patch against it.

One thing to note is that I also have an SBValue method that needs to be upstreamed, SBValue::GetValueAsAddress(). So if someone has an SBValue of a function pointer and they want to take the value of that func ptr and call SBTarget::CreateBreakpoint on it, I would say "well yes, you need to use GetValueAsAddress before you pass that to CreateBreakpoint".

If I understand correctly, this would effectively run FixCodeAddress on the value? One thing that I found doing the SB API for MTE (which was shelved, but besides the point) is that there was no way to fix addresses via the API (or get access to the ABI plugin in general). I ended up just adding FixDataAddress to SBProcess as a temporary thing. So something like ValueAsAddress could be very useful.

We have this distinction (that doesn't mean much) between code and data fixes. For now a single method will work fine, I'm hoping the code/data difference never gets to a point where we have to be really careful about it.
(as you say "This process isn't using TBI, but the high bits end up being used for signing so it's the same effect.", so FixCode and FixData are the same thing in that case)

What do you think about locating this change in ToAddress instead of Target::GetBreakableLoadAddress? It looks like the one caller to GetBreakableLoadAddress is Target::CreateBreakpoint(addr_t addr) - which is probably called by an SBTarget method if we want to think of the most general purpose use case. I think stripping non-addressable bits in OptionArgParser::ToAddress is the right thing to do - but I don't have an opinion about whether it should be done in Target::GetBreakableLoadAddress or not.

I had not checked the other lookup commands, so yeah this sounds good to me. Doing this fixing in fewer places is always better. Put up your change and I'll check the test from this patch against it.

OK sounds good, I'll put up a phab. I can't test it on macOS platforms because the bots won't be able to build & run arm64e (ARMv8.3 w/ ptrauth) binaries. :/

One thing to note is that I also have an SBValue method that needs to be upstreamed, SBValue::GetValueAsAddress(). So if someone has an SBValue of a function pointer and they want to take the value of that func ptr and call SBTarget::CreateBreakpoint on it, I would say "well yes, you need to use GetValueAsAddress before you pass that to CreateBreakpoint".

If I understand correctly, this would effectively run FixCodeAddress on the value? One thing that I found doing the SB API for MTE (which was shelved, but besides the point) is that there was no way to fix addresses via the API (or get access to the ABI plugin in general). I ended up just adding FixDataAddress to SBProcess as a temporary thing. So something like ValueAsAddress could be very useful.

Yeah my SBValue::GetValueAsAddress() simply passed it through FixCodeAddress which was correct enough for our current environment, but SBValue may have access to type information so maybe it could pick the correct Fix method to call given that information?

I've had some scripting people who also want methods to clear non-addressable bits in SBProcess, as they handle signed pointers in their own scripts, it's something I've been meaning to do for them -- or give them a way to get the address fixing bitmasks.

jasonmolenda accepted this revision.Dec 19 2022, 9:16 AM

I think we both agree that OptionArgParser::ToAddress needs to fix the address, but I don't have an opinion about whether we should also do it in GetBreakableLoadAddress. I'm fine if you think we should do it here too, for other codepaths that can try to set a breakpoint on a signed function pointer address.

This revision is now accepted and ready to land.Dec 19 2022, 9:16 AM
DavidSpickett planned changes to this revision.Dec 19 2022, 9:28 AM

I can't test it on macOS platforms because the bots won't be able to build & run arm64e (ARMv8.3 w/ ptrauth) binaries. :/

Well our bots can't either but I've got QEMU locally is what I mean. We can run top byte ignore tests but top byte ignore being ignored by hardware sometimes makes tests pass anyway :)

I don't have an opinion about whether we should also do it in GetBreakableLoadAddress

I'll make a list of the paths to setting a breakpoint, to see which layer it makes the most sense to fix in. (which I'll get time to do sometime in January)