This is an archive of the discontinued LLVM Phabricator instance.

Use kernel's global variable indicating how many bits are used in addressing when loading Darwin xnu kernel
ClosedPublic

Authored by jasonmolenda on Apr 3 2023, 11:28 AM.

Details

Summary

The Darwin xnu kernel has a global variable which has the number of bits used for addressing in ptrauth code. The DynamicLoaderDarwinKernel already knows to look at xnu global variables to load kexts (solibs for a kernel); it's a natural place to find & use this information to set the number of bits correctly for the Process.

Diff Detail

Event Timeline

jasonmolenda created this revision.Apr 3 2023, 11:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 3 2023, 11:28 AM
jasonmolenda requested review of this revision.Apr 3 2023, 11:28 AM
JDevlieghere added inline comments.Apr 3 2023, 12:46 PM
lldb/source/Plugins/DynamicLoader/Darwin-Kernel/DynamicLoaderDarwinKernel.cpp
1087

What does this check add? Should this be:

if (symbol && symbol->GetByteSize() == wordsize ==8)
1090

Maybe make this const to make it clear nobody should modify this.

1093

Instead of changing the addressable bits for the process, should we modify GetLoadAddress to explicitly skip the stripping instead? This is probably purely theoretical, but what if another (host) thread tried to read memory in the meantime? Changing global state like this can lead to subtle bugs.

1101

You can probably just reuse wordsize here?

1111

Why is this singed? This should be a uint32_t.

1114

Is this actually necessary? Does that mean that you can't do process->SetVirtualAddressableBits(process->GetVirtualAddressableBits()) in general?

jasonmolenda added inline comments.Apr 3 2023, 1:01 PM
lldb/source/Plugins/DynamicLoader/Darwin-Kernel/DynamicLoaderDarwinKernel.cpp
1093

Yes, I thought about touching the global state. At this point we've just attached to the remote device / corefile, and are loading the kernel binary. There aren't other threads operating on this at this time. We're also in a state where the number of addressable bits may default to a correct value, but is just as likely incorrect.

As for a flag to do this, it's a bit tricky! It's actually the Target::ReadUnsignedIntegerFromMemory() call, which takes an Address object, which ends up mutating the address while constructing the load address. We could pipe a flag for ReadUnsignedIntegerFromMemory down a few layers to where that's happening, or add a flag to the Address object which indicates that it should not be mutated down the line. Target::ReadMemory needs to take an Address object to fall back to using the backing file if possible (important if the corefile doesn't include the binary contents), another alternative is to switch to Process::ReadMemory whcih takes an addr_t but won't fall back to the on-disk file.

1114

I didn't debug it through the layers, but setting the value to 0 to when the getter returns 0, did break this.

Update patch to address some of Jonas' comments. I was adding a check that the ptrsize of the architecture was 64-bit, but we don't work on 32-bit xnu kernels for a few years now; we can assume 64-bit safely. I am still forcing the addressable bits value to 55 globally inside lldb while I read & set the correct value.

JDevlieghere accepted this revision.Apr 3 2023, 1:40 PM

LGTM. I would still prefer to not change the global state, but after speaking to Jason offline, that might be better tackled when we unify the current Process/Target bifurcation.

lldb/source/Plugins/DynamicLoader/Darwin-Kernel/DynamicLoaderDarwinKernel.cpp
1094–1097

Why not use symbol->GetByteSize() too? Maybe add an assert(symbol->GetByteSize() == 8)

This revision is now accepted and ready to land.Apr 3 2023, 1:40 PM

Yeah, I agree that messing with the global setting of addressable bits here is not ideal, but at this point in time the fact that we probably have an un-set/invalid value means it won't make things worse.

lldb/source/Plugins/DynamicLoader/Darwin-Kernel/DynamicLoaderDarwinKernel.cpp
1094–1097

We're trying to handle the case (it has happened in the past) where we have an inaccurate byte size because lldb is running on a stripped kernel binary, this is specifically working around the case where the symbol byte size is not 8.

(Symbols synthesize their size by looking at the next nearest symbol, so when you have a stripped binary you may have sizes that are larger than reality when some symbols have been stripped)