This is an archive of the discontinued LLVM Phabricator instance.

When setting load addresses on darwin kernel kexts, handle case where in-memory load commands are not updated
ClosedPublic

Authored by jasonmolenda on Mar 7 2023, 10:33 PM.

Details

Summary

The Darwin kernel has a number of kernel extensions, kexts, akin to solibs to lldb. We find the kext binary on disk, we create a MemoryModule from the in-memory image based on the Mach-O header and Mach-O load commands in memory. The LC_SEGMENT vmaddrs for each segment in memory have been updated with their actual load addresses. DynamicLoaderDarwinKernel uses the MemoryModule's Section "file" addresses to set the load addresses for the on-disk Module in the target.

There are some extensions that can be loaded into memory which won't have the Mach-O load commands with their load addresses updated. So the MemoryModule has the original file addresses, instead of the actual in-memory load addresses. The current DynamicLoader code will then set the target section load addresses to the file addresses, which doesn't work.

This patch detects this by looking at the MemoryModule's Mach-O header (__TEXT) segment, and comparing its actual memory address we used to create the MemoryModule with the vmaddr ("file address") of the MemoryModule Section. If they differ, this is one of those binaries that hasn't had its load commands updated when they were added to memory.

When this is detected, we assume all segments slide by a constant value, calculate that slide, and apply it to all the sections. Normal kexts have their load addresses calculated as they always had been.

Diff Detail

Event Timeline

jasonmolenda created this revision.Mar 7 2023, 10:33 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 7 2023, 10:33 PM
jasonmolenda requested review of this revision.Mar 7 2023, 10:33 PM

Show an alternate approach, where we check the file types of everything in the kernel+kext list, and if something is not a kernel/kext, don't try to load it at all. v. the first change in DynamicLoaderDarwinKernel::KextImageInfo::LoadImageUsingMemoryModule where I call !IsKernel() && !obj_macho->IsKext() to detect something in the list which isn't processed the standard way that kexts & the kernel are. Either one would work in this case.

Add an additional bit to the "don't load non-kexts, non-kernels" addition. I refined this by checking to see if the binary is already registered in the Target, and already has a load address set. If the binary is, then we don't re-register it.

If we encounter a program that is a non-kext, non-kernel binary and isn't yet loaded in the Target, we will load it. So the code that handles in-memory load commands with incorrect vmaddrs is still needed.

JDevlieghere accepted this revision.Mar 8 2023, 4:28 PM

Some code style nits but the change itself LGTM.

lldb/source/Plugins/DynamicLoader/Darwin-Kernel/DynamicLoaderDarwinKernel.cpp
819–821

You can use dyn_cast_or_null to eliminate the null check:

ObjectFileMachO *ondisk_objfile_macho = llvm::dyn_cast_or_null<ObjectFileMachO>(ondisk_object_file);

I hate temporaries that are used only once below, so I'd probably write:

ObjectFileMachO *ondisk_objfile_macho = llvm::dyn_cast_or_null<ObjectFileMachO>(m_module_sp ? m_module_sp->GetObjectFile() : nullptr);

but I'm probably the only one that likes that better.

835

What's the purpose of reseting the shared pointer? It'll just go out of scope right after.

855–857

Since you already did it for Section below...

if(ObjectFileMachO *memory_objfile_macho =
            llvm::dyn_cast<ObjectFileMachO>(memory_object_file)) {
lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
1146 ↗(On Diff #503550)

Newline below

This revision is now accepted and ready to land.Mar 8 2023, 4:28 PM
jasonmolenda marked an inline comment as done.Mar 8 2023, 4:43 PM
jasonmolenda added inline comments.
lldb/source/Plugins/DynamicLoader/Darwin-Kernel/DynamicLoaderDarwinKernel.cpp
835

No, this method is trying a few different ways to find the on-disk binary for the kext, and it puts that Module in m_module_sp once it finds it. (there's a bunch of blocks that start like if (!m_module_sp) { ... try another way of finding the Module ...} and the goal when we have a non-kext/non-kernel image that is already in the Target is to avoid re-loading it into the Target. So by clearing it, we're behaving as if it couldn't be found.

altho, tbh, looking this method I could just return here and it would accomplish the same. And be clearer.

855–857

yeah, I thought about that but it made for a really long line that would be a multi-line expression in the if and i wasn't thrilled about it. but i don't care that much.

jasonmolenda marked an inline comment as done.

Update for Jonas' suggestions.

bulbazord added inline comments.
lldb/source/Plugins/DynamicLoader/Darwin-Kernel/DynamicLoaderDarwinKernel.cpp
816–819

I have a suggestion that may make this easier to read. You already know that m_module_sp is valid because you have that as a condition to even enter this block. So you can do something like

ObjectFileMachO *ondisk_objfile_macho = m_module_sp->GetObjectFile();

But right below you only act on this if the object is valid, so you could stick it in the if condition itself:

if (auto *ondisk_objfile_macho = m_module_sp->GetObjectFile()) {