This is an archive of the discontinued LLVM Phabricator instance.

Define qHostInfo and Mach-O LC_NOTE "addrable bits" methods to describe high and low memory addressing bits
ClosedPublic

Authored by jasonmolenda on Aug 10 2023, 5:57 PM.

Details

Summary

To recap, on AArch64 where we need to strip pointer authentication bits off of addresses, it can have a different number of bits in use for high and low memory (you can have a different page table setup for high and low memory, so different number of addressing bits for virtual addresses). I added this to Process and defined a new setting so these can be done manually in https://reviews.llvm.org/D151292 a few months ago.

This patch defines keys for qHostInfo and a new "addrable bits" LC_NOTE format for mach-o corefiles to define both of them.

qHostInfo gains low_mem_addressing_bits and high_mem_addressing_bits keys. The previous addressing_bits is still accepted for a value that is used for both regions of memory (this is by far the most common case).

For Mach-O corefile, I've needed to update the "addrable bits" LC_NOTE. The previous definition was

struct addressing_bit_count_v3    // ** DEPRECATED **
{
    uint32_t version;             // currently 3
    uint32_t addressing_bits;     // # of bits used for addressing
    uint64_t unused;
};

The new definition is

struct addressing_bit_count
{
    // currently 4
    uint32_t version;

    // Number of bits used for addressing in low
    // memory (addresses starting with 0)
    uint32_t low_memory_addressing_bits;

    // Number of bits used for addressing in high
    // memory (addresses starting with f)
    uint32_t high_memory_addressing_bits;

    // set to zero
    uint32_t reserved;
};

The changes in this patchset are to GDBRemoteCommunicationClient (read the new qHostInfo keys), ProcessGDBRemote (set both in the Process when they are specified). And ObjectFileMachO (read and write the new LC_NOTE in mach-o corefiles), ProcessMachCore (set both in Process when they are specified).

I still accept the previous formats of a single value, and only initialize the base CodeAddressMask and DataAddressMask. Only when both high and low memory addressing bits are specified and are different, do we set the Process' HighmemCodeAddressMask, HighmemDataAddressMask.

This is the final part of adding support for the different address masks that can be used on AArch64 targets.

Diff Detail

Event Timeline

jasonmolenda created this revision.Aug 10 2023, 5:57 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 10 2023, 5:57 PM
jasonmolenda requested review of this revision.Aug 10 2023, 5:57 PM

update comments a tiny bit.

bulbazord added inline comments.Aug 11 2023, 12:20 PM
lldb/include/lldb/Symbol/ObjectFile.h
500–503

/param -> \param

lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
917–918

These should be SetHighmemCodeAddressMask and SetHighmemDataAddressMask right?

Fix the problems Alex found on review, thanks Alex.

bulbazord accepted this revision.Aug 11 2023, 1:40 PM

LGTM. You may want to wait for Jonas to take a look before landing though.

This revision is now accepted and ready to land.Aug 11 2023, 1:40 PM
JDevlieghere added inline comments.Aug 14 2023, 1:51 PM
lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h
125–126

I would prefer this to return a struct with the two masks, possibly living in Utility. The struct would centralize default values, which values are meaningful, etc and could be stored as a member in GDBRemoteCommunicationClient.

Incorporate Jonas' suggestion of a AddressableBits class that GDBRemoteCommunicationClient and ObjectFileMachO could use to store the zero/one/two addressable bits values back to a Process, and centralizing the logic for how those 0-2 values are used to set the Process address masks.