Page MenuHomePhabricator

[lldb][AArch64] Add memory tag reading to lldb-server

Authored by DavidSpickett on Jan 28 2021, 1:44 AM.



This adds memory tag reading using the new "qMemTags"
packet and ptrace on AArch64 Linux.

This new packet is following the one used by GDB.

On AArch64 Linux we use ptrace's PEEKMTETAGS to read
tags and we assume that lldb has already checked that the
memory region actually has tagging enabled.

We do not assume that lldb has expanded the requested range
to granules and expand it again to be sure.
(although lldb will be sending aligned ranges because it happens
to need them client side anyway)
Also we don't assume untagged addresses. So for AArch64 we'll
remove the top byte before using them. (the top byte includes
MTE and other non address data)

To do the ptrace read NativeProcessLinux will ask the native
register context for a memory tag manager based on the
type in the packet. This also gives you the ptrace numbers you need.
(it's called a register context but it also has non register data,
so it saves adding another per platform sub class)

The only supported platform for this is AArch64 Linux and the only
supported tag type is MTE allocation tags. Anything else will

Ptrace can return a partial result but for lldb-server we will
be treating that as an error. To succeed we need to get all the tags
we expect.

(Note that the protocol leaves room for logical tags to be
read via qMemTags but this is not going to be implemented for lldb
at this time.)

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
DavidSpickett retitled this revision from [lldb][AArch64] Add MTE memory tag reading to lldb-server to [lldb][AArch64] Add memory tag reading to lldb-server.Feb 23 2021, 6:31 AM
DavidSpickett edited the summary of this revision. (Show Details)
DavidSpickett added inline comments.Feb 23 2021, 6:43 AM

This looks good apart from minor nits inline


ptrace request is a success if number of tags requested is not equal to no of tags read? If not then this and following condition may be redundant.


this piece needs to run clang-format


Just curious response starts with 'm'. Whats the design need for using m in qMemTags response?


If skipUnlessAArch64MTELinuxCompiler can check for AArch64 and Linux then we wont need above two decorators.

DavidSpickett marked an inline comment as done.Mar 3 2021, 3:52 AM
DavidSpickett added inline comments.

Well ptracewrapper doesn't check the iovec, but I'll check the kernel source to see if it's actually possible for it to fail that way.


This is for future multi part replies ala qfThreadInfo ( I'll add a comment with this too.

DavidSpickett marked an inline comment as done.Mar 3 2021, 4:05 AM
DavidSpickett added inline comments.

I'll merge them into one (at least one you use in tests, separate functions). Also I just realised this is not checking that the remote supports MTE, only worked because I've been using the one qemu instance.

DavidSpickett marked an inline comment as not done.Mar 3 2021, 6:12 AM
DavidSpickett marked an inline comment as done.Mar 4 2021, 6:26 AM
DavidSpickett added inline comments.

In linux/arch/arm64/kernel/mte.c __access_remote_tags there is a comment:

+ * Access MTE tags in another process' address space as given in mm. Update
+ * the number of tags copied. Return 0 if any tags copied, error otherwise.
+ * Inspired by __access_remote_vm().
+ */

*any tags* being the key words.

So the scenario is:

  • ask to read from addr X in page 0, with length of pagesize+some so the range spills into page 1
  • kernel can access page 0, reads tags until the end of the page
  • tries to access page 1 to read the rest, fails, returns 0 (success) since *some* tags were read
  • we see the ptrace call succeeded but with less tags than we expected

I don't see it's worth dealing with this corner case here since lldb will look before it leaps. It would have errored much earlier here because either page 1 isn't in the tracee's memory regions or it wasn't MTE enabled.


On further consideration I don't think it's worth merging them. Sure we save 2 lines in each test but then anyone reading it is going to have to lookup what the combo does. I'd rather keep them listed like this for clarity (and later adding new platforms?).

Also I was wrong, the test does check for non MTE systems. If the tracee prints (nil) for the buffer, that means it's not an MTE system.
(we can't use the isAArch64MTE call in this type of test)

  • Use RemoveNonAddressBits instead of RemoveLogicalTag
  • Make the top byte of test buffer pointer non zero.

As noted, setting the top byte doesn't prove much
but it means if a future kernel gets more strict
we're already coping with that.

DavidSpickett marked 2 inline comments as done.Mar 8 2021, 8:19 AM
DavidSpickett added inline comments.

Added a comment in the code too.

  • Comments for the MemoryTaggingDetails struct
DavidSpickett marked 2 inline comments as done.Mar 8 2021, 8:27 AM
DavidSpickett added inline comments.
+@var{type} is the type of tag the request wants to fetch.  The type is a signed
omjavaid added inline comments.Mar 9 2021, 5:50 AM

infinite loop in test program may not be a good idea.

omjavaid added inline comments.Mar 9 2021, 6:13 AM

This means emitting less than requested number of tags is legit. However we have set tags vector size equal to whatever we have requested. We set error string which is actually not being used anywhere and can be removed in favor of a log message to help with debugging if needed.

Also we need to resize the vector after ptrace request so we use this size in gdb remote reply.

omjavaid added inline comments.Mar 9 2021, 6:22 AM

is there a difference between Granule and GranuleSize?


Do we test partial read case here?

DavidSpickett marked an inline comment as done.Mar 9 2021, 8:00 AM
DavidSpickett added inline comments.Mar 9 2021, 8:28 AM

Granule is used to mean the current Granule you're in. So if you were at address 0x10 you'd be in the [0x10,0x20) granule.
GranuleSize is the size of each granule.

If I saw manager->GetGranule I'd expect it to be something like std::pair<addr_t, addr_t> GetGranule(addr_t addr);.
As in, tell me which granule I'm in.

Though I admit this is more an issue of "ExpandToGranule" not being clear enough, rather than "GetGranuleSize" being too redunant.
AlignToGranule(s) perhaps? But then you ask "align how", hence "ExpandTo". Anyway.


I'll log that error in in GDBRemoteCommunicationServerLLGS.cpp.

I'll do what you suggested to support partial read on the server side. Then lldb client can error if it doesn't get what it expected.
(testing partial reads on the lldb side is going to be very difficult anyway since we'd need a valid smaps entry to even issue one)


Ack. No, but it should be a case of reading off of the end of the allocated buffer by some amount.


I'll check what the timeouts are on the expect packet sequence. I think it would get killed eventually if we didn't see the output we're looking for.
(I could do something more CPU friendly, sleep/halt/wait on something)

omjavaid added inline comments.Mar 10 2021, 12:02 AM

Right I guess we can stay with current nomenclature. Thanks for detailed explanation.


If we are following an approach similar to m/M gdb remote packets for tags then its ok to read partial data in case a part memory in requested address range was inaccessible. May be make appropriate adjustment for command output I dont recall what currently emit out for partial memory reads but should be something like <tags not available>


In past I have LLDB buildbot sometimes piling up excutables which python couldnt cleanup for whatever reason. Its better if executable can safely exit within a reasonable period.

DavidSpickett added inline comments.Mar 15 2021, 4:03 AM

I did some digging and lldb-server does not return partial data when a read fails.

for (bytes_read = 0; bytes_read < size; bytes_read += remainder) {
  Status error = NativeProcessLinux::PtraceWrapper(
      PTRACE_PEEKDATA, GetID(), (void *)addr, nullptr, 0, &data);
  if (error.Fail())
    return error;

  remainder = size - bytes_read;
  dst += k_ptrace_word_size;
return Status();

I was able to test partial writes too. There too we don't attempt to restore if we only wrote a smaller amount, writing less than the total is a failure.

However, it is true that I'm not handling the syscall properly. I need to loop like readmemory does. So I'm going to do that.
Loop until we've read all tags, or return the error we get.

omjavaid added inline comments.Mar 15 2021, 4:54 AM

Considering gdb remote protocol this is not complying with 'm' packet whick says:
"The reply may contain fewer addressable memory units than requested if the server was able to read only part of the region of memory."

May be we can fix this in a separate patch where LLDB should emit proper error code based on which either we can completely fail or send partial read data.
What do you think ?

DavidSpickett added inline comments.Mar 15 2021, 5:42 AM

That would be the ideal thing to do, however I was wrong about lldb not supporting it at all. In fact memory read can do partial results:

<step to clear caches>
(lldb) memory read 0x0000fffff7ff7000-16
lldb             <  34> send packet: $qMemoryRegionInfo:fffff7ff9010#df
lldb             <  55> read packet: $start:fffff7ff9000;size:1000;permissions:rw;flags:;#fa
lldb             <  50> send packet: $Mfffff7ff9010,f:000000000000000000000000000000#84
lldb             <   6> read packet: $OK#9a
lldb             <  34> send packet: $qMemoryRegionInfo:fffff7ff9010#df
lldb             <  55> read packet: $start:fffff7ff9000;size:1000;permissions:rw;flags:;#fa
lldb             <  36> send packet: $Mfffff7ff9010,8:0000000000000000#b6
lldb             <   6> read packet: $OK#9a
lldb             <  34> send packet: $qMemoryRegionInfo:fffff7ff9010#df
lldb             <  55> read packet: $start:fffff7ff9000;size:1000;permissions:rw;flags:;#fa
lldb             <  36> send packet: $Mfffff7ff9010,8:f06ffff7ffff0000#a9
lldb             <   6> read packet: $OK#9a
lldb             <  21> send packet: $xfffff7ff9000,200#00
lldb             < 516> read packet: $<512 bytes>#53
lldb             <  21> send packet: $xfffff7ff6e00,200#32
lldb             < 516> read packet: $<512 bytes>#0a
lldb             <  21> send packet: $xfffff7ff7000,200#fe    <<<< Fails to read the last 16 bytes
lldb             <   7> read packet: $E08#ad
0xfffff7ff6ff0: 01 02 03 04 00 00 00 00 00 00 00 00 00 00 00 00  ................
0xfffff7ff7000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
warning: Not all bytes (16/32) were able to be read from 0xfffff7ff6ff0.

Except we're not doing it by getting a smaller reply (not in this example anyway), it's working because we split up the reads in such a way that they tend to line up with the failing addresses.

So yeah it's probably worth fixing from a correctness point of view but for lldb to lldb-server we've got an equivalent already.

As for MTE would you be ok with not allowing partial reads, since the spec as proposed does not mention them?
(what I said before but just to be sure)

DavidSpickett updated this revision to Diff 330684.EditedMar 15 2021, 9:10 AM

Correctly handle the ptrace call by looping until we
get all tags as an error.

I've gone ahead and treated anything anything less than all tags
as an error as far as lldb-server is concerned.

Tell me what you think of that.

DavidSpickett added inline comments.Mar 15 2021, 9:15 AM

This has been added at the end of the tests here.

DavidSpickett marked 3 inline comments as done.
  • Remove extra newline after test decorator

@omjavaid Current status is that a partial read of memory tags is converted into an error by lldb-server. I think this is justifiable given that the GDB protocol as it stands doesn't describe how to really communicate a partial read (though we could make a reasonable guess I'm sure). Plus, lldb itself will be looking ahead via memory regions so it won't issue things that would fail. (fail for reason of address range at least)

Sound good to you? I'll be applying the same logic to writing tags.

  • Rebase onto main
  • skipUnlessAArch64MTELinuxCompiler was committed already so not added here, just used by the new tests

Rebase onto main. (switch to m_current_process)

it look good to me but I have some final minor nits inline.


may be rename to tags_iovec? because reading code below i had the tags vector in mind and confused it for that vector which was passed as parameter to this function.


may be use ?


this loop condition is is a little fishy. num_tags is unsigned which means if by chance it doesnt end up going to zero we ll keep looping for ever.


Just a side question about TBI, for memroy reads/write or tags/query is it necessary to send non address bits to remote? (tags and pauth masks). Can we instead clear these bits before sending address over using the code/data masks we have calculated in our host process class.

DavidSpickett added inline comments.May 6 2021, 1:34 AM

You shouldn't need to include them as the stripped address is equivalent as far as ptrace is concerned.

If you were gonna JIT some code and use it to load from that address then yes we'd need the bits.

That said, I'm not sure lldb-server can make the assumption that they are stripped. At least for reading memory tags, I'll have to think about it since the protocol doesn't require the client to remove tag bits.

Address comments in general.

Added an assert in the loop where we call ptrace
so that we catch any unexpected condition that would
make it an infinite loop.

DavidSpickett marked 2 inline comments as done.May 13 2021, 5:56 AM
DavidSpickett added inline comments.

I couldn't see another loop condition that made sense to use, so I've added an assert below:

assert(tags_read && (tags_read <= num_tags));

If num_tags was 0 we'd never enter the loop in the first place. Then we assert that if there was no error, the kernel returned at least 1 tag and no more tags than we asked for.

This should prevent num_tags wrapping around 0 and causing an infinite loop.

pcc added a subscriber: pcc.May 25 2021, 5:04 PM
pcc added inline comments.

Rebase, which brings in the header pcc mentioned.

DavidSpickett marked an inline comment as done.May 26 2021, 3:41 AM

I was looking at the wrong file, this adds the header.

omjavaid accepted this revision.May 26 2021, 6:00 AM
This revision is now accepted and ready to land.May 26 2021, 6:00 AM

Rebase, this is now ready to land.

DavidSpickett edited the summary of this revision. (Show Details)Jun 17 2021, 8:14 AM
This revision was landed with ongoing or failed builds.Jun 24 2021, 9:03 AM
This revision was automatically updated to reflect the committed changes.