This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Remove non address bits from memory read arguments
ClosedPublic

Authored by DavidSpickett on Jun 3 2021, 9:09 AM.

Details

Summary

Addresses on AArch64 can have top byte tags, memory tags and pointer
authentication signatures in the upper bits.

While testing memory tagging I found that memory read couldn't
read a range if the two addresses had different tags. The same
could apply to signed pointers given the right circumstance.

(lldb) memory read mte_buf_alt_tag mte_buf+16
error: end address (0x900fffff7ff8010) must be greater than the start
address (0xa00fffff7ff8000).

Or it would try to read a lot more memory than expected.

(lldb) memory read mte_buf mte_buf_alt_tag+16
error: Normally, 'memory read' will not read over 1024 bytes of data.
error: Please use --force to override this restriction just once.
error: or set target.max-memory-read-size if you will often need a
larger limit.

Fix this by removing non address bits before we calculate the read
range. A test is added for AArch64 Linux that confirms this by using
the top byte ignore feature.

This means that if you do read with a tagged pointer the output
does not include those tags. This is potentially confusing but I think
overall it's better that we don't pretend that we're reading memory
from a range that the process is unable to map.

(lldb) p ptr1
(char *) $4 = 0x3400fffffffff140 "\x80\xf1\xff\xff\xff\xff"
(lldb) p ptr2
(char *) $5 = 0x5600fffffffff140 "\x80\xf1\xff\xff\xff\xff"
(lldb) memory read ptr1 ptr2+16
0xfffffffff140: 80 f1 ff ff ff ff 00 00 38 70 bc f7 ff ff 00 00 ........8p......

Diff Detail

Event Timeline

DavidSpickett created this revision.Jun 3 2021, 9:09 AM
DavidSpickett requested review of this revision.Jun 3 2021, 9:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 3 2021, 9:09 AM

The side effect here is that you do "memory read <tagged ptr>" and you see untagged addresses for the lines. It's not really that confusing but maybe something we should make a general decision about.

Same thing applies to the previous "memory region" patch.

(lldb) p ptr1
(char *) $1 = 0x344dfffffffffcb0 ""
(lldb) memory read ptr1
0xfffffffffcb0: f0 05 40 00 00 00 00 00 00 00 00 00 00 00 00 00  ..@.............
0xfffffffffcc0: e0 fc ff ff ff ff 00 00 54 f0 e7 f7 ff ff 00 00  ........T.......
(lldb) memory region ptr1
[0x0000fffffffdf000-0x0001000000000000) rw- [stack]

Personally it doesn't bother me, but then again I know what all the upper bits are doing anyway.

omjavaid requested changes to this revision.Jun 16 2021, 4:51 PM

This doesnt build ... you need to include #include "lldb/Target/ABI.h" in lldb/source/Commands/CommandObjectMemory.cpp

lldb/test/API/linux/aarch64/tagged_memory_read/TestAArch64LinuxTaggedMemoryRead.py
23

May be consider renaming this test case.

24

This condition will always be true because we havent yet connected to remote platform.

This revision now requires changes to proceed.Jun 16 2021, 4:51 PM
pcc added a subscriber: pcc.Jun 16 2021, 4:57 PM

The side effect here is that you do "memory read <tagged ptr>" and you see untagged addresses for the lines. It's not really that confusing but maybe something we should make a general decision about.

Same thing applies to the previous "memory region" patch.

(lldb) p ptr1
(char *) $1 = 0x344dfffffffffcb0 ""
(lldb) memory read ptr1
0xfffffffffcb0: f0 05 40 00 00 00 00 00 00 00 00 00 00 00 00 00  ..@.............
0xfffffffffcc0: e0 fc ff ff ff ff 00 00 54 f0 e7 f7 ff ff 00 00  ........T.......
(lldb) memory region ptr1
[0x0000fffffffdf000-0x0001000000000000) rw- [stack]

Personally it doesn't bother me, but then again I know what all the upper bits are doing anyway.

In Android tombstones we display the memory tag inline with the address, e.g.

memory near x8 ([anon:scudo:primary]):
    00000079fd11a1d0 0000000000000000 0000000000000000  ................
    00000079fd11a1e0 84e4000000020402 000037ecb699a1b3  .............7..
    0f000079fd11a1f0 0000000000000000 0000000000000000  ................
    0f000079fd11a200 0000000000000000 0000000000000000  ................
    00000079fd11a210 0000000000000000 0000000000000000  ................
    00000079fd11a220 0000000000000000 0000000000000000  ................
    00000079fd11a230 0000000000000000 0000000000000000  ................
    00000079fd11a240 0000000000000000 0000000000000000  ................
    00000079fd11a250 0000000000000000 0000000000000000  ................
    00000079fd11a260 0000000000000000 0000000000000000  ................
    00000079fd11a270 0000000000000000 0000000000000000  ................
    00000079fd11a280 0000000000000000 0000000000000000  ................
    00000079fd11a290 0000000000000000 0000000000000000  ................
    00000079fd11a2a0 0000000000000000 0000000000000000  ................
    00000079fd11a2b0 0000000000000000 0000000000000000  ................
    00000079fd11a2c0 0000000000000000 0000000000000000  ................

Maybe this is something worth considering for LLDB memory dumps?

Maybe this is something worth considering for LLDB memory dumps?

I'm working on that at the moment, it's on my github branch. This is what it looks like with the right options: (others look weird at the moment due to ordering issues)

(lldb) memory read mte_buf mte_buf+32 -f "x" -l 1 -s 16
0xfffff7ff6000: 0x00000000000000000000000000000000 (tag: 0x0)
0xfffff7ff6010: 0x00000000000000000000000000000000 (tag: 0x1)

The format is mostly a guess on my part, given that I'm mostly testing the debugger itself not MTE enabled software.

  • Add missing include
  • Rename test
DavidSpickett marked an inline comment as done.Jun 18 2021, 3:36 AM
DavidSpickett added inline comments.
lldb/test/API/linux/aarch64/tagged_memory_read/TestAArch64LinuxTaggedMemoryRead.py
24

Runs fine for me using dotest:

$ ./bin/lldb-dotest --platform-name remote-linux <...> -p TestAArch64LinuxTaggedMemoryRead.py --arch aarch64

And this is how it's used elsewhere. Perhaps you're running the tests in a different way?

omjavaid accepted this revision.Jun 28 2021, 4:04 PM
omjavaid added inline comments.
lldb/test/API/linux/aarch64/tagged_memory_read/TestAArch64LinuxTaggedMemoryRead.py
24

This will be fixed by D105060

This revision is now accepted and ready to land.Jun 28 2021, 4:04 PM

Rebase onto main.

Matt added a subscriber: Matt.Aug 10 2021, 1:24 PM

Just to update where I am with this, I think I've got a good argument to not show the non-address bits in the output.

That being that the memory itself that we read doesn't include these bits, so showing it is just going to be more confusing that the initial confusion of the address changing. In addition for things like memory tagging once you move beyond the granule your original pointer points to, the actual tag stored in hardware could be different. So we're misleading in that way.

The same applies to signed pointers where incrementing the signing bits isn't a thing in any case, so the signature for foo and foo+4 is probably different anyway.

I think that makes sense but I don't want to just go ahead until I get other's opinions.

Rebase onto main

DavidSpickett edited the summary of this revision. (Show Details)Dec 10 2021, 1:59 AM

Since we're so close to the end of the year I'm going to give this more time and ping early next year in case of any more feedback. Though I'm feeling confident in the direction so far.

While working on this I realised that the API calls should also remove the non-address bits. We will still need to have this change, so that checking for inverted ranges still works. Changes to the underlying MemoryRead and API will be a separate review.

Add release note for this change.

Closer to release I'll see if it makes more sense to make a list of commands
that have had the same changes, but this stops me forgetting entirely.

Herald added a project: Restricted Project. · View Herald TranscriptDec 16 2021, 3:03 AM
danielkiss accepted this revision.Dec 16 2021, 4:40 AM

Maybe this is something worth considering for LLDB memory dumps?

I'm working on that at the moment, it's on my github branch. This is what it looks like with the right options: (others look weird at the moment due to ordering issues)

(lldb) memory read mte_buf mte_buf+32 -f "x" -l 1 -s 16
0xfffff7ff6000: 0x00000000000000000000000000000000 (tag: 0x0)
0xfffff7ff6010: 0x00000000000000000000000000000000 (tag: 0x1)

The format is mostly a guess on my part, given that I'm mostly testing the debugger itself not MTE enabled software.

That looks really good, could help a lot to understand tagging related issues. Looking forward for that patch :)

This change LGTM.

That looks really good, could help a lot to understand tagging related issues. Looking forward for that patch :)

https://reviews.llvm.org/D107140 is the end of the series for that if you want to see it. The test cases show various corner cases too.

Rebase, update some comment wording.

DavidSpickett retitled this revision from [lldb][AArch64] Remove non address bits from memory read arguments to [lldb] Remove non address bits from memory read arguments.Jan 11 2022, 5:22 AM
This revision was landed with ongoing or failed builds.Jan 11 2022, 5:24 AM
This revision was automatically updated to reflect the committed changes.

Since there was no great pushback on this I've landed it. With the final justification being that the non-address bits are properties of the pointer not of the memory pointed to.

I will do a follow up where I apply the same logic to the ReadMemory calls via the API.