This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Add "memory tag write" command
ClosedPublic

Authored by DavidSpickett on Jun 30 2021, 4:06 AM.

Details

Summary

This adds a new command for writing memory tags.
It is based on the existing "memory write" command.

Syntax: memory tag write <address-expression> <value> [<value> [...]]
(where "value" is a tag value)

(lldb) memory tag write mte_buf 1 2
(lldb) memory tag read mte_buf mte_buf+32
Logical tag: 0x0
Allocation tags:
[0xfffff7ff9000, 0xfffff7ff9010): 0x1
[0xfffff7ff9010, 0xfffff7ff9020): 0x2

The range you are writing to will be calculated by
aligning the address down to a granule boundary then
adding as many granules as there are tags.

(a repeating mode with an end address will be in a follow
up patch)

This is why "memory tag write" uses MakeTaggedRange but has
some extra steps to get this specific behaviour.

The command does all the usual argument validation:

  • Address must evaluate
  • You must supply at least one tag value (though lldb-server would just treat that as a nop anyway)
  • Those tag values must be valid for your tagging scheme (e.g. for MTE the value must be > 0 and < 0xf)
  • The calculated range must be memory tagged

That last error will show you the final range, not just
the start address you gave the command.

(lldb) memory tag write mte_buf_2+page_size-16 6
(lldb) memory tag write mte_buf_2+page_size-16 6 7
error: Address range 0xfffff7ffaff0:0xfffff7ffb010 is not in a memory tagged region

(note that we do not check if the region is writeable
since lldb can write to it anyway)

The read and write tag tests have been merged into
a single set of "tag access" tests as their test programs would
have been almost identical.
(also I have renamed some of the buffers to better
show what each one is used for)

Diff Detail

Event Timeline

DavidSpickett requested review of this revision.Jun 30 2021, 4:06 AM
DavidSpickett created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptJun 30 2021, 4:06 AM

Now that ranges are handled in the tag manager, the command uses
MakeTaggedRange. With some extra steps to get slightly different
alignment.

DavidSpickett edited the summary of this revision. (Show Details)Jul 14 2021, 1:55 AM
omjavaid accepted this revision.Jul 23 2021, 2:48 AM

LGTM after addressing comment above.

lldb/test/API/linux/aarch64/mte_tag_access/main.c
10

I think we should add a link to relevant information used to write this file. It makes use of arm_acle and MTE specific defs which should be explained or at least a link to relevant information should be given at the top for someone looking for explanation.

This revision is now accepted and ready to land.Jul 23 2021, 2:48 AM

Add a link to intrinsics documentation in the test file.

DavidSpickett marked an inline comment as done.Jul 27 2021, 9:00 AM
This revision was landed with ongoing or failed builds.Jul 28 2021, 2:12 AM
This revision was automatically updated to reflect the committed changes.
jrtc27 added a subscriber: jrtc27.Aug 4 2021, 6:04 PM

I'm concerned by the generality of the command "memory tag". Many different types of memory tagging exist, MTE is but one. CHERI uses memory tagging for something completely different (tracking valid capability, ie pointer provenance), and its tags make sense to read (though are a property of the stored data, not the memory allocation), but not explicitly write (only implicitly by writing a capability to the associated location, which will include the non-addressable tag bit), as that is architecturally forbidden in order to ensure that the hardware-enforced tags can never be corrupted by software. Moreover, in future, CHERI and MTE will be composed (there is ongoing experimental work to investigate doing so on CHERI-RISC-V, and if Arm's experimental Morello prototype is adopted in a future version of the Arm architecture it will also have to compose with MTE). See also the core dump format for MTE tags, which specifies the _type_ of tag, specifically to accommodate other uses of tagged memory.

Many different types of memory tagging exist, MTE is but one.

Sure. I'm not opposed to making changes to adapt to the properties of different tags. However the most concrete one I have access to it MTE so that has shaped the initial support and assumptions.

CHERI uses memory tagging for something completely different (tracking valid capability, ie pointer provenance), and its tags make sense to read (though are a property of the stored data, not the memory allocation), but not explicitly write (only implicitly by writing a capability to the associated location, which will include the non-addressable tag bit), as that is architecturally forbidden in order to ensure that the hardware-enforced tags can never be corrupted by software.

Definitely we could add properties to the tagging scheme information to disable the read and/or/write commands. Something like:

(lldb) memory tag write 0x12341234 23
Error: Cannot write memory tags for address 0x12341234. <foo> tags are not writeable by user software

If you mean hiding the commands completely from the interactive sessions, there's no existing mechanism for it but you could make the argument for it. The current pattern is to show the command but error saying that it is unsupported. (this also happens when a remote doesn't support some feature)

Moreover, in future, CHERI and MTE will be composed (there is ongoing experimental work to investigate doing so on CHERI-RISC-V, and if Arm's experimental Morello prototype is adopted in a future version of the Arm architecture it will also have to compose with MTE). See also the core dump format for MTE tags, which specifies the _type_ of tag, specifically to accommodate other uses of tagged memory.

Certainly. There's nothing stopping us supporting such a configuration apart from (at least personally) zero experience or way to test such a thing. Something like:

(lldb) memory read foo
Error: Current process has multiple memory tag types. "foo", "bar". Please select one with the --type argument.

It's not something I could really implement with just MTE as the vast majority of the code would be untested but with a use case we could definitely make the changes.

Where this does become more thorny is the lldb API where we have more stringent commitments to not changing it. Happily, I haven't started on this yet and your concerns very much need to be included.

Perhaps you could write up something and post it to the lldb-dev list for more visibility? I'd be interested to know how things work in the existing lldb cheri port (I assume there is one, there is for Morello at least).

Many different types of memory tagging exist, MTE is but one.

Sure. I'm not opposed to making changes to adapt to the properties of different tags. However the most concrete one I have access to it MTE so that has shaped the initial support and assumptions.

CHERI uses memory tagging for something completely different (tracking valid capability, ie pointer provenance), and its tags make sense to read (though are a property of the stored data, not the memory allocation), but not explicitly write (only implicitly by writing a capability to the associated location, which will include the non-addressable tag bit), as that is architecturally forbidden in order to ensure that the hardware-enforced tags can never be corrupted by software.

Definitely we could add properties to the tagging scheme information to disable the read and/or/write commands. Something like:

(lldb) memory tag write 0x12341234 23
Error: Cannot write memory tags for address 0x12341234. <foo> tags are not writeable by user software

If you mean hiding the commands completely from the interactive sessions, there's no existing mechanism for it but you could make the argument for it. The current pattern is to show the command but error saying that it is unsupported. (this also happens when a remote doesn't support some feature)

Yeah that kind of UI thing isn't really my concern.

Moreover, in future, CHERI and MTE will be composed (there is ongoing experimental work to investigate doing so on CHERI-RISC-V, and if Arm's experimental Morello prototype is adopted in a future version of the Arm architecture it will also have to compose with MTE). See also the core dump format for MTE tags, which specifies the _type_ of tag, specifically to accommodate other uses of tagged memory.

Certainly. There's nothing stopping us supporting such a configuration apart from (at least personally) zero experience or way to test such a thing. Something like:

(lldb) memory read foo
Error: Current process has multiple memory tag types. "foo", "bar". Please select one with the --type argument.

It's not something I could really implement with just MTE as the vast majority of the code would be untested but with a use case we could definitely make the changes.

My concern with that would just be that, if MTE is all that most users have accesses to for a while, various things will be written assuming memory tag means MTE and that if they run it on an MTE-less CHERI system it will do surprising things and they'll get confused. But maybe that doesn't matter and the benefit of having a shorter command when you only have one tag type outweighs that (though, LLDB is no stranger to verbose commands!).

Where this does become more thorny is the lldb API where we have more stringent commitments to not changing it. Happily, I haven't started on this yet and your concerns very much need to be included.

Yes, I do think it's important that the API always be explicit in what type of tag is used, presumably with an initially single-member enum.

Perhaps you could write up something and post it to the lldb-dev list for more visibility? I'd be interested to know how things work in the existing lldb cheri port (I assume there is one, there is for Morello at least).

I'll see if I can manage that over the weekend. FWIW, there is no CHERI LLDB, only a Morello LLDB, as our resident debugger expert is familiar with, and contributes to, GDB (and that's what we're also more familiar with using), though we have talked about wanting one. We don't currently have support for reading tagged memory in CHERI GDB, and I don't think Arm's Morello LLDB does currently either, they're both fairly limited in their functionality as we have other more pressing things to be working on.

lldb/test/API/linux/aarch64/mte_tag_access/main.c