This is an archive of the discontinued LLVM Phabricator instance.

[lldb][AArch64] Add support for SME's SVE streaming mode registers
ClosedPublic

Authored by DavidSpickett on Jul 11 2023, 1:08 AM.

Details

Summary

The Scalable Matrix Extension (SME) adds a new Scalable Vector mode
called "streaming SVE mode".

In this mode a lot of things change, but my understanding overall
is that this mode assumes you are not going to move data out of
the vector unit very often or read flags.

Based on "E1.3" of "Arm® Architecture Reference Manual Supplement,
The Scalable Matrix Extension (SME), for Armv9-A".

https://developer.arm.com/documentation/ddi0616/latest/

The important details for debug are that this adds another set
of SVE registers. This set is only active when we are in streaming
mode and is read from a new ptrace regset NT_ARM_SSVE.
We are able to read the header of either mode at all times but
only one will be active and contain register data.

For this reason, I have reused the existing SVE state. Streaming
mode is just another mode value attached to that state.

The streaming mode registers do not have different names in the
architecture, so I do not plan to allow users to read or write the
inactive mode's registers. "z0" will always mean "z0" of the active
mode.

Ptrace does allow reading inactive modes, but the data is of little
use. Writing to inactive modes will switch to that mode which would
not be what a debugger user would expect. So lldb will do neither.

Existing SVE tests have been updated to check streaming mode and
mode switches. However, we are limited in what we can check given
that state for the other mode is invalidated on mode switch.

The only way to know what mode you are in for testing purposes would
be to execute a streaming only, or non-streaming only instruction in
the opposite mode. However, the CPU feature smefa64 actually allows
all non-streaming mode instructions in streaming mode.

This is enabled by default in QEMU emulation and rather than mess
about trying to disable it I'm just going to use the pseduo streaming
control register added in a later patch to make these tests more
robust.

A new test has been added to check SIMD read/write from all the modes
as there is a subtlety there that needs noting, though lldb
doesn't have to make extra effort to do so.

If you are in streaming mode and write to v0, when you later exit
streaming mode that value may not be in the non-streaming state.
This can depend on how the core works but is a valid behaviour.

For example, say I am stopped here:
mov x0, v0.d[0]

And I want to update v0 in lldb. "register write v0 ..." should update
the v0 that this instruction is about to see. Not the potential other
copy of v0 in the non-streaming state (which is what I attempted in
earlier versions of this patch).

Not to mention, switching out of streaming mode here would be unexpected
and difficult to signal to the user.

Diff Detail

Event Timeline

DavidSpickett created this revision.Jul 11 2023, 1:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 11 2023, 1:08 AM
DavidSpickett requested review of this revision.Jul 11 2023, 1:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 11 2023, 1:08 AM

Feel free to bombard me with questions about SME, if it's quicker than reading the entire spec yourself. It has some quirks to it for sure.

SIMD registers must be read and written via the SVE regset when in SSVE mode. Writing to them exits streaming mode.

In my testing it did actually work if you always used the current mode. I think that's an artifact of QEMU or the kernel's implementation choices though, and perhaps the data is actually stale in certain circumstances.

Matt added a subscriber: Matt.Jul 11 2023, 10:40 AM

I have no experience with the linux support so I'm not an ideal person to review, but when I was reading about watchpoints I saw the caveats about SSVE mode and false watchpoint hits so I read through the patch out of curiosity about SME/SSVE. You mention a couple of times that when in SSVE mode, we can only write to the SVE registers and doing so forces the processor out of SSVE Mode. We can read the SSVE registers while in SSVE mode though right?

lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp
500–503

it's super minor, but I think you're meaning to describe two states here, but it sounds like four. Maybe "which can be non-streaming (SVE), or streaming (SSVE)" is the best I can come up with.

502

skipped

You mention a couple of times that when in SSVE mode, we can only write to the SVE registers and doing so forces the processor out of SSVE Mode. We can read the SSVE registers while in SSVE mode though right?

We are always able to read the header for either mode, which tells us the vector length among other useful metadata. Register data though...

Maybe I should just enumerate the states.

Write to SVE while in SSVE - switch to SVE mode
Write to SSVE while in SVE - switch to SSVE mode
Write to SVE while in SVE - no change
Write to SSVE while in SSVE - no change

Read SVE while in SSVE - no register data returned
Read SSVE while in SVE - no register data returned
Read SSVE while in SSVE - SSVE registers returned
Read SVE while in SVE - SVE registers returned

Then we have SIMD (v0-31) which is a bit of a wrench in this. We must read SIMD via the SVE regset even while SSVE is active, but writing to that same set brings us out of SSVE mode (into SIMD mode, I think, but it could just be SVE mode).

So perhaps you see from that why I don't plan to use the mode switching this way in lldb.

I will in follow on patches report the current mode via a pseudo control register, and report the streaming vector length again with a pseudo. To resolve the ambiguity of naming the registers the same thing.

Then we have SIMD (v0-31) which is a bit of a wrench in this. We must read SIMD via the SVE regset even while SSVE is active, but writing to that same set brings us out of SSVE mode (into SIMD mode, I think, but it could just be SVE mode).

In my testing it did work to just use the active mode instead but this is one of those things that is likely unintentional or a result of QEMU's implementation choices.

Ah, I understand, thanks for the explanation. What an unusual feature.

omjavaid added inline comments.Jul 16 2023, 11:42 PM
lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp
224

by that set you mean NT_ARM_SVE or NT_ARM_SSVE

403

shouldnt we also invalidate all other dynamic regsets here?

lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.h
119

When we are in streaming mode normal state data will be invalid? If yes then can we convert this into a pointer which should be pointing to a valid state data based on current state?

Turns out I was misinterpreting this setence from the kernel docs:

Note that when SME is present and streaming SVE mode is in use the FPSIMD subset of registers will be read via NT_ARM_SVE and NT_ARM_SVE writes will exit streaming mode in the target.

https://kernel.org/doc/html/v6.2/arm64/sve.html

I read this as "should be read" not, "will be read". The intent of the statement
is to make you aware that the register sets are connected in that one can effect
the other.

However, our strategy of using the bottom part of the Z registers to read the V
registers is still valid as long as we do not want to switch modes, which we never
do.

So I've done what Omair suggested and reverted to a single set of state for
the non-streaming and streaming modes. With m_sve_state to tell the difference.

DavidSpickett edited the summary of this revision. (Show Details)Jul 18 2023, 7:26 AM

Address some comments from the previous version.

DavidSpickett marked 3 inline comments as done.Jul 18 2023, 7:33 AM
DavidSpickett added inline comments.
lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp
403

This is not needed anymore as we're no longer going to trigger a mode switch here. When we write to FPSIMD, we'll use the bottom 128 bits of the streaming mode Z register. Instead of going back to NT_ARM_SVE, which would cause a mode switch.

lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.h
119

I've reverted to the previous scheme.

DavidSpickett marked 4 inline comments as done.Jul 18 2023, 7:36 AM
DavidSpickett added inline comments.
lldb/test/API/commands/register/register/aarch64_sve_simd_registers/TestSVESIMDRegisters.py
2

I wrote this test when I thought we had to read/write SIMD via NT_ARM_SVE during streaming mode, so its main reason for existing has gone now.

However, I think it's worth keeping because no other test checks SIMD read/write as literally as this does. And if it turns out I did part of this incorrectly, we have tests to refer to.

On the off chance anyone was going to try and run this, you'll need a kernel that includes https://lore.kernel.org/lkml/20230713-arm64-fix-sve-sme-vl-change-v1-3-129dd8611413@kernel.org/T/. This fixes a bug found while writing these tests.

omjavaid accepted this revision.Jul 21 2023, 7:18 AM

This looks very good much simpler from the first version. just a minor nit.

lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp
81–92

Should we test for SSVE first that way we wont have to check for SVE once SSVE is found?

This revision is now accepted and ready to land.Jul 21 2023, 7:18 AM
DavidSpickett marked an inline comment as done.Jul 21 2023, 8:16 AM
DavidSpickett added inline comments.
lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp
81–92

Good idea, but according to the architecture supplement:

If SME is implemented, this does not imply that FEAT_SVE and FEAT_SVE2 are implemented by the PE when it is not in Streaming SVE mode.

I don't know how realistic such a core is but at a glance Linux doesn't say it wouldn't support it. I'll leave this as it is on that basis.

DavidSpickett marked an inline comment as done.

Rebase.

I will land this and the next patch once the 17 branch has been taken. I don't want SME changes on 17 with A: not enough testing and B: missing features (ZA).