Page MenuHomePhabricator

AArch64 SVE register infos and core file support
ClosedPublic

Authored by omjavaid on Mar 30 2020, 3:24 AM.

Details

Summary

This patch adds support for AArch64 SVE register infos description and register access via ptrace.

AArch64 SVE is a an optional extension of Arm v8.3-a architecture which introduces 32 vector registers Z, 16 predicate P registers and FFR predicate register. These registers have fixed names but can dynamically be configured to different size based on underlying OS configuration.

This patch adds register info struct that describes SVE register infos. Also provides native register context routines to access SVE registers. It introduces a mechanism to configure SVE register size and offsets at startup before exchanging register information across gdb-remote process.

It makes use of linux kernel definitions copied into lldb/source/Plugins/Process/Linux/LinuxPTraceDefines_arm64sve.h for backward compatibility with sysroots which might yet not support SVE definitions.

There are two test cases added to the testsuite one of them checks register size configuration based on vg (vector granule register). Also a test is added which verifies registers can be read and written.

There is no physical hardware currently available to test SVE and we make use of QEMU for the purpose of testing this feature.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
omjavaid marked 6 inline comments as done.Jul 1 2020, 4:10 PM

There's always a chance for ODR violations, particularly with header files defining static objects. This looks better though what I really wanted was to keep the non-sve register info array (g_register_infos_arm64_le) completely out of the path of the sve header. Like, by using three headers, one defining g_register_infos_arm64_le (and stuff specific to that like the exception registers), one defining the sve array, and the third one which would contain the things common to both. Nonetheless, I think we can now move on to aspects of this patch. Please see my inline comments.

I am working to fix this with D80105 by defining dynamic register infos array based on register sets which will be requirement for Pointer Authentication and MTE. But I wanted to first get done with SVE and come back to register infos refactoring later.

lldb/source/Plugins/Process/Utility/NativeRegisterContextRegisterInfo.h
38–40

Ack. This function is used by chiild revisions which adds ptrace support. I ll move it there.

lldb/source/Plugins/Process/Utility/RegisterContextPOSIX_arm64.cpp
261 ↗(On Diff #274390)

Ack.

lldb/source/Plugins/Process/Utility/RegisterInfoInterface.h
64–68

Agreed. I ll make an arrangement for this in updated rev.

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
72–89 ↗(On Diff #274390)

Ack.

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.h
53–55 ↗(On Diff #274390)

These should be private. I ll update this in updated rev.

76 ↗(On Diff #274390)

It was pulled in from NativeRegisterContextLinux_arm64 and got set mutable by mistake I ll correct in updated rev.

omjavaid updated this revision to Diff 275037.Jul 2 2020, 3:12 AM

@labath I have updated diff after resolving issues highlighted. Also I have removed dependence on generic interfaces.

labath marked an inline comment as done.Jul 3 2020, 1:45 AM

Thanks. This looks is starting to look good now. I didn't get a chance to review all of the changes, but here's a few more things that I noticed.

lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.h
72–76

Now that this is arm-specific, I am wondering if there isn't a better name for this than "register info 'mode'". Maybe it could be just "enable/disable SVE" ? Specifically, if the validity of the "mode" is checked by the caller, then we can avoid this RegisterInfoModeIsValid business...

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
72–89 ↗(On Diff #274390)

Well.. that's one way of handing that.. :) It's not necessarily bad, but I was imagining you would just delete all the "else"s...

21 ↗(On Diff #275037)

It would be also nice if this constructor accepted a RegisterInfoPOSIX_arm64 instead of plain RegisterInfoInterface. That would make it much harder to misuse this class. For that we'd need to slightly refactor the construction code in ThreadElfCore. For example, we could swap the order of the "os" and "machine" switches and then join the two "machine" switches that construct RegisterInfoInterfaces and RegisterContexts.

Maybe you could do that as a separate patch?

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.h
17–22 ↗(On Diff #275037)

This is including platform-specific headers, but elf cores should be readable from any host. I guess you should just re-define the relevant constants somewhere. (I'm not sure what they are.)

lldb/source/Utility/ARM64_DWARF_Registers.h
54–147

All of these numbers are from the official spec, right? Maybe you could just commit this straight away so it does not clutter this review.

omjavaid updated this revision to Diff 276963.Jul 10 2020, 2:35 AM

This update incorporates suggestion from last iteration and also rebases after merger to registersets and register infos in RegisterInfosPOSIX_arm64.
Also I have posted a separate patch D83541 to make lldb/source/Plugins/Process/Linux/LinuxPTraceDefines_arm64sve.h platform independent in order to use them with elf-core.

@labath what do you think ?

I didn't notice this before, but I see now that there's more register number overlap in lldb-arm64-register-enums.h. Having one overlapping enum is bad enough, but two seems really too much? Can we avoid the second enum somehow, at least? Perhaps by switching RegisterContextCorePOSIX_arm64 to use the other enum definitions for its work?

lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp
125–126

This comment looks outdated.

149

IIUC, m_sve_enabled is only true if m_vector_reg_vq is not zero. Could we delete this variable and just make a function to make that comparison?

157–160

It doesn't look like m_dynamic_register_infos is needed -- you could just make m_register_info_p point directly into the map.

lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.h
72

The offset argument is not set anywhere.

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.h
17 ↗(On Diff #276963)

I guess this does not make sense now that the header is standalone

omjavaid updated this revision to Diff 277701.Jul 14 2020, 1:34 AM

This update addresses issues raised in last iteration. Also have removed dependence on lldb-arm64-register-enums.h.

Some more comments (and questions) from me. Sorry about the back-and-forth. It's taking me a while to wrap my head around this, and as I start to understand things, new questions arise...

lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp
151–159

I'd probably try relying on the fact that operator[] constructs an empty vector, and that the empty vector is not a valid register info value. Something like:

std::vector<RegisterInfo> &new_reg_info = m_per_vq_reg_infos[sve_vq];
if (new_reg_info.empty()) {
  new_reg_info = llvm::makeArrayRef(g_register_infos_arm64_sve_le, m_register_info_count);
  ...
}
m_register_info_p = new_reg_info.data();

That would make this less of a mouthful and also avoid looking up the same key in the map three or four times.

lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.h
17–22

If this is going to be a class enum (which I think is a good idea), then there's no need to repeat the type name in the actual enumerators. It also seems like this could/should follow the lldb naming conventions (SveState::Disabled, etc).

lldb/source/Plugins/Process/Utility/lldb-arm64-register-enums.h
316

I guess these changes should be reverted now?

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
82–106 ↗(On Diff #277701)

I'm confused by this function. If I understant the SVE_PT macros and the logic in RegisterInfoPOSIX_arm64::ConfigureVectorRegisterInfos correctly, then they both seem to encode the same information. And it seems to me that this function should just be reg_infos[reg_num].offset - some_constant, which is the same thing that we're doing for the arm FP registers when SVE is disabled, and also for most other architectures too.

Why is that not the case? Am I missing something? If they are not encoding the same thing, could they be made to encode the same thing?

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.h
17 ↗(On Diff #277701)

Please group the header next to other Plugins/Process/Utility header

lldb/test/API/functionalities/postmortem/elf-core/TestLinuxCore.py
326–327 ↗(On Diff #277701)

What's up with these?

omjavaid marked 6 inline comments as done.Jul 14 2020, 5:54 AM

There's always a chance for ODR violations, particularly with header files defining static objects. This looks better though what I really wanted was to keep the non-sve register info array (g_register_infos_arm64_le) completely out of the path of the sve header. Like, by using three headers, one defining g_register_infos_arm64_le (and stuff specific to that like the exception registers), one defining the sve array, and the third one which would contain the things common to both. Nonetheless, I think we can now move on to aspects of this patch. Please see my inline comments.

I am working to fix this with D80105 by defining dynamic register infos array based on register sets which will be requirement for Pointer Authentication and MTE. But I wanted to first get done with SVE and come back to register infos refactoring later.

lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp
151–159

ACK

lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.h
17–22

ACK

lldb/source/Plugins/Process/Utility/lldb-arm64-register-enums.h
316

ACK

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
82–106 ↗(On Diff #277701)

This function calculates offset of a particular register in core note data. SVE data in core dump is similar to what PTrace emits and offsets into this data is not linear. SVE macros are used to access those offsets based on register numbers and currently selected vector length.
Also for the purpose of ease we have linear offsets in SVE register infos and it helps us simplify register data once it makes way to GDBRemoteRegisterContext on the client side.

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.h
17 ↗(On Diff #277701)

ACK

lldb/test/API/functionalities/postmortem/elf-core/TestLinuxCore.py
326–327 ↗(On Diff #277701)

Sorry Typo

labath added inline comments.Jul 14 2020, 6:07 AM
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
82–106 ↗(On Diff #277701)

Could you give an example of the non-linearity of the core dump data? (Some registers, and their respective core file and gdb-remote offsets)

omjavaid marked an inline comment as done.Jul 14 2020, 6:35 AM
omjavaid added inline comments.
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
82–106 ↗(On Diff #277701)

In case of core file we create a buffer m_sveregset which stores SVE core note information
m_sveregset =

getRegset(notes, m_register_info_up->GetTargetArchitecture().GetTriple(),
          AARCH64_SVE_Desc);

At this point we do not know what was the vector length and at what offsets in the data our registers are located. We read top bytes of size use_sve_header and read vector length. Based on this information we configure vector length in Register infos. While the SVE payload starts with user_sve_header then there are some allignment bytes followed by vector length based Z registers followed by P and FFR, then there are some more allginment bytes followd by FPCR and FPSR. Macros provided by Linux help us jump to the desired offset by providing register number and vq into the core note or Ptrace payload.

In case of client side storage we store GPRs at linear offset followed by Vector Granule register. Then there are SVE registers Z, P, FFR, FPSR and FPCR. Offsets of V, D and S registers in FPR regset overlap with corresponding first bytes of Z registers and will be same as corresponding Z register. While both FP/SVE FPSR share same register offset, size and register number.

Here is an excerpt from https://github.com/torvalds/linux/blob/master/Documentation/arm64/sve.rst

SVE_PT_REGS_FPSIMD
SVE registers are not live (GETREGSET) or are to be made non-live (SETREGSET).
The payload is of type struct user_fpsimd_state, with the same meaning as for NT_PRFPREG, starting at offset SVE_PT_FPSIMD_OFFSET from the start of user_sve_header.
Extra data might be appended in the future: the size of the payload should be obtained using SVE_PT_FPSIMD_SIZE(vq, flags).
vq should be obtained using sve_vq_from_vl(vl).

or

SVE_PT_REGS_SVE
SVE registers are live (GETREGSET) or are to be made live (SETREGSET).
The payload contains the SVE register data, starting at offset SVE_PT_SVE_OFFSET from the start of user_sve_header, and with size SVE_PT_SVE_SIZE(vq, flags);

labath added inline comments.Jul 14 2020, 6:52 AM
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
82–106 ↗(On Diff #277701)

Given this

SVE payload starts with ... followed by vector length based Z registers followed by P and FFR,

and this

In case of client side storage we store GPRs ... Then there are SVE registers Z, P, FFR, FPSR and FPCR

I would expect that for each of the Z, P and FFR registers, the expression offset_in_core(reg) - offset_in_gdb_remote(reg) is always the same constant (and is basically equal to SVE_PT_SVE_ZREG_OFFSET(vq, 0) - reg_info[Z0].byte_offset). So we could just add/subtract that constant to the gdb-remote byte_offset field instead of painstakingly decomposing the register number only for the linux macros to reconstruct it back again. Is that not so?

omjavaid marked an inline comment as done.Jul 14 2020, 7:45 AM
omjavaid added inline comments.
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
82–106 ↗(On Diff #277701)

The standard never talks about Z, P and FFR being contagious that is what I learnt by reading macros. There standard states this:

If the registers are present, the remainder of the record has a vl-dependent size and layout. Macros SVE_SIG_* are defined [1] to facilitate access to the members.
Each scalable register (Zn, Pn, FFR) is stored in an endianness-invariant layout, with bits [(8 * i + 7) : (8 * i)] stored at byte offset i from the start of the register's representation in memory.
If the SVE context is too big to fit in sigcontext.reserved[], then extra space is allocated on the stack, an extra_context record is written in reserved[] referencing this space. sve_context is then written in the extra space. Refer to [1] for further details about this mechanism.

I understand what you are talking about but given the macros were specifically provided and above line about register record was vague and I thought best is to follow the macros for offset calculation although other way around is simpler but may be slightly unreliable.

I suggest to keep this as it is unless there is strong reason apart from slight performance penalty. This resembles with GDB implementation which was done by ARM and I am only following that as reference. May be we can revise this in future when the feature becomes more mainstream.

labath added inline comments.Jul 14 2020, 8:31 AM
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
82–106 ↗(On Diff #277701)

I'm not worried about the performance penalty (most of these abstractions can be optimized away anyway). I'm more worried about the maintainability penalty incurred by a non-standard and fairly complicated solution.

The way I see it, it doesn't really matter how exactly the specification describes these things. Having the macros is nice, but it's not their names that become a part of the ABI -- their implementation does. So they (linux, arm, whoever) cannot change the layout of the these registers without breaking all existing applications (and core files) any more than they can change the layout of the general purpose registers (which we also access by offset, without any fancy macros). In fact, even if they did change the layout, given that we have a copy of these headers, we wouldn't automatically inherit those changes anyway, but would have to take some manual action. And since we'd probably want to maintain some sort of backwards compatibility, we couldn't just replace the new macro definitions, but would have to do some sort of versioning (just today I got reminded that lldb contains versioned layouts of various internal structs used by the ObjC runtime).

So, for that reason I am more concerned about consistency with other lldb register contexts than I am with consistency with gdb.

Also, I bet gdb doesn't have an equivalent of our RegisterInfo struct. Without that, using these macros would be obviously better, but given that we have that, and we already use it to access other registers, ditching that for macros does not seem like a win to me.

Another way to look at this would be to compare this to e.g. ELF reading code in llvm. We cannot just include <linux/elf.h> as we want this to be cross-platform. However, we also don't just copy the header blindly into our code base. We don't depart from it needlessly, but we do make sure that it plays well with the rest of llvm -- #defines are changed to enums, we add templates to select between 64/32 bit versions, change some platform-specific names so that different platforms may co-exist, etc.

omjavaid marked an inline comment as done.Jul 14 2020, 9:53 PM
omjavaid added inline comments.
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
82–106 ↗(On Diff #277701)

I understand your point here and by referring to Linux, Arm or GDB implementation I only wanted to make sure that we are implementing the architecture and platform specific parts correctly. The choice of using Linux/SVE specific macros in elf-core context was not a blind decision. I am not advocating for following a particular implementations and I would not have pulled in these Linux specific macros into elf-core implementation if the SVE core note layout was independent of Linux platform specifics like sig_context and user_sve_header. Offset to the start of register data in NT_ARM_SVE is also calculated after doing alignment correction based on VQ and size of sig_context. Given that SVE is currently supported by Linux only, NT_ARM_SVE is Linux only core note and core dump generated makes use of these macro definitions I had no choice but to include them for extraction of data from NT_ARM_SVE section. If I had not followed these macros I still would have written something similar my self for register data extraction from NT_ARM_SVE elf core note.

I am going to post another update where I ll try to minimize use of these macros in offset calculation.

omjavaid updated this revision to Diff 278113.Jul 15 2020, 2:16 AM

This update address issues highlighted in last iteration and also minimizes the use of SVE_PT macros by using RegisterInfo.byte_size and byte_offset where possible.

@labath What do you think ?

omjavaid updated this revision to Diff 278124.Jul 15 2020, 3:16 AM

Minor fix removing GetRegNumP0 which is no longer needed.

At this point, I think I (finally) have a good understanding of both how this patch works and interacts with the rest of the world. I have one more batch of comments, but hopefully none are too controversial, and I really do hope this is the last iteration.

lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp
158

I think all uses of new_reg_info_p could just be replaced by reg_info_ref

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
82–106 ↗(On Diff #277701)

Thanks for your patience. I think this looks better. The more I understand how these work, the more I realize that will need fairly special handling..

63–66 ↗(On Diff #278124)

What's up with this copying? We already have the data in the m_sveregset DataExtractor. What's the reason for copying it into the m_sve_note_payload vector? Also, m_sve_header seems like it could just be a reinterpret_casted pointer into that buffer instead of a copy. (Maybe not even a pointer, just a utility function which performs the cast when called).

Actually, when I think about casts and data extractors, I am reminded of endianness. This will access those fields using host endianness, which is most likely not what we want to do. So, probably the best/simplest solution would be to indeed declare a user_sve_header struct, but don't populate it via memcpy, but rather via the appropriate DataExtractor extraction methods. Since the only field of the user_sve_header used outside this function is the vl field, perhaps the struct could be a local variable and only the vector length would be persisted. That would be consistent with how the flags field is decoded and stored into the m_sve_state field.

(If the struct fields are always little-endian (if that's true, I thought arm has BE variants) then you could also stick to the reinterpret_cast idea, but change the types of the struct fields to llvm::support::ulittleXX_t to read them as LE independently of the host.)

84–86 ↗(On Diff #278124)

This is already checked by ReadRegister and the false -> uint32_t conversion is dodgy. An assertion would completely suffice here.

140 ↗(On Diff #278124)

This IsFPR check is now redundant.

152–153 ↗(On Diff #278124)

These two registers are special-cased both here and in the CalculateSVEOffset. Given that both functions also do a switch over the sve "states", it makes following the code quite challenging. What if we moved the handling of these registers completely up-front, and removed their handling from CalculateSVEOffset completely?
I'm thinking of something like:

if (reg == GetRegNumFPCR() && m_sve_state != SVEState::Disabled) {
  src = GetSVEBuffer() + GetFPCROffset(); // Or maybe just inline GetFPCROffset here
} else if (reg == GetRegNumFPSR() && m_sve_state != SVEState::Disabled) {
  src = ...;
} if (IsFPR(reg)) {
  // as usual, only you can assume that FP[CS]R are already handled if SVE is enabled
}
161–162 ↗(On Diff #278124)

This also looks endian-incorrect. Maybe value = GetSVERegVG(); return true; ?

164 ↗(On Diff #278124)

A switch over possible m_sve_state values would likely be cleaner.

174 ↗(On Diff #278124)

I guess it would be better to do the cast inside GetSVEBuffer. (and please make that a c++ reinterpret_cast).

180 ↗(On Diff #278124)

it seems that src should be a const pointer.

omjavaid marked 9 inline comments as done.Jul 16 2020, 4:42 AM
omjavaid added inline comments.
lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp
158

Ack.

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
63–66 ↗(On Diff #278124)

Agreed m_sve_note_payload is redundant. Most of the implementation was mirrored from ptrace implementation and I was lazy enough and did not remove the redundancies which are not needed in case of core dump.

About endianess of SVE data I dont think it will create any problem as the data is transferred in endian invariant format. According to the standard here:

  • Whenever SVE scalable register values (Zn, Pn, FFR) are exchanged in memory between userspace and the kernel, the register value is encoded in memory in an endianness-invariant layout, with bits [(8 * i + 7) : (8 * i)] encoded at byte offset i from the start of the memory representation. This affects for example the signal frame (struct sve_context) and ptrace interface (struct user_sve_header) and associated data.
84–86 ↗(On Diff #278124)

Ack.

140 ↗(On Diff #278124)

Ack,

152–153 ↗(On Diff #278124)

Ok will try to update in next revision.

161–162 ↗(On Diff #278124)

Endian invariant as explained above.

164 ↗(On Diff #278124)

Ack.

174 ↗(On Diff #278124)

Ack.

180 ↗(On Diff #278124)

Ack.

omjavaid updated this revision to Diff 278435.Jul 16 2020, 5:06 AM

In this update I have addressed issues highlighted in last review iteration.

@labath Does this now look LGTM?

labath added inline comments.Jul 16 2020, 7:08 AM
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
63–66 ↗(On Diff #278124)

Ok, so if I parse that correctly, this is basically saying that the register values are always little-endian. Fair enough. However, lldb will be reading these things using the host endianness, so the values will be incorrect on a big-endian host.

Also, the are you sure that this includes the metadata in the user_sve_header field? The documentation does mention the struct explicitly, but then it also very explicitly speaks about register values. The data in the struct is not a register value, and I can find no evidence of byte-swapping in the linux kernel (https://github.com/torvalds/linux/blob/master/arch/arm64/kernel/ptrace.c#L769).

omjavaid updated this revision to Diff 278617.Jul 16 2020, 3:59 PM

This update plugs the holes susceptible to bugs by endianity variations. I have used DataExtractor GetU16 to read flags and vl from user_sve_header. Also removed the memset and replaced it with SetFromMemoryData function.

Moreover a new test is added which tests SVE FPSIMD mode core file.

omjavaid marked an inline comment as done.Jul 16 2020, 4:01 PM
omjavaid added inline comments.
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
63–66 ↗(On Diff #278124)

@labath Thanks for being great help in this review. I think you were right about the user_sve_header being susceptible to endianity problems. I have plugged the issues now.

labath accepted this revision.Jul 17 2020, 6:19 AM

I think this is as good as we can get right now. Thanks for sticking through. Two quick comments inline.

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
152–153 ↗(On Diff #278617)

Are there any FPR registers which we should not be able to retrieve from the SVE state? I'm wondering if this should be turned into an assert.

167–169 ↗(On Diff #278617)

It looks like this is only used in the IsSVEZ(reg) block below. You could move the declarations there and avoid initializations with bogus values.

This revision is now accepted and ready to land.Jul 17 2020, 6:19 AM
omjavaid updated this revision to Diff 279151.Jul 20 2020, 2:14 AM

This update makes minor adjustments before merge in light of final comments.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2020, 5:21 AM