This is an archive of the discontinued LLVM Phabricator instance.

AArch64 SVE register infos and core file support
ClosedPublic

Authored by omjavaid on Mar 30 2020, 3:24 AM.

Details

Summary

This patch adds support for AArch64 SVE register infos description and register access via ptrace.

AArch64 SVE is a an optional extension of Arm v8.3-a architecture which introduces 32 vector registers Z, 16 predicate P registers and FFR predicate register. These registers have fixed names but can dynamically be configured to different size based on underlying OS configuration.

This patch adds register info struct that describes SVE register infos. Also provides native register context routines to access SVE registers. It introduces a mechanism to configure SVE register size and offsets at startup before exchanging register information across gdb-remote process.

It makes use of linux kernel definitions copied into lldb/source/Plugins/Process/Linux/LinuxPTraceDefines_arm64sve.h for backward compatibility with sysroots which might yet not support SVE definitions.

There are two test cases added to the testsuite one of them checks register size configuration based on vg (vector granule register). Also a test is added which verifies registers can be read and written.

There is no physical hardware currently available to test SVE and we make use of QEMU for the purpose of testing this feature.

Diff Detail

Event Timeline

omjavaid created this revision.Mar 30 2020, 3:24 AM
labath added a comment.Apr 1 2020, 5:16 AM

There is no physical hardware currently available to test SVE and we make use of QEMU for the purpose of testing this feature.

Are these registers presented in core files in any way? Would it be possible to at least test the RegisterInfoPOSIX_arm64.h changes that way?

omjavaid updated this revision to Diff 254454.Apr 2 2020, 1:59 AM

Posting full diff.

There is no physical hardware currently available to test SVE and we make use of QEMU for the purpose of testing this feature.

Are these registers presented in core files in any way? Would it be possible to at least test the RegisterInfoPOSIX_arm64.h changes that way?

Yes. Core file section is added for SVE registers but i have not tried to get that working with LLDB. I will get back to you on this.

There is no physical hardware currently available to test SVE and we make use of QEMU for the purpose of testing this feature.

Are these registers presented in core files in any way? Would it be possible to at least test the RegisterInfoPOSIX_arm64.h changes that way?

Yes. Core file section is added for SVE registers but i have not tried to get that working with LLDB. I will get back to you on this.

@labath Core file FP register access is not supported on AArch64. I am working on a follow up patch to fix it. This patch only has linux changes so can we get this through?

labath added a comment.Apr 9 2020, 2:10 AM

@labath Core file FP register access is not supported on AArch64. I am working on a follow up patch to fix it. This patch only has linux changes so can we get this through?

This patch is a bit big and basically untested (an unconditionally skipped test hardly counts), so I am looking for ways to split it up and add some test coverage. I'm also worried about copying what appear to be chunks of linux header files...

lldb/source/Plugins/Process/Linux/NativeThreadLinux.h
42–43 ↗(On Diff #254454)

Why can't this work be done in the register context constructor? I believe we are always creating a Thread (and its reg context) while it is stopped...

lldb/source/Plugins/Process/Utility/NativeRegisterContextRegisterInfo.h
38–40 ↗(On Diff #254454)

I can't find any callers of this function. Who is calling it?

@labath Core file FP register access is not supported on AArch64. I am working on a follow up patch to fix it. This patch only has linux changes so can we get this through?

This patch is a bit big and basically untested (an unconditionally skipped test hardly counts), so I am looking for ways to split it up and add some test coverage. I'm also worried about copying what appear to be chunks of linux header files...

I also considered licensing issues around copying this chunk but found GDB also copied this chunk of linux defines into its sources. I think its part of linux headers with license to freely use these headers in client programs. Do you know any licensing expert who can give us feedback on this patch? The only reason we have to include these headers here is that on older sysroots AArch64 compiler we wont be able to compile this code as they dont have these headers included so far but newer version do include these headers by default.

As for the test coverage I am working on a follow up to this patch that will include corefile support I hope we ll be able to test major portion of this patch with that support.

@labath Core file FP register access is not supported on AArch64. I am working on a follow up patch to fix it. This patch only has linux changes so can we get this through?

This patch is a bit big and basically untested (an unconditionally skipped test hardly counts), so I am looking for ways to split it up and add some test coverage. I'm also worried about copying what appear to be chunks of linux header files...

I also considered licensing issues around copying this chunk but found GDB also copied this chunk of linux defines into its sources. I think its part of linux headers with license to freely use these headers in client programs. Do you know any licensing expert who can give us feedback on this patch? The only reason we have to include these headers here is that on older sysroots AArch64 compiler we wont be able to compile this code as they dont have these headers included so far but newer version do include these headers by default.

The situation might be different for gdb since they're both GPL-ish. For licensing questions you could try mailing llvm foundation.

omjavaid marked 4 inline comments as done.Apr 21 2020, 3:48 AM
omjavaid added inline comments.
lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp
1191 ↗(On Diff #254454)

SetRegisterInfoMode(vq); is getting called over here.

lldb/source/Plugins/Process/Linux/NativeThreadLinux.h
42–43 ↗(On Diff #254454)

ConfigureRegisterContext makes sure that sve register offsets and sizes are correctly updated and synced from ptrace on every stop. If SVE registers reconfigure themselves we ll have to update that information into our dynamic register infos array. Although for the context of this patch it assumes it will never change its register size during execution. But I will follow up patch that implements that behaviour.

omjavaid updated this revision to Diff 259190.Apr 22 2020, 12:57 AM
omjavaid marked an inline comment as done.

This update includes core file register access for SVE registers. Register descriptions can now be tested with core dump.

@labath Should I post QEMU SVE setup instructions somewhere if you want to have a go at running included test cases.?

Also about licensing issues of included macros I am in contact with original authors from ARM and I hope they will be able to help sign off this patch for LLVM submission.

This update includes core file register access for SVE registers. Register descriptions can now be tested with core dump.

Cool. Could you please split off the stuff necessary for making core files work into a separate patch, and have the "native" patch be on top of that. I want to get an idea of how much of new code is included in the second patch.

@labath Should I post QEMU SVE setup instructions somewhere if you want to have a go at running included test cases.?

I think that would be very valuable. Not really much for this patch (I probably wont get around to running it), but for any future contributors (inlcuding myself) whose patches break some arm stuff (be it SVE-related or not).

Also about licensing issues of included macros I am in contact with original authors from ARM and I hope they will be able to help sign off this patch for LLVM submission.

Awesome.

lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp
1191 ↗(On Diff #254454)

Ok, I see. That makes sense. The part that seems pretty unfortunate here is that you've needed to add a fairly strange function to the generic (RegisterInfoInterface) interface even though the caller and callee are in arm64-specific code. Let's see if we can avoid that.

What if, instead of remangling the register infos in RegisterInfoPOSIX_arm64 every time that SetRegisterInfoMode is called, the "mode" was a argument to the class constructor, and changing of the mode was implemented by creating a new "info" object? I.e., this code would instead of calling SetRegisterInfoMode(vq);, do something like m_register_info_interface_up = std::make_unique<RegisterInfoPOSIX_arm64>(vq);

Would that work?

lldb/source/Plugins/Process/Linux/NativeThreadLinux.h
42–43 ↗(On Diff #254454)

Ok, that makes sense now. The thing on my mind in that case is that this functionality overlaps a lot with InvalidateAllRegisters we've added a couple of patches back. I think it would be nice to have just one function which "reinitializes" a register context. InvalidateAllRegisters wouldn't work right now, because it's called before a resume, but if we changed it to be called after a stop, it would be able to do both things.

Would it be possible to change it so that this InvalidateAllRegisters is called after a stop, e.g. in NativeThreadLinux::SetStopped. I am sorry for the churn, I know it was my idea to call it before a resume -- it seemed like a reasonable thing to do at the time, but not any more.. :(

omjavaid updated this revision to Diff 263106.May 10 2020, 11:37 PM

This patch now contains AArch64 SVE register infos description and support for core dump SVE register access. Linux ptrace support will be submitted in a follow up patch while linux ptrace headers are being submitted by Arm under license agreement separately.

labath retitled this revision from AArch64 SVE register infos and ptrace support to AArch64 SVE register infos and core file support.

(sorry about all the editing -- I'm trying to figure out the order of these patches)

omjavaid updated this revision to Diff 263934.May 14 2020, 1:27 AM

New update contains minor update needed to accommodate SVE PTrace macros header from ARM.

omjavaid updated this revision to Diff 271532.Jun 17 2020, 5:08 PM

This update remove skipping over register infos requirement by overlapping sve register numbers with debug register numbers which are not required by Linux register context.

I'm worried about the confusing relationship between RegisterInfos_arm64_sve.h and RegisterInfos_arm64.h and the overlap in the register numbers it enforces. when we discussed this last time, my understanding was that the sve registers could be inserted _before_ the debug registers (similar to what happened with D24255). Now I see that may not be possible because some of these registers are used in the darwin arm64 register context. However, that makes the whole idea much less appealing.

Would it be possible to organize these headers like this:

  • create a new header which holds common register definitions: this would be register numbers for the gpr and fpr registers (and their respective _contains arrays).
  • have RegisterInfos_arm64_sve.h include this header and provide additional register definitions, and define its RegisterInfo array to match
  • have RegisterInfos_arm64.h to the same

What I am hoping to achieve this way is to remove the DECLARE_REGISTER_INFOS_ARM64_SVE_STRUCT condition macro (dragons (ODR violations) lie there), and the duplicate enum values associated with it.

omjavaid updated this revision to Diff 274390.Jun 30 2020, 3:26 AM

In this updated I have removed overlapping parts of RegisterInfos_arm64.h and RegisterInfos_arm64_sve.h which in turn removes any possibility of duplicate definitions.

Both register infos define separate static register info arrays namely g_register_infos_arm64_le and g_register_infos_arm64_sve_le. Definition of FPR registers is also different and their invalidate/contains reg lists are also different. For the context of all native register context which do not yet support SVE g_register_infos_arm64_le will be used as they only include RegisterInfos_arm64.h.

@labath do you think there is still chance of any ODR violations?

labath added a comment.Jul 1 2020, 5:40 AM

There's always a chance for ODR violations, particularly with header files defining static objects. This looks better though what I really wanted was to keep the non-sve register info array (g_register_infos_arm64_le) completely out of the path of the sve header. Like, by using three headers, one defining g_register_infos_arm64_le (and stuff specific to that like the exception registers), one defining the sve array, and the third one which would contain the things common to both. Nonetheless, I think we can now move on to aspects of this patch. Please see my inline comments.

lldb/source/Plugins/Process/Utility/NativeRegisterContextRegisterInfo.h
38–40 ↗(On Diff #254454)

It still seems to me that this function is not used (in this patch at least).

lldb/source/Plugins/Process/Utility/RegisterContextPOSIX_arm64.cpp
96
lldb/source/Plugins/Process/Utility/RegisterInfoInterface.h
64–68 ↗(On Diff #274390)

I don't believe these functions should be present on the generic interface. The only caller and the only implementation of them is already arm64-specific, so it should be possible to arrange it such that they don't need to go through the base class interface. For instance the RegisterContextCorePOSIX_arm64 constructor could accept an RegisterInfoPOSIX_arm64 instance to indicate what it needs to work with, and then downcast m_register_info_up (via a helper function?) when it needs to call these. Solutions without downcasting are possible too, though may require a bit more refactoring.

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
73–90

no else after return

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.h
41–43

Do these need to be public? If so, why?

56

Why is this mutable?

omjavaid marked 6 inline comments as done.Jul 1 2020, 4:10 PM

There's always a chance for ODR violations, particularly with header files defining static objects. This looks better though what I really wanted was to keep the non-sve register info array (g_register_infos_arm64_le) completely out of the path of the sve header. Like, by using three headers, one defining g_register_infos_arm64_le (and stuff specific to that like the exception registers), one defining the sve array, and the third one which would contain the things common to both. Nonetheless, I think we can now move on to aspects of this patch. Please see my inline comments.

I am working to fix this with D80105 by defining dynamic register infos array based on register sets which will be requirement for Pointer Authentication and MTE. But I wanted to first get done with SVE and come back to register infos refactoring later.

lldb/source/Plugins/Process/Utility/NativeRegisterContextRegisterInfo.h
38–40 ↗(On Diff #254454)

Ack. This function is used by chiild revisions which adds ptrace support. I ll move it there.

lldb/source/Plugins/Process/Utility/RegisterContextPOSIX_arm64.cpp
96

Ack.

lldb/source/Plugins/Process/Utility/RegisterInfoInterface.h
64–68 ↗(On Diff #274390)

Agreed. I ll make an arrangement for this in updated rev.

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
73–90

Ack.

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.h
41–43

These should be private. I ll update this in updated rev.

56

It was pulled in from NativeRegisterContextLinux_arm64 and got set mutable by mistake I ll correct in updated rev.

omjavaid updated this revision to Diff 275037.Jul 2 2020, 3:12 AM

@labath I have updated diff after resolving issues highlighted. Also I have removed dependence on generic interfaces.

labath marked an inline comment as done.Jul 3 2020, 1:45 AM

Thanks. This looks is starting to look good now. I didn't get a chance to review all of the changes, but here's a few more things that I noticed.

lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.h
79–83

Now that this is arm-specific, I am wondering if there isn't a better name for this than "register info 'mode'". Maybe it could be just "enable/disable SVE" ? Specifically, if the validity of the "mode" is checked by the caller, then we can avoid this RegisterInfoModeIsValid business...

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
21

It would be also nice if this constructor accepted a RegisterInfoPOSIX_arm64 instead of plain RegisterInfoInterface. That would make it much harder to misuse this class. For that we'd need to slightly refactor the construction code in ThreadElfCore. For example, we could swap the order of the "os" and "machine" switches and then join the two "machine" switches that construct RegisterInfoInterfaces and RegisterContexts.

Maybe you could do that as a separate patch?

73–90

Well.. that's one way of handing that.. :) It's not necessarily bad, but I was imagining you would just delete all the "else"s...

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.h
19–24

This is including platform-specific headers, but elf cores should be readable from any host. I guess you should just re-define the relevant constants somewhere. (I'm not sure what they are.)

lldb/source/Utility/ARM64_DWARF_Registers.h
54–146 ↗(On Diff #275037)

All of these numbers are from the official spec, right? Maybe you could just commit this straight away so it does not clutter this review.

omjavaid updated this revision to Diff 276963.Jul 10 2020, 2:35 AM

This update incorporates suggestion from last iteration and also rebases after merger to registersets and register infos in RegisterInfosPOSIX_arm64.
Also I have posted a separate patch D83541 to make lldb/source/Plugins/Process/Linux/LinuxPTraceDefines_arm64sve.h platform independent in order to use them with elf-core.

@labath what do you think ?

I didn't notice this before, but I see now that there's more register number overlap in lldb-arm64-register-enums.h. Having one overlapping enum is bad enough, but two seems really too much? Can we avoid the second enum somehow, at least? Perhaps by switching RegisterContextCorePOSIX_arm64 to use the other enum definitions for its work?

lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp
256–257

This comment looks outdated.

280

IIUC, m_sve_enabled is only true if m_vector_reg_vq is not zero. Could we delete this variable and just make a function to make that comparison?

288–291

It doesn't look like m_dynamic_register_infos is needed -- you could just make m_register_info_p point directly into the map.

lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.h
86

The offset argument is not set anywhere.

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.h
19

I guess this does not make sense now that the header is standalone

omjavaid updated this revision to Diff 277701.Jul 14 2020, 1:34 AM

This update addresses issues raised in last iteration. Also have removed dependence on lldb-arm64-register-enums.h.

Some more comments (and questions) from me. Sorry about the back-and-forth. It's taking me a while to wrap my head around this, and as I start to understand things, new questions arise...

lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp
282–290

I'd probably try relying on the fact that operator[] constructs an empty vector, and that the empty vector is not a valid register info value. Something like:

std::vector<RegisterInfo> &new_reg_info = m_per_vq_reg_infos[sve_vq];
if (new_reg_info.empty()) {
  new_reg_info = llvm::makeArrayRef(g_register_infos_arm64_sve_le, m_register_info_count);
  ...
}
m_register_info_p = new_reg_info.data();

That would make this less of a mouthful and also avoid looking up the same key in the map three or four times.

lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.h
17–22

If this is going to be a class enum (which I think is a good idea), then there's no need to repeat the type name in the actual enumerators. It also seems like this could/should follow the lldb naming conventions (SveState::Disabled, etc).

lldb/source/Plugins/Process/Utility/lldb-arm64-register-enums.h
261 ↗(On Diff #277701)

I guess these changes should be reverted now?

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
82–106

I'm confused by this function. If I understant the SVE_PT macros and the logic in RegisterInfoPOSIX_arm64::ConfigureVectorRegisterInfos correctly, then they both seem to encode the same information. And it seems to me that this function should just be reg_infos[reg_num].offset - some_constant, which is the same thing that we're doing for the arm FP registers when SVE is disabled, and also for most other architectures too.

Why is that not the case? Am I missing something? If they are not encoding the same thing, could they be made to encode the same thing?

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.h
19

Please group the header next to other Plugins/Process/Utility header

lldb/test/API/functionalities/postmortem/elf-core/TestLinuxCore.py
463–464

What's up with these?

omjavaid marked 6 inline comments as done.Jul 14 2020, 5:54 AM

There's always a chance for ODR violations, particularly with header files defining static objects. This looks better though what I really wanted was to keep the non-sve register info array (g_register_infos_arm64_le) completely out of the path of the sve header. Like, by using three headers, one defining g_register_infos_arm64_le (and stuff specific to that like the exception registers), one defining the sve array, and the third one which would contain the things common to both. Nonetheless, I think we can now move on to aspects of this patch. Please see my inline comments.

I am working to fix this with D80105 by defining dynamic register infos array based on register sets which will be requirement for Pointer Authentication and MTE. But I wanted to first get done with SVE and come back to register infos refactoring later.

lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp
282–290

ACK

lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.h
17–22

ACK

lldb/source/Plugins/Process/Utility/lldb-arm64-register-enums.h
261 ↗(On Diff #277701)

ACK

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
82–106

This function calculates offset of a particular register in core note data. SVE data in core dump is similar to what PTrace emits and offsets into this data is not linear. SVE macros are used to access those offsets based on register numbers and currently selected vector length.
Also for the purpose of ease we have linear offsets in SVE register infos and it helps us simplify register data once it makes way to GDBRemoteRegisterContext on the client side.

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.h
19

ACK

lldb/test/API/functionalities/postmortem/elf-core/TestLinuxCore.py
463–464

Sorry Typo

labath added inline comments.Jul 14 2020, 6:07 AM
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
82–106

Could you give an example of the non-linearity of the core dump data? (Some registers, and their respective core file and gdb-remote offsets)

omjavaid marked an inline comment as done.Jul 14 2020, 6:35 AM
omjavaid added inline comments.
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
82–106

In case of core file we create a buffer m_sveregset which stores SVE core note information
m_sveregset =

getRegset(notes, m_register_info_up->GetTargetArchitecture().GetTriple(),
          AARCH64_SVE_Desc);

At this point we do not know what was the vector length and at what offsets in the data our registers are located. We read top bytes of size use_sve_header and read vector length. Based on this information we configure vector length in Register infos. While the SVE payload starts with user_sve_header then there are some allignment bytes followed by vector length based Z registers followed by P and FFR, then there are some more allginment bytes followd by FPCR and FPSR. Macros provided by Linux help us jump to the desired offset by providing register number and vq into the core note or Ptrace payload.

In case of client side storage we store GPRs at linear offset followed by Vector Granule register. Then there are SVE registers Z, P, FFR, FPSR and FPCR. Offsets of V, D and S registers in FPR regset overlap with corresponding first bytes of Z registers and will be same as corresponding Z register. While both FP/SVE FPSR share same register offset, size and register number.

Here is an excerpt from https://github.com/torvalds/linux/blob/master/Documentation/arm64/sve.rst

SVE_PT_REGS_FPSIMD
SVE registers are not live (GETREGSET) or are to be made non-live (SETREGSET).
The payload is of type struct user_fpsimd_state, with the same meaning as for NT_PRFPREG, starting at offset SVE_PT_FPSIMD_OFFSET from the start of user_sve_header.
Extra data might be appended in the future: the size of the payload should be obtained using SVE_PT_FPSIMD_SIZE(vq, flags).
vq should be obtained using sve_vq_from_vl(vl).

or

SVE_PT_REGS_SVE
SVE registers are live (GETREGSET) or are to be made live (SETREGSET).
The payload contains the SVE register data, starting at offset SVE_PT_SVE_OFFSET from the start of user_sve_header, and with size SVE_PT_SVE_SIZE(vq, flags);

labath added inline comments.Jul 14 2020, 6:52 AM
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
82–106

Given this

SVE payload starts with ... followed by vector length based Z registers followed by P and FFR,

and this

In case of client side storage we store GPRs ... Then there are SVE registers Z, P, FFR, FPSR and FPCR

I would expect that for each of the Z, P and FFR registers, the expression offset_in_core(reg) - offset_in_gdb_remote(reg) is always the same constant (and is basically equal to SVE_PT_SVE_ZREG_OFFSET(vq, 0) - reg_info[Z0].byte_offset). So we could just add/subtract that constant to the gdb-remote byte_offset field instead of painstakingly decomposing the register number only for the linux macros to reconstruct it back again. Is that not so?

omjavaid marked an inline comment as done.Jul 14 2020, 7:45 AM
omjavaid added inline comments.
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
82–106

The standard never talks about Z, P and FFR being contagious that is what I learnt by reading macros. There standard states this:

If the registers are present, the remainder of the record has a vl-dependent size and layout. Macros SVE_SIG_* are defined [1] to facilitate access to the members.
Each scalable register (Zn, Pn, FFR) is stored in an endianness-invariant layout, with bits [(8 * i + 7) : (8 * i)] stored at byte offset i from the start of the register's representation in memory.
If the SVE context is too big to fit in sigcontext.reserved[], then extra space is allocated on the stack, an extra_context record is written in reserved[] referencing this space. sve_context is then written in the extra space. Refer to [1] for further details about this mechanism.

I understand what you are talking about but given the macros were specifically provided and above line about register record was vague and I thought best is to follow the macros for offset calculation although other way around is simpler but may be slightly unreliable.

I suggest to keep this as it is unless there is strong reason apart from slight performance penalty. This resembles with GDB implementation which was done by ARM and I am only following that as reference. May be we can revise this in future when the feature becomes more mainstream.

labath added inline comments.Jul 14 2020, 8:31 AM
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
82–106

I'm not worried about the performance penalty (most of these abstractions can be optimized away anyway). I'm more worried about the maintainability penalty incurred by a non-standard and fairly complicated solution.

The way I see it, it doesn't really matter how exactly the specification describes these things. Having the macros is nice, but it's not their names that become a part of the ABI -- their implementation does. So they (linux, arm, whoever) cannot change the layout of the these registers without breaking all existing applications (and core files) any more than they can change the layout of the general purpose registers (which we also access by offset, without any fancy macros). In fact, even if they did change the layout, given that we have a copy of these headers, we wouldn't automatically inherit those changes anyway, but would have to take some manual action. And since we'd probably want to maintain some sort of backwards compatibility, we couldn't just replace the new macro definitions, but would have to do some sort of versioning (just today I got reminded that lldb contains versioned layouts of various internal structs used by the ObjC runtime).

So, for that reason I am more concerned about consistency with other lldb register contexts than I am with consistency with gdb.

Also, I bet gdb doesn't have an equivalent of our RegisterInfo struct. Without that, using these macros would be obviously better, but given that we have that, and we already use it to access other registers, ditching that for macros does not seem like a win to me.

Another way to look at this would be to compare this to e.g. ELF reading code in llvm. We cannot just include <linux/elf.h> as we want this to be cross-platform. However, we also don't just copy the header blindly into our code base. We don't depart from it needlessly, but we do make sure that it plays well with the rest of llvm -- #defines are changed to enums, we add templates to select between 64/32 bit versions, change some platform-specific names so that different platforms may co-exist, etc.

omjavaid marked an inline comment as done.Jul 14 2020, 9:53 PM
omjavaid added inline comments.
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
82–106

I understand your point here and by referring to Linux, Arm or GDB implementation I only wanted to make sure that we are implementing the architecture and platform specific parts correctly. The choice of using Linux/SVE specific macros in elf-core context was not a blind decision. I am not advocating for following a particular implementations and I would not have pulled in these Linux specific macros into elf-core implementation if the SVE core note layout was independent of Linux platform specifics like sig_context and user_sve_header. Offset to the start of register data in NT_ARM_SVE is also calculated after doing alignment correction based on VQ and size of sig_context. Given that SVE is currently supported by Linux only, NT_ARM_SVE is Linux only core note and core dump generated makes use of these macro definitions I had no choice but to include them for extraction of data from NT_ARM_SVE section. If I had not followed these macros I still would have written something similar my self for register data extraction from NT_ARM_SVE elf core note.

I am going to post another update where I ll try to minimize use of these macros in offset calculation.

omjavaid updated this revision to Diff 278113.Jul 15 2020, 2:16 AM

This update address issues highlighted in last iteration and also minimizes the use of SVE_PT macros by using RegisterInfo.byte_size and byte_offset where possible.

@labath What do you think ?

omjavaid updated this revision to Diff 278124.Jul 15 2020, 3:16 AM

Minor fix removing GetRegNumP0 which is no longer needed.

At this point, I think I (finally) have a good understanding of both how this patch works and interacts with the rest of the world. I have one more batch of comments, but hopefully none are too controversial, and I really do hope this is the last iteration.

lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp
289

I think all uses of new_reg_info_p could just be replaced by reg_info_ref

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
63–66

What's up with this copying? We already have the data in the m_sveregset DataExtractor. What's the reason for copying it into the m_sve_note_payload vector? Also, m_sve_header seems like it could just be a reinterpret_casted pointer into that buffer instead of a copy. (Maybe not even a pointer, just a utility function which performs the cast when called).

Actually, when I think about casts and data extractors, I am reminded of endianness. This will access those fields using host endianness, which is most likely not what we want to do. So, probably the best/simplest solution would be to indeed declare a user_sve_header struct, but don't populate it via memcpy, but rather via the appropriate DataExtractor extraction methods. Since the only field of the user_sve_header used outside this function is the vl field, perhaps the struct could be a local variable and only the vector length would be persisted. That would be consistent with how the flags field is decoded and stored into the m_sve_state field.

(If the struct fields are always little-endian (if that's true, I thought arm has BE variants) then you could also stick to the reinterpret_cast idea, but change the types of the struct fields to llvm::support::ulittleXX_t to read them as LE independently of the host.)

82–106

Thanks for your patience. I think this looks better. The more I understand how these work, the more I realize that will need fairly special handling..

84–86

This is already checked by ReadRegister and the false -> uint32_t conversion is dodgy. An assertion would completely suffice here.

120

This IsFPR check is now redundant.

132–133

These two registers are special-cased both here and in the CalculateSVEOffset. Given that both functions also do a switch over the sve "states", it makes following the code quite challenging. What if we moved the handling of these registers completely up-front, and removed their handling from CalculateSVEOffset completely?
I'm thinking of something like:

if (reg == GetRegNumFPCR() && m_sve_state != SVEState::Disabled) {
  src = GetSVEBuffer() + GetFPCROffset(); // Or maybe just inline GetFPCROffset here
} else if (reg == GetRegNumFPSR() && m_sve_state != SVEState::Disabled) {
  src = ...;
} if (IsFPR(reg)) {
  // as usual, only you can assume that FP[CS]R are already handled if SVE is enabled
}
141–142

This also looks endian-incorrect. Maybe value = GetSVERegVG(); return true; ?

144

A switch over possible m_sve_state values would likely be cleaner.

154

I guess it would be better to do the cast inside GetSVEBuffer. (and please make that a c++ reinterpret_cast).

160

it seems that src should be a const pointer.

omjavaid marked 9 inline comments as done.Jul 16 2020, 4:42 AM
omjavaid added inline comments.
lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp
289

Ack.

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
63–66

Agreed m_sve_note_payload is redundant. Most of the implementation was mirrored from ptrace implementation and I was lazy enough and did not remove the redundancies which are not needed in case of core dump.

About endianess of SVE data I dont think it will create any problem as the data is transferred in endian invariant format. According to the standard here:

  • Whenever SVE scalable register values (Zn, Pn, FFR) are exchanged in memory between userspace and the kernel, the register value is encoded in memory in an endianness-invariant layout, with bits [(8 * i + 7) : (8 * i)] encoded at byte offset i from the start of the memory representation. This affects for example the signal frame (struct sve_context) and ptrace interface (struct user_sve_header) and associated data.
84–86

Ack.

120

Ack,

132–133

Ok will try to update in next revision.

141–142

Endian invariant as explained above.

144

Ack.

154

Ack.

160

Ack.

omjavaid updated this revision to Diff 278435.Jul 16 2020, 5:06 AM

In this update I have addressed issues highlighted in last review iteration.

@labath Does this now look LGTM?

labath added inline comments.Jul 16 2020, 7:08 AM
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
63–66

Ok, so if I parse that correctly, this is basically saying that the register values are always little-endian. Fair enough. However, lldb will be reading these things using the host endianness, so the values will be incorrect on a big-endian host.

Also, the are you sure that this includes the metadata in the user_sve_header field? The documentation does mention the struct explicitly, but then it also very explicitly speaks about register values. The data in the struct is not a register value, and I can find no evidence of byte-swapping in the linux kernel (https://github.com/torvalds/linux/blob/master/arch/arm64/kernel/ptrace.c#L769).

omjavaid updated this revision to Diff 278617.Jul 16 2020, 3:59 PM

This update plugs the holes susceptible to bugs by endianity variations. I have used DataExtractor GetU16 to read flags and vl from user_sve_header. Also removed the memset and replaced it with SetFromMemoryData function.

Moreover a new test is added which tests SVE FPSIMD mode core file.

omjavaid marked an inline comment as done.Jul 16 2020, 4:01 PM
omjavaid added inline comments.
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
63–66

@labath Thanks for being great help in this review. I think you were right about the user_sve_header being susceptible to endianity problems. I have plugged the issues now.

labath accepted this revision.Jul 17 2020, 6:19 AM

I think this is as good as we can get right now. Thanks for sticking through. Two quick comments inline.

lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp
152–153

Are there any FPR registers which we should not be able to retrieve from the SVE state? I'm wondering if this should be turned into an assert.

164–196

It looks like this is only used in the IsSVEZ(reg) block below. You could move the declarations there and avoid initializations with bogus values.

This revision is now accepted and ready to land.Jul 17 2020, 6:19 AM
omjavaid updated this revision to Diff 279151.Jul 20 2020, 2:14 AM

This update makes minor adjustments before merge in light of final comments.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2020, 5:21 AM