This is an archive of the discontinued LLVM Phabricator instance.

[LLDB] Add support to resize SVE registers at run-time
ClosedPublic

Authored by omjavaid on Jun 30 2020, 4:14 AM.

Details

Summary

This patch builds on previously submitted SVE patches regarding expedited register set and per thread register infos. (D82853 D82855 and D82857)

We need to resize SVE register based on value received in expedited list. Also we need to resize SVE registers when we write vg register using register write vg command. The resize will result in a updated offset for all of fpr and sve register set. This offset will be configured in native register context by RegisterInfoInterface and will be updated by GDBRemoteProcess and GDBRemoteRegisterContext.

A follow up patch will provide a API test to verify this change.

Diff Detail

Event Timeline

omjavaid created this revision.Jun 30 2020, 4:14 AM

I'd like to also pull in Jason for this, since this is really the trickiest part of the whole patchset.

What this patch essentially does is bake in the knowledge of the arm sve registers and their various configurations into the lldb client (and the DynamicRegisterInfo class in particular). Before we go into the implementation details, we should answer the question whether we are ok with that.

I am personally fine with that (if it gets encapsulated a bit better), because we can't avoid knowing about the registers of some architecture for various other reasons (core file debugging, eh_frame parsing, stubs which don't support qRegisterInfo, etc.). In fact, I'd be very happy if this got us closer towards a state where the stub does not need to send eh_frame and other register numbers over. Also, hardcoding may be the only way to get reasonable performance out of this, since a fully generic solution of asking about the register information after every stop would likely be prohibitively slow (well.. I suppose we could come up with some way of letting the stub to notify us when the register configuration has changed).

I believe Jason is also ok some hardcoding, in principle. However,
a) I don't want to speak for him,
b) there's still a lot of details to be worked out.

For example, one of the ideas floated around previously was based on the client and server communicating 4 things about each register:

  • name
  • the register number (for p packets, etc.)
  • size
  • offset (for e.g. g packet)

The register name and number are fine, but the other two things are sort of variable with SVE. The register size was mainly there as a cross-check, and so we could conceivably just drop it from the list (or use the max value or some other placeholder for sve registers in particular). But that still leaves the question of the g offsets...

It might be interesting data point to look at how the sve registers are described in gdb's target.xml descriptions.

@labath
I am just back from holiday and was hoping for Jason to give his advice in this. I am wondering how we can move forward on this.
I have added a some information from GDB inline.

I'd like to also pull in Jason for this, since this is really the trickiest part of the whole patchset.

What this patch essentially does is bake in the knowledge of the arm sve registers and their various configurations into the lldb client (and the DynamicRegisterInfo class in particular). Before we go into the implementation details, we should answer the question whether we are ok with that.

I am personally fine with that (if it gets encapsulated a bit better), because we can't avoid knowing about the registers of some architecture for various other reasons (core file debugging, eh_frame parsing, stubs which don't support qRegisterInfo, etc.). In fact, I'd be very happy if this got us closer towards a state where the stub does not need to send eh_frame and other register numbers over. Also, hardcoding may be the only way to get reasonable performance out of this, since a fully generic solution of asking about the register information after every stop would likely be prohibitively slow (well.. I suppose we could come up with some way of letting the stub to notify us when the register configuration has changed).

I believe Jason is also ok some hardcoding, in principle. However,
a) I don't want to speak for him,
b) there's still a lot of details to be worked out.

For example, one of the ideas floated around previously was based on the client and server communicating 4 things about each register:

  • name
  • the register number (for p packets, etc.)
  • size
  • offset (for e.g. g packet)

The register name and number are fine, but the other two things are sort of variable with SVE. The register size was mainly there as a cross-check, and so we could conceivably just drop it from the list (or use the max value or some other placeholder for sve registers in particular). But that still leaves the question of the g offsets...

There are two different things we are looking at in this scenario:

  1. Exchange of register information via RegisterInfo or XML packet at inferior creation
  2. Register size/offset update mechanism during execution on every stop.

Reaching an optimal solution dropping extra information in both above cases is quite a challenge. I have been thinking about ways to drop offset and only way right now is to make LLDB register numbers act as sequence numbers and offset can be calculated based on the size of next in line register and previous value of offset. However challenge in this case (we discussed before during review of an earilier patch) was dynamic selection of register sets. Some register sets may be active some may not be active and it will make things quite tricky.
If we look at the eh_frame, dwarf reg numbers implementation in GDB its a similar mess there as well although its not kept inside register information but its being dealt with hardcoded in various areas of code. GDB also supports complex register type description as part of target XML which luckily we are not doing at the moment.

It might be interesting data point to look at how the sve registers are described in gdb's target.xml descriptions.

GDB uses dynamically generated target XMLs now which were previously static. Target XML is generated based on the value of scale, which is no of chunks of 128 Bits in a Z register.
https://sourceware.org/git?p=binutils-gdb.git;a=blob;f=gdb/features/aarch64-sve.c;h=e4361b07a4d47282c5515846896638cc4caf97f5;hb=HEAD#l25

Furthermore, GDB keeps a list of target descriptions for every combination of (vq, pauth) where pauth is flag telling if pointer authentication feature is supported while vq is vector length quad (multiples of 128bits).
https://sourceware.org/git?p=binutils-gdb.git;a=blob;f=gdb/aarch64-tdep.c;h=cbc7038dbba9472a12a3ae927bbb0937b10b2bdd;hb=HEAD#l3284

On gdbserver side similar to what is proposed for LLDB, GDB also exchanges vector length information using expedited registers list and updates target descriptions based on vg (vector granule register) value.
https://sourceware.org/git?p=binutils-gdb.git;a=blob;f=gdbserver/linux-aarch64-tdesc.cc;h=897fbb43bd28ddf44c69d4162dda43c2589b060f;hb=HEAD#l35

Hey all, got distracted while reading the patch and jotted a few notes, but I know that's not as much what Pavel wanted to discuss. We already have some target specific goop in these ProcessGDBRemote type classes for registers and such, this doesn't cause my great pain to see added.

The question of what information the stub provides to lldb at startup (through qRegisterInfo, or XML - to be honest, we should be pushing the XML approach everywhere now, we added qRegisterInfo back before it was as commonly used) - I agree with Pavel, the remote stub's register number, register name, register size, register offset is a good set, we should have everything else available from Target-specific knowledge in lldb. gdb started out with the requirement that the remote stub and gdb were in complete agreement about what registers were available, what offsets they were in the register context, all of it. We started lldb with the idea that the remote stub was the source of truth for everything about registers - their eh_frame and DWARF numberings, their formatting, their type, their register sets. But nearly all of this is Target/ABI specific details that lldb can *safely* know already, and that reduces the burden on remote stubs to provide a ton of different information. And the g/G packets just scare the willies out of me, they cause so many problems if there is a mis-coordination between the remote stub and lldb, I never want to stop the remote stub from providing those offsets.

Overall, this looks fine to me, fwiw. The SVE registers were always going to be a tricky thing to support, and I'm not surprised we're looking at changes like this.

lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp
512 ↗(On Diff #274411)

I'm not thrilled with GetRegisterSet(0) meaning "get the GPRs" but I already see that being done in GDBRemoteCommunicationServerLLGS.cpp and CommandObjectRegister.cpp does the same, so no reason to do anything there, we assume the first set of registers are the GPRs.

545 ↗(On Diff #274411)

Can we define this register in ARM64_DWARF_Registers.h and use the enum name instead of the # here?

lldb/source/Plugins/Process/gdb-remote/GDBRemoteRegisterContext.cpp
351

You also matched llvm::Triple::aarch64_be earlier, do you mean to do that here too?

352

Same with the enum suggestion earlier, I'd like to avoid encoding the AArch64 DWARF ABI reg numbers here.

740

Same dwarf const.

755

I'm probably not following this correctly, but isn't this going to shorten the register buffer m_reg_data to the end of the SVE registers in the buffer, right? If the goal here is to create a heap object, why not just copy the entire m_data_reg? If someone ever adds a register past the SVE's, this would truncate it away.

761

DWARF constant.

778

Another constant I'd like in the header file.

lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
1770

DWARF constant, and do we need to handle aarch64_be?

omjavaid updated this revision to Diff 296811.Oct 7 2020, 3:16 PM

This is an update after addressing comments from @jasonmolenda

omjavaid added inline comments.Oct 7 2020, 3:22 PM
lldb/source/Plugins/Process/gdb-remote/GDBRemoteRegisterContext.cpp
755

Register data can expand on shrink based on vector length update of SVE register set. So heap size will change accordingly.
We need to preserve VG register and GPRs from current register data so we copy the whole data in case vector length update caused register data to expand. However we truncate data to new vector length in case vector length has shrunk. All register past VG registers are invalidated anyways.

HI @labath Do you have any comments or suggestions for this patch?

labath added a comment.Oct 8 2020, 9:07 AM

I haven't looked at the new version of the patch yet, partly because I'm busy (sorry), and partly because I'm not sure we have reached a consensus yet (or at least, I don't know what that consensus is).

Here's some questions/comments to what was earlier said, as I try to figure out what's going on.

Furthermore, GDB keeps a list of target descriptions for every combination of (vq, pauth) where pauth is flag telling if pointer authentication feature is supported while vq is vector length quad (multiples of 128bits).
https://sourceware.org/git?p=binutils-gdb.git;a=blob;f=gdb/aarch64-tdep.c;h=cbc7038dbba9472a12a3ae927bbb0937b10b2bdd;hb=HEAD#l3284

On gdbserver side similar to what is proposed for LLDB, GDB also exchanges vector length information using expedited registers list and updates target descriptions based on vg (vector granule register) value.
https://sourceware.org/git?p=binutils-gdb.git;a=blob;f=gdbserver/linux-aarch64-tdesc.cc;h=897fbb43bd28ddf44c69d4162dda43c2589b060f;hb=HEAD#l35

So you're saying that gdb will obtain the vg value from gdbserver and then lookup the proper target.xml description in the set of hardcoded xmls, is that right?

Does that mean that gdb will never send the qXfer:target.xml packet on aarch64? Or, it will send it, but then ignore/override the received data with the hardcoded xml ?

(about g packets...) they cause so many problems if there is a mis-coordination between the remote stub and lldb, I never want to stop the remote stub from providing those offsets.

I am not sure how to interpret this in the context of SVE registers. The stub cannot send the offsets of those, as their size/offset might change depending on their configuration.

Are we saying that we want the stub to send the offsets all registers _except_ SVE (and we hand-compute their offsets on the client)? That might also be tricky because we'd also need to change the offsets of registers that come _after_ the SVE regs.

Unless of course, we make sure SVE regs come last, but that imposes some constraints on future registers sets. I suppose we might be able to say that all variable-length or otherwise-funny registers must come last.

Another option would be to reserve enough space in the g packet to never have to move other registers due to the another register changing size. That would be nice, though it might pose a problem for stubs that want to be compatible with both gdb and lldb. Unless gdb is doing the same thing that is (is gdb doing the same thing?).

(about g packets...) they cause so many problems if there is a mis-coordination between the remote stub and lldb, I never want to stop the remote stub from providing those offsets.

I am not sure how to interpret this in the context of SVE registers. The stub cannot send the offsets of those, as their size/offset might change depending on their configuration.

Yeah, of course you're right, in the case of a dynamic register context like this, the g/G packets shouldn't be used (we should have GDBRemoteCommunicationClient::AvoidGPackets return true).

The jThreadsInfo may give us the full register context (in base10 because json lol) although debugserver only sends the GPRs in the jThreadsInfo response. If lldb needs to read the full register context, it would need to read them individually (and write them individually if saving the full register context). We could add qReadAllRegisters and QWriteAllRegisters which have a series of regnum:base16-regval-target-endian, or JSON versions of the same packets if the perf was important.

The original g/G packets were designed for little embedded systems where the stub had to be very small and dumb, and could just memcpy the payload in the packet into the register context & back out again. Any sensible design today would, at least, have some form of an array of regnum:regval to eliminate many of the problems.

Unless of course, we make sure SVE regs come last, but that imposes some constraints on future registers sets. I suppose we might be able to say that all variable-length or otherwise-funny registers must come last.

Yeah, Omair's patch currently assumes that the SVE regs come last already when they copy & truncate the registers context to heap. I fear that we'll get to armv12 and we'll be adding registers after the SVE and wonder why they're being truncated somewhere in lldb. :)

@omjavaid , what do you think about disabling g/G when we're working with SVE registers (GDBRemoteCommunicationClient::AvoidGPackets)? There are some gdb-remote stubs that can *only* read/write registers with g/G, but I think it's reasonable to say "you must implement p/P for a target with SVE", that's a generic packet shared by both lldb and gdb. We could add a more-modern g/G replacement packet, but no one would have that implemented, and if they were going to add anything, I'd have them implement p/P unless it's perf problems and they need a read-all-registers / write-all-registers packet that works with SVE.

(about g packets...) they cause so many problems if there is a mis-coordination between the remote stub and lldb, I never want to stop the remote stub from providing those offsets.

I am not sure how to interpret this in the context of SVE registers. The stub cannot send the offsets of those, as their size/offset might change depending on their configuration.

Yeah, of course you're right, in the case of a dynamic register context like this, the g/G packets shouldn't be used (we should have GDBRemoteCommunicationClient::AvoidGPackets return true).

The jThreadsInfo may give us the full register context (in base10 because json lol) although debugserver only sends the GPRs in the jThreadsInfo response. If lldb needs to read the full register context, it would need to read them individually (and write them individually if saving the full register context). We could add qReadAllRegisters and QWriteAllRegisters which have a series of regnum:base16-regval-target-endian, or JSON versions of the same packets if the perf was important.

The original g/G packets were designed for little embedded systems where the stub had to be very small and dumb, and could just memcpy the payload in the packet into the register context & back out again. Any sensible design today would, at least, have some form of an array of regnum:regval to eliminate many of the problems.

Unless of course, we make sure SVE regs come last, but that imposes some constraints on future registers sets. I suppose we might be able to say that all variable-length or otherwise-funny registers must come last.

Yeah, Omair's patch currently assumes that the SVE regs come last already when they copy & truncate the registers context to heap. I fear that we'll get to armv12 and we'll be adding registers after the SVE and wonder why they're being truncated somewhere in lldb. :)

@omjavaid , what do you think about disabling g/G when we're working with SVE registers (GDBRemoteCommunicationClient::AvoidGPackets)? There are some gdb-remote stubs that can *only* read/write registers with g/G, but I think it's reasonable to say "you must implement p/P for a target with SVE", that's a generic packet shared by both lldb and gdb. We could add a more-modern g/G replacement packet, but no one would have that implemented, and if they were going to add anything, I'd have them implement p/P unless it's perf problems and they need a read-all-registers / write-all-registers packet that works with SVE.

I did consider and second the idea of pulling out SVE registers from g/G packet specially for vector length that is more than 8 x (32 bytes) as it creates a huge sized g/G packet containing thousands of bytes. Historically, sending multiple p/P for all registers was an overhead due to slow communication between embedded or serially connected remote targets and host gdb/GDB. This is mostly not the case these days and we can come up with an alternate approach here as being discussed above.

However, There is some catch where we need to preserve all register data across expressions evaluation invocation and the need for offset correctness will still be needed as S,D,V and Z regs share same offset in case of SVE.

Also for futuristic point of view I am baking a patch that adds support for MTE and Pointer Authentication registers. As MT, Pauth and SVE are optional features, therefore registers for these features may or may not be available for a given target and will have to be configured dynamically when inferior is attached. So I was assuming similar adjustment in offsets of MTE/Pauth registers like the ones I have done for SVE registers. However MTE/Pauth registers do not change size dynamically and their offset change will only occur when they are present alongside SVE registers in a particular target.

My idea for all this implementation is to configure an initial offset of all dynamic register register sets at the time of inferior connection in RegisterInfoPOSIX_arm64::ConfigureVectorRegisterInfos where we put SVE register currently right after GPRs (GPR is the bare minimum that all Arm64 targets support) and any dynamically select-able register sets (like MTE, Pauth etc) can come before or after SVE registers.

labath added a comment.Oct 9 2020, 6:11 AM

The original g/G packets were designed for little embedded systems where the stub had to be very small and dumb, and could just memcpy the payload in the packet into the register context & back out again. Any sensible design today would, at least, have some form of an array of regnum:regval to eliminate many of the problems.

Unless of course, we make sure SVE regs come last, but that imposes some constraints on future registers sets. I suppose we might be able to say that all variable-length or otherwise-funny registers must come last.

Yeah, Omair's patch currently assumes that the SVE regs come last already when they copy & truncate the registers context to heap. I fear that we'll get to armv12 and we'll be adding registers after the SVE and wonder why they're being truncated somewhere in lldb. :)

Well.. we could design it such that the SVE registers (and any other dynamically-sized regs) always come last. Then they wouldn't impact the offsets of static registers.

I don't think we have a requirement that newer register sets must be appended. Or at least, we shouldn't have, as now both intel and arm have cpu features (and registers) that may be arbitrarily combined in future processors.

@omjavaid , what do you think about disabling g/G when we're working with SVE registers (GDBRemoteCommunicationClient::AvoidGPackets)? There are some gdb-remote stubs that can *only* read/write registers with g/G, but I think it's reasonable to say "you must implement p/P for a target with SVE", that's a generic packet shared by both lldb and gdb. We could add a more-modern g/G replacement packet, but no one would have that implemented, and if they were going to add anything, I'd have them implement p/P unless it's perf problems and they need a read-all-registers / write-all-registers packet that works with SVE.

I did consider and second the idea of pulling out SVE registers from g/G packet specially for vector length that is more than 8 x (32 bytes) as it creates a huge sized g/G packet containing thousands of bytes. Historically, sending multiple p/P for all registers was an overhead due to slow communication between embedded or serially connected remote targets and host gdb/GDB. This is mostly not the case these days

While it's certainly less important than it used to be, there are still people who care about these things. Just last year, we had a contribution which started using them more aggressively. The contributor has a use case for fetching all registers on each stop, which meant that our usual expedition rules were not sufficient and p packets were slow.

and we can come up with an alternate approach here as being discussed above.

That's true, but any custom solution is unlikely to be supported by stubs (qemu) which are primarily meant for communicating with gdb. And since we have people who want to use those stubs with lldb (and I am supporting some of them), if we do something custom, it's possible we may end up supporting both.

That's why I'd like to understand more about how gdb does things. The g packet definitely has its issues but so does introducing custom packets, and I think we should weigh those carefully.

However, There is some catch where we need to preserve all register data across expressions evaluation invocation and the need for offset correctness will still be needed as S,D,V and Z regs share same offset in case of SVE.

Is this about QSave/RestoreAllRegisters? If it is, then I don't think it's that very relevant here, because these saved registers don't make their way across the wire, and the fact that we save the registers in a buffer similar to the g packet is purely an implementation detail. We can change that any time we like and it does not require the client&server to be in sync. (Of course, if we do serialize everything into the g packet, then reusing that format for QSaveAllRegisters is handy, so it is still /slightly/ relavant).

The original g/G packets were designed for little embedded systems where the stub had to be very small and dumb, and could just memcpy the payload in the packet into the register context & back out again. Any sensible design today would, at least, have some form of an array of regnum:regval to eliminate many of the problems.

Unless of course, we make sure SVE regs come last, but that imposes some constraints on future registers sets. I suppose we might be able to say that all variable-length or otherwise-funny registers must come last.

Yeah, Omair's patch currently assumes that the SVE regs come last already when they copy & truncate the registers context to heap. I fear that we'll get to armv12 and we'll be adding registers after the SVE and wonder why they're being truncated somewhere in lldb. :)

Well.. we could design it such that the SVE registers (and any other dynamically-sized regs) always come last. Then they wouldn't impact the offsets of static registers.

I don't think we have a requirement that newer register sets must be appended. Or at least, we shouldn't have, as now both intel and arm have cpu features (and registers) that may be arbitrarily combined in future processors.

For now I am considering variable sized registers i-e SVE Z and P registers to always come last which is currently the case and will be in future patches which add support for Pointer Authentication and MTE registers.

@omjavaid , what do you think about disabling g/G when we're working with SVE registers (GDBRemoteCommunicationClient::AvoidGPackets)? There are some gdb-remote stubs that can *only* read/write registers with g/G, but I think it's reasonable to say "you must implement p/P for a target with SVE", that's a generic packet shared by both lldb and gdb. We could add a more-modern g/G replacement packet, but no one would have that implemented, and if they were going to add anything, I'd have them implement p/P unless it's perf problems and they need a read-all-registers / write-all-registers packet that works with SVE.

Just scrolling through GDB log one of the answers about target XML is that target XML is not exchanged for native connection where gdbserver is not being used. For gdbserver connection, target xml is exchanged onces per connection and register sizes are updated based on vg after that internally. However testing gdb with same executable as the one I wrote for LLDB was having gdb remote connection crash which could be something to do with SVE or some other bug.

I did consider and second the idea of pulling out SVE registers from g/G packet specially for vector length that is more than 8 x (32 bytes) as it creates a huge sized g/G packet containing thousands of bytes. Historically, sending multiple p/P for all registers was an overhead due to slow communication between embedded or serially connected remote targets and host gdb/GDB. This is mostly not the case these days

While it's certainly less important than it used to be, there are still people who care about these things. Just last year, we had a contribution which started using them more aggressively. The contributor has a use case for fetching all registers on each stop, which meant that our usual expedition rules were not sufficient and p packets were slow.

and we can come up with an alternate approach here as being discussed above.

That's true, but any custom solution is unlikely to be supported by stubs (qemu) which are primarily meant for communicating with gdb. And since we have people who want to use those stubs with lldb (and I am supporting some of them), if we do something custom, it's possible we may end up supporting both.

As far as I see when a stub like QEMU or OpenOCD goes on to add SVE registers support with dynamic resizing they will have to provide a basic target XML at startup plus the ability to change the size and offsets of SVE registers on their end plus communicate any vg update to LLDB so that LLDB client side can update its version of target register infos with new SVE register sizes and offset. Offsets should be relative to vg offset, that is Z register should start right after vg register and will sizes (8 bytes times vg value). In current implementation This will look like we have payload after VG register which has a size relative to current VG size.

That's why I'd like to understand more about how gdb does things. The g packet definitely has its issues but so does introducing custom packets, and I think we should weigh those carefully.

GDB is currently using g/G packet for SVE registers but I am not sure if the GDB implementation has been fully tested as described above.

However, There is some catch where we need to preserve all register data across expressions evaluation invocation and the need for offset correctness will still be needed as S,D,V and Z regs share same offset in case of SVE.

Is this about QSave/RestoreAllRegisters? If it is, then I don't think it's that very relevant here, because these saved registers don't make their way across the wire, and the fact that we save the registers in a buffer similar to the g packet is purely an implementation detail. We can change that any time we like and it does not require the client&server to be in sync. (Of course, if we do serialize everything into the g packet, then reusing that format for QSaveAllRegisters is handy, so it is still /slightly/ relavant).

The original g/G packets were designed for little embedded systems where the stub had to be very small and dumb, and could just memcpy the payload in the packet into the register context & back out again. Any sensible design today would, at least, have some form of an array of regnum:regval to eliminate many of the problems.

Unless of course, we make sure SVE regs come last, but that imposes some constraints on future registers sets. I suppose we might be able to say that all variable-length or otherwise-funny registers must come last.

Yeah, Omair's patch currently assumes that the SVE regs come last already when they copy & truncate the registers context to heap. I fear that we'll get to armv12 and we'll be adding registers after the SVE and wonder why they're being truncated somewhere in lldb. :)

Well.. we could design it such that the SVE registers (and any other dynamically-sized regs) always come last. Then they wouldn't impact the offsets of static registers.

I don't think we have a requirement that newer register sets must be appended. Or at least, we shouldn't have, as now both intel and arm have cpu features (and registers) that may be arbitrarily combined in future processors.

For now I am considering variable sized registers i-e SVE Z and P registers to always come last which is currently the case and will be in future patches which add support for Pointer Authentication and MTE registers.

@omjavaid , what do you think about disabling g/G when we're working with SVE registers (GDBRemoteCommunicationClient::AvoidGPackets)? There are some gdb-remote stubs that can *only* read/write registers with g/G, but I think it's reasonable to say "you must implement p/P for a target with SVE", that's a generic packet shared by both lldb and gdb. We could add a more-modern g/G replacement packet, but no one would have that implemented, and if they were going to add anything, I'd have them implement p/P unless it's perf problems and they need a read-all-registers / write-all-registers packet that works with SVE.

Just scrolling through GDB log one of the answers about target XML is that target XML is not exchanged for native connection where gdbserver is not being used. For gdbserver connection, target xml is exchanged onces per connection and register sizes are updated based on vg after that internally. However testing gdb with same executable as the one I wrote for LLDB was having gdb remote connection crash which could be something to do with SVE or some other bug.

Interesting.

I started drafting an email to the gdb mailing list to try to clarify this thing. For that, I was re-reading the gdb docs for the qXfer:target.xml packet, I realized that the gdb interpretation is mostly unambiguous about this thing:

  • regnum: The register’s number. If omitted, <default value algorithm...>. This register number is used to read or write the register; e.g. it is used in the remote p and P packets, and registers appear in the g and G packets in order of increasing register number.

I'm sure that whoever wrote this did not have variable-length registers in mind. However, it's interpretation for variable-length registers is pretty clear: take each register, and dump its bytes in sequence (no matter many of them are in the register at this particular time). The reason this is unambiguous is because gdb does not have the "offset" attribute of the "reg" element, and so the only way to find its 'g' offset is via this algorithm. And given your explanation about "switching" target.xml descriptions at runtime, I'm would expect this is what is happening within gdb.

The problem is that he "offset" attribute (added by lldb) makes things ambiguous, as it provides another way to find a register in the 'g' blob. So how about we "fix" that ambiguity by having lldb-server omit the "offset" attribute for variable-length registers?

Sending it over is actively harmful, as the value is very likely to be wrong, and I believe that in your current implementation these numbers are pretty much ignored by the client anyway. Lldb should already have code to automatically compute the regiser offsets when the "offset" attribute is missing (though it's possible it may not handle partially missing values correctly). This code would need to be extended to recompute those offsets whenever register sizes change (which is roughly what this patch does). We could also extend the parser with a check to ensure that all registers with explicit offsets come before offset-less registers. That would ensure that the register offsets dynamically computed by the client never overlap with any registers whose offsets are explicitly given by the server.

How does that sound?

The original g/G packets were designed for little embedded systems where the stub had to be very small and dumb, and could just memcpy the payload in the packet into the register context & back out again. Any sensible design today would, at least, have some form of an array of regnum:regval to eliminate many of the problems.

Unless of course, we make sure SVE regs come last, but that imposes some constraints on future registers sets. I suppose we might be able to say that all variable-length or otherwise-funny registers must come last.

Yeah, Omair's patch currently assumes that the SVE regs come last already when they copy & truncate the registers context to heap. I fear that we'll get to armv12 and we'll be adding registers after the SVE and wonder why they're being truncated somewhere in lldb. :)

Well.. we could design it such that the SVE registers (and any other dynamically-sized regs) always come last. Then they wouldn't impact the offsets of static registers.

I don't think we have a requirement that newer register sets must be appended. Or at least, we shouldn't have, as now both intel and arm have cpu features (and registers) that may be arbitrarily combined in future processors.

For now I am considering variable sized registers i-e SVE Z and P registers to always come last which is currently the case and will be in future patches which add support for Pointer Authentication and MTE registers.

@omjavaid , what do you think about disabling g/G when we're working with SVE registers (GDBRemoteCommunicationClient::AvoidGPackets)? There are some gdb-remote stubs that can *only* read/write registers with g/G, but I think it's reasonable to say "you must implement p/P for a target with SVE", that's a generic packet shared by both lldb and gdb. We could add a more-modern g/G replacement packet, but no one would have that implemented, and if they were going to add anything, I'd have them implement p/P unless it's perf problems and they need a read-all-registers / write-all-registers packet that works with SVE.

Just scrolling through GDB log one of the answers about target XML is that target XML is not exchanged for native connection where gdbserver is not being used. For gdbserver connection, target xml is exchanged onces per connection and register sizes are updated based on vg after that internally. However testing gdb with same executable as the one I wrote for LLDB was having gdb remote connection crash which could be something to do with SVE or some other bug.

Interesting.

I started drafting an email to the gdb mailing list to try to clarify this thing. For that, I was re-reading the gdb docs for the qXfer:target.xml packet, I realized that the gdb interpretation is mostly unambiguous about this thing:

  • regnum: The register’s number. If omitted, <default value algorithm...>. This register number is used to read or write the register; e.g. it is used in the remote p and P packets, and registers appear in the g and G packets in order of increasing register number.

I'm sure that whoever wrote this did not have variable-length registers in mind. However, it's interpretation for variable-length registers is pretty clear: take each register, and dump its bytes in sequence (no matter many of them are in the register at this particular time). The reason this is unambiguous is because gdb does not have the "offset" attribute of the "reg" element, and so the only way to find its 'g' offset is via this algorithm. And given your explanation about "switching" target.xml descriptions at runtime, I'm would expect this is what is happening within gdb.

The problem is that he "offset" attribute (added by lldb) makes things ambiguous, as it provides another way to find a register in the 'g' blob. So how about we "fix" that ambiguity by having lldb-server omit the "offset" attribute for variable-length registers?

Sending it over is actively harmful, as the value is very likely to be wrong, and I believe that in your current implementation these numbers are pretty much ignored by the client anyway. Lldb should already have code to automatically compute the regiser offsets when the "offset" attribute is missing (though it's possible it may not handle partially missing values correctly). This code would need to be extended to recompute those offsets whenever register sizes change (which is roughly what this patch does). We could also extend the parser with a check to ensure that all registers with explicit offsets come before offset-less registers. That would ensure that the register offsets dynamically computed by the client never overlap with any registers whose offsets are explicitly given by the server.

How does that sound?

I think the algorithm you are suggestion is quite workable. Summarizing our discussion above:

  1. registers should be sent over by lldb-server (via qRegisterInfo packet or target XML packet) in order of increasing register numbers
  2. If its a non-pseudo register that is not contained by any other register then its place is decided on the bases of its sequence via register number

This makes a register's name, size , type encoding and whether it is contained by any other register, the most important information while the remaining information including (dwarf and eh_frame reg numbers can be avoided somehow not sure exactly how??)
Also there are two fields that are sent over invalidate-regs and container-regs I think they are one and same thing. Moreover the encoding and format are also two difference field where I think we need only one of these.

So now comes the point where we want to decided how much information we safely can skip and do we want to skip that information as part of this patch blocking it or let it go ahead and make changes in a follow up.

For Arm/AArch64 I see no trouble sending no offset and calculating an offset locally when register infos are parsed. I dont think other architectures should have any trouble as far as process gdb-remote is concerned as most of these architecture must be following the same guidelines.

I started drafting an email to the gdb mailing list to try to clarify this thing. For that, I was re-reading the gdb docs for the qXfer:target.xml packet, I realized that the gdb interpretation is mostly unambiguous about this thing:

  • regnum: The register’s number. If omitted, <default value algorithm...>. This register number is used to read or write the register; e.g. it is used in the remote p and P packets, and registers appear in the g and G packets in order of increasing register number.

I'm sure that whoever wrote this did not have variable-length registers in mind. However, it's interpretation for variable-length registers is pretty clear: take each register, and dump its bytes in sequence (no matter many of them are in the register at this particular time). The reason this is unambiguous is because gdb does not have the "offset" attribute of the "reg" element, and so the only way to find its 'g' offset is via this algorithm. And given your explanation about "switching" target.xml descriptions at runtime, I'm would expect this is what is happening within gdb.

The problem is that he "offset" attribute (added by lldb) makes things ambiguous, as it provides another way to find a register in the 'g' blob. So how about we "fix" that ambiguity by having lldb-server omit the "offset" attribute for variable-length registers?

Sending it over is actively harmful, as the value is very likely to be wrong, and I believe that in your current implementation these numbers are pretty much ignored by the client anyway. Lldb should already have code to automatically compute the regiser offsets when the "offset" attribute is missing (though it's possible it may not handle partially missing values correctly). This code would need to be extended to recompute those offsets whenever register sizes change (which is roughly what this patch does). We could also extend the parser with a check to ensure that all registers with explicit offsets come before offset-less registers. That would ensure that the register offsets dynamically computed by the client never overlap with any registers whose offsets are explicitly given by the server.

How does that sound?

I think the algorithm you are suggestion is quite workable. Summarizing our discussion above:

  1. registers should be sent over by lldb-server (via qRegisterInfo packet or target XML packet) in order of increasing register numbers

Yep.

  1. If its a non-pseudo register that is not contained by any other register then its place is decided on the bases of its sequence via register number

I'm not sure if that's consistent with what gdb does, but if that is what we are doing right now, then I don't think we need to change that on account of this.

This makes a register's name, size , type encoding and whether it is contained by any other register, the most important information while the remaining information including (dwarf and eh_frame reg numbers can be avoided somehow not sure exactly how??)
Also there are two fields that are sent over invalidate-regs and container-regs I think they are one and same thing. Moreover the encoding and format are also two difference field where I think we need only one of these.

Yeah, that sounds like the direction we want to move in, but I don't think we have to do anything about this straight away as (IIUC) this is not blocking you -- we are able to put something meaningful into those fields of SVE registers, aren't we?

I don't think you should refactor the entire lldb register passing conventions -- I just want to make sure anything new we come up with is consistent with the general direction and does not needlessly diverge from gdb. The main new thing here is variable-sized registers -- solving that should be quite sufficient.

I started drafting an email to the gdb mailing list to try to clarify this thing. For that, I was re-reading the gdb docs for the qXfer:target.xml packet, I realized that the gdb interpretation is mostly unambiguous about this thing:

  • regnum: The register’s number. If omitted, <default value algorithm...>. This register number is used to read or write the register; e.g. it is used in the remote p and P packets, and registers appear in the g and G packets in order of increasing register number.

I'm sure that whoever wrote this did not have variable-length registers in mind. However, it's interpretation for variable-length registers is pretty clear: take each register, and dump its bytes in sequence (no matter many of them are in the register at this particular time). The reason this is unambiguous is because gdb does not have the "offset" attribute of the "reg" element, and so the only way to find its 'g' offset is via this algorithm. And given your explanation about "switching" target.xml descriptions at runtime, I'm would expect this is what is happening within gdb.

The problem is that he "offset" attribute (added by lldb) makes things ambiguous, as it provides another way to find a register in the 'g' blob. So how about we "fix" that ambiguity by having lldb-server omit the "offset" attribute for variable-length registers?

Sending it over is actively harmful, as the value is very likely to be wrong, and I believe that in your current implementation these numbers are pretty much ignored by the client anyway. Lldb should already have code to automatically compute the regiser offsets when the "offset" attribute is missing (though it's possible it may not handle partially missing values correctly). This code would need to be extended to recompute those offsets whenever register sizes change (which is roughly what this patch does). We could also extend the parser with a check to ensure that all registers with explicit offsets come before offset-less registers. That would ensure that the register offsets dynamically computed by the client never overlap with any registers whose offsets are explicitly given by the server.

How does that sound?

I think the algorithm you are suggestion is quite workable. Summarizing our discussion above:

  1. registers should be sent over by lldb-server (via qRegisterInfo packet or target XML packet) in order of increasing register numbers

Yep.

  1. If its a non-pseudo register that is not contained by any other register then its place is decided on the bases of its sequence via register number

I'm not sure if that's consistent with what gdb does, but if that is what we are doing right now, then I don't think we need to change that on account of this.

This makes a register's name, size , type encoding and whether it is contained by any other register, the most important information while the remaining information including (dwarf and eh_frame reg numbers can be avoided somehow not sure exactly how??)
Also there are two fields that are sent over invalidate-regs and container-regs I think they are one and same thing. Moreover the encoding and format are also two difference field where I think we need only one of these.

Yeah, that sounds like the direction we want to move in, but I don't think we have to do anything about this straight away as (IIUC) this is not blocking you -- we are able to put something meaningful into those fields of SVE registers, aren't we?

I don't think you should refactor the entire lldb register passing conventions -- I just want to make sure anything new we come up with is consistent with the general direction and does not needlessly diverge from gdb. The main new thing here is variable-sized registers -- solving that should be quite sufficient.

As far as no lldb-server stubs are concerned we are mostly dealingin with gdb-server, QEMU stub and OpenOCD stub. We have already discussed gdb-server and I have given a look to QEMU's implementation of SVE registers handling in its target xml creation. Looking at arm_gen_dynamic_svereg_xml in https://github.com/qemu/qemu/blob/master/target/arm/gdbstub.c

I think LLDB has already diverged from gdb in the sense that when we create register infos we depend on hard-coded offsets and register numbers. This is not the case with either gdb-server or QEMU stub.

On client side LLDB does recieve registers in sequence and also calculates a offset locally but then ignores it in the favor of the one which is being received from lldb-server. There needs to be an incrementat effort where we first get rid of the dependence on fixed offsets and register numbers in NativeProcess and LLGS code. Then fix the LLDB client to act accordingly in sync with all types of stub implementation taking guidance from gdb protocol.

As far as SVE is concerned I dont think current implementation will cause any problem as long as we can keep SVE registers to come last if not that then also adjust offests of any register that are past SVE registers. For now there is no registers past SVE and they are ones which are coming last in our sequence so we seem to be good for now.

@labath What do you think?

I'm afraid I don't understand what you mean. Could you try to be more specific?

It's true that currently lldb client uses the register offsets provided by the server (if they are present), and this is a divergence from pure gdb-remote. Were it not for variable-sized registers, I might even say that it is a good one. What the current patch does though is add another exception to the rule. This exception means that the offset provided by the server may not actually be the definitive value -- the client may choose to override it based on some built-in knowledge (and then assume the server has the same knowledge).

My proposal is to avoid this exception, and make it such that the server-provided offset is always definitive. And in cases where a definitive answer cannot be provided, the server will just not send the offset over.

I'm afraid I don't understand what you mean. Could you try to be more specific?

It's true that currently lldb client uses the register offsets provided by the server (if they are present), and this is a divergence from pure gdb-remote. Were it not for variable-sized registers, I might even say that it is a good one. What the current patch does though is add another exception to the rule. This exception means that the offset provided by the server may not actually be the definitive value -- the client may choose to override it based on some built-in knowledge (and then assume the server has the same knowledge).

My proposal is to avoid this exception, and make it such that the server-provided offset is always definitive. And in cases where a definitive answer cannot be provided, the server will just not send the offset over.

I agree with your thoughts about not sending offset, infact as far as AArch64 is concerned if we skip sending offset for SVE registers we can construct offset filed anyway by running UpdateARM64SVERegistersInfos in DynamicRegisterInfo::Finalize function.

In principle offsets are always definitive and sending explicit Offset is actually redundant information we actually dont need it as it can be calculated based on register size in the increasing order of register numbers. If we know variable sized SVE registers are always going to come after VG register we can always know the correct location of these registers in g packet. We also know FFR is the last SVE register and any data after FFR will correspond to any new registers that may be added after SVE. What we essentially need here is to sync register numbers and register sizes between client and server. The offset field in register info structure must be populated corrently on both client and server side for read/write of register data from its correct location in register data heap. In addition to that for variable sized registers we need a mechanism to update the size and offset fields of variable register and also the register coming after those registers.

But this is easier said than done though, first, for all architectures we need to see how offset field in their RegisterInfo structs is being formulated. If it is being populated based on the same formula of increasing order of register numbers then we can write a simple patch to skip offset field but if its not that way and random offsets are being used then we ll have to go on and make the changes in Register Infos structs of all those individual architectures making this change.

Secondly, this offset field also determines the location of value_regs without the offset field being explicitly sent across we cannot exactly be sure about the location of pseudo registers in g/G packet data. GDB's XML register description format does not have a provision for pseudo registers rather it has user defined register types what it calls composite types where a register can be vector, union or struct of various custom fields.

Finally, how about I post an update of this patch where lldb client side will ignore offset it receives for SVE register and incrementally places variable sized SVE registers after VG register. For any future registers we can choose an to put them before SVE registers or add a mechanism to update their offset if they are placed after SVE.

omjavaid updated this revision to Diff 302794.Nov 4 2020, 2:39 AM

This update tries to mitigate effects of fixed offset fields in LLDB register description. We have now replaced Arm64 register infos on lldb-server side to generate offsets which increase with respect to size in increasing order of register numbers.

This patch updates functions where we update offsets to do the same thing on client side whenever offsets are updated we update the whole register infos list adjusting any changes in sizes of Z and P registers in between.

On GDB side register sequence starts with GPR and then moves to SVE registers. GDB has composite types and does not use pseudo registers like lldb does. Although we do not support composite register types in LLDB for now but as far as g packet sequencing and resize is concerned our handling now does not depend on any LLDB specific register resize management. I have also updated use of dwarf register numbers and replaced them with register name because GDB/QEMU will not send dwarf register numbers as part of target XML description.

labath added a comment.Nov 4 2020, 2:54 AM

I'm sorry about the delay, I was having trouble finding the time to think this through...

I'm afraid I don't understand what you mean. Could you try to be more specific?

It's true that currently lldb client uses the register offsets provided by the server (if they are present), and this is a divergence from pure gdb-remote. Were it not for variable-sized registers, I might even say that it is a good one. What the current patch does though is add another exception to the rule. This exception means that the offset provided by the server may not actually be the definitive value -- the client may choose to override it based on some built-in knowledge (and then assume the server has the same knowledge).

My proposal is to avoid this exception, and make it such that the server-provided offset is always definitive. And in cases where a definitive answer cannot be provided, the server will just not send the offset over.

I agree with your thoughts about not sending offset, infact as far as AArch64 is concerned if we skip sending offset for SVE registers we can construct offset filed anyway by running UpdateARM64SVERegistersInfos in DynamicRegisterInfo::Finalize function.

In principle offsets are always definitive and sending explicit Offset is actually redundant information we actually dont need it as it can be calculated based on register size in the increasing order of register numbers. If we know variable sized SVE registers are always going to come after VG register we can always know the correct location of these registers in g packet. We also know FFR is the last SVE register and any data after FFR will correspond to any new registers that may be added after SVE. What we essentially need here is to sync register numbers and register sizes between client and server. The offset field in register info structure must be populated corrently on both client and server side for read/write of register data from its correct location in register data heap. In addition to that for variable sized registers we need a mechanism to update the size and offset fields of variable register and also the register coming after those registers.

Could we make it so that it's not necessary to update the register info offset information on the server size. And instead, compute them on the fly, when needed (and it's not explicitly specified)?
Something like:

Handle_g() {
  for (info: regs) {
    if (info.byte_offset == -1)
      offset = next_offset;
    else
      offset = info.byte_offset;
    memcpy(buffer+offset, reg_data, info.byte_size);
    next_offset = offset + info.byte_size
  }
  send(buffer);
}

Then qRegisterInfo (and target.xml) could just check for the -1 value and use that as a cue to skip sending the offset field? All current architectures always hard-code an offset, so this would be a no-op for them...

Secondly, this offset field also determines the location of value_regs without the offset field being explicitly sent across we cannot exactly be sure about the location of pseudo registers in g/G packet data. GDB's XML register description format does not have a provision for pseudo registers rather it has user defined register types what it calls composite types where a register can be vector, union or struct of various custom fields.

I'm not sure I understand that. Can you elaborate on the relevance of value_regs here?

Finally, how about I post an update of this patch where lldb client side will ignore offset it receives for SVE register and incrementally places variable sized SVE registers after VG register. For any future registers we can choose an to put them before SVE registers or add a mechanism to update their offset if they are placed after SVE.

I'd like to avoid sending bogus offsets, if possible. I'd rather that we teach the server to not send them, and then have the client use this as a cue that it needs to recompute something.

I'm sorry about the delay, I was having trouble finding the time to think this through...

I'm afraid I don't understand what you mean. Could you try to be more specific?

It's true that currently lldb client uses the register offsets provided by the server (if they are present), and this is a divergence from pure gdb-remote. Were it not for variable-sized registers, I might even say that it is a good one. What the current patch does though is add another exception to the rule. This exception means that the offset provided by the server may not actually be the definitive value -- the client may choose to override it based on some built-in knowledge (and then assume the server has the same knowledge).

My proposal is to avoid this exception, and make it such that the server-provided offset is always definitive. And in cases where a definitive answer cannot be provided, the server will just not send the offset over.

I agree with your thoughts about not sending offset, infact as far as AArch64 is concerned if we skip sending offset for SVE registers we can construct offset filed anyway by running UpdateARM64SVERegistersInfos in DynamicRegisterInfo::Finalize function.

In principle offsets are always definitive and sending explicit Offset is actually redundant information we actually dont need it as it can be calculated based on register size in the increasing order of register numbers. If we know variable sized SVE registers are always going to come after VG register we can always know the correct location of these registers in g packet. We also know FFR is the last SVE register and any data after FFR will correspond to any new registers that may be added after SVE. What we essentially need here is to sync register numbers and register sizes between client and server. The offset field in register info structure must be populated corrently on both client and server side for read/write of register data from its correct location in register data heap. In addition to that for variable sized registers we need a mechanism to update the size and offset fields of variable register and also the register coming after those registers.

Could we make it so that it's not necessary to update the register info offset information on the server size. And instead, compute them on the fly, when needed (and it's not explicitly specified)?
Something like:

Handle_g() {
  for (info: regs) {
    if (info.byte_offset == -1)
      offset = next_offset;
    else
      offset = info.byte_offset;
    memcpy(buffer+offset, reg_data, info.byte_size);
    next_offset = offset + info.byte_size
  }
  send(buffer);
}

Then qRegisterInfo (and target.xml) could just check for the -1 value and use that as a cue to skip sending the offset field? All current architectures always hard-code an offset, so this would be a no-op for them...

byte_offset field in RegisterInfo struct is also used elsewhere in code specially to copy data to-from cached register values in data heap buffer. Also on the server side this byte_offset field is used to calculate position of register data in ptrace buffer etc.
So we can not skip populating this field. But what we can do is that populate it with valid values replicating g/G packet positions which can be reconstructed using register sizes in increasing order of register numbers.

Secondly, this offset field also determines the location of value_regs without the offset field being explicitly sent across we cannot exactly be sure about the location of pseudo registers in g/G packet data. GDB's XML register description format does not have a provision for pseudo registers rather it has user defined register types what it calls composite types where a register can be vector, union or struct of various custom fields.

I'm not sure I understand that. Can you elaborate on the relevance of value_regs here?

So value regs are pseudo register for example V register in Arm64 SVE is a subset of first 16-bytes of a Z register. We do not include value registers in g packet but then need a way to tell where in g/G packet data a specific Vx pseudo register will be located. In my updated patch I update offsets of all primary registers GPR, Z, P, FFR etc and also I copy that offset into their corresponding value registers for example for Z0, same offset is copied to V0, D0, S0. This helps with quick fetching of data on both client and server side.

Finally, how about I post an update of this patch where lldb client side will ignore offset it receives for SVE register and incrementally places variable sized SVE registers after VG register. For any future registers we can choose an to put them before SVE registers or add a mechanism to update their offset if they are placed after SVE.

I'd like to avoid sending bogus offsets, if possible. I'd rather that we teach the server to not send them, and then have the client use this as a cue that it needs to recompute something.

I dont want to write -1 into offset field on server side because it is good helper for accessing data in register buffers but we can introduce a mechanism for all architectures supporting g/G packet we should not send the offset field in qRegisterInfo or target xml. The offset will be constructed by client itself and populated on client side register infos list for helping out with register data management.

labath added a comment.Nov 4 2020, 5:38 AM

[ I've replying to your message in a slightly different order that what it was written. ]

I'd like to avoid sending bogus offsets, if possible. I'd rather that we teach the server to not send them, and then have the client use this as a cue that it needs to recompute something.

I dont want to write -1 into offset field on server side because it is good helper for accessing data in register buffers but we can introduce a mechanism for all architectures supporting g/G packet we should not send the offset field in qRegisterInfo or target xml. The offset will be constructed by client itself and populated on client side register infos list for helping out with register data management.

Ok, I see where you're going now. If you can pull that off, I think that'd be great. Is that what the current patch does (I haven't looked at the today's update yet)?

Just in case that doesn't work for any reason, I'll elaborate on the idea I had in mind.

Then qRegisterInfo (and target.xml) could just check for the -1 value and use that as a cue to skip sending the offset field? All current architectures always hard-code an offset, so this would be a no-op for them...

byte_offset field in RegisterInfo struct is also used elsewhere in code specially to copy data to-from cached register values in data heap buffer. Also on the server side this byte_offset field is used to calculate position of register data in ptrace buffer etc.

The register context could keep relying on the byte_offset for all non-SVE registers. The SVE registers are pretty special anyway, so I was thinking part of their specialness could be that they get their offsets from some other place. Note that we already have the escape hatch called GetPtraceOffset, which was added because we could not reconcile the ptrace and g uses of the byte_offset fields on x86. So, in general, I think that separating these two things is not a bad direction to go in. For example, the bsd implementations already do a switch over the register number instead of reusing the byte_offset for ptrace purposes. (I think that code could be implemented in a more elegant way, but it is still consistent with that direction.) Your approach also kind of makes that possible (not completely, but it does move us slightly closer towards that), which is why I like it.

I'm not sure I understand that. Can you elaborate on the relevance of value_regs here?

So value regs are pseudo register for example V register in Arm64 SVE is a subset of first 16-bytes of a Z register. We do not include value registers in g packet but then need a way to tell where in g/G packet data a specific Vx pseudo register will be located. In my updated patch I update offsets of all primary registers GPR, Z, P, FFR etc and also I copy that offset into their corresponding value registers for example for Z0, same offset is copied to V0, D0, S0. This helps with quick fetching of data on both client and server side.

Ok, I see what you mean. So how will this be handled in your proposal? Will the client "know" that the V/D/S registers are located at particular positions within the Z registers ? Because I think the same solution could be applied to my idea (your proposal is basically a stricter (less flexible) version of what I had in mind -- it does not allow combining registers with offsets and offset-less regs (and that may be more flexibility than we need)).

I think that embedding this kind of knowledge into the client is fine. AFAICT, this is what gdb is doing anyway (I don't see e.g. eax/ax/ah/al being described in the x86_64 target.xml).

[ I've replying to your message in a slightly different order that what it was written. ]

I'd like to avoid sending bogus offsets, if possible. I'd rather that we teach the server to not send them, and then have the client use this as a cue that it needs to recompute something.

I dont want to write -1 into offset field on server side because it is good helper for accessing data in register buffers but we can introduce a mechanism for all architectures supporting g/G packet we should not send the offset field in qRegisterInfo or target xml. The offset will be constructed by client itself and populated on client side register infos list for helping out with register data management.

Ok, I see where you're going now. If you can pull that off, I think that'd be great. Is that what the current patch does (I haven't looked at the today's update yet)?

Just in case that doesn't work for any reason, I'll elaborate on the idea I had in mind.

Then qRegisterInfo (and target.xml) could just check for the -1 value and use that as a cue to skip sending the offset field? All current architectures always hard-code an offset, so this would be a no-op for them...

byte_offset field in RegisterInfo struct is also used elsewhere in code specially to copy data to-from cached register values in data heap buffer. Also on the server side this byte_offset field is used to calculate position of register data in ptrace buffer etc.

The register context could keep relying on the byte_offset for all non-SVE registers. The SVE registers are pretty special anyway, so I was thinking part of their specialness could be that they get their offsets from some other place. Note that we already have the escape hatch called GetPtraceOffset, which was added because we could not reconcile the ptrace and g uses of the byte_offset fields on x86. So, in general, I think that separating these two things is not a bad direction to go in. For example, the bsd implementations already do a switch over the register number instead of reusing the byte_offset for ptrace purposes. (I think that code could be implemented in a more elegant way, but it is still consistent with that direction.) Your approach also kind of makes that possible (not completely, but it does move us slightly closer towards that), which is why I like it.

I'm not sure I understand that. Can you elaborate on the relevance of value_regs here?

So value regs are pseudo register for example V register in Arm64 SVE is a subset of first 16-bytes of a Z register. We do not include value registers in g packet but then need a way to tell where in g/G packet data a specific Vx pseudo register will be located. In my updated patch I update offsets of all primary registers GPR, Z, P, FFR etc and also I copy that offset into their corresponding value registers for example for Z0, same offset is copied to V0, D0, S0. This helps with quick fetching of data on both client and server side.

Ok, I see what you mean. So how will this be handled in your proposal? Will the client "know" that the V/D/S registers are located at particular positions within the Z registers ? Because I think the same solution could be applied to my idea (your proposal is basically a stricter (less flexible) version of what I had in mind -- it does not allow combining registers with offsets and offset-less regs (and that may be more flexibility than we need)).

I think that embedding this kind of knowledge into the client is fine. AFAICT, this is what gdb is doing anyway (I don't see e.g. eax/ax/ah/al being described in the x86_64 target.xml).

GDB does not have pseudo registers as part of xml register description. GDB is managing pseudo registers on client side and independent of rest of the register set. In case of SVE Z registers, GDB describes a composite type kind of a union of (Z, V, D and S) which helps in giving registers a view in terms of V, S and D which are not actually register.

As invalidate_regs and value_regs are totally LLDB specific so we can put information about LLDB speicific pseudo registers in those lists which we actually do by the way. Taking advantage of that I have S, V and D registers in invalidate_reg list of a Z register which share the same starting offset as the corresponding Z register but are size restricted to 4, 8 or 16 bytes.

For the sake of clarity here is the scheme:

We have two types of registers:

  1. Primary registers

Includes 64 bit GPRs Xn regs, PC etc
Also includes V registers for all non-sve targets
Includes Z, P, FFR and VG for SVE targets.

All primary registers have value_regs list set to nullptr
All primary registers have invalidate_regs list which is a list of registers which share same starting offset as the corresponding primary registers.

  1. Pseudo registers

Includes 32 bit GPRs Wn regs
Includes D and S registers for all non SVE targets
Also includes V, D and S registers for all SVE targets

All pseudo register have a value register which can be found in value_regs list of that register.
All pseudo registers have invalidate_regs list which is a list of registers which share same starting offset as the corresponding primary registers.

On start of debug session we exchange qRegisterInfo or target XML packet registers on client side are numbered in sequence they are received in qRegisterInfo or their placement location in XML description. We receive register offset field populated in case we are talking to lldb-server. This offset will be ignored in case of AArch64 target with SVE and it will be recalculated in DynamicRegisterInfo::Finalize function.

Moreover, whenever we stop we get register VG in expedited registers list. we call GDBRemoteRegisterContext::AArch64SVEReconfigure function which will evaluate if we need to update offsets. In case VG has been updated whole register list will be traversed in sequence and new offsets will be assigned according to updated register sizes. All pseudo registers will share the offset of their primary register.

omjavaid updated this revision to Diff 303052.Nov 5 2020, 12:46 AM

This updates after discussion yesterday.

omjavaid updated this revision to Diff 303786.Nov 9 2020, 2:17 AM

This updates after remove invalidate_regs from primary registers D91057. Currently only SVE Z registers had invalidate_reg list populated which has been removed to send less information across while exchanging those registers.

GDB does not have pseudo registers as part of xml register description. GDB is managing pseudo registers on client side and independent of rest of the register set. In case of SVE Z registers, GDB describes a composite type kind of a union of (Z, V, D and S) which helps in giving registers a view in terms of V, S and D which are not actually register.

As invalidate_regs and value_regs are totally LLDB specific so we can put information about LLDB speicific pseudo registers in those lists which we actually do by the way. Taking advantage of that I have S, V and D registers in invalidate_reg list of a Z register which share the same starting offset as the corresponding Z register but are size restricted to 4, 8 or 16 bytes.

For the sake of clarity here is the scheme:

We have two types of registers:

  1. Primary registers

Includes 64 bit GPRs Xn regs, PC etc
Also includes V registers for all non-sve targets
Includes Z, P, FFR and VG for SVE targets.

All primary registers have value_regs list set to nullptr
All primary registers have invalidate_regs list which is a list of registers which share same starting offset as the corresponding primary registers.

  1. Pseudo registers

Includes 32 bit GPRs Wn regs
Includes D and S registers for all non SVE targets
Also includes V, D and S registers for all SVE targets

All pseudo register have a value register which can be found in value_regs list of that register.
All pseudo registers have invalidate_regs list which is a list of registers which share same starting offset as the corresponding primary registers.

On start of debug session we exchange qRegisterInfo or target XML packet registers on client side are numbered in sequence they are received in qRegisterInfo or their placement location in XML description. We receive register offset field populated in case we are talking to lldb-server.

Up to here, everything makes sense to me. Thanks for summarizing that.

This offset will be ignored in case of AArch64 target with SVE and it will be recalculated in DynamicRegisterInfo::Finalize function.

Didn't we say we were going to make lldb-server stop sending offsets in the SVE (or even aarch64 in general) case?

Moreover, whenever we stop we get register VG in expedited registers list. we call GDBRemoteRegisterContext::AArch64SVEReconfigure function which will evaluate if we need to update offsets. In case VG has been updated whole register list will be traversed in sequence and new offsets will be assigned according to updated register sizes.

Ideally, I'd want to structure this in a way so that it does not depend on whether lldb-server expedites any particular register. The idea is, that after every stop, the client would check the value of the VG register (and update stuff if it changed). If the register was expedited, then this access would be satisfied from the expedited cache. But if the server did not send it for any reason, (and this is structured correctly) the client should just transparently request the updated register value from the server.

lldb/source/Plugins/Process/Utility/DynamicRegisterInfo.cpp
543

We should already have code somewhere which sets the register offsets in case the server did not provide them. We should merge the two. Ideally, this wouldn't even require any special logic, as the server will just not send any offsets for SVE.

717

This fields is out of place here, as this class doesn't know anything about threads. I guess what it really means is "can the size of these registers be changed at runtime".

A natural consequence of this would then be that its users would need to keep a copy of each class for each thread...

lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp.rej
1

?

lldb/source/Plugins/Process/gdb-remote/GDBRemoteRegisterContext.cpp
734

(void) is a c-ism.

lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
1769–1770

arch.GetTriple().isAArch64() covers both.

omjavaid updated this revision to Diff 304726.Nov 11 2020, 11:24 PM

This update addresses comments about last revision. Also sending offset field in qRegisterInfo and XML register description has been disabled D91241

omjavaid updated this revision to Diff 308341.Nov 30 2020, 5:46 AM

Updated after re-base over D91241

labath added inline comments.Dec 10 2020, 1:31 AM
lldb/source/Plugins/Process/gdb-remote/GDBRemoteRegisterContext.cpp
755

How often do you expect the user to be changing the vector length? Could we just create a completely empty (invalid) buffer of the right size and force a re-fetch the next time the registers are accessed? With g packets, this should be a single round-trip anyway...

794–795

We already have a mostly generic algorithm for computing the initial offsets of registers. This code reintroduces the assumptions about the ordering of registers and similar.

It would be nice if this could reuse that offset computation code (after updating the register sizes). One possibility might be to stash the original register infos (with empty byte offsets) somewhere, and then reinitialize this register context from *that*.

omjavaid updated this revision to Diff 314327.Jan 4 2021, 12:53 AM

@labath I have incorporated your suggestions in this update. Invalidate all registers as SVE size update is not a high frequency occurrence. Also using same logic for offset calculation as we did DynamicRegisterInfo::Finalize.

Any further changes needed or this is good for commit?

Thanks for your patience. I think this is in a pretty good shape now. Just a couple of quick questions inline...

lldb/source/Plugins/Process/gdb-remote/GDBRemoteRegisterContext.cpp
411

Could we move the GetPrimordialRegister part into AArch64SVEReconfigure? The function looks up the "vg" number again anyway?

Actually, why is this even needed? Wouldn't the ReadRegisterAsUnsigned(vg_reg_num) call inside AArch64SVEReconfigure already handle the reading aspect?

753–754
auto reg_data_sp = std::make_shared<DataBufferHeap>(m_reg_info_sp->GetRegisterDataByteSize(), 0);

Or maybe even drop the local var and move directly into the SetData call.

756

Is Clear before SetData really needed?

omjavaid updated this revision to Diff 317036.Jan 15 2021, 11:43 AM

@labath incorporated your suggestion. Good for commit?

labath accepted this revision.Jan 18 2021, 11:02 PM

Looks good, thanks.

This revision is now accepted and ready to land.Jan 18 2021, 11:02 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJan 19 2021, 2:02 AM