This is an archive of the discontinued LLVM Phabricator instance.

Use remote regnums in expedited list, value regs and invalidate regs
ClosedPublic

Authored by omjavaid on Mar 30 2020, 2:26 AM.

Details

Summary

Native register descriptions in LLDB specify lldb register numbers in value_regs and invalidate_regs lists. These register numbers may not match with Process gdb-remote register numbers which are generated through native process by counting all registers in its register sets.

It was coincidentally not causing any problems as we never came across a native target with dynamically changing register sets and register numbers generated by counter matched with LLDB native register numbers. This came up while testing target AArch64 SVE which can choose register sets based on underlying hardware.

This patch fixes this behavior and tries to send lldb register number as extra information in registerinfo and targetxml packets. This patch also updates Read/Write RegisterBytes function of process gdb-remote to look for LLDB register numbers if they are available.

I have tested this on arm, aarch64, x86_64.

Diff Detail

Event Timeline

omjavaid created this revision.Mar 30 2020, 2:26 AM

I am still thinking this over, but for now I have two comments. First, could you re-upload the diff with full context (e.g. git show -U9999). That would make it a lot easier to review this.
Second, would it be possible to change the meaning of the invalidate_regs and value_regs lists so that they do the right thing even in your case (instead of introducing a new number)? We already have too many numbering schemes to begin with, and introducing a new one is definitely something I'd like to avoid (in particular when the number is stored as eRegisterKindLLDB on the server side, but then becomes eRegisterKindProcessPlugin on the other end).

omjavaid updated this revision to Diff 254444.Apr 2 2020, 1:24 AM

Posting full diff.

I am still thinking this over, but for now I have two comments. First, could you re-upload the diff with full context (e.g. git show -U9999). That would make it a lot easier to review this.
Second, would it be possible to change the meaning of the invalidate_regs and value_regs lists so that they do the right thing even in your case (instead of introducing a new number)? We already have too many numbering schemes to begin with, and introducing a new one is definitely something I'd like to avoid (in particular when the number is stored as eRegisterKindLLDB on the server side, but then becomes eRegisterKindProcessPlugin on the other end).

There has to be a unique register no for each register in a particular architecture and that can be single register no (eRegisterKindLLDB for both server and host) if we somehow stop the use of index in register info packet. Right now register info packet iterates over an index to get next register info and uses that as LLDB register no which is in theory wrong. We need to assign every register in a particular architecture a unique LLDB specific number and exchange that in register info or target xml packets. Right now if we change the order of iteration over register set or skip one of the register sets the register nos exchanged over target xml will also change. Thats why i felt the need for this patch and thought it would be the most backward compatible solution available without disturbing any existing functionality specially when various other architectures also depend on this.

There

labath added a comment.Apr 9 2020, 2:30 AM

Register infos in lldb are a mess. However lldb seems to be able to communicate (more or less successfully) with stub which know nothing about lldb or numbers it assigns to various registers. So it does not seem like there needs to be (or even _can_ be) one universal constant for a register in all situations.

IIUC, the problem is that on the server side the "invalidate" numbers are given in the form of the "lldb" register numbers, but on we don't send that number over -- instead lldb client makes it up (from the index).

Your patch fixes that by sending that number over explicitly. The thing I don't like about that is that it: a) increases the chance of things going wrong with non-lldb stubs; b) forks the code paths depending on whether one is talking to the old or new stub.

It seems to me like this could be solved in another way -- changing the meaning of the "invalidate" numbers to mean register indexes -- basically blessing the thing that the client is doing now. Since for the current targets the two numbers are the same, that should not make any functional difference, but it would make things work for SVE without forking anything or introducing new concepts. The translation could be done at the protocol level, just before sending the data over.

What do you think of that. Are there any downsides there?

omjavaid marked 2 inline comments as done.Apr 14 2020, 3:16 AM

Register infos in lldb are a mess. However lldb seems to be able to communicate (more or less successfully) with stub which know nothing about lldb or numbers it assigns to various registers. So it does not seem like there needs to be (or even _can_ be) one universal constant for a register in all situations.

IIUC, the problem is that on the server side the "invalidate" numbers are given in the form of the "lldb" register numbers, but on we don't send that number over -- instead lldb client makes it up (from the index).

Your patch fixes that by sending that number over explicitly. The thing I don't like about that is that it: a) increases the chance of things going wrong with non-lldb stubs; b) forks the code paths depending on whether one is talking to the old or new stub.

It seems to me like this could be solved in another way -- changing the meaning of the "invalidate" numbers to mean register indexes -- basically blessing the thing that the client is doing now. Since for the current targets the two numbers are the same, that should not make any functional difference, but it would make things work for SVE without forking anything or introducing new concepts. The translation could be done at the protocol level, just before sending the data over.

What do you think of that. Are there any downsides there?

For current implementation I dont think it will break any stubs because newly introduced regnum is totally optional. If regnum is not provided then mocked up register index is used and things will work as they were. It will will only trigger for lldbserver native register context which will have an implementation for NativeRegisterContextRegisterInfo::GetNativeRegisterIndex

Default is just return the same index:

uint32_t NativeRegisterContextRegisterInfo::GetNativeRegisterIndex(

  uint32_t reg_index) const {
return reg_index;

}

Downside to invalidate registers is
It is not intended to store an id for the register it belongs to rather its store a register list for register numbers affected by the containing register.
I guess that is only used by x86 for now and I used it in the other patch AArch64 reginfo patch when I found some strange behavior or register not being updated after write with QEMU.

On a different note LLDB now supports sending over target xml packet and there it has to send a register number along with register name and everything else. Most of stubs like QEMU, OpenOCD, etc use target xml for register description exchange and we may consider giving up qRegisterInfo in favor of target xml in future.

lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
487

Here regnum is init to mocked up register index.

546

Here we update register number if it is provided and that too in most cases will be equal to the mocked up register number as long as remote does not implement NativeRegisterContextRegisterInfo::GetNativeRegisterIndex

For current implementation I dont think it will break any stubs because newly introduced regnum is totally optional. If regnum is not provided then mocked up register index is used and things will work as they were.

I believe it is optional right now, but that can easily change in the future. That's why I am looking into whether it's possible to make this work without introducing the new field. If all code takes the same path, its much less likely something will break.

On a different note LLDB now supports sending over target xml packet and there it has to send a register number along with register name and everything else. Most of stubs like QEMU, OpenOCD, etc use target xml for register description exchange and we may consider giving up qRegisterInfo in favor of target xml in future.

Lldb is able to use target.xml too. However, relying on as the sole source of register numbers would effectively make libxml a mandatory dependency. Maybe that would be possible, but that's a topic for a wider discussion, as some platforms (windows) don't have libxml easily available. Anyway, falling back to a different format is not what worries me that much -- I'm more concerned if we diverge on the content we send using those formats from what the other stubs send.

@labath

I have considered alternatives which can be used to avoid this patch altogether.

We are creating dynamic register infos vector for SVE in D77047. This is needed because we have update register size and offset of SVE registers.

The reason this problem of register indexes not matching with native register number happens because we place SVE register set after debug registers in register infos list. Linux does not use debug register and thus we have to jump over those register indexes which creates a mismatch in value reg list of SVE Z registers. see here https://github.com/llvm-mirror/lldb/blob/master/source/Plugins/Process/Utility/RegisterInfos_arm64.h#L719

To solve this problem:

  1. First option we have is that also update/correct the value_regs and LLDB register no while sharing these register infos to correct register indexes. This is more like a hack but should work as long as we correctly keep track of LLDB register nos that we assign to SVE register. Register nos are mostly considered constant and expected to remain fixed so it will be ugly to change them specially when they might also be used by other parts of code.
  1. Second option is to avoid the use of debug registers altogether in SVE register infos list which will mean that sve register nos may overlap with debug registers and in future if some other target wants to implement sve register access then will have to correct this and push debug register after SVE registers in the list.
  1. and final option is what I have already implemented in this patch.

What do you think?

Where does the option which I proposed fit in? It sounds like it would be something like 1a), where we don't actually modify the "invalidate" numbers stored within lldb-server, but we just do the conversion at the protocol level. That way the "invalidate" numbers keep their meaning as "lldb" register numbers in lldb-server, just like they used to. They also keep referring to "lldb" register numbers on the client side, because the client puts the register "index" into the "lldb" field. The only inconsistency is that the "lldb" register numbers assigned to a specific register will be different on the client and server side. However:

  • this is not different from what happens in your patch, because you put the "server lldb" number into the "process plugin" field on the client
  • it's not something we should actually rely on if we want to be able to communicate with non-matching server stubs

It seems to me like this solution has the same downsides as the current patch, while having the upside of not requiring protocol changes. That means it can be revisited if we find it to be a problem, while a protocol change is more permanent. That's why I'd like to get a good understanding of why it won't work (or be ugly) if we're not going to do it, and so far I haven't gotten that.

(BTW, if other aarch64 targets (which ones?) are reporting debug registers in their register lists, I don't think it would be unreasonable to do the same on linux)

Where does the option which I proposed fit in? It sounds like it would be something like 1a), where we don't actually modify the "invalidate" numbers stored within lldb-server, but we just do the conversion at the protocol level. That way the "invalidate" numbers keep their meaning as "lldb" register numbers in lldb-server, just like they used to. They also keep referring to "lldb" register numbers on the client side, because the client puts the register "index" into the "lldb" field. The only inconsistency is that the "lldb" register numbers assigned to a specific register will be different on the client and server side. However:

  • this is not different from what happens in your patch, because you put the "server lldb" number into the "process plugin" field on the client
  • it's not something we should actually rely on if we want to be able to communicate with non-matching server stubs

As far as non-matching stubs are concerned lldb qRegisterInfo handler on host only looks for eRegisterKindEHFrame, eRegisterKindGeneric, and eRegisterKindDWARF. The remaining two fields of register kinds list i-e eRegisterKindProcessPlugin and eRegisterKindLLDB are assigned same value of index by host. It is interesting to note here that when LLDB goes on to access any register on remote side by sending a 'g/G' or 'p/P' packets it makes use of eRegisterKindProcessPlugin although both eRegisterKindLLDB and eRegisterKindProcessPlugin are assigned the same value. So this patch actually gives a remote stub or process plugin an option to send its own regnum if it has to otherwise index will be used as the default behavior. I think eRegisterKindProcessPlugin is there for this purpose in case a stub wants to provide its own register number it can do so but we are actually not allowing any provision of doing so in qRegisterInfo packet while this provision is available in target xml packet and we use it while parsing target xml.

It seems to me like this solution has the same downsides as the current patch, while having the upside of not requiring protocol changes. That means it can be revisited if we find it to be a problem, while a protocol change is more permanent. That's why I'd like to get a good understanding of why it won't work (or be ugly) if we're not going to do it, and so far I haven't gotten that.

options 1) is similar to what you have proposed whereby we update value_regs and invalidate_regs nos to mean indices before first qRegisterinfo. What this actually means is that register number of all concerned registers in register info list will be updated (All SVE register will have an updated register no). These registers numbers are also used by instruction emulation by accessing the register infos array which for now does pose any problem for now.

(BTW, if other aarch64 targets (which ones?) are reporting debug registers in their register lists, I don't think it would be unreasonable to do the same on linux)

Linux does not expose AArch64 debug register via ptrace interface and they are mostly needed for manipulation of debug features like hw watchpoints/breakpoints. On Linux PTrace provides interface for installing hw watch/breakpoints but no access for debug regs. The reason i proposed moving debug register down is that they have the least chance of ever being used elsewhere in LLDB code.

Also apart from SVE registers in another patch we ll also be introducing AArch64 Pointer Authentication registers access support. Those register will also have to be moved up in register infos array because and they may or may not be available for given target and we ll have to choose that by querying target after inferior is spawned. Future features and the features being optional with their registers being available or not is the reason I choose to write this patch rather than taking the options under discussion.

Thanks for elaborating. Responses inline.

Where does the option which I proposed fit in? It sounds like it would be something like 1a), where we don't actually modify the "invalidate" numbers stored within lldb-server, but we just do the conversion at the protocol level. That way the "invalidate" numbers keep their meaning as "lldb" register numbers in lldb-server, just like they used to. They also keep referring to "lldb" register numbers on the client side, because the client puts the register "index" into the "lldb" field. The only inconsistency is that the "lldb" register numbers assigned to a specific register will be different on the client and server side. However:

  • this is not different from what happens in your patch, because you put the "server lldb" number into the "process plugin" field on the client
  • it's not something we should actually rely on if we want to be able to communicate with non-matching server stubs

As far as non-matching stubs are concerned lldb qRegisterInfo handler on host only looks for eRegisterKindEHFrame, eRegisterKindGeneric, and eRegisterKindDWARF. The remaining two fields of register kinds list i-e eRegisterKindProcessPlugin and eRegisterKindLLDB are assigned same value of index by host. It is interesting to note here that when LLDB goes on to access any register on remote side by sending a 'g/G' or 'p/P' packets it makes use of eRegisterKindProcessPlugin although both eRegisterKindLLDB and eRegisterKindProcessPlugin are assigned the same value. So this patch actually gives a remote stub or process plugin an option to send its own regnum if it has to otherwise index will be used as the default behavior. I think eRegisterKindProcessPlugin is there for this purpose in case a stub wants to provide its own register number it can do so but we are actually not allowing any provision of doing so in qRegisterInfo packet while this provision is available in target xml packet and we use it while parsing target xml.

I'm not convinced that this extra flexibility is needed for qRegisterInfo. I mean, there's already a register number present in the qRegisterInfo query. If the stub can look up the appropriate register info when responding to qRegisterInfoXY, it should also be able to find the register when it receives a p packet. If the stub wants/needs to internally use different register numbers for any reason, it can remap the numbers internally any way it wants.

The situations gets trickier when it comes to target.xml. While we could assign register numbers sequentially (and this is what happens if regnum is not present), that could quickly get confusing, particularly if the xml description is spread over multiple files. That's why I think it makes sense to have such field in the target.xml. Another reason is that this field (unlike the invalidate_regnums or ehframe_regnum fields) is actually documented in the gdb spec for this packet.

It seems to me like this solution has the same downsides as the current patch, while having the upside of not requiring protocol changes. That means it can be revisited if we find it to be a problem, while a protocol change is more permanent. That's why I'd like to get a good understanding of why it won't work (or be ugly) if we're not going to do it, and so far I haven't gotten that.

options 1) is similar to what you have proposed whereby we update value_regs and invalidate_regs nos to mean indices before first qRegisterinfo. What this actually means is that register number of all concerned registers in register info list will be updated (All SVE register will have an updated register no). These registers numbers are also used by instruction emulation by accessing the register infos array which for now does pose any problem for now.

The question of instruction emulation is more interesting (and I'd really like to hear @jasonmolenda's take on all of this). I presume you mean the use of intstruction emulation on client side, right? (we shouldn't need instruction emulation in lldb-server on aarch64, and in any case, there it can use any register number it wishes).

I believe the only use of instruction emulation in the client right now is to compute the unwind plans. For that, you're right that the numbers of SVE registers don't matter much, as for this use case we are mostly interested in PC, SP, FP, etc., and those won't change. However, the dependence of that component on the "lldb" register numbers does make this approach smell. The question is what to do about that. One possibility is to make lldb-server (and all stubs which want to interact with lldb) send over the "lldb" register numbers. Another would be to make instruction emulation (and lldb in general) less dependant on the information received from the stub. I believe the current opinion (see e.g. http://lists.llvm.org/pipermail/lldb-commits/Week-of-Mon-20190218/048008.html) is that we should do the latter. There hasn't been much work on that front since then as this is never a priority, but I have started some refactors which I hope will be a step towards that (and you might be able to help by reviewing D75607).

Since there are no immediate downsides to instruction emulation from not sending the lldb register numbers I'd say we should go with the flow of requiring less information from the stub rather than against it.

(BTW, if other aarch64 targets (which ones?) are reporting debug registers in their register lists, I don't think it would be unreasonable to do the same on linux)

Linux does not expose AArch64 debug register via ptrace interface and they are mostly needed for manipulation of debug features like hw watchpoints/breakpoints. On Linux PTrace provides interface for installing hw watch/breakpoints but no access for debug regs. The reason i proposed moving debug register down is that they have the least chance of ever being used elsewhere in LLDB code.

Right but you could in theory get a fairly good approximation debug register value (the only thing missing would be some super-user bits masked by the kernel) through the NT_ARM_HW_WATCH/BREAK interface. I think I might find that useful if I was starting to doubt what my debugger is doing and wanted to verify that. However, I don't think we should do that just to "fix" this problem...

And FWIW, I don't have a problem with moving the debug registers down in the list....

Also apart from SVE registers in another patch we ll also be introducing AArch64 Pointer Authentication registers access support. Those register will also have to be moved up in register infos array because and they may or may not be available for given target and we ll have to choose that by querying target after inferior is spawned. Future features and the features being optional with their registers being available or not is the reason I choose to write this patch rather than taking the options under discussion.

Well, I hope that if we come up with a reasonable way to remap register numbers, then there should be no problem regardless of how many optional register sets we have.

omjavaid updated this revision to Diff 263107.May 10 2020, 11:44 PM

@labath as per your suggestion I have implemented a solution where we fixup register index before sending them to the host in xml or registerinfos packet. Also two new helper functions are added which can be overriden by register context if there is a difference between user register index (Register index calculated by iteration) or reg infos register index (actual index into register infos array).
Still stub selected regnum can be supplied using target XML packet and to accommodate that case there is also a fix that remains from the old patch which forces LLDB to use eRegisterKindProcessPlugin while reading/writing value_regs in order to make sure LLDB always uses correct register number for the case of target XML register infos.

The invalidate_regs part looks as I would expect. I think it ought to be implemented a bit differently, but that can wait until the bigger issue is resolved.

The bigger issue being the fact that this patch now renumbers all numbers going in/out of the stub (qRegisterInfo, p,P, etc.). It seems I got more than what I bargained for there, though in retrospect, that does not seem totally unexpected, as you were mentioning the problem of the extra dbg registers in the middle of the register list. The part which I didn't get is that this does not only apply to the "lldb" register list, but that these registers also make their way to NativeRegisterContextLinux_arm64::GetRegisterInfoAtIndex. Currently that class was culling them (in the same way as other register contexts do for "optional" registers) by returning a smaller value via GetUserRegisterCount. That is, of course, not possible if you have other registers in the list that you want to make available.

So, IIUC, what the current patch does is expose the registers through GetRegisterInfoAtIndex, but then ensures that the UserRegIndexToRegInfosIndex conversion skips over them. That wouldn't be too bad were it not for the fact that accessing the register "the right way" is very hard now.

So here's my current doubt. Basically, the main thing this patch does is "hide" the uninteresting (debug) registers from the client. However, those registers still remain exposed through the NativeRegisterContext<->GDBRemoteCommunicationServerLLGS boundary via the GetRegisterInfoAtIndex function. What if we took this one step further, and made is so that the debug registers are not exposed from NativeRegisterContextLinux_arm64 at all?

One way to achieve that would be to handle the renumbering inside GetRegisterInfoAtIndex. Another would be to ensure that these registers don't even appear in the RegisterInfo array that backs this function (then the current implementation of GetRegisterInfoAtIndex would "just work"). All other things being equal, I would prefer the second approach. It seems like that could be easily achieved as you're defining a custom RegisterInfo array for SVE anyway (g_register_infos_arm64_sve_le in D77047). We could just exclude dbg registers from that list.

That would still leave us with the question of what does invalidate_regs refer to. If it still makes sense for it to refer to "lldb" register numbers, then we can remap it as we discussed previously. But it may turn out that after no remapping is needed -- in which case, even better.

omjavaid marked 2 inline comments as done.May 13 2020, 12:40 AM

The invalidate_regs part looks as I would expect. I think it ought to be implemented a bit differently, but that can wait until the bigger issue is resolved.

The bigger issue being the fact that this patch now renumbers all numbers going in/out of the stub (qRegisterInfo, p,P, etc.). It seems I got more than what I bargained for there, though in retrospect, that does not seem totally unexpected, as you were mentioning the problem of the extra dbg registers in the middle of the register list. The part which I didn't get is that this does not only apply to the "lldb" register list, but that these registers also make their way to NativeRegisterContextLinux_arm64::GetRegisterInfoAtIndex. Currently that class was culling them (in the same way as other register contexts do for "optional" registers) by returning a smaller value via GetUserRegisterCount. That is, of course, not possible if you have other registers in the list that you want to make available.

I have tried other solution like overriding GetRegisterInfoAtIndex for NativeRegisterContextLinux_arm64 and doing all this register number magic their but then this culminated into lots of changes related to register sets and the currently presented solution looked cleaner than others.
In case of all registers other than SVE registers UserRegIndexToRegInfosIndex is going to do nothing but return the input as ouput. So using this approach doesnt really impact anything else other than AArch64+SVE.

So, IIUC, what the current patch does is expose the registers through GetRegisterInfoAtIndex, but then ensures that the UserRegIndexToRegInfosIndex conversion skips over them. That wouldn't be too bad were it not for the fact that accessing the register "the right way" is very hard now.

The problem we have is that qRegisterInfo packets uses its own register numbering scheme and we dont have mechanism to push those register numbers through to registerinfos list. lldb register numbers are actually indices and not register number so to speak.

So here's my current doubt. Basically, the main thing this patch does is "hide" the uninteresting (debug) registers from the client. However, those registers still remain exposed through the NativeRegisterContext<->GDBRemoteCommunicationServerLLGS boundary via the GetRegisterInfoAtIndex function. What if we took this one step further, and made is so that the debug registers are not exposed from NativeRegisterContextLinux_arm64 at all?

The only problem/hurdle we have here is the way we construct our register sets from the indices of registerinfos array which is static. This will require some reshuffling of register sets a revamp of how sets are created in the register context. Any solution that requires changes to register sets will be more extensive than this.

One way to achieve that would be to handle the renumbering inside GetRegisterInfoAtIndex. Another would be to ensure that these registers don't even appear in the RegisterInfo array that backs this function (then the current implementation of GetRegisterInfoAtIndex would "just work"). All other things being equal, I would prefer the second approach. It seems like that could be easily achieved as you're defining a custom RegisterInfo array for SVE anyway (g_register_infos_arm64_sve_le in D77047). We could just exclude dbg registers from that list.

This solution requires quite a lot of register numbering magic in RegisterInfos_arm64 and I was reluctant to do that as they are used all over the code with a consideration of a static array with lldb_reg_num as indices.

That would still leave us with the question of what does invalidate_regs refer to. If it still makes sense for it to refer to "lldb" register numbers, then we can remap it as we discussed previously. But it may turn out that after no remapping is needed -- in which case, even better.

lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp
467 ↗(On Diff #263107)

Just to explain this call to RegInfosIndexToUserRegIndex: It is more relevant for host's needs to have correct register number living in value_regs list in order to access the correct parent for all the pseudo registers.

As far as invalidate_regs are concerned they actually do not matter much as far as process gdb remote + aarch64 is concerned. Mainly because if we write a pseudo reg we ll actually be writing its parent from value_regs list. This parent will automatically be invalidated by the call to writeregister. When any pseudo register having the same value_regs register is read it will force an invalidation.

So as far as i can see aarch64/linux dont really need invalidate regs list and at one point i thought about removing them but then kept them as it is in case some other transport other than gdb-remote might wanna use it.

1809 ↗(On Diff #263107)

GetRegisterSetNameForRegisterAtIndex also needs modifications if we go for the solution you are talking about

omjavaid updated this revision to Diff 263937.May 14 2020, 1:34 AM

This updated diff reduces impact of register numbering correction on LLGS code by removing UserRegIndexToRegInfosIndex out of the code and instead overriding GetRegisterInfoAtIndex in NativeRegisterContextLinux_arm64.

Sorry, about the delay. I had to find some time to dig into the code again, to understand all of the interactions.

It seems that we (lldb) have painted ourselves in the corner somewhat. The thing I did not realize until now is that the "lldb" scheme is documented to be the same as the index into the register info array. (I believed it was used that way, but I didn't expect this level of intent in it.) The relationship between qRegisterInfo and p register numbers is not mentioned explicitly anywhere, but reading the documentation of qRegisterInfo, I do get the impression that it was intended for them to match.

If we take these two things as a given, we are left with a bunch of sub-optimal alternatives, most of which we have tried already. Basically, once we start having targets with multiple optional register sets, the only way to reconcile these two requirements is to introduce an extra renumbering _somewhere_. The only reason we got away with this so far is because register sets tend to be additive -- the presence of one register set usually implies the presence of all "older" sets as well. And the register context interfaces offers us a very crude tool (GetUserRegisterCount) to weed out any weird registers, by placing them last. This isn't going to be a maintainable long term state, as today's architectures are starting to have a lot of different "optional" registers. (Intel is not promising that all cpus will always have the new MPX registers, for example).

I do believe something will need to be done about that. And I hope that something does not involve adding a new numbering scheme, because I believe we have too many of those already. I think the best solution would be to drop the "lldb regnum == reginfo index" requirement. It's redundant because the index can be computed by subtracting the RegisterInfo pointer from the beginning of the reginfo vector, and it enforces a linear order on the register descriptions (or forces the usage of complicated renumbering schemes). Without that restriction, it would be a simple thing for NativeRegisterContexts to only expose the registers they really want to expose, and there wouldn't be a need for renumberings at the gdb-remote level. I've done some experimenting with that, and while it required changes in a lot if places, the changes were not too complicated, and I did not run into any fundamental problems. Most of the time, the changes actually ended up simplifying the code.

Now, as for your patch set, I don't think that it should be on you to implement all of this, though I would strongly encourage it, and promise to help along. I think this last version is something that we could go with, even though it's definitely still less than ideal. But before we go through with that, let me circle back to one of the previous ideas. A bunch of revisions ago, you mentioned the possibility of inserting the sve registers before the watch/breakpoint registers in the register info vector (and the lldb enumeration).

Do we know of any reason why that would not work? After playing around with this code (for far too long), I've come to believe that this should work right now. It won't solve any of the underlying problems, but it should unblock this patch set, and it would match what was e.g. done when we added the intel mpx registers (D24255). While it won't work for targets that wish to expose debug registers, I'm pretty sure we don't have any targets like that right now. Based on everything I've learned so far, I believe that change should only impact servers who actually depend on the the registers embedded in the source code (and not on lldb client communicating with them). Right now, only lldb-server depends on these, and it does not expose the debug registers...

Sorry, about the delay. I had to find some time to dig into the code again, to understand all of the interactions.

It seems that we (lldb) have painted ourselves in the corner somewhat. The thing I did not realize until now is that the "lldb" scheme is documented to be the same as the index into the register info array. (I believed it was used that way, but I didn't expect this level of intent in it.) The relationship between qRegisterInfo and p register numbers is not mentioned explicitly anywhere, but reading the documentation of qRegisterInfo, I do get the impression that it was intended for them to match.

If we take these two things as a given, we are left with a bunch of sub-optimal alternatives, most of which we have tried already. Basically, once we start having targets with multiple optional register sets, the only way to reconcile these two requirements is to introduce an extra renumbering _somewhere_. The only reason we got away with this so far is because register sets tend to be additive -- the presence of one register set usually implies the presence of all "older" sets as well. And the register context interfaces offers us a very crude tool (GetUserRegisterCount) to weed out any weird registers, by placing them last. This isn't going to be a maintainable long term state, as today's architectures are starting to have a lot of different "optional" registers. (Intel is not promising that all cpus will always have the new MPX registers, for example).

I do believe something will need to be done about that. And I hope that something does not involve adding a new numbering scheme, because I believe we have too many of those already. I think the best solution would be to drop the "lldb regnum == reginfo index" requirement. It's redundant because the index can be computed by subtracting the RegisterInfo pointer from the beginning of the reginfo vector, and it enforces a linear order on the register descriptions (or forces the usage of complicated renumbering schemes). Without that restriction, it would be a simple thing for NativeRegisterContexts to only expose the registers they really want to expose, and there wouldn't be a need for renumberings at the gdb-remote level. I've done some experimenting with that, and while it required changes in a lot if places, the changes were not too complicated, and I did not run into any fundamental problems. Most of the time, the changes actually ended up simplifying the code.

So I agree with your thoughts on dropping the "lldb regnum == reginfo index" and have already started moving in that direction wit D80105. I plan to come up with a feature based register infos and access to register sets will be available through RegisterInfoInterface.
A register context can configure register infos to add features and those register sets will be exposed to the register context. The infos pointer will be pointer to a dynamically created list of register infos whcih are part on currently configured register context. This is partially being done for sve but a more complete solution is needed for future features.

Now, as for your patch set, I don't think that it should be on you to implement all of this, though I would strongly encourage it, and promise to help along. I think this last version is something that we could go with, even though it's definitely still less than ideal. But before we go through with that, let me circle back to one of the previous ideas. A bunch of revisions ago, you mentioned the possibility of inserting the sve registers before the watch/breakpoint registers in the register info vector (and the lldb enumeration).

There is this issue of feature selection like SVE available for a target but pointer authentication registers are available or other way around then this problem of jumping over one of them will again present itself. So it wont be a good idea to go for that option.

Do we know of any reason why that would not work? After playing around with this code (for far too long), I've come to believe that this should work right now. It won't solve any of the underlying problems, but it should unblock this patch set, and it would match what was e.g. done when we added the intel mpx registers (D24255). While it won't work for targets that wish to expose debug registers, I'm pretty sure we don't have any targets like that right now. Based on everything I've learned so far, I believe that change should only impact servers who actually depend on the the registers embedded in the source code (and not on lldb client communicating with them). Right now, only lldb-server depends on these, and it does not expose the debug registers...

After looking at RegisterContextWindows_arm64 and Darwin_arm64 it seems both use RegisterInfos_arm64.h directly and do not make use of RegisterInfosPOSIX_arm64 class. I am planning to extend register infos interface to have register set manipulation methods and then expose a dynamic register infos list with no dependence on LLDB register numbers. And meanwhile we can go with current solution of SVE registers support in its current form.

omjavaid updated this revision to Diff 271534.Jun 17 2020, 5:13 PM

This patch is now independent of SVE register support and is only a requirement for the case of where remote stub utilizes xml register description and sends register nos which are not consecutively placed.

This patch ensures eRegisterKindProcessPlugin is used while referring to value_regs/invalidate_regs. This is needed because remote stubs may send target xml packets with stub specific register numbering scheme. Thus value_regs and invalidate_regs may have been populated based on a foreign register numbering scheme. We fix this by converting value_reg/invalidate_reg number to lldb register number before querying appropriate register info.

labath added a comment.Jul 8 2020, 7:29 AM

The idea seems ok (though could use a test), but I am wondering if it wouldn't be better to do this conversion during (or immediately after) parsing. If I remember the previous discussions correctly, we've established that the value/invalidate_regs fields in the RegisterInfo structs are expected to contain "lldb" register numbers, but that the natural thing to do in the target.xml is to transfer "process plugin" (i.e., the regnum field) in these fields.

If that's true, then it would be better to do the conversion during parsing, so that we don't violate any invariants that some other parts of code may be assuming.

omjavaid updated this revision to Diff 316354.Jan 13 2021, 3:12 AM

This update adds a new test case that checks for register number mismatch between lldb and gdb stub. LLDB client assigns register numbers to target xml registers in increasing order starting with regnum = 0, while gdb-remote may specify different regnum which is stored as eRegisterKindProcessPlugin. Remote side will use its register number in expedited register list, value_regs and invalidate_regnums.

For p/P packet LLDB was already using remote register numbers while this patch fixes the behavior for all above mentioned cases.

omjavaid retitled this revision from Fix process gdb-remote usage of value_regs/invalidate_regs to Use remote regnums in expedited list, value regs and invalidate regs.Jan 13 2021, 3:14 AM

Using remote numbers in the lists seems reasonable as well, but I am troubled by the "opportunistic" aspect of the implementation. This pattern

uint32_t lldb_regnum = ConvertRegisterKindToRegisterNumber(
    eRegisterKindProcessPlugin, reg);
if (lldb_regnum != LLDB_INVALID_REGNUM)
  reg = lldb_regnum;
const RegisterInfo *value_reg_info = GetRegisterInfoAtIndex(reg);

is cumbersome to write and easy to get wrong.

Why are we sometimes not using the plugin/remote number? Could we make it so that this number is used unconditionally?
Then all of this could become: const RegisterInfo *value_reg_info = GetRegisterInfo(eRegisterKindProcessPlugin, reg)

omjavaid updated this revision to Diff 317049.Jan 15 2021, 12:45 PM

@labath addresses your review comments in current diff. Any further changes or this is ok ?

There's still one "opportunistic" snippet in GDBRemoteRegisterContext.cpp, around line 404. Did you forget about that, or is there something special about that case? (if so, then what?)

There's still one "opportunistic" snippet in GDBRemoteRegisterContext.cpp, around line 404. Did you forget about that, or is there something special about that case? (if so, then what?)

Answer inline.

lldb/source/Plugins/Process/gdb-remote/GDBRemoteRegisterContext.cpp
404

Using ConvertRegisterKindToRegisterNumber because SetRegisterIsValid requires register number as input unlike other instances where we needed to get register info.

lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
1768

Here also we used ConvertRegisterKindToRegisterNumber because PrivateSetRegisterValue requires register number as input unlike other instances where we needed to get register info.

labath added inline comments.Jan 31 2021, 3:03 AM
lldb/source/Plugins/Process/gdb-remote/GDBRemoteRegisterContext.cpp
404

That's fine -- what bothers me is the conditional setting of the reg variable below.

What I am pushing for is for consistency in the meanings of invalidate_regs numbers. If the invalidate_regs always stores the "process plugin" register kind, then calling we should never be calling`SetRegisterIsValid(reg)` if we fail to convert the plugin number to a "register number".
I would expect something like SetRegisterIsValid(ConvertRegisterKindToRegisterNumber(eRegisterKindProcessPlugin, reg), false) (with some error checking if it can really happen than ConvertRegisterKindToRegisterNumber can fail here, but I hope that's not needed);

omjavaid updated this revision to Diff 320373.Jan 31 2021, 12:45 PM
omjavaid marked an inline comment as done.

@labath I have made suggested adjustment.

labath accepted this revision.Feb 4 2021, 5:18 AM
This revision is now accepted and ready to land.Feb 4 2021, 5:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 8 2021, 1:12 AM