This is an archive of the discontinued LLVM Phabricator instance.

[lldb] [ABI/X86] Add pseudo-registers if missing
ClosedPublic

Authored by mgorny on Aug 27 2021, 10:08 AM.

Diff Detail

Event Timeline

mgorny created this revision.Aug 27 2021, 10:08 AM
mgorny requested review of this revision.Aug 27 2021, 10:08 AM
mgorny updated this revision to Diff 369130.Aug 27 2021, 10:10 AM

Remove unnecessary include.

mgorny updated this revision to Diff 369421.Aug 30 2021, 5:20 AM

Add writing tests.

mgorny updated this revision to Diff 369422.Aug 30 2021, 5:28 AM

Add tests for reading back (should verify invalidation).

mgorny updated this revision to Diff 370754.Sep 4 2021, 12:21 PM

Fix filename in heading.

mgorny updated this revision to Diff 370786.Sep 5 2021, 1:45 AM

Restore the sp test.

mgorny updated this revision to Diff 372261.Sep 13 2021, 8:26 AM

Rename the file to _x86.cpp to future-proof it. Replace`assert()s` with explicit skipping, we don't really want to crash on unexpected data in target.xml.

This is an interesting problem. I don't yet know how, but as I alluded to in the other patch, my instinct would be to find a way to put this into the ABI class, in order to keep the "fill in bits not provided by debug server" code in a single place.

Another question: should we aim to remove the 'redundant afterwards' bits from lldb-server, and fill them in from client side entirely? I don't think our process plugins really need knowledge of EAX etc., and it would reduce the duplication a fair bit.

Also I'm not sure if I like my code here, two questions inline.

lldb/source/Plugins/Process/gdb-remote/GDBRemoteRegisterContext_x86.cpp
37–54 ↗(On Diff #372261)

Also here I'm wondering if it wouldn't be better to just have an array of {"ax", "bx", ...} and construct all other names from that base, plus r8..r15 via numeric for loop.

66–75 ↗(On Diff #372261)

I've been somewhat basing this thing on how our process plugins do registers but I'm wondering if it wouldn't be better to just do a for (int x = 0; i < 8; i++) loop and dynamically generate st${x} and mm${x} names.

Another question: should we aim to remove the 'redundant afterwards' bits from lldb-server, and fill them in from client side entirely? I don't think our process plugins really need knowledge of EAX etc., and it would reduce the duplication a fair bit.

I think that would be great. Besides the deduplication aspect, this will also decrease the chances of some gdb compatibility bug creeping in. I might consider waiting for one release, just in case someone wants to connect a newer lldb-server to an old lldb.

/And/, I would actually like to go further than that, and remove these register definitions from core file (&minidump, etc) plugins too. Since eax/ax/ah/al can be extracted from rax in a generic way, I don't see a reason for each plugin to have to redefine those (or forget doing like, like is the case with the windows plugin). This is the point where this logic stops being gdb-remote-specific. There are no compatibility issues here, though the process itself might be trickier, depending on how much the plugins are assuming a particular register order, etc.

@labath, could you look at my inline comments before I start moving stuff? ;-)

labath added inline comments.Sep 16 2021, 12:35 AM
lldb/source/Plugins/Process/gdb-remote/GDBRemoteRegisterContext_x86.cpp
37–54 ↗(On Diff #372261)

I don't have a strong opinion on this. The names are sufficiently irregular that the other approach will (probably) be messy as well.

I'm more bothered by the fact that you're writing into these global variables. IIRC, the DynamicRegisterInfo class has some sort of a buffer to store the array data. It would be good to reuse that (or do something similar).

66–75 ↗(On Diff #372261)

The loop idea seems reasonable here.

mgorny updated this revision to Diff 378533.Oct 10 2021, 1:53 PM
mgorny retitled this revision from [lldb] [gdb-remote] Add x86_64 pseudo-registers when using gdbserver to [lldb] [ABI/X86] Add pseudo-registers if missing.
mgorny edited the summary of this revision. (Show Details)

Rewritten for the new augmentation API based on ABI plugins.

We don't really want to support adding an arbitrary subset of sub-registers, do we? I am thinking if this could be made simpler if it was all-or-nothing. Like, during the initial pass you could check whether the list contains _any_ subregister, and abort if it does. Then the subsequent pass could assume that all subregisters need to be added...

lldb/source/Plugins/ABI/X86/ABIX86.h
23

I think you should be able to get this from the ArchSpec of the process.

We don't really want to support adding an arbitrary subset of sub-registers, do we? I am thinking if this could be made simpler if it was all-or-nothing. Like, during the initial pass you could check whether the list contains _any_ subregister, and abort if it does. Then the subsequent pass could assume that all subregisters need to be added...

Sure, that makes sense. Should I update AArch64 too?

I don't think this is that critical over there, but I think you could do that.

mgorny updated this revision to Diff 379652.Oct 14 2021, 3:56 AM

Use ArchSpec instead of passing hardcoded 64bit. Abort the whole process if at least one subreg is present, eliminating all the have* bools.

mgorny updated this revision to Diff 379653.Oct 14 2021, 3:59 AM

Move LLDB_INVALID_REGNUM check inside addPartialRegister().

mgorny updated this revision to Diff 379697.Oct 14 2021, 6:29 AM

Halfway through to a new approach for getting regnames.

labath added inline comments.Oct 14 2021, 6:29 AM
lldb/source/Plugins/ABI/X86/ABIX86.cpp
89

I'd just do is64bit = arch.GetAddressByteSize() == 8

122–188

What if we introduced helper functions like:

// returns 0 for [re]ax, 1 for [re]bx, ...
Optional<unsigned> GetGPRNumber(llvm::StringRef name) { 
  // either a lookup table or the fancy name matching
}
GPRReg MakeGPRReg(size_t base_index, unsigned gpr_number) {
  // I'd probably implement this via lookup table(s)
}

.. or something similar?

The general idea is to split this function into smaller chunks and also try to abstract away the funky register names as much as possible.

209–211

llvm::is_contained(haystack, needle)

mgorny updated this revision to Diff 379732.Oct 14 2021, 8:27 AM
mgorny marked 3 inline comments as done.

Next iteration. I will try to simplify even more.

mgorny updated this revision to Diff 379738.Oct 14 2021, 8:57 AM

Final round of refactoring. The code should be simpler and more readable now.

I don't have a strong opinion wrt generating names in a loop vs hardcoding the list (possibly using a macro). Please decide how I should proceed ;-).

mgorny updated this revision to Diff 379771.Oct 14 2021, 10:50 AM
mgorny marked an inline comment as done.

Use inline lists instead of for loops for registers. Use llvm::SmallDense* types with StringRefs.

mgorny updated this revision to Diff 379921.Oct 14 2021, 11:10 PM

Use ConstString::GetStringRef() directly instead of converting through const char *. Not that it has any practical difference right now…

labath accepted this revision.Oct 15 2021, 12:33 AM

I think we've managed to come to something that looks mostly reasonable.

lldb/source/Plugins/ABI/X86/ABIX86.cpp
51

const RegisterMap& -- it's expensive to copy

59

drop const & -- it's cheap to copy :)

92–94

how about you reverse this, so you set gpr_base_size first, and then use that to initialize is64bit?

lldb/test/API/functionalities/gdb_remote_client/TestGDBServerTargetXML.py
327

Maybe also do a register read --all and check that all (sub)registers are present and have the expected values (you can try using self.filecheck for that).

This revision is now accepted and ready to land.Oct 15 2021, 12:33 AM
mgorny marked 4 inline comments as done.Oct 15 2021, 2:49 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptOct 15 2021, 3:55 AM
Herald added a subscriber: thopre. · View Herald Transcript