This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Read register fields from target XML

Authored by DavidSpickett on Mar 8 2023, 3:44 AM.



This teaches ProcessGDBRemote to look for "flags" nodes
in the target XML that tell you what fields a register has.

It will check for various invalid inputs like:

  • Flags nodes with 0 fields in them.
  • Start or end being > the register size.
  • Fields that overlap.
  • Required properties not being present (e.g. no name).
  • Flag sets being redefined.

If anything untoward is found, we'll just drop the field or the
flag set altogether. Register fields are a "nice to have" so LLDB
shouldn't be crashing because of them, instead just log anything
we throw away. So the user can fix their XML/file a bug with their

Once that is done it will sort the fields and pass them to
the RegisterFields class I added previously.

There is no way to see these fields yet, so tests for this code
will come later when the formatting code is added.

The fields are stored in a map of unique pointers on the
ProcessGDBRemote class. It will give out raw pointers on the
assumption that the GDB process lives longer than the users
of those pointers do. Which means RegisterInfo is still a trivial struct
but we are properly destroying the fields when the GDB process ends.

We can't store the fields directly in the map because adding new
items may cause its storage to be reallocated, which would invalidate
pointers we've already given out.

Diff Detail

Event Timeline

DavidSpickett created this revision.Mar 8 2023, 3:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 8 2023, 3:44 AM
Herald added a subscriber: mgrang. · View Herald Transcript
DavidSpickett requested review of this revision.Mar 8 2023, 3:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 8 2023, 3:44 AM

Tests are added in, because they use the formatting code added there.

Use StringRef instead of std::string.

Though an empty StringRef is morally equivalent
to optional, I've stuck with optional to keep the code
looking consistent.

jasonmolenda accepted this revision.Mar 9 2023, 3:48 PM

This looks good to me.


All of the bit positions are unsigned, %u's to be proper (and further printf specifiers below. not that it will matter, of course.)

This revision is now accepted and ready to land.Mar 9 2023, 3:48 PM

Remove use of type attribute on flags. We will recognise it as valid,
but do nothing with it as it's not needed.

Correct printf for unsigned.

DavidSpickett marked an inline comment as done.Mar 10 2023, 5:29 AM

The big assumption here is that the GDB process lasts longer than the higher level debug session does. I am keeping a map of register name to unique pointer in the GDB process, then giving out raw pointers to the higher level commands. The higher level is destroyed first, then the GDB process which frees all the unique pointers. This means that the register info struct remains trivial, it wouldn't be if we stored the registerfields type directly in it.

The obvious way to try to break that is to cut the connection to the remote but even so, I think the GDB process on lldb's side still exists (albeit erroring for everything).

JDevlieghere added inline comments.Mar 20 2023, 2:07 PM

Do you think that using a StringSwitch could improve readability?


Alternatively you can use LLDB_LOG which uses LLVM's formatv under the hood and not have to bother about format specifiers.

DavidSpickett added inline comments.Mar 21 2023, 3:21 AM

Unless I have the wrong end of the stick, StringSwitch is usually about returning a single value. Here we'd return a lambda for processing the attribute value.

That would look like:

std::function<void(llvm::StringRef)> processor =
              [&name, log](llvm::StringRef value) {
                          "ProcessGDBRemote::ParseFlags Found field node "
                          "name \"%s\"",
                name = value;
              [&start, log, max_start_bit](llvm::StringRef value) {
                unsigned parsed_start = 0;
                if (llvm::to_integer(value, parsed_start)) {
                  if (parsed_start > max_start_bit) {
                              "ProcessGDBRemote::ParseFlags Invalid "
                              "start %u in field node, "
                              "cannot be > %u",
                              parsed_start, max_start_bit);
                  } else
                    start = parsed_start;
                } else {
                            "ProcessGDBRemote::ParseFlags Invalid start "
                            "\"%s\" in field node",


Which seems about the same to me assuming one is comfortable reading lambdas, but tell me what you think. The extra indents seem to help more than the StringSwitch itself.

Use LOG instead of LOGF.

DavidSpickett marked an inline comment as done.Mar 21 2023, 4:02 AM
JDevlieghere accepted this revision.Mar 21 2023, 9:32 AM
JDevlieghere added inline comments.

Fair enough, and it's probably not worth creating an enum for this (which is what I originally had in mind). I think the code is fine as it is.

This revision was automatically updated to reflect the committed changes.
MaskRay added inline comments.

Is there a library layering violation?

I assume that lldb/Target/*.h files depend on lldb/include/lldb/lldb-private-types.h, so lldb/include/lldb/lldb-private-types.h cannot include lldb/Target/*.h files...

DavidSpickett marked an inline comment as done.Apr 14 2023, 9:17 AM
DavidSpickett added inline comments.

It is and thanks for the fix!