This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Add basic support for Rust enums in TypeSystemClang
ClosedPublic

Authored by VladimirMakaev on Apr 25 2023, 4:05 PM.

Details

Summary

LLDB doesn't yet have a TypeSystemRust implemented however it is used to debug Rust applications. Most of the types map well enough to Clang types and there are python formatters implemented to display those types reasonably well in a debugger. We discussed this with Greg Clayton internally and this approach looks like a reasonable thing to implement and fix one of the major pain points when debugging Rust with LLDB.

However, Rust enums are completely ignored by LLDB as Clang never emits DW_TAG_variant_part inside DW_TAG_structure_type

This diff adds a parser for DW_TAG_variant_part (Rust-only) that creates a matching valid Clang declaration that represents a Rust enum. As long as there is enough information and all fields have correct offsets synthetic/summary providers can be implemented to display it correctly when debugging Rust code

For example given a Rust enum:

enum EnumOpt2 {
    A(u8),
    B(bool),
}

The following is a matching Clang declaration:

struct EnumOpt2 {
    union debugger_test::EnumOpt2$Inner {
        struct A$Variant {
            unsigned char $discr$;
            debugger_test::EnumOpt2::A value : 8;
        };
        debugger_test::EnumOpt2::debugger_test::EnumOpt2$Inner::A$Variant $variant$0;
        struct B$Variant {
            unsigned char $discr$;
            debugger_test::EnumOpt2::B value : 8;
        };
        debugger_test::EnumOpt2::debugger_test::EnumOpt2$Inner::B$Variant $variant$1;
    };
    struct A {
        unsigned char __0;
    };
    struct B {
        bool __0;
    };
    debugger_test::EnumOpt2::debugger_test::EnumOpt2$Inner inner;
}

Now this declaration contains all the necessary information to interpret the original Rust enum in a Python synthetic formatter and display back on UI as Rust enum.

Diff Detail

Event Timeline

VladimirMakaev created this revision.Apr 25 2023, 4:05 PM
Herald added a project: Restricted Project. · View Herald Transcript
VladimirMakaev requested review of this revision.Apr 25 2023, 4:05 PM

Overall this looks fine as a way to help rust users see rust enums without having to implement a new TypeSystem for rust. I would just ask that we check the language of the CU before calling the ParseVariantPart, so that this would only get enabled for Rust.

lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
3184

I would add a check here for the Rust language. We might also want to rename "ParseVariantPart" to "ParseRustVariant" to indicate this is specific to the Rust language.

clayborg added inline comments.Jul 11 2023, 3:33 PM
lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
3184

Something like:

if (die.GetCU()->GetDWARFLanguageType() == eLanguageTypeRust) 
  ParseRustVariant(die, parent_die, class_clang_type, default_accessibility, layout_info);
break;
lldb/test/API/lang/rust/enum-structs/TestEnumStructsGenerated.py
14 ↗(On Diff #516962)

the main.yaml below is empty? Will this test work?

lldb/test/API/lang/rust/enum-structs/main.rs
2–5

Is this file used at all? I don't see any references to it. I am guessing that the main.yaml is expected to reference this file?

I'm curious to know why you don't try and implement some kind of TypeSystemRust? I assume it's going to be a lengthy process, but eventually you (or somebody else) is going to want to implement a TypeSystemRust in some form and they'll have to revisit all of rust-specific things we've tacked onto various parts of LLDB (e.g. this functionality with DWARFASTParserClang). I suppose the question I'm trying to answer is "How much do we add to LLDB before actually adding proper rust support?"

I was also curious about the test. I'm not sure how the test is supposed to work here, is the rust source file actually built or is that just to show how the yaml file was generated? There's no rust support in our testing infrastructure AFAICT so I assume this just makes sure that something already generated doesn't get broken in the future.

lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
2526–2527

Store variant_name as a ConstString instead of creating one each time. Creating a ConstString (even if the string is in the StringPool) has a non-trivial creation cost.

2530

return !discr_value;

has_value does the same thing as operator bool for std::optional.

I'm curious to know why you don't try and implement some kind of TypeSystemRust? I assume it's going to be a lengthy process, but eventually you (or somebody else) is going to want to implement a TypeSystemRust in some form and they'll have to revisit all of rust-specific things we've tacked onto various parts of LLDB (e.g. this functionality with DWARFASTParserClang). I suppose the question I'm trying to answer is "How much do we add to LLDB before actually adding proper rust support?"

Yes this is a work around until a proper TypeSystemRust is created. It works for most things now and this will help enums work using TypeSystemClang.

I was also curious about the test. I'm not sure how the test is supposed to work here, is the rust source file actually built or is that just to show how the yaml file was generated? There's no rust support in our testing infrastructure AFAICT so I assume this just makes sure that something already generated doesn't get broken in the future.

The yaml generates the binary so no rust compiler or anything is needed for the test. I suggested to use a yaml file that has global variables with all of the enum types so we can test the SBValue system out to ensure it is working as expected.

I'm curious to know why you don't try and implement some kind of TypeSystemRust? I assume it's going to be a lengthy process, but eventually you (or somebody else) is going to want to implement a TypeSystemRust in some form and they'll have to revisit all of rust-specific things we've tacked onto various parts of LLDB (e.g. this functionality with DWARFASTParserClang). I suppose the question I'm trying to answer is "How much do we add to LLDB before actually adding proper rust support?"

I think there was an attempt in the past to build TypeSystemRust. Rust Project had a fork of LLDB with this implemented by Tom Tromey and CodeLLDB maintainer(vadimcn) has one. Both of them were not able to merge that into LLDB codebase. A relevant link to a discussion https://rust-lang.zulipchat.com/#narrow/stream/317568-t-compiler.2Fwg-debugging/topic/upstreaming.20rust.20support.20into.20lldb

Apart from just implementing type system itself (which is much bigger scope than this change) there are other non-trivial issues:

  1. There is no "compiler-as-a-service" in Rust so getting expressions to work is non-trivial. An interpreter of some sort needs to be built with subset of Rust support
  2. Infra changes (DEVOPS) changes are required to start building Rust inferiors to run tests on CI
  3. LLVM / rustc cross repository logistics. Rustc probably needs to live in LLVM to make this work well.

Given the fact that the majority of the things "just work" when debugging Rust with LLDB (using Clang declarations effectively as of today) I kinda believe this is not a bad solution to try to do a bit of extra care for Rust types in TypeSystemClang and the debugging experience will be great for Rust users. As long as LLDB emits enough information a conversion can be made in synthetic provider to display it nicely in debugger. Tests can be written to make sure it doesn't rot. This is definitely something I want to improve in this PR too.

I think if Rust Project Debugging WG decides to invest into native TypeSystemRust I think having these fixes around isn't going to slow down the progress but quite the contrary (in my opinion). And cleaning them up will also be trivial since they're specifically guarded by CU check.

I was also curious about the test. I'm not sure how the test is supposed to work here, is the rust source file actually built or is that just to show how the yaml file was generated? There's no rust support in our testing infrastructure AFAICT so I assume this just makes sure that something already generated doesn't get broken in the future.

I'm gonna work on the test more and try to make assertions against global SBValue. Greg suggested this might work. So since we don't have access to rustc in the testing setup I've generated object file from main.rs manually and converted it to yaml. This makes the test portable. (main.rs is kept for the reference)

lldb/test/API/lang/rust/enum-structs/TestEnumStructsGenerated.py
14 ↗(On Diff #516962)

It't not empty it's just big and not opened automatically

lldb/test/API/lang/rust/enum-structs/main.rs
2–5

It's not used I've just put it as a reference of the program that the object file was taken from. I used rustc to produce object file from this program

I'm curious to know why you don't try and implement some kind of TypeSystemRust? I assume it's going to be a lengthy process, but eventually you (or somebody else) is going to want to implement a TypeSystemRust in some form and they'll have to revisit all of rust-specific things we've tacked onto various parts of LLDB (e.g. this functionality with DWARFASTParserClang). I suppose the question I'm trying to answer is "How much do we add to LLDB before actually adding proper rust support?"

I think there was an attempt in the past to build TypeSystemRust. Rust Project had a fork of LLDB with this implemented by Tom Tromey and CodeLLDB maintainer(vadimcn) has one. Both of them were not able to merge that into LLDB codebase. A relevant link to a discussion https://rust-lang.zulipchat.com/#narrow/stream/317568-t-compiler.2Fwg-debugging/topic/upstreaming.20rust.20support.20into.20lldb

Apart from just implementing type system itself (which is much bigger scope than this change) there are other non-trivial issues:

  1. There is no "compiler-as-a-service" in Rust so getting expressions to work is non-trivial. An interpreter of some sort needs to be built with subset of Rust support
  2. Infra changes (DEVOPS) changes are required to start building Rust inferiors to run tests on CI
  3. LLVM / rustc cross repository logistics. Rustc probably needs to live in LLVM to make this work well.

Given the fact that the majority of the things "just work" when debugging Rust with LLDB (using Clang declarations effectively as of today) I kinda believe this is not a bad solution to try to do a bit of extra care for Rust types in TypeSystemClang and the debugging experience will be great for Rust users. As long as LLDB emits enough information a conversion can be made in synthetic provider to display it nicely in debugger. Tests can be written to make sure it doesn't rot. This is definitely something I want to improve in this PR too.

I think if Rust Project Debugging WG decides to invest into native TypeSystemRust I think having these fixes around isn't going to slow down the progress but quite the contrary (in my opinion). And cleaning them up will also be trivial since they're specifically guarded by CU check.

Thanks for giving that background, I was aware there were conversations about this somewhere but I didn't have the full context.

I was also curious about the test. I'm not sure how the test is supposed to work here, is the rust source file actually built or is that just to show how the yaml file was generated? There's no rust support in our testing infrastructure AFAICT so I assume this just makes sure that something already generated doesn't get broken in the future.

I'm gonna work on the test more and try to make assertions against global SBValue. Greg suggested this might work. So since we don't have access to rustc in the testing setup I've generated object file from main.rs manually and converted it to yaml. This makes the test portable. (main.rs is kept for the reference)

Sounds good. I added a comment inline about that.

lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
3184

+1, I'm partial to the ParseRustVariantPart name myself.

IMO, I think it would be helpful knowing that this functionality was written with rust in mind (especially since it's a clang-based codepath). That way if we need to support this in the future for something in clang (or any other language), it makes it easy to see at a glance that ParseVariantPart is intended for rust.

lldb/test/API/lang/rust/enum-structs/main.rs
2–5

In that case, could you include the exact invocation of rustc (and any other follow-up commands) in a comment at the beginning of this file? That way it can be rebuilt if the test needs adjustment for some reason.

Rebased on master (this diff was quite a bit old)
Addressed review comments

  • renamed method to ParseRustVariantPart
  • added CU check
  • added a good bunch of test covering cases I could think of using global variabes.
  • 2 tests for Option<NonNull<T>> couldn't be done with globals due to Rust compiler strictness on UBs. Was able to create SBValue based on raw data
  • addressed VariantMember::GetName()
VladimirMakaev marked 7 inline comments as done.Jul 13 2023, 7:22 PM

I've reached out to Rust wg-debugging for additional feedback (https://rust-lang.zulipchat.com/#narrow/stream/317568-t-compiler.2Fwg-debugging/topic/Feedback.20on.20DW_TAG_variant_part.20support.20in.20LLDB)

There a couple of edge cases (when using u128 enums) that not working properly so I'm going fix them and add more tests.

Additionally an issue was raised:

Another comment: Checking the language of the CU might not be reliable because of cross-language inlining. That is, if some Rust code gets inlined into a C++ CU, then the CU will probably still say it is C++ even though it might contain DW_TAG_variant_part DIEs. Just something to be aware of an handle gracefully.

What do you guys think on not gating it behind CU check, maybe there is another way? My opinion probably is that there is more risk to CLang users / potential conflict with CLang evolution than benefit of having LLDB working for Rust code in this niche scenario?

clayborg accepted this revision.Jul 14 2023, 11:30 AM

LGTM

lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
3965

Add a newline

VladimirMakaev removed reviewers: ayermolo, Michael137.

Added more tests.
Last revision had main.yaml missing.

Reviewed test cases from https://github.com/rust-lang/rust/blob/master/tests/debuginfo/msvc-pretty-enums.rs

It seems #[repr(i128)] isn't working correctly with no_std. (It looks like a compiler issue and the feature is still unstable) Updated tests and main.yaml to include stdlib

This revision is now accepted and ready to land.Jul 18 2023, 7:11 PM

nit: added missing end of line

VladimirMakaev retitled this revision from [lldb] Add basic support to Rust enums in TypeSystemClang to [lldb] Add basic support for Rust enums in TypeSystemClang.Jul 18 2023, 8:05 PM

Apart from just implementing type system itself (which is much bigger scope than this change) there are other non-trivial issues:

  1. There is no "compiler-as-a-service" in Rust so getting expressions to work is non-trivial. An interpreter of some sort needs to be built with subset of Rust support

My work also included a parser for some Rust expressions.

VladimirMakaev marked an inline comment as done.Jul 21 2023, 9:26 AM

Apart from just implementing type system itself (which is much bigger scope than this change) there are other non-trivial issues:

  1. There is no "compiler-as-a-service" in Rust so getting expressions to work is non-trivial. An interpreter of some sort needs to be built with subset of Rust support

My work also included a parser for some Rust expressions.

Yeah I reviewed your fork and I saw there was a parser. So it could potentially be used with some success. With this approach we can use Clang for expression evaluations which might be inconvenient for Rust developers but it's decent enough. Other concerns with having this tested with CI still stands. Overall I agree that having TypeSystemRust will offer a better experience and more tuning but this fix is quite cheap and should be a big improvement over current state of things. Do you know by chance if there are other areas apart from enums which are currently broken in LLDB and cannot be fixed via synthetic provider?

jryans added a subscriber: jryans.Jul 21 2023, 9:52 AM

I think there was an attempt in the past to build TypeSystemRust. Rust Project had a fork of LLDB with this implemented by Tom Tromey and CodeLLDB maintainer(vadimcn) has one. Both of them were not able to merge that into LLDB codebase. A relevant link to a discussion https://rust-lang.zulipchat.com/#narrow/stream/317568-t-compiler.2Fwg-debugging/topic/upstreaming.20rust.20support.20into.20lldb

Apart from just implementing type system itself (which is much bigger scope than this change) there are other non-trivial issues:

  1. There is no "compiler-as-a-service" in Rust so getting expressions to work is non-trivial. An interpreter of some sort needs to be built with subset of Rust support
  2. Infra changes (DEVOPS) changes are required to start building Rust inferiors to run tests on CI
  3. LLVM / rustc cross repository logistics. Rustc probably needs to live in LLVM to make this work well.

I know this is not exactly the focus of this patch, but I just wanted to reply to this portion (since you've mentioned it). While some people did try to suggest things like your points 2 and 3 in previous Rust support attempts, I would say it seems impractical to require pulling a whole language implementation into LLVM just to add debugging support. If LLDB really wants to be a many-language debugger, I would strongly urge looking for testing solutions that do not require somehow adding an full implementation of each one into the tree. After all, the whole point of debug info formats like DWARF is that the producer and consumer do not need to be tightly coupled (since there's a standard format in between).

I would argue a more practical approach would involve test cases based on whatever IR or AST might be appropriate for the feature under test, e.g. DWARF output from Rust, Rust IR or MIR, etc. These things are just data, and LLDB (and / or small helper tools) could process this data without the need for an entire language implementation.

A discussion from 2019 (https://discourse.llvm.org/t/rust-support-in-lldb-again/53210) seemed to potentially be open to such an idea, but perhaps it's worth starting a new discussion thread focused on the testing requirements for language support in LLDB which attempts to work out guidelines which allow for languages whose implementations live outside of the LLVM tree.

I don't have commit access to the repo. Since this was accepted can somebody commit this to the repo? CC @clayborg

I fixed a typo in this to get the Windows build going again: https://github.com/llvm/llvm-project/commit/f8d1209f966ccd1dd0a19f3acef0871bf8fc3c94

I assume it's a typo, not an attempt to use what seems to have been a NetBSD type at one point.

Thanks @DavidSpickett for fixing this. I did not intend to use that type. Any number of that size would work.

lldb/test/API/lang/rust/enum-structs/main.rs