This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Add support for using integral const static data members in the expression evaluator
ClosedPublic

Authored by werat on Jun 9 2020, 8:10 AM.

Details

Summary

This adds support for using const static integral data members as described by C++11 [class.static.data]p3
to LLDB's expression evaluator.

So far LLDB treated these data members are normal static variables. They already work as intended when they are declared in the class definition and then defined in a namespace scope. However, if they are declared and initialised in the class definition but never defined in a namespace scope, all LLDB expressions that use them will fail to link when LLDB can't find the respective symbol for the variable.

The reason for this is that the data members which are only declared in the class are not emitted into any object file so LLDB can never resolve them. Expressions that use these variables are expected to directly use their constant value if possible. Clang can do this for us during codegen, but it requires that we add the constant value to the VarDecl we generate for these data members.

This patch implements this by:

  • parsing the constant values from the debug info and adding it to variable declarations we encounter.
  • ensuring that LLDB doesn't implicitly try to take the address of expressions that might be an lvalue that points to such a special data member.

The second change is caused by LLDB's way of storing lvalues in the expression parser. When LLDB parses an expression, it tries to keep the result around via two mechanisms:

  1. For lvalues, LLDB generates a static pointer variable and stores the address of the last expression in it: T *$__lldb_expr_result_ptr = &LastExpression
  2. For everything else, LLDB generates a static variable of the same type as the last expression and then direct initialises that variable: T $__lldb_expr_result(LastExpression)

If we try to print a special const static data member via something like expr Class::Member, then LLDB will try to take the address of this expression as it's an lvalue. This means LLDB will try to take the address of the variable which causes that Clang can't replace the use with the constant value. There isn't any good way to detect this case (as there a lot of different expressions that could yield an lvalue that points to such a data member), so this patch also changes that we only use the first way of capturing the result if the last expression does not have a type that could potentially indicate it's coming from such a special data member.

This change shouldn't break most workflows for users. The only observable side effect I could find is that the implicit persistent result variables for const int's now have their own memory address:

Before this change:

(lldb) p i
(const int) $0 = 123
(lldb) p &$0
(const int *) $1 = 0x00007ffeefbff8e8
(lldb) p &i
(const int *) $2 = 0x00007ffeefbff8e8

After this change we capture i by value so it has its own value.

(lldb) p i
(const int) $0 = 123
(lldb) p &$0
(const int *) $1 = 0x0000000100155320
(lldb) p &i
(const int *) $2 = 0x00007ffeefbff8e8

Diff Detail

Event Timeline

teemperor created this revision.Jun 9 2020, 8:10 AM
teemperor retitled this revision from [lldb] Add support for using const static data members in the expression evaluator [WIP to [lldb] Add support for using const static data members in the expression evaluator [WIP].Jun 9 2020, 8:10 AM

I will take a second look later on.

lldb/source/Plugins/ExpressionParser/Clang/ASTResultSynthesizer.cpp
203

Should we assert that this is an lvalue?

lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
7286 ↗(On Diff #269547)

So the name says AddInitToVarDecl but we are asserting only integral or enum type ... seems like the name should be more specific.

shafik added inline comments.Jun 10 2020, 9:56 AM
lldb/source/Utility/DataExtractor.cpp
882 ↗(On Diff #269547)

Nice refactoring!

I will take a second look later on.

Just saying that this is a WIP revision (see the title) just for the sake of linking my status report here, so this isn't actually up for review yet (and that's why that DataExtractor refactoring is in here, it's just to make my CI happy when testing this patch). You're added as reviewer because of your automatic Herald rule.

teemperor retitled this revision from [lldb] Add support for using const static data members in the expression evaluator [WIP] to [lldb] Add support for using integral const static data members in the expression evaluator.
teemperor edited the summary of this revision. (Show Details)
teemperor added reviewers: JDevlieghere, labath.
teemperor set the repository for this revision to rLLDB LLDB.
teemperor marked 2 inline comments as done.Jun 15 2020, 10:43 PM
abidh removed a subscriber: abidh.Aug 11 2020, 5:05 AM
jarin accepted this revision.Aug 11 2020, 7:01 AM

This looks great! Thank you for the extensive test suite.

lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
2344

Out of curiosity, why can't you just use the APInt as 'result' here?

This revision is now accepted and ready to land.Aug 11 2020, 7:01 AM
teemperor added inline comments.Aug 11 2020, 8:33 AM
lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
2344

You mean why use APSInt instead of APInt here? That's just because of the required_bits check below (getMinSignedBits is only in APSInt). It's not a complex function, but I rather make it an APSInt than document the MinSignedBits algorithm here.

And the required_bits check is only there because the DWARFFormValue's uint64_t value wasn't checked so far with the specific type in mind (like, having 0xabc for an unsigned char would otherwise just be silently truncated).

lldb/unittests/SymbolFile/DWARF/DWARFASTParserClangTests.cpp
20

Note to myself: This change and the one below were actually supposed to be its own NFC commit...

jarin added inline comments.Aug 11 2020, 9:50 AM
lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
2344
teemperor planned changes to this revision.Aug 11 2020, 10:39 AM
teemperor added inline comments.
lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
2344

Ah that's a good catch, thanks! I assumed that was defined in APSInt. Let me make that just APInt then.

Very nice!

lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
2331

In this particular case = 64 might be more readable.

  • Revert unintended NFC cleanup in DWARFASTParserTest
  • APSInt -> APInt (Thanks Jaroslav!)
This revision is now accepted and ready to land.Aug 12 2020, 12:44 AM
teemperor updated this revision to Diff 284998.EditedAug 12 2020, 12:58 AM
  • Simplify max size to just '64' instead of sizeof(uint64_t) * 8 (thanks Adrian!)
  • Update comment to mention APInt instead of APSInt

I am curious about constexpr static as well here.

lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
2331

constexpr

2348

there is no APSInt anymore? right?

lldb/unittests/SymbolFile/DWARF/DWARFASTParserClangTests.cpp
172

Why not use int32_t although this should never affect us int can be 64 bit, so int32_t is a better reflection of your intent.

172

You can also use constexpr here since numeric_limits methods are all constexpr as well.

219

std::numeric_limits<uint64_t>::max(); or am I missing the intent here?

jarin added a comment.Aug 13 2020, 4:00 AM

I am curious about constexpr static as well here.

+1

It would be great to write tests for the static constexpr fields, too.

jankratochvil requested changes to this revision.Feb 6 2021, 12:19 AM
jankratochvil added a subscriber: jankratochvil.

It does crash with testcase from D92643:

echo 'struct Vars { inline static double inline_static = 1.5; } v; int main() {}'|./bin/clang -Wall -g -x c++ -;./bin/lldb -b ./a.out -o 'p Vars::inline_static'
lldb: lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp:2346: llvm::Expected<llvm::APInt> DWARFASTParserClang::ExtractIntFromFormValue(const lldb_private::CompilerType &, const DWARFFormValue &) const: Assertion `qt->isIntegralOrEnumerationType()' failed.
 #9 0x00007fd1b28454d1 DWARFASTParserClang::ExtractIntFromFormValue(lldb_private::CompilerType const&, DWARFFormValue const&) const lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp:2347:54

The patch also needs a trivial rebase.

This revision now requires changes to proceed.Feb 6 2021, 12:19 AM
werat added a subscriber: werat.Feb 13 2022, 9:11 AM

@labath @jgorbe - looks like this might be useful to pick up/push forward.

Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2022, 11:02 AM
werat commandeered this revision.Jul 4 2022, 11:09 AM
werat added a reviewer: teemperor.

Taking over the patch (as discussed with @teemperor)

werat updated this revision to Diff 442126.Jul 4 2022, 11:18 AM
  • Rebase on latest HEAD
  • Address review comments
  • Add a check for field being integer/enum
Herald added a project: Restricted Project. · View Herald TranscriptJul 4 2022, 11:18 AM
werat marked 5 inline comments as done.Jul 4 2022, 11:19 AM
werat marked an inline comment as done.Jul 4 2022, 11:23 AM

Reviving the patch and taking over.

@jankratochvil @shafik @labath can you please take another look? Thanks!

lldb/unittests/SymbolFile/DWARF/DWARFASTParserClangTests.cpp
172

The tests verify whether the value fits into the type, so the value itself needs to be large to fit things that can't be represented in int32

For example, this will pass if int_min is int32_t:

EXPECT_THAT_EXPECTED(ExtractS(ast.IntTy, int_min - 2), llvm::Failed());
labath added a comment.Jul 6 2022, 4:23 AM

My only complaint is about the restriction in CanTakeAddressOfLValue. It seems arbitrary. There's anything in C++ or DWARF that would prevent us from having constant-valued variables of class types. It's just that clang/llvm does not know how to emit the DW_AT_const_value for such variables (yet). Gcc does not seem to have that problem (it happily emits a block form for those). So, it feels to me this is just kicking the can down the road, and we will sooner or later need to come up with a better story for this.

I guess the condition we really want to express here is "does this expression refer to a constexpr variable (ideally one without a location)"? And the problem is that clang does not give us the means to detect that?

Is that really the case? Would it maybe be possible to use some c++ magic to get clang to do that for us. Write something like if constexpr (__builtin_constant_p(user_expression)) do_something_rvalue_like(); else assume_regular_lvalue(); ?

Michael137 added inline comments.Jul 6 2022, 4:32 AM
lldb/source/Plugins/ExpressionParser/Clang/ASTResultSynthesizer.cpp
248

Minor: Should we update this documentation?

lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
2575

Should this be:

if (!attrs.const_value_form || !ct.IsIntegerOrEnumerationType(unused))

?

2582

Can v be nullptr here?

werat marked 6 inline comments as done.Jul 7 2022, 1:39 PM

I guess the condition we really want to express here is "does this expression refer to a constexpr variable (ideally one without a location)"? And the problem is that clang does not give us the means to detect that?

Is that really the case? Would it maybe be possible to use some c++ magic to get clang to do that for us. Write something like if constexpr (__builtin_constant_p(user_expression)) do_something_rvalue_like(); else assume_regular_lvalue(); ?

I think you're right here, but I don't know a better way to express this. My clangfu is not good enough for this yet :)

As I understand it, we can't express it purely in the expression (via __builtin_constant_p or something), because we need to know the answer before executing the expression (it changes the way we do materialization/dematerialization).

Could this be something to improve in the future?

lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
2575

Whoops, yes, thanks!

2582

Honestly, not sure. Theoretically AddVariableToRecordType can return nullptr, but I've never seen such case. Maybe it only happens when the debug info is malformed? Would that be caught earlier at some stage?
I can add a safe guard here with a log message.

werat updated this revision to Diff 443051.Jul 7 2022, 1:41 PM
werat marked 2 inline comments as done.

Address review comments

werat updated this revision to Diff 443052.Jul 7 2022, 1:42 PM
This comment was removed by werat.
werat updated this revision to Diff 443053.Jul 7 2022, 1:45 PM

Fix condition

with latest changes LGTM

I guess the condition we really want to express here is "does this expression refer to a constexpr variable (ideally one without a location)"? And the problem is that clang does not give us the means to detect that?

Is that really the case? Would it maybe be possible to use some c++ magic to get clang to do that for us. Write something like if constexpr (__builtin_constant_p(user_expression)) do_something_rvalue_like(); else assume_regular_lvalue(); ?

I think you're right here, but I don't know a better way to express this. My clangfu is not good enough for this yet :)

As I understand it, we can't express it purely in the expression (via __builtin_constant_p or something), because we need to know the answer before executing the expression (it changes the way we do materialization/dematerialization).

Well, yes, either that, or change how the dematerialization works -- prepare for both options and then choose the one that was actually used.

Could this be something to improve in the future?

Well.. maybe... I don't know. I think this is definitely useful, and I don't see much of a downside. The main difference I see is that you won't be able to "modify a constant" via expressions like p const_int, p (int&)$0 = 47, but: a) you will still be able to do that within a single expression; and b) such an operation is unlikely to produce predictable results in anything but pure C (which doesn't really have constants).

So, if noone objects to this, then I think it's fine.

werat edited reviewers, added: Michael137; removed: jarin.Jul 14 2022, 5:55 AM

So, if noone objects to this, then I think it's fine.

I guess nobody objects :)

@labath, @Michael137, can you accept/LGTM the revision?

werat added a reviewer: jarin.Jul 14 2022, 5:56 AM
Michael137 accepted this revision.Jul 14 2022, 6:21 AM
This revision is now accepted and ready to land.Jul 14 2022, 8:12 AM
Michael137 added a comment.EditedJul 20 2022, 9:46 AM

This seems to cause issues when var->getType() == const llvm::APFloatBase::roundingMode.

The following assertion triggered:

Assertion failed: (type->isIntegerType() && "Illegal type in IntegerLiteral"), function IntegerLiteral, file Expr.cpp, line 892

Reproduces with:

  1. lldb -- ./bin/lldb a.out
  2. b LookupLocalVariable
  3. step a couple of times until decl_context is declared
  4. p decl_context

It looks ike the dyn_cast to EnumType fails and thus qt.getUnqualifiedType() which we pass into IntegerLiteral::Create remains an EnumType, which breaks the invariant

Investigating further...

Michael137 added a comment.EditedJul 20 2022, 5:29 PM

This seems to cause issues when var->getType() == const llvm::APFloatBase::roundingMode.

The following assertion triggered:

Assertion failed: (type->isIntegerType() && "Illegal type in IntegerLiteral"), function IntegerLiteral, file Expr.cpp, line 892

Reproduces with:

  1. lldb -- ./bin/lldb a.out
  2. b LookupLocalVariable
  3. step a couple of times until decl_context is declared
  4. p decl_context

It looks ike the dyn_cast to EnumType fails and thus qt.getUnqualifiedType() which we pass into IntegerLiteral::Create remains an EnumType, which breaks the invariant

Investigating further...

Tried to address this in https://reviews.llvm.org/D130213