This is an archive of the discontinued LLVM Phabricator instance.

[LLDB][ClangExpression] Allow expression evaluation from within C++ Lambdas
ClosedPublic

Authored by Michael137 on Jul 4 2022, 6:54 AM.

Details

Summary

This patch adds support for evaluating expressions which reference
a captured this from within the context of a C++ lambda expression.
Currently LLDB doesn't provide Clang with enough information to
determine that we're inside a lambda expression and are allowed to
access variables on a captured this; instead Clang simply fails
to parse the expression.

There are two problems to solve here:

  1. Make sure clang::Sema doesn't reject the expression due to illegal member access.
  2. Materialize all the necessary captured variables/member variables required to evaluate the expression.

To address (1), we currently import the outer structure's AST context
onto $__lldb_class, making the contextClass and the NamingClass
match, a requirement by clang::Sema::BuildPossibleImplicitMemberExpr.

To address (2), we inject all captured variables as locals into the
expression source code.

Testing

  • Added API test

Diff Detail

Event Timeline

Michael137 created this revision.Jul 4 2022, 6:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 4 2022, 6:54 AM
Michael137 requested review of this revision.Jul 4 2022, 6:54 AM
Herald added a project: Restricted Project. · View Herald Transcript

The is a draft for now, but feel free to comment on the approach. I'm still evaluating some other alternatives to this.

Michael137 planned changes to this revision.Jul 4 2022, 7:27 AM
Michael137 updated this revision to Diff 442102.Jul 4 2022, 8:07 AM
This comment was removed by Michael137.
Michael137 updated this revision to Diff 442104.Jul 4 2022, 8:25 AM

Correct email config

Michael137 planned changes to this revision.Jul 4 2022, 8:26 AM

Remove from review queue for now

Michael137 updated this revision to Diff 442525.Jul 6 2022, 5:09 AM
  • [LLDB][NFC] Create variable for hardcoded alignment/size constants in materializer
  • [LLDB][Expression] Allow instantiation of IR Entity from ValueObject
  • Removed redundant m_object_pointer_type
Michael137 updated this revision to Diff 442561.Jul 6 2022, 7:15 AM
  • Add AddOneVariable overload that takes ValueObjectSP
Michael137 updated this revision to Diff 442563.Jul 6 2022, 7:20 AM
  • Fixed doc
Michael137 retitled this revision from WIP: [LLDB][ClangExpression] Allow expression evaluation from within C++ Lambdas to [LLDB][ClangExpression] Allow expression evaluation from within C++ Lambdas.Jul 6 2022, 9:07 AM

I'm not sure whether I'm bothered that this patch handles the other captures for lambda's with captured "this" pretty differently from ones that don't capture "this". But the method for the ones that don't capture "this" is more straightforward, so maybe that's desirable.

You introduced a handful of API's that take StackFrame *'s but then their callers end up having StackFrameSP's that they have to call "get" on to pass to your new API's. That doesn't seem desirable.

Other inline comments...

lldb/source/Expression/Materializer.cpp
31

This seems weird to me. You didn't do it, you're just replacing hard-coded 8's later in the code. But this seems like something we should be getting from the target? You don't need to solve this for this patch since it isn't intrinsic to it, but at least leave a FIXME...

808

Naively it seems like it would be a bad idea to call GetValueObject twice. We don't want independent ValueObjectVariable shared pointers floating around for the same Entity. Should this maybe do an if (!m_variable_sp) first?

850

Is this right? For instance, in the case where the fake "this" Variable in the lambda expression has an invalid expression, all the ValueObjects you try to make (the "real captured this" as well as any other captures that were hanging off the fake "this" would have invalid location expressions.

lldb/source/Expression/UserExpression.cpp
101–102

Can we think of a better name for this? Having two instances of Object with different meanings in the same name is confusing? Even GetObjectPointerValueObject would be clearer.

lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionDeclMap.cpp
70

This comment is confusing. Partly because you refer to an "unnamed structure" which you then look up by name "this". I think it's clearer to say that clang makes an artificial class whose members are the captures, and makes the lambda look like a method of that class. Then the captured "this" is a special case of all the captures.

This method also ONLY returns the fake this if the real "this" was captured, which you should say.

1690

Is there enough advantage to move-ing the incoming valobj as opposed to just copying the shared pointer? It's a little weird to have API's that consume one of their incoming arguments. If that really is worth doing, you should note that you've done that.

lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionSourceCode.cpp
229

It's worth noting that you handle lambda's that capture "this" and lambda's that don't capture "this" differently. In the former case, we promote all the captures to local variables and ignore the fake this. In the latter case (because GetLambdaValueObject only returns a valid ValueObjectSP if it has a child called "this"), we keep finding the values by implicit lookup in the fake this instead.

I don't think that a problem, no need to use the more complex method just for consistency, but you should note that somewhere.

lldb/source/Plugins/ExpressionParser/Clang/ClangUserExpression.h
202

Why did you make this take a StackFrame *? It seems to force some other functions to change from StackFrameSP to StackFrame * but the only place it gets called it has a StackFrameSP, and explicitly calls get to make it a pointer. That seems awkward.

Michael137 added inline comments.Jul 7 2022, 7:45 AM
lldb/source/Expression/Materializer.cpp
808

m_variable_sp is used here to create a new ValueObjectVariable. It's a precondition that m_variable_sp != nullptr (will add that to the function documentation). I could add a m_value_object_var member that we set if it hasn't been set before.

850

If the location expression is invalid for the fake "this", would we ever be able to get ValueObject's out of it? By the time these Entity objects get instantiated we either managed to get ValueObject's from "this" or if we didn't, then we wouldn't have added them to the $__lldb_local_vars namespace

lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionDeclMap.cpp
1690

Not particularly, don't imagine there's a measurable difference. A copy isn't necessary so I was trying to avoid it. I guess a clearer way would be to just not pass the shared_ptr around, and instead a ValueObject const&

lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionSourceCode.cpp
229

Good point, will add a comment

lldb/source/Plugins/ExpressionParser/Clang/ClangUserExpression.h
202

Was simply trying to avoid a shared_ptr copy where it wasn't necessary. Arguably the GetObjectPointer API should just take a StackFrame const&

Michael137 marked 5 inline comments as done.Jul 7 2022, 8:48 AM
Michael137 updated this revision to Diff 442954.Jul 7 2022, 9:21 AM
  • Add more documentation
  • Moved GetLambdaValueObject into common utility header
  • Added defensive check to EntityVariable::GetValueObject
  • Move helper into new namespace
aprantl accepted this revision.Jul 11 2022, 1:13 AM

Nice work! I only had superficial comments, maybe let's also wait for @jingham to accept the patch.

lldb/include/lldb/Expression/UserExpression.h
283

Nit: The LLVM style wants doxygen comments in the .h file, would be nice to add one here.

lldb/source/Expression/Materializer.cpp
31

Yeah this is odd, maybe we can clean this up in a follow-up commit.

436

FWIW, the refactoring of EntityVariable could have been a separate preparatory NFC patch, then the patch that adds the lambda functionality would be shorter. It's fine either way, but it's usually easier to review two simple comments instead of one complex one :-)

798

Maybe add a doxygen comment explaining when this subclass is used as opposed to EntityValueObject?

834

Micro-nit: . at the end of sentence.

lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionDeclMap.cpp
835

Nice!

1105

Maybe comment that this is the lambda/this-capture path?

1669

Nit:

LLVM style would be

ClangExpressionVariable::ParserVars *parser_vars =
          AddExpressionVariable(context, pt, ut, valobj);
if (!parser_vars)
  return;
lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionDeclMap.h
593

Doxygen comment (unless this is is an override function (but it's not marked as such))?

lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionSourceCode.cpp
222

at some point we should add a children() method that returns a range...

306

not your code, but this one has an iterator :-)

lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionUtil.cpp
2

the -*- C++ -*- marker only makes sense for .h files, where it's ambiguous whether it's a C or a C++ header.

lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionVariable.h
132

Here a /// comment on top of the declaration might be more readable. (Maybe fix this for the entire class in a follow-up commit)

This revision is now accepted and ready to land.Jul 11 2022, 1:13 AM
Michael137 marked 9 inline comments as done.
  • Address stylistic/documentation comments
lldb/source/Expression/Materializer.cpp
436

True
I added it as a separate commit to this revision. Wasn't sure whether a separate patch was preferred over a separate commit in the same Phabricator revision.

  • LocationExpressionIsValid returns false for invalid ValueObjects
  • Add asserts to EntityValueObject overrides
  • Add docs for AddValueObject
Michael137 marked an inline comment as done.Jul 12 2022, 1:46 AM
Michael137 added inline comments.
lldb/source/Expression/Materializer.cpp
850

Added a check for ValueObject's error status here.

Manual testing shows we'd catch lack of variable location earlier in materialisation

jingham accepted this revision.Jul 13 2022, 10:00 AM

One trivial naming tweak, otherwise this looks good.

lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionUtil.cpp
21

lldb generally uses "this_val_sp" form for local variables. Camel case is for method names, initial lower camel case for llvm method names.

Michael137 marked an inline comment as done.
  • Address stylistic comments
Michael137 marked 2 inline comments as done.Jul 13 2022, 4:52 PM

After playing around with this some more I found an edge case with conditional breakpoints (in fact any place where we reuse an LLVMUserExpression). Modifying lldb/test/API/functionalities/breakpoint/two_hits_one_actual such that the helper method is inside a lambda like so:

struct Foo {
  void usleep_helper(int usec) {
    [this, &usec] {
        // Break here in the helper
        std::this_thread::sleep_for(std::chrono::duration<unsigned int, std::milli>(usec));
    }();
  }
};

void *background_thread(void *arg) {
    (void) arg;
    Foo f;
    for (;;) {
        f.usleep_helper(2);
    }
}

int main(void) {
  std::thread main_thread(background_thread, nullptr);
  Foo f;
  for (;;) {
    f.usleep_helper(1);
  }
}

Then setting a breakpoint twice, one for usec == 1 and usec == 100, we would end up hitting the breakpoint even if usec == 2 because conditional breakpoints re-use Materializer (and thus Entity objects). Since EntityValueObject::GetValueObject simply returns the ValueObject it was instantiated with, the usec == 1 condition always evaluates to true. Have a fix for this but verifying whether that really is the best approach.

  • Add test for conditional breakpoints on lambda captures
  • Add ValueObjectProvider so materializer doesn't use incorrect ValueObject when re-using UserExpressions
  • Remove now redundant m_lldb_value_object
  • Fix assertion
  • Remove redundant moves
aprantl accepted this revision.Jul 21 2022, 6:08 AM
jingham accepted this revision.Jul 21 2022, 4:09 PM

This looks good to me. I had a trivial comment about comments. You added a test for the condition in lambda as well, that's great!

lldb/include/lldb/Expression/Materializer.h
88

We seem to put empty /// lines between each of the parameters in most other places. I think that makes them marginally easier to read, and it's what we do in most other places...

Michael137 marked an inline comment as done.Jul 21 2022, 11:10 PM
  • Doxgyen format improvements
Michael137 closed this revision.Jul 27 2022, 8:19 AM

Missed the differential link in the commits.

This patch has been merged to main in following commits:

8184b252cdab2fbe44f766d6de28b29ebe4c8753
fcf4e252f4d992cade4bdfe5aed10ff96fa40202
317c8bf84d185c6b4a51a415c0deb7f8af661cb6