This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Support macro evaluation on hover
ClosedPublic

Authored by zyounan on Apr 16 2023, 2:01 AM.

Details

Summary

Creating a SelectionTree at the location where macro expands allows
us to obtain the associated expression, which might then be used to
evaluate compile-time values if possible.

Closes clangd/clangd#1595.

Diff Detail

Event Timeline

zyounan created this revision.Apr 16 2023, 2:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 16 2023, 2:01 AM
zyounan updated this revision to Diff 513979.EditedApr 16 2023, 2:23 AM

Fix tests on Windows

zyounan updated this revision to Diff 514019.Apr 16 2023, 8:33 AM

unsigned long isn't 8 bytes on windows

zyounan updated this revision to Diff 514028.Apr 16 2023, 9:00 AM

Do not validate type of builtin operators

zyounan published this revision for review.Apr 16 2023, 6:46 PM
zyounan added reviewers: nridge, sammccall, kadircet.

Thanks for the patch and the thorough testcase!

I wonder if we should make this a bit more strict: if the macro expansion itself is an expression, show the value of that expression, but not e.g. an enclosing expression. Otherwise, the Definition field of the hover (which shows the tokens of the macro expansion) and the Value and Type fields (which show the value and type of a larger expression) could be out of sync and confusing.

A (somewhat contrived) example would be:

#define PLUS_TWO + 2
int x = 40 PLUS_TW^O;  // huh, the value of "+ 2" is "42"?

Of your existing testcases, I think the only one this stricter rule would break is #6 (define_lambda_begin), and I think that's fine.

clang-tools-extra/clangd/Hover.cpp
778

Why createEach() when the non-macro case uses createRight()?

Sorry for responding late and thank you for the good catch!
I've updated the code and please feel free to leave comments. :)

clang-tools-extra/clangd/Hover.cpp
778

TBH, I didn't notice that we're using createRight() before and I'm not quite sure if there exists such case, that yields nothing with the right selection tree (which is exactly the 1st tree produced by createEach()) but could be evaluated with falling back to the left selection tree.
I'll change it to createEach anyway.

Oops, an inline comment disappears accidentally.

#define PLUS_TWO + 2
int x = 40 PLUS_TW^O;  // huh, the value of "+ 2" is "42"?

The reason is we're getting a SelectionNode with a wider range. Here is the layout of the tree:

txt
Built selection tree
 TranslationUnitDecl 
   FunctionDecl void check()
     CompoundStmt { …
      .BinaryOperator 40 + 2
        *IntegerLiteral 2

The dot prior to AST node kind indicates partial selection and the asterisk indicates complete selection according to SelectionTree::print.

By invoking Tree.commonAncestor() it would stop at BinaryOperator despite the fact that we're expecting IntegerLiteral, that represents the number being accordance with the macro expansion.

In order to address such case, I think we could go down the SelectionTree a little further and start evaluating at the node with complete selection.

(This looks really cool, nice work! only a driveby comment from me I'm afraid)

The dot prior to AST node kind indicates partial selection and the asterisk indicates complete selection according to SelectionTree::print.

By invoking Tree.commonAncestor() it would stop at BinaryOperator despite the fact that we're expecting IntegerLiteral, that represents the number being accordance with the macro expansion.

In order to address such case, I think we could go down the SelectionTree a little further and start evaluating at the node with complete selection.

I think rather if the common ancestor isn't completely selected, evaluation shouldn't happen. (I think this is what Nathan was suggesting by "strict"?)

(Here, "+ 2" doesn't have a value, the + is binary rather than unary)

The partial/fully selected concept in SelectionTree was originally intended for exactly this kind of thing, so I'm happy to see it get some use :-)

zyounan added a comment.EditedApr 28 2023, 8:56 PM

Haha. My bad for misunderstanding.

I think rather if the common ancestor isn't completely selected, evaluation shouldn't happen.

I don't think it is a feasible approach to simply rule out partial selection. Please consider this snippet:

#define SizeOf sizeof
void check() {
  Siz^eOf(int);
}

And associative selection tree:

Built selection tree
 TranslationUnitDecl 
   FunctionDecl void check()
     CompoundStmt { …
      .UnaryExprOrTypeTraitExpr sizeof(int)  # Partial selected. Exclude it?

Obviously we're expecting evaluation on such macro (which is just what the original issue addresses). And it's worth noting that if we define the macro a whole expression e.g., sizeof(int), rather than a token sizeof, the common ancestor will be seen as completely selected.

OTOH, it doesn't seem so trivial to me to tell if an expression or token is appropriate (or to say, full enough?) to evaluate. Take Nathan's case for example. The macros #define PLUS_TWO +2 and #define TWO 2 are both transformed into the same AST. The mapped selection tree is,

.BinaryOperator 40 + 2
   *IntegerLiteral 2  # For either '40 PLU^S_TWO' or '40 + T^WO'

...or even more extreme,

#define PLUS_TWO +2
40 + PL^US_TWO;  // it should be seen as '+2', right?

As of now I didn't come up with a better idea to split the cases. Any suggestion? :)

zyounan updated this revision to Diff 518144.Apr 29 2023, 12:58 AM

Refactor tests. Obtain type from VarDecl.

Obviously we're expecting evaluation on such macro (which is just what the original issue addresses).

That wasn't (and isn't) obvious to me - the issue didn't mention the macro in question, i assumed it was #define alignof(x) __alignof(x) or similar. Looks like both forms are in use but clang provides #define alignof __alignof as a built-in.

A couple of ideas:

  • special-case alignof
  • (generalization) allow partial selection via macros that expand to a single token
  • (generalization) allow partial selection as long as it's of a single node - the common ancestor is partially selected and no children are
zyounan updated this revision to Diff 518277.Apr 30 2023, 12:16 AM

Do not evaluate on partial selection with children.

Thank you very much for the ideas.

(generalization) allow partial selection as long as it's of a single node - the common ancestor is partially selected and no children are

This strategy looks reasonable to me and it passes my test cases. I've updated my patch again. :)

zyounan added inline comments.Apr 30 2023, 12:44 AM
clang-tools-extra/clangd/unittests/HoverTests.cpp
3924

Oops, this should be FALSE I think.

zyounan updated this revision to Diff 518280.Apr 30 2023, 1:42 AM

Exclude the macro expands to single punctulator token.

Gently ping

Thanks for the update, and thank you Sam for the suggestions!

A couple of ideas:

  • special-case alignof
  • (generalization) allow partial selection via macros that expand to a single token
  • (generalization) allow partial selection as long as it's of a single node - the common ancestor is partially selected and no children are

I was going to suggest a variant of the second option, where if a macro expands to a single token, the behaviour is the same as if the expanded token had been written in the source and that's what had been hovered over.

However, I don't have any strong concerns about the additional scenarios supported by the third option, so since this is what was implemented, let's go with it. We can always tweak this in the future if necessary.


Circling back to my original concern:

the Definition field of the hover (which shows the tokens of the macro expansion) and the Value and Type fields (which show the value and type of a larger expression) could be out of sync and confusing

the original selection being "partial" vs. "complete" is just one part of this concern. Another part is the loop in visitExprFromSelectionTree that can choose a larger enclosing expression than the original selection (even if it's "complete"), so that we can get a Definition for a smaller expression but a Value for a larger one.

However, I discovered that this sort of discrepancy can already arise without macros, where the same loop means Type is given for a smaller expression and Value for a larger one. I filed https://github.com/clangd/clangd/issues/1622 for this with an example.

Since this is a pre-existing issue, I think it's fine not to worry about it in this patch. Perhaps in a follow-up patch we can find a solution that addresses both the macro and non-macro cases.

clang-tools-extra/clangd/Hover.cpp
471

Instead of generalizing printExprValue to take an arbitrary callback, can we modify the original function to return both the Expr* and its printed value (grouped in a pair or small struct)? And then our new call site can call getType() on the Expr*.

(I realize this would be a slight change in semantics as the callback for the macro codepath currently exits the loop as soon as we have a type... but I think there's value in keeping the behaviour consistent between the macro and non-macro cases.)

707

This check for punctuators actually makes the behaviour more strict for macros than for non-macros in some cases.

For example:

#define PLUS +

constexpr int A = 2 + 3;  // hover over `+` shows `Value = 5`
constexpr int B = 2 PLUS 3;  // hover over `PLUS` does not show `Value`

I don't think this is a particularly important use case (it's a bit of a hack to get the expression's value this way, much better would be to select the entire expression you want to evaluate in the editor, but unfortunately LSP only sends a single position in textDocument/hover, not a full range...), but perhaps we could consider relaxing this in the future. (I'm thinking, if we switched to "allow partial selection via macros that expand to a single token", we could probably drop this condition.)

clang-tools-extra/clangd/unittests/HoverTests.cpp
3855

I guess with this style of test you can simplify the $3^ to ^

zyounan updated this revision to Diff 520555.May 8 2023, 7:30 PM
zyounan marked 2 inline comments as done.

Address comments

Thank you for the opinions. I've updated and please take a look.

clang-tools-extra/clangd/Hover.cpp
707

Thanks for clarifying this intention (for any bug hunter in the future).

if we switched to "allow partial selection via macros that expand to a single token", we could probably drop this condition.

I'm afraid we can't. For the array case in the test,

vector left_b^racket 3 right_b^racket;
// left_bracket -> [
// right_bracket -> ]

the associated selection tree is,

TranslationUnitDecl 
  FunctionDecl void function()
    CompoundStmt { …
     .ArraySubscriptExpr vector[3]

(function is the wrapper that facilitates writing single expression, doesn't matter here.)
It expands to one single token and is partially selected, which exactly matches the strategy. If we do allow evaluation, we'd get a value/type on [ or ]. Yes, it is a contrived and weird use case, but for now I think it's fine to keep it unevaluated to avoid issue aforementioned.

(But I'm willing to remove this restriction if you prefer a relaxed strategy.)

clang-tools-extra/clangd/unittests/HoverTests.cpp
3855

Aha, that's a typo.

nridge accepted this revision.May 9 2023, 12:35 AM

Thanks! LGTM with one final nit (the other comment is just food for future thought)

clang-tools-extra/clangd/Hover.cpp
503

nit: just inline this into its only call site as doPrintExprValue(N, Ctx).PrintedValue (and then we can call the function printExprValue instead of doPrintExprValue)

707

If we do allow evaluation, we'd get a value/type on [ or ].

Out of curiosity, I tried commenting out this condition locally, and the left_bracket test case still passed for me.

The reason is that in that test case, the ArraySubscriptExpr cannot be evaluated (we don't know what the array values are).

In a modified version of the testcase where the array values are known, but macros are not used:

int main() {
  constexpr int vector[3] = {1, 2, 3};
  vector [ 2 ];
}

hovering over the [ or ] does work (in clangd today, no patch needed).

(But maybe a hover in this situation is more confusing than useful, and it would be better if it didn't work?)

Anyways, we can discuss this further in follow-up issues like https://github.com/clangd/clangd/issues/1622, I think this is good enough for now.

This revision is now accepted and ready to land.May 9 2023, 12:35 AM
zyounan updated this revision to Diff 520638.May 9 2023, 2:49 AM

Final update

zyounan marked 2 inline comments as done.May 9 2023, 3:49 AM
This revision was automatically updated to reflect the committed changes.