This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Support extraction of binary "subexpressions" like a + [[b + c]].
ClosedPublic

Authored by sammccall on Jul 23 2019, 4:32 AM.

Details

Summary

These aren't formally subexpressions in C++, in this case + is left-associative.
However informally +, *, etc are usually (mathematically) associative and users
consider these subexpressions.

We detect these and in simple cases support extracting the partial expression.
As well as builtin associative operators, we assume that overloads of them
are associative and support those too.

Diff Detail

Repository
rL LLVM

Event Timeline

sammccall created this revision.Jul 23 2019, 4:32 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 23 2019, 4:32 AM
SureYeaah added inline comments.Jul 24 2019, 5:30 AM
clang-tools-extra/clangd/refactor/tweaks/ExtractVariable.cpp
210 ↗(On Diff #211276)

A related FIXME would be helpful since it's pretty easy to fix(I think?) and seems like something important.

213 ↗(On Diff #211276)

Why do we need an anon namespace inside another anon namespace?

217 ↗(On Diff #211276)

Wouldn't it make more sense this to have this in some other place (maybe Selection) because other refactorings might want to make use of this feature?

235 ↗(On Diff #211276)

Can we negate this if to reduce nesting?

283 ↗(On Diff #211276)

We're considering the case where all the operators are the same. Isn't it true that in that case, the End will be the the first 'selected RHS' that we find during the traversal? If so, can't we just get rid of the IsStart?

393 ↗(On Diff #211276)

Why not just compute this inside ExtractionContext?

sammccall marked 6 inline comments as done.Jul 24 2019, 8:53 AM
sammccall added inline comments.
clang-tools-extra/clangd/refactor/tweaks/ExtractVariable.cpp
213 ↗(On Diff #211276)

Oops, forgot we already were in one.

217 ↗(On Diff #211276)

It's possible, but I think we should cross that bridge when we come to it, because it's not obvious what those other refactorings are.

There might be some subtle ways in which I've specialized the code for this use case, and it's easier to see that and polish it when we have more use cases in hand.

283 ↗(On Diff #211276)

Oh, right - I'm assuming a much more general tree than can actually exist.
This code is trying to handle an arbitrary binary tree of binops, but of course each actual operator is either left-associative or right-associative, and so it's a linked-list-shaped binary tree.
And the actual code is pretty simple, but the explanation sure is complicated.

And in fact, all the operators are left-associative! I didn't really want to hard-code that, but it'll simplify the code a lot I think.

So this should probably just be a simple loop:

findEndpoints(L, R):
  while (LL, RL) = parseBinop(L):
    L = (LL is selected) ? LL : RL
  return (L, R)

I'll rewrite this tomorrow. Thanks for the hint!

sammccall updated this revision to Diff 211521.Jul 24 2019, 8:54 AM
sammccall marked an inline comment as done.

Address minor comments, but not tree-traversal simplification yet.

Rather than handling arbitrary trees, exploit the fact that the operator is left-associative.

sammccall marked 2 inline comments as done.Jul 25 2019, 1:08 AM

This is ready for another round, I think.

clang-tools-extra/clangd/refactor/tweaks/ExtractVariable.cpp
393 ↗(On Diff #211276)

Using ExtractionContext to cache values that are used in several places but not everywhere seems confusing and to obfuscate the data flow.

In particular, when is it going to be initialized? Not in the constructor, because we don't want to do this during compare. On first access? That seems much less clear than explicitly computing it and passing it around.

SureYeaah added inline comments.Jul 25 2019, 7:09 AM
clang-tools-extra/clangd/refactor/tweaks/ExtractVariable.cpp
50 ↗(On Diff #211679)

Nit: same parameter order for replaceWithVar and insertDeclaration.

245 ↗(On Diff #211679)

Why do we need to check this for CXXOperatorCallExpr and not BinaryOperator?

305 ↗(On Diff #211679)

For 1 + [[2 + 3 * 4]] + 5, we probably don't get an invalid range from this function. If so, we might want to add one more check where we parse End and compare it's operator.

314 ↗(On Diff #211679)

Nit: For [[a + b + c]],

     +
   /  \
  +    c
 / \
a   b

For the second + as Op, a would be completely selected. So Op.LHS can be partially or completely selected.

clang-tools-extra/clangd/unittests/TweakTests.cpp
448 ↗(On Diff #211679)

Can we have some tests with macros as well?

sammccall marked 8 inline comments as done.Jul 25 2019, 9:08 AM
sammccall added inline comments.
clang-tools-extra/clangd/refactor/tweaks/ExtractVariable.cpp
50 ↗(On Diff #211679)

I think this *is* the same parameter order semantically: this is roughly (thing, definition) in both cases.
The fact that the *types* match but in the opposite order is just a coincidence I think?

245 ↗(On Diff #211679)

Because there's no child AST node corresponding to a builtin +, but there is one corresponding to an overloaded +: a DeclRefExpr to an operator+ FunctionDecl.

305 ↗(On Diff #211679)

I'm not quite sure what you mean here.

Annotating the operators, for 1 +a [[2 +b 3 *c 4]] +d 5:

  • N is *c
  • RHS is 4
  • LHS is 1 +a 2 +b 3 *c 4.

The point is that RHS can never be an operator of the same type, because if we replaced RHS with x +e y then N would be that +e instead (since + is left-associative).

314 ↗(On Diff #211679)

Yeah, I meant "at least partially". Elaborated comment.

clang-tools-extra/clangd/unittests/TweakTests.cpp
448 ↗(On Diff #211679)

Added a simple one that just verifies we support operands being wrapped in macros, but not operators.

sammccall updated this revision to Diff 211773.Jul 25 2019, 9:08 AM
sammccall marked 3 inline comments as done.

Elaborate tests and comments.

SureYeaah added inline comments.Jul 26 2019, 3:06 AM
clang-tools-extra/clangd/refactor/tweaks/ExtractVariable.cpp
50 ↗(On Diff #211679)

Sorry yes, makes sense.

305 ↗(On Diff #211679)

But isn't the tree something like

             +
          /    \ 
         +     5
      /     \
  +           *
/  \       /   \
1    2    3    4
SureYeaah added inline comments.Jul 26 2019, 3:29 AM
clang-tools-extra/clangd/unittests/TweakTests.cpp
448 ↗(On Diff #211679)

Isn't it the same behavior in both cases?

SureYeaah accepted this revision.Jul 26 2019, 3:33 AM
This revision is now accepted and ready to land.Jul 26 2019, 3:33 AM
sammccall marked an inline comment as done.Jul 26 2019, 3:33 AM
sammccall added inline comments.
clang-tools-extra/clangd/refactor/tweaks/ExtractVariable.cpp
305 ↗(On Diff #211679)

Whoops, somehow I transcribed that without noticing c was a * instead of a +.

From offline discussion: the code is apparently correct here: RHS can't be a matching operator, because it violates associativity: the parse tree would be rotated in that case. We don't care to distinguish between mismatching operator vs some other node entirely.

  • Expanded a comment to cover this more explicitly
  • adjusted a testcase to ensure that we expand a +-selection across a * on the RHS as well as the LHS.

There's also the idea that it would be nice to verify that we don't bail out of this function just because the selection covers some mismatched operator. The idea of a test that we keep digging right if left mismatches and vice versa is appealing, but as described we never need to to dig right, so such a case doesn't exist. I've settled for ensuring that 0 + 1 * [2 + 3] * 4 + 5 expands to cover 1 and 4, but not 0 or 5.

sammccall updated this revision to Diff 211905.Jul 26 2019, 3:34 AM

Improve testcase, expand comment to clarify RHS non-traversal.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 26 2019, 8:31 AM