This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Type hints for variables with 'auto' type
ClosedPublic

Authored by nridge on May 9 2021, 9:06 PM.

Diff Detail

Event Timeline

nridge created this revision.May 9 2021, 9:06 PM
nridge requested review of this revision.May 9 2021, 9:06 PM
nridge updated this revision to Diff 343963.May 9 2021, 9:07 PM

test formatting fix

nridge added a comment.EditedMay 9 2021, 9:15 PM

Here's a first draft of an implementation of type hints. It provides hints for variable declarations that use auto (possibly with const, &, etc.) as the type.

There are additional places in which type hints could be useful, such as:

  • Functions with deduced return type (C++14)
  • Structured bindings (either for the aggregate, or for the individual bindings)
  • decltype(expr) in any context
  • probably others I'm not thinking of

but I think it makes sense to pursue those in subsequent patches.

The patch also doesn't currently make an attempt to omit the type hint in cases where the type is obvious from the initializer. There's probably some low-hanging fruit to be had there:

  • = Foo(...) or = (Foo) (...)
  • = static_cast<Foo>(...)
  • It might be nice to handle things like make_unique<Foo>(...) or dyn_cast<Foo>(...), though I'm not sure if there's a general heuristic that would apply to things like this as a category.

Client-side patch for experimentation: https://github.com/clangd/vscode-clangd/pull/188

sammccall accepted this revision.May 26 2021, 4:46 AM

Sorry *again* about the delayed response, will explain off-list.

This looks great. There's a substantive question about auto &x⟦: int &⟧ = 42 vs auto⟦int⟧ &x = 42. But I don't think this should block landing this unless you want to.

clang-tools-extra/clangd/InlayHints.cpp
33

For lambda values, the overwhelmingly common case is auto X = <lambda-expr>.

A few reasons we may want to not show these at all (and so omit type hints for lambdas as a first approximation):

  • the fact that it's a lambda is pretty obvious
  • lambdas very often have complex introducers (captures/params) that a hint will displace/add noise to
  • we may want to add type hints to captures-with-initializers, which are very much like auto-typed variables, and the fewer hints we have to cram on a line the better

(fine to consider this stuff later, just wanted to mention it)

76

FWIW I think hinting the individual bindings will be more useful and also more consistent with the other hints you have here.

81

why this check vs checking whether AT is deduced?
(thus fixing the dependent T testcase)
At first I assumed this was the usual "AutoType in AST doesn't store the deduced type" problem, but shouldn't it work properly if we're starting at the variable type?

82

This places the hint on the variable rather than the auto. And fittingly (but subtly), it prints type rather than just the placeholder type (e.g. const auto& x = 42 => const int&, not int)

This decision is worth a comment, and probably a little discussion :-)

The alternative of annotating auto has advantages:

  • the deduced auto type is the most obvious question when reading the code
  • that auto deduces to a particular type is the "correct" model, and may lead to better language understanding. Extreme unrealistic example: in auto x = foo(), *y = bar(); I'd strongly prefer one hint.
  • the hint will often be shorter, which is an important consideration since we're reflowing column-limited text
  • the annotated code more closely resembles code that does not use auto

And annotating the variable has advantages too:

  • the type of the variable is the most directly important fact
  • generalizes better to cases where auto is not present (lambda captures, individual structured bindings)
  • i suspect simpler implementation

My intuition is to expect annotating auto to be a little nicer, mostly because it will come closer to making code read the same way whether it uses auto or not. I'm not certain though. Interested if you have strong feelings about this!

To save a round-trip: a likely conclusion is "maybe annotating auto is better, but it's not clear and/or it's a bunch of work". That's 100% fine, but we should just leave a comment whether this is something decided against vs desirable but deferred vs unsure.

clang-tools-extra/clangd/unittests/InlayHintTests.cpp
445

consider = [b(1+1)](int a) { ... } and we can consider adding type hints to b later

487

It'd be useful to have cases showing how we handle "bulky" typenames, e.g. ns::OuterClass::Template<int> which I think we render as Template<int>.

489

not sure if you meant dynamic_cast or dyn_cast here, both are important but distinct!

489

maybe mention the other low-hanging fruit cases from the patch description too

490

The use of a __reserved name might be a good hint this is a structural type whose name is not a meaningful hint. (I guess maybe that's already what you meant!)

This revision is now accepted and ready to land.May 26 2021, 4:46 AM
nridge updated this revision to Diff 348727.May 30 2021, 10:41 PM
nridge marked 8 inline comments as done.

Address review comments

clang-tools-extra/clangd/InlayHints.cpp
33

There is some value in showing (lambda) as a hint: it tells you the initializer is not an IIFE (immediately-invoked function expression, i.e. auto var = [](...){ ... }(); -- note the parentheses at the end). For multi-line lambdas, this information is not otherwise present on the first line of the declaration.

Granted, how useful this is depends on the code style, i.e. whether IIFEs are common or even allowed.

I'm going to keep this as-is for now, but I'm definitely open to revise this based on usage experience.

76

Makes sense, I've updated the comment here and in the testcase to reflect this direction.

81

Not sure if I'm understanding the suggestion correctly, but if I make the condition AT->isDeduced(), I get an unwanted auto hint for var1 in TypeHints.DependentType as well (and moreover, the hint for var2 is also auto instead of the desired T).

82

Very good points here.

The argument I find most convincing for annotating the auto is the space savings, which are significant especially in the presence of const.

On the other hand, the arguments I find most convincing for the current approach are:

  • the generalization to lambda captures and structured bindings, as you mention
  • the familiarity to people coming from other languages like Typescript or Rust which have let var = expr; and who get a hint after the var

(Not sure if you view that second one as a valid consideration, but thought I'd mention it :))

For now, I've kept the current approach and documented it, but I'm definitely happy to reconsider based on usage experience / user feedback.

clang-tools-extra/clangd/unittests/InlayHintTests.cpp
445

Good idea, and... turns out the existing code was already adding the hint for the init-capture!

487

Added a case like this to NoNamespaces (and renamed that test case NoQualifiers).

489

I meant llvm::dyn_cast. It's not obvious to me what sort of heuristic would allow us to recognize things like that -- if you have ideas please feel free to mention.

Syntactic casts and built-in cast operators are indeed low-hanging fruit, I added mention of this in a FIXME here.

490

I didn't really have a heuristic mind, though omitting the hint for __reserved names could be reasonable.

If we consider a case like:

std::vector<Foo> vec;
auto it2 = vec.erase(it1);

the hint is __normal_iterator<Foo *, vector<Foo>>. Ideally, I'd want just Foo *, but I guess we don't get to have that, so omitting it might be the next best thing.

This revision was automatically updated to reflect the committed changes.
nridge added inline comments.Jun 6 2021, 7:05 PM
clang-tools-extra/clangd/InlayHints.cpp
81

and moreover, the hint for var2 is also auto instead of the desired T

I've looked into this a bit further, and it looks to me like the T isn't even stored in the AST. The type is just recorded as auto (and that's what e.g. hover shows too).

I suspect the auto is only filled in after instantiation (i.e. once the initializer expression has a concrete type).