This is an archive of the discontinued LLVM Phabricator instance.

[clangd] parse all make_unique-like functions in preamble
ClosedPublic

Authored by upsj on Apr 29 2022, 11:00 AM.

Details

Summary

I am working on support for forwarding parameter names in make_unique-like functions, first for inlay hints, later maybe for signature help.
For that to work generically, I'd like to parse all of these functions in the preamble. Not sure how this impacts performance on large codebases though.

Diff Detail

Event Timeline

upsj created this revision.Apr 29 2022, 11:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 29 2022, 11:00 AM
upsj requested review of this revision.Apr 29 2022, 11:00 AM

I'll have a proper look at this when I get a chance but meanwhile adding Sam and Kadir who may have thoughts

It was intentional to only (initially) support std::make_unique as its implementation is trivial. Whereas I looked at std::make_shared and it appears to result in instantiating a lot of templates.
This was more "I'm sure this is useful and cheap" than "I'm sure covering more functions is too expensive" though.

I suspect that parsing all forwarding function bodies is quite expensive (well, parsing them => instantiating them => instantiating everything from their bodies).
We can measure this in a couple of ways:

  • some work will happen while parsing headers, this will show up as increased preamble size which is easy to measure reliably
  • some will only happen when parsing main files, this will cause increased RAM and latency in main-file parses, which is less reproducible

I'll try patching this and at least see what happens to preamble sizes on some big files

@adamcz FYI (Adam was looking at forwarding functions, for completion/signature help, but isn't anymore I believe)

clang-tools-extra/clangd/Preamble.cpp
177

this is a linear search, and so is the next line, let's just do it once :-)

177

I feel that checking template params -> injected template args -> params is a bit roundabout. (And it involves some searching and construction of the injected template args).

Possibly more direct, and I think equivalent:

  • get the last function param, check that it's a pack and unwrap
  • check that its (non-ref) type is a TemplateTypeParmType
  • verify that the param is from the function template rather than elsewhere, by comparing type->getDepth() == funcTemplate->getTemplateParameters()->getDepth()

I think all these are constant-time, too

Trying this on a few files, this seems like it increases preamble sizes up to 1% or so:

                     | Before    | After
---------------------+-----------+----------
AST.cpp              | 42249648  | 42419732
XRefsTests.cpp       | 56525116  | 56763768
SelectionDAGISel.cpp | 37546764  | 37668564
internal_s.cc        | 59582216  | 60024876
internal_m.cc        | 192084984 | 193850560
internal_l.cc        | 365811816 | 368841388

I can't see any reason to think that RAM/CPU usage would be out of proportion to this.
A 1% regression here isn't trivial but seems worthwhile for significant functional improvements.
I suppose we can add:

  • diagnostics (already with this patch)
  • inlay hints
  • signature help

WDYT about keeping this behind a flag until these are working?
For diagnostics alone, I'm not sure this is a good tradeoff (otherwise why would we restrict it to variadics? FWIW allowing all function templates to be parsed is +4% to preamble size for internal_l.cc.

I'm less concerned about the effects on the main file, partly because preambles are a bigger performance cliff, and partly because we're probably only regressing in cases where the user really is seeing benefits too.

upsj added a comment.Apr 30 2022, 3:56 AM

Thanks for checking! Putting it behind a flag was my intention from the start anyways, since ideally I would like to traverse the AST through something like emplace_back ->construct/realloc_insert -> allocator::construct until I reach a non-forwarding function.
Do you have some pointers on what needs to be done to add a new flag? I am still kind of new to LLVM development :)

Great! If this were to be a public, user-controlled feature we'd make it part of the config file which is a bit more involved, but I think this is rather just a developer toggle until it's time to turn it on by default, so a command-line flag seems fine.

This needs to be passed from the main binary => ClangdServer => Preamble::build, as something like bool AlwaysParseForwardingFunctions.

I think the most appropriate place to pass this to Preamble::build is in ParseOptions defined in Compiler.h (currently empty) which is embedded in ParseInputs. Something like bool AlwaysParseForwardingFunctions.
And the usual way to pass command line flags into ClangdServer is via ClangdServer::Options.

See https://github.com/llvm/llvm-project/commit/e6be5c7cd6d227144f874623e2764890f80cad32 where we removed a couple of ParseOptions, I think you'd basically want the opposite of that.
(Sorry about the plumbing, configuration is always a pain).

For testing, you want the ability to turn this option on in TestTU. I'd just add ParseOptions TestTU::ParseOpts as a public member that your test can set, and use it from TestTU::inputs() instead of the current Inputs.Opts = ParseOptions();.

Trying this on a few files, this seems like it increases preamble sizes up to 1% or so

Are preamble sizes a good proxy for preamble build times as well? I suspect that may be the metric that's more noticeable for users.

Trying this on a few files, this seems like it increases preamble sizes up to 1% or so

Are preamble sizes a good proxy for preamble build times as well? I suspect that may be the metric that's more noticeable for users.

Agree, yes I think they're a pretty good proxy for this sort of change which parses more/less code. (Preamble size more or less counting AST nodes).

upsj updated this revision to Diff 426387.May 2 2022, 5:57 AM
upsj marked 2 inline comments as done.

Rewrote detection of forwarding functions in preamble, added --preamble-parse-forwarding flag

I ran clangd over bits/stdc++.h, and it seems like the condition works very well:

It picked up

  • container::emplace and friends
  • allocator::construct and friends
  • make_unique, make_shared and friends
  • some parts of functional
  • some parts of tuple
  • some parts of thread

most of which seem like good candidates for forwarding parameters.

In terms of runtime, I can't see any noticeable difference between parsing with or without --preamble-parse-forwarding, but of course, all of libstdc++ is not the most realistic scenario

upsj added inline comments.May 2 2022, 6:04 AM
clang-tools-extra/clangd/Preamble.cpp
197

Can PackTypePtr ever be nullptr?

upsj updated this revision to Diff 427852.May 7 2022, 6:12 AM

Work around the currently broken isExpandedParameter, also fix the broken diff

upsj updated this revision to Diff 427854.May 7 2022, 6:15 AM

accidentally pushed to the wrong revision, this is the previous version

Thanks, this looks good!

  • the plumbing can be simplified by moving the option into ParseOptions, sorry about the churn!
  • we need to add some tests, probably easiest by checking for the diagnostics that are now produced. In unittests/DiagnosticsTests.cpp there's TEST(DiagnosticTest, MakeUnique) which checks the existing make_unique support in this way. We should have a similar function with a different name and verify that it does/doesn't produce diags depending on the flag. We should probably also have a check that two levels of forwarding works.
clang-tools-extra/clangd/Preamble.cpp
180

nit: reverse the order of the && here, the isStr should be cheaper to check

185

rather than inlining all this here, can you pull out a function isLikelyForwardingFunction(FunctionTemplateDecl*)?
That documents the high-level goal and moves that logic away from the flow control of the different types of decls/options that shouldSkipFunctionBody deals with. It also avoids the confusing "return false" when you found what you were looking for!

185

somewhere we need to explain *why* we're parsing these functions. Something like:

Usually we don't need to look inside the bodies of header functions to understand the program. However when forwarding function like emplace() forward their arguments to some other function, the interesting overload resolution happens inside the forwarding function's body. To provide more meaningful meaningful diagnostics, code completion, and parameter hints we should parse (and later instantiate) the bodies.

This could either be here, or possibly on the option inside ParseOptions.

186

nit: getTemplatedDecl().

(getAsFunction() is a loose do-what-I-mean helper, and you know the precise case you're dealing with here)

There's no need for a null check here, you can't have a FunctionTemplateDecl with no function.

190

nit: if (const auto* PET = dyn_cast<PackExpansionType>(LastParam->getType()))
and then use it below

193

again, this is a loose function where you can use a precise one.
PET->getPattern()

197

sure: template <class...T> int foo(T*...).
or even some other more exotic derived type.

I don't think we want to include that case, so a null check is fine, with another comment: // ... of the form T&&...

clang-tools-extra/clangd/Preamble.h
103 ↗(On Diff #427854)

instead of passing this as a separate parameter, if you make it part of the ParseOptions then it's available in ParseInputs.
This reduces the amount of plumbing/noise needed.

clang-tools-extra/clangd/TUScheduler.h
208 ↗(On Diff #427854)

can you change "make_unique" to "emplace"?
It's confusing if our example is the one function that gets parsed either way.

209 ↗(On Diff #427854)

TUScheduler should not need to know about this option at all, it can be passed from ClangdServer through to buildPreamble by placing it in the ParseInputs, specifically in the ParseOptions struct.

clang-tools-extra/clangd/tool/ClangdMain.cpp
501

Preamble is a bit jargony for a command-line flag. And "forwarding" is dangling.
Maybe "parse-forwarding-functions"?
(Internally the name is fine though)

502

again, make_unique -> emplace

503

nit: use a trailing comma to get consistent formatting with the others (UseDirtyHeaders is an anomaly)

upsj updated this revision to Diff 427910.May 8 2022, 1:07 AM
upsj marked 13 inline comments as done.

review updates:

  • Pass ParseForwardingFunctions via ParseOptions
  • Simplify check for forwarding function
  • Add test checking diagnostics for make_shared
upsj updated this revision to Diff 427911.May 8 2022, 1:09 AM

update test docs

upsj added a comment.May 15 2022, 10:16 AM

@sammccall @nridge can you give this another look? :)

I looked through this patch, and it looks good to me! I will leave the final approval to Sam though.

clang-tools-extra/clangd/unittests/DiagnosticsTests.cpp
416

I'm confused about this line: you say "show its body is not needed" but forward has a body here

upsj updated this revision to Diff 429630.May 16 2022, 12:44 AM
upsj marked an inline comment as done.

remove std::forward bodies from diagnostic tests

clang-tools-extra/clangd/unittests/DiagnosticsTests.cpp
416

Good catch, that was copy-pasted from std::make_unique above, but it doesn't seem to be necessary there either. I'll remove the bodies both.

sammccall accepted this revision.May 16 2022, 1:41 AM

Thanks, this looks good now! A last few nits here.

Would you like me to commit for you?
From your patches so far, I think it's appropriate to request commit access if you want it.

clang-tools-extra/clangd/ClangdServer.cpp
221–222

nit: rather assign
Opts.PreambleParseForwardingFunctions = PreambleParseForwardingFunctions

booleans in these opts structs are particularly prone to accidentally reordering and mis-assigning

One day we'll have designated initializers for this...

clang-tools-extra/clangd/Preamble.cpp
140

nit: this can be static, and I think it'd be slightly clearer (e.g. that this doesn't interact with the policy flag)

166

nit: only one "meaningful" :-)

clang-tools-extra/clangd/unittests/TestTU.h
87

Instead of adding a parameter here, we should add it to the TestTU struct and use it in inputs().

*many* things affect the behavior of build(), and tests would quickly become unreadable if we passed many of them as parameters. So we hold the line at zero :-)

This revision is now accepted and ready to land.May 16 2022, 1:41 AM
upsj updated this revision to Diff 429653.May 16 2022, 2:08 AM
upsj marked 4 inline comments as done.

review updates

upsj added a comment.May 16 2022, 2:10 AM

@sammccall Feel free to merge with Author: Tobias Ribizel <ribizel@kit.edu>
Thanks for the vote of confidence, I'll apply once the forwarding stuff is done :)

This revision was landed with ongoing or failed builds.May 16 2022, 2:23 AM
This revision was automatically updated to reflect the committed changes.