This is an archive of the discontinued LLVM Phabricator instance.

[libclang] Allow skipping function bodies in preamble only
ClosedPublic

Authored by nik on Apr 19 2018, 6:51 AM.

Details

Summary

As an addition to CXTranslationUnit_SkipFunctionBodies, provide the
new option CXTranslationUnit_LimitSkipFunctionBodiesToPreamble,
which constraints the skipping of functions bodies to the preamble
only. Function bodies in the main file are not affected if this
option is set.

Skipping function bodies only in the preamble is what clangd already
does and the introduced flag implements it for libclang clients.

Diff Detail

Event Timeline

nik created this revision.Apr 19 2018, 6:51 AM

@ilya: Using SkipFunctionBodies_AllExceptTemplates for the preamble might be also useful for clangd.

In D45815#1072094, @nik wrote:

@ilya: Using SkipFunctionBodies_AllExceptTemplates for the preamble might be also useful for clangd.

Unfortunately, that's also the biggest performance win you get when skipping the function bodies.
I.e. most codebases do not have function bodies in headers unless the function is template.

Have you measured the performance gain of skipping only non-template functions, but not skipping the template ones?
Would that really make a significant difference for any real codebase?

include/clang/Frontend/SkipFunctionBodies.h
18

Maybe make it enum class and remove the name prefix from enumerators?
I.e. enum class SkipFunctionBodiesKind { None, ....

nik added a comment.Apr 20 2018, 7:38 AM
In D45815#1072094, @nik wrote:

@ilya: Using SkipFunctionBodies_AllExceptTemplates for the preamble might be also useful for clangd.

Unfortunately, that's also the biggest performance win you get when skipping the function bodies.
I.e. most codebases do not have function bodies in headers unless the function is template.

Have you measured the performance gain of skipping only non-template functions, but not skipping the template ones?
Would that really make a significant difference for any real codebase?

Here are some rough numbers, based on LIBCLANG_TIMING=1 output: https://dpaste.de/os39/raw

Reparses of Qt Creator's texteditor.cpp
Without any skipping of function bodies in the preamble: 1.9s
Skipping all function bodies in the preamble: 1.0s
Skipping all function bodies in the preamble except function template bodies: 1.2s

Reparses of Clang's SemaDecl.cpp
Without any skipping of function bodies in the preamble: 1.7s
Skipping all function bodies in the preamble: 0.59s
Skipping all function bodies in the preamble except function template bodies: 0.64s

Hmm, that indicates that template function bodies are actually not that
dominant, which I also haven't thought. Now I'm doubting the correctness of
my patch/tests.

I'll recheck this on Tuesday. Until then, you/someone might want to give this also a shot.

include/clang/Frontend/SkipFunctionBodies.h
18

Will do.

nik added a reviewer: erikjv.Apr 24 2018, 1:46 AM
In D45815#1073581, @nik wrote:

Hmm, that indicates that template function bodies are actually not that
dominant, which I also haven't thought. Now I'm doubting the correctness of
my patch/tests.

The numbers are definitely interesting, we need to double-check if that it's the case. But they definitely make a good point for including this option.
It seems Qt and LLVM are exactly the types of codebases where skipping the templates doesn't bring much value, as most of the code is not templated. On the other hand, skipping inline function might potentially be a win for those.

I'm still tempted to say that we should either skip all bodies or none of them to keep the code simpler, but I see why having errors from template instantiations is preferable.

ilya-biryukov added inline comments.Apr 24 2018, 3:03 AM
include/clang/Frontend/ASTUnit.h
679

Default ctor will be generated automatically. Maybe remove explicit definition?

include/clang/Frontend/FrontendOptions.h
302

Maybe add a comment to match the code style of other options?

lib/Frontend/ASTUnit.cpp
1662

Maybe use LLVM's make_scope_exit from ADT/ScopeExit.h instead of creating a separate RAII class?
Or even directy set/restore the value of the flag right in the function.
Given that LLVM does not use exceptions, RAII class does not seem to buy much in terms of correctness and doesn't really make the code easier to read, IMO.

But up to you.

lib/Parse/Parser.cpp
1107

Requested in this name is a bit confusing, e.g. it's not clear who requested it.
Maybe rename to TrySkipFunctionBody or something similar?

yvvan added inline comments.Apr 24 2018, 5:55 AM
include/clang/Frontend/ASTUnit.h
679

or replace by constructor with parameters

nik marked 3 inline comments as done.Apr 24 2018, 7:02 AM
In D45815#1073581, @nik wrote:

Hmm, that indicates that template function bodies are actually not that
dominant, which I also haven't thought. Now I'm doubting the correctness of
my patch/tests.

The numbers are definitely interesting, we need to double-check if that it's the case. But they definitely make a good point for including this option.
It seems Qt and LLVM are exactly the types of codebases where skipping the templates doesn't bring much value, as most of the code is not templated. On the other hand, skipping inline function might potentially be a win for those.

I'm still tempted to say that we should either skip all bodies or none of them to keep the code simpler, but I see why having errors from template instantiations is preferable.

OK, I've rechecked this change. I don't see any obvious mistake :)

In my previous tests/timings I had some modifications to the clang command line applied (extra diagnostics enabled with -Weverything and maybe others, using -isystem instead of -I). I've reverted these and made some runs/timings with only "-Wall -Wextra" (https://dpaste.de/cZgZ/raw) - The difference between skip-all-in-preamble and skip-only-non-template-in-preamble becomes even smaller now for Qt Creator's texteditor.cpp.

include/clang/Frontend/ASTUnit.h
679

The explicit definition is needed to workaround https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58328 ("SkipFunctionBodiesOptions()" is used as default value in some function signatures)

679

That would not really help in this case.

include/clang/Frontend/FrontendOptions.h
302

No comment here is in line with "CodeCompleteOptions CodeCompleteOpts" below. That one has also its comments at definition. Also, I can't come up with a comment that adds more value than the name itself and that does not duplicate the comment from SkipFunctionBodies.h.

lib/Frontend/ASTUnit.cpp
1662

I've added the RAII class solely to minimize duplication. Now I'm just passing the flags to getMainBufferWithPrecompiledPreamble() and set/reset there directly.

lib/Parse/Parser.cpp
1107

TrySkipFunctionBody somewhat collides with the function trySkippingFunctionBody(). I'm using "ShouldSkipFunctionBody" now.

nik updated this revision to Diff 143738.Apr 24 2018, 7:03 AM
nik marked 2 inline comments as done.

Addressed inline comments.

About the timing testing: I'm not very familiar with boost, but it probably has a library that's using templates more than either LLVM or Qt. Maybe you can test one of those too?

LGTM otherwise.

ilya-biryukov requested changes to this revision.Apr 26 2018, 3:54 AM

OK, I've rechecked this change. I don't see any obvious mistake :)

I think I got to the bottom of it. We didn't expect a big win, because we expect people to not put their non-template code into the header files. Which is probably true.
The problem with our reasoning, is that this patch also skip bodies of non-template functions inside template classes, e.g.

template <class T>
struct vector {
   template <class It>
   void append(T* pos, It begin, It end) {
        / * This function won't be skipped, it is a template. */ 
   }

   void push_back(T const&) {
        /* However, this function will be skipped! It is not a template! */
   }
};

So it's not surprising that we get a big win. Template function are (probably) much more rare than non-template functions inside templates.

However, note that we will still miss a lot of diagnostics if we skip bodies of non-template functions inside templates.
So I don't see much value in an option like this: it still compromises correctness of diagnostics even inside the main file, while still missing out some performance opportunities that come from skipping the template functions. And it makes the code more complex.

We should either have an option that does not skip functions inside template classes or keep the current boolean value.
We should definitely measure the performance impact of having such an option before adding more complexity to the code.
WDYT?

BTW, maybe split this change out into two patches: one that allows to skip function bodies only in the preamble, the other that turns SkipFunctionBodies boolean into a enum and changes parser, etc.
The ASTUnit changes LG and we could probably LGTM them sooner than the parser changes.

lib/Parse/ParseCXXInlineMethods.cpp
104 ↗(On Diff #143738)

Can't this be a template too? Do we need to check for it here?

This revision now requires changes to proceed.Apr 26 2018, 3:54 AM
nik added a comment.Apr 27 2018, 2:41 AM

OK, I've rechecked this change. I don't see any obvious mistake :)

I think I got to the bottom of it. We didn't expect a big win, because we expect people to not put their non-template code into the header files. Which is probably true.
The problem with our reasoning, is that this patch also skip bodies of non-template functions inside template classes, e.g.

template <class T>
struct vector {
   template <class It>
   void append(T* pos, It begin, It end) {
        / * This function won't be skipped, it is a template. */ 
   }

   void push_back(T const&) {
        /* However, this function will be skipped! It is not a template! */
   }
};

So it's not surprising that we get a big win. Template function are (probably) much more rare than non-template functions inside templates.

Oh, right. That makes sense. I didn't have a test for this case and thus didn't notice.

However, note that we will still miss a lot of diagnostics if we skip bodies of non-template functions inside templates.
So I don't see much value in an option like this: it still compromises correctness of diagnostics even inside the main file, while still missing out some performance opportunities that come from skipping the template functions. And it makes the code more complex.

We should either have an option that does not skip functions inside template classes or keep the current boolean value.
We should definitely measure the performance impact of having such an option before adding more complexity to the code.
WDYT?

BTW, maybe split this change out into two patches: one that allows to skip function bodies only in the preamble, the other that turns SkipFunctionBodies boolean into a enum and changes parser, etc.
The ASTUnit changes LG and we could probably LGTM them sooner than the parser changes.

Agree. I'll first reduce this change to the skip-in-preamble-only functionality and then will check some numbers with the new case.

nik updated this revision to Diff 144303.Apr 27 2018, 2:41 AM

Reduction to skip-in-preamble-only functionality.

nik retitled this revision from [libclang] Add options to limit skipping of function bodies to [libclang] Allow skipping function bodies in preamble only.Apr 27 2018, 2:43 AM
nik edited the summary of this revision. (Show Details)
nik added inline comments.Apr 27 2018, 7:18 AM
lib/Parse/ParseCXXInlineMethods.cpp
104 ↗(On Diff #143738)

Correct. Here is also the point were we would check for the parent class being a template.

nik added a comment.Apr 27 2018, 7:28 AM

OK, to skip all function bodies in the preamble except for template functions and functions within a template class, I've amended the previous diff with:

  • a/lib/Parse/ParseCXXInlineMethods.cpp +++ b/lib/Parse/ParseCXXInlineMethods.cpp @@ -102,9 +102,14 @@ NamedDecl *Parser::ParseCXXInlineMethodDef(AccessSpecifier AS, }

    if (SkipFunctionBodies != SkipFunctionBodiesKind::None && + TemplateInfo.Kind == ParsedTemplateInfo::NonTemplate && + !isa<ClassTemplateDecl>(getCurrentClass().TagOrTemplate) && + !isa<ClassTemplateSpecializationDecl>(getCurrentClass().TagOrTemplate) && + !isa<ClassTemplatePartialSpecializationDecl>( + getCurrentClass().TagOrTemplate) && (!FnD || Actions.canSkipFunctionBody(FnD)) && trySkippingFunctionBody()) { Actions.ActOnSkippedFunctionBody(FnD);

I'm not sure whether this covers all cases, but here are the timing in relation to others:

Reparse for skipping all function bodies: 0.2s
Reparse for skipping all function bodies except template functions (previous patch set, not handling function bodies within template classes): 0.27s
Reparse for skipping all function bodies with the diff above (fixing another case + ignoring function bodies in templates): 0.44s
Reparse without skipping any function bodies in the preamble: 0.51s

As I said, I'm not sure whether this diff covered all cases, but it can get only more worse than 0.44s :) That's enough for me to stop here.

nik added a comment.Apr 27 2018, 7:29 AM

Trying to format the diff in the previous comment:

--- a/lib/Parse/ParseCXXInlineMethods.cpp
+++ b/lib/Parse/ParseCXXInlineMethods.cpp
@@ -102,9 +102,14 @@ NamedDecl *Parser::ParseCXXInlineMethodDef(AccessSpecifier AS,
   }
 
   if (SkipFunctionBodies != SkipFunctionBodiesKind::None &&
+      TemplateInfo.Kind == ParsedTemplateInfo::NonTemplate &&
+      !isa<ClassTemplateDecl>(getCurrentClass().TagOrTemplate) &&
+      !isa<ClassTemplateSpecializationDecl>(getCurrentClass().TagOrTemplate) &&
+      !isa<ClassTemplatePartialSpecializationDecl>(
+          getCurrentClass().TagOrTemplate) &&
       (!FnD || Actions.canSkipFunctionBody(FnD)) && trySkippingFunctionBody()) {
     Actions.ActOnSkippedFunctionBody(FnD);
nik added a comment.May 3 2018, 11:56 PM

Do I miss something? I've uploaded a new diff/version and state is still "(X) Requested Changes to Prior Diff".

Waiting for review.

ilya-biryukov added inline comments.May 4 2018, 6:22 AM
include/clang/Frontend/ASTUnit.h
104

Maybe move this out of ASTUnit? Would allow removing the first qualifier in usages outside the class itself (i.e. ASTUnit::SkipFunctionBodiesScope::Preamble will be shorter!)

370

NIT: Maybe keep the name SkipFunctionBodies? It seems that putting Scp at the end does not add any clarity.

832

This parameter seems slightly out of place.
The scope of skipped function bodies should be a property of the whole ASTUnit, changing it on every reparse would mean more complexity and wouldn't be used anyway.

I suggest storing a field of type SkipFunctionBodiesScope instead of adding parameters to Reparse and LoadFromCompilerInvocation.

Sorry for the delay

nik updated this revision to Diff 145648.May 8 2018, 1:56 AM
nik marked 3 inline comments as done.
nik edited the summary of this revision. (Show Details)

Addressed comments.

nik added inline comments.May 8 2018, 1:56 AM
include/clang/Frontend/ASTUnit.h
370

IIRC I've done that to better distinguish it from e.g. "getFrontendOpts().SkipFunctionBodies", but yes, they are not so often used together, removed "Scp" suffix.

832

Agree, this makes the change significantly shorter :)

nik edited the summary of this revision. (Show Details)May 8 2018, 2:09 AM
This revision is now accepted and ready to land.May 14 2018, 6:51 AM
This revision was automatically updated to reflect the committed changes.