Page MenuHomePhabricator

[clang] Refactor how are doc comments attached to Decls
ClosedPublic

Authored by jkorous on Jul 25 2019, 3:37 PM.

Details

Summary
  • Add ASTContext::searchForDocComments so we don't have to load external comments in Sema::ActOnDocumentableDecls when trying to attach existing doc comments to newly parsed Decls.
  • Keep comments ordered and cache their decomposed locations.

This noticeably improves performance of code completion when using large modules with doc comments.

Diff Detail

Repository
rL LLVM

Event Timeline

jkorous created this revision.Jul 25 2019, 3:37 PM

Not a complete review, but sending out the comments I have so far.

clang/include/clang/AST/ASTContext.h
845 ↗(On Diff #211839)

"attachCommentsToJustParsedDecls"?

clang/include/clang/AST/RawCommentList.h
202 ↗(On Diff #211839)

"... in case there are no comments in ..."

219 ↗(On Diff #211839)

s/FileId hash/FileId/

clang/lib/AST/ASTContext.cpp
268 ↗(On Diff #211839)

Please add a blank line between functions.

270 ↗(On Diff #211839)

Please un-abbrevieate "search".

Also maybe call it "DeclLoc"?

444 ↗(On Diff #211839)

We don't care about VS2015 anymore. https://reviews.llvm.org/D64326

clang/lib/AST/RawCommentList.cpp
283 ↗(On Diff #211839)

Please move this comment down, right before the if statement that implement this logic.

319 ↗(On Diff #211839)

No need to parenthesize.

329 ↗(On Diff #211839)

I don't understand what you mean by "slow path".

334 ↗(On Diff #211839)

No need to parenthesize.

clang/lib/Sema/SemaDecl.cpp
12730 ↗(On Diff #211839)

s/D/d/

12732 ↗(On Diff #211839)

Also macros. (However I do agree with you that we can ignore this case.)

12738 ↗(On Diff #211839)

Would it be better to move this logic into searchForDocComments? It seems like it belongs there. That function already computes CommentsInThisFile etc.

clang/lib/Serialization/ASTReader.cpp
9733 ↗(On Diff #211839)

Wouldn't it be faster to directly modify Context.Comments.OrderedComments?

jkorous marked 15 inline comments as done.Aug 5 2019, 3:08 PM
jkorous added inline comments.
clang/include/clang/AST/ASTContext.h
845 ↗(On Diff #211839)

Sounds good - let's just finish the discussion about moving the logic from Sema::ActOnDocumentableDecls to this method first.

clang/lib/AST/RawCommentList.cpp
329 ↗(On Diff #211839)

The comment is no longer relevant. Thanks!

clang/lib/Sema/SemaDecl.cpp
12732 ↗(On Diff #211839)

You're absolutely right! I added a FIXME.

12738 ↗(On Diff #211839)

I was asking myself the same thing.

We can move the logic but then we'd have to change semantics of ASTContext::searchForDocComments - essentially documenting that it makes sense to be called only from Sema::ActOnDocumentableDecls context.

I don't have a strong preference but I'm worried that it would be kind of weird interface.

WDYT?

jkorous updated this revision to Diff 213474.Aug 5 2019, 3:10 PM
jkorous marked 2 inline comments as done.

Addressed feedback (comments, naming).

Finished the review. I like this approach a lot.

clang/lib/AST/ASTContext.cpp
437 ↗(On Diff #213474)

cacheRawCommentForDecl

We don't know that it is a doc comment (it could be an ordinary comment when ParseAllComments is true), and "data" in the name does not explain anything.

452 ↗(On Diff #213474)

Should it be setOriginalDecl(OriginalDeclForComment), like in ASTContext::getRawCommentForAnyRedecl?

Could you also call cacheDocCommentDataForDecl from ASTContext::getRawCommentForAnyRedecl?

457 ↗(On Diff #213474)

cacheRawCommentForRedecls

475 ↗(On Diff #213474)

Could you call cacheDocCommentDataForRedecls from ASTContext::getRawCommentForAnyRedecl?

clang/lib/Sema/SemaDecl.cpp
12738 ↗(On Diff #211839)

I think changing the name and semantics like that would be an improvement in clarity. I don't think it makes sense to call ASTContext::searchForDocComments in any other case anyway -- users who want to retrieve the comment should be calling ASTContext::getRawCommentForAnyRedecl.

jkorous updated this revision to Diff 214026.Aug 7 2019, 4:11 PM
jkorous marked 3 inline comments as done.

Rewrite caching of Decl -> RawComment mapping.

jkorous marked an inline comment as done.Aug 7 2019, 4:16 PM

I still have to do the searchForDocComments -> attachCommentsToJustParsedDecls change.

clang/lib/AST/ASTContext.cpp
452 ↗(On Diff #213474)

Well, turned out this was a spot on comment!

I started by explaining the use NoCommentInDecl and why this change is fine and realized we can save some memory and make the code (arguably) simpler. I ended up rewriting the caching completely.

jkorous updated this revision to Diff 214034.Aug 7 2019, 4:42 PM

searchForDocComments() -> attachCommentsToJustParsedDecls()

jkorous marked 4 inline comments as done.Aug 7 2019, 4:44 PM
gribozavr added inline comments.Aug 8 2019, 9:47 AM
clang/lib/AST/ASTContext.cpp
370 ↗(On Diff #214034)

Is it possible for D to be a null pointer? adjustDeclToTemplate does a dyn_cast on D, which would crash if D was null.

396 ↗(On Diff #214034)

return DeclRawComments.find(RedeclComment->second)->second; might be slightly better, because it will detect coding errors -- it will catch the case when DeclRawComments does not have an entry for that decl. The square bracket syntax will insert a null entry silently.

430 ↗(On Diff #214034)

I like the new approach a lot. The last loop is tricky, but I do think it is an improvement.

439 ↗(On Diff #214034)

Why not retrieve the canonical decl inside the cacheRawCommentForDecl from OriginalD?

466 ↗(On Diff #214034)

Indentation is off.

505 ↗(On Diff #214034)

Are there cases when canonical decl is null?..

clang/lib/Sema/SemaDecl.cpp
12729 ↗(On Diff #214034)

I think this FIXME should be sunk down into attachCommentsToJustParsedDecls, because it relates to the implementation of that function.

clang/lib/Serialization/ASTReader.cpp
9733 ↗(On Diff #211839)

So what about this point?

jkorous marked 15 inline comments as done.Aug 8 2019, 1:53 PM
jkorous added inline comments.
clang/lib/AST/ASTContext.cpp
370 ↗(On Diff #214034)

You're right. I moved the check above the adjustment and changed interface of adjustDeclToTemplate to make it obvious that != NULL is expected of input and guaranteed for output.

396 ↗(On Diff #214034)

Good point!

430 ↗(On Diff #214034)

I agree. I don't think we're doing any rocket science here so the code should be easy to understand.

439 ↗(On Diff #214034)

You are right, this is premature optimization.

//  - The first element is also often called the canonical element. Every
//    element has a pointer to it so that "getCanonical" can be fast.
...
class Redeclarable
505 ↗(On Diff #214034)

IIUC correctly there are no such cases at the moment just that the interface doesn't guarantee that. But it seems like we don't expect that possibility anywhere in clang.

clang/lib/Sema/SemaDecl.cpp
12729 ↗(On Diff #214034)

I've put it as a requirement to attachCommentsToJustParsedDecls doc comment. Since we're violating it here I'd leave the FIXME also here.

Or did you mean that I shouldn't be expecting that in the first place?

clang/lib/Serialization/ASTReader.cpp
9733 ↗(On Diff #211839)

Oh, absolutely. Thanks for reminding me!

jkorous updated this revision to Diff 214226.Aug 8 2019, 1:54 PM
jkorous marked 6 inline comments as done.

Addressed comments & clang-formatted the patch.

gribozavr added inline comments.Aug 8 2019, 11:24 PM
clang/lib/AST/ASTContext.cpp
340 ↗(On Diff #214226)

Why add a null check?

351 ↗(On Diff #214226)

Why null check?

400 ↗(On Diff #214226)

No need to. Dereferencing the end iterator is undefined behavior, which is sufficient to find the problem in sanitizer builds.

402 ↗(On Diff #214226)

Please don't add code after llvm_unreachable. We don't do defensive programming in LLVM/Clang.

370 ↗(On Diff #214034)

But why even add a null check at all? The contract of this function is that D is never null.

clang/lib/Sema/SemaDecl.cpp
12729 ↗(On Diff #214034)

What I meant is that the relevant logic (that is violating the expectation, and should be eventually fixed) is in attachCommentsToJustParsedDecls, therefore, the FIXME belongs there, not in the caller. There isn't much a caller can do about the problem. Also, when the problem will get fixed, I doubt someone will look at the callsite for FIXMEs.

clang/test/Index/comment-redeclarations.cpp
15 ↗(On Diff #214226)

Please add a newline.

jkorous marked 8 inline comments as done.Aug 9 2019, 1:33 PM
jkorous added inline comments.
clang/lib/AST/ASTContext.cpp
340 ↗(On Diff #214226)

You're right, nullptr can't be returned here.

351 ↗(On Diff #214226)

You're right, nullptr can't be returned here.

400 ↗(On Diff #214226)

Arguably debug builds are more common than sanitized builds so I'd rather not rely on sanitizer being used.

370 ↗(On Diff #214034)

I am not sure what you mean. "D is never null" isn't explicitly stated in the ASTContext.h header - do you mean that it's implied by the interface & doc comment of ASTContext::getRawCommentForAnyRedecl?

It seems to me that we should either

  • Explicitly mention this in the header file and add an assert here.

or

  • Handle nullptr value here.
clang/lib/Sema/SemaDecl.cpp
12729 ↗(On Diff #214034)

Sorry, I don't understand what you mean by:

relevant logic (that is violating the expectation, and should be eventually fixed)

The way I understand it is that attachCommentsToJustParsedDecls states the expectation in its doc comment

/// Requirement: All \p Decls are in the same file.

and it's up to a caller to satisfy it, which isn't something Sema::ActOnDocumentableDecls does.

Are you saying that attachCommentsToJustParsedDecls should pretend it's more generic than that and admit it's cheating by FIXME in its implementation?

jkorous updated this revision to Diff 214443.Aug 9 2019, 1:34 PM
jkorous marked 2 inline comments as done.

Addressed comments

gribozavr accepted this revision.Aug 12 2019, 1:34 AM

Very nice improvement! Thank you!

This revision is now accepted and ready to land.Aug 12 2019, 1:34 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptAug 13 2019, 11:10 AM