This is an archive of the discontinued LLVM Phabricator instance.

[clang][ExtractAPI] Add support for single symbol SGF
ClosedPublic

Authored by dang on Dec 1 2022, 7:10 AM.

Details

Summary

This is mainly adding an entry point to SymbolGraphSerializer at
serializeSingleSymbolSGF and exposing the necessary data to make this
possible. Additionaly there are some changes to how symbol kinds and
path components are serialized to make the usage more ergonomic in
serializeSingleSymbolSGF.

[clang][ExtractAPI] Add libclang support

Introduces APIs to:

  • create an APISet from a TU
  • dispose of an APISet
  • query an APISet for a single symbol SGF for a given USR.
  • generate a single symbol SGF for a given CXCursor, this only traverses

the necessary AST nodes to construct the result as oppposed as going
through the entire AST.

Diff Detail

Event Timeline

dang created this revision.Dec 1 2022, 7:10 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 1 2022, 7:10 AM
Herald added a subscriber: arphaman. · View Herald Transcript
dang requested review of this revision.Dec 1 2022, 7:10 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 1 2022, 7:10 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
dang updated this revision to Diff 480172.Dec 5 2022, 11:16 AM

Removing some accidental unneeded includes that were tripping up CI.

zixuw added inline comments.Dec 6 2022, 4:31 PM
clang/include/clang/ExtractAPI/API.h
132–134

Misplaced comment?

234

What's the reason for refactoring out instance vs. class property? Looks like this should be a separated patch

316

No need for classof in an abstract node

399

No need for classof in an abstract node

407

Same question as instance/class property

466

No need for classof in an abstract node

clang/include/clang/ExtractAPI/DeclarationFragments.h
91

Is the decl guaranteed to be available when Fragment is used?
In our existing use case this might be fine as the serializer will be used while the AST is still alive. But this is actually the first thing that adds the dependency to the AST IIRC. So this makes the whole APISet dependent on the AST

clang/lib/ExtractAPI/API.cpp
145

nit: tab

169

nit: tab

193

nit: tab

clang/tools/c-index-test/c-index-test.c
4931–4932

nit: tab

dang updated this revision to Diff 480870.Dec 7 2022, 5:35 AM

Fix code review feedback and ensure that all data is initialized before first
use.

dang marked 6 inline comments as done.Dec 7 2022, 5:37 AM
dang added inline comments.
clang/include/clang/ExtractAPI/API.h
234

It was needed to simplify the implementation of SymbolGraphSerializer::serializeSingleRecord

dang updated this revision to Diff 480890.Dec 7 2022, 6:26 AM
dang marked an inline comment as done.

Fix whitespace issues

dang updated this revision to Diff 480895.Dec 7 2022, 6:36 AM

Fix whitespace again

dang marked 3 inline comments as done.Dec 7 2022, 6:50 AM
dang updated this revision to Diff 481193.Dec 8 2022, 1:29 AM

Add comment explaining that the associated declaration in Declaration Fragments
is not intended to be used outside of libclang

benlangmuir added inline comments.
clang/include/clang-c/Documentation.h
549

@dang please document what this type represents.

@akyrtzi what's the preferred implementation type name for opaque typedef types? I see we have a mix right now:

  • void *
  • CXOpaqueFoo *
  • CXFooImpl *
551

Please add documentation comments to all the new functions. For this one, we should explicitly include

  • That the user needs to call clang_disposeAPISet for the output
  • Whether out_api is modified on failure -- e.g. is it null?
556

Does SGF stand for "Symbol Graph Fragment"? How well-known is this abbreviation? And how commonly-used do you expect these functions to be? My gut reaction would be to use the full name here.

We should document the null return value for unknown symbols. Are there other ways this could fail? Is it ever interesting to know why this failed, or is success/fail always good enough? Same applies to the function below.

clang/tools/libclang/CXExtractAPI.cpp
41

Nit: we use static for global functions and only use anon namespaces for types. Same for the below.

74
if (isNotUsableTU(tu) || !out_api)
  return CXError_InvalidArguments;
93

We don't want CINDEX_LINKAGE in the implementation, only the header needs it. Same for the other functions

98

Nit: maybe SmallString and raw_svector_ostream?

109

This is binding a temporary; you probably want CXCursorKind by value.

130

Probably should check the return value here

akyrtzi added inline comments.Dec 8 2022, 1:43 PM
clang/include/clang-c/Documentation.h
549

There's no consistency in libclang APIs 😅 This typedef LGTM.

dang marked an inline comment as done.Dec 9 2022, 2:44 PM
dang added inline comments.
clang/include/clang/ExtractAPI/DeclarationFragments.h
91

Currently yes, the Decl is available as the APISet is not kept around beyond the lifetime of the AST. Nonetheless I agree that it is used in an unusual manner, to help traverse the AST in libclang as opposed to truly encoding information about the available APIs. I think if we introduce new clients of APISet we would need to add an extension mechanism to the visitation mechanism to enable use cases like this.

dang updated this revision to Diff 481782.Dec 9 2022, 4:30 PM
dang marked 11 inline comments as done.

Address code review feedback regarding libclang specific work.

Thanks, the libclang change LGTM other than my comment about the function names.

clang/include/clang-c/Documentation.h
589

clang_getSingleSymbolSymbolGraphForUSR is a bit hard to read with the double "symbol". What do you think of

  • clang_getSymbolGraphForUSR
  • clang_getSymbolGraphForCursor

Or perhaps clang_getSymbolGraphFragmentFor...? My thinking is that both USR and Cursor are already implying it is a single symbol so we can avoid the "single-symbol symbol-graph" term.

I'm not deeply familiar with this API, so apologies if this suggestion makes no sense 😅

dang updated this revision to Diff 482096.Dec 12 2022, 6:23 AM

Rebase and fix clang format issue

dang updated this revision to Diff 482101.Dec 12 2022, 6:32 AM

Rename new libclang APIs to make the names more readable.

libclang changes LGTM, thanks! I'll let the other reviewers who have domain expertise speak for the rest of the patch.

zixuw accepted this revision.Dec 12 2022, 10:41 AM
This revision is now accepted and ready to land.Dec 12 2022, 10:41 AM
This revision was landed with ongoing or failed builds.Dec 13 2022, 3:19 AM
This revision was automatically updated to reflect the committed changes.