This is an archive of the discontinued LLVM Phabricator instance.

[clang][extract-api] Fix small issues with SymbolGraphSerializer
ClosedPublic

Authored by dang on Apr 4 2022, 9:21 AM.

Details

Summary

This includes:

  • replacing "relationhips" with "relationships"
  • emitting the "pathComponents" property on symbols
  • emitting the "accessLevel" property on symbols

Diff Detail

Event Timeline

dang created this revision.Apr 4 2022, 9:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 4 2022, 9:21 AM
dang requested review of this revision.Apr 4 2022, 9:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 4 2022, 9:21 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
zixuw added inline comments.Apr 4 2022, 2:02 PM
clang/lib/ExtractAPI/Serialization/SymbolGraphSerializer.cpp
17

Not needed

509

What's the cost/would it worth it to wrap the emplace_backs and pop_backs of PathComponentContext in meaningful APIs like enterPathComponentContext and exitPathComponentContext?
That way the code is more self-explanatory and easier to read. It took me some time and mental resources to figure out why the pop_back is placed here.

519–530

Same comment as above, would be nice to have clear APIs for entering and exiting path component contexts.

655

oops 😬

clang/test/ExtractAPI/macro_undefined.c
53

s/m/,/

After a quick scan comparing the current output of these symbol graphs with the primary library used for reading them, the last thing i can spot that's "off" is that the "function signature" is currently being serialized under a parameters field instead of the required functionSignature.

clang/lib/ExtractAPI/Serialization/SymbolGraphSerializer.cpp
509

What's the use of having the emplace_back call inside serializeAPIRecord but to pop it outside? It seems like it's easier to mess up for new kinds of records.

zixuw added a comment.Apr 4 2022, 3:35 PM

After a quick scan comparing the current output of these symbol graphs with the primary library used for reading them, the last thing i can spot that's "off" is that the "function signature" is currently being serialized under a parameters field instead of the required functionSignature.

Hmm, I was looking at the format specification at https://github.com/apple/swift-docc-symbolkit/blob/0a45209833f4a151212c1aa38e13cfc03b9462e4/openapi.yaml#L307, and I searched the term functionSignature in the spec but found no property with that name (except for the FunctionSignature schema that the parameters property is referring to). But anyway this should be a easy fix.

zixuw added inline comments.Apr 4 2022, 3:42 PM
clang/lib/ExtractAPI/Serialization/SymbolGraphSerializer.cpp
509

The reason to emplace_back the path component inside serializeAPIRecord is that the pathComponent field of the Record is serialized in there so you want to include the name of the symbol itself in the path component list.
The pop_back is delayed til all other additional serialization for a specific kind of record, for example serializeEnumRecord handles all the enum cases after processing the enum record itself using serializeAPIRecord. So in order to correctly include the enum name in the path components of all the enum cases, the enum name has to stay in PathComponentContext until all members are serialized.

This is exactly the reason why I wanted a clearer API to make it easier to see.

zixuw added inline comments.Apr 4 2022, 3:54 PM
clang/lib/ExtractAPI/Serialization/SymbolGraphSerializer.cpp
509

Hmm now that I thought about this, it seems that it would be easier to understand and avoid bugs if we lift PathComponentContext.emplace_back/enterPathComponentContext out of serializeAPIRecord, because we have access to the record name before that anyway.

So we establish a pre-condition of serializeAPIRecord that the correct path components would be set up in PathComponentContext before the call so we could still serialize the field inside that method. And in specific serialize*Record methods we push the record name, and pop out at the end.

This way the push and pop operations would appear in pairs in a single block, saving the confusion and mental work of jumping across functions to see how PathComponentContext is evolving.

After a quick scan comparing the current output of these symbol graphs with the primary library used for reading them, the last thing i can spot that's "off" is that the "function signature" is currently being serialized under a parameters field instead of the required functionSignature.

Hmm, I was looking at the format specification at https://github.com/apple/swift-docc-symbolkit/blob/0a45209833f4a151212c1aa38e13cfc03b9462e4/openapi.yaml#L307, and I searched the term functionSignature in the spec but found no property with that name (except for the FunctionSignature schema that the parameters property is referring to). But anyway this should be a easy fix.

It seems like the specification and implementation have diverged. The parser in swift-docc-symbolkit is looking for a functionSignature field by virtue of how it names its "coding key". By comparison, here is the functionSignature emitter in the Swift symbol-graph generator.

dang marked 2 inline comments as done.Apr 5 2022, 3:18 AM

After a quick scan comparing the current output of these symbol graphs with the primary library used for reading them, the last thing i can spot that's "off" is that the "function signature" is currently being serialized under a parameters field instead of the required functionSignature.

Hmm, I was looking at the format specification at https://github.com/apple/swift-docc-symbolkit/blob/0a45209833f4a151212c1aa38e13cfc03b9462e4/openapi.yaml#L307, and I searched the term functionSignature in the spec but found no property with that name (except for the FunctionSignature schema that the parameters property is referring to). But anyway this should be a easy fix.

It seems like the specification and implementation have diverged. The parser in swift-docc-symbolkit is looking for a functionSignature field by virtue of how it names its "coding key". By comparison, here is the functionSignature emitter in the Swift symbol-graph generator.

Yeah we should fix that I will do it as part of this patch since it seems like a small change.

clang/lib/ExtractAPI/Serialization/SymbolGraphSerializer.cpp
509

If you think we should have a specialized API I am happy to do this. I figured it was self-explanatory by the name of PathComponentContext but it clearly isn't so needs addressing. I put the emplace_back call in serializeAPIRecord since all the specific serialization methods call it. I thought it would make it impossible to forget to add them. @zixuw is correct in the reason why the pop is outside we don't want to pop before we have serialized the sub records by agree it is ugly and potentially error prone. I can see three ways forward for improving the ergonomics of this since this seems to be problematic:

  • Provide serializeAPIRecord a continuation to serialize the sub records or additional fields, that way we can move the pop inside serializeAPIRecord I am not a fan personally because we get into JS style closure hell if we start doing this...
  • Use a visitor pattern where the visitor would be responsible for managing PathComponentContext and do the appropriate push/pops in the visit methods. We would need to add a APIRecordVisitor type, and appropriate visit methods for each APIRecord. This would make sense because the records should really know how to visit themselves.
  • Just add a specialized API although it seems it would really easy to forget to remove path components.

Let me know what you think is the way forward.

655

Yeah that's on us for not catching this earlier.

dang marked an inline comment as done.Apr 5 2022, 3:19 AM
dang added inline comments.Apr 5 2022, 3:53 AM
clang/lib/ExtractAPI/Serialization/SymbolGraphSerializer.cpp
509

Unless we go with the last option, I think this should be a follow up patch since it would a structural change.

dang updated this revision to Diff 420451.Apr 5 2022, 4:42 AM

Address code review feedback:

  • Fix the function signature serialization
  • Typo fix of "m to ",
QuietMisdreavus added inline comments.Apr 5 2022, 7:37 AM
clang/lib/ExtractAPI/Serialization/SymbolGraphSerializer.cpp
509

I'm most a fan of the APIRecordVisitor option, but you're right in that that would be a rather involved change. Now that you've said that the arrangement is for encoding sub-records, it makes sense to me. As an alternative, i think moving the emplace_back outside of serializeAPIRecord is actually my other preferred arrangement; without some other way of calculating the path components on-the-fly (e.g. walking up DeclContexts) simply having serializeAPIRecord look at what's there and have the wrapper calls deal with setting up its state sounds the most reasonable to me. That way, as @zixuw said, the emplace_back and pop_back calls are at least lexically close to each other.

dang updated this revision to Diff 420598.Apr 5 2022, 12:10 PM

Add descriptive method names for manipulating the path component stack.

dang updated this revision to Diff 420770.Apr 6 2022, 3:38 AM

Manage PatchComponents stack manipulation using RAII to avoid forgetting to pop
the stack when returning early from serializeAPIRecord.

dang marked an inline comment as done.Apr 6 2022, 3:40 AM
dang added inline comments.
clang/lib/ExtractAPI/Serialization/SymbolGraphSerializer.cpp
509

Let's move to a visitor based implementation later as a follow up patch since it is quite a big structural change. For now, I have implemented managing path components using RAII to avoid issues around forgetting to pop when returning early for skipped records.

QuietMisdreavus accepted this revision.Apr 6 2022, 8:12 AM

I like the idea of using the RAII context/guard to manage the path components stack. I have one non-blocking comment about the rest of the patch now.

clang/include/clang/ExtractAPI/Serialization/SymbolGraphSerializer.h
125

Now that the path components are being manipulated outside of serializeAPIRecord, can this function become const again?

This revision is now accepted and ready to land.Apr 6 2022, 8:12 AM
dang updated this revision to Diff 420883.Apr 6 2022, 8:26 AM
dang marked an inline comment as done.

Make SymbolGraphSerializer::serializeAPIRecord const again.

dang marked an inline comment as done.Apr 6 2022, 8:27 AM
zixuw accepted this revision.Apr 6 2022, 10:08 AM

Unused include of declaration fragments in SymbolGraphSerializer. Otherwise LGTM!

clang/lib/ExtractAPI/Serialization/SymbolGraphSerializer.cpp
17

ping.

dang updated this revision to Diff 420938.Apr 6 2022, 10:23 AM

Remove unnecessary include.

dang marked 2 inline comments as done.Apr 6 2022, 10:24 AM
This revision was landed with ongoing or failed builds.Apr 6 2022, 10:46 AM
This revision was automatically updated to reflect the committed changes.