This is an archive of the discontinued LLVM Phabricator instance.

[clang-doc] Read docstrings for record members
ClosedPublic

Authored by brettw on Aug 5 2022, 2:42 PM.

Details

Summary

Struct/class data members did not have the comments associated with them. This adds that information to the MemberTypeInfo class and emits it in the YAML. This does not update the frontends yet.

Diff Detail

Event Timeline

brettw created this revision.Aug 5 2022, 2:42 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 5 2022, 2:42 PM
brettw requested review of this revision.Aug 5 2022, 2:42 PM
paulkirth requested changes to this revision.Aug 8 2022, 2:51 PM

I agree w/ @phosek on unit testing.

Additionally, while we don't have many of them in clang-doc, an end-to-end test would also be welcome. In this case, it will also probably help w/ review, since we can see how clang-doc output is changing as part of the review process. To be clear: I'm not going to block acceptance on the end-to-end tests, since that may be asking too much. It would just be helpful.

I've left more direct feedback inline.

clang-tools-extra/clang-doc/BitcodeReader.cpp
354

So, I see that this uses the same pattern as in other getCommentInfo(...) API's, but I'm a bit confused by this.

Do we always grow the vector whenever getCommentInfo(...) is called? I would expect a get... function to be const and just return data. This grows the Description vector on every call, which seems like an odd choice. The pattern is also pervasive in BitcodeReader.cpp.

@phosek is this intentional? If Description exceeds capacity and reallocs on emplace_back, then any reference it had returned would be invalidated, right? Or am I missing something here re: the various *TypeInfos and how clang-doc uses them?

clang-tools-extra/clang-doc/Serialize.cpp
439

I think we need to at least assert that D != nullptr. While there is only one caller now, its likely there may be more in the future. Plus nullptr is deref is never fun to debug.

If getting a nullptr is expected to be a common input, then an early return is also fine.

If you prefer, a few places in the codebase do both to maintain correct behavior when asserts are disabled. Often they have a form along these lines:

if(!D){
  assert(false && "Some error message");
  return;
}

Any of these would be fine, and I don't have a preference between them.

440

If the goal is to get the FullComment, then doesn't getCommentForDecl(...) handle that? This also seems like an area where we'd want to use caching if we could.

clang-tools-extra/clang-doc/YAMLGenerator.cpp
195

Can we get a comment describing this?

This revision now requires changes to proceed.Aug 8 2022, 2:51 PM
brettw updated this revision to Diff 450967.Aug 8 2022, 3:09 PM

Added some tests (with one open question).

brettw updated this revision to Diff 450976.Aug 8 2022, 3:28 PM
brettw added a comment.Aug 8 2022, 3:36 PM
clang-tools-extra/clang-doc/BitcodeReader.cpp
354

You are correct that this always grows. These functions are only called once and that's when readSubBlock gets a block ID. The comment is filled into the returned array.

My question is why the comments on the info structs are arrays. I have only ever seen one top-level comment of type "FullComment" which can have many children.

clang-tools-extra/clang-doc/Serialize.cpp
439

I'm fine adding an assert to unblock the review.

But clang-doc is full of pointers and none of them are asserted on. Is this one special in some way? To me, it seems weird to assert on this one random pointer. Is there some policy about this?

I follow the opposite theory that asserts on null pointers all over the place clutter the code and random null parameters are one of the easiest types of crashes to debug.

440

I have no idea how this works, I copied this snipped from MapASTVisitor::getComment in Mapper.cpp

clang-tools-extra/clang-doc/YAMLGenerator.cpp
195

I don't think this needs a comment, a substantial part of this file is just adding these mappings between YAML keys and field names. The one above it has a comment only because of the non-obvious default value. If you feel strongly, please let me know what the comment should say.

clang-tools-extra/unittests/clang-doc/BitcodeTest.cpp
291

Whoops, I'll delete this.

clang-tools-extra/unittests/clang-doc/SerializeTest.cpp
169

I spent quite some time trying to get this to work and I'm stuck. If I copy the contents of the raw string above into a file and run clang-doc on it, I get a comment in the YAML output. But in this test I never get a comment and I don't know how this parsing environment is different than the "real" one.

I'm very confused by the result of ExtractInfosFromCode function in the first place. What are the 1st and 6th item which are never checked? Why does it emit two duplicate records for the E struct, one (Infos[2]) with a data member and no function and one (Infos[3]) with no data member and a function in it?

Any suggestions here would be appreciated.

paulkirth added inline comments.Aug 8 2022, 4:19 PM
clang-tools-extra/clang-doc/Serialize.cpp
439

There's nothing special regarding this particular pointer, but such checks are common in other parts of LLVM.

We use asserts liberally throughout the LLVM codebase (https://llvm.org/docs/CodingStandards.html#assert-liberally) and asserting that a pointer is valid is common.

The other thing to keep in mind is that debug builds of LLVM are > 20GB, and enabling asserts is a more desirable choice for most of our developers. That probably isn't as bad for clang-doc, but I'd rather err on the side of caution.

While I agree that right now it may be a bit strange to have a single assert, hopefully overtime the surrounding code will start to use such checks in a way that is more consistent with the rest of the codebase.

But like I said, an early return is also fine IMO.

clang-tools-extra/clang-doc/YAMLGenerator.cpp
195

Now that I see this in context, I agree that it doesn't need a comment.

clang-tools-extra/unittests/clang-doc/SerializeTest.cpp
201

?

brettw updated this revision to Diff 451001.Aug 8 2022, 4:51 PM
brettw marked an inline comment as done.
brettw marked 2 inline comments as done.
brettw added inline comments.
clang-tools-extra/clang-doc/Serialize.cpp
439

Done

brettw updated this revision to Diff 451299.Aug 9 2022, 4:16 PM
brettw marked an inline comment as done.

I updated the comment test that's not currently working with a TODO.

There are some open questions in the code review about some bigger questions about this tool. Since this has been virtually abandoned for several years, I don't think anybody knows these answers and we're not going to figure out everything for this patch.

I think this patch makes things incrementally better and I'd like to land something soon so I can work on the next missing feature. I think over time we can build up knowledge about this tool and make it better on a wider scale. Can you please talk another look?

LGTM, modulo the small fixes I've left inline

clang-tools-extra/clang-doc/Serialize.cpp
439

Can you add a String message to the assert?

This is a fairly standard example of the usage w/in LLVM:

assert(LCS && RCS && "Expect non-null FunctionSamples");

Taken from https://github.com/llvm/llvm-project/blob/30bbb73bb448910f791088bfc3154e752d42241a/llvm/lib/Transforms/IPO/SampleProfile.cpp#L400

440

That's fine, but lets add a TODO here to consider the other API, so we track this.

clang-tools-extra/unittests/clang-doc/SerializeTest.cpp
169

So I also wanted to understand why. MakeOneLinecCmmentIInfo() is not used elsewhere in clang-doc, and I don't think it was ever tested.

Looking at the body it doesn't use it's input parameter at all.

There may also be an issue w/ how it constructs the CommentInfo.

Leaving this as a TODO is fine. I don't think this is something we're going to solve in this patch, but if you can also add a TODO to MakeOneLine CommentInfo, it would be appreciated.

brettw updated this revision to Diff 451632.Aug 10 2022, 2:31 PM

New patch up.

Thanks. Can we get the small changes to the assert + the TODO in Serialize.cpp? I'll be happy to land this for you after that.

brettw added inline comments.Aug 11 2022, 9:23 AM
clang-tools-extra/unittests/clang-doc/SerializeTest.cpp
169

The problem isn't with MakeOneLineCommentInfo (I fixed the parameter usage) but rather that the records generated in the test has no comments at all. The current test code (with the line commented out) expects that the comment is completely missing, which is the only way the test will pass now.

Since nobody else uses it, I commented it out and referenced the calling code.

brettw updated this revision to Diff 451882.Aug 11 2022, 9:28 AM
brettw marked 2 inline comments as done.

New patch up with requested changes.

paulkirth accepted this revision.Aug 11 2022, 10:07 AM
This revision is now accepted and ready to land.Aug 11 2022, 10:07 AM
This revision was landed with ongoing or failed builds.Aug 11 2022, 10:17 AM
This revision was automatically updated to reflect the committed changes.