Struct/class data members did not have the comments associated with them. This adds that information to the MemberTypeInfo class and emits it in the YAML. This does not update the frontends yet.
Details
Diff Detail
Event Timeline
Would it be possible to extend the existing unit tests to cover this:
https://github.com/llvm/llvm-project/blob/main/clang-tools-extra/unittests/clang-doc/BitcodeTest.cpp
https://github.com/llvm/llvm-project/blob/main/clang-tools-extra/unittests/clang-doc/SerializeTest.cpp
https://github.com/llvm/llvm-project/blob/main/clang-tools-extra/unittests/clang-doc/YAMLGeneratorTest.cpp
When uploading patches through web interface, it's useful to include full context, see https://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-web-interface
I agree w/ @phosek on unit testing.
Additionally, while we don't have many of them in clang-doc, an end-to-end test would also be welcome. In this case, it will also probably help w/ review, since we can see how clang-doc output is changing as part of the review process. To be clear: I'm not going to block acceptance on the end-to-end tests, since that may be asking too much. It would just be helpful.
I've left more direct feedback inline.
clang-tools-extra/clang-doc/BitcodeReader.cpp | ||
---|---|---|
354 | So, I see that this uses the same pattern as in other getCommentInfo(...) API's, but I'm a bit confused by this. Do we always grow the vector whenever getCommentInfo(...) is called? I would expect a get... function to be const and just return data. This grows the Description vector on every call, which seems like an odd choice. The pattern is also pervasive in BitcodeReader.cpp. @phosek is this intentional? If Description exceeds capacity and reallocs on emplace_back, then any reference it had returned would be invalidated, right? Or am I missing something here re: the various *TypeInfos and how clang-doc uses them? | |
clang-tools-extra/clang-doc/Serialize.cpp | ||
439 | I think we need to at least assert that D != nullptr. While there is only one caller now, its likely there may be more in the future. Plus nullptr is deref is never fun to debug. If getting a nullptr is expected to be a common input, then an early return is also fine. If you prefer, a few places in the codebase do both to maintain correct behavior when asserts are disabled. Often they have a form along these lines: if(!D){ assert(false && "Some error message"); return; } Any of these would be fine, and I don't have a preference between them. | |
440 | If the goal is to get the FullComment, then doesn't getCommentForDecl(...) handle that? This also seems like an area where we'd want to use caching if we could. | |
clang-tools-extra/clang-doc/YAMLGenerator.cpp | ||
195 | Can we get a comment describing this? |
clang-tools-extra/clang-doc/BitcodeReader.cpp | ||
---|---|---|
354 | You are correct that this always grows. These functions are only called once and that's when readSubBlock gets a block ID. The comment is filled into the returned array. My question is why the comments on the info structs are arrays. I have only ever seen one top-level comment of type "FullComment" which can have many children. | |
clang-tools-extra/clang-doc/Serialize.cpp | ||
439 | I'm fine adding an assert to unblock the review. But clang-doc is full of pointers and none of them are asserted on. Is this one special in some way? To me, it seems weird to assert on this one random pointer. Is there some policy about this? I follow the opposite theory that asserts on null pointers all over the place clutter the code and random null parameters are one of the easiest types of crashes to debug. | |
440 | I have no idea how this works, I copied this snipped from MapASTVisitor::getComment in Mapper.cpp | |
clang-tools-extra/clang-doc/YAMLGenerator.cpp | ||
195 | I don't think this needs a comment, a substantial part of this file is just adding these mappings between YAML keys and field names. The one above it has a comment only because of the non-obvious default value. If you feel strongly, please let me know what the comment should say. | |
clang-tools-extra/unittests/clang-doc/BitcodeTest.cpp | ||
291 | Whoops, I'll delete this. | |
clang-tools-extra/unittests/clang-doc/SerializeTest.cpp | ||
169 | I spent quite some time trying to get this to work and I'm stuck. If I copy the contents of the raw string above into a file and run clang-doc on it, I get a comment in the YAML output. But in this test I never get a comment and I don't know how this parsing environment is different than the "real" one. I'm very confused by the result of ExtractInfosFromCode function in the first place. What are the 1st and 6th item which are never checked? Why does it emit two duplicate records for the E struct, one (Infos[2]) with a data member and no function and one (Infos[3]) with no data member and a function in it? Any suggestions here would be appreciated. |
clang-tools-extra/clang-doc/Serialize.cpp | ||
---|---|---|
439 | There's nothing special regarding this particular pointer, but such checks are common in other parts of LLVM. We use asserts liberally throughout the LLVM codebase (https://llvm.org/docs/CodingStandards.html#assert-liberally) and asserting that a pointer is valid is common. The other thing to keep in mind is that debug builds of LLVM are > 20GB, and enabling asserts is a more desirable choice for most of our developers. That probably isn't as bad for clang-doc, but I'd rather err on the side of caution. While I agree that right now it may be a bit strange to have a single assert, hopefully overtime the surrounding code will start to use such checks in a way that is more consistent with the rest of the codebase. But like I said, an early return is also fine IMO. | |
clang-tools-extra/clang-doc/YAMLGenerator.cpp | ||
195 | Now that I see this in context, I agree that it doesn't need a comment. | |
clang-tools-extra/unittests/clang-doc/SerializeTest.cpp | ||
201 | ? |
clang-tools-extra/clang-doc/Serialize.cpp | ||
---|---|---|
439 | Done |
I updated the comment test that's not currently working with a TODO.
There are some open questions in the code review about some bigger questions about this tool. Since this has been virtually abandoned for several years, I don't think anybody knows these answers and we're not going to figure out everything for this patch.
I think this patch makes things incrementally better and I'd like to land something soon so I can work on the next missing feature. I think over time we can build up knowledge about this tool and make it better on a wider scale. Can you please talk another look?
LGTM, modulo the small fixes I've left inline
clang-tools-extra/clang-doc/Serialize.cpp | ||
---|---|---|
439 | Can you add a String message to the assert? This is a fairly standard example of the usage w/in LLVM: assert(LCS && RCS && "Expect non-null FunctionSamples"); | |
440 | That's fine, but lets add a TODO here to consider the other API, so we track this. | |
clang-tools-extra/unittests/clang-doc/SerializeTest.cpp | ||
169 | So I also wanted to understand why. MakeOneLinecCmmentIInfo() is not used elsewhere in clang-doc, and I don't think it was ever tested. Looking at the body it doesn't use it's input parameter at all. There may also be an issue w/ how it constructs the CommentInfo. Leaving this as a TODO is fine. I don't think this is something we're going to solve in this patch, but if you can also add a TODO to MakeOneLine CommentInfo, it would be appreciated. |
Thanks. Can we get the small changes to the assert + the TODO in Serialize.cpp? I'll be happy to land this for you after that.
clang-tools-extra/unittests/clang-doc/SerializeTest.cpp | ||
---|---|---|
169 | The problem isn't with MakeOneLineCommentInfo (I fixed the parameter usage) but rather that the records generated in the test has no comments at all. The current test code (with the line commented out) expects that the comment is completely missing, which is the only way the test will pass now. Since nobody else uses it, I commented it out and referenced the calling code. |
So, I see that this uses the same pattern as in other getCommentInfo(...) API's, but I'm a bit confused by this.
Do we always grow the vector whenever getCommentInfo(...) is called? I would expect a get... function to be const and just return data. This grows the Description vector on every call, which seems like an odd choice. The pattern is also pervasive in BitcodeReader.cpp.
@phosek is this intentional? If Description exceeds capacity and reallocs on emplace_back, then any reference it had returned would be invalidated, right? Or am I missing something here re: the various *TypeInfos and how clang-doc uses them?