PartialDiagnostic misses some functions compared to DiagnosticBuilder.
This patch refactors DiagnosticBuilder and PartialDiagnostic, extracts
the common functionality so that the streaming << operators are
shared.
Differential D84362
[NFC] Refactor DiagnosticBuilder and PartialDiagnostic yaxunl on Jul 22 2020, 2:07 PM. Authored by
Details PartialDiagnostic misses some functions compared to DiagnosticBuilder. This patch refactors DiagnosticBuilder and PartialDiagnostic, extracts
Diff Detail Event TimelineComment Actions FYI, the patch does not compile: In file included from llvm/repo/clang/lib/Parse/ParseStmtAsm.cpp:13: In file included from /llvm/repo/clang/include/clang/Parse/Parser.h:24: llvm/repo/clang/include/clang/Sema/Sema.h:1833:10: error: use of overloaded operator '<<' is ambiguous (with operand types 'const clang::Sema::SemaDiagnosticBuilder' and 'clang::QualType') DB << T; ~~ ^ ~ llvm/repo/clang/include/clang/AST/Type.h:7144:28: note: candidate function [with DiagBuilderT = clang::Sema::SemaDiagnosticBuilder, $1 = nullptr] inline const DiagBuilderT &operator<<(const DiagBuilderT &DB, QualType T) { ^ llvm/repo/clang/include/clang/Sema/Sema.h:1508:41: note: candidate function [with T = clang::QualType] friend const SemaDiagnosticBuilder &operator<<( ^ llvm/repo/clang/include/clang/AST/TemplateBase.h:684:26: note: candidate function const DiagnosticBuilder &operator<<(const DiagnosticBuilder &DB, Comment Actions So, the idea here is to do some sort of duck-typing and allow DiagBuilder to work with both DiagnosticBuilder and PartialDiagnostic. What bothers me is that unlike Diagnostic PartialDiagnostic seems to be commingling functionality of the builder with that of the diagnostic itself. I'm not sure if that's by design or just happened to be that way. I think a better approach would be to refactor PartialDiagnostic and separate the builder functionality. That should make it possible to create a common diagnostic builder base class with Partial/Full diagnostic deriving their own builder, if needed. That said, I'm not that familiar with the diags. Perhaps @rtrieu @aaron.ballman would have better ideas. Comment Actions I'm similarly a bit uncomfortable with adding the SFINAE magic to make this work instead of making a base class that will work for either kind of diagnostic builder. I'm adding @rsmith to see if he has opinions as well. Comment Actions There are use patterns expecting PartialDiagnosticInst << X << Y to continue to be a PartialDiagnostic&, e.g. PartialDiagnosticAt PDAt(SourceLoc, PartialDiagnosticInst << X << Y); However if we derive PartialDiagnostic and DiagnosticBuilder from a base class DiagnosticBuilderBase which implements the << operators, PartialDiagnosticInst << X << Y will become a DiagnosticBuilderBase&, then we can no longer write the above code. That's one reason I use templates to implement << operators. Do we want to sacrifice this convenience? Comment Actions I don't think we have to. Comment Actions I saw lots of usage of PartialDiagnosticAt, which is defined at It requres a PartialDiagnostic as the second argument. A typical use is like If we use a base class DiagnosticBuilderBase to implement << operators, S.PDiag(diag::warn_unlock_but_no_lock) << Kind << LockName becomes DiagnosticBuilderBase& and we can no longer write the above code. There are ~100 uses like that. Comment Actions Perhaps we could inherit PartialDiagnostic from DiagnosticBuilder base class. This would probably be the least invasive approach as it would preserve existing uses while allowing reuse of common DiagnosticBuilder functionality. That said, it would muddle the meaning of PartialDiagnostic even more -- is it the diagnostic itself, or a thing that produces the diagnostic? To think of it, we actually don't need the DiagnosticBuilderBase per se, but rather a limited subset of operations for implementing freestanding operator<<. So the base class for DiagnosticBuilder and PartialDiagnostic could be called StreamableDiagnosticBase and only provide those bits. This way the arrangement looks a bit less awkward. Both DiagnosticBuilder and PartialDiagnostic would inherit from it and would work with operator<<(StreamableDiagnosticBase). We may need to convert StreamableDiagnosticBase results back to the original type. Comment Actions Revised by Artem's comments. Extracted the common code of DiagnosticBuilder and PartialDiagnostic Comment Actions Apparently this patch triggers compiler crashes on some of our code. I'll try to create a reproducer, but it would be great to revert the patch for now. *** SIGSEGV (@0x7f68892d8fe8), received by PID 8210 (TID 8225) on cpu 65; stack trace: *** PC: @ 0x7f686d8a169d (unknown) clang::Decl::getAvailability() @ 0x7f6712fad13b 1376 FailureSignalHandler() @ 0x7f696f1b69a0 34448 (unknown) @ 0x7f688c278c9a 352 ShouldDiagnoseAvailabilityOfDecl() @ 0x7f688c27897d 352 clang::Sema::DiagnoseAvailabilityOfDecl() @ 0x7f688c8bd3c7 1584 clang::Sema::DiagnoseUseOfDecl() @ 0x7f688c8cc1e8 672 clang::Sema::BuildDeclarationNameExpr() @ 0x7f688c8c9e30 192 clang::Sema::BuildDeclarationNameExpr() @ 0x7f688c8c2698 1280 clang::Sema::ActOnIdExpression() @ 0x7f688e9b95db 9136 clang::Parser::ParseCastExpression() @ 0x7f688e9b358b 128 clang::Parser::ParseCastExpression() @ 0x7f688e9b9ebb 9136 clang::Parser::ParseCastExpression() @ 0x7f688e9b358b 128 clang::Parser::ParseCastExpression() @ 0x7f688e9b1998 112 clang::Parser::ParseAssignmentExpression() @ 0x7f688e9b1832 64 clang::Parser::ParseExpression() @ 0x7f688e9bd6f0 4064 clang::Parser::ParseParenExpression() @ 0x7f688e9b6bb4 9136 clang::Parser::ParseCastExpression() @ 0x7f688e9b358b 128 clang::Parser::ParseCastExpression() @ 0x7f688e9b1998 112 clang::Parser::ParseAssignmentExpression() @ 0x7f688e9c1285 288 clang::Parser::ParseExpressionList() @ 0x7f688e9b4731 3088 clang::Parser::ParsePostfixExpressionSuffix() @ 0x7f688e9bb92a 9136 clang::Parser::ParseCastExpression() @ 0x7f688e9b358b 128 clang::Parser::ParseCastExpression() @ 0x7f688e9b1998 112 clang::Parser::ParseAssignmentExpression() @ 0x7f688e9b1832 64 clang::Parser::ParseExpression() @ 0x7f688ea43040 240 clang::Parser::ParseReturnStatement() @ 0x7f688ea3cef0 1264 clang::Parser::ParseStatementOrDeclarationAfterAttributes() @ 0x7f688ea3c5cf 160 clang::Parser::ParseStatementOrDeclaration() @ 0x7f688ea4490a 1264 clang::Parser::ParseCompoundStatementBody() @ 0x7f688ea45df9 272 clang::Parser::ParseFunctionStatementBody() @ 0x7f688ea6a7b4 960 clang::Parser::ParseFunctionDefinition() @ 0x7f688e9600bd 2880 clang::Parser::ParseDeclGroup() @ 0x7f688ea69634 368 clang::Parser::ParseDeclOrFunctionDefInternal() @ 0x7f688ea68c20 672 clang::Parser::ParseDeclarationOrFunctionDefinition() @ 0x7f688ea684ee 768 clang::Parser::ParseExternalDeclaration() @ 0x7f688ea667a9 368 clang::Parser::ParseTopLevelDecl() @ 0x7f688e93d2a2 240 clang::ParseAST() @ 0x7f688fe079bd 80 clang::ASTFrontendAction::ExecuteAction() @ 0x7f688fe073b8 144 clang::FrontendAction::Execute() @ 0x7f688fd1bdd6 480 clang::ASTUnit::Parse() Comment Actions It's likely the same issue as the one reported in D84364, but we probably triggered it independently. Comment Actions I can try it on the internal test that crashed with the patch. I've reopened this review and will pick up the diff once you update it. Comment Actions Pretty sure. We've bisected the failure specifically to commit ee5519d323571c4a9a7d92cb817023c9b95334cd, before D84364 landed. Comment Actions It's possible. Unfortunately it's only triggered by our internal tool and it's hard to create a public reproducer for it. I'll debug and try to fix it on Monday. Comment Actions It appears the patch slightly increased stack use and caused a failure in code that was running with a limited stack space available.
Comment Actions revised by John's comments. Extracted common part of DiagnosticEngine and PartialDiagnostics as DiagnosticStorage. Make member functions of the base class of DiagnosticBuilder and ParticalDiagnostics non-virtual.
Comment Actions Thanks, this looks a lot better.
Comment Actions Hi All, I've just noticed this patch and realised that it takes DiagnosticBuilder in the direction tangential to what we proposed in [1]. I've just restarted our efforts with regard to refactoring these APIs and feel that it would be good to coordinate any future efforts in this area. I also want to make sure that we have a solution that works for everyone. Basically, this patch makes DiagnosticBuilder more integrated with libclangBasic by assuming that every diagnostic contains SourceLocations and FixItHints (see the definition of DiagnosticStorage). However, that's not the case for driver diagnostics (i.e. diagnostics issued in libclangDriver). In [1] we proposed to create thin abstraction layers above all diagnostic classes that do not depend on SourceLocation (or anything else from libclangBasic). This patch makes DiagnosticBuilder _more_ dependant on SourceLocation and in this respect refactors DiagnosticBuilder in a direction opposite to what we proposed in [1]. Since this patch changes the DiagnosticBuilder API quite significantly, I am not entirely sure how to proceed with refactoring it (we need to make it independent of SourceLocation and any related classes). Originally, DiagnosticBuilder was a standalone class, but now we will have to refactor the following classes on top of DiagnosticBuilder:
One idea that immediately comes to mind is to split DiagnosticStorage as follows: struct DiagnosticStorage { // Storage that applies to all diagnostics // Original content, but without DiagRanges and FixItHints }; struct LangDiagnosticStorage { // Storage that applies only to diagnostics that contain ranges and FixitHints - in practice these are language diagnostics SmallVector<CharSourceRange, 8> DiagRanges; SmallVector<FixItHint, 6> FixItHints; LangDiagnosticStorage() = default; } How does this look? Sadly it will clutter this API, but I can't see any way around it. One other extreme alternative is to:
I appreciate that it is important to keep this API as clean and as lean as possible. However,
Further refactoring will be required to achieve this. We sent 2 RFCs [1] [2] earlier this year to cfe-dev to discuss this, so I hope that this won't come as a surprise. Do you have any suggestions? CC @tra @rjmccall @rsmith @yaxunl Thank you, [1] http://lists.llvm.org/pipermail/cfe-dev/2020-July/066393.html Comment Actions It sounds like what you want is a diagnostic library that's almost completely abstracted over the kinds of entities that can be stored in a diagnostic, including the definition of a source location. I don't think that's incompatible with this patch; there's no need to suggest reversion. Source locations, supplemental source ranges, and fix-its are all still meaningful concepts in the driver, it's just that a source location is, say, an offset into a particular argument rather than a location in the file. So the challenge for you is to come up with a way to abstract this that doesn't significantly make things worse for Clang. Comment Actions Hi @rjmccall , thank you for your quick reply!
No :) We suggested (and that's based on the feedback on cfe-dev) a more basic approach. We proposed an abstraction layer that does not require any location information, because that's irrelevant in the driver (more on this below). We did, intially, suggest re-using and re-factoring SourceLocation (and, more specifically, SourceManager). That direction was explicitely discouraged on cfe-dev [3].
When we studied the source code of libclangDriver that was not the case. The diagnostics in the driver [1] are only used to a very limited extent (compared to the frontend) and I've only seen them issued via [2]: inline DiagnosticBuilder DiagnosticsEngine::Report(unsigned DiagID) { return Report(SourceLocation(), DiagID); } i.e. SourceLocation is not really used at all. Are we missing something here? Is there any place in the libclangDriver (or any tool that's implemented in terms of libclangdDriver) that actually requires SourceLocation? We would like to keep the scope of refactoring libclangDriver (and its dependencies) to the required minimum. The solution that you suggested would mean refactoring an API that's not required by libclangDriver (please correct me if I'm missing something here). I'm concerned that any attempts to modify SourceLocation will be challenged on cfe-dev. It was already suggested that we shouldn't need it for what we want to achieve. And indeed, in our simplified prototype downstream we were able to avoid using it without affecting the driver diagnostics. Another difficulty with your suggestion is testing. If we were to create some abstraction layer above SourceLocation - how would we define or test it? Again, AFAIK diagnostics in libclangDriver never contain any information with respect to source location. So how do we decide what's actually required? This would be easy if there was a reference example :) All in all, I think that this patch will require us to broaden the refactoring in a way that otherwise wouldn't be required. And according to the feedback so far this will be detrimental to Clang. If we are still in disagreement then perhaps we should move this discussion to cfe-dev? I want to make sure that we are all aligned on the overall direction and that we don't diverge too much from what was already discussed on cfe-dev. Thanks in advance for your feedback, [1] [3] http://lists.llvm.org/pipermail/cfe-dev/2020-June/065766.html Comment Actions I may not have been clear. I'm not saying SourceLocation is a meaningful concept in the driver. I'm saying that if you generalize the concept of "source location" to "location in the input", there is a clear analogue in the driver (namely, a position in the argument list), and the only reason this isn't passed down to the driver and used in diagnostics is that we don't have the ability to express that today to the diagnostic engine. So instead of trying to extract out a part of the diagnostics engine that will work without any concept of source locations, you should be trying to parameterize the diagnostics engine so that it can work with an arbitrary external concept of source locations, and then the driver can use a different kind of source location than the main compiler and everything is fine. Comment Actions Likewise, I should be more careful with respect to how I refer to SourceLocation (a class in Clang) and "source location" (location within a source file). I see your point. Basically, you are suggesting that we improve the diagnostics for the driver first. And for that we'd need something like CommandLineLocation (a new class: location within the command invocation) that would be related to SourceLocation via a common base class, e.g. AbstractLocation (also a new class). That would indeed be a very nice thing to have. However, this is not what we've been proposing. Our ultimate goal is a new Flang driver:
We shouldn't need to implement this extra functionality to achieve our goal. Personally I like the idea, but we need to stay focused on activities that are strictly needed for a new Flang driver. Not making things worse for Clang is an equally important goal. Improving things in Clang is something we are keen on, but are unlikely to have the bandwidth for. It'll hard to create these thin layers above the diagnostic classes that do not depend on SourceLocation and SourceManager (and other classes that depend/require these). This is regardless of whether driver diagnostics are capable of tracking location (via e.g. CommandLineLocation) or not. The integration of diagnostics classes and "source location" related classes (specifically SourceLocation and SourceManager) is quite deep and this patch deepens that. IMO this patch makes things even harder for us. I'm starting to think that all this could be vastly simplified if libclangDriver had it's own diagnostics classes- a very simplified version of what's currently available in libclangBasic (again, because the driver doesn't need much). The improvements that you propose could be added at later time. Would this be acceptable? I think that the thin abstraction layer that we're discussing here would naturally emerge with time. |
I think I would prefer StreamingDiagnostic as the class name here.