This is an archive of the discontinued LLVM Phabricator instance.

[NFC] Refactor DiagnosticBuilder and PartialDiagnostic
ClosedPublic

Authored by yaxunl on Jul 22 2020, 2:07 PM.

Details

Summary

PartialDiagnostic misses some functions compared to DiagnosticBuilder.

This patch refactors DiagnosticBuilder and PartialDiagnostic, extracts
the common functionality so that the streaming << operators are
shared.

Diff Detail

Event Timeline

yaxunl created this revision.Jul 22 2020, 2:07 PM
tra added a comment.Jul 22 2020, 2:37 PM

FYI, the patch does not compile:

In file included from llvm/repo/clang/lib/Parse/ParseStmtAsm.cpp:13:
In file included from /llvm/repo/clang/include/clang/Parse/Parser.h:24:
llvm/repo/clang/include/clang/Sema/Sema.h:1833:10: error: use of overloaded operator '<<' is ambiguous (with operand types 'const clang::Sema::SemaDiagnosticBuilder' and 'clang::QualType')
      DB << T;
      ~~ ^  ~
llvm/repo/clang/include/clang/AST/Type.h:7144:28: note: candidate function [with DiagBuilderT = clang::Sema::SemaDiagnosticBuilder, $1 = nullptr]
inline const DiagBuilderT &operator<<(const DiagBuilderT &DB, QualType T) {
                           ^
llvm/repo/clang/include/clang/Sema/Sema.h:1508:41: note: candidate function [with T = clang::QualType]
    friend const SemaDiagnosticBuilder &operator<<(
                                        ^
llvm/repo/clang/include/clang/AST/TemplateBase.h:684:26: note: candidate function
const DiagnosticBuilder &operator<<(const DiagnosticBuilder &DB,
yaxunl updated this revision to Diff 279951.Jul 22 2020, 2:58 PM

fix build failure

So, the idea here is to do some sort of duck-typing and allow DiagBuilder to work with both DiagnosticBuilder and PartialDiagnostic.

What bothers me is that unlike Diagnostic PartialDiagnostic seems to be commingling functionality of the builder with that of the diagnostic itself. I'm not sure if that's by design or just happened to be that way.

I think a better approach would be to refactor PartialDiagnostic and separate the builder functionality. That should make it possible to create a common diagnostic builder base class with Partial/Full diagnostic deriving their own builder, if needed.

That said, I'm not that familiar with the diags. Perhaps @rtrieu @aaron.ballman would have better ideas.

aaron.ballman added a subscriber: rsmith.
In D84362#2271585, @tra wrote:

So, the idea here is to do some sort of duck-typing and allow DiagBuilder to work with both DiagnosticBuilder and PartialDiagnostic.

What bothers me is that unlike Diagnostic PartialDiagnostic seems to be commingling functionality of the builder with that of the diagnostic itself. I'm not sure if that's by design or just happened to be that way.

I think a better approach would be to refactor PartialDiagnostic and separate the builder functionality. That should make it possible to create a common diagnostic builder base class with Partial/Full diagnostic deriving their own builder, if needed.

That said, I'm not that familiar with the diags. Perhaps @rtrieu @aaron.ballman would have better ideas.

I'm similarly a bit uncomfortable with adding the SFINAE magic to make this work instead of making a base class that will work for either kind of diagnostic builder. I'm adding @rsmith to see if he has opinions as well.

In D84362#2271585, @tra wrote:

So, the idea here is to do some sort of duck-typing and allow DiagBuilder to work with both DiagnosticBuilder and PartialDiagnostic.

What bothers me is that unlike Diagnostic PartialDiagnostic seems to be commingling functionality of the builder with that of the diagnostic itself. I'm not sure if that's by design or just happened to be that way.

I think a better approach would be to refactor PartialDiagnostic and separate the builder functionality. That should make it possible to create a common diagnostic builder base class with Partial/Full diagnostic deriving their own builder, if needed.

That said, I'm not that familiar with the diags. Perhaps @rtrieu @aaron.ballman would have better ideas.

I'm similarly a bit uncomfortable with adding the SFINAE magic to make this work instead of making a base class that will work for either kind of diagnostic builder. I'm adding @rsmith to see if he has opinions as well.

There are use patterns expecting PartialDiagnosticInst << X << Y to continue to be a PartialDiagnostic&, e.g.

PartialDiagnosticAt PDAt(SourceLoc, PartialDiagnosticInst << X << Y);

However if we derive PartialDiagnostic and DiagnosticBuilder from a base class DiagnosticBuilderBase which implements the << operators, PartialDiagnosticInst << X << Y will become a DiagnosticBuilderBase&, then we can no longer write the above code.

That's one reason I use templates to implement << operators.

Do we want to sacrifice this convenience?

tra added a comment.Sep 15 2020, 11:32 AM

There are use patterns expecting PartialDiagnosticInst << X << Y to continue to be a PartialDiagnostic&, e.g.

PartialDiagnosticAt PDAt(SourceLoc, PartialDiagnosticInst << X << Y);

However if we derive PartialDiagnostic and DiagnosticBuilder from a base class DiagnosticBuilderBase which implements the << operators, PartialDiagnosticInst << X << Y will become a DiagnosticBuilderBase&, then we can no longer write the above code.

That's one reason I use templates to implement << operators.

Do we want to sacrifice this convenience?

I don't think we have to.
AFAICT, virtually all such patterns (and there are only 20-ish of them in total) are used with EmitFormatDiagnostic(S.PDiag()) which could be adapted to accept DiagnosticBuilderBase and Sema::PDiag() changed to return PartialDiagnosticBuilder with no loss of convenience.

In D84362#2274884, @tra wrote:

There are use patterns expecting PartialDiagnosticInst << X << Y to continue to be a PartialDiagnostic&, e.g.

PartialDiagnosticAt PDAt(SourceLoc, PartialDiagnosticInst << X << Y);

However if we derive PartialDiagnostic and DiagnosticBuilder from a base class DiagnosticBuilderBase which implements the << operators, PartialDiagnosticInst << X << Y will become a DiagnosticBuilderBase&, then we can no longer write the above code.

That's one reason I use templates to implement << operators.

Do we want to sacrifice this convenience?

I don't think we have to.
AFAICT, virtually all such patterns (and there are only 20-ish of them in total) are used with EmitFormatDiagnostic(S.PDiag()) which could be adapted to accept DiagnosticBuilderBase and Sema::PDiag() changed to return PartialDiagnosticBuilder with no loss of convenience.

I saw lots of usage of PartialDiagnosticAt, which is defined at

https://github.com/llvm/llvm-project/blob/8c3d0d6a5f5a2a521c4cbae7acbad82a49e2a92f/clang/include/clang/Basic/PartialDiagnostic.h#L417

It requres a PartialDiagnostic as the second argument.

A typical use is like

https://github.com/llvm/llvm-project/blob/69f98311ca42127df92527b6fc3be99841a15f12/clang/lib/Sema/AnalysisBasedWarnings.cpp#L1741

If we use a base class DiagnosticBuilderBase to implement << operators, S.PDiag(diag::warn_unlock_but_no_lock) << Kind << LockName becomes DiagnosticBuilderBase& and we can no longer write the above code.

There are ~100 uses like that.

tra added a comment.Sep 15 2020, 1:46 PM

Perhaps we could inherit PartialDiagnostic from DiagnosticBuilder base class. This would probably be the least invasive approach as it would preserve existing uses while allowing reuse of common DiagnosticBuilder functionality. That said, it would muddle the meaning of PartialDiagnostic even more -- is it the diagnostic itself, or a thing that produces the diagnostic?

To think of it, we actually don't need the DiagnosticBuilderBase per se, but rather a limited subset of operations for implementing freestanding operator<<. So the base class for DiagnosticBuilder and PartialDiagnostic could be called StreamableDiagnosticBase and only provide those bits. This way the arrangement looks a bit less awkward. Both DiagnosticBuilder and PartialDiagnostic would inherit from it and would work with operator<<(StreamableDiagnosticBase). We may need to convert StreamableDiagnosticBase results back to the original type.

yaxunl updated this revision to Diff 292319.Sep 16 2020, 1:10 PM
yaxunl retitled this revision from [NFC] Add missing functions to PartialDiagnostic to [NFC] Refactor DiagnosticBuilder and PartialDiagnostic.
yaxunl edited the summary of this revision. (Show Details)

Revised by Artem's comments. Extracted the common code of DiagnosticBuilder and PartialDiagnostic
and removed redundant code.

tra accepted this revision.Sep 16 2020, 1:25 PM

This is a very nice cleanup. Thank you, Sam.

This revision is now accepted and ready to land.Sep 16 2020, 1:25 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptSep 16 2020, 2:36 PM
tra added a comment.Sep 17 2020, 10:13 AM

Apparently this patch triggers compiler crashes on some of our code. I'll try to create a reproducer, but it would be great to revert the patch for now.

*** SIGSEGV (@0x7f68892d8fe8),  received by PID 8210 (TID 8225) on cpu 65; stack trace: ***
PC: @     0x7f686d8a169d  (unknown)  clang::Decl::getAvailability()
    @     0x7f6712fad13b       1376  FailureSignalHandler()
    @     0x7f696f1b69a0      34448  (unknown)
    @     0x7f688c278c9a        352  ShouldDiagnoseAvailabilityOfDecl()
    @     0x7f688c27897d        352  clang::Sema::DiagnoseAvailabilityOfDecl()
    @     0x7f688c8bd3c7       1584  clang::Sema::DiagnoseUseOfDecl()
    @     0x7f688c8cc1e8        672  clang::Sema::BuildDeclarationNameExpr()
    @     0x7f688c8c9e30        192  clang::Sema::BuildDeclarationNameExpr()
    @     0x7f688c8c2698       1280  clang::Sema::ActOnIdExpression()
    @     0x7f688e9b95db       9136  clang::Parser::ParseCastExpression()
    @     0x7f688e9b358b        128  clang::Parser::ParseCastExpression()
    @     0x7f688e9b9ebb       9136  clang::Parser::ParseCastExpression()
    @     0x7f688e9b358b        128  clang::Parser::ParseCastExpression()
    @     0x7f688e9b1998        112  clang::Parser::ParseAssignmentExpression()
    @     0x7f688e9b1832         64  clang::Parser::ParseExpression()
    @     0x7f688e9bd6f0       4064  clang::Parser::ParseParenExpression()
    @     0x7f688e9b6bb4       9136  clang::Parser::ParseCastExpression()
    @     0x7f688e9b358b        128  clang::Parser::ParseCastExpression()
    @     0x7f688e9b1998        112  clang::Parser::ParseAssignmentExpression()
    @     0x7f688e9c1285        288  clang::Parser::ParseExpressionList()
    @     0x7f688e9b4731       3088  clang::Parser::ParsePostfixExpressionSuffix()
    @     0x7f688e9bb92a       9136  clang::Parser::ParseCastExpression()
    @     0x7f688e9b358b        128  clang::Parser::ParseCastExpression()
    @     0x7f688e9b1998        112  clang::Parser::ParseAssignmentExpression()
    @     0x7f688e9b1832         64  clang::Parser::ParseExpression()
    @     0x7f688ea43040        240  clang::Parser::ParseReturnStatement()
    @     0x7f688ea3cef0       1264  clang::Parser::ParseStatementOrDeclarationAfterAttributes()
    @     0x7f688ea3c5cf        160  clang::Parser::ParseStatementOrDeclaration()
    @     0x7f688ea4490a       1264  clang::Parser::ParseCompoundStatementBody()
    @     0x7f688ea45df9        272  clang::Parser::ParseFunctionStatementBody()
    @     0x7f688ea6a7b4        960  clang::Parser::ParseFunctionDefinition()
    @     0x7f688e9600bd       2880  clang::Parser::ParseDeclGroup()
    @     0x7f688ea69634        368  clang::Parser::ParseDeclOrFunctionDefInternal()
    @     0x7f688ea68c20        672  clang::Parser::ParseDeclarationOrFunctionDefinition()
    @     0x7f688ea684ee        768  clang::Parser::ParseExternalDeclaration()
    @     0x7f688ea667a9        368  clang::Parser::ParseTopLevelDecl()
    @     0x7f688e93d2a2        240  clang::ParseAST()
    @     0x7f688fe079bd         80  clang::ASTFrontendAction::ExecuteAction()
    @     0x7f688fe073b8        144  clang::FrontendAction::Execute()
    @     0x7f688fd1bdd6        480  clang::ASTUnit::Parse()
tra added a comment.Sep 17 2020, 11:05 AM
In D84362#2279688, @tra wrote:

Apparently this patch triggers compiler crashes on some of our code. I'll try to create a reproducer, but it would be great to revert the patch for now.

It's likely the same issue as the one reported in D84364, but we probably triggered it independently.

In D84362#2279845, @tra wrote:
In D84362#2279688, @tra wrote:

Apparently this patch triggers compiler crashes on some of our code. I'll try to create a reproducer, but it would be great to revert the patch for now.

It's likely the same issue as the one reported in D84364, but we probably triggered it independently.

I have a fix for the issue reported in D84364. Would you like to try? Thanks.

tra reopened this revision.Sep 18 2020, 2:27 PM

I have a fix for the issue reported in D84364. Would you like to try? Thanks.

I can try it on the internal test that crashed with the patch. I've reopened this review and will pick up the diff once you update it.

This revision is now accepted and ready to land.Sep 18 2020, 2:27 PM
In D84362#2282965, @tra wrote:

I have a fix for the issue reported in D84364. Would you like to try? Thanks.

I can try it on the internal test that crashed with the patch. I've reopened this review and will pick up the diff once you update it.

The fix is for the change in D84364. It has no effect on the change in this review. Are you sure the issue you saw is due to change in this review instead of change in D84364?

tra added a comment.Sep 18 2020, 2:52 PM

The fix is for the change in D84364. It has no effect on the change in this review. Are you sure the issue you saw is due to change in this review instead of change in D84364?

Pretty sure. We've bisected the failure specifically to commit ee5519d323571c4a9a7d92cb817023c9b95334cd, before D84364 landed.

In D84362#2283045, @tra wrote:

The fix is for the change in D84364. It has no effect on the change in this review. Are you sure the issue you saw is due to change in this review instead of change in D84364?

Pretty sure. We've bisected the failure specifically to commit ee5519d323571c4a9a7d92cb817023c9b95334cd, before D84364 landed.

I think this is probably a different issue. The issue reported in D84364 was introduced by change in D84364.

tra added a comment.Sep 18 2020, 2:59 PM

I think this is probably a different issue. The issue reported in D84364 was introduced by change in D84364.

It's possible. Unfortunately it's only triggered by our internal tool and it's hard to create a public reproducer for it. I'll debug and try to fix it on Monday.

tra accepted this revision.Sep 21 2020, 12:23 PM
In D84362#2283078, @tra wrote:

It's possible. Unfortunately it's only triggered by our internal tool and it's hard to create a public reproducer for it. I'll debug and try to fix it on Monday.

It appears the patch slightly increased stack use and caused a failure in code that was running with a limited stack space available.
AFAICT there are no issues with this patch per se and it's good to re-land, again.

rnk added inline comments.
clang/include/clang/Basic/Diagnostic.h
1200–1204

This comment documents that this object is intended to be small. Adding a vtable grows it by one pointer, and the virtual methods practically ensure that it will always be address-taken and will not be promoted from memory to registers. I think you need to reconsider the design of this patch. I will revert it for now. Please get approval from @rsmith or @rjmccall before adding a vtable to this class gain.

tra requested changes to this revision.Sep 24 2020, 11:32 AM
tra added a reviewer: rjmccall.
This revision now requires changes to proceed.Sep 24 2020, 11:32 AM
rjmccall added inline comments.Oct 1 2020, 11:44 PM
clang/include/clang/Basic/Diagnostic.h
1200–1204

I don't know that copying a DiagnosticBuilder is actually an important thing to support, but yes, this doesn't really seem like the right approach. The bigger problem is that it still leaves an enormous amount of code duplication between the active diagnostic being built in the DiagnosticsEngine and the diagnostic being built in the PartialDiagnostic — it's basically the same storage concept, duplicated in two different places! We should unify those two things, so that there's only one class implementing the storage of an active diagnostic. There can then be a common object of that class in DiagnosticsEngine, which DiagnosticBuilder then stores a reference to, while PartialDiagnostic has its own copy. And then it should be fairly straightforward to make the operators work with either.

Among other things, that might make it significantly more efficient to emit a diagnostic from a PartialDiagnostic — we might be able to just emit it from the internal storage instead of having to copy it into the engine, so that the engine's use of a common diagnostic object is just a storage optimization.

yaxunl updated this revision to Diff 298125.Oct 14 2020, 5:17 AM

revised by John's comments.

Extracted common part of DiagnosticEngine and PartialDiagnostics as DiagnosticStorage.

Make member functions of the base class of DiagnosticBuilder and ParticalDiagnostics non-virtual.

tra added inline comments.Oct 14 2020, 11:08 AM
clang/include/clang/Basic/PartialDiagnostic.h
51

Is there a particular reason to move field initialization into the body here and in other constructors?

yaxunl marked an inline comment as done.Oct 14 2020, 11:25 AM
yaxunl added inline comments.
clang/include/clang/Basic/PartialDiagnostic.h
51

Allocator is now a member of the base class, therefore needs to be initialized directly. I could add ctor to the base class but I did not see too much benefit of doing that.

Thanks, this looks a lot better.

clang/include/clang/Basic/Diagnostic.h
1065

I think I would prefer StreamingDiagnostic as the class name here.

clang/include/clang/Basic/PartialDiagnostic.h
66

Why are these template operators necessary? The LHS of the << should convert to a base reference type just fine.

yaxunl updated this revision to Diff 298368.Oct 15 2020, 7:08 AM
yaxunl marked an inline comment as done.

Rename StreamableDiagnosticBase to StreamingDiagnostic.

yaxunl marked 4 inline comments as done.Oct 15 2020, 7:13 AM
yaxunl added inline comments.
clang/include/clang/Basic/Diagnostic.h
1065

renamed

clang/include/clang/Basic/PartialDiagnostic.h
66

There are lots of usage patterns expecting the derived class after streaming, e.g.

void foo(PartialDiagnostic& PD);
foo(PD << X << Y);
rjmccall added inline comments.Oct 15 2020, 10:47 AM
clang/include/clang/Basic/PartialDiagnostic.h
66

I see. You could also just do this in the base operators by just making them templates, e.g.

template <class Diagnostic>
std::enable_if_t<std::is_base_of_v<StreamingDiagnostic, Diagnostic>, const Diagnostic &>
operator<<(const Diagnostic &D, ...) {
}

That way you'd never have these forwarding problems — but it would be more boilerplate for each operator, so maybe it's a wash.

yaxunl updated this revision to Diff 299044.Oct 19 2020, 7:26 AM
yaxunl marked 2 inline comments as done.

Add constructors to StreamingDiagnostic.

rjmccall accepted this revision.Oct 19 2020, 10:59 AM

Okay. This LGTM if you feel that the operator forwarders is the better approach.

This revision was not accepted when it landed; it landed in state Needs Review.Oct 19 2020, 2:48 PM
This revision was automatically updated to reflect the committed changes.

Hi All,

I've just noticed this patch and realised that it takes DiagnosticBuilder in the direction tangential to what we proposed in [1]. I've just restarted our efforts with regard to refactoring these APIs and feel that it would be good to coordinate any future efforts in this area. I also want to make sure that we have a solution that works for everyone.

Basically, this patch makes DiagnosticBuilder more integrated with libclangBasic by assuming that every diagnostic contains SourceLocations and FixItHints (see the definition of DiagnosticStorage). However, that's not the case for driver diagnostics (i.e. diagnostics issued in libclangDriver). In [1] we proposed to create thin abstraction layers above all diagnostic classes that do not depend on SourceLocation (or anything else from libclangBasic). This patch makes DiagnosticBuilder _more_ dependant on SourceLocation and in this respect refactors DiagnosticBuilder in a direction opposite to what we proposed in [1].

Since this patch changes the DiagnosticBuilder API quite significantly, I am not entirely sure how to proceed with refactoring it (we need to make it independent of SourceLocation and any related classes). Originally, DiagnosticBuilder was a standalone class, but now we will have to refactor the following classes on top of DiagnosticBuilder:

  • DiagnosticStorage
  • StreamingDiagnostic
  • PartialDiagnostic

One idea that immediately comes to mind is to split DiagnosticStorage as follows:

struct DiagnosticStorage {
    // Storage that applies to all diagnostics
    // Original content, but without DiagRanges and FixItHints
};

struct LangDiagnosticStorage {
    // Storage that applies only to diagnostics that contain ranges and FixitHints - in practice these are language diagnostics
    SmallVector<CharSourceRange, 8> DiagRanges;
    SmallVector<FixItHint, 6> FixItHints;

    LangDiagnosticStorage() = default;
}

How does this look? Sadly it will clutter this API, but I can't see any way around it. One other extreme alternative is to:

  • revert this patch
  • create a thin abstraction layer for DiagnosticBuilder independent of SourceLocation (as per our RFC [1])
  • re-apply this patch (with some necessary modifications)

I appreciate that it is important to keep this API as clean and as lean as possible. However,

  • it's required by libclangDriver
  • we want to make libclangDriver independent of Clang (so that it can be used by Flang without the dependency on Clang).

Further refactoring will be required to achieve this. We sent 2 RFCs [1] [2] earlier this year to cfe-dev to discuss this, so I hope that this won't come as a surprise.

Do you have any suggestions?

CC @tra @rjmccall @rsmith @yaxunl

Thank you,
Andrzej

[1] http://lists.llvm.org/pipermail/cfe-dev/2020-July/066393.html
[2] http://lists.llvm.org/pipermail/llvm-dev/2020-June/141994.html

It sounds like what you want is a diagnostic library that's almost completely abstracted over the kinds of entities that can be stored in a diagnostic, including the definition of a source location. I don't think that's incompatible with this patch; there's no need to suggest reversion.

Source locations, supplemental source ranges, and fix-its are all still meaningful concepts in the driver, it's just that a source location is, say, an offset into a particular argument rather than a location in the file. So the challenge for you is to come up with a way to abstract this that doesn't significantly make things worse for Clang.

Hi @rjmccall , thank you for your quick reply!

It sounds like what you want is a diagnostic library that's almost completely abstracted over the kinds of entities that can be stored in a diagnostic, including the definition of a source location.

No :) We suggested (and that's based on the feedback on cfe-dev) a more basic approach. We proposed an abstraction layer that does not require any location information, because that's irrelevant in the driver (more on this below). We did, intially, suggest re-using and re-factoring SourceLocation (and, more specifically, SourceManager). That direction was explicitely discouraged on cfe-dev [3].

Source locations, supplemental source ranges, and fix-its are all still meaningful concepts in the driver

When we studied the source code of libclangDriver that was not the case. The diagnostics in the driver [1] are only used to a very limited extent (compared to the frontend) and I've only seen them issued via [2]:

inline DiagnosticBuilder DiagnosticsEngine::Report(unsigned DiagID) {
  return Report(SourceLocation(), DiagID);
}

i.e. SourceLocation is not really used at all. Are we missing something here? Is there any place in the libclangDriver (or any tool that's implemented in terms of libclangdDriver) that actually requires SourceLocation?

We would like to keep the scope of refactoring libclangDriver (and its dependencies) to the required minimum. The solution that you suggested would mean refactoring an API that's not required by libclangDriver (please correct me if I'm missing something here). I'm concerned that any attempts to modify SourceLocation will be challenged on cfe-dev. It was already suggested that we shouldn't need it for what we want to achieve. And indeed, in our simplified prototype downstream we were able to avoid using it without affecting the driver diagnostics.

Another difficulty with your suggestion is testing. If we were to create some abstraction layer above SourceLocation - how would we define or test it? Again, AFAIK diagnostics in libclangDriver never contain any information with respect to source location. So how do we decide what's actually required? This would be easy if there was a reference example :)

All in all, I think that this patch will require us to broaden the refactoring in a way that otherwise wouldn't be required. And according to the feedback so far this will be detrimental to Clang. If we are still in disagreement then perhaps we should move this discussion to cfe-dev? I want to make sure that we are all aligned on the overall direction and that we don't diverge too much from what was already discussed on cfe-dev.

Thanks in advance for your feedback,
-Andrzej

[1]
https://github.com/llvm/llvm-project/blob/3c3071d5e7dd259022d4f5e879fdf9824b9e490c/clang/lib/Driver/OptionUtils.cpp#L26
https://github.com/llvm/llvm-project/blob/3c3071d5e7dd259022d4f5e879fdf9824b9e490c/clang/lib/Driver/Driver.cpp#L270
https://github.com/llvm/llvm-project/blob/3c3071d5e7dd259022d4f5e879fdf9824b9e490c/clang/lib/Driver/Driver.cpp#L275
https://github.com/llvm/llvm-project/blob/3c3071d5e7dd259022d4f5e879fdf9824b9e490c/clang/lib/Driver/Driver.cpp#L1145

[2] https://github.com/llvm/llvm-project/blob/3c3071d5e7dd259022d4f5e879fdf9824b9e490c/clang/include/clang/Basic/Diagnostic.h#L1482-L1484

[3] http://lists.llvm.org/pipermail/cfe-dev/2020-June/065766.html

I may not have been clear. I'm not saying SourceLocation is a meaningful concept in the driver. I'm saying that if you generalize the concept of "source location" to "location in the input", there is a clear analogue in the driver (namely, a position in the argument list), and the only reason this isn't passed down to the driver and used in diagnostics is that we don't have the ability to express that today to the diagnostic engine. So instead of trying to extract out a part of the diagnostics engine that will work without any concept of source locations, you should be trying to parameterize the diagnostics engine so that it can work with an arbitrary external concept of source locations, and then the driver can use a different kind of source location than the main compiler and everything is fine.

I may not have been clear. I'm not saying SourceLocation is a meaningful concept in the driver.

Likewise, I should be more careful with respect to how I refer to SourceLocation (a class in Clang) and "source location" (location within a source file).

I see your point. Basically, you are suggesting that we improve the diagnostics for the driver first. And for that we'd need something like CommandLineLocation (a new class: location within the command invocation) that would be related to SourceLocation via a common base class, e.g. AbstractLocation (also a new class). That would indeed be a very nice thing to have.

However, this is not what we've been proposing. Our ultimate goal is a new Flang driver:

  • implemented in terms of libclangDriver
  • independent of Clang.

We shouldn't need to implement this extra functionality to achieve our goal. Personally I like the idea, but we need to stay focused on activities that are strictly needed for a new Flang driver. Not making things worse for Clang is an equally important goal. Improving things in Clang is something we are keen on, but are unlikely to have the bandwidth for.

It'll hard to create these thin layers above the diagnostic classes that do not depend on SourceLocation and SourceManager (and other classes that depend/require these). This is regardless of whether driver diagnostics are capable of tracking location (via e.g. CommandLineLocation) or not. The integration of diagnostics classes and "source location" related classes (specifically SourceLocation and SourceManager) is quite deep and this patch deepens that. IMO this patch makes things even harder for us. I'm starting to think that all this could be vastly simplified if libclangDriver had it's own diagnostics classes- a very simplified version of what's currently available in libclangBasic (again, because the driver doesn't need much). The improvements that you propose could be added at later time. Would this be acceptable? I think that the thin abstraction layer that we're discussing here would naturally emerge with time.