clang-format: better handle namespace macros
Needs ReviewPublic

Authored by Typz on Sep 13 2017, 9:27 AM.

Details

Summary

Other macros are used to declare namespaces, and should thus be handled
similarly. This is the case for crpcut's TESTSUITE macro, or for
unittest-cpp's SUITE macro:

TESTSUITE(Foo) {
TEST(MyFirstTest) {
  assert(0);
}
} // TESTSUITE(Foo)

This patch deals with this cases by introducing a new option to specify
lists of namespace macros. Internally, it re-uses the system already in
place for foreach and statement macros, to ensure there is no impact on
performance.

Diff Detail

Typz created this revision.Sep 13 2017, 9:27 AM
Typz added a reviewer: klimek.Sep 13 2017, 9:30 AM

I think instead of introducing more and more special cases of macros we might want to handle, we should instead allow specifying macro productions globally.

Typz added a comment.Oct 24 2017, 8:57 AM

I think instead of introducing more and more special cases of macros we might want to handle, we should instead allow specifying macro productions globally.

what do you mean?
Do you mean to group all the macro configuration options into "Macros" field, still containing one field for each kind of macro? Or do you have something else in mind?

In D37813#930065, @Typz wrote:

ping?

Argh, very sorry for the delay in response.

In D37813#905257, @Typz wrote:

I think instead of introducing more and more special cases of macros we might want to handle, we should instead allow specifying macro productions globally.

what do you mean?
Do you mean to group all the macro configuration options into "Macros" field, still containing one field for each kind of macro? Or do you have something else in mind?

I mean that we can configure macros in the format style, like "define A(X) class X {". I'm not 100% sure whether we would just try to use the Preprocessor for this, or whether we'd want to only allow a small subset of actual macros, but the general idea would be the same: The UnwrappedLineParser would parse the macro at the expansion location A(X) into an unwrapped line, and then parse the expansion into a child line, with the tokens tha tare not in the argument of the call being marked as fixed (parent child might also be better inverted).

That will allow folks to actually specify the semantics they care about instead of us growing ever increasing special-case logic for different types of macros.

Typz added a comment.Dec 1 2017, 6:00 AM

Definitely that would be much more expressive. But this approach is also incomplete: in case of namespace (and possibly others?), we also need to perform the reverse operation, e.g. to "generate" a macro call for rewriting the closing comment.

On top of this, I think that is really not trivial, I would prefer to move forward with these "simpler" patch, and change the whole macro configurations part in later commits...

klimek added a comment.Dec 1 2017, 6:35 AM
In D37813#941987, @Typz wrote:

Definitely that would be much more expressive. But this approach is also incomplete: in case of namespace (and possibly others?), we also need to perform the reverse operation, e.g. to "generate" a macro call for rewriting the closing comment.

On top of this, I think that is really not trivial, I would prefer to move forward with these "simpler" patch, and change the whole macro configurations part in later commits...

Looking at the tests, I see what you mean, but I'd actually vote to not add non-namespace closing comments, or at least make closing comments configurable in a different way - it seems very much unrelated?

Typz added a comment.Dec 1 2017, 7:20 AM

As far as "parsing" and formatting inside the block is concerned, this is indeed unrelated (and would totally work if all macros where specified with some preprocessor definitions).

But identifying the 'opening' token and generating the matching 'closing' comment are totally related; it would seem very strange to have an option to specify that TESTSUITE() macros are parsed as namespace, then another option to indicate that namespace declared by this macros are actually closed with another macro call...

klimek added a comment.Dec 1 2017, 7:55 AM
In D37813#942137, @Typz wrote:

As far as "parsing" and formatting inside the block is concerned, this is indeed unrelated (and would totally work if all macros where specified with some preprocessor definitions).

But identifying the 'opening' token and generating the matching 'closing' comment are totally related; it would seem very strange to have an option to specify that TESTSUITE() macros are parsed as namespace, then another option to indicate that namespace declared by this macros are actually closed with another macro call...

Putting the namespace into the macro gives the macro an abstraction - if somebody were to change, for example, the namespace portion with a class definition, would you still want the closing comments or not?

Typz added a comment.Dec 5 2017, 8:20 AM

I don't think this is really relevant for this tool: if someone changes the implementation of the macro, then *they* must indeed if it should not be formatted like a namespace (and keep the clang-format configuration unchanged), or if it should now be formatted like a class (and thus make changes to clang-format configuration). Here we are not defining what the macro does, but how clang-format should indent it : in most case I don't think the indentation format should actually depend on the way it is implemented...

klimek added a comment.Dec 6 2017, 2:31 AM
In D37813#945125, @Typz wrote:

I don't think this is really relevant for this tool: if someone changes the implementation of the macro, then *they* must indeed if it should not be formatted like a namespace (and keep the clang-format configuration unchanged), or if it should now be formatted like a class (and thus make changes to clang-format configuration). Here we are not defining what the macro does, but how clang-format should indent it : in most case I don't think the indentation format should actually depend on the way it is implemented...

Ok, that's probably where our different opinions come from - I would want a macro to be formatted to reflect how it's implemented, because otherwise I'm going to be surprised when I look at the implementation, and I consider surprises to be something to avoid in programming in general, where possible.

That said, I'm also then a bit confused by your tests - they seem to currently format a mixture of namespace and class formatting, and combine the indentation of a class-scope with namespace end comments.

Typz added a comment.Dec 6 2017, 7:36 AM

Ok, that's probably where our different opinions come from - I would want a macro to be formatted to reflect how it's implemented, because otherwise I'm going to be surprised when I look at the implementation, and I consider surprises to be something to avoid in programming in general, where possible.

As "author of the coding rules" (and thus writer of clang-format config), I would agree with you.
But I think the tool itself should not decide to enforce this : maybe there are cases where the macro is "conceptually" a namespace, but should be indented as a class ; in this case I think the tool should quite dumb, and simply do what was asked: if asked to format the macro as a namespace, it should do it, whatever the implementation is.

That said, on top of this "user choice" the tool could be smart, and automatically handle all macros based on their actual implementation: if the macro is actually implemented with a namespace (and there is no "explicit" definition of how it should be handled), then indeed the tool could format the whole block as a namespace, and so on... But unfortunately that would imply parsing numerous #include to get the definitions, and is definitely not the goal of this patch: this patch is really about letting the "user" decide how to handle the macro.

That said, I'm also then a bit confused by your tests - they seem to currently format a mixture of namespace and class formatting, and combine the indentation of a class-scope with namespace end comments.

This should not be the case: the goal here is that the macro is handled exactly like a namespace.
The NamespaceEndCommentFixer tests indeed contain some indentation, but it is not significant and is there only because I copied existing test cases and followed the same "style": but these tests do not actually perform any indentation, they simply handle the "end comment".
Do you have any specific test which confuses you?

I've talked with Daniel and we both believe this patch is not the right way to go forward for clang-format. It is adding configuration mechanisms that we need to maintain going forward, without addressing a general problem; specifically for how to handle macros, I believe there is a clear better way to do it that is also fundamentally more useful than adding extra knobs for every problem we encounter. Finally, these changes do not address a fundamental bug, and workarounds exist, so there is no large benefit to do this now in a non-principled way rather than wait for a more principled fix. Overall, I do understand that this is a trade-off, but I hope you understand that we sometimes have to make decisions.

Typz added a comment.Dec 18 2017, 3:04 AM

OK.

So you mean a solution like the one discussed earlier would be the way to go?

I mean that we can configure macros in the format style, like "define A(X) class X {". I'm not 100% sure whether we would just try to use the Preprocessor for this, or whether we'd want to only allow a small subset of actual macros, but the general idea would be the same: The UnwrappedLineParser would parse the macro at the expansion location A(X) into an unwrapped line, and then parse the expansion into a child line, with the tokens tha tare not in the argument of the call being marked as fixed (parent child might also be better inverted).

(As a side-note, I want to stress out that we would actually need a 'reversible' description to support the namespace case, to allow generating the end-of-namespace comment)

Is there any work on that side, any timeline when this may be supported ?

In D37813#958116, @Typz wrote:

OK.

So you mean a solution like the one discussed earlier would be the way to go?

I mean that we can configure macros in the format style, like "define A(X) class X {". I'm not 100% sure whether we would just try to use the Preprocessor for this, or whether we'd want to only allow a small subset of actual macros, but the general idea would be the same: The UnwrappedLineParser would parse the macro at the expansion location A(X) into an unwrapped line, and then parse the expansion into a child line, with the tokens tha tare not in the argument of the call being marked as fixed (parent child might also be better inverted).

(As a side-note, I want to stress out that we would actually need a 'reversible' description to support the namespace case, to allow generating the end-of-namespace comment)

I believe this will be needed for various reasons. The plan is to make this similar to how we format #ifdef trees, for which clang-format already has full support.

Is there any work on that side, any timeline when this may be supported ?

Some initial design work has been done, and Krasimir said that he's interested. No timeline though :(

Typz updated this revision to Diff 132838.Feb 5 2018, 8:48 AM

rebase

Typz added a comment.Feb 28 2018, 7:06 AM

Some initial design work has been done, and Krasimir said that he's interested. No timeline though :(

any update or progress maybe?

Typz added a comment.May 16 2018, 2:53 AM

Some initial design work has been done, and Krasimir said that he's interested. No timeline though :(

@klimek , @krasimir : any progress on this new preprocessor-like way to configure macros ?