This is an archive of the discontinued LLVM Phabricator instance.

[clang-format] Distinguish K&R C function definition and attribute
ClosedPublic

Authored by owenpan on Aug 12 2021, 6:30 AM.

Diff Detail

Event Timeline

owenpan requested review of this revision.Aug 12 2021, 6:30 AM
owenpan created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptAug 12 2021, 6:31 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
HazardyKnusperkeks added inline comments.
clang/lib/Format/UnwrappedLineParser.cpp
17

Why is this not needed anymore?

owenpan added inline comments.Aug 12 2021, 1:42 PM
clang/lib/Format/UnwrappedLineParser.cpp
17

It was added in D107950 but not needed even there.

owenpan updated this revision to Diff 366176.Aug 12 2021, 8:54 PM

Added keywords register, struct, and union back.

Did the original change make it into the 13 branch? I'm seeing some unexpected behavior.

bool foo(int a, Bar) override;
bool foo(int a, Bar) override; // comment

becomes

bool foo(int a, Bar)
override;
bool foo(int a, Bar)
override; // comment
MyDeveloperDay added a comment.EditedAug 13 2021, 12:43 AM

Logged as https://bugs.llvm.org/show_bug.cgi?id=51470 @krasimir fix resolves this.

I've marked this bug as a 13.0 release blocker ((@tstellar), we can take @krasimir fix or we revert all the recent K&R support out of the 13 branch, @owenpan which would you be happier doing?

FWIW with this additional fix is also still ok and doesn't revert the behaviour, but we'd have to land this in order for @tstellar to move it to the branch (and I think he said that needed to be done by Friday 13th)

bool foo(int a, Bar) override;
bool foo(int a, Bar) override; // comment
bool foo(int a, int) override;

bool foo(int a, Bar) final;
bool foo(int a, Bar) final; // comment
MyDeveloperDay requested changes to this revision.Aug 13 2021, 1:16 AM

The following doesn't feel consistent

bool foo(a, Bar) final;     // comment
bool foo(a, Bar) final;
Bar foo(a, Bar)
final; // comment
Bar foo(a, Bar)
final;
This revision now requires changes to proceed.Aug 13 2021, 1:16 AM
owenpan added a comment.EditedAug 13 2021, 1:17 AM

Did the original change make it into the 13 branch?

What's the 13 branch?

I'm seeing some unexpected behavior.

bool foo(int a, Bar) override;
bool foo(int a, Bar) override; // comment

becomes

bool foo(int a, Bar)
override;
bool foo(int a, Bar)
override; // comment

How can I reproduce it? The above C++ declarations should be inside a class definition, no?

Anyway, here is what I got:

$ cat .clang-format
BasedOnStyle: LLVM
AlwaysBreakAfterReturnType: TopLevelDefinitions
$ cat foo.cpp
typedef float Bar;
class A {
  virtual bool foo(int, Bar);
  virtual bool bar(int, Bar);
};
class B : A {
  bool foo(int a, Bar) override;
  bool bar(int a, Bar) override; // comment
};
$ clang -Wall -std=c++11 -c foo.cpp
$ clang-format foo.cpp > bar.cpp
$ diff foo.cpp bar.cpp
$ 
owenpan added a comment.EditedAug 13 2021, 1:39 AM

Logged as https://bugs.llvm.org/show_bug.cgi?id=51470 @krasimir fix resolves this.

I've marked this bug as a 13.0 release blocker ((@tstellar), we can take @krasimir fix or we revert all the recent K&R support out of the 13 branch, @owenpan which would you be happier doing?

@krasimir's fix did not make into the 13 branch. We should take both that fix and this one if possible.

Did the original change make it into the 13 branch?

What's the 13 branch?

I'm seeing some unexpected behavior.

bool foo(int a, Bar) override;
bool foo(int a, Bar) override; // comment

becomes

bool foo(int a, Bar)
override;
bool foo(int a, Bar)
override; // comment

How can I reproduce it? The above C++ declarations should be inside a class definition, no?

Anyway, here is what I got:

$ cat .clang-format
BasedOnStyle: LLVM
AlwaysBreakAfterReturnType: TopLevelDefinitions
$ cat foo.cpp
typedef float Bar;
class A {
  virtual bool foo(int, Bar);
  virtual bool bar(int, Bar);
};
class B : A {
  bool foo(int a, Bar) override;
  bool bar(int a, Bar) override; // comment
};
$ clang -Wall -std=c++11 -c foo.cpp
$ clang-format foo.cpp > bar.cpp
$ diff foo.cpp bar.cpp
$ 

The above C++ declarations should be inside a class definition, no?

Not really, we don't have to have any context of where we are (in class or struct). if you had for example

class A
{
#include "myfunctions.h"
}

you still would want the code in myfunctions.h to be formatted just the same (less indentation)

So the following code:

bool foo(int a, Bar)
override;
bool foo(int a, Bar)
override; // comment

Failed on the 13 branch, but is fixed with @krasimir fix, but that isn't backported (we need to do that as a minimum I think)

With this revision that is also fixed, the following occurs if I change the return type

Bar foo(int a, Bar)
override;
Bar foo(int a, Bar)
override; // comment

And that doesn't feel correct to me. Well not as by default at least. From my perspective there seems to be a propensity to think code is K&R, but that should be the exception not the rule.

If that is to be the case then we need some sort of setting that means they need to expect it, (like the Language: XXXX) that was proposed before

I just ran this on my 4.5 million line and ended up with way more areas that needed clang-formatting than I think should need to (in c++ code not c code)

The above C++ declarations should be inside a class definition, no?

Not really, we don't have to have any context of where we are (in class or struct). if you had for example

class A
{
#include "myfunctions.h"
}

you still would want the code in myfunctions.h to be formatted just the same (less indentation)

Got it!

So the following code:

bool foo(int a, Bar)
override;
bool foo(int a, Bar)
override; // comment

Failed on the 13 branch, but is fixed with @krasimir fix, but that isn't backported (we need to do that as a minimum I think)

With this revision that is also fixed, the following occurs if I change the return type

Bar foo(int a, Bar)
override;
Bar foo(int a, Bar)
override; // comment

And that doesn't feel correct to me. Well not as by default at least. From my perspective there seems to be a propensity to think code is K&R, but that should be the exception not the rule.

I agree.

owenpan updated this revision to Diff 366234.Aug 13 2021, 4:04 AM

Add checking for final and override.

MyDeveloperDay added a comment.EditedAug 13 2021, 4:08 AM

This is better as it will catch more cases but there can still be cases we could fall foul.

Bar foo(a, Bar)
LLVM_OVERRIDE;
Bar foo(a, Bar)
LLVM_FINAL;
Bar foo(a, Bar)
LLVM_NOEXCEPT;

Is there any way this K&R stuff can be put behind an option, I just feel like we'll be chasing shadows.

I'm not making this cases up, these are examples of real cases that we've used in our cross platform environment because no all C++ compilers supported override/final.

This is better as it will catch more cases but there can still be cases we could fall foul.

Bar foo(a, Bar)
LLVM_OVERRIDE;
Bar foo(a, Bar)
LLVM_FINAL;
Bar foo(a, Bar)
LLVM_NOEXCEPT;

Is there any way this K&R stuff can be put behind an option, I just feel like we'll be chasing shadows.

We should do it only as the last resort. Let me first give it another shot.

I'm not making this cases up, these are examples of real cases that we've used in our cross platform environment because no all C++ compilers supported override/final.

Thanks for coming up with all these cases!

owenpan updated this revision to Diff 366237.Aug 13 2021, 4:42 AM

Check for tok::semi (in addition to tok::l_paren) after the first token of a potential K&R C parameter declaration.

MyDeveloperDay accepted this revision.Aug 13 2021, 4:52 AM
bool foo(int a, Bar) override;
bool foo(int a, Bar) override; // comment
bool foo(int a, Bar) final;
bool foo(int a, Bar) final; // comment
bool foo(a, Bar) final;     // comment
bool foo(a, Bar) final;
Bar foo(a, Bar) final; // comment
Bar foo(a, Bar) final;
Bar foo(int a, Bar) final; // comment
Bar foo(int a, Bar) final;
Bar foo(a, Bar) noexcept;
Bar foo(a, Bar) noexcept(false);
Bar foo(int a, Bar) LLVM_OVERRIDE;
Bar foo(a, Bar) LLVM_NOEXCEPT;
Bar foo(a, Bar) LLVM_NOEXCEPT;
Bar foo(a, Bar) const;
bool foo(a, Bar) override;
bool foo(a, Bar) const override;
bool foo(a, Bar) noexcept override;
bool foo(a, Bar) /*ABC*/ override;
auto foo(a, Bar) OVERRIDE;
[[nodiscard]] bool foo(a, Bar) final;
[[nodiscard]] Bar foo(a, Bar) final;

This is looking way more robust to me, LGTM..

This revision is now accepted and ready to land.Aug 13 2021, 4:52 AM
This revision was landed with ongoing or failed builds.Aug 13 2021, 5:28 AM
This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Aug 13 2021, 5:54 AM

Looks like this breaks tests: http://45.33.8.238/linux/53600/step_7.txt

Please take a look and revert for now if it takes a while to fix.

MyDeveloperDay added inline comments.Aug 13 2021, 9:47 AM
clang/lib/Format/UnwrappedLineParser.cpp
1391–1394

Maybe?

MyDeveloperDay added inline comments.Aug 13 2021, 9:49 AM
clang/unittests/Format/FormatTest.cpp
8258

can you check with out the comment and without a having a type (I know it shocking code) but just want to be sure.

Bar g(int a, Bar) final;
Bar g(a, Bar) final;

@owenpan If its been reverted is it OK to just reopen the review?

owenpan reopened this revision.Aug 13 2021, 12:59 PM
This revision is now accepted and ready to land.Aug 13 2021, 12:59 PM
owenpan planned changes to this revision.Aug 13 2021, 1:00 PM
owenpan added inline comments.Aug 13 2021, 1:02 PM
clang/lib/Format/UnwrappedLineParser.cpp
1391–1394

Yep!

clang/unittests/Format/FormatTest.cpp
8258

Sure. Will also add a test case for the assertion failure.

clang/lib/Format/UnwrappedLineParser.cpp
17

I would put that in a different commit, it has nothing to do with what is changed here. And the revert is the example why I would put that in different commits. :)

But I will not stand in the way to land this.

owenpan updated this revision to Diff 366342.Aug 13 2021, 1:20 PM

Fix the assertion failure and add more test cases.

This revision is now accepted and ready to land.Aug 13 2021, 1:20 PM
owenpan marked 3 inline comments as done.Aug 13 2021, 1:26 PM
owenpan added inline comments.
clang/lib/Format/UnwrappedLineParser.cpp
17

I included the test case here though. It's related to the bug I had in previous diffs.

owenpan requested review of this revision.Aug 13 2021, 2:05 PM
This revision is now accepted and ready to land.Aug 13 2021, 3:25 PM
This revision was landed with ongoing or failed builds.Aug 14 2021, 5:12 AM
This revision was automatically updated to reflect the committed changes.
MyDeveloperDay added a comment.EditedAug 16 2021, 12:45 AM

Nit: There is something niggling in the back of my mind that this is too much logic here to be in parseStructuralElement that sort of suggests to me that this isn't the correct place to handle this.

I don't really see any other structural element being handled like this and I'm a little concerned that we could end up in this code for more than just function declarations as parseStructuralElement is called from all over the place.

There seems to be no reference to TT_FunctionOrDecalartionName or TT_StartOfName which I would normally consider to be indicators of a function. I think ultimately you are trying to identify a function which doesn't have type information as actually being a function

so I sort of feel it should be in isFunctionDeclarationName, did you consider that at any point? or is the problem about trying to add the newline after.

What made you decide it should always have a new line and what about indentation? I see alot of code on github like this?

int main(argc, argv)
	int argc;
	char **argv;
int main (argc, argv)
int argc;
char *argv[];
int main(argc, argv) int argc;
char** argv;
int main(argc, argv)int argc; char* argv [];

Aren't we now going to unify them around a single style?

https://github.com/search?l=C&p=99&q=%22int+main%28argc%2C+argv%29%22&type=Code

Nit: There is something niggling in the back of my mind that this is too much logic here to be in parseStructuralElement that sort of suggests to me that this isn't the correct place to handle this.

I don't really see any other structural element being handled like this and I'm a little concerned that we could end up in this code for more than just function declarations as parseStructuralElement is called from all over the place.

The IsTopLevel and Style.isCpp() checks should cut it down a lot, and we are already doing this with tryToParseJSFunction() anyway.

There seems to be no reference to TT_FunctionOrDecalartionName or TT_StartOfName which I would normally consider to be indicators of a function. I think ultimately you are trying to identify a function which doesn't have type information as actually being a function

so I sort of feel it should be in isFunctionDeclarationName, did you consider that at any point? or is the problem about trying to add the newline after.

The latter although I did consider handling it in isFunctionDeclarationName() with the help of TT_StartOfName and TT_FunctionOrDeclarationName. In the end, I came to the conclusion that we must break the line in the unwrapped-line parser.

What made you decide it should always have a new line and what about indentation? I see alot of code on github like this?

int main(argc, argv)
	int argc;
	char **argv;
int main (argc, argv)
int argc;
char *argv[];
int main(argc, argv) int argc;
char** argv;
int main(argc, argv)int argc; char* argv [];

Aren't we now going to unify them around a single style?

https://github.com/search?l=C&p=99&q=%22int+main%28argc%2C+argv%29%22&type=Code

I only break the line. The indentation would be lost even if we didn't break it. For example:

int main(argc, argv)
    int argc;
    char *argv[];

would be formatted to:

int main(argc, argv) int argc;
char *argv[];

which is worse than:

int main(argc, argv)
int argc;
char *argv[];