This is a follow-up on https://reviews.llvm.org/D107950 which missed user-defined types in K&R C.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
clang/lib/Format/UnwrappedLineParser.cpp | ||
---|---|---|
17 | Why is this not needed anymore? |
Did the original change make it into the 13 branch? I'm seeing some unexpected behavior.
bool foo(int a, Bar) override; bool foo(int a, Bar) override; // comment
becomes
bool foo(int a, Bar) override; bool foo(int a, Bar) override; // comment
Logged as https://bugs.llvm.org/show_bug.cgi?id=51470 @krasimir fix resolves this.
I've marked this bug as a 13.0 release blocker ((@tstellar), we can take @krasimir fix or we revert all the recent K&R support out of the 13 branch, @owenpan which would you be happier doing?
FWIW with this additional fix is also still ok and doesn't revert the behaviour, but we'd have to land this in order for @tstellar to move it to the branch (and I think he said that needed to be done by Friday 13th)
bool foo(int a, Bar) override; bool foo(int a, Bar) override; // comment bool foo(int a, int) override; bool foo(int a, Bar) final; bool foo(int a, Bar) final; // comment
The following doesn't feel consistent
bool foo(a, Bar) final; // comment bool foo(a, Bar) final; Bar foo(a, Bar) final; // comment Bar foo(a, Bar) final;
What's the 13 branch?
I'm seeing some unexpected behavior.
bool foo(int a, Bar) override; bool foo(int a, Bar) override; // commentbecomes
bool foo(int a, Bar) override; bool foo(int a, Bar) override; // comment
How can I reproduce it? The above C++ declarations should be inside a class definition, no?
Anyway, here is what I got:
$ cat .clang-format BasedOnStyle: LLVM AlwaysBreakAfterReturnType: TopLevelDefinitions $ cat foo.cpp typedef float Bar; class A { virtual bool foo(int, Bar); virtual bool bar(int, Bar); }; class B : A { bool foo(int a, Bar) override; bool bar(int a, Bar) override; // comment }; $ clang -Wall -std=c++11 -c foo.cpp $ clang-format foo.cpp > bar.cpp $ diff foo.cpp bar.cpp $
The above C++ declarations should be inside a class definition, no?
Not really, we don't have to have any context of where we are (in class or struct). if you had for example
class A { #include "myfunctions.h" }
you still would want the code in myfunctions.h to be formatted just the same (less indentation)
So the following code:
bool foo(int a, Bar) override; bool foo(int a, Bar) override; // comment
Failed on the 13 branch, but is fixed with @krasimir fix, but that isn't backported (we need to do that as a minimum I think)
With this revision that is also fixed, the following occurs if I change the return type
Bar foo(int a, Bar) override; Bar foo(int a, Bar) override; // comment
And that doesn't feel correct to me. Well not as by default at least. From my perspective there seems to be a propensity to think code is K&R, but that should be the exception not the rule.
If that is to be the case then we need some sort of setting that means they need to expect it, (like the Language: XXXX) that was proposed before
I just ran this on my 4.5 million line and ended up with way more areas that needed clang-formatting than I think should need to (in c++ code not c code)
Got it!
So the following code:
bool foo(int a, Bar) override; bool foo(int a, Bar) override; // commentFailed on the 13 branch, but is fixed with @krasimir fix, but that isn't backported (we need to do that as a minimum I think)
With this revision that is also fixed, the following occurs if I change the return type
Bar foo(int a, Bar) override; Bar foo(int a, Bar) override; // commentAnd that doesn't feel correct to me. Well not as by default at least. From my perspective there seems to be a propensity to think code is K&R, but that should be the exception not the rule.
I agree.
This is better as it will catch more cases but there can still be cases we could fall foul.
Bar foo(a, Bar) LLVM_OVERRIDE; Bar foo(a, Bar) LLVM_FINAL; Bar foo(a, Bar) LLVM_NOEXCEPT;
Is there any way this K&R stuff can be put behind an option, I just feel like we'll be chasing shadows.
I'm not making this cases up, these are examples of real cases that we've used in our cross platform environment because no all C++ compilers supported override/final.
We should do it only as the last resort. Let me first give it another shot.
I'm not making this cases up, these are examples of real cases that we've used in our cross platform environment because no all C++ compilers supported override/final.
Thanks for coming up with all these cases!
Check for tok::semi (in addition to tok::l_paren) after the first token of a potential K&R C parameter declaration.
bool foo(int a, Bar) override; bool foo(int a, Bar) override; // comment bool foo(int a, Bar) final; bool foo(int a, Bar) final; // comment bool foo(a, Bar) final; // comment bool foo(a, Bar) final; Bar foo(a, Bar) final; // comment Bar foo(a, Bar) final; Bar foo(int a, Bar) final; // comment Bar foo(int a, Bar) final; Bar foo(a, Bar) noexcept; Bar foo(a, Bar) noexcept(false); Bar foo(int a, Bar) LLVM_OVERRIDE; Bar foo(a, Bar) LLVM_NOEXCEPT; Bar foo(a, Bar) LLVM_NOEXCEPT; Bar foo(a, Bar) const; bool foo(a, Bar) override; bool foo(a, Bar) const override; bool foo(a, Bar) noexcept override; bool foo(a, Bar) /*ABC*/ override; auto foo(a, Bar) OVERRIDE; [[nodiscard]] bool foo(a, Bar) final; [[nodiscard]] Bar foo(a, Bar) final;
This is looking way more robust to me, LGTM..
Looks like this breaks tests: http://45.33.8.238/linux/53600/step_7.txt
Please take a look and revert for now if it takes a while to fix.
clang/lib/Format/UnwrappedLineParser.cpp | ||
---|---|---|
1388–1395 | Maybe? |
clang/unittests/Format/FormatTest.cpp | ||
---|---|---|
8258 | can you check with out the comment and without a having a type (I know it shocking code) but just want to be sure. Bar g(int a, Bar) final; Bar g(a, Bar) final; |
clang/lib/Format/UnwrappedLineParser.cpp | ||
---|---|---|
17 | I would put that in a different commit, it has nothing to do with what is changed here. And the revert is the example why I would put that in different commits. :) But I will not stand in the way to land this. |
clang/lib/Format/UnwrappedLineParser.cpp | ||
---|---|---|
17 | I included the test case here though. It's related to the bug I had in previous diffs. |
Nit: There is something niggling in the back of my mind that this is too much logic here to be in parseStructuralElement that sort of suggests to me that this isn't the correct place to handle this.
I don't really see any other structural element being handled like this and I'm a little concerned that we could end up in this code for more than just function declarations as parseStructuralElement is called from all over the place.
There seems to be no reference to TT_FunctionOrDecalartionName or TT_StartOfName which I would normally consider to be indicators of a function. I think ultimately you are trying to identify a function which doesn't have type information as actually being a function
so I sort of feel it should be in isFunctionDeclarationName, did you consider that at any point? or is the problem about trying to add the newline after.
What made you decide it should always have a new line and what about indentation? I see alot of code on github like this?
int main(argc, argv) int argc; char **argv;
int main (argc, argv) int argc; char *argv[];
int main(argc, argv) int argc; char** argv;
int main(argc, argv)int argc; char* argv [];
Aren't we now going to unify them around a single style?
https://github.com/search?l=C&p=99&q=%22int+main%28argc%2C+argv%29%22&type=Code
The IsTopLevel and Style.isCpp() checks should cut it down a lot, and we are already doing this with tryToParseJSFunction() anyway.
There seems to be no reference to TT_FunctionOrDecalartionName or TT_StartOfName which I would normally consider to be indicators of a function. I think ultimately you are trying to identify a function which doesn't have type information as actually being a function
so I sort of feel it should be in isFunctionDeclarationName, did you consider that at any point? or is the problem about trying to add the newline after.
The latter although I did consider handling it in isFunctionDeclarationName() with the help of TT_StartOfName and TT_FunctionOrDeclarationName. In the end, I came to the conclusion that we must break the line in the unwrapped-line parser.
What made you decide it should always have a new line and what about indentation? I see alot of code on github like this?
int main(argc, argv) int argc; char **argv;int main (argc, argv) int argc; char *argv[];int main(argc, argv) int argc; char** argv;int main(argc, argv)int argc; char* argv [];Aren't we now going to unify them around a single style?
https://github.com/search?l=C&p=99&q=%22int+main%28argc%2C+argv%29%22&type=Code
I only break the line. The indentation would be lost even if we didn't break it. For example:
int main(argc, argv) int argc; char *argv[];
would be formatted to:
int main(argc, argv) int argc; char *argv[];
which is worse than:
int main(argc, argv) int argc; char *argv[];
Why is this not needed anymore?