This is an archive of the discontinued LLVM Phabricator instance.

[clang-format] BreakAfterReturnType ignored on functions with numeric template parameters
ClosedPublic

Authored by MyDeveloperDay on Mar 13 2019, 10:26 AM.

Details

Summary

Addresses PR40696 - https://bugs.llvm.org/show_bug.cgi?id=40696

The BreakAfterReturnType didn't work if it had a single arguments which was a template with an integer template parameter

int  foo(A<8> a) { return a; }

When run with the Mozilla style. would not break after the int

int TestFn(A<8> a)
{
  return a;
}

This revision resolves this issue by allowing numeric constants to be considered function parameters if if seen inside <>

Diff Detail

Event Timeline

Why did the numeric constant break this? Which path would the function run through?

MyDeveloperDay added a comment.EditedMar 14 2019, 2:56 AM

Why did the numeric constant break this? Which path would the function run through?

as its looking though the parameter list is doesn't recognize a numeric constant as being a valid thing it would see in an argument list and so falls through the bottom to return false, it will only be where there is a numeric_constant as the first argument, if it has a second argument and that argument is just a simple type it would still fail, but switch them around and it won't

As I look more at this, even this would (and still would) fail to break

int TestFn(A=1)
{
  return a;
}

originally I didn't have the counting of the <> but one of the unit tests failed where we had

verifyFormat("LoooooooooooooooooooooooooooooooooooooooongType\n"
             "    LoooooooooooooooooooooooooooooooooooooooongVariable(1);");

I guess they wanted to do something like this when the type and name is very long

int
variable(1);

but I think this gets thought of as being a potential FunctionDecl

so I held off just adding tok::numeric_constant to the isOneOf(), and went with the use case

I think perhaps I could solve the default argument by adding a

startsSequence(tok::equal, tok::numeric_constant)

Why did the numeric constant break this? Which path would the function run through?

as its looking though the parameter list is doesn't recognize a numeric constant as being a valid thing it would see in an argument list and so falls through the bottom to return false, it will only be where there is a numeric_constant as the first argument, if it has a second argument and that argument is just a simple type it would still fail, but switch them around and it won't

As I look more at this, even this would (and still would) fail to break

int TestFn(A=1)
{
  return a;
}

originally I didn't have the counting of the <> but one of the unit tests failed where we had

verifyFormat("LoooooooooooooooooooooooooooooooooooooooongType\n"
             "    LoooooooooooooooooooooooooooooooooooooooongVariable(1);");

I guess they wanted to do something like this when the type and name is very long

int
variable(1);

but I think this gets thought of as being a potential FunctionDecl

so I held off just adding tok::numeric_constant to the isOneOf(), and went with the use case

I think perhaps I could solve the default argument by adding a

startsSequence(tok::equal, tok::numeric_constant)

How does this even work for

void Test(A a)
{
}

This code seems to deeply assume that the { is the last thing in the line for a definition. One question is whether we can just mark this as a function definition when we see the { in the UnwrappedLineParser...

MyDeveloperDay added a comment.EditedMar 14 2019, 4:26 AM

That works because the argument list is just tok::identifier and TT_StartOfName

so if we see TT_StartOfName we are going to return true

tok::identifier will simply be skipped round the loop

int
Test(A a)
{}
AnnotatedTokens(L=0):
 M=0 C=0 T=Unknown S=1 B=0 BK=0 P=0 Name=int L=3 PPK=2 FakeLParens= FakeRParens=0 II=0x11be8c48b20 Text=int
 M=1 C=1 T=FunctionDeclarationName S=1 B=0 BK=0 P=220 Name=identifier L=88 PPK=2 FakeLParens= FakeRParens=0 II=0x11be8c42d00 Text=Test
 M=0 C=0 T=Unknown S=0 B=0 BK=0 P=23 Name=l_paren L=89 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text=(
 M=0 C=1 T=Unknown S=0 B=0 BK=0 P=140 Name=identifier L=90 PPK=2 FakeLParens= FakeRParens=0 II=0x11be8c42c70 Text=A
 M=0 C=1 T=StartOfName S=1 B=0 BK=0 P=240 Name=identifier L=92 PPK=2 FakeLParens= FakeRParens=0 II=0x11be8c42ca0 Text=a
 M=0 C=0 T=Unknown S=0 B=0 BK=0 P=43 Name=r_paren L=93 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text=)

Where you are correct in its handling is this,

int Test(A) {}

because it doesn't see the TT_StartOfName it doesn't return true and so falls out the bottom of the function with false
and hence doesn't break the return type

AnnotatedTokens(L=0):
 M=0 C=0 T=Unknown S=1 B=0 BK=0 P=0 Name=int L=3 PPK=2 FakeLParens= FakeRParens=0 II=0x245bb8e7140 Text=int
 M=0 C=1 T=StartOfName S=1 B=0 BK=0 P=220 Name=identifier L=8 PPK=2 FakeLParens= FakeRParens=0 II=0x245bb8ebdc0 Text=Test
 M=0 C=0 T=Unknown S=0 B=0 BK=0 P=23 Name=l_paren L=9 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text=(
 M=0 C=1 T=Unknown S=0 B=0 BK=0 P=140 Name=identifier L=10 PPK=2 FakeLParens= FakeRParens=0 II=0x245bb8ebd30 Text=A
 M=0 C=0 T=Unknown S=0 B=0 BK=0 P=43 Name=r_paren L=11 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text=)

That works because the argument list is just tok::identifier and TT_StartOfName

Should we fix isStartOfName then?

In A<bool> or A<8> A should be TT_StartOfName, too, right?

of course, this isn't legal

int TestFn(A=1)
{
  return a;
}

and, would see the TT_StartOfName for (a) and would break

int
TestFn(A a = 1)
{
  return a;
}

That works because the argument list is just tok::identifier and TT_StartOfName

Should we fix isStartOfName then?

In A<bool> or A<8> A should be TT_StartOfName, too, right?

Maybe that is a better way, let me look..

I've realized that defining a function Test which returns a class, and has an unnamed class parameter in its definition

B Test(A);

is indistinguishable from a variable declaration Test of type B which passes a variable A into its constructor, without knowing that A is a class and not a variable, there isn't enough information to say this is a function or not

B Test(A);

Which is why we should always name our parameters in function declarations ;-)

MyDeveloperDay added a comment.EditedMar 14 2019, 5:11 AM

That works because the argument list is just tok::identifier and TT_StartOfName

Should we fix isStartOfName then?

In A<bool> or A<8> A should be TT_StartOfName, too, right?

I'm not sure that would work, the <8> is part of the type not the TT_StartOfName, or did I miss something

M=0 C=0 T=Unknown S=1 B=0 BK=0 P=0 Name=int L=3 PPK=2 FakeLParens= FakeRParens=0 II=0x263a5ddd090 Text='int'
M=1 C=1 T=FunctionDeclarationName S=1 B=0 BK=0 P=220 Name=identifier L=88 PPK=2 FakeLParens= FakeRParens=0 II=0x263a5dda5a0 Text='Test'
M=0 C=0 T=Unknown S=0 B=0 BK=0 P=23 Name=l_paren L=89 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text='('
M=0 C=1 T=Unknown S=0 B=0 BK=0 P=140 Name=identifier L=90 PPK=2 FakeLParens= FakeRParens=0 II=0x263a5dda510 Text='A'
M=0 C=1 T=StartOfName S=1 B=0 BK=0 P=240 Name=identifier L=92 PPK=2 FakeLParens= FakeRParens=0 II=0x263a5dda540 Text='a'
M=0 C=0 T=Unknown S=0 B=0 BK=0 P=43 Name=r_paren L=93 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text=')
M=0 C=0 T=Unknown S=1 B=0 BK=0 P=0 Name=int L=3 PPK=2 FakeLParens= FakeRParens=0 II=0x263a5ddd090 Text='int'
M=1 C=1 T=FunctionDeclarationName S=1 B=0 BK=0 P=220 Name=identifier L=88 PPK=2 FakeLParens= FakeRParens=0 II=0x263a5dda4e0 Text='TestFn'
M=0 C=0 T=Unknown S=0 B=0 BK=0 P=23 Name=l_paren L=89 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text='('
M=0 C=1 T=Unknown S=0 B=0 BK=0 P=140 Name=identifier L=90 PPK=2 FakeLParens= FakeRParens=0 II=0x263a5dda510 Text='A'
M=0 C=0 T=TemplateOpener S=0 B=0 BK=0 P=50 Name=less L=91 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text='<'
M=0 C=1 T=Unknown S=0 B=0 BK=0 P=380 Name=bool L=95 PPK=2 FakeLParens= FakeRParens=0 II=0x263a5ddd670 Text='bool'
M=0 C=0 T=TemplateCloser S=0 B=0 BK=0 P=290 Name=greater L=96 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text='>'
M=0 C=1 T=StartOfName S=1 B=0 BK=0 P=240 Name=identifier L=98 PPK=2 FakeLParens= FakeRParens=0 II=0x263a5dda540 Text='a'
M=0 C=0 T=Unknown S=0 B=0 BK=0 P=43 Name=r_paren L=99 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text=')'

Yea, the
a b(c)
problem is the olden C++ problem clang-format will never be able to solve, thus all these crazy heuristics.

So, inching closer to understanding this - I totally missed the isLiteral() that short-circuits out before. I'm more happy with your current approach, except I'd turn it around:
I'd only short-circuit out of the loop returning false in line 2064 if isLiteral() && (TemplateOpenerLevel > 0).
Alternatively, we could try to skip things within template <> - don't we have the closing > for an opening < in MatchingParen?

Addressing code review comment

@klimek I think is what you meant, a much smaller and neater version of what I had before.

Sorry for the spam:

  • write this a little better

Minus me understanding the TT_TemplateString change, the rest looks nice now, thanks!

clang/lib/Format/TokenAnnotator.cpp
3181–3182

I don't understand this yet, which test broke? The TT_TemplateString looks like it's a JS thing?

MyDeveloperDay marked 2 inline comments as done.Apr 3 2019, 12:36 PM
MyDeveloperDay added inline comments.
clang/lib/Format/TokenAnnotator.cpp
3181–3182

I'm not sure what you mean here?

MyDeveloperDay marked an inline comment as done.Apr 5 2019, 2:37 AM

friendly ping

klimek accepted this revision.Apr 5 2019, 3:29 AM

lg

clang/lib/Format/TokenAnnotator.cpp
3181–3182

I don't see the change any more - perhaps the diff was broken? Sorry :(

This revision is now accepted and ready to land.Apr 5 2019, 3:29 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptApr 6 2019, 3:11 AM