This is an archive of the discontinued LLVM Phabricator instance.

[clang-format] Fix SeparateDefinitionBlocks issues
ClosedPublic

Authored by ksyx on Jan 17 2022, 4:05 PM.

Details

Summary
  • Fixes https://github.com/llvm/llvm-project/issues/53227 that wrongly indents multiline comments
  • Fixes wrong detection of single-line opening bracket when used along with those only opening scopes, causing crashes due to duplicated replacements on the same token:
	void foo()
	{
	  {
		int x;
	  }
	}
  • Fixes wrong recognition of first line of definition when the line starts with block comment, causing crashes due to duplicated replacements on the same token for this leads toward skipping the line starting with inline block comment:
        /*
           Some descriptions about function
        */
	/*inline*/ void bar() {
	}
  • Fixes wrong recognition of enum when used as a type name rather than starting definition block, causing crashes due to duplicated replacements on the same token since both actions for enum and for definition blocks were taken place:
	void foobar(const enum EnumType e) {
	}
  • Change to use function keyword for JavaScript instead of comparing strings
  • Resolves formatting conflict with options EmptyLineAfterAccessModifier and EmptyLineBeforeAccessModifier (prompts with --dry-run (-n) or --output-replacement-xml but no observable change)
  • Recognize long (len>=5) uppercased name taking a single line as return type and fix the problem of adding newline below it, with adding new token type FunctionLikeOrFreestandingMacro and marking tokens in UnwrappedLineParser:
	void
	afunc(int x) {
	  return;
	}
	TYPENAME
	func(int x, int y) {
	  // ...
	}
  • Remove redundant and repeated initialization
  • Do no change to newlines before EOF

Diff Detail

Event Timeline

ksyx created this revision.Jan 17 2022, 4:05 PM
ksyx requested review of this revision.Jan 17 2022, 4:05 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 17 2022, 4:05 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
ksyx updated this revision to Diff 400667.Jan 17 2022, 4:22 PM

Apply clangfmt's suggestion.

clang/lib/Format/DefinitionBlockSeparator.cpp
45–48

I still wonder if this can't be CurrentToken->is(JSKeywords.kw_function)

Could you explain more in details the before/after of the fixed bugs, please?
The best would be to develop the patch description.

clang/lib/Format/DefinitionBlockSeparator.cpp
38

Nit and personal preference.

179

I find "bracket" ambiguous.

ksyx updated this revision to Diff 400821.Jan 18 2022, 6:11 AM
ksyx marked 3 inline comments as done.
ksyx edited the summary of this revision. (Show Details)
  • Use function keyword for JavaScript instead of comparing strings
  • Apply review suggestions
MyDeveloperDay accepted this revision.Jan 18 2022, 6:38 AM

LGTM, I'll pull it down and try it here..

This revision is now accepted and ready to land.Jan 18 2022, 6:38 AM

I'm getting some sort of seemingly empty replacements going on, after public: I can't quite put my finger on what's happening.

ksyx added a comment.Jan 18 2022, 8:42 AM

I'm getting some sort of seemingly empty replacements going on, after public: I can't quite put my finger on what's happening.

Not so sure what empty means here. Is it removing empty line following public: or adding, or something else?

For

class Test
{
public:
    static void foo()
    {
    }
};

$ clang-format -n test.cxx
test.cxx:3:8: warning: code should be clang-formatted [-Wclang-format-violations
]
public:
       ^
$ clang-format --output-replacements-xml test.cxx
<?xml version='1.0'?>
<replacements xml:space='preserve' incomplete_format='false'>
<replacement offset='20' length='5'>&#10;    </replacement>
</replacements>

but when I run it nothing changes

$ clang-format test.cxx
class Test
{
public:
    static void foo()
    {
    }
};

And no character change

ksyx added a comment.EditedJan 18 2022, 8:59 AM

For

class Test
{
public:
    static void foo()
    {
    }
};

$ clang-format -n test.cxx
test.cxx:3:8: warning: code should be clang-formatted [-Wclang-format-violations
]
public:
       ^
$ clang-format --output-replacements-xml test.cxx
<?xml version='1.0'?>
<replacements xml:space='preserve' incomplete_format='false'>
<replacement offset='20' length='5'>&#10;    </replacement>
</replacements>

but when I run it nothing changes

$ clang-format test.cxx
class Test
{
public:
    static void foo()
    {
    }
};

And no character change

Looks like it is just replacing indent whitespaces to tabs

Sorry it does not look like that

-bash-4.4$ xxd testt.cpp
00000000: 636c 6173 7320 5465 7374 207b 0a70 7562  class Test {.pub
00000010: 6c69 633a 0a20 2073 7461 7469 6320 766f  lic:.  static vo
00000020: 6964 2066 6f6f 2829 207b 0a20 2020 2069  id foo() {.    i
00000030: 6e74 2074 3b0a 2020 2020 7265 7475 726e  nt t;.    return
00000040: 2031 3b0a 2020 7d0a 7d3b                  1;.  }.};
-bash-4.4$ cat testt.cpp | sha256sum
c8628b0d002842cdcfa532cfc4be1225a926b70ef1fc47d6ac3f9d6c46a78ad6  -
-bash-4.4$ bin/clang-format testt.cpp -style='{SeparateDefinitionBlocks: Always}' | sha256sum
c8628b0d002842cdcfa532cfc4be1225a926b70ef1fc47d6ac3f9d6c46a78ad6  -
clang/lib/Format/DefinitionBlockSeparator.cpp
79–83

Is that needed?

87–88

What if NewlineToInsert is 0? Does that explain @MyDeveloperDay s observation?

ksyx added inline comments.Jan 18 2022, 12:37 PM
clang/lib/Format/DefinitionBlockSeparator.cpp
87–88

Then the line break would be removed (concatenate with last line), which is not supposed to happen with Always style but possible at first line of Never style.

I have located the problem and roughly the solution but it takes some time to find a neat fix :)

Thanks for the feedback

ksyx updated this revision to Diff 401062.Jan 18 2022, 5:39 PM
ksyx marked 2 inline comments as done.
ksyx edited the summary of this revision. (Show Details)
  • Resolves formatting conflict with options EmptyLineAfterAccessModifier and EmptyLineBeforeAccessModifier (prompts with --dry-run (-n) or --output-replacement-xml but no observable change)
  • Remove redundant and repeated initialization
  • Do no change to newlines before EOF

ok, this is better but not quite right still.. I though I mentioned this before (please add as a unit test)

HRESULT
Foo::bar() {}

HRESULT
Foo::bar() {}

formats as

HRESULT

Foo::bar() {}

HRESULT

Foo::bar() {}

If you are windows developer using clang-format (HRESULT,LRESULT,LPSTR,LPCSTR,DWORD_PTR etc.. ) its an issue.

Sorry I so want this to be right, its singularly one of the things I've wanted for some time (you just got to it before me ;-) ) so I'm super excited by it.

From the other aspects we are getting there.

ksyx updated this revision to Diff 401293.EditedJan 19 2022, 9:52 AM
ksyx edited the summary of this revision. (Show Details)

Recognize long (len>=5) uppercased name taking a single line as return type
and fix the problem of adding newline below it:

void
afunc(int x) {
  return;
}
TYPENAME
func(int x, int y) {
  // ...
}

Related code: https://github.com/llvm/llvm-project/blob/1e512f022ad5c23dc4ef4e663f51d5d0bcbc7c69/clang/lib/Format/UnwrappedLineParser.cpp#L1712

MyDeveloperDay added a comment.EditedJan 19 2022, 11:56 AM

Works nicely now... (running it over my own code base)

LGTM

clang/lib/Format/DefinitionBlockSeparator.cpp
120

Split into two asserts, then one would know which did not hold.

135

Shouldn't we set a type for such cases instead of repeating the detection code here?

136

As a Qt user who also wants to use your patch, please add a test for that case. ;)

clang/unittests/Format/DefinitionBlockSeparatorTest.cpp
146

Maybe really use HRESULT? People will know what that should be.

ksyx added inline comments.Jan 19 2022, 12:11 PM
clang/lib/Format/DefinitionBlockSeparator.cpp
135

Here I actually did a few more checks to limit the impact to the minimum but I am happy to do that if that's necessary.

136

For Q_OBJECT if used as pattern like this:

class X : ... {
  Q_OBJECT
public:
  // ...
}

I think this patch has no effect on this? The comment here is just a repetition of that in unwrapped parser

clang/unittests/Format/DefinitionBlockSeparatorTest.cpp
146

It actually does not matter what type name it is as long as it is an identifier with >=5 characters and all uppercased

ksyx updated this revision to Diff 401445.Jan 19 2022, 4:46 PM
ksyx marked an inline comment as done.
  • Apply suggestions from clangfmt
  • Split assertion conditions
ksyx updated this revision to Diff 401460.Jan 19 2022, 5:39 PM

Apply suggestion from clangfmt

For

class Test
{
public:
    static void foo()
    {
    }
};

$ clang-format -n test.cxx
test.cxx:3:8: warning: code should be clang-formatted [-Wclang-format-violations
]
public:
       ^
$ clang-format --output-replacements-xml test.cxx
<?xml version='1.0'?>
<replacements xml:space='preserve' incomplete_format='false'>
<replacement offset='20' length='5'>&#10;    </replacement>
</replacements>

but when I run it nothing changes

$ clang-format test.cxx
class Test
{
public:
    static void foo()
    {
    }
};

And no character change

@MyDeveloperDay, @ksyx
Is there a test for this?
If no, how can we test it? It happens only with --dry-run, right?

I think its been fixed, in the last but one diff.

Generally speaking we simply cannot have --output-replacements-xml , -n or --dry-run show replacements that amount to nothing because this is the mechanism by which 100s of repos fail commits thinking their is a clang-format issue.

Last revision I tried these issues seem to have disappeared

ksyx updated this revision to Diff 401659.Jan 20 2022, 9:01 AM

Add unit test for formatting conflict

clang/lib/Format/DefinitionBlockSeparator.cpp
136

Which does indicate even more, that we shouldn't copy the stuff, but let the UnwrappedLineParser mark it as that, whatever that is.
What's your opinion @MyDeveloperDay ?

I think its been fixed, in the last but one diff.

Generally speaking we simply cannot have --output-replacements-xml , -n or --dry-run show replacements that amount to nothing because this is the mechanism by which 100s of repos fail commits thinking their is a clang-format issue.

Last revision I tried these issues seem to have disappeared

I understood that it was fixed but I'd really like to have a test case that reproduces it.
We might have the same problem somewhere else, no?
Can we have a part of verifyFormat that runs with DryRun=true and checks the replacements?

ksyx added a comment.EditedJan 21 2022, 4:45 AM

I think its been fixed, in the last but one diff.

Generally speaking we simply cannot have --output-replacements-xml , -n or --dry-run show replacements that amount to nothing because this is the mechanism by which 100s of repos fail commits thinking their is a clang-format issue.

Last revision I tried these issues seem to have disappeared

I understood that it was fixed but I'd really like to have a test case that reproduces it.
We might have the same problem somewhere else, no?
Can we have a part of verifyFormat that runs with DryRun=true and checks the replacements?

Yes I see your point so I added one in the last diff :)

I'd like to see the refactor of the parts repeated from UnwrappedLineParser.
No comments otherwise.

clang/lib/Format/DefinitionBlockSeparator.cpp
135

Shouldn't we set a type for such cases instead of repeating the detection code here?

👍

ksyx updated this revision to Diff 402006.Jan 21 2022, 8:54 AM
ksyx edited the summary of this revision. (Show Details)

Add token type FunctionLikeOrFreestandingMacro and use it to replace duplicated check with UnwrappedLineParser

ksyx marked 3 inline comments as done.Jan 21 2022, 9:13 AM
clang/lib/Format/DefinitionBlockSeparator.cpp
135

Thanks! Shouldn't we drop are radically change this comment?

clang/lib/Format/UnwrappedLineParser.cpp
1686

This is more consistent with the rest of the code base.

clang/unittests/Format/DefinitionBlockSeparatorTest.cpp
47

This is from Debugging?

146

That I know, but HRESULT would look familiar to at least some people.

ksyx marked an inline comment as done.Jan 21 2022, 12:38 PM
ksyx added inline comments.
clang/unittests/Format/DefinitionBlockSeparatorTest.cpp
47

I suppose it is from rebasing.

ksyx updated this revision to Diff 402078.Jan 21 2022, 12:49 PM
ksyx marked an inline comment as done.

Apply review suggestions of renaming.

clang/lib/Format/DefinitionBlockSeparator.cpp
135

I've checked this comment but found nothing to change, since it has nothing to do with the duplicated check but just referring to where created this problem?

clang/unittests/Format/DefinitionBlockSeparatorTest.cpp
146

This makes sense! Thanks for the suggestion.

ksyx marked an inline comment as done.Jan 21 2022, 12:50 PM

maybe slightly related https://github.com/llvm/llvm-project/issues/53183 in that this is also impacted by the 5 character TT_FunctionLikeOrFreestandingMacro

ksyx marked 2 inline comments as done.Jan 22 2022, 11:16 AM

maybe slightly related https://github.com/llvm/llvm-project/issues/53183 in that this is also impacted by the 5 character TT_FunctionLikeOrFreestandingMacro

Yes this patch is just marking this type of recognition but surely it requires other changes to the formatter.

By the way are there any improvements needed for this patch? Thanks.

Just this small detail. :)

clang/lib/Format/DefinitionBlockSeparator.cpp
131
curdeius accepted this revision.Jan 24 2022, 12:11 AM

LGTM.

clang/lib/Format/DefinitionBlockSeparator.cpp
141–145

Please merge these two ifs into one.

This revision was automatically updated to reflect the committed changes.
ksyx marked 2 inline comments as done.