This is an archive of the discontinued LLVM Phabricator instance.

[clang-format] Handle macros in function params and return value
AbandonedPublic

Authored by tamas.petz on Feb 28 2020, 9:37 AM.

Details

Summary

This patch adds support to handle macros in function parameter list
and function return type.

Consider the following case:

const ElfW(Addr)* foo(ElfW(Addr)* addr);

This case was handled incorrectly because the star token was
handled as a binary operator. After formatting:

const ElfW(Addr) * foo(ElfW(Addr) * addr);

This patch adds support for this case.

All format tests pass.

Diff Detail

Event Timeline

tamas.petz created this revision.Feb 28 2020, 9:37 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 28 2020, 9:37 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Friendly ping.

MyDeveloperDay added inline comments.Apr 3 2020, 9:59 AM
clang/lib/Format/TokenAnnotator.cpp
164

your naming convention is incorrect it would be parseFunctionQualifiedName

what are your trying to do here? are you just trying to skip the to the (

clang/unittests/Format/FormatTest.cpp
7359

I would like to see the macro definitions in your tests. It's not clear from looking at just the test line that M is supposed to be a macro. It looks like a syntax error.

MyDeveloperDay added inline comments.Apr 4 2020, 9:50 AM
clang/lib/Format/TokenAnnotator.cpp
313

I assume it could be almost anything?

void f(volatile ElfW(Addr)& addr);
void f(volatile ElfW(Addr)&& addr);
void f(volatile ElfW(Addr) const & addr);

void f(volatile ElfW(Addr,foo)* addr);
void f(volatile ElfW(Addr,ElfW(Addr) *foo)* addr);

? you seem to handle only the * case

I think situations where we infer M(x) to expand to a "type" without const/volatile close-by are too ambiguous an we will have the opposite problem: formatting certain binary expressions as type pointer/references. Some sort of a list of macros that clang-format assumes expand to a "type" could eliminate ambiguities, but has its own problems.
I think the case where there's a const/volatile/typename/typedef etc. around M(x) is very interesting; not sure if there are many practical ambiguities there.
There was an idea of giving a bunch of #define-s to clang-format (maybe with a small subset of what's possible) and use these rules to decide how to format matching forms, but that's hard (requires some sort of "virtual expanded token sequences" that the formatter should undestrand how to handle) to do and I don't know what's the status of that.

clang/unittests/Format/FormatTest.cpp
7360

This is ambiguous: the * could be a binary operator: https://godbolt.org/z/n7Jr-h

A sample of snippets that this misformats:

  • inside other macro:
EXPECT_EQ(Calc()* bar, baz);
  • in member function call:
BENCHMARK(BM_Foo)->ThreadRange(1, NumCPUs()* 2);
  • this causes javascript diffs:
- const foo = (bar == 'rtl' ? -1 : 1) * (blah || blah || 0);
+ const foo =(bar == 'rtl' ? -1 : 1) * (blah || blah || 0);
  • binary operators with casts:
a = (int)(b()* c);
rectaaaaaaaaaaaaaa->SetSize((int)(GetWidth(foo)* bar),
                            (int)(GetHeight(foo)* bar));
int f() {
  ...
- return static_cast<int>(param) * (a * b + c);
+ return static_cast<int>(param)*(a * b + c);
}
  • some arithmetic expressions:
float aaaa = 43 * log(sin(a/4) *b);

Test:

% cat test.cc                           
EXPECT_EQ(Calc() * bar, baz);

BENCHMARK(BM_Foo)->ThreadRange(1, NumCPUs() * 2);

a = (int)(b() * c);

rectaaaaaaaaaaaaaa->SetSize((int)(GetWidth(foo) * bar),
                            (int)(GetHeight(foo) * bar));

int f() { return static_cast<int>(param) * (a * b + c); }

float aaaa = 43 * log(sin(a / 4) * b);
% bin/clang-format -style=google test.cc
EXPECT_EQ(Calc()* bar, baz);

BENCHMARK(BM_Foo)->ThreadRange(1, NumCPUs()* 2);

a = (int)(b()* c);

rectaaaaaaaaaaaaaa->SetSize((int)(GetWidth(foo)* bar),
                            (int)(GetHeight(foo)* bar));

int f() { return static_cast<int>(param)*(a * b + c); }

float aaaa = 43 * log(sin(a / 4)* b);
% cat test.js
const foo = (bar == 'rtl' ? -1 : 1) * (blah || blah || 0);
% bin/clang-format -style=google test.js
const foo =(bar == 'rtl' ? -1 : 1) * (blah || blah || 0);
tamas.petz marked 2 inline comments as done.Apr 7 2020, 12:55 AM

@all, thank you for the review so far.

The case I am trying to handle is way too ambiguous.
IMHO looking at tokens only is not going to lead to a "perfect" formatter.
The case I am trying to handle is quite common but it can lead to incorrect formatting in other cases.

I had a weak hope that passing all tests is "safe enough" but it is unfortunately not.

Should we just note that cases similar to the one in the description are just too ambiguous to handle correctly?
Does anyone see a way the highlighted formatting issue can be solved?

Many thanks,

  • Tamas
clang/lib/Format/TokenAnnotator.cpp
313

Yes, I am handling this case only at the moment.
I am not sure this patch is going to land at all so I spared some work until we figure it out.

clang/unittests/Format/FormatTest.cpp
7360

This case is ambiguous. Like the case at line 7324, which could also be "MACRO()* ptr;"

tamas.petz marked 2 inline comments as not done.Apr 7 2020, 12:55 AM

Did you try putting 'ElfW and 'M' in the list of TypenameMacros?

---
Language: Cpp
BasedOnStyle: LLVM
TypenameMacros: ['ElfW']
PointerAlignment: Left
const ElfW(Addr)* foo(ElfW(Addr)* addr);
const M(Addr) * foo(M(Addr) * addr);
$ clang-format macros.cpp
const ElfW(Addr)* foo(ElfW(Addr)* addr);
const M(Addr) * foo(M(Addr) * addr);

Simply extend the list to include the macros you need

TypenameMacros: ['ElfW','M']
$ clang-format macros.cpp
const ElfW(Addr)* foo(ElfW(Addr)* addr);
const M(Addr)* foo(M(Addr)* addr);

Wow, I have missed that configuration option.
I will try it, I assume it should work.

Looks like this change should be abandoned.

Wow, I have missed that configuration option.
I will try it, I assume it should work.

Looks like this change should be abandoned.

To be honest, I forget what we've got too! ;-) I was just writing a reply that said "how about adding an option with a list of type macros", I was looking for an example of other places we do that, and stumbled on it.

tamas.petz abandoned this revision.Apr 15 2020, 12:06 AM

Wow, I have missed that configuration option.
I will try it, I assume it should work.

Looks like this change should be abandoned.

To be honest, I forget what we've got too! ;-) I was just writing a reply that said "how about adding an option with a list of type macros", I was looking for an example of other places we do that, and stumbled on it.

: ) I am now abandoning this change.
Thank you all for you comments.