This is an archive of the discontinued LLVM Phabricator instance.

[clang-format] Properly handle the C11 _Generic keyword.
ClosedPublic

Authored by rymiel on Dec 2 2022, 10:25 AM.

Details

Summary

This patch properly recognizes the generic selection expression
introduced in C11, by adding an additional token type for the colons
present in such expressions.

Previously, they would be recognized as
"inline ASM colons" purely by the fact that those are the last thing
checked for.

I tried to avoid adding an addition token type, but since colons by
default like having spaces around them, I chose to add a new type so
that no space is added after the type selector.

Currently, no aspect of the formatting of these expressions in able to
be configured, as I'm not sure what could even be configured here.

One notable thing is that association list is always formatted as
either entirely on one line, if it can fit, or with line breaks
after every comma in the expression (also after the controlling expr.)

This visually makes them more similar to switch statements when long,
matching the behaviour of the selection expression, being that of a sort
of switch on types, but also allows for terseness when only selecting
for a few things.

Fixes https://github.com/llvm/llvm-project/issues/18080

Diff Detail

Event Timeline

rymiel created this revision.Dec 2 2022, 10:25 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 2 2022, 10:25 AM
rymiel requested review of this revision.Dec 2 2022, 10:25 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 2 2022, 10:25 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

I put WIP in the title right now because firstly, I still need to write more tests, but also, since I haven't touched the breaking and indenting part of the code before, I want to make sure what I'm doing is sensible.

Basically, after making the colons their own type so that they no longer followed the rules of InlineASMColon, I noticed the output was perfect on its own if passed BinPackArgument: false as a style option. Following this train of thought, I simply let generic selection expressions act as if regular call-like expressions, but with binpack always forcefully disabled.

Not sure if that's something that should even be done (i.e. leave it just up to BinPackArgument), or configurable, or etc.

rymiel added a project: Restricted Project.Dec 2 2022, 11:08 AM

Looks good to me.
And adding a token type is absolutely nothing negative. The more tokens get a type, the better.

clang/lib/Format/TokenAnnotator.cpp
1635
MyDeveloperDay accepted this revision.Dec 13 2022, 12:05 AM
MyDeveloperDay added inline comments.
clang/lib/Format/ContinuationIndenter.cpp
1699

I wonder if this says isJavaScript() here because of the

foo(name: type,name2: type2) syntax

if that might be the case then given the format you present in your test then maybe it IS correct to have GenericSelection here.

This revision is now accepted and ready to land.Dec 13 2022, 12:05 AM
rymiel planned changes to this revision.Dec 22 2022, 11:06 PM

Going to try to make the indentation of the cases following the controlling expression not continuations, since that results in weird results such as:

#define LIMIT_MAX(T)                     \
  _Generic(                              \
      ((T)0),                            \
      unsigned int: UINT_MAX,            \
      unsigned long: ULONG_MAX,          \
      unsigned long long: ULLONG_MAX)

My desired output is instead:

#define LIMIT_MAX(T)                     \
  _Generic(((T)0),                       \
      unsigned int: UINT_MAX,            \
      unsigned long: ULONG_MAX,          \
      unsigned long long: ULLONG_MAX)

or something similar, but I'm struggling with unfamiliar parts of the codebase and taking my time getting familiar with them.

rymiel updated this revision to Diff 485062.Dec 23 2022, 1:00 AM
rymiel marked an inline comment as done.
rymiel retitled this revision from [WIP][clang-format] Properly handle the C11 _Generic keyword. to [clang-format] Properly handle the C11 _Generic keyword..

Change indenting behaviour

I'm still not sure what I'm doing regarding ContinuationIndenter, since I don't fully understand its mechanisms, so I just followed my usual method of "stick a very specific check in a random spot that makes the correct fields have the correct value", which feels like a hack here. There is probably a much better way to do this, perhaps something with fake parens or something, to actually treat the "controlling expression" and the actual selectors differently, but right now I've just put in a specific check to dedent the selectors to be relative to the _Generic keyword, and not aligned to the opening paren.

It is also no surprise that I don't know how to write tests; I included organic examples of cases that are semi-realistic use cases of _Generic, but there are definitely edge cases I don't have listed.

This revision is now accepted and ready to land.Dec 23 2022, 1:00 AM
rymiel requested review of this revision.Dec 23 2022, 1:01 AM
This revision is now accepted and ready to land.Dec 23 2022, 8:38 AM
MyDeveloperDay accepted this revision.Jan 3 2023, 2:51 AM
owenpan accepted this revision.Jan 10 2023, 1:51 AM
This revision was landed with ongoing or failed builds.Jan 10 2023, 8:02 PM
This revision was automatically updated to reflect the committed changes.