This is an archive of the discontinued LLVM Phabricator instance.

[clang-format] Preserve whitespace in selected macros
ClosedPublic

Authored by JakeMerdichAMD on Jun 25 2020, 7:16 PM.

Details

Summary

https://bugs.llvm.org/show_bug.cgi?id=46383

When the c preprocessor stringizes tokens, the generated string literals
are affected by the whitespace. This means clang-format can affect
codegen silently, adding spaces and newlines to strings. Practically
speaking, the vast majority of cases will be harmless, only affecting
single identifiers or debug macros.

In the interest of doing no harm in other cases though, this introduces
a blacklist option 'WhitespaceSensitiveMacros', which contains a list of
names of function-like macros whose contents should not be touched by
clang-format, period. Clang-format can't automatically detect these
without a real compile context, so users will have to specify it
explicitly (it still beats clang-format off'ing at every invocation).

I added one default, "STRINGIZE", but am not particularly attached to
it, nor have I given much thought to defaults.

Diff Detail

Event Timeline

JakeMerdichAMD created this revision.Jun 25 2020, 7:16 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 25 2020, 7:16 PM
JakeMerdichAMD added a project: Restricted Project.
curdeius added inline comments.Jun 26 2020, 12:57 AM
clang/docs/ClangFormatStyleOptions.rst
2706

Shouldn't there be a configuration example like what's in ForEachMacros doc?

In the .clang-format configuration file, this can be configured like:

.. code-block:: yaml

  WhitespaceSensitiveMacros: ['STRINGIZE', 'PP_STRINGIZE']

For example: BOOST_PP_STRINGIZE.
clang/lib/Format/FormatTokenLexer.cpp
46–49

Personally I would add braces around the loop body.

clang/unittests/Format/FormatTest.cpp
13961

Shouldn't that be:
CHECK_PARSE("WhitespaceSensitiveMacros: ['STRINGIZE']",
as in other options that take vector of strings?

13963

Ditto: apostrophes around strings.

16487

How about a test with escaped parentheses \( inside the macro argument?

JakeMerdichAMD marked 10 inline comments as done.Jun 26 2020, 7:05 AM
JakeMerdichAMD added inline comments.
clang/docs/ClangFormatStyleOptions.rst
2706

Done. I also added PP_STRINGIZE and BOOST_PP_STRINGIZE as defaults; seems reasonable.

clang/unittests/Format/FormatTest.cpp
13961

I'll add it since it ought to be tested, but they aren't required to my knowledge due to 'YAML reasons' and most of the other tests omit them.

16487

Done. Note that parens always need to be matched anyhow, and the escaping doesn't actually mean anything outside of a string literal (tested on MSVC, gcc, clang), so no functionality change is needed.

JakeMerdichAMD marked 3 inline comments as done.

Address feedback (nits, better docs, more defaults)

Thanks for the fast review, @curdeius, and thanks for mentioning PP_STRINGIZE and BOOST_PP_STRINGIZE too!

This revision is now accepted and ready to land.Jun 26 2020, 10:03 AM
curdeius accepted this revision.Jun 26 2020, 11:33 AM

LGTM. Thank you for taking my comments into account.

This revision was automatically updated to reflect the committed changes.
MyDeveloperDay added inline comments.Jul 3 2020, 8:40 AM
clang/docs/ClangFormatStyleOptions.rst
2706

The change I think you made here gets lost when the ClangFormatOptions.rst file is regenerated

Could you update the Format.h and regenerate rather than making the code changes in the rst itself?

I ran into this issue when I tried to regenerate.