This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/clang/Format/
-
clang/
-
Format/
-
Format.h
-
lib/Format/
-
Format/
1/2
Format.cpp
-
FormatToken.h
4/4
FormatTokenLexer.h
5/5
FormatTokenLexer.cpp
-
UnwrappedLineParser.h
1/1
UnwrappedLineParser.cpp
-
unittests/Format/
-
Format/
5/5
FormatTest.cpp

Differential D33440

clang-format: better handle statement macros
ClosedPublic

Authored by Typz on May 23 2017, 7:27 AM.

Download Raw Diff

Details

Reviewers

krasimir
djasper
klimek

Commits

rG6f40e21a1601: clang-format: better handle statement macros
rC343602: clang-format: better handle statement macros
rL343602: clang-format: better handle statement macros

Summary

Some macros are used in the body of function, and actually contain the trailing semicolon: they should thus be automatically followed by a new line, and not get merged with the next line. This is for example the case with Qt's Q_UNUSED macro:

  
void foo(int a, int b) {
  Q_UNUSED(a)
  return b;
}

This patch deals with these cases by introducing a new option to specify list of statement macros. This re-uses the system already in place for foreach macros, to ensure there is no impact on performance.

Diff Detail

Repository

rC Clang

Build Status

Buildable 18170
Build 18170: arc lint + arc unit

Event Timeline

Typz created this revision.May 23 2017, 7:27 AM

Herald added a subscriber: klimek. · View Herald TranscriptMay 23 2017, 7:27 AM

clang-format already has logic to detect semicolon-less macro invocations an in fact this already does behave as I would expect. What are you fixing?

Without this patch, macros with no trailing semicolon _in the body of a function_ are not handled properly, so I get:

int foo(int a, int b) {
  Q_UNUSED(a) return b;
}

class Foo {
  void bar(int a, int b) { Q_UNUSED(a) Q_UNUSED(b) }
}

I don't. Only if they start out to be on the same line. As long as I start with:

class C {
  void foo(int a, int b) {
    Q_UNUSED(a)
    Q_UNUSED(a)
    return b;
  }
};

clang-format leaves this alone. That's good enough I think and we don't want to add more special handling for macro-DSLs than strictly necessary.

Digging back, it seem the issue was that I had this code:

void foo(......) { Q_UNUSED(a) Q_UNUSED(b) }

which got wrapped because the line was too long:

void foo(......) {
  Q_UNUSED(a) Q_UNUSED(b)
}

I definitely understand your concern about introduceing special handling for some specific/hard-coded macro-DSLs, but would it be acceptable to add a configuration option to specify a list of such macros? e.g. a StatementMacros option, similar to the ForEachMacros option?

That would allow:

Systematically reformatting the first example
If such a code is reformatted, ensure the 2 macros are not on the same line

I generally would not be opposed to such a patch. However, note that this might be hard to get right. We had significant performance problems in the past with ForEachMacros as we used to match every single identifier against the regex stored in there. For for loops you can somewhat get out of that and you might be able to do the same thing here, but I am not entirely sure. In contrast, the added value is actually not very large. clang-format is merely not able to automatically fix something to your liking and it's very easy to make the code right and have clang-format keep it that way.

Complete refactor to make the processing much more generic

Typz retitled this revision from clang-format: properly handle Q_UNUSED and QT_REQUIRE_VERSION to clang-format: better handle statement and namespace macros.Jun 26 2017, 6:18 AM

Typz edited the summary of this revision. (Show Details)

Fix typo

ping?

So, there are two things in this patch: Statement macros and namespace macros. Lets break this out and handle them individually. They really aren't related that much.

Statement macros:
I think clang-format's handling here is good enough. clang-format does not insert the line break, but it also doesn't remove it. I am not 100% sure here, so I an be convinced. But I want to understand the use cases better. Do you expect people to run into this frequently? I am essentially trying to understand whether the cost of an extra option is worth the benefit it is giving.

Namespace macros:
How important are the automatic closing comments to you? I'd say that we should punt on that and leave it to the user to fix comments of these. And then, we could try to make the things we already have in MacroBlockBegin detect whether it ends with an opening brace and not need an extra list here. What do you think?

unittests/Format/FormatTest.cpp
1640	What's the difference here?

t>>! In D33440#812645, @djasper wrote:

So, there are two things in this patch: Statement macros and namespace macros. Lets break this out and handle them individually. They really aren't related that much.

Indeed, the only "relation" is the implementation, which now uses the exact same list (internally) to match all macros. Phabricator makes it very difficult to work with related changes pushed as multiple reviews, so I ended up merging the two features.

Statement macros:
I think clang-format's handling here is good enough. clang-format does not insert the line break, but it also doesn't remove it. I am not 100% sure here, so I an be convinced. But I want to understand the use cases better. Do you expect people to run into this frequently? I am essentially trying to understand whether the cost of an extra option is worth the benefit it is giving.

This happens relatively often, for example when fixing "unused parameter warning" on an inlined function: `int f(int a) { return 0; } often gets fixed to int f(int a) { Q_UNUSED(a) return 0; }` and clang-format does not fix the formatting...

Namespace macros:
How important are the automatic closing comments to you? I'd say that we should punt on that and leave it to the user to fix comments of these. And then, we could try to make the things we already have in MacroBlockBegin detect whether it ends with an opening brace and not need an extra list here. What do you think?

This is not just about automatic closing comments, there are may differences: indentation of namespaces, 'compacting' of namespaces when CompactNamespaces is used, detection of inline functions (for putting on a single line with SFS_Inline), handling of empty blocks (i.e. use BraceWrappingFlags.SplitEmptyNamespace)...

ping?

Rebase to master to fix merge issue

Harbormaster completed remote builds in B10123: Diff 114783.Sep 12 2017, 2:05 AM

djasper added a reviewer: klimek.Sep 12 2017, 2:14 AM

I'd still prefer individual patches for each of these changes. If the code review system or VCS make it hard for you to deal with two adjacent changes this way, do them in sequence.

Adding Manuel as a reviewer who has a longer term idea on how to handle macros.

Patch looks good, but I also would like to see it splited. I would suggest to first get the statement macro part in, which requires less code. Then we can put the namespace macros on top of that. I really like the generality of this approach and would want to also add support for class macros eventually.

lib/Format/FormatTokenLexer.cpp
662	Please move this inside the following `if` statement, so that we only perform the search when we see a `tok::pp_define`.
lib/Format/NamespaceEndCommentsFixer.cpp
56 ↗	(On Diff #114783)	I don't understand why you have a `Tok ? ...` after you `assert(Tok && ...)`?
82 ↗	(On Diff #114783)	What does this comment refer to? If it's about the line above, consider moving it up.
100 ↗	(On Diff #114783)	nit: end comment with `.`
105 ↗	(On Diff #114783)	nit: end comment with `.`
155 ↗	(On Diff #114783)	What happened to the old `// Detect "(inline)? namespace" in the beginning of a line.`
unittests/Format/FormatTest.cpp
1640	Hm, what would happen if you have a namespace macro with two or more parameters?

Typz marked 8 inline comments as done.Sep 13 2017, 8:49 AM

Typz added inline comments.

lib/Format/NamespaceEndCommentsFixer.cpp
155 ↗	(On Diff #114783)	Moved to `FormatToken::getNamespaceToken()`
unittests/Format/FormatTest.cpp
1640	only the first argument is used, as seen in previous test case.

Fix review comments, before splitting the commit.

Split diff: handle only statements in here, namespace macros will be moved to another one.

Typz retitled this revision from clang-format: better handle statement and namespace macros to clang-format: better handle statement macros.Sep 13 2017, 9:29 AM

Typz edited the summary of this revision. (Show Details)

krasimir added inline comments.Sep 14 2017, 1:00 AM

unittests/Format/FormatTest.cpp
2494	Please add tests where: the macro occurs inside a function body, thus having nontrivial indentation there are two macros one after another there is some code before the macro (on the same line + on a previous line with the same level)

djasper added inline comments.Sep 14 2017, 2:29 AM

lib/Format/FormatTokenLexer.cpp
665–666	This does a binary search. Why aren't you implementing it with a hashtable?
lib/Format/FormatTokenLexer.h
104	What does this class do and why do we need it? Describe it's purpose in a comment.
105	nullptr
112	Are all of these used?

Add tests.
Replace sorted list with hashtable.

lib/Format/FormatTokenLexer.cpp
665–666	It was already done this way, so I did not change it to avoid any impact on performance. But I can change it if you prefer.

rebase

ping?

Out of curiosity, will this be able to fix the two situations that you get for python extension?
There, you usually have a PyObject_HEAD with out semicolon in a struct and than a PyObject_HEAD_INIT(..) in a braced init list. More info:
http://starship.python.net/crew/mwh/toext/node20.html

lib/Format/FormatTokenLexer.cpp
661	Just do: auto it = Macros.find(FormatTok->Tok.getIdentifierInfo()); .. I know that this means that we might do the hash look up when we wouldn't need to (when we are actually in a #define), but I think clarity here is more important than the tiny performance benefit.
665–666	Thanks. This is much better than it was before.
lib/Format/FormatTokenLexer.h
25	Use ".
lib/Format/UnwrappedLineParser.cpp
1308	This contains the exact same code (I think). Can you pull it out into a function?

In D33440#920205, @djasper wrote:

Out of curiosity, will this be able to fix the two situations that you get for python extension?
There, you usually have a PyObject_HEAD with out semicolon in a struct and than a PyObject_HEAD_INIT(..) in a braced init list. More info:
http://starship.python.net/crew/mwh/toext/node20.html

PyObject_HEAD is defined (by defaut) as:

Py_ssize_t ob_refcnt;
PyTypeObject *ob_type;

so declaring it as a statement macro should allow clang-format to process it correctly. (though this is a variable-like macro, so I need to check if this case is supported)

PyObject_HEAD_INIT ends with a comma, which is not supported yet. We would need to add another kind of macro to support that case.

Address review comments

Typz marked an inline comment as done.Nov 9 2017, 5:41 AM

ping?

Use StatementMacro detection to improve brace type detection heuristics (in UnwrappedLineParser::calculateBraceTypes).

Harbormaster completed remote builds in B14613: Diff 132837.Feb 5 2018, 8:48 AM

alexfh added a subscriber: alexfh.Mar 23 2018, 5:14 AM

alexfh added inline comments.

lib/Format/Format.cpp
694–695	What's the reason to have these in the LLVM style? The macros aren't used in LLVM code.
unittests/Format/FormatTest.cpp
2476	nit: Add a trailing period.

Rebase on latest master

Herald added a subscriber: mgrang. · View Herald TranscriptMay 16 2018, 2:38 AM

There are still outstanding comments.

Address review comments

Harbormaster completed remote builds in B18249: Diff 147293.May 17 2018, 4:51 AM

Typz marked an inline comment as done.May 17 2018, 4:51 AM

Typz added inline comments.

lib/Format/Format.cpp
694–695	This is similar to the default foreach macros (foreach, Q_FOREACH and BOOST_FOREACH) : the macros are added here so that they are the default values for any style, since LLVM style is also both the default style and the base style for any style.

Regenerate documentation

Herald added a subscriber: acoomans. · View Herald TranscriptJul 31 2018, 9:50 AM

Harbormaster completed remote builds in B20911: Diff 158308.Jul 31 2018, 9:50 AM

krasimir accepted this revision.Sep 18 2018, 2:50 AM

This revision is now accepted and ready to land.Sep 18 2018, 2:50 AM

Closed by commit rL343602: clang-format: better handle statement macros (authored by Typz). · Explain WhyOct 2 2018, 9:41 AM

This revision was automatically updated to reflect the committed changes.

Herald added a subscriber: llvm-commits. · View Herald TranscriptOct 2 2018, 9:41 AM

@Typz
I think this should be part of the release notes for v8.
This is changing the output on some code base and this is a new feature.

By the way, an example in the doc would be great (including with the configuration)

merci!

sangwoo.joh added a subscriber: sangwoo.joh.Apr 22 2019, 9:57 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 22 2019, 9:57 PM

Revision Contents

Path

Size

include/

clang/

Format/

Format.h

12 lines

lib/

Format/

3 lines

1 line

4 lines

11 lines

UnwrappedLineParser.h

1 line

UnwrappedLineParser.cpp

23 lines

unittests/

Format/

FormatTest.cpp

45 lines

Diff 147024

include/clang/Format/Format.h

Show First 20 Lines • Show All 1,024 Lines • ▼ Show 20 Lines	struct FormatStyle {
/// In the .clang-format configuration file, this can be configured like:		/// In the .clang-format configuration file, this can be configured like:
/// \code{.yaml}		/// \code{.yaml}
/// ForEachMacros: ['RANGES_FOR', 'FOREACH']		/// ForEachMacros: ['RANGES_FOR', 'FOREACH']
/// \endcode		/// \endcode
///		///
/// For example: BOOST_FOREACH.		/// For example: BOOST_FOREACH.
std::vector<std::string> ForEachMacros;		std::vector<std::string> ForEachMacros;

		/// A vector of macros that should be interpreted as complete
		/// statements.
		///
		/// Typical macros are expressions, and require a semi-colon to be
		/// added; sometimes this is not the case, and this allows to make
		/// clang-format aware of such cases.
		///
		/// For example: Q_UNUSED
		std::vector<std::string> StatementMacros;

tooling::IncludeStyle IncludeStyle;		tooling::IncludeStyle IncludeStyle;

/// Indent case labels one level from the switch statement.		/// Indent case labels one level from the switch statement.
///		///
/// When ``false``, use the same indentation level as for the switch statement.		/// When ``false``, use the same indentation level as for the switch statement.
/// Switch statement body is always indented one level more than case labels.		/// Switch statement body is always indented one level more than case labels.
/// \code		/// \code
/// false: true:		/// false: true:
▲ Show 20 Lines • Show All 687 Lines • ▼ Show 20 Lines	return AccessModifierOffset == R.AccessModifierOffset &&
SpaceInEmptyParentheses == R.SpaceInEmptyParentheses &&		SpaceInEmptyParentheses == R.SpaceInEmptyParentheses &&
SpacesBeforeTrailingComments == R.SpacesBeforeTrailingComments &&		SpacesBeforeTrailingComments == R.SpacesBeforeTrailingComments &&
SpacesInAngles == R.SpacesInAngles &&		SpacesInAngles == R.SpacesInAngles &&
SpacesInContainerLiterals == R.SpacesInContainerLiterals &&		SpacesInContainerLiterals == R.SpacesInContainerLiterals &&
SpacesInCStyleCastParentheses == R.SpacesInCStyleCastParentheses &&		SpacesInCStyleCastParentheses == R.SpacesInCStyleCastParentheses &&
SpacesInParentheses == R.SpacesInParentheses &&		SpacesInParentheses == R.SpacesInParentheses &&
SpacesInSquareBrackets == R.SpacesInSquareBrackets &&		SpacesInSquareBrackets == R.SpacesInSquareBrackets &&
Standard == R.Standard && TabWidth == R.TabWidth &&		Standard == R.Standard && TabWidth == R.TabWidth &&
UseTab == R.UseTab;		StatementMacros == R.StatementMacros && UseTab == R.UseTab;
}		}

llvm::Optional<FormatStyle> GetLanguageStyle(LanguageKind Language) const;		llvm::Optional<FormatStyle> GetLanguageStyle(LanguageKind Language) const;

// Stores per-language styles. A FormatStyle instance inside has an empty		// Stores per-language styles. A FormatStyle instance inside has an empty
// StyleSet. A FormatStyle instance returned by the Get method has its		// StyleSet. A FormatStyle instance returned by the Get method has its
// StyleSet set to a copy of the originating StyleSet, effectively keeping the		// StyleSet set to a copy of the originating StyleSet, effectively keeping the
// internal representation of that StyleSet alive.		// internal representation of that StyleSet alive.
▲ Show 20 Lines • Show All 250 Lines • Show Last 20 Lines

lib/Format/Format.cpp

Show First 20 Lines • Show All 440 Lines • ▼ Show 20 Lines	static void mapping(IO &IO, FormatStyle &Style) {
IO.mapOptional("SpacesInAngles", Style.SpacesInAngles);		IO.mapOptional("SpacesInAngles", Style.SpacesInAngles);
IO.mapOptional("SpacesInContainerLiterals",		IO.mapOptional("SpacesInContainerLiterals",
Style.SpacesInContainerLiterals);		Style.SpacesInContainerLiterals);
IO.mapOptional("SpacesInCStyleCastParentheses",		IO.mapOptional("SpacesInCStyleCastParentheses",
Style.SpacesInCStyleCastParentheses);		Style.SpacesInCStyleCastParentheses);
IO.mapOptional("SpacesInParentheses", Style.SpacesInParentheses);		IO.mapOptional("SpacesInParentheses", Style.SpacesInParentheses);
IO.mapOptional("SpacesInSquareBrackets", Style.SpacesInSquareBrackets);		IO.mapOptional("SpacesInSquareBrackets", Style.SpacesInSquareBrackets);
IO.mapOptional("Standard", Style.Standard);		IO.mapOptional("Standard", Style.Standard);
		IO.mapOptional("StatementMacros", Style.StatementMacros);
IO.mapOptional("TabWidth", Style.TabWidth);		IO.mapOptional("TabWidth", Style.TabWidth);
IO.mapOptional("UseTab", Style.UseTab);		IO.mapOptional("UseTab", Style.UseTab);
}		}
};		};

template <> struct MappingTraits<FormatStyle::BraceWrappingFlags> {		template <> struct MappingTraits<FormatStyle::BraceWrappingFlags> {
static void mapping(IO &IO, FormatStyle::BraceWrappingFlags &Wrapping) {		static void mapping(IO &IO, FormatStyle::BraceWrappingFlags &Wrapping) {
IO.mapOptional("AfterClass", Wrapping.AfterClass);		IO.mapOptional("AfterClass", Wrapping.AfterClass);
▲ Show 20 Lines • Show All 228 Lines • ▼ Show 20 Lines	FormatStyle getLLVMStyle() {
LLVMStyle.PenaltyExcessCharacter = 1000000;		LLVMStyle.PenaltyExcessCharacter = 1000000;
LLVMStyle.PenaltyReturnTypeOnItsOwnLine = 60;		LLVMStyle.PenaltyReturnTypeOnItsOwnLine = 60;
LLVMStyle.PenaltyBreakBeforeFirstCallParameter = 19;		LLVMStyle.PenaltyBreakBeforeFirstCallParameter = 19;
LLVMStyle.PenaltyBreakTemplateDeclaration = prec::Relational;		LLVMStyle.PenaltyBreakTemplateDeclaration = prec::Relational;

LLVMStyle.DisableFormat = false;		LLVMStyle.DisableFormat = false;
LLVMStyle.SortIncludes = true;		LLVMStyle.SortIncludes = true;
LLVMStyle.SortUsingDeclarations = true;		LLVMStyle.SortUsingDeclarations = true;
		LLVMStyle.StatementMacros.push_back("Q_UNUSED");
		LLVMStyle.StatementMacros.push_back("QT_REQUIRE_VERSION");
		alexfhUnsubmitted Done Reply Inline Actions What's the reason to have these in the LLVM style? The macros aren't used in LLVM code. alexfh: What's the reason to have these in the LLVM style? The macros aren't used in LLVM code.
		TypzAuthorUnsubmitted Not Done Reply Inline Actions This is similar to the default foreach macros (foreach, Q_FOREACH and BOOST_FOREACH) : the macros are added here so that they are the default values for any style, since LLVM style is also both the default style and the base style for any style. Typz: This is similar to the default foreach macros (foreach, Q_FOREACH and BOOST_FOREACH) : the…

return LLVMStyle;		return LLVMStyle;
}		}

FormatStyle getGoogleStyle(FormatStyle::LanguageKind Language) {		FormatStyle getGoogleStyle(FormatStyle::LanguageKind Language) {
if (Language == FormatStyle::LK_TextProto) {		if (Language == FormatStyle::LK_TextProto) {
FormatStyle GoogleStyle = getGoogleStyle(FormatStyle::LK_Proto);		FormatStyle GoogleStyle = getGoogleStyle(FormatStyle::LK_Proto);
GoogleStyle.Language = FormatStyle::LK_TextProto;		GoogleStyle.Language = FormatStyle::LK_TextProto;
▲ Show 20 Lines • Show All 1,509 Lines • Show Last 20 Lines

lib/Format/FormatToken.h

Show First 20 Lines • Show All 80 Lines • ▼ Show 20 Lines	#define LIST_TOKEN_TYPES \
TYPE(OverloadedOperator) \		TYPE(OverloadedOperator) \
TYPE(OverloadedOperatorLParen) \		TYPE(OverloadedOperatorLParen) \
TYPE(PointerOrReference) \		TYPE(PointerOrReference) \
TYPE(PureVirtualSpecifier) \		TYPE(PureVirtualSpecifier) \
TYPE(RangeBasedForLoopColon) \		TYPE(RangeBasedForLoopColon) \
TYPE(RegexLiteral) \		TYPE(RegexLiteral) \
TYPE(SelectorName) \		TYPE(SelectorName) \
TYPE(StartOfName) \		TYPE(StartOfName) \
		TYPE(StatementMacro) \
TYPE(StructuredBindingLSquare) \		TYPE(StructuredBindingLSquare) \
TYPE(TemplateCloser) \		TYPE(TemplateCloser) \
TYPE(TemplateOpener) \		TYPE(TemplateOpener) \
TYPE(TemplateString) \		TYPE(TemplateString) \
TYPE(ProtoExtensionLSquare) \		TYPE(ProtoExtensionLSquare) \
TYPE(TrailingAnnotation) \		TYPE(TrailingAnnotation) \
TYPE(TrailingReturnArrow) \		TYPE(TrailingReturnArrow) \
TYPE(TrailingUnaryOperator) \		TYPE(TrailingUnaryOperator) \
▲ Show 20 Lines • Show All 699 Lines • Show Last 20 Lines

lib/Format/FormatTokenLexer.h

Show All 16 Lines
#define LLVM_CLANG_LIB_FORMAT_FORMATTOKENLEXER_H		#define LLVM_CLANG_LIB_FORMAT_FORMATTOKENLEXER_H

#include "Encoding.h"		#include "Encoding.h"
#include "FormatToken.h"		#include "FormatToken.h"
#include "clang/Basic/SourceLocation.h"		#include "clang/Basic/SourceLocation.h"
#include "clang/Basic/SourceManager.h"		#include "clang/Basic/SourceManager.h"
#include "clang/Format/Format.h"		#include "clang/Format/Format.h"
#include "llvm/Support/Regex.h"		#include "llvm/Support/Regex.h"
		#include "llvm/ADT/MapVector.h"
		djasperUnsubmitted Done Reply Inline Actions Use ". djasper: Use ".

#include <stack>		#include <stack>

namespace clang {		namespace clang {
namespace format {		namespace format {

enum LexerState {		enum LexerState {
NORMAL,		NORMAL,
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	private:
const FormatStyle &Style;		const FormatStyle &Style;
IdentifierTable IdentTable;		IdentifierTable IdentTable;
AdditionalKeywords Keywords;		AdditionalKeywords Keywords;
encoding::Encoding Encoding;		encoding::Encoding Encoding;
llvm::SpecificBumpPtrAllocator<FormatToken> Allocator;		llvm::SpecificBumpPtrAllocator<FormatToken> Allocator;
// Index (in 'Tokens') of the last token that starts a new line.		// Index (in 'Tokens') of the last token that starts a new line.
unsigned FirstInLineIndex;		unsigned FirstInLineIndex;
SmallVector<FormatToken *, 16> Tokens;		SmallVector<FormatToken *, 16> Tokens;
SmallVector<IdentifierInfo *, 8> ForEachMacros;
		llvm::SmallMapVector<IdentifierInfo *, TokenType, 8> Macros;
		djasperUnsubmitted Done Reply Inline Actions What does this class do and why do we need it? Describe it's purpose in a comment. djasper: What does this class do and why do we need it? Describe it's purpose in a comment.

		djasperUnsubmitted Done Reply Inline Actions nullptr djasper: nullptr
bool FormattingDisabled;		bool FormattingDisabled;

llvm::Regex MacroBlockBeginRegex;		llvm::Regex MacroBlockBeginRegex;
llvm::Regex MacroBlockEndRegex;		llvm::Regex MacroBlockEndRegex;

void readRawToken(FormatToken &Tok);		void readRawToken(FormatToken &Tok);

		djasperUnsubmitted Done Reply Inline Actions Are all of these used? djasper: Are all of these used?
void resetLexer(unsigned Offset);		void resetLexer(unsigned Offset);
};		};

} // namespace format		} // namespace format
} // namespace clang		} // namespace clang

#endif		#endif

lib/Format/FormatTokenLexer.cpp

Show All 31 Lines	: FormatTok(nullptr), IsFirstToken(true), StateStack({LexerState::NORMAL}),
Keywords(IdentTable), Encoding(Encoding), FirstInLineIndex(0),		Keywords(IdentTable), Encoding(Encoding), FirstInLineIndex(0),
FormattingDisabled(false), MacroBlockBeginRegex(Style.MacroBlockBegin),		FormattingDisabled(false), MacroBlockBeginRegex(Style.MacroBlockBegin),
MacroBlockEndRegex(Style.MacroBlockEnd) {		MacroBlockEndRegex(Style.MacroBlockEnd) {
Lex.reset(new Lexer(ID, SourceMgr.getBuffer(ID), SourceMgr,		Lex.reset(new Lexer(ID, SourceMgr.getBuffer(ID), SourceMgr,
getFormattingLangOpts(Style)));		getFormattingLangOpts(Style)));
Lex->SetKeepWhitespaceMode(true);		Lex->SetKeepWhitespaceMode(true);

for (const std::string &ForEachMacro : Style.ForEachMacros)		for (const std::string &ForEachMacro : Style.ForEachMacros)
ForEachMacros.push_back(&IdentTable.get(ForEachMacro));		Macros.insert({&IdentTable.get(ForEachMacro), TT_ForEachMacro});
llvm::sort(ForEachMacros.begin(), ForEachMacros.end());		for (const std::string &StatementMacro : Style.StatementMacros)
		Macros.insert({&IdentTable.get(StatementMacro), TT_StatementMacro});
}		}

ArrayRef<FormatToken *> FormatTokenLexer::lex() {		ArrayRef<FormatToken *> FormatTokenLexer::lex() {
assert(Tokens.empty());		assert(Tokens.empty());
assert(FirstInLineIndex == 0);		assert(FirstInLineIndex == 0);
do {		do {
Tokens.push_back(getNextToken());		Tokens.push_back(getNextToken());
if (Style.Language == FormatStyle::LK_JavaScript) {		if (Style.Language == FormatStyle::LK_JavaScript) {
▲ Show 20 Lines • Show All 602 Lines • ▼ Show 20 Lines	if (FirstNewlinePos == StringRef::npos) {
// The last line of the token always starts in column 0.		// The last line of the token always starts in column 0.
// Thus, the length can be precomputed even in the presence of tabs.		// Thus, the length can be precomputed even in the presence of tabs.
FormatTok->LastLineColumnWidth = encoding::columnWidthWithTabs(		FormatTok->LastLineColumnWidth = encoding::columnWidthWithTabs(
Text.substr(Text.find_last_of('\n') + 1), 0, Style.TabWidth, Encoding);		Text.substr(Text.find_last_of('\n') + 1), 0, Style.TabWidth, Encoding);
Column = FormatTok->LastLineColumnWidth;		Column = FormatTok->LastLineColumnWidth;
}		}

if (Style.isCpp()) {		if (Style.isCpp()) {
		auto it = Macros.find(FormatTok->Tok.getIdentifierInfo());
		djasperUnsubmitted Done Reply Inline Actions Just do: auto it = Macros.find(FormatTok->Tok.getIdentifierInfo()); .. I know that this means that we might do the hash look up when we wouldn't need to (when we are actually in a #define), but I think clarity here is more important than the tiny performance benefit. djasper: Just do: auto it = Macros.find(FormatTok->Tok.getIdentifierInfo()); .. I know that this…
if (!(Tokens.size() > 0 && Tokens.back()->Tok.getIdentifierInfo() &&		if (!(Tokens.size() > 0 && Tokens.back()->Tok.getIdentifierInfo() &&
		krasimirUnsubmitted Done Reply Inline Actions Please move this inside the following `if` statement, so that we only perform the search when we see a `tok::pp_define`. krasimir: Please move this inside the following `if` statement, so that we only perform the search when…
Tokens.back()->Tok.getIdentifierInfo()->getPPKeywordID() ==		Tokens.back()->Tok.getIdentifierInfo()->getPPKeywordID() ==
tok::pp_define) &&		tok::pp_define) &&
std::find(ForEachMacros.begin(), ForEachMacros.end(),		it != Macros.end()) {
FormatTok->Tok.getIdentifierInfo()) != ForEachMacros.end()) {		FormatTok->Type = it->second;
		djasperUnsubmitted Done Reply Inline Actions This does a binary search. Why aren't you implementing it with a hashtable? djasper: This does a binary search. Why aren't you implementing it with a hashtable?
		TypzAuthorUnsubmitted Done Reply Inline Actions It was already done this way, so I did not change it to avoid any impact on performance. But I can change it if you prefer. Typz: It was already done this way, so I did not change it to avoid any impact on performance. But I…
		djasperUnsubmitted Done Reply Inline Actions Thanks. This is much better than it was before. djasper: Thanks. This is much better than it was before.
FormatTok->Type = TT_ForEachMacro;
} else if (FormatTok->is(tok::identifier)) {		} else if (FormatTok->is(tok::identifier)) {
if (MacroBlockBeginRegex.match(Text)) {		if (MacroBlockBeginRegex.match(Text)) {
FormatTok->Type = TT_MacroBlockBegin;		FormatTok->Type = TT_MacroBlockBegin;
} else if (MacroBlockEndRegex.match(Text)) {		} else if (MacroBlockEndRegex.match(Text)) {
FormatTok->Type = TT_MacroBlockEnd;		FormatTok->Type = TT_MacroBlockEnd;
}		}
}		}
}		}
▲ Show 20 Lines • Show All 51 Lines • Show Last 20 Lines

lib/Format/UnwrappedLineParser.h

Show First 20 Lines • Show All 119 Lines • ▼ Show 20 Lines	private:
// parses the record as a child block, i.e. if the class declaration is an		// parses the record as a child block, i.e. if the class declaration is an
// expression.		// expression.
void parseRecord(bool ParseAsExpr = false);		void parseRecord(bool ParseAsExpr = false);
void parseObjCProtocolList();		void parseObjCProtocolList();
void parseObjCUntilAtEnd();		void parseObjCUntilAtEnd();
void parseObjCInterfaceOrImplementation();		void parseObjCInterfaceOrImplementation();
bool parseObjCProtocol();		bool parseObjCProtocol();
void parseJavaScriptEs6ImportExport();		void parseJavaScriptEs6ImportExport();
		void parseStatementMacro();
bool tryToParseLambda();		bool tryToParseLambda();
bool tryToParseLambdaIntroducer();		bool tryToParseLambdaIntroducer();
void tryToParseJSFunction();		void tryToParseJSFunction();
void addUnwrappedLine();		void addUnwrappedLine();
bool eof() const;		bool eof() const;
// LevelDifference is the difference of levels after and before the current		// LevelDifference is the difference of levels after and before the current
// token. For example:		// token. For example:
// - if the token is '{' and opens a block, LevelDifference is 1.		// - if the token is '{' and opens a block, LevelDifference is 1.
▲ Show 20 Lines • Show All 161 Lines • Show Last 20 Lines

lib/Format/UnwrappedLineParser.cpp

Show First 20 Lines • Show All 467 Lines • ▼ Show 20 Lines	case tok::r_brace:
LBraceStack.back()->BlockKind = BK_BracedInit;		LBraceStack.back()->BlockKind = BK_BracedInit;
} else {		} else {
Tok->BlockKind = BK_Block;		Tok->BlockKind = BK_Block;
LBraceStack.back()->BlockKind = BK_Block;		LBraceStack.back()->BlockKind = BK_Block;
}		}
}		}
LBraceStack.pop_back();		LBraceStack.pop_back();
break;		break;
		case tok::identifier:
		if (!Tok->is(TT_StatementMacro))
		break;
		LLVM_FALLTHROUGH;
case tok::at:		case tok::at:
case tok::semi:		case tok::semi:
case tok::kw_if:		case tok::kw_if:
case tok::kw_while:		case tok::kw_while:
case tok::kw_for:		case tok::kw_for:
case tok::kw_switch:		case tok::kw_switch:
case tok::kw_try:		case tok::kw_try:
case tok::kw___try:		case tok::kw___try:
▲ Show 20 Lines • Show All 609 Lines • ▼ Show 20 Lines	if (Style.isCpp() &&
Keywords.kw_slots, Keywords.kw_qslots)) {		Keywords.kw_slots, Keywords.kw_qslots)) {
nextToken();		nextToken();
if (FormatTok->is(tok::colon)) {		if (FormatTok->is(tok::colon)) {
nextToken();		nextToken();
addUnwrappedLine();		addUnwrappedLine();
return;		return;
}		}
}		}
		if (Style.isCpp() && FormatTok->is(TT_StatementMacro)) {
		parseStatementMacro();
		return;
		}
// In all other cases, parse the declaration.		// In all other cases, parse the declaration.
break;		break;
default:		default:
break;		break;
}		}
do {		do {
const FormatToken *Previous = FormatTok->Previous;		const FormatToken *Previous = FormatTok->Previous;
switch (FormatTok->Tok.getKind()) {		switch (FormatTok->Tok.getKind()) {
▲ Show 20 Lines • Show All 183 Lines • ▼ Show 20 Lines	case tok::identifier: {
break;		break;
}		}
}		}
parseRecord();		parseRecord();
addUnwrappedLine();		addUnwrappedLine();
return;		return;
}		}

		if (Style.isCpp() && FormatTok->is(TT_StatementMacro)) {
		djasperUnsubmitted Done Reply Inline Actions This contains the exact same code (I think). Can you pull it out into a function? djasper: This contains the exact same code (I think). Can you pull it out into a function?
		parseStatementMacro();
		return;
		}

// See if the following token should start a new unwrapped line.		// See if the following token should start a new unwrapped line.
StringRef Text = FormatTok->TokenText;		StringRef Text = FormatTok->TokenText;
nextToken();		nextToken();
if (Line->Tokens.size() == 1 &&		if (Line->Tokens.size() == 1 &&
// JS doesn't have macros, and within classes colons indicate fields,		// JS doesn't have macros, and within classes colons indicate fields,
// not labels.		// not labels.
Style.Language != FormatStyle::LK_JavaScript) {		Style.Language != FormatStyle::LK_JavaScript) {
if (FormatTok->Tok.is(tok::colon) && !Line->MustBeDeclaration) {		if (FormatTok->Tok.is(tok::colon) && !Line->MustBeDeclaration) {
▲ Show 20 Lines • Show All 982 Lines • ▼ Show 20 Lines	if (FormatTok->is(tok::l_brace)) {
nextToken();		nextToken();
parseBracedList();		parseBracedList();
} else {		} else {
nextToken();		nextToken();
}		}
}		}
}		}

		void UnwrappedLineParser::parseStatementMacro()
		{
		nextToken();
		if (FormatTok->is(tok::l_paren))
		parseParens();
		if (FormatTok->is(tok::semi))
		nextToken();
		addUnwrappedLine();
		}

LLVM_ATTRIBUTE_UNUSED static void printDebugInfo(const UnwrappedLine &Line,		LLVM_ATTRIBUTE_UNUSED static void printDebugInfo(const UnwrappedLine &Line,
StringRef Prefix = "") {		StringRef Prefix = "") {
llvm::dbgs() << Prefix << "Line(" << Line.Level		llvm::dbgs() << Prefix << "Line(" << Line.Level
<< ", FSC=" << Line.FirstStartColumn << ")"		<< ", FSC=" << Line.FirstStartColumn << ")"
<< (Line.InPPDirective ? " MACRO" : "") << ": ";		<< (Line.InPPDirective ? " MACRO" : "") << ": ";
for (std::list<UnwrappedLineNode>::const_iterator I = Line.Tokens.begin(),		for (std::list<UnwrappedLineNode>::const_iterator I = Line.Tokens.begin(),
E = Line.Tokens.end();		E = Line.Tokens.end();
I != E; ++I) {		I != E; ++I) {
▲ Show 20 Lines • Show All 311 Lines • Show Last 20 Lines

unittests/Format/FormatTest.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,631 Lines • ▼ Show 20 Lines	TEST_F(FormatTest, FormatsCompactNamespaces) {
EXPECT_EQ("namespace out { namespace in {\n"		EXPECT_EQ("namespace out { namespace in {\n"
"}} // namespace out::in",		"}} // namespace out::in",
format("namespace out {\n"		format("namespace out {\n"
"namespace in {\n"		"namespace in {\n"
"} // namespace in\n"		"} // namespace in\n"
"} // namespace out",		"} // namespace out",
Style));		Style));

// Only namespaces which have both consecutive opening and end get compacted		// Only namespaces which have both consecutive opening and end get compacted
		djasperUnsubmitted Done Reply Inline Actions What's the difference here? djasper: What's the difference here?
		krasimirUnsubmitted Done Reply Inline Actions Hm, what would happen if you have a namespace macro with two or more parameters? krasimir: Hm, what would happen if you have a namespace macro with two or more parameters?
		TypzAuthorUnsubmitted Done Reply Inline Actions only the first argument is used, as seen in previous test case. Typz: only the first argument is used, as seen in previous test case.
EXPECT_EQ("namespace out {\n"		EXPECT_EQ("namespace out {\n"
"namespace in1 {\n"		"namespace in1 {\n"
"} // namespace in1\n"		"} // namespace in1\n"
"namespace in2 {\n"		"namespace in2 {\n"
"} // namespace in2\n"		"} // namespace in2\n"
"} // namespace out",		"} // namespace out",
format("namespace out {\n"		format("namespace out {\n"
"namespace in1 {\n"		"namespace in1 {\n"
▲ Show 20 Lines • Show All 818 Lines • ▼ Show 20 Lines	EXPECT_EQ("class SomeClass {\n"
format("class SomeClass {\n"		format("class SomeClass {\n"
"public:\n"		"public:\n"
" SomeClass()\n"		" SomeClass()\n"
" EXCLUSIVE_LOCK_FUNCTION(mu_);\n"		" EXCLUSIVE_LOCK_FUNCTION(mu_);\n"
"};",		"};",
getLLVMStyleWithColumns(40)));		getLLVMStyleWithColumns(40)));

verifyFormat("MACRO(>)");		verifyFormat("MACRO(>)");

		// Some macros contain an implicit semicolon
		alexfhUnsubmitted Done Reply Inline Actions nit: Add a trailing period. alexfh: nit: Add a trailing period.
		FormatStyle Style = getLLVMStyle();
		Style.StatementMacros.push_back("FOO");
		verifyFormat("FOO(a) int b = 0;");
		verifyFormat("FOO(a)\n"
		"int b = 0;",
		Style);
		verifyFormat("FOO(a);\n"
		"int b = 0;",
		Style);
		verifyFormat("FOO(argc, argv, \"4.0.2\")\n"
		"int b = 0;",
		Style);
		verifyFormat("FOO()\n"
		"int b = 0;",
		Style);
		verifyFormat("FOO\n"
		"int b = 0;",
		Style);
		krasimirUnsubmitted Done Reply Inline Actions Please add tests where: the macro occurs inside a function body, thus having nontrivial indentation there are two macros one after another there is some code before the macro (on the same line + on a previous line with the same level) krasimir: Please add tests where: - the macro occurs inside a function body, thus having nontrivial…
		verifyFormat("void f() {\n"
		" FOO(a)\n"
		" return a;\n"
		"}",
		Style);
		verifyFormat("FOO(a)\n"
		"FOO(b)",
		Style);
		verifyFormat("int a = 0;\n"
		"FOO(b)\n"
		"int c = 0;",
		Style);
		verifyFormat("int a = 0;\n"
		"int x = FOO(a)\n"
		"int b = 0;",
		Style);
		verifyFormat("void foo(int a) { FOO(a) }\n"
		"uint32_t bar() {}",
		Style);
}		}

TEST_F(FormatTest, LayoutMacroDefinitionsStatementsSpanningBlocks) {		TEST_F(FormatTest, LayoutMacroDefinitionsStatementsSpanningBlocks) {
verifyFormat("#define A \\\n"		verifyFormat("#define A \\\n"
" f({ \\\n"		" f({ \\\n"
" g(); \\\n"		" g(); \\\n"
" });",		" });",
getLLVMStyleWithColumns(11));		getLLVMStyleWithColumns(11));
▲ Show 20 Lines • Show All 8,240 Lines • ▼ Show 20 Lines	TEST_F(FormatTest, ParsesConfiguration) {
BoostForeach.push_back("BOOST_FOREACH");		BoostForeach.push_back("BOOST_FOREACH");
CHECK_PARSE("ForEachMacros: [BOOST_FOREACH]", ForEachMacros, BoostForeach);		CHECK_PARSE("ForEachMacros: [BOOST_FOREACH]", ForEachMacros, BoostForeach);
std::vector<std::string> BoostAndQForeach;		std::vector<std::string> BoostAndQForeach;
BoostAndQForeach.push_back("BOOST_FOREACH");		BoostAndQForeach.push_back("BOOST_FOREACH");
BoostAndQForeach.push_back("Q_FOREACH");		BoostAndQForeach.push_back("Q_FOREACH");
CHECK_PARSE("ForEachMacros: [BOOST_FOREACH, Q_FOREACH]", ForEachMacros,		CHECK_PARSE("ForEachMacros: [BOOST_FOREACH, Q_FOREACH]", ForEachMacros,
BoostAndQForeach);		BoostAndQForeach);

		Style.StatementMacros.clear();
		CHECK_PARSE("StatementMacros: [QUNUSED]", StatementMacros,
		std::vector<std::string>{"QUNUSED"});
		CHECK_PARSE("StatementMacros: [QUNUSED, QT_REQUIRE_VERSION]", StatementMacros,
		std::vector<std::string>({"QUNUSED", "QT_REQUIRE_VERSION"}));

Style.IncludeStyle.IncludeCategories.clear();		Style.IncludeStyle.IncludeCategories.clear();
std::vector<tooling::IncludeStyle::IncludeCategory> ExpectedCategories = {		std::vector<tooling::IncludeStyle::IncludeCategory> ExpectedCategories = {
{"abc/.", 2}, {".", 1}};		{"abc/.", 2}, {".", 1}};
CHECK_PARSE("IncludeCategories:\n"		CHECK_PARSE("IncludeCategories:\n"
" - Regex: abc/.*\n"		" - Regex: abc/.*\n"
" Priority: 2\n"		" Priority: 2\n"
" - Regex: .*\n"		" - Regex: .*\n"
" Priority: 1",		" Priority: 1",
▲ Show 20 Lines • Show All 1,549 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

clang-format: better handle statement macrosClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 147024

include/clang/Format/Format.h

lib/Format/Format.cpp

lib/Format/FormatToken.h

lib/Format/FormatTokenLexer.h

lib/Format/FormatTokenLexer.cpp

lib/Format/UnwrappedLineParser.h

lib/Format/UnwrappedLineParser.cpp

unittests/Format/FormatTest.cpp

clang-format: better handle statement macros
ClosedPublic