Diff 206246

clang/include/clang/Tooling/Syntax/Tokens.h

Show First 20 Lines • Show All 154 Lines • ▼ Show 20 Lines
/// tokens for each of the files can be obtained via spelledTokens(FileID).		/// tokens for each of the files can be obtained via spelledTokens(FileID).
///		///
/// To map between the expanded and spelled tokens use findSpelledByExpanded().		/// To map between the expanded and spelled tokens use findSpelledByExpanded().
///		///
/// To build a token buffer use the TokenCollector class. You can also compute		/// To build a token buffer use the TokenCollector class. You can also compute
/// the spelled tokens of a file using the tokenize() helper.		/// the spelled tokens of a file using the tokenize() helper.
///		///
/// FIXME: allow to map from spelled to expanded tokens when use-case shows up.		/// FIXME: allow to map from spelled to expanded tokens when use-case shows up.
		/// FIXME: allow mappings into macro arguments.
class TokenBuffer {		class TokenBuffer {
public:		public:
TokenBuffer(const SourceManager &SourceMgr) : SourceMgr(&SourceMgr) {}		TokenBuffer(const SourceManager &SourceMgr) : SourceMgr(&SourceMgr) {}
/// All tokens produced by the preprocessor after all macro replacements,		/// All tokens produced by the preprocessor after all macro replacements,
/// directives, etc. Source locations found in the clang AST will always		/// directives, etc. Source locations found in the clang AST will always
/// point to one of these tokens.		/// point to one of these tokens.
/// FIXME: figure out how to handle token splitting, e.g. '>>' can be split		/// FIXME: figure out how to handle token splitting, e.g. '>>' can be split
/// into two '>' tokens by the parser. However, TokenBuffer currently		/// into two '>' tokens by the parser. However, TokenBuffer currently
▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	public:
/// Lexed tokens of a file before preprocessing. E.g. for the following input		/// Lexed tokens of a file before preprocessing. E.g. for the following input
/// #define DECL(name) int name = 10		/// #define DECL(name) int name = 10
/// DECL(a);		/// DECL(a);
/// spelledTokens() returns {"#", "define", "DECL", "(", "name", ")", "eof"}.		/// spelledTokens() returns {"#", "define", "DECL", "(", "name", ")", "eof"}.
/// FIXME: we do not yet store tokens of directives, like #include, #define,		/// FIXME: we do not yet store tokens of directives, like #include, #define,
/// #pragma, etc.		/// #pragma, etc.
llvm::ArrayRef<syntax::Token> spelledTokens(FileID FID) const;		llvm::ArrayRef<syntax::Token> spelledTokens(FileID FID) const;

		const SourceManager &sourceManager() const { return *SourceMgr; }

std::string dumpForTests() const;		std::string dumpForTests() const;

private:		private:
/// Describes a mapping between a continuous subrange of spelled tokens and		/// Describes a mapping between a continuous subrange of spelled tokens and
/// expanded tokens. Represents macro expansions, preprocessor directives,		/// expanded tokens. Represents macro expansions, preprocessor directives,
/// conditionally disabled pp regions, etc.		/// conditionally disabled pp regions, etc.
/// #define FOO 1+2		/// #define FOO 1+2
/// #define BAR(a) a + 1		/// #define BAR(a) a + 1
▲ Show 20 Lines • Show All 67 Lines • ▼ Show 20 Lines	public:
/// CreateASTConsumer().		/// CreateASTConsumer().
TokenCollector(Preprocessor &P);		TokenCollector(Preprocessor &P);

/// Finalizes token collection. Should be called after preprocessing is		/// Finalizes token collection. Should be called after preprocessing is
/// finished, i.e. after running Execute().		/// finished, i.e. after running Execute().
LLVM_NODISCARD TokenBuffer consume() &&;		LLVM_NODISCARD TokenBuffer consume() &&;

private:		private:
		/// Maps from a start to an end spelling location of transformations
		sammccallUnsubmitted Done Reply Inline Actions spelled locations! sammccall: spelled locations!
		/// performed by the preprocessor. These include:
		/// 1. range from '#' to the last token in the line for PP directives,
		/// 2. macro name and arguments for macro expansions.
		sammccallUnsubmitted Done Reply Inline Actions There are now 4 things called mappings, and I can't understand how they relate to each other. I think this needs new names and/or concepts. sammccall: There are now 4 things called mappings, and I can't understand how they relate to each other. I…
		ilya-biryukovAuthorUnsubmitted Done Reply Inline Actions Renamed to `PPExpansions`. ilya-biryukov: Renamed to `PPExpansions`.
		/// Note that we record only top-level macro expansions, intermediate
		sammccallUnsubmitted Done Reply Inline Actions do I understand right that this is logically a stack, but it's hard to know when to pop or just less hassle to do this way? if so, maybe worth mentioning sammccall: do I understand right that this is logically a stack, but it's hard to know when to pop or just…
		ilya-biryukovAuthorUnsubmitted Done Reply Inline Actions That's exactly the case, but preprocessor only exposes the point at which we push macros to the stack (`PPCallbacks::MacroExpands`, etc) and not points when we pop from the stack. This map is an attempt to recover the pop positions (e.g. to detect intermediate expansions in the macro arguments). Added a comment ilya-biryukov: That's exactly the case, but preprocessor only exposes the point at which we push macros to the…
		/// expansions (e.g. inside macro arguments) are ignored.
		///
		/// Used to find correct boundaries of macro calls and directives when
		/// building mappings from spelled to expanded tokens.
		///
		/// Logically, at each point of the preprocessor execution there is a stack of
		/// macro expansions being processed and we could use it to recover the
		/// location information we need. However, the public preprocessor API only
		/// exposes the points when macro expansions start (when we push a macro onto
		/// the stack) and not when they end (when we pop a macro from the stack).
		/// To workaround this limitation, we rely on source location information
		/// stored in this map.
		using PPExpansions = llvm::DenseMap</SourceLocation/ int, SourceLocation>;
class Builder;		class Builder;
		class CollectPPExpansions;

std::vector<syntax::Token> Expanded;		std::vector<syntax::Token> Expanded;
		// FIXME: we only store macro expansions, also add directives(#pragma, etc.)
		PPExpansions Expansions;
Preprocessor &PP;		Preprocessor &PP;
		CollectPPExpansions *Collector;
		sammccallUnsubmitted Done Reply Inline Actions Give the class and member more descriptive names? sammccall: Give the class and member more descriptive names?
		ilya-biryukovAuthorUnsubmitted Done Reply Inline Actions Renamed to `Expansions` ilya-biryukov: Renamed to `Expansions`
};		};

} // namespace syntax		} // namespace syntax
} // namespace clang		} // namespace clang

#endif		#endif

clang/lib/Tooling/Syntax/Tokens.cpp

//===- Tokens.cpp - collect tokens from preprocessing ---------------------===//		//===- Tokens.cpp - collect tokens from preprocessing ---------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
#include "clang/Tooling/Syntax/Tokens.h"		#include "clang/Tooling/Syntax/Tokens.h"

#include "clang/Basic/Diagnostic.h"		#include "clang/Basic/Diagnostic.h"
#include "clang/Basic/IdentifierTable.h"		#include "clang/Basic/IdentifierTable.h"
#include "clang/Basic/LLVM.h"		#include "clang/Basic/LLVM.h"
#include "clang/Basic/LangOptions.h"		#include "clang/Basic/LangOptions.h"
#include "clang/Basic/SourceLocation.h"		#include "clang/Basic/SourceLocation.h"
#include "clang/Basic/SourceManager.h"		#include "clang/Basic/SourceManager.h"
#include "clang/Basic/TokenKinds.h"		#include "clang/Basic/TokenKinds.h"
		#include "clang/Lex/PPCallbacks.h"
#include "clang/Lex/Preprocessor.h"		#include "clang/Lex/Preprocessor.h"
#include "clang/Lex/Token.h"		#include "clang/Lex/Token.h"
#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/None.h"		#include "llvm/ADT/None.h"
#include "llvm/ADT/Optional.h"		#include "llvm/ADT/Optional.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
Show All 40 Lines
}		}

llvm::raw_ostream &syntax::operator<<(llvm::raw_ostream &OS, const Token &T) {		llvm::raw_ostream &syntax::operator<<(llvm::raw_ostream &OS, const Token &T) {
return OS << T.str();		return OS << T.str();
}		}

FileRange::FileRange(FileID File, unsigned BeginOffset, unsigned EndOffset)		FileRange::FileRange(FileID File, unsigned BeginOffset, unsigned EndOffset)
: File(File), Begin(BeginOffset), End(EndOffset) {		: File(File), Begin(BeginOffset), End(EndOffset) {
assert(File.isValid());		assert(File.isValid());
assert(BeginOffset <= EndOffset);		assert(BeginOffset <= EndOffset);
}		}

FileRange::FileRange(const SourceManager &SM, SourceLocation BeginLoc,		FileRange::FileRange(const SourceManager &SM, SourceLocation BeginLoc,
unsigned Length) {		unsigned Length) {
assert(BeginLoc.isValid());		assert(BeginLoc.isValid());
assert(BeginLoc.isFileID());		assert(BeginLoc.isFileID());

std::tie(File, Begin) = SM.getDecomposedLoc(BeginLoc);		std::tie(File, Begin) = SM.getDecomposedLoc(BeginLoc);
▲ Show 20 Lines • Show All 164 Lines • ▼ Show 20 Lines	while (!L.LexFromRawLexer(T))
AddToken(T);		AddToken(T);
// 'eof' is only the last token if the input is null-terminated. Never store		// 'eof' is only the last token if the input is null-terminated. Never store
// it, for consistency.		// it, for consistency.
if (T.getKind() != tok::eof)		if (T.getKind() != tok::eof)
AddToken(T);		AddToken(T);
return Tokens;		return Tokens;
}		}

		/// Records information reqired to construct mappings for the token buffer that
		sammccallUnsubmitted Done Reply Inline Actions what is this class for, what does it do? sammccall: what is this class for, what does it do?
		ilya-biryukovAuthorUnsubmitted Done Reply Inline Actions Added a comment. ilya-biryukov: Added a comment.
		/// we are collecting.
		class TokenCollector::CollectPPExpansions : public PPCallbacks {
		public:
		CollectPPExpansions(TokenCollector &C) : Collector(&C) {}

		/// Disabled instance will stop reporting anything to TokenCollector.
		sammccallUnsubmitted Done Reply Inline Actions add a comment for why this is needed? sammccall: add a comment for why this is needed?
		/// This ensures that uses of the preprocessor after TokenCollector::consume()
		/// is called do not access the (possibly invalid) collector instance.
		void disable() { Collector = nullptr; }

		void MacroExpands(const clang::Token &MacroNameTok, const MacroDefinition &MD,
		SourceRange Range, const MacroArgs *Args) override {
		if (!Collector)
		sammccallUnsubmitted Done Reply Inline Actions This doesn't seem like a particularly standard use of the word "recursive", and the code isn't totally obvious either. Could this be "only record top-level expansions, not those where: the macro use is inside a macro body the macro appears in an argument to another macro Because the top level macros are treated as opaque atoms. (We probably need a FIXME for tracking arg locations somewhere) sammccall: This doesn't seem like a particularly standard use of the word "recursive", and the code isn't…
		ilya-biryukovAuthorUnsubmitted Done Reply Inline Actions Added a comment and a FIXME at the declaration site of `TokenBuffer` ilya-biryukov: Added a comment and a FIXME at the declaration site of `TokenBuffer`
		return;
		// Only record top-level expansions, not those where:
		// - the macro use is inside a macro body,
		// - the macro appears in an argument to another macro.
		if (!MacroNameTok.getLocation().isFileID() \|\|
		(LastExpansionEnd.isValid() &&
		Collector->PP.getSourceManager().isBeforeInTranslationUnit(
		Range.getBegin(), LastExpansionEnd)))
		return;
		sammccallUnsubmitted Done Reply Inline Actions members should have real names sammccall: members should have real names
		Collector->Expansions[Range.getBegin().getRawEncoding()] = Range.getEnd();
		LastExpansionEnd = Range.getEnd();
		}
		// FIXME: handle directives like #pragma, #include, etc.
		private:
		TokenCollector *Collector;
		/// Used to detect recursive macro expansions.
		SourceLocation LastExpansionEnd;
		};

/// Fills in the TokenBuffer by tracing the run of a preprocessor. The		/// Fills in the TokenBuffer by tracing the run of a preprocessor. The
/// implementation tracks the tokens, macro expansions and directives coming		/// implementation tracks the tokens, macro expansions and directives coming
/// from the preprocessor and:		/// from the preprocessor and:
/// - for each token, figures out if it is a part of an expanded token stream,		/// - for each token, figures out if it is a part of an expanded token stream,
/// spelled token stream or both. Stores the tokens appropriately.		/// spelled token stream or both. Stores the tokens appropriately.
/// - records mappings from the spelled to expanded token ranges, e.g. for macro		/// - records mappings from the spelled to expanded token ranges, e.g. for macro
/// expansions.		/// expansions.
/// FIXME: also properly record:		/// FIXME: also properly record:
Show All 11 Lines	DEBUG_WITH_TYPE("collect-tokens", llvm::dbgs()
<< "Token: "		<< "Token: "
<< syntax::Token(T).dumpForTests(		<< syntax::Token(T).dumpForTests(
this->PP.getSourceManager())		this->PP.getSourceManager())
<< "\n"		<< "\n"

);		);
Expanded.push_back(syntax::Token(T));		Expanded.push_back(syntax::Token(T));
});		});
		// And locations of macro calls, to properly recover boundaries of those in
		// case of empty expansions.
		auto CB = llvm::make_unique<CollectPPExpansions>(*this);
		this->Collector = CB.get();
		PP.addPPCallbacks(std::move(CB));
}		}

/// Builds mappings and spelled tokens in the TokenBuffer based on the expanded		/// Builds mappings and spelled tokens in the TokenBuffer based on the expanded
/// token stream.		/// token stream.
class TokenCollector::Builder {		class TokenCollector::Builder {
public:		public:
Builder(std::vector<syntax::Token> Expanded, const SourceManager &SM,		Builder(std::vector<syntax::Token> Expanded, PPExpansions CollectedExpansions,
const LangOptions &LangOpts)		const SourceManager &SM, const LangOptions &LangOpts)
: Result(SM), SM(SM), LangOpts(LangOpts) {		: Result(SM), CollectedExpansions(std::move(CollectedExpansions)), SM(SM),
		LangOpts(LangOpts) {
Result.ExpandedTokens = std::move(Expanded);		Result.ExpandedTokens = std::move(Expanded);
}		}

TokenBuffer build() && {		TokenBuffer build() && {
buildSpelledTokens();		buildSpelledTokens();

// Walk over expanded tokens and spelled tokens in parallel, building the		// Walk over expanded tokens and spelled tokens in parallel, building the
// mappings between those using source locations.		// mappings between those using source locations.
		// To correctly recover empty macro expansions, we also take locations
		// reported to PPCallbacks::MacroExpands into account as we do not have any
		// expanded tokens with source locations to guide us.

// The 'eof' token is special, it is not part of spelled token stream. We		// The 'eof' token is special, it is not part of spelled token stream. We
// handle it separately at the end.		// handle it separately at the end.
assert(!Result.ExpandedTokens.empty());		assert(!Result.ExpandedTokens.empty());
assert(Result.ExpandedTokens.back().kind() == tok::eof);		assert(Result.ExpandedTokens.back().kind() == tok::eof);
for (unsigned I = 0; I < Result.ExpandedTokens.size() - 1; ++I) {		for (unsigned I = 0; I < Result.ExpandedTokens.size() - 1; ++I) {
// (!) I might be updated by the following call.		// (!) I might be updated by the following call.
processExpandedToken(I);		processExpandedToken(I);
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	private:
/// function returns.		/// function returns.
void processMacroExpansion(CharSourceRange SpelledRange, unsigned &I) {		void processMacroExpansion(CharSourceRange SpelledRange, unsigned &I) {
auto FID = SM.getFileID(SpelledRange.getBegin());		auto FID = SM.getFileID(SpelledRange.getBegin());
assert(FID == SM.getFileID(SpelledRange.getEnd()));		assert(FID == SM.getFileID(SpelledRange.getEnd()));
TokenBuffer::MarkedFile &File = Result.Files[FID];		TokenBuffer::MarkedFile &File = Result.Files[FID];

fillGapUntil(File, SpelledRange.getBegin(), I);		fillGapUntil(File, SpelledRange.getBegin(), I);

TokenBuffer::Mapping M;
// Skip the spelled macro tokens.
std::tie(M.BeginSpelled, M.EndSpelled) =
consumeSpelledUntil(File, SpelledRange.getEnd().getLocWithOffset(1));
// Skip all expanded tokens from the same macro expansion.		// Skip all expanded tokens from the same macro expansion.
M.BeginExpanded = I;		unsigned BeginExpanded = I;
for (; I + 1 < Result.ExpandedTokens.size(); ++I) {		for (; I + 1 < Result.ExpandedTokens.size(); ++I) {
auto NextL = Result.ExpandedTokens[I + 1].location();		auto NextL = Result.ExpandedTokens[I + 1].location();
if (!NextL.isMacroID() \|\|		if (!NextL.isMacroID() \|\|
SM.getExpansionLoc(NextL) != SpelledRange.getBegin())		SM.getExpansionLoc(NextL) != SpelledRange.getBegin())
break;		break;
}		}
M.EndExpanded = I + 1;		unsigned EndExpanded = I + 1;
		consumeMapping(File, SM.getFileOffset(SpelledRange.getEnd()), BeginExpanded,
// Add a resulting mapping.		EndExpanded, NextSpelled[FID]);
File.Mappings.push_back(M);
}		}

/// Initializes TokenBuffer::Files and fills spelled tokens and expanded		/// Initializes TokenBuffer::Files and fills spelled tokens and expanded
/// ranges for each of the files.		/// ranges for each of the files.
void buildSpelledTokens() {		void buildSpelledTokens() {
for (unsigned I = 0; I < Result.ExpandedTokens.size(); ++I) {		for (unsigned I = 0; I < Result.ExpandedTokens.size(); ++I) {
auto FID =		auto FID =
SM.getFileID(SM.getExpansionLoc(Result.ExpandedTokens[I].location()));		SM.getFileID(SM.getExpansionLoc(Result.ExpandedTokens[I].location()));
auto It = Result.Files.try_emplace(FID);		auto It = Result.Files.try_emplace(FID);
TokenBuffer::MarkedFile &File = It.first->second;		TokenBuffer::MarkedFile &File = It.first->second;

File.EndExpanded = I + 1;		File.EndExpanded = I + 1;
if (!It.second)		if (!It.second)
continue; // we have seen this file before.		continue; // we have seen this file before.

// This is the first time we see this file.		// This is the first time we see this file.
File.BeginExpanded = I;		File.BeginExpanded = I;
File.SpelledTokens = tokenize(FID, SM, LangOpts);		File.SpelledTokens = tokenize(FID, SM, LangOpts);
}		}
}		}

/// Consumed spelled tokens until location L is reached (token starting at L		void consumeEmptyMapping(TokenBuffer::MarkedFile &File, unsigned EndOffset,
/// is not included). Returns the indicies of the consumed range.		unsigned ExpandedIndex, unsigned &SpelledIndex) {
std::pair</Begin/ unsigned, /End/ unsigned>		consumeMapping(File, EndOffset, ExpandedIndex, ExpandedIndex, SpelledIndex);
consumeSpelledUntil(TokenBuffer::MarkedFile &File, SourceLocation L) {		}
assert(L.isFileID());
FileID FID;		/// Consumes spelled tokens that form a macro expansion and adds a entry to
unsigned Offset;		/// the resulting token buffer.
std::tie(FID, Offset) = SM.getDecomposedLoc(L);		/// (!) SpelledIndex is updated in-place.
		void consumeMapping(TokenBuffer::MarkedFile &File, unsigned EndOffset,
		unsigned BeginExpanded, unsigned EndExpanded,
		unsigned &SpelledIndex) {
		// We need to record this mapping before continuing.
		unsigned MappingBegin = SpelledIndex;
		++SpelledIndex;

		bool HitMapping =
		tryConsumeSpelledUntil(File, EndOffset + 1, SpelledIndex).hasValue();
		(void)HitMapping;
		chrish_ericsson_atxUnsubmitted Not Done Reply Inline Actions What is intended by this line? chrish_ericsson_atx: What is intended by this line?
		ilya-biryukovAuthorUnsubmitted Done Reply Inline Actions Suppressing compiler warning for unused local variable. ilya-biryukov: Suppressing compiler warning for unused local variable.
		assert(!HitMapping && "recursive macro expansion?");

		TokenBuffer::Mapping M;
		M.BeginExpanded = BeginExpanded;
		M.EndExpanded = EndExpanded;
		M.BeginSpelled = MappingBegin;
		M.EndSpelled = SpelledIndex;

// (!) we update the index in-place.		File.Mappings.push_back(M);
unsigned &SpelledI = NextSpelled[FID];
unsigned Before = SpelledI;
for (; SpelledI < File.SpelledTokens.size() &&
SM.getFileOffset(File.SpelledTokens[SpelledI].location()) < Offset;
++SpelledI) {
}		}
return std::make_pair(Before, /After/ SpelledI);
};

/// Consumes spelled tokens until location \p L is reached and adds a mapping		/// Consumes spelled tokens until location \p L is reached and adds a mapping
/// covering the consumed tokens. The mapping will point to an empty expanded		/// covering the consumed tokens. The mapping will point to an empty expanded
/// range at position \p ExpandedIndex.		/// range at position \p ExpandedIndex.
void fillGapUntil(TokenBuffer::MarkedFile &File, SourceLocation L,		void fillGapUntil(TokenBuffer::MarkedFile &File, SourceLocation L,
unsigned ExpandedIndex) {		unsigned ExpandedIndex) {
unsigned BeginSpelledGap, EndSpelledGap;		assert(L.isFileID());
std::tie(BeginSpelledGap, EndSpelledGap) = consumeSpelledUntil(File, L);		FileID FID;
if (BeginSpelledGap == EndSpelledGap)		unsigned Offset;
return; // No gap.		std::tie(FID, Offset) = SM.getDecomposedLoc(L);

		unsigned &SpelledIndex = NextSpelled[FID];
		unsigned MappingBegin = SpelledIndex;
		while (true) {
		auto EndLoc = tryConsumeSpelledUntil(File, Offset, SpelledIndex);
		if (SpelledIndex != MappingBegin) {
TokenBuffer::Mapping M;		TokenBuffer::Mapping M;
M.BeginSpelled = BeginSpelledGap;		M.BeginSpelled = MappingBegin;
M.EndSpelled = EndSpelledGap;		M.EndSpelled = SpelledIndex;
M.BeginExpanded = M.EndExpanded = ExpandedIndex;		M.BeginExpanded = M.EndExpanded = ExpandedIndex;
File.Mappings.push_back(M);		File.Mappings.push_back(M);
		}
		if (!EndLoc)
		break;
		consumeEmptyMapping(File, SM.getFileOffset(*EndLoc), ExpandedIndex,
		SpelledIndex);

		MappingBegin = SpelledIndex;
		}
};		};

		/// Consumes spelled tokens until it reaches Offset or a mapping boundary,
		/// i.e. a name of a macro expansion or the start '#' token of a PP directive.
		/// (!) NextSpelled is updated in place.
		///
		/// returns None if \p Offset was reached, otherwise returns the end location
		/// of a mapping that starts at \p NextSpelled.
		llvm::Optional<SourceLocation>
		tryConsumeSpelledUntil(TokenBuffer::MarkedFile &File, unsigned Offset,
		unsigned &NextSpelled) {
		for (; NextSpelled < File.SpelledTokens.size(); ++NextSpelled) {
		auto L = File.SpelledTokens[NextSpelled].location();
		if (Offset <= SM.getFileOffset(L))
		return llvm::None; // reached the offset we are looking for.
		auto Mapping = CollectedExpansions.find(L.getRawEncoding());
		if (Mapping != CollectedExpansions.end())
		return Mapping->second; // found a mapping before the offset.
		}
		return llvm::None; // no more tokens, we "reached" the offset.
		}

/// Adds empty mappings for unconsumed spelled tokens at the end of each file.		/// Adds empty mappings for unconsumed spelled tokens at the end of each file.
void fillGapsAtEndOfFiles() {		void fillGapsAtEndOfFiles() {
for (auto &F : Result.Files) {		for (auto &F : Result.Files) {
unsigned Next = NextSpelled[F.first];		if (F.second.SpelledTokens.empty())
if (F.second.SpelledTokens.size() == Next)		continue;
continue; // All spelled tokens are accounted for.		fillGapUntil(F.second, F.second.SpelledTokens.back().endLocation(),
		F.second.EndExpanded);
// Record a mapping for the gap at the end of the spelled tokens.
TokenBuffer::Mapping M;
M.BeginSpelled = Next;
M.EndSpelled = F.second.SpelledTokens.size();
M.BeginExpanded = F.second.EndExpanded;
M.EndExpanded = F.second.EndExpanded;

F.second.Mappings.push_back(M);
}		}
}		}

TokenBuffer Result;		TokenBuffer Result;
/// For each file, a position of the next spelled token we will consume.		/// For each file, a position of the next spelled token we will consume.
llvm::DenseMap<FileID, unsigned> NextSpelled;		llvm::DenseMap<FileID, unsigned> NextSpelled;
		PPExpansions CollectedExpansions;
		sammccallUnsubmitted Done Reply Inline Actions maybe RecordedExpansions? to make the link with the recorder sammccall: maybe RecordedExpansions? to make the link with the recorder
		ilya-biryukovAuthorUnsubmitted Done Reply Inline Actions SG, I've renamed to `CollectedExpansions`. (Assuming 'recorder' stands for 'collector', happy to update if I misinterpreted your comment) ilya-biryukov: SG, I've renamed to `CollectedExpansions`. (Assuming 'recorder' stands for 'collector', happy…
const SourceManager &SM;		const SourceManager &SM;
const LangOptions &LangOpts;		const LangOptions &LangOpts;
};		};

TokenBuffer TokenCollector::consume() && {		TokenBuffer TokenCollector::consume() && {
PP.setTokenWatcher(nullptr);		PP.setTokenWatcher(nullptr);
return Builder(std::move(Expanded), PP.getSourceManager(), PP.getLangOpts())		Collector->disable();
		return Builder(std::move(Expanded), std::move(Expansions),
		PP.getSourceManager(), PP.getLangOpts())
.build();		.build();
}		}

std::string syntax::Token::str() const {		std::string syntax::Token::str() const {
return llvm::formatv("Token({0}, length = {1})", tok::getTokenName(kind()),		return llvm::formatv("Token({0}, length = {1})", tok::getTokenName(kind()),
length());		length());
}		}

▲ Show 20 Lines • Show All 69 Lines • Show Last 20 Lines

clang/unittests/Tooling/Syntax/TokensTest.cpp

Show First 20 Lines • Show All 418 Lines • ▼ Show 20 Lines	)cpp",
"file './input.cpp'\n"		"file './input.cpp'\n"
" spelled tokens:\n"		" spelled tokens:\n"
" # define ADD ( X , Y ) X + Y # define MULT ( X , Y ) X * Y int "		" # define ADD ( X , Y ) X + Y # define MULT ( X , Y ) X * Y int "
"a = ADD ( MULT ( 1 , 2 ) , MULT ( 3 , ADD ( 4 , 5 ) ) ) ;\n"		"a = ADD ( MULT ( 1 , 2 ) , MULT ( 3 , ADD ( 4 , 5 ) ) ) ;\n"
" mappings:\n"		" mappings:\n"
" ['#'_0, 'int'_22) => ['int'_0, 'int'_0)\n"		" ['#'_0, 'int'_22) => ['int'_0, 'int'_0)\n"
" ['ADD'_25, ';'_46) => ['1'_3, ';'_12)\n"},		" ['ADD'_25, ';'_46) => ['1'_3, ';'_12)\n"},
// Empty macro replacement.		// Empty macro replacement.
		// FIXME: the #define directives should not be glued together.
{R"cpp(		{R"cpp(
#define EMPTY		#define EMPTY
#define EMPTY_FUNC(X)		#define EMPTY_FUNC(X)
EMPTY		EMPTY
EMPTY_FUNC(1+2+3)		EMPTY_FUNC(1+2+3)
)cpp",		)cpp",
R"(expanded tokens:		R"(expanded tokens:
<empty>		<empty>
file './input.cpp'		file './input.cpp'
spelled tokens:		spelled tokens:
# define EMPTY # define EMPTY_FUNC ( X ) EMPTY EMPTY_FUNC ( 1 + 2 + 3 )		# define EMPTY # define EMPTY_FUNC ( X ) EMPTY EMPTY_FUNC ( 1 + 2 + 3 )
mappings:		mappings:
['#'_0, '<eof>'_18) => ['<eof>'_0, '<eof>'_0)		['#'_0, 'EMPTY'_9) => ['<eof>'_0, '<eof>'_0)
		['EMPTY'_9, 'EMPTY_FUNC'_10) => ['<eof>'_0, '<eof>'_0)
		['EMPTY_FUNC'_10, '<eof>'_18) => ['<eof>'_0, '<eof>'_0)
		sammccallUnsubmitted Done Reply Inline Actions the two #define statements are still merged into a single mapping. Do we have a FIXME somewhere to cover this? sammccall: the two #define statements are still merged into a single mapping. Do we have a FIXME somewhere…
)"},		)"},
// File ends with a macro replacement.		// File ends with a macro replacement.
{R"cpp(		{R"cpp(
#define FOO 10+10;		#define FOO 10+10;
int a = FOO		int a = FOO
)cpp",		)cpp",
R"(expanded tokens:		R"(expanded tokens:
int a = 10 + 10 ;		int a = 10 + 10 ;
▲ Show 20 Lines • Show All 293 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[Syntax] Do not glue multiple empty PP expansions to a single mapping
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 206246

clang/include/clang/Tooling/Syntax/Tokens.h

clang/lib/Tooling/Syntax/Tokens.cpp

clang/unittests/Tooling/Syntax/TokensTest.cpp

This is an archive of the discontinued LLVM Phabricator instance.

[Syntax] Do not glue multiple empty PP expansions to a single mappingClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 206246

clang/include/clang/Tooling/Syntax/Tokens.h

clang/lib/Tooling/Syntax/Tokens.cpp

clang/unittests/Tooling/Syntax/TokensTest.cpp

[Syntax] Do not glue multiple empty PP expansions to a single mapping
ClosedPublic