This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
cfe/trunk/
-
trunk/
-
lib/Lex/
-
Lex/
-
TokenLexer.cpp
-
test/CoverageMapping/
-
CoverageMapping/
-
Inputs/
-
macros.h
-
include-macros.c
-
unittests/Lex/
-
Lex/
-
LexerTest.cpp

Differential D20401

[Lexer] Don't merge macro args from different macro files
ClosedPublic

Authored by vsk on May 18 2016, 4:41 PM.

Download Raw Diff

Details

Reviewers

akyrtzi
doug.gregor

Commits

rG95a2a7f2a124: [Lexer] Don't merge macro args from different macro files
rC270160: [Lexer] Don't merge macro args from different macro files
rL270160: [Lexer] Don't merge macro args from different macro files

Summary

The lexer sets the end location of macro arguments incorrectly *if*,
while merging consecutive args to fit into a single SLocEntry, it finds
args which come from different macro files.

Fix the issue by using separate SLocEntries in this situation.

This fixes a code coverage crasher (rdar://problem/26181005). Because
the lexer reported end locations for certain macro args incorrectly, we
would generate bogus coverage mappings with negative line offsets.

Diff Detail

Repository: rL LLVM

Event Timeline

vsk updated this revision to Diff 57710.May 18 2016, 4:41 PM

vsk retitled this revision from to [Lexer] Don't merge macro args from different macro files.

vsk updated this object.

vsk added reviewers: akyrtzi, doug.gregor.

vsk added a subscriber: cfe-commits.

Add some comments to the unit test.

Fix explanation of the test case in test/CoverageMapping.

I discussed this bug with Argyrios off-list, who lgtm'd on the condition that it doesn't introduce a performance regression. He suggested preprocessing Cocoa.h to stress the patch. After running a stabilization script, I used this command to stress RelNoAsserts builds of clang both with and without this patch.

for I in $(seq 1 100); do time $CC -F /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/System/Library/Frameworks -E Cocoa.h -o /dev/null; done

The results are basically in the noise (link to raw data: https://ghostbin.com/paste/r6cyh):

Compiler	Unpatched TOT	Patched TOT
Avg. wall time (s)	0.21709	0.21608
Std. deviation	0.02101	0.02219

I also made sure that the preprocessed sources emitted by the two compilers are the same.

Closed by commit rL270160: [Lexer] Don't merge macro args from different macro files (authored by vedantk). · Explain WhyMay 19 2016, 4:50 PM

This revision was automatically updated to reflect the committed changes.

I know this was sped up slightly in 3339c568c43e4644f02289e5edfc78c860f19c9f, but this change makes updateConsecutiveMacroArgTokens the hottest function in clang in a bottom up profile of an entire build of the Linux kernel. It thrashes the one entry LastFileIDLookup cache, and we end up looking up the same FileID again and again and again for each token when we expand nested function like macros.

Is there anything we can do to speed this up? Is there any way to record which FileID corresponds to a given Token so that we don't have to keep rematerializing that? Is it possible to find whether two SourceLocations correspond to the same FileID without having to figure out which FileID in particular each belongs to?

I discussed this bug with Argyrios off-list, who lgtm'd on the condition that it doesn't introduce a performance regression.

Well, I'd say it's a performance regression, though perhaps reported 5 years too late.

In D20401#2770059, @nickdesaulniers wrote:

I know this was sped up slightly in 3339c568c43e4644f02289e5edfc78c860f19c9f, but this change makes updateConsecutiveMacroArgTokens the hottest function in clang in a bottom up profile of an entire build of the Linux kernel. It thrashes the one entry LastFileIDLookup cache, and we end up looking up the same FileID again and again and again for each token when we expand nested function like macros.

Is there anything we can do to speed this up? Is there any way to record which FileID corresponds to a given Token so that we don't have to keep rematerializing that? Is it possible to find whether two SourceLocations correspond to the same FileID without having to figure out which FileID in particular each belongs to?

Perhaps you could try:

using SourceManager::isInFileID(NextLoc, getFileID(CurLoc), ...) (to halve the number of necessary getFileID lookups), or
using a 2-element cache in getFileID?

I discussed this bug with Argyrios off-list, who lgtm'd on the condition that it doesn't introduce a performance regression.

Well, I'd say it's a performance regression, though perhaps reported 5 years too late.

If the performance issue manifests on Linux kernel sources from 5 years ago, then sure, I'd agree :).

In D20401#2770059, @nickdesaulniers wrote:

I know this was sped up slightly in 3339c568c43e4644f02289e5edfc78c860f19c9f, but this change makes updateConsecutiveMacroArgTokens the hottest function in clang in a bottom up profile of an entire build of the Linux kernel. It thrashes the one entry LastFileIDLookup cache, and we end up looking up the same FileID again and again and again for each token when we expand nested function like macros.

Is there anything we can do to speed this up?

@hokein and I spent some time looking at this (initially trying to understand behavior, now performance).

Short version is:

we can *simplify* the code a lot, we think it's now just partitioning based on FileID and this can be done more clearly. This may have some speedups at the margin.
in terms of performance: I suspect when clang is built by GCC it's doing roughly 2x as much work as when it's built by clang. @nickdesaulniers can you tell me which you're measuring/deploying? To give some idea if we're likely to actually help much...

Behavior: partitioning by file IDs

I think we're back to the original (<2011) behavior of just partitioning by file IDs

the original patch clearly intended this to merge tokens across file IDs, and the comments still claim this
then a bugfix banned merging file+macro or macro+file
then this patch banned merging macro+macro
meanwhile, there's no code disallowing file+file, but I don't think it's actually possible to achieve: you can't have an #include or an eof inside a macro arg, and I don't know how else to switch between files.

Performance (good case)

The current obfuscated version *is* faster than the pre-2011 version because we avoid getFileID() in when testing file+macro, macro+file, and *some* macro+macro cases (when the locations happen to be >50 apart).
When we see a run of N consecutive macro nearby macro tokens, we do 2*(N-1) getFileID()s.

We can reduce the number of getFileID() calls by caching FileID bounds (the expensive part is looking up the SLocEntry - given that we can hit-test a SourceLocation against it with simple arithmetic).

However, getFileID() has the one-element cache of SLocEntry, so this may only be a marginal improvement.

// Tokens A1 A2 A3 B1 B2

isWrittenInSameFile(A1, A2);
  getFileID(A1); // miss
  getFileID(A2);
isWrittenInSameFile(A2, A3);
  getFileID(A2);
  getFileID(A3);
isWrittenInSameFile(A3, B1);
  getFileID(A3);
  getFileID(B1); // miss
isWrittenInSameFile(B1, B2);
  getFileID(B1);
  getFileID(B2);

All the getFileID() calls we could avoid are the cached ones. It's probably still a win (the cache lookup logic is kinda messy), but probably not huge.

Performance (bad case)

However, the implementation of isWrittenInSameFile is getFileID(X) == getFileID(Y), and it's unspecified which gets evaluated first. GCC generally evaluates right-to-left (https://godbolt.org/z/M4bs74Tbj), producing substantially more misses:

isWrittenInSameFile(A1, A2);
  getFileID(A2); // miss
  getFileID(A1);
isWrittenInSameFile(A2, A3);
  getFileID(A3);
  getFileID(A2);
isWrittenInSameFile(A3, B1);
  getFileID(B1); // miss
  getFileID(A3); // miss
isWrittenInSameFile(B1, B2);
  getFileID(B2); // miss
  getFileID(B1);

So I'd expect we can improve GCC-built-clang's performance by maybe 2x by caching externally.
(Or by having isWrittenInSameFile try to cleverly evaluate whichever arg matches the cache first, but I have no idea whether that will work well across clang)

@hokein or I will try to find time to take a stab at this.

Herald added a project: Restricted Project. · View Herald TranscriptSep 29 2022, 4:21 AM

In D20401#3823476, @sammccall wrote:

In D20401#2770059, @nickdesaulniers wrote:

I know this was sped up slightly in 3339c568c43e4644f02289e5edfc78c860f19c9f, but this change makes updateConsecutiveMacroArgTokens the hottest function in clang in a bottom up profile of an entire build of the Linux kernel. It thrashes the one entry LastFileIDLookup cache, and we end up looking up the same FileID again and again and again for each token when we expand nested function like macros.

Is there anything we can do to speed this up?

@hokein and I spent some time looking at this (initially trying to understand behavior, now performance).

Wonderful! I'm glad to see you also identified this change in particular. Thanks as well for looking into this code.

Short version is:

we can *simplify* the code a lot, we think it's now just partitioning based on FileID and this can be done more clearly. This may have some speedups at the margin.

in terms of performance: I suspect when clang is built by GCC it's doing roughly 2x as much work as when it's built by clang. @nickdesaulniers can you tell me which you're measuring/deploying? To give some idea if we're likely to actually help much...

In all of my personal measurements, I've been bootstrapping clang with AOSP's clang, and updateConsecutiveMacroArgTokens is still the worst method in profiles.

But I wouldn't be surprised if other Linux distro's like RHEL bootstrap their clang distribution via GCC. @tstellar or @serge-sans-paille or @nikic might know. We did get a curious comment from a kernel developer recently claiming that clang was "twice as slow as GCC" which didn't make much sense; not sure if it was an exaggeration vs. precise measurement, but I wouldn't be surprised if evaluation order you identified plays into this, making the worst method even slower. I'll try to find a link to the thread...

My summer intern @justinstitt looked into this case again briefly. We found that minor improvements to SourceManager::isWrittenInSameFile to avoid a few more calls to getFileID to be irrelevant in terms of performance improvement. But we also didn't consider GCC-built-clang; we were using clang to bootstrap clang.

Meanwhile, I think besides evaluating the high level logic in in TokenLexer and how it might be improved, I think there's potentially an opportunity for a "AOS vs. SOA" speedup in SourceManager. SourceManager::LoadedSLocEntryTable is a llvm::SmallVector<SrcMgr::SLocEntry>. SourceManager::getFileIDLoaded only really cares about the SLocEntry's Offset. I suspect we could get major memory locality wins by packing those into a standalone vector so that we could search them faster.

@hokein or I will try to find time to take a stab at this.

Awesome, please keep me in the loop.

Thanks Nick for the info! No kernel experience here, so if you have any particular suggestions about how to measure the workload you care about it'd be much appreciated (e.g. are particular files that are slow enough to measure in isolation, or is it better to do a full build)

In D20401#3824569, @nickdesaulniers wrote:

But I wouldn't be surprised if other Linux distro's like RHEL bootstrap their clang distribution via GCC. @tstellar or @serge-sans-paille or @nikic might know. We did get a curious comment from a kernel developer recently claiming that clang was "twice as slow as GCC" which didn't make much sense; not sure if it was an exaggeration vs. precise measurement, but I wouldn't be surprised if evaluation order you identified plays into this, making the worst method even slower. I'll try to find a link to the thread...

I'm sure this is common enough to be worth fixing if it's a real effect which I need to confirm. (MSVC is right-to-left too).

Meanwhile, I think besides evaluating the high level logic in in TokenLexer and how it might be improved, I think there's potentially an opportunity for a "AOS vs. SOA" speedup in SourceManager. SourceManager::LoadedSLocEntryTable is a llvm::SmallVector<SrcMgr::SLocEntry>. SourceManager::getFileIDLoaded only really cares about the SLocEntry's Offset. I suspect we could get major memory locality wins by packing those into a standalone vector so that we could search them faster.

Ah, great point. SLocEntry is 24 bytes while Offset is only 4. SLocEntry is an important public API, but Offset is ~only used in SourceManager, so that refactoring might be doable. I guess we can cheaply prototype by redundantly storing offset *both* in a separate array used for search and in the SLocEntry.

@hokein or I will try to find time to take a stab at this.

Awesome, please keep me in the loop.

Will do!

In D20401#3824569, @nickdesaulniers wrote:

But I wouldn't be surprised if other Linux distro's like RHEL bootstrap their clang distribution via GCC. @tstellar or @serge-sans-paille or @nikic might know. We did get a curious comment from a kernel developer recently claiming that clang was "twice as slow as GCC" which didn't make much sense; not sure if it was an exaggeration vs. precise measurement, but I wouldn't be surprised if evaluation order you identified plays into this, making the worst method even slower. I'll try to find a link to the thread...

For Fedora/RHEL we used to build Clang with GCC -- for Clang 15, we switched to building with Clang. No idea what other distros do. Maybe worth mentioning that the builds on https://llvm-compile-time-tracker.com/ are done with GCC.

@hokein or I will try to find time to take a stab at this.

Awesome, please keep me in the loop.

Will do!

https://reviews.llvm.org/D134942 is my attempt.

In D20401#2770059, @nickdesaulniers wrote:

I discussed this bug with Argyrios off-list, who lgtm'd on the condition that it doesn't introduce a performance regression.

Well, I'd say it's a performance regression, though perhaps reported 5 years too late.

This patch does increase the number of SLocEntries. At least for SemaExpr.cpp, prior this patch it is ~298K, after this patch it is ~315K (~5% increase).

In D20401#3824713, @sammccall wrote:

Thanks Nick for the info! No kernel experience here, so if you have any particular suggestions about how to measure the workload you care about it'd be much appreciated (e.g. are particular files that are slow enough to measure in isolation, or is it better to do a full build)

@sammccall I wrote up instructions for how to profile a Linux kernel build with LLVM.
https://github.com/ClangBuiltLinux/profiling/tree/main/perf
A build on a 72+ threaded workstation should only take ~1 minute. Can you please give it a shot and let me know off-thread if you encounter any issues?
One thing I found about that workflow: just this week I upgraded to a zen2-based threadripper workstation. It appears that zen2 has issues using per-thread LBR.
https://www.spinics.net/lists/linux-perf-users/msg23103.html
(There's follow up responses that I don't see yet in the archive, but it looks like there's pending Linux kernel patches to get that working.)
https://lore.kernel.org/lkml/166155216401.401.5809694678609694438.tip-bot2@tip-bot2/
https://lore.kernel.org/lkml/20220829113347.295-1-ravi.bangoria@amd.com/
So you might want to ensure you're running those instructions on an intel box, for now. I'm also happy to hop on a virtual call with you and @hokein anytime.

Meanwhile, I think besides evaluating the high level logic in in TokenLexer and how it might be improved, I think there's potentially an opportunity for a "AOS vs. SOA" speedup in SourceManager. SourceManager::LoadedSLocEntryTable is a llvm::SmallVector<SrcMgr::SLocEntry>. SourceManager::getFileIDLoaded only really cares about the SLocEntry's Offset. I suspect we could get major memory locality wins by packing those into a standalone vector so that we could search them faster.

Ah, great point. SLocEntry is 24 bytes while Offset is only 4.

And SLocEntry costs 4 bytes for padding only, which is bad :(

SLocEntry is an important public API, but Offset is ~only used in SourceManager, so that refactoring might be doable. I guess we can cheaply prototype by redundantly storing offset *both* in a separate array used for search and in the SLocEntry.

This is an interesting idea. I got a quick prototype of adding an in-parallel offset table in SourceManager:

clang::SourceManager::getFileIDLocal 2.45% -> 1.57% (reduce by 30%+)
SourceManager memory usage is increased by ~10%: SemaExpr.cpp 12.6MB -> 14.3MB

The improvement of getFileIDLocal seems promising, but the memory increasement is a thing (10% is not small, maybe it is ok compared the actual AST size).

An alternative is to restructure the SLocEntry and the underlying storage in SourceManager, it will give us both performance and memory improvement, but we need to make a significant change of the SourceManager.

A build on a 72+ threaded workstation should only take ~1 minute. Can you please give it a shot and let me know off-thread if you encounter any issues?

Thanks for writing down the instructions, it is useful. It works for me (on an intel-based workstation).

In D20401#3833201, @hokein wrote:

Meanwhile, I think besides evaluating the high level logic in in TokenLexer and how it might be improved, I think there's potentially an opportunity for a "AOS vs. SOA" speedup in SourceManager. SourceManager::LoadedSLocEntryTable is a llvm::SmallVector<SrcMgr::SLocEntry>. SourceManager::getFileIDLoaded only really cares about the SLocEntry's Offset. I suspect we could get major memory locality wins by packing those into a standalone vector so that we could search them faster.

Ah, great point. SLocEntry is 24 bytes while Offset is only 4.

And SLocEntry costs 4 bytes for padding only, which is bad :(

SLocEntry is an important public API, but Offset is ~only used in SourceManager, so that refactoring might be doable. I guess we can cheaply prototype by redundantly storing offset *both* in a separate array used for search and in the SLocEntry.

Do we need to store both the whole SLocEntry and a copy of the Offset, or can we just store the Offset (or perhaps individual arrays of the pieces of an SLocEntry)? Perhaps we can lazily materialize an SLocEntry only when needed, if ever?

This is an interesting idea. I got a quick prototype of adding an in-parallel offset table in SourceManager:

clang::SourceManager::getFileIDLocal 2.45% -> 1.57% (reduce by 30%+)

SourceManager memory usage is increased by ~10%: SemaExpr.cpp 12.6MB -> 14.3MB

How did you measure the memory usage of an individual class? (I think we should move this discussion to LLVM Discourse for more visibility of our discussion).

The improvement of getFileIDLocal seems promising, but the memory increasement is a thing (10% is not small, maybe it is ok compared the actual AST size).

At this point, I'll pay it. Unless it regresses peak RSS of the compiler, I don't care.

An alternative is to restructure the SLocEntry and the underlying storage in SourceManager, it will give us both performance and memory improvement, but we need to make a significant change of the SourceManager.

At this point, I think it's worth it.

Started a thread: https://discourse.llvm.org/t/macro-performance-lexer-and-sourcemanager/65713

Revision Contents

Path

Size

cfe/

trunk/

lib/

Lex/

TokenLexer.cpp

3 lines

test/

CoverageMapping/

Inputs/

macros.h

13 lines

include-macros.c

18 lines

unittests/

Lex/

LexerTest.cpp

26 lines

Diff 57880

cfe/trunk/lib/Lex/TokenLexer.cpp

Show First 20 Lines • Show All 781 Lines • ▼ Show 20 Lines	static void updateConsecutiveMacroArgTokens(SourceManager &SM,
// on the relative offset.		// on the relative offset.

Token *NextTok = begin_tokens + 1;		Token *NextTok = begin_tokens + 1;
for (; NextTok < end_tokens; ++NextTok) {		for (; NextTok < end_tokens; ++NextTok) {
SourceLocation NextLoc = NextTok->getLocation();		SourceLocation NextLoc = NextTok->getLocation();
if (CurLoc.isFileID() != NextLoc.isFileID())		if (CurLoc.isFileID() != NextLoc.isFileID())
break; // Token from different kind of FileID.		break; // Token from different kind of FileID.

		if (CurLoc.isMacroID() && !SM.isWrittenInSameFile(CurLoc, NextLoc))
		break; // Token from a different macro.

int RelOffs;		int RelOffs;
if (!SM.isInSameSLocAddrSpace(CurLoc, NextLoc, &RelOffs))		if (!SM.isInSameSLocAddrSpace(CurLoc, NextLoc, &RelOffs))
break; // Token from different local/loaded location.		break; // Token from different local/loaded location.
// Check that token is not before the previous token or more than 50		// Check that token is not before the previous token or more than 50
// "characters" away.		// "characters" away.
if (RelOffs < 0 \|\| RelOffs > 50)		if (RelOffs < 0 \|\| RelOffs > 50)
break;		break;
CurLoc = NextLoc;		CurLoc = NextLoc;
▲ Show 20 Lines • Show All 56 Lines • Show Last 20 Lines

cfe/trunk/test/CoverageMapping/Inputs/macros.h

				// Assorted macros to help test #include behavior across file boundaries.

				#define helper1 0

				void helper2(const char *, ...);

				#define M1(a, ...) helper2(a, ##__VA_ARGS__);

				// Note: M2 stresses vararg macro functions with macro arguments. The spelling
				// locations of the args used to be set to the expansion site, leading to
				// crashes (region LineEnd < LineStart). The regression test requires M2's line
				// number to be greater than the line number containing the expansion.
				#define M2(a, ...) M1(a, helper1, ##__VA_ARGS__);

cfe/trunk/test/CoverageMapping/include-macros.c

				// RUN: %clang_cc1 -fprofile-instrument=clang -fcoverage-mapping -dump-coverage-mapping -emit-llvm-only -main-file-name include-macros.c %s \| FileCheck %s

				#include "Inputs/macros.h"

				void f1() {
				M2("a", "b");
				}

				// CHECK-LABEL: f1:
				// CHECK-NEXT: File 0, 5:11 -> 7:2 = #0
				// CHECK-NEXT: Expansion,File 0, 6:3 -> 6:5 = #0 (Expanded file = 1)
				// CHECK-NEXT: File 1, 13:20 -> 13:50 = #0
				// CHECK-NEXT: Expansion,File 1, 13:20 -> 13:22 = #0 (Expanded file = 2)
				// CHECK-NEXT: File 2, 7:20 -> 7:46 = #0
				// CHECK-NEXT: Expansion,File 2, 7:33 -> 7:44 = #0 (Expanded file = 3)
				// CHECK-NEXT: File 3, 13:26 -> 13:34 = #0
				// CHECK-NEXT: Expansion,File 3, 13:26 -> 13:33 = #0 (Expanded file = 4)
				// CHECK-NEXT: File 4, 3:17 -> 3:18 = #0

cfe/trunk/unittests/Lex/LexerTest.cpp

Show First 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	: FileMgr(FileMgrOpts),
Diags(DiagID, new DiagnosticOptions, new IgnoringDiagConsumer()),		Diags(DiagID, new DiagnosticOptions, new IgnoringDiagConsumer()),
SourceMgr(Diags, FileMgr),		SourceMgr(Diags, FileMgr),
TargetOpts(new TargetOptions)		TargetOpts(new TargetOptions)
{		{
TargetOpts->Triple = "x86_64-apple-darwin11.1.0";		TargetOpts->Triple = "x86_64-apple-darwin11.1.0";
Target = TargetInfo::CreateTargetInfo(Diags, TargetOpts);		Target = TargetInfo::CreateTargetInfo(Diags, TargetOpts);
}		}

std::vector<Token> CheckLex(StringRef Source,		std::vector<Token> Lex(StringRef Source) {
ArrayRef<tok::TokenKind> ExpectedTokens) {
std::unique_ptr<llvm::MemoryBuffer> Buf =		std::unique_ptr<llvm::MemoryBuffer> Buf =
llvm::MemoryBuffer::getMemBuffer(Source);		llvm::MemoryBuffer::getMemBuffer(Source);
SourceMgr.setMainFileID(SourceMgr.createFileID(std::move(Buf)));		SourceMgr.setMainFileID(SourceMgr.createFileID(std::move(Buf)));

VoidModuleLoader ModLoader;		VoidModuleLoader ModLoader;
HeaderSearch HeaderInfo(new HeaderSearchOptions, SourceMgr, Diags, LangOpts,		HeaderSearch HeaderInfo(new HeaderSearchOptions, SourceMgr, Diags, LangOpts,
Target.get());		Target.get());
Preprocessor PP(new PreprocessorOptions(), Diags, LangOpts, SourceMgr,		Preprocessor PP(new PreprocessorOptions(), Diags, LangOpts, SourceMgr,
HeaderInfo, ModLoader, /IILookup =/nullptr,		HeaderInfo, ModLoader, /IILookup =/nullptr,
/OwnsHeaderSearch =/false);		/OwnsHeaderSearch =/false);
PP.Initialize(*Target);		PP.Initialize(*Target);
PP.EnterMainSourceFile();		PP.EnterMainSourceFile();

std::vector<Token> toks;		std::vector<Token> toks;
while (1) {		while (1) {
Token tok;		Token tok;
PP.Lex(tok);		PP.Lex(tok);
if (tok.is(tok::eof))		if (tok.is(tok::eof))
break;		break;
toks.push_back(tok);		toks.push_back(tok);
}		}

		return toks;
		}

		std::vector<Token> CheckLex(StringRef Source,
		ArrayRef<tok::TokenKind> ExpectedTokens) {
		auto toks = Lex(Source);
EXPECT_EQ(ExpectedTokens.size(), toks.size());		EXPECT_EQ(ExpectedTokens.size(), toks.size());
for (unsigned i = 0, e = ExpectedTokens.size(); i != e; ++i) {		for (unsigned i = 0, e = ExpectedTokens.size(); i != e; ++i) {
EXPECT_EQ(ExpectedTokens[i], toks[i].getKind());		EXPECT_EQ(ExpectedTokens[i], toks[i].getKind());
}		}

return toks;		return toks;
}		}

▲ Show 20 Lines • Show All 260 Lines • ▼ Show 20 Lines	TEST_F(LexerTest, LexAPI) {
SourceLocation idLoc3 = toks[8].getLocation();		SourceLocation idLoc3 = toks[8].getLocation();
SourceLocation idLoc4 = toks[9].getLocation();		SourceLocation idLoc4 = toks[9].getLocation();
EXPECT_EQ("INN", Lexer::getImmediateMacroName(idLoc1, SourceMgr, LangOpts));		EXPECT_EQ("INN", Lexer::getImmediateMacroName(idLoc1, SourceMgr, LangOpts));
EXPECT_EQ("INN", Lexer::getImmediateMacroName(idLoc2, SourceMgr, LangOpts));		EXPECT_EQ("INN", Lexer::getImmediateMacroName(idLoc2, SourceMgr, LangOpts));
EXPECT_EQ("NOF2", Lexer::getImmediateMacroName(idLoc3, SourceMgr, LangOpts));		EXPECT_EQ("NOF2", Lexer::getImmediateMacroName(idLoc3, SourceMgr, LangOpts));
EXPECT_EQ("N", Lexer::getImmediateMacroName(idLoc4, SourceMgr, LangOpts));		EXPECT_EQ("N", Lexer::getImmediateMacroName(idLoc4, SourceMgr, LangOpts));
}		}

		TEST_F(LexerTest, DontMergeMacroArgsFromDifferentMacroFiles) {
		std::vector<Token> toks =
		Lex("#define helper1 0\n"
		"void helper2(const char *, ...);\n"
		"#define M1(a, ...) helper2(a, ##__VA_ARGS__)\n"
		"#define M2(a, ...) M1(a, helper1, ##__VA_ARGS__)\n"
		"void f1() { M2(\"a\", \"b\"); }");

		// Check the file corresponding to the "helper1" macro arg in M2.
		//
		// The lexer used to report its size as 31, meaning that the end of the
		// expansion would be on the next line (just past `M2("a", "b")`). Make
		// sure that we get the correct end location (the comma after "helper1").
		SourceLocation helper1ArgLoc = toks[20].getLocation();
		EXPECT_EQ(SourceMgr.getFileIDSize(SourceMgr.getFileID(helper1ArgLoc)), 8U);
		}

} // anonymous namespace		} // anonymous namespace