This is an archive of the discontinued LLVM Phabricator instance.

[Lex] Simplify and cleanup the updateConsecutiveMacroArgTokens implementation.
ClosedPublic

Authored by hokein on Sep 30 2022, 2:26 AM.

Details

Summary

The code falls back to the pre-2011 partition-file-id solution (see for
details).

This patch simplifies/rewrites the code based on the partition-based-on-file-id
idea. The new implementation is optimized by reducing the number of
calling getFileID (~40% drop).

Despite the huge drop of getFileID, this is a marignal improvment on
speed (becase the number of calling non-cached getFileID is roughly
the same). It removes the evaluation-order performance gap between gcc-built-clang
and clang-built-clang.

SemaExpr.cpp:

  • before: 315063 SLocEntries, FileID scans: 388230 linear, 1393437 binary. 458893 cache hits, 672299 getFileID calls
  • after: 313494 SLocEntries, FileID scans: 397525 linear, 1451890 binary, 176714 cache hits, 397144 getFileID calls

FindTarget.cpp:

  • before: 279984 SLocEntries, FileID scans: 361926 linear, 1275930 binary, 436072 cache hits, 632150 getFileID calls
  • after: 278426 SLocEntries, FileID scans: 371279 linear, 1333963 binary, 153705 cache hits, 356814 getFileID calls

Diff Detail

Event Timeline

hokein created this revision.Sep 30 2022, 2:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 30 2022, 2:26 AM
hokein requested review of this revision.Sep 30 2022, 2:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 30 2022, 2:26 AM

Some more perf data on building linux kernel (x86_64)

before: getFileID 2.4% (1.10% on getFileIDSlow)
after: getFileID 2.35% (1.05% on getFileIDSlow)

Some more perf data on building linux kernel (x86_64)

before: getFileID 2.4% (1.10% on getFileIDSlow)
after: getFileID 2.35% (1.05% on getFileIDSlow)

What compiler was used as the bootstrap?

For me bootstrapping w/ clang, I observed:

Before:
+    2.11%  clang-15                           [.] clang::SourceManager::getFileIDLocal
After:
+    2.01%  clang-15                           [.] clang::SourceManager::getFileIDLocal

I can test again bootstrapping with gcc.

This is such a small improvement that I'm tempted to say "look elsewhere." But it is an improvement in the hottest method, and gets rid of that awful magic constant 50. It might take a few refactorings to get this method out of the top spot of profiles.

clang/lib/Lex/TokenLexer.cpp
1006–1011

Could you just check that all of the tokens in the partition have the same fileID as the first token?

FileID FirstFID = SM.getFileID(Partition[0]->getLocation());
llvm::all_of(Partition, [&SM, &FirstID](const Token &T) { return SM.getFileID(T.getLocation() == FID; });

or move the assertion into the take_while above so we iterate less?

Some more perf data on building linux kernel (x86_64)

before: getFileID 2.4% (1.10% on getFileIDSlow)
after: getFileID 2.35% (1.05% on getFileIDSlow)

What compiler was used as the bootstrap?

For me bootstrapping w/ clang, I observed:

Before:
+    2.11%  clang-15                           [.] clang::SourceManager::getFileIDLocal
After:
+    2.01%  clang-15                           [.] clang::SourceManager::getFileIDLocal

That's interesting. I also used clang (v14) for bootstrapping.

What's your base revision? I'm using d32b8fdbdb4b99a5cc21604db6211fc506eb1f9b, looking at your profile (clang-15), I think your base revision is older than mine (my profile shows clang-16).

clang/lib/Lex/TokenLexer.cpp
1006–1011

The optimization for this case is that we don't call any getFileID, the getFileID is only needed in the assert sanity check, so moving the assertion to take_while doesn't really work.

I adjust the code to save some unnecessary getFileID call in assert.

Thanks, I think this is worthwhile for the simpler code, better (non-lying) comments, avoiding arg-evaluation-order stuff.

-0.05% time and -0.5% SLocEntries is interesting too! Clearly not earth-shattering but the the offset table idea gives us a lead to follow.
I do have one idea to skip a getFileID() that might not be cached...

clang/lib/Lex/TokenLexer.cpp
999

I think this comment can be shorter while still getting its point across.

// Consecutive tokens not written in macros must be from the same file.
// (Neither #include nor eof can occur inside a macro argument.)
1006

this assertion seems to belong outside the if() - it applies to both the file/macro case?
I'd suggest asserting nonempty first and then the rest as another assertion.
also missing an assertion that if there are any more tokens, the next token has a different FileID

that said with these assertions we should probably check we're not regressing debug performance too much!

1017

this getFileID() call is unneccesary when All.empty() || All.front().getLocation().isFileID().

Worth checking whether bailing out in that case is profitable?

You could do it right at the top:

if (All.size() == 1 || All[0].getLocation().isFileID() != All[1].getLocation().isFileID())
  return All.take_front(1);
1021

nit: s/begin/front/ if you're using back()

1037

hmm, actually maybe just before this line would be the best place to assert that T and BeginLoc are in the same FileID, as it justifies the subtraction

nickdesaulniers added inline comments.Oct 4 2022, 1:17 PM
clang/lib/Lex/TokenLexer.cpp
1006

Right, I do all development and profiles with Release builds with assertions enabled. So avoiding getFileID in release+no_asserts builds is a win, but am somewhat bummed to not get as much a win for my rel+assert builds.

1017

Good point; I'd say "avoid getFileID" at all costs, even to readability/style.

hokein updated this revision to Diff 465321.Oct 5 2022, 1:46 AM
hokein marked 2 inline comments as done.

address comments

hokein added inline comments.Oct 5 2022, 1:48 AM
clang/lib/Lex/TokenLexer.cpp
1006

this assertion seems to belong outside the if() - it applies to both the file/macro case?

yeah, it should apply the else case. Moved it outside -- it seems unnecessary to check the else case, since we actually do the partition based on the file id (that being said, it is somehow like int s = 1+1; assert(s == 2);), but it might be good for code readability.

I do all development and profiles with Release builds with assertions enabled

hmm, I think when doing profiles we probably should use a release build without assertions enabled to give a more correct result, since assertion will slow everything down.

1017

We have handled the 1-element special case in the caller, so All.size() > 1 in this function. Added an assertion.

All[0].getLocation().isFileID() != All[1].getLocation().isFileID()

I tried it when writing this patch, I don't find the result now, but IIRC performance difference is negligible. I'm happy to add this special case since it won't hurt the readability too much.

1037

I agree that this is the best place to put the check-file-id assertion.

Simply adding assert(getFileID(BeginLoc) == getFileID(T.getLocation())) can work, but it creates N-1 unnecessary getFileID calls on BeginLoc for assert build, it is better to avoid it. The only way I can think of is

#ifndef NDEBUG
FileID BeginFID = SM.getFileID(BeginLoc);
assert(BeginFID == getFileID(T.getLocation()));
#endif
sammccall accepted this revision.Oct 5 2022, 3:30 AM
sammccall added inline comments.
clang/lib/Lex/TokenLexer.cpp
1037

I assume you mean with the #if once above, but the assert outside the #if and inside the loop?
That LGTM

There's also #ifdef EXPENSIVE_CHECKS, maybe not needed here though.

This revision is now accepted and ready to land.Oct 5 2022, 3:30 AM
nickdesaulniers accepted this revision.Oct 6 2022, 11:28 AM
nickdesaulniers added inline comments.
clang/lib/Lex/TokenLexer.cpp
1019

Please consider moving this to

#ifdef EXPENSIVE_CHECKS rather than asserts.

This revision was landed with ongoing or failed builds.Oct 7 2022, 12:16 AM
This revision was automatically updated to reflect the committed changes.
hokein marked an inline comment as done.