This is an archive of the discontinued LLVM Phabricator instance.

[NFC][Lexer] Make Lexer::LangOpts const reference
ClosedPublic

Authored by yurai007 on Feb 22 2022, 8:38 AM.

Details

Summary

This change can be seen as code cleanup but motivation is more performance related.
While browsing perf reports captured during Linux build we can notice unusual portion of instructions executed in std::vector<std::string> copy constructor like:

0.59%     0.58%  clang-14    clang-14      [.] std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >,
                                                                std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::vector

or even:

1.42%     0.26%  clang    clang-14             [.] clang::LangOptions::LangOptions
       |
        --1.16%--clang::LangOptions::LangOptions
                  |
                   --0.74%--std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >,
                            std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::vector

After more digging we can see that relevant LangOptions std::vector members (*Files, ModuleFeatures and NoBuiltinFuncs)
are constructed when Lexer::LangOpts field is initialized on list:

Lexer::Lexer(..., const LangOptions &langOpts, ...)
            : ..., LangOpts(langOpts),

Since LangOptions copy constructor is called by Lexer(..., const LangOptions &LangOpts,...) and local Lexer objects are created thousands times
(in Lexer::getRawToken, Preprocessor::EnterSourceFile and more) during single module processing in frontend it makes std::vector copy constructors surprisingly hot.

Unfortunately even though in current Lexer implementation mentioned std::vector members are unused and most of time empty,
no compiler is smart enough to optimize their std::vector copy constructors out (take a look at test assembly): https://godbolt.org/z/hdoxPfMYY even with LTO enabled.
However there is simple way to fix this. Since Lexer doesn't access *Files, ModuleFeatures, NoBuiltinFuncs and any other LangOptions fields (but only LangOptionsBase)
we can simply get rid of redundant copy constructor assembly by changing LangOpts type to more appropriate const LangOptions reference: https://godbolt.org/z/fP7de9176

Additionally we need to store LineComment outside LangOpts because it's written in SkipLineComment function.
Also FormatTokenLexer need to be adjusted a bit to avoid lifetime issues related to passing local LangOpts reference to Lexer.

After this change I can see more than 1% speedup in some of my microbenchmarks when using Clang release binary built with LTO.
For Linux build gains are not so significant but still nice at the level of -0.4%/-0.5% instructions drop.

Diff Detail

Event Timeline

yurai007 requested review of this revision.Feb 22 2022, 8:38 AM
yurai007 created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 22 2022, 8:38 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
yurai007 edited the summary of this revision. (Show Details)Feb 22 2022, 8:38 AM
yurai007 edited the summary of this revision. (Show Details)Feb 22 2022, 8:41 AM

The performance implications are pretty interesting! Have you tried avoiding the copy altogether by just storing the reference?

yurai007 added a comment.EditedFeb 22 2022, 11:51 AM

The performance implications are pretty interesting! Have you tried avoiding the copy altogether by just storing the reference?

Yes, actually that was my first attempt but I failed miserably. LangOpts::LineComment is mutated in Lexer so LangOpts cannot be simply const&. Making it non-const reference is technically doable with some extra adjustment for Lexer constructors.
However I'm worried a bit how it would impact external LangOpts users after mutating LineComment member by Lexer. As long as LangOpts acts as simple cache it's easy to reason about but when we start to share it with Lexer callers I'm not longer sure. That's why eventually base class approach was chosen.

The performance implications are pretty interesting! Have you tried avoiding the copy altogether by just storing the reference?

Yes, actually that was my first attempt but I failed miserably. LangOpts::LineComment is mutated in Lexer so LangOpts cannot be simply const&. Making it non-const reference is technically doable with some extra adjustment for Lexer constructors.
However I'm worried a bit how it would impact external LangOpts users after mutating LineComment member by Lexer. As long as LangOpts acts as simple cache it's easy to reason about but when we start to share it with Lexer callers I'm not longer sure. That's why eventually base class approach was chosen.

Got it. It seems like getting rid of the LineComment mutation would be pretty straightforward though. I don't expect any big perf win by replacing copy of a handful of bitfields with indirection, but it would be interesting to see the experiment nonetheless.

The performance implications are pretty interesting! Have you tried avoiding the copy altogether by just storing the reference?

Yes, actually that was my first attempt but I failed miserably. LangOpts::LineComment is mutated in Lexer so LangOpts cannot be simply const&. Making it non-const reference is technically doable with some extra adjustment for Lexer constructors.
However I'm worried a bit how it would impact external LangOpts users after mutating LineComment member by Lexer. As long as LangOpts acts as simple cache it's easy to reason about but when we start to share it with Lexer callers I'm not longer sure. That's why eventually base class approach was chosen.

Got it. It seems like getting rid of the LineComment mutation would be pretty straightforward though. I don't expect any big perf win by replacing copy of a handful of bitfields with indirection, but it would be interesting to see the experiment nonetheless.

Ok. I can explore more this direction and come back with results.

A small local test hints that using a `SmallVector instead of a std::vector` also fixes the assembly bloat. I don't know if it's worth considering as an alternative or as an extra step.

yurai007 added a comment.EditedFeb 23 2022, 6:06 AM

A small local test hints that using a `SmallVector instead of a std::vector` also fixes the assembly bloat. I don't know if it's worth considering as an alternative or as an extra step.

Ok, I'm gonna try SmallVectors on top of this change. LangOptions are widely used so maybe it could bring some extra gains in places where vectors are indeed filled with more data.

yurai007 updated this revision to Diff 410801.Feb 23 2022, 6:23 AM
yurai007 retitled this revision from [NFC][Lexer] Use more appropriate LangOptionsBase type for Lexer::LangOpts to [NFC][Lexer] Make Lexer::LangOpts const reference.
yurai007 edited the summary of this revision. (Show Details)

Use simpler approach with const LangOptions reference + rebase.

I like this approach. it seems a lot easier to maintain as people don't have to remember to use LangOptionsBase to not degrade performance

clang/include/clang/Lex/Lexer.h
94

Should we add a comment to explain why this is a reference?

nikic accepted this revision.Feb 24 2022, 12:12 AM

Looks reasonable to me.

This revision is now accepted and ready to land.Feb 24 2022, 12:12 AM
jansvoboda11 accepted this revision.Feb 24 2022, 12:53 AM

LGTM too if the performance remains the same as with the original revision.

yurai007 updated this revision to Diff 411368.Feb 25 2022, 3:13 AM

Unfortunately after change there is lifetime issue in FormatTokenLexer because LangOpts received from getFormattingLangOpts has automatic storage duration and Lexer gets reference to it.
Fix issue and poke CI to make sure clang-tools-extra tests on CI are passing. For now it's only about making CI happy, I'm gonna check also if nice performance gains are preserved (I guess they should be).

I like this approach. it seems a lot easier to maintain as people don't have to remember to use LangOptionsBase to not degrade performance

I like it either but we need to be careful about LangOptions lifetime as my previous comment shows.

yurai007 added inline comments.Feb 25 2022, 3:16 AM
clang/include/clang/Lex/Lexer.h
94

Good idea. It will be added.

yurai007 updated this revision to Diff 411770.Feb 28 2022, 2:58 AM
yurai007 edited the summary of this revision. (Show Details)
yurai007 marked an inline comment as done.

All comments were addressed. Since broken libarcher/libomp/libomptarget UT are seen on others patches too I think it has nothing to do with this particular change. There is no speed degradation in comparison to LangOptionsBase approach so I believe now change is mature enough to submit.

This revision was landed with ongoing or failed builds.Feb 28 2022, 6:48 AM
This revision was automatically updated to reflect the committed changes.