This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Avoid scanning up to end of file on each comment!
ClosedPublic

Authored by sammccall on Oct 5 2022, 1:27 PM.

Details

Summary

Assigning char* (pointing at comment start) to StringRef was causing us
to scan the rest of the source file looking for the null terminator.

This seems to be eating about 8% of our *total* CPU!

While fixing this, factor out the common bits from the two places we're
parsing IWYU pragmas.

Diff Detail

Event Timeline

sammccall created this revision.Oct 5 2022, 1:27 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 5 2022, 1:27 PM
Herald added a subscriber: arphaman. · View Herald Transcript
sammccall requested review of this revision.Oct 5 2022, 1:27 PM
kadircet accepted this revision.Oct 6 2022, 2:01 AM

wow, that's an interesting finding. thanks!

clang-tools-extra/clangd/Headers.cpp
25–35

i think this also deserves a comment to make sure people won't refactor it in the future to take in a stringref.

25–35

i'd actually return an empty stringref, instead of None. it makes logic in callers less complicated and I don't think we convey any different signal between None vs empty right now (at least a signal that can be used by the callers)

27

nit: static constexpr StringLiteral instead (we can just use size() afterwards instead of calling strlen a bunch.

clang-tools-extra/clangd/unittests/HeadersTests.cpp
456

it might be useful to get rid of the space between : and keep as well.

This revision is now accepted and ready to land.Oct 6 2022, 2:01 AM
sammccall marked 3 inline comments as done.Oct 6 2022, 2:35 AM
sammccall added inline comments.
clang-tools-extra/clangd/Headers.cpp
25–35

The comment is in the header file, extended it to be more explicit about avoiding StringRef

25–35

it makes logic in callers less complicated

I think this encourages a risky pattern for very minimal gain.
We've seen comment handlers are performance-sensitive (each strlen of the source file isn't that expensive!).
If we bail out explicitly early in the overwhelmingly common no-pragma case, then the remaining code is not critical.
If we continue on and "naturally" fail to match any pragma case, I don't see performance problems today, however we need to be extra-careful whenever we change the code. I'd much prefer to have to read/write an obvious if (empty) return than to think about performance.

If we're going to prefer an early bailout, StringRef isn't really simpler than Optional<StringRef>, and it pushes callers to handle this the right way.

This revision was automatically updated to reflect the committed changes.
sammccall marked an inline comment as done.

BTW it looks like the 8% profile I saw was an outlier, 3-4% seems more typical.