This is an archive of the discontinued LLVM Phabricator instance.

[SourceManager] Improve getFileIDLocal.
ClosedPublic

Authored by hokein on Oct 4 2022, 12:55 AM.

Details

Summary

Prune the search space -- If we know offset(LastFileIDLookup) < SearchOffset, we
can prune the initial binary-search range from [0, end) to [LastFileIDlookup, end).

It reduces the binary search scan by ~30%.

SemaExpr.cpp: 1393437 -> 1035426
FindTarget.cpp: 1275930 -> 920087

Linux kernel:
getFileIDLocal: 2.45% -> 2.15%

Diff Detail

Event Timeline

hokein created this revision.Oct 4 2022, 12:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 4 2022, 12:55 AM
hokein requested review of this revision.Oct 4 2022, 12:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 4 2022, 12:55 AM

Nice!

clang/lib/Basic/SourceManager.cpp
799

hmm, while here, why not just get rid of I and use GreaterIndex throughout?

Of its uses: line 806 and 815 would be clearer, 814 would be less clear, 796 and 813 are the same, and 827 goes away entirely.

(This is a little thing to nitpick, but I always hate trying to read this function)

810–811

this comment duplicates your new one above

902–903

is the same optimization available here?
I think getFileIDLoaded() is important for module builds and when we use PCH.
Though admittedly I don't think this is relevant to the lexer path we've been looking at: those locations should always be local.

hokein updated this revision to Diff 465298.Oct 5 2022, 12:34 AM
hokein marked 2 inline comments as done.

Remove the I variable, use GreaterIndex instead.

clang/lib/Basic/SourceManager.cpp
902–903

Yeah. That's my next step. I plan to send a separate patch for it.

sammccall accepted this revision.Oct 5 2022, 12:54 AM
sammccall added inline comments.
clang/lib/Basic/SourceManager.cpp
810

Just checking: we decrement GreaterIndex without a bounds check in the loop and then dereference it in the loop.

I guess this is safe because:

  • in the case where it's equal to size(), it's nonzero due to a dummy entry at 0 (and unchanged in this patch)
  • in the case where the cache is used, it's also nonzero (guarded)

Maybe this is worth an assertion?

This revision is now accepted and ready to land.Oct 5 2022, 12:54 AM
hokein updated this revision to Diff 465325.Oct 5 2022, 2:43 AM
hokein marked an inline comment as done.

add an assertion.

nickdesaulniers accepted this revision.Oct 6 2022, 12:01 PM
nickdesaulniers added inline comments.
clang/lib/Basic/SourceManager.cpp
800

Consider renaming this LesserIndex which matches with GreaterIndex better.

800

So this comment was wrong?

hokein updated this revision to Diff 465988.Oct 7 2022, 12:36 AM
hokein marked an inline comment as done.

update

clang/lib/Basic/SourceManager.cpp
800

This comment was true before the patch. I have updated it to match the current code.

800

I'm not sure, it doesn't seem to be much better (I'm leaning towards keeping it as-is).

810

yes, exactly. added an assertion.

This revision was landed with ongoing or failed builds.Oct 7 2022, 12:37 AM
This revision was automatically updated to reflect the committed changes.