This is an archive of the discontinued LLVM Phabricator instance.

[clangd][NFC] Reduce memory usage while building dex
ClosedPublic

Authored by kadircet on Apr 22 2022, 1:51 AM.

Details

Summary

Posting lists were copied into the final inverted index rather than
being moved. Also we weren't shrinking the vectors. This patch should help with
both peak memory usage while building dex (preamble/background-index) and it's
idle memory usage (unclear if copy constructor of vector preserves the capacity
or not).

Diff Detail

Event Timeline

kadircet created this revision.Apr 22 2022, 1:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 22 2022, 1:51 AM
kadircet requested review of this revision.Apr 22 2022, 1:51 AM
kadircet updated this revision to Diff 424417.Apr 22 2022, 2:22 AM

Get rid of shrink_to_fit as PostingList representation doesn't really make a copy of the vector, but rather build a new representation.

sammccall added inline comments.Apr 22 2022, 2:50 AM
clang-tools-extra/clangd/index/dex/Dex.cpp
112

the move here does nothing, we're passing as ArrayRef

did you want to clear the map values (vectors) as you went, too?

(this gets a bit verbose, but you could pull out a function template since all 4 cases are so regular)

kadircet added inline comments.Apr 22 2022, 5:51 AM
clang-tools-extra/clangd/index/dex/Dex.cpp
112

did you want to clear the map values (vectors) as you went, too?

yes that's the idea. would you prefer an explicit destructor call?

(this gets a bit verbose, but you could pull out a function template since all 4 cases are so regular)

the problem is 1 of these is a densemap, other three is a stringmap, and onther is a vector. densemap and the stringmap doesn't really go together because there's no common method to get the key (densemap has first as a field, or getFirst() as a method. stringmap has first() or getKey() as methods :/).
happy to move out just the three of them into a helper like:

buildPostingList(StringMap<..>, TokenKind, Out);

if you think that's still helpful (that way we can also get rid of the calls to clear since we can just move the maps into the function arguments).

sammccall accepted this revision.Apr 22 2022, 6:17 AM
sammccall added inline comments.
clang-tools-extra/clangd/index/dex/Dex.cpp
112

yes that's the idea. would you prefer an explicit destructor call?

I mean e.second.clear() in the loop.
As it stands, you're only destroying them a whole map at a time, and trigrams is (i think) >50% of the total

the problem is 1 of these is a densemap, other three is a stringmap, and onther is a vector.

Sounds too complicated, nevermind!

This revision is now accepted and ready to land.Apr 22 2022, 6:17 AM
kadircet updated this revision to Diff 424489.Apr 22 2022, 8:40 AM
kadircet marked 2 inline comments as done.

Copy-assign to empty containers rather than clear.

This revision was landed with ongoing or failed builds.Apr 22 2022, 8:45 AM
This revision was automatically updated to reflect the committed changes.