This is an archive of the discontinued LLVM Phabricator instance.

[llvm-profgen] Use shared string table for function name to save memory usage
AcceptedPublic

Authored by wlei on Dec 17 2021, 9:49 AM.

Details

Reviewers
hoy
wenlei
Summary

Currently we have three maps all using function name std::string, a) FuncName in pseudo probe section(GUID2FuncDescMap) b) FuncName in debug info(BinaryFunctions) c) symbolizer (NameStrings). We can optimize to use one shared map and other two use StringRef to save memory usage.

Diff Detail

Unit TestsFailed

Event Timeline

wlei created this revision.Dec 17 2021, 9:49 AM
wlei requested review of this revision.Dec 17 2021, 9:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 17 2021, 9:49 AM
wlei edited the summary of this revision. (Show Details)Dec 17 2021, 9:52 AM
wlei added reviewers: hoy, wenlei.
wenlei added inline comments.Dec 17 2021, 11:10 AM
llvm/lib/MC/MCPseudoProbe.cpp
354

Just double checking, so the underlying data for probe section is freed after decoding? Where does that happen?

hoy added a comment.Dec 17 2021, 11:23 AM

Thanks for the change. How much space do you see this helps save?

llvm/lib/MC/MCPseudoProbe.cpp
354

The real name is on the disk. When read, a temp object ErrorOrName is created which is then freed.

wenlei added inline comments.Dec 17 2021, 11:29 AM
llvm/lib/MC/MCPseudoProbe.cpp
354

ErrorOrName doesn't own the underlying data. I was asking about the underlying data. The data is MCPseudoProbeDecoder::Data. Is OwningBinary taking care of the freeing?

hoy added inline comments.Dec 17 2021, 11:35 AM
llvm/lib/MC/MCPseudoProbe.cpp
354

Yes, the binary object owns that underlying data buffer, i.e, pointed by const uint8_t *Start.

wlei added a comment.Dec 17 2021, 2:52 PM

Thanks for the change. How much space do you see this helps save?

I just tested it on our search infra, surprisingly I didn't see any memory saved by this. I dived more and it showed that the FuncNameStrings is not a big consumer, only ~1GB compared to the total consumption (200GB+), that's petty small. Another guess is the library have some optimization to allocate the string memory, but I'm not quite familiar with underlying optimization of string allocation.

llvm/lib/MC/MCPseudoProbe.cpp
354

Yeah, the input binary info OwningBinary<Binary> OBinary is all freed after load() completed, otherwise we can directly use StringRef before.

hoy accepted this revision.Dec 17 2021, 3:05 PM

It's still good to have this optimization, thanks.

This revision is now accepted and ready to land.Dec 17 2021, 3:05 PM
wenlei accepted this revision.Dec 18 2021, 11:58 PM

lgtm, thanks.