This is an archive of the discontinued LLVM Phabricator instance.

[memprof] Filter out callstack frames which cannot be symbolized.
ClosedPublic

Authored by snehasish on Mar 2 2022, 5:51 PM.

Details

Summary

This patch filters out callstack frames which can't be symbolized or if
the frames belong to the runtime. Symbolization may not be possible if
debug information is unavailable or if the addresses are from a shared
library. For now we only support optimization of the main binary which
is statically linked to the compiler runtime.

Diff Detail

Event Timeline

snehasish created this revision.Mar 2 2022, 5:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 2 2022, 5:51 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
snehasish requested review of this revision.Mar 2 2022, 5:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 2 2022, 5:51 PM
snehasish updated this revision to Diff 412752.Mar 3 2022, 9:54 AM

Simplify discarded vaddrs check, rename method to indicate filtering and add comments.

tejohnson added inline comments.Mar 3 2022, 1:57 PM
llvm/lib/ProfileData/RawMemProfReader.cpp
308–312

Nit: parses slightly better if you say "or if we don't..."

323

I think operator new should go through memprof_new_delete.cpp? Does FileName contain any of the path, or do we have it available anywhere? I.e. could we match on "/memprof/memprof_"?

344

Rather than scan CallStack again, can you keep a flag initialized to false above the prior loop, set to true if any entries are symbolized, and check that here?

llvm/test/tools/llvm-profdata/memprof-basic.test
37

s/originae/originate/

45

Should this now be 2? Ditto for NumStackOffsets below.

snehasish updated this revision to Diff 412862.Mar 3 2022, 4:21 PM

Address comments.

snehasish updated this revision to Diff 412865.Mar 3 2022, 4:37 PM
snehasish marked 3 inline comments as done.

Add TODO to update summary print.

PTAL, thanks!

llvm/lib/ProfileData/RawMemProfReader.cpp
344

In practice all callstacks have at least one item to discard which originates from a shared library, i.e. the call to main from glibc at __libc_start_main in libc-start.c so adding a flag here doesn't help.

An alternative way to filter would be to use the procmaps to identify all addresses within the main binary and only retain those while reading the raw profile, however we don't record build ids on linux during profiling today (sanitizer_procmaps_linux does not support it). We could use a heuristic to identify the main binary but matching it to the profiled binary (passed in during llvm-profdata merge) build id would be best.

llvm/test/tools/llvm-profdata/memprof-basic.test
45

This is coming from the raw profile without symbolization and pruning. If you don't feel strongly, I'll update the summary code in a separate patch. I've added a TODO in RawMemProfReader.cpp for now.

tejohnson added inline comments.Mar 3 2022, 4:51 PM
llvm/lib/ProfileData/RawMemProfReader.cpp
344

Nevermind, I mistakenly thought this was removing all of them, but it may only be removing some frames.

llvm/test/tools/llvm-profdata/memprof-basic.test
45

That's fine, but I'm confused as to how the output of this file changed then, since it looks like printYAML prints the records immediately after the summary, but now there are only 2 MIB records printed? Is it the case that the records have been filtered by that point but the summary hasn't been updated accordingly?

snehasish added inline comments.Mar 3 2022, 5:02 PM
llvm/test/tools/llvm-profdata/memprof-basic.test
45

The issue here is that printSummaries generates the data to print on the fly by reading the DataBuffer (i.e. raw profile file) directly. This was introduced to have some basic testing and we need to migrate it to generating the summary from the internal data structures that we have now. I hope that makes it clear.

This revision is now accepted and ready to land.Mar 3 2022, 7:26 PM
This revision was landed with ongoing or failed builds.Mar 4 2022, 11:11 AM
This revision was automatically updated to reflect the committed changes.