This is an archive of the discontinued LLVM Phabricator instance.

[memprof] Support symbolization of PIE binaries.
ClosedPublic

Authored by snehasish on Mar 15 2023, 5:11 PM.

Details

Summary

Support symolization of PIE binaries in memprof. We assume that the
profiled binary has one executable text segment for simplicity. Update
the memprof-pic test to now expect the same output as the memprof-basic test.

Diff Detail

Event Timeline

snehasish created this revision.Mar 15 2023, 5:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 15 2023, 5:11 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
snehasish requested review of this revision.Mar 15 2023, 5:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 15 2023, 5:11 PM
tejohnson added inline comments.Mar 17 2023, 1:16 PM
llvm/lib/ProfileData/RawMemProfReader.cpp
283

Is this limitation going to affect us?

291

Is it expected that p_vaddr will not be page boundary aligned already?

297

I think it might be clearer to report this error in the above loop. Otherwise when reading that code it is a little confusing why PreferredTextSegmentAddress will be overwritten for each executable segment. Similar suggestion when setting ProfiledTextSegment{Start,End} in setupForSymbolization

581

I'm a little confused on the difference between PreferredTextSegmentAddress and ProfiledTextSegmentStart, and the arithmetic being done here. Can you clarify?

snehasish marked an inline comment as done.

Address comments.

PTAL, thanks!

llvm/lib/ProfileData/RawMemProfReader.cpp
283

No, the systems we care about default to 4K. Notable exceptions include newer versions of Darwin which use 16K. See a similar assumption made in llvm-profgen [1].

https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-profgen/ProfiledBinary.cpp#L316

291

It most likely will be page (or 2M hugepage aligned) already. The default is to align by 4K [1] and this should almost always be a no-op since even 2M aligned is also 4K aligned.

[1] https://github.com/llvm/llvm-project/blob/main/lld/ELF/Target.h#L112

581

For non-PIE binaries:
PreferredTextSegmentAddress is non-zero, e.g 0x200000 or 0x400000. In this case the loader will put the segment at the virtual address requested by the binary and the ProfiledTextSegmentStart will be the same as PreferredTextSegmentAddress. Thus this arithmetic is a no-op.

For PIE binaries:
PreferredTextSegmentAddress is zero. The loader is free to place the segment at any address (respecting alignment specified by the header). Then the instruction offset in the binary is computed as the virtual address minus the start address of the ProfiledTextSegment. This assumes p_offset = 0 for the executable segment (added an assertion).

Added comments to the code to clarify.

tejohnson accepted this revision.Mar 21 2023, 12:14 PM

lgtm. Couple suggestions for asserts if you think they make sense.

llvm/lib/ProfileData/RawMemProfReader.cpp
291

Consider making this an assert that it is already page aligned?

581

Thanks, the new comment helps. Perhaps add an assert somewhere like:
assert(PreferredTextSegmentAddress == 0 || (PreferredTextSegmentAddress == ProfiledTextSegmentStart));

This revision is now accepted and ready to land.Mar 21 2023, 12:14 PM

Address comments.

snehasish marked 2 inline comments as done.Mar 21 2023, 12:58 PM

Thanks for the review!

snehasish updated this revision to Diff 507098.Mar 21 2023, 1:09 PM

Rebase, remove var only used in assert.

This revision was landed with ongoing or failed builds.Mar 21 2023, 1:13 PM
This revision was automatically updated to reflect the committed changes.