This change supports to run without parsing MMap binary loading events instead it always assumes binary is loaded at the preferred address. This is used when we have assured no binary load address changes or we have pre-processed the addresses resolution. Warn if there's interior mmap event but without leading mmap events.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Having both "--show-mmap-events --ignore-mmap-events" seems a bit counter-intuitive.
If we already preprocessed the mmap events, could we also assume that we have the right pids and hence don't need --show-mmap-events?
Can the mmap events be treated as optional, so when they present we parse them otherwise just move forward to other events?
Thanks for the feedback, the --show-mmap-events here is different from the linux perf script, here is to show the mmap events paring results, only for debugging use I think.
And in this test, see the directive: ; CHECK-IGNORE-NOT: Mmap: Binary /home/a.out loaded at 0x400000 which means --show-mmap-events won't work with --ignore-mmap-events.
I can add a warning here if it's confusing.
That might have issues with the leading samples before we meet a MMAP, they won't know which binary it belongs to. Currently If not specify ignore-mmap-events, we will always check the Addr2Binary table which is empty without mmap events parsed.
Can this be fixed? Without mmap events we can just use the preferred address as the load address.
It sounds a bit unclear to me by giving ignore-mmap-events when the map events are there. That means we basically do not trust those events.
Understood! My concern is if the scripts has mmap events and only the leading sample missed the load address, they still can be loaded at a different base address which is not the preferred address. it's wrong base address in that case,
How about change it to always-trust-mmap-events, with this, the leading no-matched samples will be dropped. otherwise, it will use preferred address as load address.
If that's a real case, we'll need to fix that. For now I think we want to cover to cases: 1. with leading mmaps 2. without leading mmaps (assuming the preferred address is the real load address). Have you seen a third case? I'm wondering if the input is invalid.
Oh, I see, you mean if turn on --show-mmap-events, it will always have the leading mmap event, otherwise it's the invalid input? I indeed haven't seen the third case before. If so, that makes sense to use preferred address when without leading mmaps.
I think --show-mmap-events should always give leading mmap events or no mmap events at all. We should warn or error out when only interior mmap events are seen.
+1. I was thinking about the same..
We can start with assuming all binaries are loaded at preferred based, then on mmap event we update the base for binary, on other event, we process them using the current base (would be preferred base if no relevant mmap is seen before), and warn if there's no mmap before that event?
The above would be the default behavior without any switch. Is that what we're trying to do? If we want we could add a switch to always require mmap and error out if we don't find mmap for a binary.
This is my two cents.
then on mmap event we update the base for binary,
I was thinking about warning or erroring for this. Erroring is probably two strong. An interior mmap event without a corresponding leading event likely means the profiling system doesn't work consistently, i.e., it sometimes grabs mmap event but sometimes not.
addressing reviewers' feedback, default to use preferred adddress as loading address without mmap events, give warning if no leading mmap events
llvm/tools/llvm-profgen/PerfReader.cpp | ||
---|---|---|
320–326 | nit: prefered -> preferred | |
745 | Instead of a prescan for leading mmap events based on limited number of lines, can we compute a flag, say EventSeenFirst, the value of which can be sample events or mmap events, during parseAndAggregateTrace and check it in parseMMap2Event? |
llvm/tools/llvm-profgen/PerfReader.cpp | ||
---|---|---|
745 | Good idea! Fixed. |
llvm/tools/llvm-profgen/PerfReader.cpp | ||
---|---|---|
666 | The term "Interior mmap event" could be confusing.. The warning I was thinking is for when LBR/stack sample is seen before a relevant mmap is seen, in which case we warn about assuming preferred based address due to not seeing corresponding mmap. It doesn't matter whether we found other mmap event later or not. |
llvm/tools/llvm-profgen/PerfReader.cpp | ||
---|---|---|
666 | That sounds reasonable. In that case, the warning should be issued in parseSample for the first sample that does not have a leading mmap event. |
LGTM, thanks.
llvm/tools/llvm-profgen/PerfReader.h | ||
---|---|---|
657 | Nit: add a comment to indicate what the flag is used for. |
llvm/tools/llvm-profgen/PerfReader.cpp | ||
---|---|---|
666 | Sorry if I wasn't clear. But note the "relevant mmap" in my previous comment - the way EventSeen is set here doesn't guarantee that the mmap seen is relevant for the binary we're interested in. Say if we want to generate profile for foo.so, we'd still want to warn if mmap for foo.so isn't seen (we may have seen mmap for other binaries) before we have a sample that is concluded to belong to foo.so's address space based on the preferred base. |
llvm/tools/llvm-profgen/PerfReader.cpp | ||
---|---|---|
666 | That's more precise. EventSeen should be set after the mmap event for the current binary is analyzed. |
llvm/tools/llvm-profgen/PerfReader.cpp | ||
---|---|---|
666 | As for the implementation, it's more like a MMapSeen member of ProfiledBinary. |
Thanks for making the changes. It's almost ready, except two more tweaks.
llvm/tools/llvm-profgen/PerfReader.cpp | ||
---|---|---|
650–657 | Would it be cleaner if we move this to be after line 646 - at that point, we've located the binary using top addresses from stack sample, and we're ready to warn even if we don't get any lbr extracted due to leading outgoing branches. | |
llvm/tools/llvm-profgen/ProfiledBinary.h | ||
145 ↗ | (On Diff #363506) | Do we need a three-way Optional value here? Would it work if we just always initialize it to false, and set it to true when mmap is seen for the binary? |
lgtm with nit on flags. thanks!
llvm/tools/llvm-profgen/ProfiledBinary.h | ||
---|---|---|
145 ↗ | (On Diff #363506) | Ok, I see, thanks for clarification. In that case, after resetting, the meaning of the flag no longer abides to its definition described in the comment above? Mmap is still not seen, but we reset it to unknown just to avoid duplicated warning.. May consider a separate flag (like MissingMMapWarned) is cleaner. Another nit: we use upper case initial for member field - IsLoadedByMMap |
llvm/tools/llvm-profgen/PerfReader.cpp | ||
---|---|---|
325 | With IsLoadedByMMap initialized to false, this is no longer needed? (We don't setMissingMMapWarned here either) |
remove a redundant assignment of IsLoadedByMMap
llvm/tools/llvm-profgen/PerfReader.cpp | ||
---|---|---|
325 | Good catch, fixed! |
Nit: add a comment to indicate what the flag is used for.