Probes of dead functions may be left over in the final binary for some reason, and they should be disgarded during decoding. The llvm-profgen profile generation path should already disgarded them due to its on-demand style decoding. I'm fixing for the --show-disassembly path which unconditionally decodes all probes.
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Probes of dead functions may be left over in the final binary for some reason,
How many probes (in terms of % size) belong to dead functions? Can we avoid emitting these probes in the first place?
Very little. Currently probes are emitted along with the binary code for functions and the linker is responsible for removing dead code. For probes, the linker can also remove them with the implementation D146853: [Pseudo Probe] Placing .pseudoprobe section in the same comdat group with .text. However, since dead probes are very little, probably due to the thinLTO compiler backend doing a good job removing dead functions already, I'm switching to the implementation D152546: [Pseudo Probe] Placing .pseudoprobe section in a comdat group, which doesn't enable the linker to remove dead probes but has other benefits.
Currently probes are emitted along with the binary code for functions and the linker is responsible for removing dead code. For probes, the linker can also remove them with the implementation
why do we still have probe for dead functions with this current implementation though?
Very little.
How much in terms .pseudo_probe section size?
I'm switching to the implementation D152546: [Pseudo Probe] Placing .pseudoprobe section in a comdat group, which doesn't enable the linker to remove dead probes but has other benefits.
It looks the trade off with the new implementation is that, we would allow ICF for probe section, but would also disable REF when its associated code is being removed?
The current implementation doesn't place pseudo probes in the same comdat group with the text section, so a probe section won't be removed when the corresponding text section is dead removed by the linker. This is what D146853 is supposed to fix.
Dead probes can be completed removed by the thinLTO backend only. The only ones survive are for the functions removed by the native linker.
Very little.
How much in terms .pseudo_probe section size?
Less than 1% savings when I enabled D146853.
I'm switching to the implementation D152546: [Pseudo Probe] Placing .pseudoprobe section in a comdat group, which doesn't enable the linker to remove dead probes but has other benefits.
It looks the trade off with the new implementation is that, we would allow ICF for probe section, but would also disable REF when its associated code is being removed?
Right, there is no connection between probes and code with the new implementation. It has a benefit of fully deduplicating probes even for static functions injected by the compiler which does not have unique linkage name.
Discussed offline. It makes more sense to move on with D146853: [Pseudo Probe] Placing .pseudoprobe section in the same comdat group with .text.