[Internalize] Rename instead of removal if a to-be-internalized comdat has more…

Authored by MaskRay on May 25 2021, 2:15 PM.


[Internalize] Rename instead of removal if a to-be-internalized comdat has more than one member

Beside the comdat any deduplication feature, instrumentations use comdat to
establish dependencies among a group of sections, to prevent section based
linker garbage collection from discarding some members without discarding all.
LangRef acknowledges this usage with the following wording:

All global objects that specify this key will only end up in the final object file if the linker chooses that key over some other key.

On ELF, for PGO instrumentation, a __llvm_prf_cnts section and its associated
__llvm_prf_data section are placed in the same GRP_COMDAT group. A
__llvm_prf_data is usually not referenced and expects the liveness of its
associated __llvm_prf_cnts to retain it.

The setComdat(nullptr) code (added by D10679) in InternalizePass can break the
use case (a __llvm_prf_data may be dropped with its associated __llvm_prf_cnts retained).
The main goal of this patch is to fix the dependency relationship.

I think it makes sense for InternalizePass to internalize a comdat and thus
suppress the deduplication feature, e.g. a relocatable link of a regular LTO can
create an object file affected by InternalizePass.
If a non-internal comdat in a.o is prevailed by an internal comdat in b.o, the
a.o references to the comdat definitions will be non-resolvable (references
cannot bind to STB_LOCAL definitions in b.o).

On PE-COFF, for a non-external selection symbol, deduplication is naturally
suppressed with link.exe and lld-link. However, this is fuzzy on ELF and I tend
to believe the spec creator has not thought about this use case (see D102973).

GNU ld and gold are still using the "signature is name based" interpretation.
So even if D102973 for ld.lld is accepted, for portability, a better approach is
to rename the comdat. A comdat with one single member is the common case,
leaving the comdat can waste (sizeof(Elf64_Shdr)+4*2) bytes, so we optimize by
deleting the comdat; otherwise we rename the comdat.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D103043