This is an archive of the discontinued LLVM Phabricator instance.

[Pseudo Probe] Placing .pseudoprobe section in the same comdat group with .text
AbandonedPublic

Authored by hoy on Mar 24 2023, 4:25 PM.

Details

Summary

Previously the .pseudoprobe data section of a function was supposed to be placed in its own comdat group so that it can de deduplicated against same-named data (such as weak symbols). Unfortunately a bug in the group name setting blocked such, therefore probes for different copies of the same function were not deduplicated and duplicated probes were reported for the final binary code.

Fixing the bug should unblock the deduplication. However, it may cause another issue where the final function code and the probe data are mismatched, because the code and the probes are deduplicated independently. I'm fixing this be placing the probes in the same comdat group as the code so that they can be deduplicated synchronously.

As an example below, with the current change we have section layout :

COMDAT group section [    3] `.group' [foo] contains 3 sections:
  [Index]    Name
  [    4]   .text.foo
  [    5]   .rela.text.foo
  [   17]   .pseudo_probe.foo
COMDAT group section [    6] `.group' [foo2] contains 3 sections:
  [Index]    Name
  [    7]   .text.foo2
  [    8]   .rela.text.foo2
  [   18]   .pseudo_probe.foo2

COMDAT group section [   10] `.group' [.pseudo_probe_desc_foo] contains 1 sections:
  [Index]    Name
  [   11]   .pseudo_probe_desc
COMDAT group section [   12] `.group' [.pseudo_probe_desc_foo2] contains 1 sections:
  [Index]    Name
  [   13]   .pseudo_probe_desc

Note that the probe desc data are still placed in separate groups because they are not bound to the final code unlike the probe data. Once a function is inlined, its probe data are included in the caller's probe data, but its probe desc is still there in separate. The probe descs do not need to be deduplicated with the code together.

Test Plan:

Diff Detail

Event Timeline

hoy created this revision.Mar 24 2023, 4:25 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 24 2023, 4:25 PM
hoy requested review of this revision.Mar 24 2023, 4:25 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 24 2023, 4:25 PM
hoy edited the summary of this revision. (Show Details)Mar 24 2023, 4:26 PM
hoy added reviewers: wenlei, wlei.
hoy abandoned this revision.Jun 5 2023, 1:36 PM
hoy reclaimed this revision.Jun 13 2023, 3:11 PM
hoy updated this revision to Diff 531087.Jun 13 2023, 3:12 PM

Updating D146853: [Pseudo Probe] Placing .pseudoprobe section in the same comdat group with .txt

hoy retitled this revision from [Pseudo Probe] Placing .pseudoprobe section in the same comdat group with .txt to [Pseudo Probe] Placing .pseudoprobe section in the same comdat group with .text.Jun 13 2023, 3:13 PM
hoy updated this revision to Diff 531090.Jun 13 2023, 3:14 PM

Updating D146853: [Pseudo Probe] Placing .pseudoprobe section in the same comdat group with .text

hoy updated this revision to Diff 531091.Jun 13 2023, 3:15 PM

Updating D146853: [Pseudo Probe] Placing .pseudoprobe section in the same comdat group with .text

hoy updated this revision to Diff 532202.Jun 16 2023, 9:33 AM

Removing an assert.

Thank you for mentioning lld/ELF, otherwise I may not notice this patch:) The part is not so related to the main functionality and I usually have a strong opinion on lld/test/ELF testing...

Actually, for LLVM metadata sections, I think the convention is not to have a suffix like .text.foo. You may want to use .pseudo_probe without the suffix (MCSectionELF::UniqueID).

hoy added a comment.Jun 16 2023, 9:51 AM

Thank you for mentioning lld/ELF, otherwise I may not notice this patch:) The part is not so related to the main functionality and I usually have a strong opinion on lld/test/ELF testing...

A test on lld side sounds good. I'll add it.

Actually, for LLVM metadata sections, I think the convention is not to have a suffix like .text.foo. You may want to use .pseudo_probe without the suffix (MCSectionELF::UniqueID).

D152546 is one implementation that treats .pseudo_probe as an independent section of .text, but after iterations we think they should be combined when it comes to deduplication and gc-sections, so I'm falling back to this patch.

hoy updated this revision to Diff 532230.Jun 16 2023, 10:52 AM

Adding a lld test.

hoy updated this revision to Diff 532231.Jun 16 2023, 10:53 AM

Adding a lld test.

MaskRay added a comment.EditedJun 16 2023, 1:15 PM

Thank you for mentioning lld/ELF, otherwise I may not notice this patch:) The part is not so related to the main functionality and I usually have a strong opinion on lld/test/ELF testing...

A test on lld side sounds good. I'll add it.

Actually, for LLVM metadata sections, I think the convention is not to have a suffix like .text.foo. You may want to use .pseudo_probe without the suffix (MCSectionELF::UniqueID).

D152546 is one implementation that treats .pseudo_probe as an independent section of .text, but after iterations we think they should be combined when it comes to deduplication and gc-sections, so I'm falling back to this patch.

My idea is that we should not have .pseudo_probe.xxx (waste .strtab size). Just have a number of .pseudo_probe (either differentiated by a comdat key or a UniqueID). This scheme matches many other metadata sections from all sorts of instrumentations.

MaskRay added 1 blocking reviewer(s): MaskRay.Jun 16 2023, 1:15 PM
hoy added a comment.Jun 16 2023, 1:36 PM

Thank you for mentioning lld/ELF, otherwise I may not notice this patch:) The part is not so related to the main functionality and I usually have a strong opinion on lld/test/ELF testing...

A test on lld side sounds good. I'll add it.

Actually, for LLVM metadata sections, I think the convention is not to have a suffix like .text.foo. You may want to use .pseudo_probe without the suffix (MCSectionELF::UniqueID).

D152546 is one implementation that treats .pseudo_probe as an independent section of .text, but after iterations we think they should be combined when it comes to deduplication and gc-sections, so I'm falling back to this patch.

My idea is that we should not have .pseudo_probe.xxx (waste .strtab size). Just have a number of .pseudo_probe (either differentiated by a comdat key or a UniqueID). This scheme matches many other metadata sections from all sorts of instrumentations.

Is it required that sections of a comdat group should have a suffix same to the group name? E.g, .rela.text.foo, is it fine to remove the function name from it?

Thank you for mentioning lld/ELF, otherwise I may not notice this patch:) The part is not so related to the main functionality and I usually have a strong opinion on lld/test/ELF testing...

A test on lld side sounds good. I'll add it.

Actually, for LLVM metadata sections, I think the convention is not to have a suffix like .text.foo. You may want to use .pseudo_probe without the suffix (MCSectionELF::UniqueID).

D152546 is one implementation that treats .pseudo_probe as an independent section of .text, but after iterations we think they should be combined when it comes to deduplication and gc-sections, so I'm falling back to this patch.

My idea is that we should not have .pseudo_probe.xxx (waste .strtab size). Just have a number of .pseudo_probe (either differentiated by a comdat key or a UniqueID). This scheme matches many other metadata sections from all sorts of instrumentations.

Is it required that sections of a comdat group should have a suffix same to the group name? E.g, .rela.text.foo, is it fine to remove the function name from it?

There is no such requirement. LLVM instrumentation generally omits the suffix for metadata sections to save .strtab size (related to relocatable object file sizes).

hoy added a comment.EditedJun 16 2023, 1:42 PM

Thank you for mentioning lld/ELF, otherwise I may not notice this patch:) The part is not so related to the main functionality and I usually have a strong opinion on lld/test/ELF testing...

A test on lld side sounds good. I'll add it.

Actually, for LLVM metadata sections, I think the convention is not to have a suffix like .text.foo. You may want to use .pseudo_probe without the suffix (MCSectionELF::UniqueID).

D152546 is one implementation that treats .pseudo_probe as an independent section of .text, but after iterations we think they should be combined when it comes to deduplication and gc-sections, so I'm falling back to this patch.

My idea is that we should not have .pseudo_probe.xxx (waste .strtab size). Just have a number of .pseudo_probe (either differentiated by a comdat key or a UniqueID). This scheme matches many other metadata sections from all sorts of instrumentations.

Is it required that sections of a comdat group should have a suffix same to the group name? E.g, .rela.text.foo, is it fine to remove the function name from it?

There is no such requirement. LLVM instrumentation generally omits the suffix for metadata sections to save .strtab size (related to relocatable object file sizes).

I see. BTW, why is `UniqueID required? I'm seeing even with MCSection::NonUniqueID the sections are still separate, something like

COMDAT group section [    3] `.group' [foo] contains 2 sections:
   [Index]    Name
   [    4]   .text.foo
   [   11]   .pseudo_probe

COMDAT group section [    5] `.group' [foo2] contains 3 sections:
   [Index]    Name
   [    6]   .text.foo2
   [    7]   .rela.text.foo2
   [   12]   .pseudo_probe
hoy abandoned this revision.Jun 17 2023, 4:46 PM