This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho] Don't emit symbols into the symtab multiple times
AbandonedPublic

Authored by lgrey on Feb 22 2023, 1:39 PM.

Details

Reviewers
int3
JDevlieghere
Group Reviewers
Restricted Project
Summary

https://reviews.llvm.org/D139069 made the entry in symbols for private aliases point directly to the aliasee. This caused a new symbol table entry to be created for the aliasee each time it appeared in symbols which

  • Increases the number of symbols emitted (since the private aliases wouldn't have been emitted as their own symbol)
  • Confuses dsymutil, which says stuff like:
<snip>
warning: (x86_64)  failed to insert symbol '_.str.422' in the debug map.
warning: (x86_64)  failed to insert symbol '_.str.423' in the debug map.
warning: (x86_64)  failed to insert symbol '_.str.424' in the debug map.
warning: (x86_64)  failed to insert symbol '_.str.425' in the debug map.
warning: (x86_64)  failed to insert symbol '_.str.426' in the debug map.
<snip>

This change adds all symbols that have already been added to a set and checks against it to decide whether to add a symbol or not, which has a little bit of overhead (Chromium's base_unittests in ASAN configuration; as one might assume the difference is dwarfed in LTO builds). Not sure if this is significant enough to drop this approach and try something more complex.

+------------------------------------------------------------+
|              x                           +             +   |
|x             *             x             +             +   |
|    |_________A_________|                                   |
|                         |________________A________________||
+------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   5          1.28           1.3          1.29          1.29  0.0070710678
+   5          1.29          1.32          1.31          1.31   0.012247449
Difference at 95.0% confidence
	0.02 +/- 0.0145844
	1.55039% +/- 1.13058%
	(Student's t, pooled s = 0.01)

Diff Detail

Event Timeline

lgrey created this revision.Feb 22 2023, 1:39 PM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a subscriber: pengfei. · View Herald Transcript
lgrey requested review of this revision.Feb 22 2023, 1:39 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 22 2023, 1:39 PM
int3 added a comment.Feb 23 2023, 12:59 PM

1.5% seems kinda significant. I wonder if I should rethink my approach in D139069... let me try some things

lld/test/MachO/private-label-alias.s
69
lgrey added a comment.Feb 23 2023, 1:39 PM

1.5% seems kinda significant.

FWIW, the margin of error is pretty big, so I'm not sure how real it is. If the other stuff you try out isn't promising, I can try rerunning with a bigger n.

lgrey added a comment.Mar 3 2023, 8:29 AM

Ping :) Have you had a chance to think about alternatives?

int3 added a comment.Mar 3 2023, 10:25 AM

Yeah sorry been a bit swamped but I'll try to put up something by Monday. Basically my plan is to sort the symbols such that weak defs appear after all other definitions, so that when we do weak def coalescing we can bring along all the other local symbols as well. I think it should work, just a matter of how messy the implementation is...

int3 added a comment.Mar 6 2023, 6:05 PM

As promised: D145455: [lld-macho] Coalesce local symbol aliases along with their aliased weak def

Unfortunately I am traveling this week and don't have access to my usual benchmarking machine, so I can't run the numbers, but I think my change should add minimal overhead.

I can do the benchmarks at the EOW and land this then

int3 added a comment.Mar 10 2023, 10:39 PM

Turns out D145455: [lld-macho] Coalesce local symbol aliases along with their aliased weak def is a massive perf win because the old 'solution' was actually a perf regression... heh