This is an archive of the discontinued LLVM Phabricator instance.

[lld][macho] Stop grouping symbols by sections in mapfile.
ClosedPublic

Authored by Roger on Nov 29 2021, 12:29 PM.

Details

Reviewers
int3
Group Reviewers
Restricted Project
Commits
rGf84023a812b6: [lld][macho] Stop grouping symbols by sections in mapfile.
Summary

As per Bug 50689,

2. getSectionSyms() puts all the symbols into a map of section -> symbols, but this seems unnecessary. This was likely copied from the ELF port, which prints a section header before the list of symbols it contains. But the Mach-O map file doesn't print these headers.

This diff removes getSectionSyms() and keeps all symbols in a flat vector.

What does ld64's mapfile look like?

$ llvm-mc -filetype=obj -triple=x86_64-apple-darwin test.s -o test.o
$ llvm-mc -filetype=obj -triple=x86_64-apple-darwin foo.s -o foo.o
$ ld -map map test.o foo.o -o out -L/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib -lSystem
# Path: out
# Arch: x86_64
# Object files:
[  0] linker synthesized
[  1] test.o
[  2] foo.o
# Sections:
# Address       Size            Segment Section
0x100003FB7     0x00000001      __TEXT  __text
0x100003FB8     0x00000000      __TEXT  obj
0x100003FB8     0x00000048      __TEXT  __unwind_info
0x100004000     0x00000001      __DATA  __common
# Symbols:
# Address       Size            File  Name
0x100003FB7     0x00000001      [  1] _main
0x100003FB8     0x00000000      [  2] _foo
0x100003FB8     0x00000048      [  0] compact unwind info
0x100004000     0x00000001      [  1] _number

Perf numbers when linking chromium framework on a 16-Core Intel Xeon W Mac Pro:

base           diff           difference (95% CI)
sys_time   1.406 ± 0.020  1.388 ± 0.019  [  -1.9% ..   -0.6%]
user_time  5.557 ± 0.023  5.914 ± 0.020  [  +6.2% ..   +6.6%]
wall_time  4.455 ± 0.041  4.436 ± 0.035  [  -0.8% ..   -0.0%]
samples    35             35

Diff Detail

Event Timeline

Roger created this revision.Nov 29 2021, 12:29 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 29 2021, 12:29 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript
Roger updated this revision to Diff 394916.Dec 16 2021, 10:08 AM

Deleted unnecessary function.

Roger updated this revision to Diff 394917.Dec 16 2021, 10:11 AM

Remove unnecessary using statement.

Roger updated this revision to Diff 394936.Dec 16 2021, 11:28 AM

Update diff name

Roger updated this revision to Diff 394937.Dec 16 2021, 11:32 AM
Roger retitled this revision from [lld][macho] Remove dumping sections in mapfile. to [lld][macho] Stop grouping symbols by sections in mapfile..
Roger edited the summary of this revision. (Show Details)

Update diff details.

Roger updated this revision to Diff 394941.Dec 16 2021, 11:41 AM

Small fixes.

Roger updated this revision to Diff 394954.Dec 16 2021, 12:14 PM

smal fix.

Roger updated this revision to Diff 395003.Dec 16 2021, 2:50 PM

Compilation fix.

Roger updated this revision to Diff 396048.Dec 23 2021, 10:29 AM

test passes

Roger published this revision for review.Dec 23 2021, 10:31 AM
Roger added a reviewer: int3.
Herald added a project: Restricted Project. · View Herald TranscriptDec 23 2021, 10:31 AM
int3 added inline comments.Dec 25 2021, 2:25 AM
lld/test/MachO/map-file.s
49–51 ↗(On Diff #396048)

hmm why did the ordering change? and what does ld64 output here?

Roger edited the summary of this revision. (Show Details)Dec 30 2021, 12:48 PM
Roger added inline comments.Dec 30 2021, 12:51 PM
lld/test/MachO/map-file.s
49–51 ↗(On Diff #396048)

Since we are no longer grouping symbols by their sections, the order in which they appear will change. I updated the description to include what ld64's mapfile looks like for this example. Aside from the "compact unwind info" that ld64 has that lld does not, the ordering is the same. WDYT?

MaskRay added a subscriber: MaskRay.Jan 4 2022, 2:40 PM

Drive-by: the ELF -Map way has the nice property that it is easy to know the relative section of a symbol. The ld64 way, while there is a separate Symbol part, may provide less information (the information can mostly be inferred from llvm-objdump -t)...

int3 accepted this revision.Jan 5 2022, 1:39 AM

@MaskRay makes sense, but I think it's more important that we can act as a drop-in replacement for ld64. We can certainly add support for a different mapfile format in the future.

lld/test/MachO/map-file.s
49–51 ↗(On Diff #396048)

oh so this makes us more correct then -- excellent!

This revision is now accepted and ready to land.Jan 5 2022, 1:39 AM
Roger added a comment.Jan 6 2022, 11:29 AM

@int3 could you submit this for me? thanks :)

@int3 could you submit this for me? thanks :)

do you still need help landing this?

Roger added a comment.Jan 10 2022, 1:37 PM

@int3 could you submit this for me? thanks :)

do you still need help landing this?

no, @int3 suggested offline that I benchmark this before submitting so we'll submit after that. Thanks for checking :)

int3 added a comment.Jan 11 2022, 6:12 PM

So benchmarking instructions... first, download the zipfile in comment #1 on https://bugs.llvm.org/show_bug.cgi?id=48657 and unpack it into a chromium_framework directory. That will serve as our benchmark. Then install cbdr per the instructions here: https://github.com/asayers/cbdr/blob/master/cbdr.md

Now create two builds of LLD, before and after this change. Say the LLD binary before this change is at ~/before/ld64.lld and the one after the change is at ~/after/ld64.lld. Then run the following command from within chromium_framework:

cbdr sample "base:~/before/ld64.lld @response.txt -map beforemap.txt" "diff:~/after/ld64.lld @response.txt -map aftermap.txt" --timeout=300s | tee results.csv | cbdr analyze -s 95

To reduce noise, close as many other processes as possible before kicking off the benchmark. If your readings are still very noisy, try increasing the timeout value in the benchmarking command to get a more precise confidence interval.

So the command above will measure the change in overall LLD runtime. However, since we are running writeMapFile in parallel with other linker work, we should be able to get a more precise estimate of perf improvement by measuring the change in the time it takes to run that function alone. We can collect & extract that data via the --time-trace flag:

#-- bench.sh
RESULT=$(~/${1}/ld64.lld @response.txt -map ${1}map.txt --time-trace-file=${1}trace.txt && cat ${1}trace.txt | python -m json.tool | grep '\"Write map file' -B1 | grep '[0-9]\+' -o)
echo {\"time\":$RESULT}

I used the script as follows:

cbdr sample --bench ~/bench.sh --timeout=300s -- before after

LMK if you have any issues getting this to work :)

$ cbdr sample "base:/tmp/before/ld64.lld @response.txt -map beforemap.txt" "diff:/tmp/after/ld64.lld @response.txt -map aftermap.txt" --timeout=300s | tee results.csv | cbdr analyze -s 95                                                                                           Warming up base...
Warming up diff...
base           diff           difference (95% CI)
sys_time   1.406 ± 0.020  1.388 ± 0.019  [  -1.9% ..   -0.6%]
user_time  5.557 ± 0.023  5.914 ± 0.020  [  +6.2% ..   +6.6%]
wall_time  4.455 ± 0.041  4.436 ± 0.035  [  -0.8% ..   -0.0%]
samples    35             35
int3 edited the summary of this revision. (Show Details)Jan 20 2022, 11:54 AM
This revision was automatically updated to reflect the committed changes.