This is an archive of the discontinued LLVM Phabricator instance.

[lld/mac] Make X86_64::getImplicitAddend not do heap allocations
ClosedPublic

Authored by thakis on Dec 6 2020, 10:52 AM.

Details

Summary

Speeds up linking Chromium's base_unittests almost 10%. According to ministat:

    N           Min           Max        Median           Avg        Stddev
x   5    0.72193289    0.73073196    0.72560811    0.72565799  0.0032265649
+   5    0.64069581    0.67173195    0.65876389    0.65796089   0.011349451
Difference at 95.0% confidence
	-0.0676971 +/- 0.0121682
	-9.32906% +/- 1.67685%
	(Student's t, pooled s = 0.00834328)

Diff Detail

Event Timeline

thakis requested review of this revision.Dec 6 2020, 10:52 AM
thakis created this revision.
smeenai accepted this revision.Dec 6 2020, 9:55 PM
smeenai added subscribers: gkm, smeenai.

LGTM, thanks!

CC @gkm there might be something similar in the outstanding arm64 diff.

This revision is now accepted and ready to land.Dec 6 2020, 9:55 PM
int3 accepted this revision.Dec 7 2020, 3:58 AM
int3 added a subscriber: int3.

oh wow... I'd have thought that compilers would be better at optimizing such a common operation

thakis added a comment.Dec 7 2020, 6:03 AM

Thanks!

oh wow... I'd have thought that compilers would be better at optimizing such a common operation

The compiler would have to inline validateLength() to do that, and it's called a few times and has a large-ish loop at the bottom. A common technique is to manually outline the cold part of a function, and then the compiler can inline the hot part and that might be enough to let the compiler optimize this (...but I haven't checked).

Herald added a project: Restricted Project. · View Herald TranscriptDec 7 2020, 6:24 AM