This is an archive of the discontinued LLVM Phabricator instance.

Optimize StringTableBuilder
Needs ReviewPublic

Authored by ruiu on Oct 25 2015, 7:29 PM.
This revision needs review, but all specified reviewers are disabled or inactive.

Details

Reviewers
espindola
Summary

This is a patch to improve StringTableBuilder's performance. That class'
finalize function is very hot particularly in LLD because the function
does tail-merge strings for string tables or SHF_MERGE sections.

Generic std::sort-style sorter is not efficient for sorting strings.
The function implemented in this patch seems to be more efficient.

Here's a benchmark of LLD to link Clang with or without this patch.
The numbers are medians of 50 runs.

-O0
real 0m0.455s
real 0m0.430s (5.5% faster)

-O3
real 0m0.487s
real 0m0.452s (7.2% faster)

Diff Detail

Event Timeline

ruiu updated this revision to Diff 38370.Oct 25 2015, 7:29 PM
ruiu retitled this revision from to Optimize StringTableBuilder.
ruiu updated this object.
ruiu added a reviewer: rafael.
ruiu added a subscriber: llvm-commits.
rafael edited edge metadata.Oct 26 2015, 4:08 AM
rafael added a subscriber: rafael.

-O0
real 0m0.455s
real 0m0.430s (5.5% faster)

-O3
real 0m0.487s
real 0m0.452s (7.2% faster)

Nice!

+static void qsort(StringPair Begin, StringPair End, int Pos) {
+ if (End - Begin < 2)
+ return;
+
+ // Safeguard for pathetic input.
+ if (Pos > 10000)
+ return;

We get here if two strings have a lot of characters in common, right?
This if then avoids running out of stack.

+ Partition items. Items in [Begin, P) are less than the pivot, [P, Q)
+
are the same as the pivot, and [Q, End) are greater than the pivot.

We are actually sorting the other way, no? Things larger than the
pivot go to be start of the array.

+ int Pivot = charTailAt(*Begin, Pos);
+ StringPair P = Begin;
+ StringPair
Q = End;
+ for (StringPair **R = Begin + 1; R < Q;) {
+ int C = charTailAt(*R, Pos);
+ if (C > Pivot)
+ std::swap(*P++, *R++);
+ else if (C < Pivot)
+ std::swap(*--Q, *R);
+ else
+ R++;

}
  • return SizeB - SizeA;

+ qsort(Begin, P, Pos);
+ if (Pivot != -1)
+ qsort(P, Q, Pos + 1);

This is the recursive call that can causes us to run out of stack. If
you write this as

qsort(Begin, P, Pos);
qsort(Q, End, Pos);
if (Pivot != -1)
loop back to stort P, Q, Pos + 1.

We should be able to remove the earlier check for Pos being too large.

Thanks you so much for ding this, it should help MC too!

Cheers,
Rafael

ruiu updated this revision to Diff 38396.Oct 26 2015, 5:44 AM
ruiu edited edge metadata.

Address review comments.

Cool. AFAIK, the only reason for the array_pod_sort() before was "seems good enough for now."

lib/MC/StringTableBuilder.cpp
33

Can we use a different name? Function names that collide with the C standard library are bountiful opportunities for confusion later.

Relatedly, perhaps this helper function, or a generalized version of it, should go into STLExtras.h next to array_pod_sort?

espindola edited reviewers, added: espindola; removed: rafael.Mar 15 2018, 10:57 AM