User Details
- User Since
- Mar 29 2021, 3:03 PM (130 w, 19 h)
Today
Yesterday
edited ld.lld.1
nits
nit
tests
minor
Thu, Sep 21
D159526 performed the rename. If this patch is applied, applyCDSort in lld/ELF will need to be updated again. This is exactly the scenario I want to avoid.
Wed, Sep 20
Tue, Sep 19
Mon, Sep 18
renaming the option
comments
Wed, Sep 13
rebasing to latest
Thanks for looking into this. I made the requested changes
Mon, Sep 11
addressing comments, adding a test for cdsort
Fri, Sep 8
Thanks. We recently discussed a case where stale matching was unable to match any block in a function and so function exec count was not set. We thought it's still beneficial to set exec count in this case for function reordering. I assume the entry block match will also make stale matching set function exec count in this scenario?
Thu, Aug 31
Wed, Aug 30
rebase
Tue, Aug 29
rebase
Mon, Aug 28
Aug 26 2023
Aug 18 2023
Since the last change wasn't trivial, I'll wait for another review before landing this
Aug 17 2023
Addressing comments by reverting the changes in MCPlusBuilder.h
Aug 16 2023
rebase & adjust logging
Jul 31 2023
Sorry for overlooking this; here is a fix: https://reviews.llvm.org/D156734
I am adding an internal test for CDS
Jul 27 2023
Ready for review now
rebased past formatting
clang-format
nit
Since the upstream diffs have landed, this is now ready for review
rebase
nit
Jul 25 2023
Updated the part computing matching metrics to make it more readable. Let me know if there is still confusion.
Jul 24 2023
Thanks for review Rahman. I'll wait a few more days before landing in case anyone else has suggestions.
rebase
Jul 20 2023
reverting accidentally attached changes
rebase to latest
Jul 19 2023
Jul 18 2023
another typo
a typo
- wrapping binary stats into a struct;
- printing matching stats based on strong hashes, where we have high confidence in correctness.
Jul 14 2023
Moving stats from BinaryFunction to (aggregated) BinaryContext
addressing comments (mostly adding periods)
Jul 13 2023
Jul 10 2023
comments
Jul 7 2023
Jul 6 2023
nit
Added a test. Without the change, the output is
... BOLT-INFO: Starting pass: reorder-functions BOLT-INFO: hot func main (400) BOLT-INFO: hot func func1 (500) BOLT-INFO: hot func func2 (1500) BOLT-INFO: hot func func3 (100) BOLT-INFO: hot func func4 (99) BOLT-INFO: hot func func5 (110) ...
Jul 5 2023
Jun 23 2023
Thanks for teaching me how to measure the impact of instruction caches. While re-running the experiments with the new events, I realized that my earlier report was not using C^3 as the baseline. Instead the numbers were on top of an improved code layout (referred to hfsort+) utilized by BOLT, which is not relevant here; I apologize for the confusion.
Below are details of the latest run on the same clang benchmarks, with and without huge pages. In addition to comparing the new algorithm to C^3, i'm also including the numbers on top of the "input" ordering that comes from the compiler. Here I'm building the binary with LTO and AutoFDO, but observe similar numbers when using instrumentation counts or other sampling-based profiling approaches (e.g., CSSPGO).
Jun 21 2023
Hmm. I was using "cpu/event=0x85,umask=0x61/u" for i-TLB misses, which we got from https://download.01.org/perfmon, which has even been moved since then. Back in 2017 (when the algorithm was developed) we thought this is the "right" event to look at, but it might not be the case. Which one would you recommend to look at? I see this page has a good description.
Here are my measurements on the clang binary (release_14) by compiling two large cpp files (benchmark1 and benchmark2). Negative values are improvements, bold ones are stat sig.
Jun 20 2023
@rahmanl @davidxl Our measurements are always on top of C^3 (the one currently utilized by CallGraphSort.cpp). If we compare against "no-function-ordering" (ie looking at the order produced by the compiler), then the wins would be well beyond 1% cpu but that's not a realistic scenario these days. The linked follow-up diffs, D152840 and D153039, enable the ordering in the linker and in the BOLT. I'll share more specific numbers on my benchmarks soon.
Jun 16 2023
Jun 15 2023
Jun 14 2023
Jun 13 2023
Jun 12 2023
Do you have a follow-up diff to adjust BOLT's part?
Jun 9 2023
Jun 8 2023
rebasing & fixing the test & adding debug logging