This is an archive of the discontinued LLVM Phabricator instance.

[FuncSpec] Cost model improvements.
AbandonedPublic

Authored by labrinea on Mar 6 2023, 6:29 AM.

Details

Summary
  • Separate user bonus to Latency and CodeSize.
  • Do not re-count the bonus of a visited user (from other arguments).
  • Limit the recursion depth when accumulating user bonus.
  • Penalize latency of loop nests the deeper we get in the use-def chain.
  • Penalize specialization on the same argument.
  • Formulate specialization score based on Latency/CodeSize ratio.

Diff Detail

Event Timeline

labrinea created this revision.Mar 6 2023, 6:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 6 2023, 6:29 AM
labrinea requested review of this revision.Mar 6 2023, 6:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 6 2023, 6:29 AM

Old cost model (LTO)

Compilation
FnSpecialization: Specialized 1 functions in module test-suite/CTMark/SPASS/terminator.c
FnSpecialization: Specialized 5 functions in module test-suite/CTMark/ClamAV/libclamav_message.c
FnSpecialization: Specialized 1 functions in module test-suite/CTMark/SPASS/rules-inf.c
FnSpecialization: Specialized 1 functions in module test-suite/CTMark/sqlite3/sqlite3.c

Linking
FnSpecialization: Specialized 1 functions in module ld-temp.o (ClamAV)
FnSpecialization: Specialized 1 functions in module ld-temp.o (sqlite3)
FnSpecialization: Specialized 3 functions in module ld-temp.o (lencod)

New cost model (LTO)

Compilation
FnSpecialization: Created 1 specializations in module test-suite/CTMark/ClamAV/libclamav_message.c

Linking
FnSpecialization: Created 2 specializations in module ld-temp.o (ClamAV)
FnSpecialization: Created 2 specializations in module ld-temp.o (sqlite3)
FnSpecialization: Created 3 specializations in module ld-temp.o (lencod)

Instruction Count % delta

ClamAV+0.381
7zip-0.012
tramp3d-v4-0.007
kimwitu+++0.097
sqlite3+0.170
mafft+0.026
lencod-0.054
SPASS-0.673
consumer-typeset+0.054
Bullet-0.012
geomean-0.003

(The specializations happening at link time seem to be affecting the total compilation time a lot more than those happening pre-link time).

labrinea added inline comments.Mar 7 2023, 2:33 AM
llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
140–141

Oops, forgot to remove it from here

labrinea updated this revision to Diff 504223.Mar 10 2023, 11:16 AM

Changes from last revision:

  • Removed redefinition of SpecMap type from the header file.
  • Removed @param Cost from the description of findSpecializations() in the header file.
  • Decreased the value of MinScore from 100 to 80 as it no longer specialized mcf_r from SPEC2017.
labrinea updated this revision to Diff 504229.Mar 10 2023, 11:38 AM

Adjusted the test output for recursive-penalty.ll. Decreasing MinScore from 100 to 80 translates to 10 instead of 8 specializations being created for the test.

I am wondering if you're trying to do too much in this patch. But I only had a quick look at this, and need to look again.
But my first request is going to be about the cost/bonus calculation. Since this is crucial for this transformation, it would be useful to document and comment the idea/algorithm at some place, with all the formulas and definitions (costs, bonus, latency, code size etc). I think this would greatly improve readability of these changes and the code in general.

labrinea updated this revision to Diff 505834.Mar 16 2023, 9:14 AM

Added a description of the cost model in the header file.

SjoerdMeijer added inline comments.Mar 17 2023, 2:01 AM
llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
51

Thanks for adding these descriptions, that is very helpful.

Just trying to understand the initial code size calculation better, the last part:

[number of inline candidates] x [small function size]

why is that not the [number of instructions] for each inline candidate?

SjoerdMeijer added inline comments.Mar 17 2023, 2:23 AM
llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
88

It's not clear to me what this range exactly is.

90

Nit: perhaps consider a more informative name for this.

92

Nit: abreviation -> abbreviation
and you can drop shorter, or the comment entirely, up to you.

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
396

Nit: UM is a bit cryptic.

labrinea added inline comments.Mar 17 2023, 4:22 AM
llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
51

Because this information is not available in CodeMetrics. We don't know who that inline candidate is at this point, and if we did we would have to compute it's CodeMetrics. So instead we use mall function size as an approximation. Knowing the exact number of instructions does not matter that much. What matters is to increase the codesize hit if there are inline candidates, meaning the entire function might end up even larger than it is now.

88

This typdef is unrelated to this patch, I just grouped all the typedef here. Just ignore it.

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
396

UniqueMap. Was already named as such before this patch, so unrelated. Please ignore on this review.

SjoerdMeijer added inline comments.Mar 17 2023, 5:11 AM
llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
51

Thanks, if you can add this explanation to the comments that would help.

88

Okay, so it lived somewhere else before and you just moved it here, but can you explain what the range is while we are at it?

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
396

Naming is difficult, but it would be a shame if we cannot come up with something better while we are at it.

543–544

Bit of duplication here with the comment above.

567–568

We are not calculating one bonus anymore, but two values latency and code size.

588

I don't quite get why we are subtracting UserSize here. I guess something related to double or recounting, but after a quick look it wasn't clear to me, perhaps a comment would be helpful here.

llvm/test/Transforms/FunctionSpecialization/recursive-penalty.ll
10

Do we want to check the resulting IR too?

chill added inline comments.Mar 17 2023, 8:09 AM
llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
88

Isn't that clear from the comment above?

We have a vector of specializations. In this vector the specializations of any one function are adjacent to one another, IOW "form a contiguous range".
For example, if specializations of foo are at indices 4, 5, 6, 7 in the vector, we keep just 4 and 8,
to save space. The range is [4, 8).

A somewhat longer description of the data structures is given in the commit message: in https://reviews.llvm.org/rGe6b9fc4c8be00660c5e1f1605a6a5d92475bdba7

SjoerdMeijer added inline comments.Mar 17 2023, 12:40 PM
llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
88

Isn't that clear from the comment above?

No, sorry, it wasn't for me. This however, is crystal clear to me:

We have a vector of specializations. In this vector the specializations of any one function are adjacent to one another, IOW "form a contiguous range". For example, if specializations of foo are at indices 4, 5, 6, 7 in the vector, we keep just 4 and 8, to save space. The range is [4, 8).

As it also answers my follow up question, i.e. what is the relevance/rationale of this. I think exactly this should be part of the comment/explanation here.

A somewhat longer description of the data structures is given in the commit message: in https://reviews.llvm.org/rGe6b9fc4c8be00660c5e1f1605a6a5d92475bdba7

So I missed that review, i.e. didn't have time to look into it, but am now dipping back into FuncSpec, which is why I am missing some rationale. But I think this is an excellent write-up, that should be, or parts of it, should be part of the write up to make this self contained (i.e. it's a bit of a missed opportunity this explanation is only in the commit message).

labrinea added inline comments.Mar 20 2023, 8:03 AM
llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
543–544

Good point, I forgot to remove these comments.

567–568

Will update the wording.

588

We start from Cost CodeSize = Metrics.NumInsts + Metrics.NumInlineCandidates * MinFunctionSize; (see the invocation of getSpecializationBonus). Then we start decreasing it by the footprint of each user.

llvm/test/Transforms/FunctionSpecialization/recursive-penalty.ll
10

Not really. All we care about in this test is to show that the number of specializations created is not linear to funcspec-max-iters.

labrinea added inline comments.Mar 23 2023, 10:30 AM
llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
51

Will do.

88

I disagree with repeating the write-up in a comment as it may become irrelevant at some point. I think git blame should suffice. I find the existing comment already quite extensive/informative.

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
396

Agreed. I'll update it.

labrinea updated this revision to Diff 507835.Mar 23 2023, 11:33 AM
labrinea marked 7 inline comments as done.
SjoerdMeijer accepted this revision.Mar 27 2023, 1:40 AM

Cheers, LGTM

This revision is now accepted and ready to land.Mar 27 2023, 1:40 AM
labrinea abandoned this revision.May 9 2023, 3:46 AM

I am having second thoughts about this change. I will merge the trivial NFC changes (variable renaming, type aliasing) and will raise new tickets for the remaining parts. So abandoning the cost model improvements for now.