This is an archive of the discontinued LLVM Phabricator instance.

[FuncSpec] Decouple Cost and Benefit analysis, to sort candidates. NFC.
ClosedPublic

Authored by SjoerdMeijer on Dec 9 2021, 11:30 AM.

Details

Summary

This looks like a big change/rewrite, but it is not as bad as it looks as it mostly is the same code that is refactored to decouple the cost and benefit analysis. The biggest change is top-level function specializeFunctions that now drives the transformation likes this:

specializeFunctions() {
  Cost = getSpecializationCost(F);
  calculateGains(F, Cost);
  specializeFunction(F);
}

while this is just a restructuring and decoupling of the cost and benefit analysis, this separation helps the actual functional change in calculateGains. We now sort the candidates based on the expected specialisation gain, which we didn't do before. For this, a book keeping struct ArgInfo was introduced. If we have a list of N candidates, but we only want specialise less than N as set by option -func-specialization-max-constants, we sort the list and discard the candidates that give the least benefit.

Given a formal argument, this change results in selecting the best actual argument(s). In a follow up, I want to go one step further and compare all functions and all arguments. But that will mostly build on top of this refactoring and change, and will be less change; this is enough change for now.

Diff Detail

Event Timeline

SjoerdMeijer created this revision.Dec 9 2021, 11:30 AM
SjoerdMeijer requested review of this revision.Dec 9 2021, 11:30 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 9 2021, 11:30 AM
snehasish added inline comments.Dec 9 2021, 5:44 PM
llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
98

typo "specialization"

396

nit: We could avoid calling getSpecializationBonus if ForceFunctionSpecialization is true.

438

Use a reference to avoid copies here?

467

nit: just break here instead of return?

llvm/test/Transforms/FunctionSpecialization/function-specialization4.ll
41 ↗(On Diff #393230)

It wasn't clear to me what changed in the code that this function is now specialized. Is it possible to split this into a refactor without functionality change and a smaller patch with the behaviour change?

Do we notice the score/code size change in SPEC after this patch?

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
97

For a class used only in a cpp file, it may be better to wrap it into anonymous namespace to make sure the linkage is internal.

438

We could use llvm::sort here to use range style sort.

443–450

A note: this change is not NFC (I'm fine with this change) @snehasish

452

How about Worklist.size() < ActualConstArg.size()?

465

It might be better to:

Clone->getArg(AI.Arg->getArgNo);
duan.db added a subscriber: duan.db.Dec 9 2021, 6:23 PM

Thanks for the reviews!
I have replied to 2 comments inline while I work on addressing the other comments.

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
396

Good point, that will help compile-times. We still need to set Gain though, but what I will do is initialise Gain to the maximum value with ~0 if specialisation is forced.

llvm/test/Transforms/FunctionSpecialization/function-specialization4.ll
41 ↗(On Diff #393230)

Okay, good point. I changed the behaviour how MaxConstantsThreshold is interpreted.

Before, when the number of candidates in the worklist exceeded MaxConstantsThreshold, no specialisation was done at all. I have changed that to specialise up to MaxConstantsThreshold candidates, which seems to me what you would expect from MaxConstantsThreshold. After sorting the candidates on the maximum gain, it was this code getting rid of the remaining candidates with the lower gains:

// Truncate the worklist to 'MaxConstantsThreshold' candidates if                               
// necessary.
if (Worklist.size() > MaxConstantsThreshold) {                                                  
  LLVM_DEBUG(dbgs() << "FnSpecialization: number of constants exceed "                          
                    << "the maximum number of constants threshold.\n"                           
                    << "Truncating worklist to " << MaxConstantsThreshold                       
                    << " candidates.\n");
  Worklist.erase(Worklist.begin() + MaxConstantsThreshold,
                 Worklist.end());
}

I included this change in behaviour here, because otherwise the sorting doesn't serve a purpose if we are going to keep all candidates anyway. But given that I am going to follow up on this, I will keep the sorting but move this change in behaviour in a follow up and dependent patch.

To finally answer your question about this test change: there are 2 candidates here, but -func-specialization-max-constants=1. With the old behaviour, this means no specialisation at all like I mentioned before. With the new interpretation, this specialises up to 1 candidate, which is what you see here.

SjoerdMeijer retitled this revision from [FuncSpec] Decouple Cost and Benefit analysis, to sort candidates. to [FuncSpec] Decouple Cost and Benefit analysis, to sort candidates. NFC..

This addresses the other comments. This should now be NFC, so have added this to the Title.
I will double check the SPEC score remains the same, and will follow up with the functional change that was present of the first revision of this patch which caused the change in the test.

In fact, this might not be a NFC change since it would try to sort the candidates.I would love to see the score/code size change after D115509 applied.

BTW, it might be better to add a test case to show the sort matters.

I can confirm that the SPEC score with this patch and also with D115509 remains the same, as expected and should be.

This is NFC as the resulting code should be the same, which is the definition of NFC I use. But yeah, I do agree of course that under the hood things work slightly different, although that will only have an affect in subsequent patches. Anyway, happy to remove NFC from the title.

BTW, it might be better to add a test case to show the sort matters.

So it shouldn't matter here, but it's a good point and I will add an test-case for this to D115509 as the affect will only be visible there.

ChuanqiXu accepted this revision.Dec 15 2021, 2:35 AM

If the score remains the same, it looks good to me. BTW, I think the change of sort matters. Previously, we banned a lot of cases which could have been specialized due to the consideration of the cost for code size and compilation time. But if we could sort them and assume we could have a good enough cost model (the current one might be coarse), then we could relax the restrictions we made previously to make function specialization more useful.
Nit: format codes before committing

(I suggest to commit this with D115509 in one shot, but it doesn't really matter)

This revision is now accepted and ready to land.Dec 15 2021, 2:35 AM
snehasish accepted this revision.Dec 15 2021, 8:51 AM

lgtm

llvm/test/Transforms/FunctionSpecialization/function-specialization4.ll
41 ↗(On Diff #393230)

Thanks for clarifying the change.

This revision was landed with ongoing or failed builds.Dec 16 2021, 3:58 AM
This revision was automatically updated to reflect the committed changes.