This is an archive of the discontinued LLVM Phabricator instance.

[FuncSpec][NFC] Refactor finding specialisation opportunities
ClosedPublic

Authored by chill on Oct 14 2022, 9:21 AM.

Details

Summary

This patch reorders the traversal of function call sites and function
formal parameters to:

  • do various argument feasibility checks (isArgumentInteresting ) only once per argument, i.e. doing N-args checks instead of N-calls x N-args checks.
  • do hash table lookups only once per call site, i.e. N-calls lookups/inserts instead of N-call x N-args lookups/inserts.

Diff Detail

Event Timeline

chill created this revision.Oct 14 2022, 9:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 14 2022, 9:21 AM
chill requested review of this revision.Oct 14 2022, 9:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 14 2022, 9:21 AM
labrinea accepted this revision.Oct 14 2022, 9:44 AM
labrinea added inline comments.
llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
464–465

This confused me a bit and then I remembered it's a MapVector. LGTM!

This revision is now accepted and ready to land.Oct 14 2022, 9:44 AM
ChuanqiXu added inline comments.Oct 16 2022, 7:16 PM
llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
671–673

Why this check missed?

chill added inline comments.Oct 17 2022, 1:35 AM
llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
671–673

Oops, that's accidental. I'll put it back.

chill planned changes to this revision.Oct 17 2022, 4:12 AM

Yeah, nice patch.

Just a few nits inlined when I was reading the patch.

According to my definition of NFC, this is not so NFC as things are done quite differently now. I appreciate the codegen should be the same, which is the definition of NFC that some use I believe.

Just out of curiousity, did you measure compile times improvements with this, which I guess is the reason for doing this?

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
429

Nit: you can also drop these brackets I think

434

Nit: not sure the coding standard says anything about comparing against zero, but personally it's a bit verbose for me and I prefer !Args.size().

671–673

Looks like we are missing a test case? :)

682

After reducing/changing the comments, the function name and comment are a bit out of sync now for me, because specialisation should not only "possible" (the new comment), but also profitable which is why "interesting" is in the function name. Perhaps restoring some words about the profitability explains the "interesting" part.

chill added a comment.Oct 17 2022, 6:09 AM

Unfortunately, this patch exposes a latent issue in the pass - one call site matching two or more specialisations - as can be seen in the failure of llvm/test/Transforms/FunctionSpecialization/identical-specializations.ll.

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
671–673

Ack!

682

In fact, the old comment was out of sync - neither the old isArgumentInteresting nor the old getPossibleConstants did any cost/benefit analysis.

chill added inline comments.Oct 18 2022, 9:54 AM
llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
671–673

Testcase added in D136184

chill updated this revision to Diff 468603.Oct 18 2022, 9:58 AM
This revision is now accepted and ready to land.Oct 18 2022, 9:58 AM
chill requested review of this revision.Oct 18 2022, 9:58 AM
chill marked 6 inline comments as done.

According to my definition of NFC, this is not so NFC as things are done quite differently now. I appreciate the codegen should be the same, which is the definition of NFC that some use I believe.

AFAIK, the working notion of NFC in LLVM is "does not change any testcase", which is a bit of a compromise, of course. 😁

I think it can try specialisations of the same cost in a different order, which has the potential of changing the codegen, however, it does not
violate any explicit guarantees we make in the pass.

Just out of curiousity, did you measure compile times improvements with this, which I guess is the reason for doing this?

No, not measured yet, but planning to.

This revision is now accepted and ready to land.Oct 18 2022, 7:17 PM
chill added a comment.Oct 21 2022, 8:11 AM

Just out of curiousity, did you measure compile times improvements with this, which I guess is the reason for doing this?

No, not measured yet, but planning to.

I did some measurements on sqlite3 (with the whole series ending in D136332), with -O3 and -O3 -mllvm -enable-function-specialization).
I got about 0.42% overall regression compared to not running the FunctionSpecialization pass and about 0.45% improvement when the pass is enabled in both compilers.
Comparing just the pass (via clang ... -ftime-report 2>&1 | grep FunctionSpecialization) shows about 2.2% improvement.

This revision was landed with ongoing or failed builds.Oct 26 2022, 2:26 AM
This revision was automatically updated to reflect the committed changes.
chill reopened this revision.Oct 26 2022, 5:57 AM
This revision is now accepted and ready to land.Oct 26 2022, 5:57 AM
chill updated this revision to Diff 470835.Oct 26 2022, 9:21 AM
labrinea requested changes to this revision.Oct 26 2022, 3:35 PM

This revision looks similar to 467796, which must be accidental. Can you update it as it was when it landed? (470754)

This revision now requires changes to proceed.Oct 26 2022, 3:35 PM
chill updated this revision to Diff 471103.Oct 27 2022, 3:07 AM
chill planned changes to this revision.Oct 27 2022, 3:11 AM
chill updated this revision to Diff 471107.Oct 27 2022, 3:32 AM
labrinea accepted this revision.Oct 27 2022, 6:00 AM
This revision is now accepted and ready to land.Oct 27 2022, 6:00 AM
This revision was landed with ongoing or failed builds.Oct 28 2022, 3:27 AM
This revision was automatically updated to reflect the committed changes.