This is an archive of the discontinued LLVM Phabricator instance.

[FuncSpec] Track the return values of specializations.
ClosedPublic

Authored by labrinea on Mar 15 2023, 11:08 AM.

Details

Summary

To track the return values of specializations, we need to invalidate all the lattice values across the use-def chain which originates from the callsites, recompute and propagate.

Diff Detail

Event Timeline

labrinea created this revision.Mar 15 2023, 11:08 AM
labrinea requested review of this revision.Mar 15 2023, 11:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 15 2023, 11:08 AM
labrinea added a comment.EditedMar 15 2023, 11:17 AM

Alternatively I was thinking to remove the lattice value for the callsites whose called function has just been replaced. However I believe that might be incorrect, as the users of those callsites might be already in a lattice state of less specificity (i.e. overdefined, or wider constant range).

Actually I have a WIP for recursively invalidating the lattice state of the users of such callsites and adding them back to the solver's worklist. Will run some tests and upload soon.

labrinea planned changes to this revision.Mar 16 2023, 11:46 AM
labrinea updated this revision to Diff 506154.Mar 17 2023, 11:44 AM
labrinea edited the summary of this revision. (Show Details)
labrinea added a reviewer: chill.
labrinea added inline comments.
llvm/lib/Transforms/Utils/SCCPSolver.cpp
750

Perhaps we could only run this only when the new lattice is more specific than the old one?

labrinea updated this revision to Diff 507765.Mar 23 2023, 9:32 AM

I found a bug in the previosu revision. When invalidating return instructions we shouldn't be looking for their lattice state in the value map, but in the tracked returns map instead. Then we should invalidate the users of the function which corresponds to that return instruction. I've changed the unit test to verify this scenario.

labrinea updated this revision to Diff 509287.Mar 29 2023, 3:42 AM
labrinea edited the summary of this revision. (Show Details)

The previous revision failed stage2 clang build. The problem was that some invalidated values remained unknown until the end and as a result SCCP would incorrectly zap return values of functions. One solution is to rerun resolveUndefs on the entire module to make sure that the invalidated values will get overdefined. Instead I postpone the invalidation until we've solved the specializations, and therefore we can tell whether they return a constant before we decide to invalidate its users. This (supposedly) guarantees that the entire use-def chain which was invalidated will be revisited but I am not sure.

labrinea added inline comments.Mar 29 2023, 11:15 AM
llvm/lib/Transforms/Utils/SCCPSolver.cpp
750

Release build without assertions emits warning: unused variable here. Should fix.

labrinea updated this revision to Diff 509693.EditedMar 30 2023, 9:10 AM

Changes from last revision:

  • Fixed unused variable warning on release builds without assertions.
  • Erase each value, which is popped from the solver's worklists, from the set of invalidated values.
  • Factored out code from resolvedUndefsIn which applies to an individual instruction, so that we can call it on the remaining invalidated instructions which didn't pop from the solver's worklists.

This guarantees that all the invalidated instructions will become overdefined if they are still unknown after the solver has run.

Testing:

  • Passed stage2 build of clang (with LTO, specialization of constants enabled and max-iters=10).
  • The compile times for the llvm-test-suite have actually gone down too (-0.066% geomean for non-lto, -0.087% geomean for lto)
chill added inline comments.Apr 3 2023, 2:27 AM
llvm/lib/Transforms/Utils/SCCPSolver.cpp
530

It'd be better to not write function recursively, but with an explicit worklist. Both maximum length of the list and the maximum stack depth would be the same (say N), but while the worklist would keep N pointers, the stack would grow to N stack frames, which is easily several times bigger memory usage (resp. cache footprint) and has the potential to overflow in extreme cases.

labrinea updated this revision to Diff 512742.Apr 12 2023, 3:21 AM
labrinea marked an inline comment as done.
chill added inline comments.Apr 19 2023, 6:36 AM
llvm/lib/Transforms/Utils/SCCPSolver.cpp
486

Don't we always call invalidate with CallBase * ? Then the parameter ought to be CallBase *.

493
if (!Invalidated.insert(Inst).second)
  continue;
505

nit: In C++17 one can use

if (auto It = TrackedRetVals.find(F); It != TYrackedRetVals.end()) {
 ...
512

nit: Some (if not all) of these could be turned into {F, I} - less line noise, easier to read.

labrinea updated this revision to Diff 515292.Apr 20 2023, 5:36 AM

Addressed comments and rebased.

labrinea marked 4 inline comments as done.Apr 20 2023, 5:37 AM
chill accepted this revision.Apr 21 2023, 9:28 AM

LGTM, cheers!

This revision is now accepted and ready to land.Apr 21 2023, 9:28 AM
This revision was landed with ongoing or failed builds.Apr 24 2023, 5:24 AM
This revision was automatically updated to reflect the committed changes.