To track the return values of specializations, we need to invalidate all the lattice values across the use-def chain which originates from the callsites, recompute and propagate.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Alternatively I was thinking to remove the lattice value for the callsites whose called function has just been replaced. However I believe that might be incorrect, as the users of those callsites might be already in a lattice state of less specificity (i.e. overdefined, or wider constant range).
Actually I have a WIP for recursively invalidating the lattice state of the users of such callsites and adding them back to the solver's worklist. Will run some tests and upload soon.
llvm/lib/Transforms/Utils/SCCPSolver.cpp | ||
---|---|---|
707 | Perhaps we could only run this only when the new lattice is more specific than the old one? |
I found a bug in the previosu revision. When invalidating return instructions we shouldn't be looking for their lattice state in the value map, but in the tracked returns map instead. Then we should invalidate the users of the function which corresponds to that return instruction. I've changed the unit test to verify this scenario.
The previous revision failed stage2 clang build. The problem was that some invalidated values remained unknown until the end and as a result SCCP would incorrectly zap return values of functions. One solution is to rerun resolveUndefs on the entire module to make sure that the invalidated values will get overdefined. Instead I postpone the invalidation until we've solved the specializations, and therefore we can tell whether they return a constant before we decide to invalidate its users. This (supposedly) guarantees that the entire use-def chain which was invalidated will be revisited but I am not sure.
llvm/lib/Transforms/Utils/SCCPSolver.cpp | ||
---|---|---|
707 | Release build without assertions emits warning: unused variable here. Should fix. |
Changes from last revision:
- Fixed unused variable warning on release builds without assertions.
- Erase each value, which is popped from the solver's worklists, from the set of invalidated values.
- Factored out code from resolvedUndefsIn which applies to an individual instruction, so that we can call it on the remaining invalidated instructions which didn't pop from the solver's worklists.
This guarantees that all the invalidated instructions will become overdefined if they are still unknown after the solver has run.
Testing:
- Passed stage2 build of clang (with LTO, specialization of constants enabled and max-iters=10).
- The compile times for the llvm-test-suite have actually gone down too (-0.066% geomean for non-lto, -0.087% geomean for lto)
llvm/lib/Transforms/Utils/SCCPSolver.cpp | ||
---|---|---|
547 | It'd be better to not write function recursively, but with an explicit worklist. Both maximum length of the list and the maximum stack depth would be the same (say N), but while the worklist would keep N pointers, the stack would grow to N stack frames, which is easily several times bigger memory usage (resp. cache footprint) and has the potential to overflow in extreme cases. |
llvm/lib/Transforms/Utils/SCCPSolver.cpp | ||
---|---|---|
503 | Don't we always call invalidate with CallBase * ? Then the parameter ought to be CallBase *. | |
510 | if (!Invalidated.insert(Inst).second) continue; | |
522 | nit: In C++17 one can use if (auto It = TrackedRetVals.find(F); It != TYrackedRetVals.end()) { ... | |
529 | nit: Some (if not all) of these could be turned into {F, I} - less line noise, easier to read. |
Don't we always call invalidate with CallBase * ? Then the parameter ought to be CallBase *.