- User Since
- Feb 10 2016, 9:51 AM (234 w, 3 d)
Thu, Aug 6
I'm inclined to change the API to pass a second vector of updates instead of the GraphDiff directly (for the PostView), since during updates the GD gets modified and it makes sense to keep it as an internal component. Thoughts?
Wed, Aug 5
Mon, Aug 3
Sat, Aug 1
This is ok to check in to resolve the regression.
I'll go back and check if that invalidate is overly conservative, but don't block in it.
Thu, Jul 30
The additional debug print with "Invalidating all non-preserved analyses for" seems too noisy and not informative.
+1 on rebasing this without the no-lvi-abandon flag on D84981.
Thanks for removing the noise.
Wed, Jul 29
I'm looking into it. If needed this can be reverted as it's not blocking the work for DomTree updates with a CFGView.
Tue, Jul 28
Renamed ChildrenGet to getChildren. The same name only exists in GraphDiff, it's ok to keep a consistent naming.
I'm ok if this is added in either this or a follow-up patch.
I think it's useful to have it in the codebase now, with one test to showcase it (e.g. do another run line in new-pass-manager.ll, with check-prefixes).
Does it make sense to have the option to enable seeing where the adaptors are run?
Some general comments:
- The MemDepAnalysis has been known to be problematic for compile-time, so reducing the 100k implicit threshold seems reasonable.
- It's natural the compiler tracker will be happy, but can we consider runtime implications due to potential missed optimizations?
- Could we evaluate which optimizations are missed, in which passes using MemDepAnalysis along with their run-time impact? (in particular for the benchmarks where we see the large compile-time benefits)
AFAIK, the main passes using MemDepAnalysis are DSE, MemCpyOpt and GVN and there is active work in porting these to MemorySSA. The same analysis of compile-time vs run-time benefits is needed for that switch, so having data from reducing thresholds here will be very valuable for the short term (current patch) on deciding how much to reduce it to, and for the long-term switch to MemorySSA.
Mon, Jul 27
Fri, Jul 24
Great, thank you for confirming :-)
Yes. The difference in perf in instructions and cycles count is due to reversing the update order.
I'm still not seeing meaningful changes in wall-time even with the update order reversed, but reliable wall-time testing is hard.
Thu, Jul 23
The increase in number of instructions and cycles was caused by reversing the order in which the updates are applied by GraphDiff.
I'll look into re-adding some of the original cleanups in a follow-up patch.
Thank you very much for the quick patch! Confirming Halide is fixed too.
IIRC the MIR pipeline is essentially flat (vs the IR one), so at this point it's probably fine to not have the adaptor complexity. However, if this changes to become more than a one time use, then perhaps the Module-Function hierarchy makes sense.
Let me ask this differently, if the CodegenNPM introduced the ModuleToFunctionPassAdaptor now, and it was only used during codegen set-up (assuming negligible overhead for this complexity), is it possible it will be useful long-term in case there are changes in the current restrictions?
Wed, Jul 22
We're also seeing runtime failures in Halide (SIGILL). Either reverting or having a quick forward fix would be very much appreciated.
Tue, Jul 21
Thu, Jul 16
Thanks, this is good.
Wed, Jul 15
lgtm with 2 nits to clarify/fix.
Tue, Jul 14
I'm inclined to favor having the isRequired() method optional with the SFINAE in this patch.
It's up to you two which patch to actually keep, but I'd say update this patch, with renaming to isRequired() (and flipping the bool value), adding the default to true in isRequired() for all adaptors and if, as you mentioned the diff can be simplified, great, but if not, this one makes sense.
FWIW, the patch to abandon in PreserveAnalysis is correct and the right approach IMO.
Mon, Jul 13
Fri, Jul 10
Thank you for the testing. Could you help with with instructions on how to run the tracker myself?
My local testing showed a negligible regression for mafft and a negligible improvement on other benchmarks, so it looked like noise on average.
Jul 9 2020
Nit: re-add consts
Updated to include the part of the patch that's moving the Updates to a CFGDiff object. Splitting off from the clean-up work merging the two branches when BUI is null.
This patch does not exhibit the compile-time regression which caused it to be reverted previously.
Jul 8 2020
LGTM, thank you!
This is an issue I am aware of and the main motivation for the DominatorTree refactoring work. We do need an updated DT here, and currently we're not computing the correct one. This isn't causing issues at the moment because of the scenarios where the updater is being used, but using the current DT in general is not correct.
Jul 7 2020
Thanks you, LGTM.
Jul 6 2020
Jun 29 2020
I think it's great to have a way to specify the exact nesting of managers we want.
Please extend the changes to all tests.
Seems like this needs to be rebased on top of D82290?
Looks like a reasonable cleanup making safety checks reusable.
Please run check-all to check the statistics you're changing are not checked for in unit tests.
LGTM, but please wait to see if nikic@ has additional comments.
Yes, the diff look good. Could you clarify what isSafeToMove does? I don't see it in the included file, only isSafeToMoveBefore. Is there a dependent patch not linked?
Jun 26 2020
The issue with this change is that there is a lot of unnecessary work done now. Before, we could exit early to decide an instruction could never be hoisted/sunk. Now, we're removing all those early checks and only when actually trying to move get the "actually, you can't". I expect this to get a big compile-time penalty.
Could you do some testing and include results in the patch summary?
Jun 25 2020
It's not impossible. It's hard and likely not worth it.
Preserving MemorySSA for this pass means creating MemoryAccesses for all blocks cloned when unrolling. Creating an access and then updating so many times is likely more expensive than recomputing MemorySSA from scratch.
Jun 24 2020
A general comment on the series of patches: it's great to have this means of testing the NPM through the legacy way of specifying passes, but I believe the long term goal is to only rely on specifying via passes=.... Most tests will need to be updated to get there, so having this mass testing available now is very useful!
Just note that when the transition is made to only use passes=... to opt, all additions made solely for matching the legacy -foo-pass arguments will need to be cleaned up (e.g. the API added here in PassBuilder.h)
Jun 23 2020
Thank you for the patch!