- User Since
- Feb 10 2016, 9:51 AM (241 w, 6 d)
Fri, Sep 25
Add to the documentation on NPM passes.
I meant to add about isRequired, but with mentioning the behavior for optnone functions. I didn't mean to document what optnone means.
Could you add a few lines about this in the LLVM documentation?
We've had all the discussion in patches, so it may get unnoticed if it's not documented.
Wed, Sep 23
I think for prune-eh, the NPM inliner was intended to address the same cleanup, but I don't have all the context.
@chandlerc for more details.
Tue, Sep 22
Can you please extend the testing to cover at least some of these cases: multiple predecesssors, multiple exit blocks, multiple predecessors with a coroutine switch?
Mon, Sep 21
Thanks for all these!
I don't think this should be ported to the NPM.
For the NPM, there's -debug-pass-manager, so this would give you the corresponding coverage (with a different CHECK prefix). Some tests already have this.
Optional to add coverage for phi-values-usage.ll and alias-analysis-uses.ll, the others are already covered or are checking for Changed.
I think in general the pass is not trying to do DCE too, but here it looks cheap to clean the dead code left by the conditional.
Please have MSSA updated though.
This looks like a good cleanup. It's possible for DIE to be useful as a "light-weight" DCE, but that could also be achieved by adding a flag to DCE.
Unless there are folks that come forward that they need this for other reasons, I think it's good to clean the codebase.
Fri, Sep 18
The general comment (I mentioned this offline too), is that this is much closer to the direction I'd see the changes going. By this, I mean, the APIs removed here seemed unnecessary, and inlining their functionality at the callsite made more sense.
This seems the right approach for coroutines and outlining. I'm not clear for inlining at this point, Im working to understand the flow better there.
Thu, Sep 17
Wed, Sep 16
Incorrect previous approach. The renaming needs to start at the block with the new def, but it needs to rename all blocks in the IDF, not just those with phis added.
Update to different approach.
I'm in favor of being conservative here and checking for only constant indexes on the geps.
Updated the printing in separate revision. Rebase.
Tue, Sep 15
I checked in a fix in https://reviews.llvm.org/rGfc8200633122, but I haven't yet verified it addresses all the failures reported.
Yes, exactly. The status before this patch was broken anyway, no sense reverting. I'm looking to understand whether there are ways to improve the infrastructure/APIs.
The createNode is only called in addNewFunctionIntoSCC and addNewFunctionIntoRefSCC, where the former is only called in unit tests and the latter is only used by co-routines. I'm trying to understand what the expected changes in the CallGraph are for some of these methods, if this is specific to coroutines or there are other usecases and if we can get better testing.
Is registerOutlinedFunction used out-of tree?
Except the Attributor and coroutines there is no user of the CallGraphUpdater (I know of).
The Attributor does not yet use this function but it is not unreasonable to expect it will be used.
There is a good chance the "shallow wrapper" functionality should actually do so, though that feature is on its own not the most useful anyway.
I'm fine with LLVM-dev, TBH I need to talk about updating the LazyCallGraph soon anyway. I think the current &updateCGAndAnalysisManagerForCGSCCPass in conjunction with the ::run method don't allow you to delete more than one function from the SCC at a time, which is bad. I might simply use it wrong but whatever it is, I'm all for a discussion on how we expose new PM CGSCC updates :) [= how we properly replace/implement something like the CallGraphUpdater]
I'd like to revisit this and open a discussion. The createNode is only called in addNewFunctionIntoSCC and addNewFunctionIntoRefSCC, where the former is only called in unit tests and the latter is only used by co-routines. I'm trying to understand what the expected changes in the CallGraph are for some of these methods, if this is specific to coroutines or there are other usecases and if we can get better testing.
Is registerOutlinedFunction used out-of tree?
I believe the solution is to set the location size to unknown after a phi translation with the same ptr, but I need to properly verify that.
// include/llvm/Analysis/MemorySSA.h:1233 CurrentPair.second = Location.getWithNewSize(LocationSize::unknown()); return
Yes, the load should have the Phi as the defining access. I'm still looking into where this information should come from, but it's not hitting the phi translation.
Thank you for the revert, I'll update with the fix once I have it available.
Mon, Sep 14
I'd suggest reverting if the failures are blocking, but we do need a reproducer so it can be recommitted after a fix is in place.
Thank you for the patch!
I think for clarify this could be split into a patch that updates the current pass to use the CallGraphUpdater, and a patch that only adds the pass for the new pass manager.
Thank you adding this!
Thu, Sep 10
I'm running additional testing in the background. I don't have anything holding this back at this point.
Either check in and if something shows up, I'll let you know to resolve or revert, or wait another 1-2 days for the testing I'm running to complete.
I'm ok with either option, and may even be inclined for the former, because there may be additional parties coming forward if issues arise, and I'd like these to come up earlier rather than later.
Thank you so much Reid for taking the time to look into this, discovering the msvc issue and proposing this work-around!
Wed, Sep 9
I'm running into a crash with this, could you please hold off until tomorrow? I'm working on getting a reproducer.
Could you please prioritize figuring out why the test fails with DSE+MSSA? It's important to have this clarified before the flag flip.
I forgot to add, if you could include the impact on the NPM for compile-time, it would be great.
It may be good to have this in before the DSE+MSSA flag flip to minimize the regression introduced by that patch (assuming there are no meaningful binary changes or run time regressions for the NPM either).
Green light from me as well. I think there's still room for improvement for the NPM, but this is the right step to move forward infrastructure-wise.
Tue, Sep 8
This looks good to me.
Did you investigate a similar change for lib/Passes/PassBuilder.cpp:734->720?
Thu, Sep 3
Diff looks good to me.
Same comment as before about testing with EXPENSIVE_CHECKS to catch potential missed updates.
Thank you for working on this!
Wed, Sep 2
Changes look good.
Please also test with EXPENSIVE_CHECKS. The additional MSSA verifications there can unearth missed changes.
Unless there are complaints from others, I think this is good.
A small note to test with EXPENSIVE_CHECKS as well, as there's extra verification there.
I realized I accidentally reordered passes, but didn't realize the impact of not using phi values in basicaa.
Tue, Sep 1
Thank you for the results! Looking at libclamav_unsp.c, I'm seeing 3.96% in instructions for ThinLTO, which I'm guessing matches the results you see.
The fluctuations may make sense, since with the additional MSSA updates in LoopRotate it can do more or less work, depending on the updates. For instructions per cycle I'm seeing it neutral results (0.5%), and for wall time (average over 20 run), also neutral.
I was working on having MemCpyOpt preserve and use MSSA, in order to replace the use of MemDepAnalysis entirely. But I'm happy with having MSSA preserved first :-).
Rebase, update tests.
The idea to have all passes in sequence preserved and use MemorySSA is great. I'm wondering if changing the pass order affects run time (due to potential missed optimizations).
On the other hand, this only affects the LPM, so it may not be as critical due to the plan to switch to the NPM.
Rebased after the fix to remove DT recomputation in MSSA. I'm still seeing a small regression on one of the tests in the original PRs (0.6% in instructions on mogrify.i), but it's nowhere near the regression that motivated the pipeline split.
I'm not sure if it's worth pushing this forward if the only motivation is re-merging the pipeline in the LPM, unless it's entirely performance neutral.
Rebase at ToT.
Mon, Aug 31
Aug 28 2020
Seems like a good idea. I'm curious if you saw any impact from this change.
Aug 27 2020
lgtm with some nits.
Not exactly the path I had in mind (working on newGVN was), but that's a longer avenue and I'm curious to see the compile-time impact of preserving MSSA :-)
Diff looks reasonable at this point. Thank you for the patch!
Please wait on @nikic for compile-time impact or additional feedback.
A few nits, but lgtm.
Your understanding is correct. I only have one more comment on preserving passes in LICM too.