This makes the new PM not matching the legacy PM. But testing shows that it improves performance because simplification passes especially simplify CFG can affect the accuracy of the debug info.
Diff Detail
- Build Status
Buildable 9256 Build 9256: arc lint + arc unit
Event Timeline
lib/Passes/PassBuilder.cpp | ||
---|---|---|
536–537 | Since you're now adding the only cleanup you need before loading the sample profile, can't you lift this back into the caller to simplify things? (Or does that not simplify things any more?) I'm happy for that to be a separate patch if you want to see what the performance numbers end up looking like first. | |
537–538 | This comment needs updating |
lib/Passes/PassBuilder.cpp | ||
---|---|---|
536–537 | That's a good point. Moved the InstCombine pass to the caller to make the logic clearer. |
Is there a functional test case (i.e., better matching with this change)?
While it does not matter in the longer run, it is probably better to not diverge the sample PGO behavior of the new PM with legacy PM too much, so perhaps make the same change to legacy PM?
Looks like function tests like this is not welcomed in llvm/clang as unittests are supposed to test a single component instead of multiple components. Maybe we can make it into part of the test that we run outside of llvm/clang to validate performance?
While it does not matter in the longer run, it is probably better to not diverge the sample PGO behavior of the new PM with legacy PM too much, so perhaps make the same change to legacy PM?
I just looked at the possibility of changing this in legacy PM, it will be non-trivial to do it because SampleProfileLoader pass is a module pass, which needs to run after AddDiscriminator pass, which is a function pass that runs in populateFunctionPassManager. And the simplification passes also runs in that function. We will need to break populateFunctionPassManager into 2 parts to invoke AddDiscriminator pass first, and then run SampleProfileLoader module pass, and then populateFunctionPassManager. Needs to change a lot across clang and llvm PassManagerBuilder.
The code change LGTM.
To David's point, we do have tests that span basic functionality, so I don't really see a problem with testing this?
The thing we typically try to avoid are tests that span a *frontend* and LLVM proper, but it seems like this could be expressed without that by just having some really minimal IR with debug info that simplify-cfg will destroy, and loading a sample profile into that IR....
That said, if David is OK with it, I'd be happy with the test being in a follow-up commit if it will take a while to craft.
Since you're now adding the only cleanup you need before loading the sample profile, can't you lift this back into the caller to simplify things? (Or does that not simplify things any more?)
I'm happy for that to be a separate patch if you want to see what the performance numbers end up looking like first.