- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Jan 25 2021
Jan 22 2021
Restrict replay to default advisor only
Nest original advisors in ReplayInlineAdvisor rather than the other way around.
Jan 21 2021
Wrap ReplayInlineAdvisor into InlineAdvisor so that we can fall back to the original Advisor if we don't want to follow the replay. I think the composition of advisors here makes sense but I'm not sure so I'm very open to different approaches.
Jan 15 2021
Jan 14 2021
Add DEFAULT testing to make sure baseline inlining differs from replay. Fix copy-paste error in flag description for -cgscc-inline-replay
Jan 12 2021
Jan 11 2021
Add comment about best effort approach of replay.
In D94333#2491835, @mtrofin wrote:I wouldn't block this patch on that - we can do the refactoring subsequently.
The modifications to the DefaultInlineAdvice is an attempt to solve the following problem:
- In the SampleProfile inliner it emits remarks through the "legacy" interface with emitInlinedInto. For replay if advice generates remarks you'll get duplicate remarks but if it doesn't you get a single set (which is what's happening today and correct).
- In the CGSCC inliner in newPM it emits remarks through InlineAdvice via recordInlining and nowhere else. For replay if InlineAdvice generates remarks you're good but if it doesn't you get no remarks.
Jan 8 2021
Move the ReplayInlineAdvisor.cpp/h and SampleProfile.cpp files to D94333 as they need to be atomic with the remarks format change.
In D94333#2487829, @davidxl wrote:The proposal in the patch is to use line:col.discriminator (which is reasonable), but the implementation uses line:col:discriminator -- please update the patch to be consistent.
Split changes with D94334 better: now the ReplayInlineAdvisor update which needs to be atomic with the formatting change is moved over to here. Removed ReplayInlineAdvice and modified DefaultInlineAdvice to provide the desired functionality. Fixed up tests.
Dec 3 2020
Appreciate the quick commit! Guess I gotta be faster with D92592 :D.
Good catch @MaskRay. If I'm understanding correctly I think the correct approach is to move the class AliasScopeNode I need to metadata.h to fix this?
Certainly, I appreciate the thorough and quick review! Best of luck with your restrict patches!
Dec 2 2020
Updating new merge test to explicitly look for the correct merged scopes. Thanks @jeroen.dobbelaere!
Dec 1 2020
Go with intersection of domains then union the scopes within those domains as discussed. Updating tests to match latest behavior.
Nov 30 2020
Move the fix to getMostGenericAliasScope. Renamed metadata in test case.
Nov 19 2020
@jeroen.dobbelaere I think the correct merging in all cases requires this strategy (or one which checks that all the domains match). If true (@hfinkel?) that'll be a larger change that needs more testing.
Changing to conservative merge of metadata rather than bailing on optimization. Updated test case due to that.
Nov 18 2020
Update unit test with correct set of metadata
In D91576#2404054, @jeroen.dobbelaere wrote:Hi Modi,
Following line in your input example is wrong and explains why the resulting alias info is corrupt:
call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %tmp, i8* align 8 %src, i64 1, i1 false), !alias.scope !1
It should be !alias.scope !0
Thanks for verifying! Good catch, I'll update the patch with this fixed up example.
I've looked more into what alias.scopes are and I think I've got a better handle on what's actually wrong. In the meantime let me try to describe the original problem better. The first section is checking my understanding and making sure we're on the same page. Skip to the second part for the bug details.
My understanding of domains
Let's say we have the following (I slightly modified test/Transforms/Inline/noalias.ll for this):
Nov 16 2020
Add more comments in test case
Oct 12 2020
@MaskRay Ah the action table start is inferred as the end of the CST. Saves a pointer but shackles the implementation. These tables will be fairly small (effectively 1 entry per "catch" in a function) and not all CST would need a corresponding action table (only those inside "try" do) so duplicating them is not as far-fetched as it would initially sound.
Oct 9 2020
Remove useless "if"
Oct 1 2020
Glad to see this landing! I'm testing locally with my change to enable this purely for landing blocks.
Sep 25 2020
Sep 17 2020
In D73739#2278496, @rahmanl wrote:Although I can definitely work with you to diagnose any potential issues on ARM64, I suggest we don't write tests for this patch before the base feature is well-tested.
(On related note, a recent commit (rG157cd93b48a9) limited -fbasic-block-sections to X86.).
Sep 16 2020
@rahmanl Have you had a chance to run your added test on ARM64 as another itanium C++ ABI target to make sure it works properly? Having testing there is fine as a follow-up but wanted to make sure we're in agreement here. Otherwise changes LGTM, please let @MaskRay have a chance to provide any additional feedback.
Thanks @asbirlea!
Sep 15 2020
Rebase #2
Sep 14 2020
Remove unnecessary period
Fix merge issues with comment, updated description to include benefit we see from this change. Thanks @asbirlea for the quick review!
Sep 11 2020
Rebase
Sep 10 2020
Sep 9 2020
These changes will also apply to other Itanium C++ ABI targets (arm32/arm64/RISCV etc.) so adding testing for at least another target is good. Trying this out for arm64 hits an ICE here:
case TargetOpcode::EH_LABEL: if (MBB.isBeginSection() && MBB.isEHPad() && ((*std::prev(MI.getIterator())).getOpcode() == TargetOpcode::CFI_INSTRUCTION)) {
which is worth following up on.
In D73739#2264518, @MaskRay wrote:Taking a closer look at the test (haven't delved into the code yet): the PIC vs non-PIC difference of .gcc_exception_table looks strange.
.gcc_except_table (please also test its section flags "a") has an absolute relocation referencing main.2. This is not good. In all targets (except RISC-V -mno-relax), .gcc_except_table is a relocation-free section.
In D73739#2261537, @snehasish wrote:Once this patch is in we can look into splitting ehpads out though I'm more inclined to enhance the static profile count mechanism to account for ehpads appropriately rather than adding a new flag to MFS.
On this front, is there a specific reviewer that we're waiting for here to get this change through?
Sep 3 2020
Looking at https://bugs.llvm.org/show_bug.cgi?id=36578:
Andrew Kelley 2019-08-11 09:00:59 PDT
I'm no longer subscribed to this bug report. I've come to the conclusion that LLVM's coroutine API is not worth using, and resorting to implementing coroutines directly in the frontend.
Sep 2 2020
I think with this alongside the recently committed MachineFunctionSplitter (MFS) the MFS phase is a good place to split out EH landing pads in both profile and non-profile code. A small change in MFS will enable EH LP splits in profiling given they should always be cold. I think in non-profile mode a sub-flag can be added like -mfs-split-LP-only where EH LPs are always split out. Enabling a general static-analysis splitting could also be interesting but outside of EH LPs I'm not certain there's guaranteed gains in other scenarios.
Sep 1 2020
I'm rebased against 478eb98cd25cb0ebc01fc2c3889ae94d3f1797d3 and I'm seeing both added tests failing. They may need to be updated.
Aug 28 2020
In D86156#2245103, @nikic wrote:I have no familiarity with BFI, so possibly stupid question: There is already some similar handling as part of BFIImpl here: https://github.com/llvm/llvm-project/blob/0f14b2e6cbb54c84ed3b00b0db521f5ce2d1e3f2/llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h#L1043-L1058 What is the difference to that / why are both needed?
Remove redundant VH callback as @nikic helpfully pointed out!
Aug 27 2020
Condition usage of BFI to PGO in newPM as well
In D86156#2242872, @asbirlea wrote:As a general note, it may make sense to include BFI in the set of loop passes always preserved (getLoopPassPreservedAnalyses()), if its nature is to always be preserved (with some potential info loss) due to the callbacks deleting blocks. But since we're only looking at LICM effect for now, this can be a follow up when/if needed.
only use BFI when profile is enabled, have LICM mark BFI as preserved
Aug 26 2020
Remove usage need for BFI in LPM2 and set unswitching to preserve lazy BPI/BFI so it can remain in the same loop pass as LICM
In D86156#2231710, @nikic wrote:This change adds three PDT calculations to the standard pipeline. Please try to avoid the PDT calculations if PGO is not used, possibly by using LazyBPI.
Change to LazyBFI for legacy pass manager to prevent rebuilding the post-dominator tree
Aug 21 2020
@asbirlea Thanks for taking a look!
Aug 18 2020
Commit my changes (crazy I know) so that the diff is actually updated for linting
Linting