- User Since
- May 5 2018, 9:37 AM (134 w, 4 d)
The problem here is that LoopSink is only relevant for PGO, but MemorySSA gets computed unconditionally in the legacy PM. This problem does not affect the new pass manager, where it's possible to compute MemorySSA only if useful. Supporting this properly in LegacyPM would require something like LazyMemorySSA.
Tue, Dec 1
It looks like this change has also enabled the new pass manager by default.
I don't think this change makes a lot of sense. I expect that in nearly all cases we'll be able to produce a better result by having a separate PoisonValue check beforehand (we'll want to return poison rather than undef or some other value). And if we do that, this change to the isUndefValue() API just makes things more confusing.
Mon, Nov 30
Sun, Nov 29
@asbirlea Is this ready to land in the current form?
This will clearly need something more sophisticated. After thinking this over, I also think that the current approach of inserting assumptions at phis is not optimal, as the tests added in https://github.com/llvm/llvm-project/commit/b5e8de9c7903d458b098a8af03939384270c1a5e show. I expect this can also result in BatchAA returning different results, depending on whether a gep or a phi is visited first. We should be inserting everything as NoAlias into the cache in the first place and tracking whether that assumption gets used or not.
Sat, Nov 28
There seem to be still quite a few places where UndefValue can be replaced with PoisonValue in ConstantFold, do you plan to follow up on that?
Fri, Nov 27
@mstorsjo Thanks for the report & revert.
Thu, Nov 26
Looks like this change has a large negative compile-time impact: https://llvm-compile-time-tracker.com/compare.php?from=2254e014a9019bf17c3f5cb27c1dc40ca0f2ffce&to=14f2ad0e3cc54d5eb254b545a469e8ffdb62b119&stat=instructions 2% on average, 6% on mafft.
Tue, Nov 24
Mon, Nov 23
Unless there's something that makes this issue specific to the call slot optimization, wouldn't it be better to adjust the logic in MDNode::getMostGenericAliasScope() to ensure that any code performing alias scope merging is covered?
The logic here looks sound to me. Structurally, I would suggest integrating this with the getPointerDereferenceableBytes() check above, as you are currently duplicating the logic around it. Basically extract the current getPointerDereferenceableBytes() call into a static function/lambda, and add the handling for the allocated object size in there, the remaining logic (including the non-null check and the alignment check) will be shared.
Sun, Nov 22
Rebase. I've committed the not-really-related change that lead to the compile-time improvement. With that gone, the new numbers are: http://localhost:8000/compare.php?from=6f5ef648a57aed3274118dcbd6467e8ac4f6ed1f&to=50d73a31620111c4298cc336ac4682ef7585ed6b&stat=instructions Which does make this a mild compile-time regression, but not too bad.
Sat, Nov 21
Fri, Nov 20
Rebase over D91649, after which this correctness of this change should be more obvious. At this point in BasicAA, all LocationSizes only allow access after the base pointer.
Accept AATags in MemoryLocation::getAfter/getBeforeOrAfter helpers, which makes them usable in more cases.
Thu, Nov 19
Rename the MemoryLocation helpers as well. With the new LocationSize naming, it makes sense to call these MemoryLocation::getAfter() and MemoryLocation::getBeforeOrAfter().
Rename to LocationSize::afterPointer() and LocationSize::beforeOrAfterPointer().
Rebase, address review.
I'm not sure. It might make more sense to make SimplifyCFG be less aggressive about forming switches in this case. Pretty sure that converting if (a == C || B == C2) into a switch is pretty much a strict regression in optimization power, because a lot more passes support branches than switches.
Wed, Nov 18
Looking over the patch, the compile-time regression is likely not related to MemorySSA, but the fact that you moved the AST construction before the !Preheader->getParent()->hasProfileData() check, and the profitability heuristic. This means that expensive AST construction now happens every time, instead of only when useful.
This change showed up as a significant compile-time regression (before it was reverted): https://llvm-compile-time-tracker.com/compare.php?from=85ccdcaa502ee2c478f2d0ba2b1e217117b69032&to=d4ba28bddc89a14885218b9eaa4fbf6654c2a5bd&stat=instructions
@uabelho Thanks for the report! I have reverted the assert in https://github.com/llvm/llvm-project/commit/85ccdcaa502ee2c478f2d0ba2b1e217117b69032. We'll have to fix this asymmetric in phi-phi aliasing before re-enabling it.
Tue, Nov 17
Can't really comment on whether this makes sense approach wise. For testing, I'd suggest to test that this is correctly handled in bitcode serialization/loading as well, and to drop the current hack in PredicateInfo and see if it works fine for that use case.
The approach here doesn't look right to me. Poison flags have no relation to the position where a certain instruction is executed. What's relevant is where the instruction is used.
I don't get this fix. Adding extra scopes to alias.scope should always be conservatively correct. I think there may be some confusion here due to the malformed noalias metadata you're using.
Strip pointer casts in GlobalsModRef. This *might* be the cause for observed differences, as previously the call from BasicAA would have been performed with casts stripped.
Sun, Nov 15
While I believe this change is correct, there are some incorrect AA uses that should be addressed first. There's one in MemorySSA wrt phi translation over decreasing cycles (https://reviews.llvm.org/D87778#inline-852179), and another in ParallelDSP (see test failure).