This is an archive of the discontinued LLVM Phabricator instance.

[BasicAA] Handle two unknown sizes for GEPs
ClosedPublic

Authored by nikic on Dec 1 2020, 9:43 AM.

Details

Summary

If we have two unknown sizes and one GEP operand and one non-GEP operand, then we currently simply return MayAlias. The comment says we can't do anything useful ... but we can! We can still check that the underlying objects are different (and do so for the GEP-GEP case).

To reduce the compile-time impact, this a) checks this early, before doing the relatively expensive GEP decomposition that will not be used and b) doesn't do the check if the other operand is a phi or select. In that case, the phi/select will already recurse, so this would just do two slightly different recursive walks that arrive at the same roots.

Compile-time is still a bit of a mixed bag: https://llvm-compile-time-tracker.com/compare.php?from=624af932a808b363a888139beca49f57313d9a3b&to=845356e14adbe651a553ed11318ddb5e79a24bcd&stat=instructions On average this is a small improvement, but sqlite with ThinLTO has a 0.5% regression (lencod has a 1% improvement).

The BasicAA test case checks this by using two memsets with unknown size. However, the more interesting case where this is useful is the LoopVectorize test case, as analysis of accesses in loops tends to always us unknown sizes.

Diff Detail

Event Timeline

nikic created this revision.Dec 1 2020, 9:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 1 2020, 9:43 AM
nikic requested review of this revision.Dec 1 2020, 9:43 AM

This kind of logic should always be valid, regardless if V1 is a GEP or not, right? Is there a way to do this check early or late for any query?

nikic added a comment.Dec 1 2020, 12:58 PM

This kind of logic should always be valid, regardless if V1 is a GEP or not, right? Is there a way to do this check early or late for any query?

The general idea is valid, but the way it is applied depends on the operation. For GEPs we just strip to the base pointer. For phis and selects, we recurse over the phi/select operands (with special casing for the case of identical control dependence).

This kind of logic should always be valid, regardless if V1 is a GEP or not, right? Is there a way to do this check early or late for any query?

The general idea is valid, but the way it is applied depends on the operation. For GEPs we just strip to the base pointer. For phis and selects, we recurse over the phi/select operands (with special casing for the case of identical control dependence).

Hm, I'm confused(, but also not deep enough in the BasicAA code to argue). I assumed that this new logic which is put in a function that has a precondition (isa<GEP>(V1)) could be placed in a function without that precondition. The logic does not utilize the fact that V1 isa GEP. Anyway, it was just a thought.

nikic added a comment.Dec 2 2020, 9:49 AM

This kind of logic should always be valid, regardless if V1 is a GEP or not, right? Is there a way to do this check early or late for any query?

The general idea is valid, but the way it is applied depends on the operation. For GEPs we just strip to the base pointer. For phis and selects, we recurse over the phi/select operands (with special casing for the case of identical control dependence).

Hm, I'm confused(, but also not deep enough in the BasicAA code to argue). I assumed that this new logic which is put in a function that has a precondition (isa<GEP>(V1)) could be placed in a function without that precondition. The logic does not utilize the fact that V1 isa GEP. Anyway, it was just a thought.

BasicAA first does a stripPointerCastsAndInvariantGroups(). After that operation we are left with either a GEP, a Phi, a Select, or an underlying object (in the sense of something that cannot be further inspected). The BasicAA code thus handles these three cases specially with recursive queries. The reason this is in the GEP code is that we need to strip away the GEP there to arrive at the underlying object. So while isa<GEP> is not directly used, if the value weren't a GEP, then this code wouldn't do anything useful (as we'd have already arrived at the underlying object).

Hope that makes some sense...

nikic updated this revision to Diff 309347.Dec 3 2020, 1:09 PM
nikic edited the summary of this revision. (Show Details)

Ruin formatting using clang-format :(

jdoerfert accepted this revision.Dec 10 2020, 12:42 PM

LGTM, with your explanation it makes sense to place it there. The logic seemed sound from the start.

This revision is now accepted and ready to land.Dec 10 2020, 12:42 PM
This revision was automatically updated to reflect the committed changes.
fhahn added a subscriber: fhahn.Dec 18 2020, 4:28 AM

With this change, it appears LLVM gets stuck in a loop when building MultiSource/Benchmarks/MiBench/consumer-lame with -O3 & LTO. To reproduce, build MultiSource/Benchmarks/MiBench/consumer-lame/consumer-lame with -O3 & LTO on X86.

I am not sure if there are any public bots that run with that configuration, but it is causing some issues with our internal testing. Given that, I am inclined to revert this change for now, unless there's a quick fix. I'll try to reduce the input IR for the hang.

I reverted the change for now in a74941da716d and filed https://bugs.llvm.org/show_bug.cgi?id=48553 which contains a smallish reproducer to show the increase in compile-time. I am not sure if the full input actually hangs or just takes a very long time to finish.

@fhahn Thank you! For this test case we end up recursing through a deep "gep of phi of gep of phi of ..." chain, where we split up into two branches at each phi, resulting in exponential runtime. Not recursing in this case, while somewhat arbitrary, ends up cutting off the recursion at the first "gep of phi" where we no longer have known sizes.

I think the right way to address this is to introduce a proper recursion limit in BasicAA, though that would take some care to ensure cache consistency.