This is an archive of the discontinued LLVM Phabricator instance.

[AssumptionCache] caches @llvm.experimental.guard's
ClosedPublic

Authored by caojoshua on Jan 23 2023, 12:48 AM.

Details

Summary

As discussed in https://github.com/llvm/llvm-project/issues/59901

This change is not NFC. There is one SCEV and EarlyCSE test that have an
improved analysis/optimization case. Rest of the tests are not failing.

I've mostly only added cleanup to SCEV since that is where this issue
started. As a follow up, I believe there is more cleanup opportunity in
SCEV and other affected passes.

There could be cases where there are missed registerAssumption of
guards, but this case is not so bad because there will be no
miscompilation. AssumptionCacheTracker should take care of deleted
guards.

Diff Detail

Event Timeline

caojoshua created this revision.Jan 23 2023, 12:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 23 2023, 12:48 AM

An additional though: I have a lot of asserts that checks is assume || is guard. I'd like to add a helper in IntrinsicInst, but there is already a isAssumeLikeIntrinsic(), which is completely unrelated to this change. Maybe we can add isCondGuard()?

caojoshua published this revision for review.Jan 23 2023, 12:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 23 2023, 12:54 AM
nikic added a comment.Jan 23 2023, 3:01 AM

An additional though: I have a lot of asserts that checks is assume || is guard. I'd like to add a helper in IntrinsicInst, but there is already a isAssumeLikeIntrinsic(), which is completely unrelated to this change. Maybe we can add isCondGuard()?

I would recommend adding an extra instruction type next to here: https://github.com/llvm/llvm-project/blob/ea6e8d4f5913e89a1a6951f017b6509ed32c6be9/llvm/include/llvm/IR/IntrinsicInst.h#L1518 Given that the term "assume-like" is already taken, this could be AssumeOrGuardInst.

This will allow you to retain the more specific type information where currently AssumeInst is used.

caojoshua edited the summary of this revision. (Show Details)
  • added a CondGuardInst that consists of assumes and guards
  • added to AC's basic.ll test
nikic accepted this revision.Jan 24 2023, 12:42 AM

LGTM

llvm/lib/Analysis/ValueTracking.cpp
620

Drop these asserts, they are implied by cast<> (rather than dyn_cast<>).

llvm/lib/Transforms/Utils/CodeExtractor.cpp
1670

This needs to use CondGuardInst.

This revision is now accepted and ready to land.Jan 24 2023, 12:42 AM
This revision was landed with ongoing or failed builds.Jan 24 2023, 8:17 PM
This revision was automatically updated to reflect the committed changes.

This chance has caused us a huge compile time degradation. @caojoshua I wonder, are you using guards for some reason? And why have you chosen to use it instead of new representation based on https://llvm.org/docs/LangRef.html#llvm-experimental-widenable-condition-intrinsic?

I think that experimental.guard wasn't our best design decision, and we in Azul are trying to completely get rid of them (it's moving slowly with pain, but it's moving). The reason of this is the requirement to duplicate every single branch-based transform for guards intrinsic, which is practically impossible. Unless there are strong reasons to stick with them, I would strongly recommend to use branch-based form that doesn't have this problem.

This chance has caused us a huge compile time degradation.

Oh, that's surprising. I would have expected this to be a compile-time improvement. Do you have any information on where the slowdown is coming from?

I have two guesses:

  • We have a number of places manually handling guards, and these might now end up handling guards both explicitly and via AssumptionCache, doing duplicate work. This would be easy to fix by just deleting the guard code.
  • This automatically supports guards everywhere assumes are supported, so maybe we now have a lot more places handling guards.

@caojoshua I wonder, are you using guards for some reason?

No, I don't have specific use cases for guards not using guards, I just saw inconsistencies between how we handle assumptions and guards, and I thought this would be a nice code cleanup. I did not anticipate an increase in compile time.

I think both of nikic's guesses are entirely reasonable. Having information on which passes has the most slowdowns would be very helpful.

If helpful, I would not mind reverting this change. We can revisit this idea. Or since we would like to move away from guards, we could forget about this change all together.

mkazantsev added a comment.EditedFeb 20 2023, 3:33 AM

I plan to investigate that in a span of next few days, but the regression is REALLY huge. @caojoshua I will do the revert and report back when I know what's going on.

As for the intrinsic, my strong preference would be to deprecate it and transit to branch form in near future, but we'll see how it goes.

anna added a subscriber: anna.Feb 24 2023, 5:22 AM

Also, this caused a miscompile when assumption cache recorded guards. Specifically, in LazyValueAnalysis, when it was used by correlated-value-propagation. Took a while to root-cause to this change, since it was a miscompile. I will add a testcase in LVI/CVP to catch this in the future.

Specifically, LVI makes a distinction between guards and assumes (in intersectAssumeOrGuardBlockValueConstantRange), which was not held by this patch.

anna added a comment.EditedFeb 24 2023, 8:31 AM

If this change goes forward, we need to investigate all places where assumes/AssumptionCache is used and see if what is being done for assumes holds for guards as well. In LVI, what was done for assumes did not hold for guards.
Also, in future, when anyone adds assumption cache (under the idea that it handles assumes), we can very easily fall into the problem of having a miscompile with guards. For that reason and given that we are working towards moving away from this implementation, I would suggest we abandon this change altogether.

For the details, here is the code where we miscompile in LVI with this change (It can be fixed by checking that we're indeed intersecting with an assume from the AC) :

void LazyValueInfoImpl::intersectAssumeOrGuardBlockValueConstantRange(
          Value *Val, ValueLatticeElement &BBLV, Instruction *BBI) {
...
BasicBlock *BB = BBI->getParent();
    for (auto &AssumeVH : AC->assumptionsFor(Val)) {
      if (!AssumeVH)
        continue;
  
      // Only check assumes in the block of the context instruction. Other
      // assumes will have already been taken into account when the value was
      // propagated from predecessor blocks.
      auto *I = cast<CallInst>(AssumeVH);
      if (I->getParent() != BB || !isValidAssumeForContext(I, BBI))
        continue;
  
      BBLV = intersect(BBLV, getValueFromCondition(Val, I->getArgOperand(0))); <-- we were intersecting with a guard here (!) without checking where the guard belonged 
    }
  
    // If guards are not used in the module, don't spend time looking for them
    if (GuardDecl && !GuardDecl->use_empty() &&
        BBI->getIterator() != BB->begin()) {
      for (Instruction &I : make_range(std::next(BBI->getIterator().getReverse()),
                                       BB->rend())) {
        Value *Cond = nullptr;
        if (match(&I, m_Intrinsic<Intrinsic::experimental_guard>(m_Value(Cond))))
          BBLV = intersect(BBLV, getValueFromCondition(Val, Cond));
      }
    }
  
..
}

Note that with guards we intersect LVI by reverse traversing from the context instruction upwards till start of basic block. The assumption intersection above it does not do that - it just takes all assumes in the block. isValidAssumeForContext allows assumes which are after the context instruction as long as we will reach that assume (this would be incorrect in the case of guards). With this patch, we were intersecting the constant range with a guard's argument which was after Val and incorrectly folding a compare in correlated-value-propagation.

nikic added a comment.Feb 24 2023, 9:11 AM

Note that with guards we intersect LVI by reverse traversing from the context instruction upwards till start of basic block. The assumption intersection above it does not do that - it just takes all assumes in the block. isValidAssumeForContext allows assumes which are after the context instruction as long as we will reach that assume (this would be incorrect in the case of guards).

Hm, would you mind explaining in more detail why this is not legal for guards (maybe with an IR sample)? It's not completely clear to me what the distinction is.

But in any case, we could easily accommodate this by adjusting isValidAssumeForContext() -- it has access to the call, so it could skip that particular logic for guards.

anna added a comment.Feb 24 2023, 1:10 PM

Note that with guards we intersect LVI by reverse traversing from the context instruction upwards till start of basic block. The assumption intersection above it does not do that - it just takes all assumes in the block. isValidAssumeForContext allows assumes which are after the context instruction as long as we will reach that assume (this would be incorrect in the case of guards).

Hm, would you mind explaining in more detail why this is not legal for guards (maybe with an IR sample)? It's not completely clear to me what the distinction is.

Sure, consider this IR:

%local_48_ = phi i32 [ %172, %bci_664 ], [ 4, %bci_381 ]
...

bb:
 %145 = shl nsw i32 %local_48_, 16, !dbg !96
  %val = add i32 %145, 393216, !dbg !96
  %146 = ashr exact i32 %val, 16, !dbg !96
  %147 = icmp ult i32 %val, 32768000, !dbg !96
  call void (i1, ...) @llvm.experimental.guard(i1 %147, i32 13)

If we had an assume instead of a guard there, we can use the argument of the assume to make the constant range more precise. This works only when we are guaranteed to transfer execution from %val until the assume (which is part of what isValidAssumeForContext checks).

However, with the guard, the icmp ult i32 %val, 32768000 condition is not guaranteed to be true because the guard can fail. By considering the guard's range for the val, we unfortunately used circular logic and folded away the guard based on it (the CR for %val was incorrectly calculated as constantrange<0, 32768000>).
This is why when we try to make the constant range more precise with guards, we do backward traversal from the val until start of BB to find relevant guards. This is correct because by relying on guards *before* val, we know that we can use the guard's condition because we can reach val only if the guard passes.

But in any case, we could easily accommodate this by adjusting isValidAssumeForContext() -- it has access to the call, so it could skip that particular logic for guards.

Yes, that would work and handle all cases where the API of Assumption Cache is used.

If this change goes forward, we need to investigate all places where assumes/AssumptionCache is used and see if what is being done for assumes holds for guards as well. In LVI, what was done for assumes did not hold for guards.
Also, in future, when anyone adds assumption cache (under the idea that it handles assumes), we can very easily fall into the problem of having a miscompile with guards. For that reason and given that we are working towards moving away from this implementation, I would suggest we abandon this
change altogether.

I agree with Anna here. Although it should be possible to make this change work with isValidAssumeForContext(), it will be a lot of work and testing a bunch of passes with little benefit. And we hope to move away from guards anyway.

nikic added a comment.Feb 25 2023, 3:19 PM

@anna Thanks for the example, I understand it now. We can only use the backwards reasoning with assume because assume violation is undefined behavior, while for guards we just exit this control flow path.

@caojoshua I think this depends a lot on when we can expect support for guard intrinsics to be removed. The LangRef note that this mechanism is obsolete has been there for more than 3 years, and as far as I can tell, there hasn't been any (upstream) movement to remove guard intrinsics in that time. If removing these intrinsics is just a vague plan for the future (rather than something people are actively working on right now), then I think it still makes sense to reapply a fixed version of this patch.

It sounds like a big part of the motivation for removing them is that they require special handling everywhere -- but with this patch, this is not really the case anymore, because they get handled as a side-effect of existing assume handling.

Regarding compile time: I made some manual checks and don't see any CT impact from this patch. Our automated triage might have been wrong, I don't trust this anymore. But the miscompile concern still holds.

My advice would still be to migrate to widenable conditions and eventually throw away experimental.guards. It has always been a bad idea.