This is an archive of the discontinued LLVM Phabricator instance.

[ValueTracking] Handle two PHIs in isKnownNonEqual()
ClosedPublic

Authored by jaykang10 on Mar 11 2021, 6:27 AM.

Details

Summary

BasicAA fails to return NoAlias from aliasGEP with PHI index. Let's see code snippet.

typedef struct data
{
  int oldVal;
  int newVal;
} data_t;
  

void alias_issue(data_t *data, int max) {
  int pos = 1;
  int cmp = 3;

  while(cmp < max) {
    data[pos].newVal = data[cmp].newVal;
    data[pos].oldVal = data[cmp].newVal;
    pos = cmp;
    cmp *= 2;
  }

  return;
}

From above code, we can imagine that we need to load data[cmp].newVal one time. Let's see LLVM IR output from above code.

while.cond:                                       ; preds = %while.body, %entry
  %pos.0 = phi i32 [ 1, %entry ], [ %cmp.0, %while.body ]
  %cmp.0 = phi i32 [ 3, %entry ], [ %mul, %while.body ]
  %cmp1 = icmp slt i32 %cmp.0, %max
  br i1 %cmp1, label %while.body, label %while.end

while.body:                                       ; preds = %while.cond
  %sub = add nsw i32 %cmp.0, -1
  %idxprom = sext i32 %sub to i64 
  %newVal = getelementptr inbounds %struct.data, %struct.data* %data, i64 %idxprom, i32 1
  
  %0 = load i32, i32* %newVal, align 4, !tbaa !6   ==> data[cmp].newVal

  %sub2 = add nsw i32 %pos.0, -1
  %idxprom3 = sext i32 %sub2 to i64 
  %arrayidx4 = getelementptr inbounds %struct.data, %struct.data* %data, i64 %idxprom3
  %newVal5 = getelementptr inbounds %struct.data, %struct.data* %data, i64 %idxprom3, i32 1
  
  store i32 %0, i32* %newVal5, align 4, !tbaa !6  ==> data[pos].newVal = data[cmp].newVal;

  %1 = load i32, i32* %newVal, align 4, !tbaa !6  ==> data[cmp].newVal

  %oldVal = getelementptr inbounds %struct.data, %struct.data* %arrayidx4, i32 0, i32 0
  store i32 %1, i32* %oldVal, align 4, !tbaa !11 
  %mul = mul nsw i32 %cmp.0, 2
  br label %while.cond

As you can see, there are %0 and %1 which are same load instruction for data[cmp].newVal. Optimization passes like instcombine, GVN check the the address of store i32 %0, i32* %newVal5, align 4, !tbaa !6 has alias with %1 load. BasicAA returns MayAlias for it. If GEP has index with PHI, it look aliasGEP fails to detect NoAlias. Let's see the situation.

loop:
  %pos.0 = phi i32 [ 1, %entry ], [ %cmp.0, %loop ]
  %cmp.0 = phi i32 [ 3, %entry ], [ %mul, %loop ]
  ...
  %sub = add nsw i32 %cmp.0, -1
  %idxprom = sext i32 %sub to i64
  %newVal = getelementptr inbounds %struct.data, %struct.data* %data, i64 %idxprom, i32 1
  ...
  %sub2 = add nsw i32 %pos.0, -1
  %idxprom3 = sext i32 %sub2 to i64
  %newVal5 = getelementptr inbounds %struct.data, %struct.data* %data, i64 %idxprom3, i32 1
  ...
  %mul = mul nsw i32 %cmp.0, 2
  br label %loop

On above example, %pos.0 uses previous iteration's %cmp.0 with backedge according to PHI's instruction's definition. In this case, I think we can say %pos.0 is not same with %cmp.0. This patch is to detect above pattern.

Diff Detail

Event Timeline

jaykang10 created this revision.Mar 11 2021, 6:27 AM
jaykang10 requested review of this revision.Mar 11 2021, 6:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 11 2021, 6:27 AM
jaykang10 edited reviewers, added: efriedma, sanwou01; removed: eli.friedman.Mar 11 2021, 6:41 AM

I am so sorry. I made a mistake on description. I will update it.

jaykang10 edited the summary of this revision. (Show Details)Mar 11 2021, 9:59 AM
jaykang10 edited the summary of this revision. (Show Details)
jaykang10 edited the summary of this revision. (Show Details)Mar 11 2021, 10:02 AM
jaykang10 edited the summary of this revision. (Show Details)Mar 11 2021, 11:36 PM
nikic edited reviewers, added: nikic, reames; removed: SjoerdMeijer, sanwou01.Mar 12 2021, 1:00 PM
nikic added a subscriber: nikic.

Is it possible to phrase this as an enhancement to isKnownNonEqual() of two phi nodes, rather than adding additional logic in BasicAA?

If I'm reading the code right, this looks potentially expensive for compile time. You're recursing on each phi operand individually. Can we narrow the scope any without loosing your motivating example?

I agree that this seems to make more sense in isKnownNonEqual than basic aa. In that context, it would naturally follow to handle two phis with N-1 trivially different values (e.g. constants), and recurse through at most one input pair. Would that handle your example?

Is it possible to phrase this as an enhancement to isKnownNonEqual() of two phi nodes, rather than adding additional logic in BasicAA?

@nikic It is good point! I agree with you. It would be more general solution. Let me try and check it next week.

If I'm reading the code right, this looks potentially expensive for compile time. You're recursing on each phi operand individually. Can we narrow the scope any without loosing your motivating example?

I agree that this seems to make more sense in isKnownNonEqual than basic aa. In that context, it would naturally follow to handle two phis with N-1 trivially different values (e.g. constants), and recurse through at most one input pair. Would that handle your example?

@reames In order to check the recursive operand of PHI node, I thought it is enough to check backedge from PHI node's incoming block to the block with PHI node using dominance relation. I could be mis-understanding what you suggest . If I missed something, please let me know.

jaykang10 updated this revision to Diff 330613.Mar 15 2021, 4:52 AM
jaykang10 retitled this revision from [Alias] Add a ah-hoc pattern for aliasGEP to [Alias] Add a ah-hoc pattern with two PHI for isKnownNonEqual.

Moved the change from aliasGEP to isKnownNonEqual.

@nikic @reames After moving this change to isKnownNonEqual, it still handles the example case with existing patterns in aliasGEP.

If I'm reading the code right, this looks potentially expensive for compile time. You're recursing on each phi operand individually. Can we narrow the scope any without loosing your motivating example?

I agree that this seems to make more sense in isKnownNonEqual than basic aa. In that context, it would naturally follow to handle two phis with N-1 trivially different values (e.g. constants), and recurse through at most one input pair. Would that handle your example?

@reames In order to check the recursive operand of PHI node, I thought it is enough to check backedge from PHI node's incoming block to the block with PHI node using dominance relation. I could be mis-understanding what you suggest . If I missed something, please let me know.

@reames I guess you meant to skip isKnownNonEqual for each operand of PHI. um... I think we need to check all operands of two PHIs are not same... I agree It could increase compile time... um...

Some nits, didn't follow logic fully.

llvm/lib/Analysis/ValueTracking.cpp
2662

I think this whole new block should be split up into it's own helper function just like isAddOfNonZero().

2663

Add a FIXME that this is missing a generalization to handle the case where one is a PHI and another one isn't?

2663–2679

This is only ever true if the PHI nodes are in the same basic block.

2683–2686

This is potentially needlessly costly, because for switches,
each case that branches to the block, will have it's own entry in PHI node.

jaykang10 added inline comments.Mar 15 2021, 5:44 AM
llvm/lib/Analysis/ValueTracking.cpp
2662

Yep, I will move it to new helper function.

2663

Yep, I will add it.

2663–2679

Right! I will update it.

2683–2686

Yep, I agree with you. I will update it.

jaykang10 updated this revision to Diff 330649.Mar 15 2021, 7:14 AM

Following @lebedev.ri comments, moved the change to a new helper function and updated redundant code.

jaykang10 marked 4 inline comments as done.Mar 15 2021, 7:14 AM

Following @lebedev.ri comments, moved the change to a new helper function and updated redundant code.

Did you upload the right diff? I think not all inline comments were addressed.

llvm/lib/Analysis/ValueTracking.cpp
2570

Now that it is in a separate function, early return can be used.

Following @lebedev.ri comments, moved the change to a new helper function and updated redundant code.

Did you upload the right diff? I think not all inline comments were addressed.

Ah, sorry... I am not sure which one I missed. Can you let me know that please?

llvm/lib/Analysis/ValueTracking.cpp
2570

Yep, I will update it.

nikic added inline comments.Mar 15 2021, 10:55 AM
llvm/lib/Analysis/ValueTracking.cpp
2525

If the parents are the same, then you don't need to check the incoming blocks, they must be the same.

You also don't need to check PN1 and PN2 for null, they are expected to be non-null in this function.

Can you please add tests for this functionality in llvm/test/Analysis/ValueTracking/known-non-equal.ll? These are simpler than going through BasicAA.

llvm/lib/Analysis/ValueTracking.cpp
2552

I don't understand why this code is needed. For your example you are comparing %cmp.0 and %mul = mul nsw i32 %cmp.0, 2, which should already be recognized as non-equal. What is this backedge check for?

2557

%inc is not used in the phi nodes, so it's not clear what the relation is. Maybe you meant to replace %mul with %inc?

jaykang10 added inline comments.Mar 15 2021, 12:06 PM
llvm/lib/Analysis/ValueTracking.cpp
2525

You are right! I will update it.

2552

There is a slightly different case as below.

// loop_header:
//   %cmp.0 = phi i32 [ 3, %entry ], [ %inc, %loop_body ]
//   %pos.0 = phi i32 [ 1, %entry ], [ %cmp.0, %loop_body ]
//   ...
// loop_body:
//   ...
//   %inc = add i32 %cmp.0, 1

If the %cmp.0 is ahead of %pos.0, it is not easy to say that %cmp.0 is not same with %pos.0. Following PHI instruction's definition which @dmgreen mentioned, the incoming value comes from incoming block. In this case, if there is backedge from incoming block, I think we can say %pos.0 has previous iteration's %inc value and %com.0 is not same with %pos.0. That was the idea for backedge. I am sorry for making you confused. If you feel something wrong from it, please let me know... I could be wrong...

2557

Sorry for typo... %inc is correct.

nikic added inline comments.Mar 15 2021, 2:01 PM
llvm/lib/Analysis/ValueTracking.cpp
2552

I don't think this reasoning is generally valid, at least not without additional checks. Consider this counter-example:

%p1 = phi(1, %p1.next)
%p2 = phi(3, %p1)
%p1.next = or %p1, 2

On the first iteration %p1 = 1, %p2 = 3, on the second %p1 = 3, %p2 = 1, on the third %p1 = 3, %p2 = 3, so the phis are not known non-equal.

jaykang10 added inline comments.Mar 15 2021, 2:44 PM
llvm/lib/Analysis/ValueTracking.cpp
2552

Yep, that's the reason why I added below FIXME comments. I have wanted to get the idea to fix it and that is the main thing for this patch.

if (PN1 == IV2 || PN2 == IV1) {
  // FIXME: Do we have to check the IdxVal is not constant in detail?
  continue;
}

I think I need to explain the whole situation. If you compile summary's c code, the two PHI's order is first %pos.0 and second %cmp.0. However, the on loop rotation pass, SSAUpdater adds PHIs to newHeader using BB->front() and the order of PHIs is changed. First time, I thought I need to fix the SSAUpdater but @dmgreen pointed out the order of PHIs is not invalid from PHI's definition on LLVM IR ref. I have checked the PHIElimination pass removes the PHI correctly following the LLVM IR ref. It means that we can meet any order of PHIs on LLVM IR code. If we can not detect the PHIs are not same, we can return NoAlias from something like above c example and it blocks optimizations... I thought something like SCEV to resolve it but it could be too expensive for compile time.... The main purpose of this patch was to get the idea to resolve this issue... I am sorry for unclear patch and explanation. If you have idea about this issue, please let me know.

jaykang10 added inline comments.Mar 15 2021, 3:12 PM
llvm/lib/Analysis/ValueTracking.cpp
2552

Oops! Sorry for Typo...
we can return NoAlias ==> we can not return NoAlias.

jaykang10 updated this revision to Diff 330909.Mar 16 2021, 2:29 AM

Updated code following comments of @nikic

I think we could check the IV of loop roughly as below.

 if (Q.DT != nullptr && Q.DT->dominates(PN1->getParent(), Pred)) {
   const SCEV *IVSCEV = nullptr;
   if (PN1 == IV2) 
     IVSCEV = SE.getSCEV(IV2);
   else if (PN2 == IV1) {
     IVSCEV = SE.getSCEV(IV1);

   if (isa<SCEVAddRecExpr>(IVSCEV))
     continue;
}

In this case, we could say the IV is not same among iterations... but it needs ScalarEvolution... @nikic @reames How do you think about it? I wonder whether it is acceptable to add ScalarEvolution to BasicAA and ValueTracking or not...

I think we could check the IV of loop roughly as below.

 if (Q.DT != nullptr && Q.DT->dominates(PN1->getParent(), Pred)) {
   const SCEV *IVSCEV = nullptr;
   if (PN1 == IV2) 
     IVSCEV = SE.getSCEV(IV2);
   else if (PN2 == IV1) {
     IVSCEV = SE.getSCEV(IV1);

   if (isa<SCEVAddRecExpr>(IVSCEV))
     continue;
}

In this case, we could say the IV is not same among iterations... but it needs ScalarEvolution... @nikic @reames How do you think about it? I wonder whether it is acceptable to add ScalarEvolution to BasicAA and ValueTracking or not...

Any comments for SCEV in BasicAA?

nikic added a comment.Mar 18 2021, 9:54 AM

@jaykang10 Using SCEV in BasicAA is not possible (SCEVAA uses SCEV).

@jaykang10 Using SCEV in BasicAA is not possible (SCEVAA uses SCEV).

Thanks for letting me know. @nikic

At this moment, the SCEVAA fails to return NoAlias from the example... If I want to create a ad-hoc pattern with SCEV in AA, I need to add the pattern to SCEVAA and I have to add the SCEVAA to pass manager's AA pipeline... um... I am not sure it is good way... As far as I know, only loop passes run the SCEVAA and other passes do not run the SCEVAA, do they?

I don't understand what the problem here is?

Sure, it's not great that we need to write yet another bicycle,
but it seems to me that the "only" change isKnownNonEqual() needs to support the motivational case
is to implement a variation of isAddOfNonZero() but for multiplications.

nikic added a comment.Mar 18 2021, 1:31 PM

I think the main problem here is that the review tries doing too much. Yes, the BasicAA test case can be handled as a combination of simple phi handling plus some additional mul handling. But from the code comments and discussion the intention is to also handle some kind of additional case involving add recurrences.

It would be great if this patch can be cut down to just basic phi-phi handling, with dedicated ValueTracking tests -- even if that does not cover the motivational case in isolation.

As @reames has mentioned before, the code should also only recurse into one pair of operands, and require trivially non-equal constants for the rest. At least we have been following this rule in the rest of the isKnownNonEqual code.

I don't understand what the problem here is?

Sure, it's not great that we need to write yet another bicycle,
but it seems to me that the "only" change isKnownNonEqual() needs to support the motivational case
is to implement a variation of isAddOfNonZero() but for multiplications.

@lebedev.ri As you suggested, simply, we could add lots of case by case code with checking AddOfNonZero or MulofNonZeroOrOne or something like that whenever we find different example. However, if we consider more general solution, we could check whether the IV's expression is linear or not. For the general solution, I am asking the SCEV or something like that. If it is not possible to find the general solution, I could try what you suggest.

I think the main problem here is that the review tries doing too much. Yes, the BasicAA test case can be handled as a combination of simple phi handling plus some additional mul handling. But from the code comments and discussion the intention is to also handle some kind of additional case involving add recurrences.

It would be great if this patch can be cut down to just basic phi-phi handling, with dedicated ValueTracking tests -- even if that does not cover the motivational case in isolation.

As @reames has mentioned before, the code should also only recurse into one pair of operands, and require trivially non-equal constants for the rest. At least we have been following this rule in the rest of the isKnownNonEqual code.

Ok, I agree with your opinion. Let me cut down this patch to the basic case. Thanks for suggestion @nikic @lebedev.ri

jaykang10 updated this revision to Diff 331860.Mar 19 2021, 7:04 AM

Cut down this patch to basic case.

@nikic @lebedev.ri I have updated the patch for basic cases. If you feel something weird or inefficient, please let me know.

jaykang10 updated this revision to Diff 331890.Mar 19 2021, 8:30 AM

Fixed a bug

Fixed a bug.

  • During optimization, there are unreachable blocks with PHI and they have no predecessors. Do not handle it.
jaykang10 marked 5 inline comments as done.Mar 19 2021, 11:36 AM

This is still trying to handle the phi and the add/mul together. Both of those should be handled separately. I've applied https://github.com/llvm/llvm-project/commit/d11d5d1c5f5a8bafc28be98f43c15a3452abb98b to add the necessary handling for multiplies, and I think with that this should work if you just do the recursive isKnownNonEqual checks without any additional handling.

This is still trying to handle the phi and the add/mul together. Both of those should be handled separately. I've applied https://github.com/llvm/llvm-project/commit/d11d5d1c5f5a8bafc28be98f43c15a3452abb98b to add the necessary handling for multiplies, and I think with that this should work if you just do the recursive isKnownNonEqual checks without any additional handling.

@nikic Thank you very much for the patch. It is good. Can I add slightly something more to your patch please? The isKnownNonZero fails to detect the recurrence with mul. I have created a review for it. https://reviews.llvm.org/D99069 Can you review it is ok or not please?

jaykang10 updated this revision to Diff 332245.Mar 22 2021, 4:11 AM

Following patch and comment of @nikic, use isKnownNonEqual instead of checking add/mul.

jaykang10 updated this revision to Diff 332821.Mar 23 2021, 4:32 PM
nikic added inline comments.Mar 24 2021, 12:51 PM
llvm/lib/Analysis/ValueTracking.cpp
2560

Why is it necessary to explicitly handle this case? It should be fine to report that two empty phis are non-equal (they're poison, so they can be either equal or not equal, or both at the same time).

2570

You need to adjust the context instruction when recursing into phis. Grep for RecQ in this file.

2575

Why do we still need this code at all? With the improved handling for multiplies, we shouldn't need this special check anymore. Your tests still pass if I delete this code entirely.

2640

You are incrementing the Depth twice, once here, and once inside isNonEqualPHIs. I think you want to drop the increment here.

llvm/test/Analysis/ValueTracking/non-equal-two-phis.ll
1 ↗(On Diff #332821)

These tests are still testing the isKnownNonEqual() functionality via load forwarding and BasicAA. It would be better to make these -instsimplify tests that that fold an icmp to true or false.

nikic retitled this revision from [Alias] Add a ah-hoc pattern with two PHI for isKnownNonEqual to [ValueTracking] Handle two PHIs in isKnownNonEqual().Mar 24 2021, 1:00 PM
jaykang10 added inline comments.Mar 25 2021, 1:50 AM
llvm/lib/Analysis/ValueTracking.cpp
2560

There was a regression. Below loop uses predecessors(PN1->getParent()) as range of loop. On optimization pass, there was a unreachable PN's block. In this case, the block did not have predecessors and it caused wrong return value. Let me update it.

2570

Yep, I will update it.

2575

You are right!!! I will delete it.

2640

Yep, I will update it.

llvm/test/Analysis/ValueTracking/non-equal-two-phis.ll
1 ↗(On Diff #332821)

Yep, I will add the tests and delete this test.

jaykang10 updated this revision to Diff 333233.Mar 25 2021, 1:55 AM

Following comments of @nikic, updated code.

Thank you!
This looks about right to me now.

llvm/lib/Analysis/ValueTracking.cpp
2565–2567
jaykang10 added inline comments.Mar 25 2021, 10:10 AM
llvm/lib/Analysis/ValueTracking.cpp
2565–2567

Yep, I will update it.

Following comment of @lebedev.ri, updated code.

This looks mostly good. Only remaining issue is that we need to aggressively limit recursion when it comes to phi nodes. What most other ValueTracking code does is to use MaxAnalysisRecursionDepth - 1 as the depth for recursion over phis, which means that only one more level of recursion is allowed. If that still covers your motivating case, then I'd suggest to follow that pattern. If that doesn't cover your motivating case, then the alternative here would be to only allow full recursion over one pair of phi operands, and only check that the others are trivially non-equal (i.e. distinct ConstantInts). That should definitely cover your case, and would also be in line with existing practice for isKnownNonEqual(). Either way is fine by me.

llvm/lib/Analysis/ValueTracking.cpp
2564

This should be using IncomBB->getTerminator() as the context, not the phi node itself.

llvm/test/Analysis/ValueTracking/monotonic-phi.ll
305 ↗(On Diff #333336)

We should also have a negative test for the case where the start value is the same (e.g. 2, 2 instead of 2. 3).

known-non-equal.ll might be a better fit for these tests.

jaykang10 added inline comments.Mar 25 2021, 11:15 AM
llvm/lib/Analysis/ValueTracking.cpp
2564

Ah... sorry. I will update it.

llvm/test/Analysis/ValueTracking/monotonic-phi.ll
305 ↗(On Diff #333336)

Yep, I will update it.

Following comment of @nikic, updated code.

This looks mostly good. Only remaining issue is that we need to aggressively limit recursion when it comes to phi nodes. What most other ValueTracking code does is to use MaxAnalysisRecursionDepth - 1 as the depth for recursion over phis, which means that only one more level of recursion is allowed. If that still covers your motivating case, then I'd suggest to follow that pattern. If that doesn't cover your motivating case, then the alternative here would be to only allow full recursion over one pair of phi operands, and only check that the others are trivially non-equal (i.e. distinct ConstantInts). That should definitely cover your case, and would also be in line with existing practice for isKnownNonEqual(). Either way is fine by me.

@nikic Thanks for kind comments. I can see my case is fixed with MaxAnalysisRecursionDepth - 1.

This comment was removed by nikic.

Used MaxAnalysisRecursionDepth - 1 for limit of recursion with phi.

nikic added inline comments.Mar 25 2021, 12:28 PM
llvm/lib/Analysis/ValueTracking.cpp
2554

Ah sorry, this isn't what I meant. What the other code does is pass MaxAnalysisRecursionDepth - 1 to the recursive call (i.e. the recursive isKnownNonEqual call in this case). I just tried this, but unfortunately this would not cover your case (it needs to recurse two more levels).

jaykang10 added inline comments.Mar 25 2021, 1:27 PM
llvm/lib/Analysis/ValueTracking.cpp
2554

um... only one pair of phi operands can use full recursion...

bool UsedFullRecursion = false;
for (const BasicBlock *IncomBB : PN1->blocks()) {
  if (!VisitedBBs.insert(IncomBB).second)
    continue; // Don't reprocess blocks that we have dealt with already.
  const Value *IV1 = PN1->getIncomingValueForBlock(IncomBB);
  const Value *IV2 = PN2->getIncomingValueForBlock(IncomBB);
  const ConstantInt *C1 = dyn_cast<ConstantInt>(IV1);
  const ConstantInt *C2 = dyn_cast<ConstantInt>(IV2);
  if (C1 && C2) {
    if (C1->getValue().eq(C2->getValue()))
      return false;
  } else {
    // Only one pair of phi operands is allowed for full recursion.
    if (UsedFullRecursion)
      return false;

    Query RecQ = Q; 
    RecQ.CxtI = IncomBB->getTerminator();
    if (!isKnownNonEqual(IV1, IV2, Depth + 1, RecQ))
      return false;
    UsedFullRecursion = true;
  }    
}

How you you think about above one? Is it acceptable?

nikic added inline comments.Mar 25 2021, 3:16 PM
llvm/lib/Analysis/ValueTracking.cpp
2554

Yeah, that looks reasonable to me. Maybe I would write the start as:

const APInt *C1, *C2;
if (match(IV1, m_APInt(C1)) && match(IV2, m_APInt(C2)) && *C1 != *C2)
  continue;

if (UsedFullRecursion)
  return false;
// etc.
jaykang10 added inline comments.Mar 25 2021, 3:31 PM
llvm/lib/Analysis/ValueTracking.cpp
2554

Yep, It looks better. I will update it.

jaykang10 updated this revision to Diff 333440.Mar 25 2021, 3:31 PM

Allowed only one pair of phi operands for full recursion.

nikic accepted this revision.Mar 25 2021, 3:35 PM

LGTM, thanks for pulling this through!

This revision is now accepted and ready to land.Mar 25 2021, 3:35 PM

LGTM, thanks for pulling this through!

Thanks for good comments and guide. :)

This revision was landed with ongoing or failed builds.Mar 25 2021, 3:57 PM
This revision was automatically updated to reflect the committed changes.