This is an archive of the discontinued LLVM Phabricator instance.

[LiveVariables] Don't set undef reg PHI used as live for FromMBB
ClosedPublic

Authored by ZhangKang on May 17 2020, 1:04 AM.

Details

Summary

In the patch, https://reviews.llvm.org/D73152 , it adds a new function LiveVariables::addNewBlock.
This new function will add the reg which PHI used to the MBB which reg is from.
But the new function may cause LiveVariable Verification failed when the Src reg in PHI is undef.

For example:

bb.0:
  ... // no-definiton for %2
bb.1:
  ...  // definiton for %1
bb.2:
  %3:g8rc = PHI %1, %bb.1, undef %2:g8rc, %bb.0

It's obvious that, the %2 shouldn't live in bb.0, but the patch D73152 will set it live for bb.0.

If you have enabled all the pass verification in llvm/lib/CodeGen/TargetPassConfig.cpp,
then, you can use below command to reproduce the bug.

llvm-lit -a llvm/test/CodeGen/AArch64/shrink-wrap.ll

There is no test case for this patch, because the verification for LiveVariable pass has other errors,
we can't enable the verification for the LiveVariable pass and PHIElimination pass.

From the LiveVariables::analyzePHINodes, we can know we can use getOperand(i).readsReg() to avoid above bug.

Diff Detail

Event Timeline

ZhangKang created this revision.May 17 2020, 1:04 AM
ZhangKang edited the summary of this revision. (Show Details)May 17 2020, 1:05 AM
bb.0:
  ... // no-definiton for %2
bb.1:
  ...
bb.2:
  %3:g8rc = PHI %1, %bb.0, undef %2:g8rc, %bb.0

I am confused with the above PHI node in your description. Is it %3:g8rc = PHI [%1, %bb.0], [undef %2:g8rc, %bb.0]
If so, do you mean:
1: the PHI node get two different values from the same block %bb.0?
2: for undef value we still generate %2:g8rc as its def?

Maybe adding a more runnable test case is better.

1: the PHI node get two different values from the same block %bb.0?

It's a typo, it should be

%3:g8rc = PHI %1, %bb.1, undef %2:g8rc, %bb.0`

I will update it.

2: for undef value we still generate %2:g8rc as its def?

In fact, after the PHIElimination pass, if there is no use for %2:g8rc, it will be removed. But the patch D73152 has added it into LivrVariables for %bb.0, it's not correct.

Maybe adding a more runnable test case is better.

The Verification for LiveVariable pass and PHIElimination pass has been disabled, so it's hard to find a case to hit this bug. Because the two passes has some other errors,we can't enable verification for the two passes.

ZhangKang edited the summary of this revision. (Show Details)May 17 2020, 8:54 PM

In fact, after the PHIElimination pass, if there is no use for %2:g8rc, it will be removed. But the patch D73152 has added it into LivrVariables for %bb.0, it's not correct.

If here the undef is for %2:g8rc, is it possible to move the removal action for %2:g8rc to the time when we create undef for it?

bjope added a comment.May 18 2020, 3:36 AM

Can you add a test case for this? I guess it wouldn't be too hard to create a reduced mir test case running "-run-pass livevars -run-pass phi-node-elimination" and then the machine verifier (either by using -run-pass or by using "-verify-machineinstrs -verify-each-machine-pass").

lkail added a subscriber: lkail.May 19 2020, 1:57 AM

For example:

bb.0:

... // no-definiton for %2

bb.1:

...  // definiton for %1

bb.2:

%3:g8rc = PHI %1, %bb.1, undef %2:g8rc, %bb.0

It's obvious that, the %2 shouldn't live in bb.0, but the patch D73152 will set it live for bb.0.

I think it makes sense that %2 live through bb.0, as comment of VarInfo in LiveVariables.h writes

/// PHI nodes complicate things a bit.  If a PHI node is the last user of a
/// value in one of its predecessor blocks, it is not listed in the kills set,
/// but does include the predecessor block in the AliveBlocks set (unless that
/// block also defines the value).
bjope added a comment.May 19 2020, 2:55 AM

For example:

bb.0:

... // no-definiton for %2

bb.1:

...  // definiton for %1

bb.2:

%3:g8rc = PHI %1, %bb.1, undef %2:g8rc, %bb.0

It's obvious that, the %2 shouldn't live in bb.0, but the patch D73152 will set it live for bb.0.

I think it makes sense that %2 live through bb.0, as comment of VarInfo in LiveVariables.h writes

/// PHI nodes complicate things a bit.  If a PHI node is the last user of a
/// value in one of its predecessor blocks, it is not listed in the kills set,
/// but does include the predecessor block in the AliveBlocks set (unless that
/// block also defines the value).

Since PHI elimination runs after having removed IMPLICIT_DEF:s, the PHI node here is the first reference to %2. So there is no def of %2 in any predecessor blocks.
So the fix actually make sense to me. But it had been nice to have a test case.

Maybe I was unclear in my previous request about a test case. If I understand it correctly the verifier might not always work after PHI elimination (for some reason we don't normally run it, and I guess it has to do with some verifications either assuming that we have SSA form, and some that we don't have SSA form, and here we still are in the SSA deconstruction phase). But my idea was that perhaps a reduced MIR test case could work (identifying the problem that we want to solve here, but being small enough not to trigger any of the problems with running the verifier after PHIElmination).
If that doesn't work, then at least adding a RUN line in llvm/test/CodeGen/AArch64/shrink-wrap.ll that uses -verify-machineinstrs -verify-each-machine-pass could work as a regression test for the fix.

Since PHI elimination runs after having removed IMPLICIT_DEF:s, the PHI node here is the first reference to %2. So there is no def of %2 in any predecessor blocks.

I am not familiar with the PHI elimination pass, but the IR %3:g8rc = PHI [%1, %bb.0], [undef %2:g8rc, %bb.0] seems a little strange to me. We have two incoming values from bb %bb.0, undef and %2:g8rc. Which one is right? and what if %2:g8rc has a definition in predecessor blocks? I think if undef is generated in PHI elimination pass based on that %2:g8rc has no definition, then why don't we just eliminate %2:g8rc when we add undef use?

lkail added a comment.EditedMay 19 2020, 4:21 AM

We have two incoming values from bb %bb.0, undef and %2:g8rc

Should be one, i.e., undef is a flag of the register operand.

We have two incoming values from bb %bb.0, undef and %2:g8rc

Should be one, i.e., undef is a flag of the register operand.

Ah, right, I checked the case shrink-wrap.ll, after pass Process Implicit Definitions, many undef FLAG are added to register operands. I misunderstood the undef flag with undef value in llvm IR representation.

ZhangKang added a comment.EditedMay 19 2020, 8:06 AM

@bjope , we can use below command to reproduce the error:

llc -mtriple=aarch64-linux-gnu shrink-wrap.ll  -o stop-before.processimpdefs.mir -verify-machineinstrs -stop-before=processimpdefs
llc -mtriple=aarch64-linux-gnu stop-before.processimpdefs.mir -verify-machineinstrs -run-pass=processimpdefs,livevars,machine-loops,phi-node-elimination,twoaddressinstruction

Then ,we will get below error:

*** Bad machine code: LiveVariables: Block should not be in AliveBlocks ***
- function:    func
- basic block: %bb.19  (0x1001cf4fde0)
Virtual register %59 is not needed live through the block.
LLVM ERROR: Found 1 machine code errors.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:

But the stop-before.processimpdefs.mir has more than 500 lines, I am using bugpoint to reduce the case, but it doesn't narrow down it.

ZhangKang updated this revision to Diff 264940.May 19 2020, 9:06 AM

Add a test case.

bjope added a comment.EditedMay 19 2020, 12:44 PM

Maybe it could be considered as cheating a bit, but it seems like PHIElimination has some options that can help trigger the critical edge splitting without even having a loop.
So by adding some extra cmd line options I managed to create a reproducer that is quite similar to your description:

# RUN: llc -mtriple=aarch64-linux-gnu -verify-machineinstrs -o /dev/null %s \
# RUN:   -run-pass=livevars,phi-node-elimination,twoaddressinstruction \
# RUN:   -no-phi-elim-live-out-early-exit=1 -phi-elim-split-all-critical-edges=1

# Used to result in
#
#     *** Bad machine code: LiveVariables: Block should not be in AliveBlocks ***
#
# Just verify that we do not crash (or get verifier error).

---
name: test
tracksRegLiveness: true
body: |
  bb.0:
    liveins: $nzcv, $wzr
    Bcc 8, %bb.2, implicit $nzcv

  bb.1:
    %x:gpr32 = COPY $wzr

  bb.2:
    %y:gpr32 = PHI %x:gpr32, %bb.1, undef %undef:gpr32, %bb.0
    $wzr = COPY %y:gpr32
...
ZhangKang updated this revision to Diff 265110.May 19 2020, 6:20 PM

Use a small test case.

Maybe it could be considered as cheating a bit, but it seems like PHIElimination has some options that can help trigger the critical edge splitting without even having a loop.
So by adding some extra cmd line options I managed to create a reproducer that is quite similar to your description:

Yes, your small case can work well. I have used it. Thank you very much.

lkail added a comment.EditedMay 19 2020, 7:17 PM

It looks like a bug in phi-node-elimination to me. When critical edge bb.0 -> bb.2 is splitted and bb.3 is inserted, undef %undef is transformed into %3 = IMPLICIT_DEF in bb.3. LV in bb.2 is not updated in memory(If we run phi-node-elimination and twoaddressinstruction separately, we won't run into crash).

It looks like a bug in phi-node-elimination to me. When critical edge bb.0 -> bb.2 is splitted and bb.3 is inserted, undef %undef is transformed into %3 = IMPLICIT_DEF in bb.3. LV in bb.2 is not updated in memory(If we run phi-node-elimination and twoaddressinstruction separately, we won't run into crash).

It's not a bug, I have checked the LiveVariables has been updated before exit the machineverifier pass.
If you use below command you can also get the error.

llc -verify-machineinstrs -mtriple=aarch64-linux-gnu -verify-machineinstrs PHIElimination-crash.mir    -run-pass=livevars,phi-node-elimination,phi-node-elimination -no-phi-elim-live-out-early-exit=1 -phi-elim-split-all-critical-edges=1 -o runpass-phi-node-elimination.mir

If the last pass is machineverifier pass, the LiveVariables will not be preserved, so the last machineverifier pass won't get the LiveVariables and it doesn't check LiveVariables.
Above command run phi-node-elimination pass twice.

@bjope Do you have any other comments for this patch?

lkail added a comment.Jun 3 2020, 12:39 AM

If you use below command you can also get the error.

llc -verify-machineinstrs -mtriple=aarch64-linux-gnu -verify-machineinstrs PHIElimination-crash.mir    -run-pass=livevars,phi-node-elimination,phi-node-elimination -no-phi-elim-live-out-early-exit=1 -phi-elim-split-all-critical-edges=1 -o runpass-phi-node-elimination.mir

If the last pass is machineverifier pass, the LiveVariables will not be preserved, so the last machineverifier pass won't get the LiveVariables and it doesn't check LiveVariables.
Above command run phi-node-elimination pass twice.

Thanks for correcting my mistake.

bjope accepted this revision.Jun 3 2020, 6:28 AM

LGTM, thanks!

This revision is now accepted and ready to land.Jun 3 2020, 6:28 AM
This revision was automatically updated to reflect the committed changes.