This is an archive of the discontinued LLVM Phabricator instance.

[MachinePipeliner] Fix Phi generation failure for large stages
ClosedPublic

Authored by ytmukai on Jun 15 2022, 2:54 AM.

Details

Summary

The previous code overwrites VRMap for prologue stages during Phi
generation if a register spans many stages.
As a result, the wrong register is used as the one coming from
the prologue in Phis at later stages. (A process exists to correct
this, but it does not work in all cases.)
In addition, VRMap for prologue must be preserved until addBranches().

This patch fixes them by separating the map for Phis into a different
variable (VRMapPhi).

Diff Detail

Event Timeline

ytmukai created this revision.Jun 15 2022, 2:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2022, 2:54 AM
ytmukai requested review of this revision.Jun 15 2022, 2:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2022, 2:54 AM
lkail added a reviewer: jsji.Jun 15 2022, 4:28 AM
lkail added a subscriber: lkail.

Here are more details about the failure.
The code below is the original code for the added test case.

void f(float *restrict a, float *b, int n) {
#pragma clang loop pipeline_initiation_interval(3)
  for (int i=0; i<n; i++)
    a[i] = b[i]+1+b[i];
}

The current code produces the following error.

$ clang test.c --target=powerpc64le-unknown-linux-gnu -mcpu=pwr9 -mllvm --ppc-enable-pipeliner -mllvm -pipeliner-max-stages=10 -mllvm -verify-machineinstrs -O -c
...
bb.9.for.body:
...
  %42:f4rc, %43:g8rc_and_g8rc_nox0 = LFSU 4, %39:g8rc_and_g8rc_nox0(tied-def 1) :: (load unknown-size from %ir.3, align 4, !tbaa !5)
...
bb.12:
...
  %72:f4rc = PHI %47:f4rc, %bb.10, %51:f4rc, %bb.11
  %73:f4rc = PHI %42:f4rc, %bb.10, %59:f4rc, %bb.11
  %74:f4rc = PHI %37:f4rc, %bb.10, %60:f4rc, %bb.11
  %75:f4rc = PHI %32:f4rc, %bb.10, %61:f4rc, %bb.11
...
bb.13:
...
  %85:f4rc = PHI %75:f4rc, %bb.9, %72:f4rc, %bb.12
  %86:f4rc = PHI %37:f4rc, %bb.9, %73:f4rc, %bb.12
  %87:f4rc = PHI %32:f4rc, %bb.9, %74:f4rc, %bb.12
  %88:f4rc = PHI %28:f4rc, %bb.9, %75:f4rc, %bb.12
...
*** Bad machine code: PHI operand is not live-out from predecessor ***
- function:    f
- basic block: %bb.13  (0x5645be4f36e0)
- instruction: %85:f4rc = PHI %75:f4rc, %bb.9, %72:f4rc, %bb.12
- operand 1:   %75:f4rc

*** Bad machine code: Virtual register defs don't dominate all uses. ***
- function:    f
- v. register: %75
...

The Phi operand from bb.9 must be %42, but is overwritten by %75 in bb.12.
This error may not occur under the default maximum number of stages (3).
However, I would like to use pipelining with a larger number of stages.

This change does not affect the generated code for the existing success cases (confirmed by the CodeGen tests for Hexagon, PowerPC and ARM).

rovka added a subscriber: rovka.Jul 22 2022, 12:46 AM

@ytmukai There seem to be some OpenMP failures in the premerge CI. I'm guessing they're not related to your patch - could you please rebase? That should give us more up-to-date CI results.

@rovka Thank you for the advice. I re-ran the test on the current revision. It still fails on fuzzer and libomp. However, it is not related to this patch. Other patches have failed as well, and the pipeliner pass does not work on x86 target.
Instead, I locally confirmed the tests for the pipeliner passed for the current revision as follows.

llvm-lit llvm/test/CodeGen/ARM llvm/test/CodeGen/Hexagon llvm/test/CodeGen/PowerPC llvm/test/CodeGen/Thumb2
rovka added a comment.Aug 2 2022, 6:13 AM

Thanks for rebasing! I agree that the test failures are unrelated to your patch.
I'm not familiar with this pass, so I'm probably not the best person to review the code changes, but I think the test needs to be updated (see comment).

llvm/test/CodeGen/PowerPC/sms-large-stages.mir
127

I'm noticing a mix of FileCheck variables (e.g. LFSU) and plain MIR numbered variables (e.g. %29). Is this how the script generated the checks? In general, I think it's preferred to replace numbered variables with either FileCheck variables if they're relevant for the test, or {{.*}} otherwise.

This is an interesting bug. The patch does fix a flaw with the current Phi generation by the pipeliner. I think the approach here is a good one. It is a little difficult to understand the reason needed for the two different VRMaps though (much of the code in generatePhis is difficult to understand). But, the two maps do reduce some of the complexity in the code, which is good. I'll accept it once the test case comment is resolved. Thanks for fix the issue!

ytmukai updated this revision to Diff 449852.Aug 4 2022, 5:14 AM

Thanks for the reviews! I fixed the test case to use FileCheck variables. update_mir_test_checks.py didn't replace MIR numbered variables because it doesn't recognize the instructions with the flag nofpexcept.

ytmukai marked an inline comment as done.Aug 7 2022, 7:12 PM

@bcahoon Thanks for the review. I resolved the test case comment.

llvm/test/CodeGen/PowerPC/sms-large-stages.mir
127

Sorry, I forgot to check for done. I fixed the test case to use FileCheck variables. update_mir_test_checks.py didn't replace MIR numbered variables because it doesn't recognize the instructions with the flag nofpexcept.

ytmukai marked an inline comment as done.Aug 7 2022, 7:14 PM
bcahoon accepted this revision.Aug 8 2022, 7:00 PM

Thanks for the fix. LGTM.

This revision is now accepted and ready to land.Aug 8 2022, 7:00 PM
This revision was landed with ongoing or failed builds.Aug 8 2022, 9:15 PM
This revision was automatically updated to reflect the committed changes.