This is an archive of the discontinued LLVM Phabricator instance.

Fix the side effect of outlined function when the register is implicit use and implicit-def in the same instruction.
ClosedPublic

Authored by DianQK on Nov 1 2021, 2:01 AM.

Details

Summary

This is the diff associated with D95267: Add the use of register r for outlined function when register r is live in and defined later., and we need to mark $x0 as live whether or not $x0 is dead.

The compiler also needs to mark register $x0 as live in for the following case.

$x1 = ADDXri $sp, 16, 0
BL @spam, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit killed $x1, implicit-def $sp, implicit-def $x0

This change fixes an issue where the wrong registers were used when -machine-outliner-reruns>0.
As an example:

typedef struct {
    double v1;
    double v2;
} D16;

typedef struct {
    D16 v1;
    D16 v2;
} D32;

typedef long long LL8;
typedef struct {
    long long v1;
    long long v2;
} LL16;
typedef struct {
    LL16 v1;
    LL16 v2;
} LL32;

typedef struct {
    LL32 v1;
    LL32 v2;
} LL64;

LL8 needx0(LL8 v0, LL8 v1);

void bar(LL64 v1, LL32 v2, LL16 v3, LL32 v4, LL8 v5, D16 v6, D16 v7, D16 v8);

LL8 foo(LL8 v0, LL64 v1, LL32 v2, LL16 v3, LL32 v4, LL8 v5, D16 v6, D16 v7, D16 v8)
{
  LL8 result = needx0(v0, 0);
  bar(v1, v2, v3, v4, v5, v6, v7, v8);
  return result + 1;
}

As you can see from the foo function, we should not modify the value of x0 until we call needx0.
This code is compiled to give the following instruction MIR code.

$sp = frame-setup SUBXri $sp, 256, 0
frame-setup STPDi killed $d13, killed $d12, $sp, 16
frame-setup STPDi killed $d11, killed $d10, $sp, 18
frame-setup STPDi killed $d9, killed $d8, $sp, 20

frame-setup STPXi killed $x26, killed $x25, $sp, 22
frame-setup STPXi killed $x24, killed $x23, $sp, 24
frame-setup STPXi killed $x22, killed $x21, $sp, 26
frame-setup STPXi killed $x20, killed $x19, $sp, 28
...
$x1 = MOVZXi 0, 0
BL @needx0, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit $x1, implicit-def $sp, implicit-def $x0
...

Since there are some other instruction sequences that duplicate foo, after the first execution of Machine Outliner you will get:

$sp = frame-setup SUBXri $sp, 256, 0
frame-setup STPDi killed $d13, killed $d12, $sp, 16
frame-setup STPDi killed $d11, killed $d10, $sp, 18
frame-setup STPDi killed $d9, killed $d8, $sp, 20

$x7 = ORRXrs $xzr, $lr, 0
BL @OUTLINED_FUNCTION_0, implicit-def $lr, implicit $sp, implicit-def $lr, implicit $sp, implicit $xzr, implicit $x7, implicit $x19, implicit $x20, implicit $x21, implicit $x22, implicit $x23, implicit $x24, implicit $x25, implicit $x26
$lr = ORRXrs $xzr, $x7, 0
...
BL @OUTLINED_FUNCTION_1, implicit-def $lr, implicit $sp, implicit-def $lr, implicit-def $sp, implicit-def $x0, implicit-def $x1, implicit $sp
...

For the first time we outlined the following sequence:

frame-setup STPXi killed $x26, killed $x25, $sp, 22
frame-setup STPXi killed $x24, killed $x23, $sp, 24
frame-setup STPXi killed $x22, killed $x21, $sp, 26
frame-setup STPXi killed $x20, killed $x19, $sp, 28

and

$x1 = MOVZXi 0, 0
BL @needx0, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit $x1, implicit-def $sp, implicit-def $x0

When we execute the outline again, we will get:

$x0 = ORRXrs $xzr, $lr, 0 <---- here
BL @OUTLINED_FUNCTION_2_0, implicit-def $lr, implicit $sp, implicit-def $sp, implicit-def $lr, implicit $sp, implicit $xzr, implicit $d8, implicit $d9, implicit $d10, implicit $d11, implicit $d12, implicit $d13, implicit $x0
$lr = ORRXrs $xzr, $x0, 0

$x7 = ORRXrs $xzr, $lr, 0
BL @OUTLINED_FUNCTION_0, implicit-def $lr, implicit $sp, implicit-def $lr, implicit $sp, implicit $xzr, implicit $x7, implicit $x19, implicit $x20, implicit $x21, implicit $x22, implicit $x23, implicit $x24, implicit $x25, implicit $x26
$lr = ORRXrs $xzr, $x7, 0
...
BL @OUTLINED_FUNCTION_1, implicit-def $lr, implicit $sp, implicit-def $lr, implicit-def $sp, implicit-def $x0, implicit-def $x1, implicit $sp

When calling OUTLINED_FUNCTION_2_0, we used x0 to save the lr register.
The reason for the above error appears to be that:

BL @OUTLINED_FUNCTION_1, implicit-def $lr, implicit $sp, implicit-def $lr, implicit-def $sp, implicit-def $x0, implicit-def $x1, implicit $sp

should be:

BL @OUTLINED_FUNCTION_1, implicit-def $lr, implicit $sp, implicit-def $lr, implicit-def $sp, implicit-def $x0, implicit-def $x1, implicit $sp, implicit $x0

When processing the same instruction with both implicit-def $x0 and implicit $x0 we should keep implicit $x0.
A reproducible demo is available at: https://github.com/DianQK/reproduce_outlined_function_use_live_x0.

Diff Detail

Event Timeline

DianQK created this revision.Nov 1 2021, 2:01 AM
DianQK requested review of this revision.Nov 1 2021, 2:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 1 2021, 2:01 AM
DianQK edited the summary of this revision. (Show Details)Nov 1 2021, 2:16 AM
DianQK edited the summary of this revision. (Show Details)Nov 1 2021, 2:41 AM
DianQK edited the summary of this revision. (Show Details)Nov 1 2021, 2:44 AM
DianQK updated this revision to Diff 383992.Nov 2 2021, 12:56 AM

remove !MOP.isDead() && and machine-outliner-side-effect-3.mir

DianQK edited the summary of this revision. (Show Details)Nov 2 2021, 12:57 AM
DianQK updated this revision to Diff 384023.Nov 2 2021, 3:10 AM

reformat

DianQK edited the summary of this revision. (Show Details)Nov 6 2021, 3:17 AM
DianQK edited the summary of this revision. (Show Details)
DianQK edited the summary of this revision. (Show Details)Nov 7 2021, 1:31 AM
jinlin added a comment.EditedNov 12 2021, 4:01 PM

Here is the output of first iteration machine outliner.
llc -march=arm64 -run-pass=machine-outliner -machine-outliner-reruns=0 --simplify-mir -verify-machineinstrs build/example.mir -o build/example.outlined.mir0

The implicit use of $x0 for outlined function OUTLINED_FUNCTION_1 is missing in this case. What we should do is to check whether implict-def and implict use of $x0 are in the same instruction or not. If so, we should ignore the implict-def. So the fix does serve this purpose.

before:

$x1 = MOVZXi 0, 0
BL @needx0, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit killed $x1, implicit-def $sp, implicit-def $x0
renamable $w0 = nsw ADDWri renamable $w0, 2, 0, implicit killed $x0

After:

frame-setup CFI_INSTRUCTION offset $w29, -16 
BL @OUTLINED_FUNCTION_1, implicit-def $lr, implicit $sp, implicit-def $lr, implicit-def $sp, implicit-def $x0, implicit-def $x1, implicit $sp
renamable $x0 = nsw ADDXri killed renamable $x0, 1, 0

name: OUTLINED_FUNCTION_1

body: |

bb.0:
  liveins: $d15, $x19, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $d8, $d9, $d10, $d11, $d12, $d13, $d14, $x0
 
  $x1 = MOVZXi 0, 0
  TCRETURNdi @needx0, 0, implicit $sp
jinlin added inline comments.Nov 12 2021, 4:44 PM
llvm/test/CodeGen/AArch64/machine-outliner-side-effect-2.mir
19 ↗(On Diff #384669)

Why do you need to change this test file? With this fix, $x0 should be marked as use.

DianQK updated this revision to Diff 387013.Nov 13 2021, 3:43 AM

remove unnecessary contents

DianQK marked an inline comment as done.Nov 13 2021, 4:04 AM

Here is the output of first iteration machine outliner.
llc -march=arm64 -run-pass=machine-outliner -machine-outliner-reruns=0 --simplify-mir -verify-machineinstrs build/example.mir -o build/example.outlined.mir0

The implicit use of $x0 for outlined function OUTLINED_FUNCTION_1 is missing in this case. What we should do is to check whether implict-def and implict use of $x0 are in the same instruction or not. If so, we should ignore the implict-def. So the fix does serve this purpose.

before:

$x1 = MOVZXi 0, 0
BL @needx0, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit killed $x1, implicit-def $sp, implicit-def $x0
renamable $w0 = nsw ADDWri renamable $w0, 2, 0, implicit killed $x0

After:

frame-setup CFI_INSTRUCTION offset $w29, -16 
BL @OUTLINED_FUNCTION_1, implicit-def $lr, implicit $sp, implicit-def $lr, implicit-def $sp, implicit-def $x0, implicit-def $x1, implicit $sp
renamable $x0 = nsw ADDXri killed renamable $x0, 1, 0

name: OUTLINED_FUNCTION_1

body: |

bb.0:
  liveins: $d15, $x19, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $d8, $d9, $d10, $d11, $d12, $d13, $d14, $x0
 
  $x1 = MOVZXi 0, 0
  TCRETURNdi @needx0, 0, implicit $sp

Yes, we should do is to check whether implict-def and implict use of $x0 are in the same instruction or not.

llvm/test/CodeGen/AArch64/machine-outliner-side-effect-2.mir
19 ↗(On Diff #384669)

I wanted to show that this has nothing to do with the dead attribute, which is unnecessary and has been removed.

19 ↗(On Diff #384669)

Why do you need to change this test file? With this fix, $x0 should be marked as use.

DianQK edited the summary of this revision. (Show Details)Nov 13 2021, 4:08 AM

I have got your point. Please still keep your changes in file machine-outliner-side-effect-2.mir. Please update the comment in that file to state that the case of implict-def and implict use of $x0 in the same instruction are handled correctly.

DianQK updated this revision to Diff 387052.Nov 13 2021, 5:53 PM
DianQK marked an inline comment as done.

update machine-outliner-side-effect-2.mir

This comment was removed by jinlin.
jinlin accepted this revision.Nov 13 2021, 7:04 PM
This revision is now accepted and ready to land.Nov 13 2021, 7:04 PM
jinlin retitled this revision from Fix the side effect of outlined function when register is implicit and implicit-def to Fix the side effect of outlined function when register is implicit use and implicit-def in the same instruction..Nov 13 2021, 7:05 PM
jinlin retitled this revision from Fix the side effect of outlined function when register is implicit use and implicit-def in the same instruction. to Fix the side effect of outlined function when the register is implicit use and implicit-def in the same instruction..

Many thanks for your review @jinlin. I don’t have commit access, can you land this patch for me? Please use “DianQK dianqk@icloud.com” to commit the change.

DianQK updated this revision to Diff 387063.Nov 14 2021, 3:22 AM

rebase the patch to make sure everything is based on an up-to-date origin/main.