This is an archive of the discontinued LLVM Phabricator instance.

Fix the side effect of outlined function when the register is implicit use and implicit-def in the same instruction.
ClosedPublic

Authored by DianQK on Nov 1 2021, 2:01 AM.

Details

Summary

This is the diff associated with D95267: Add the use of register r for outlined function when register r is live in and defined later., and we need to mark $x0 as live whether or not $x0 is dead.

The compiler also needs to mark register $x0 as live in for the following case.

$x1 = ADDXri $sp, 16, 0
BL @spam, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit killed $x1, implicit-def $sp, implicit-def $x0

This change fixes an issue where the wrong registers were used when -machine-outliner-reruns>0.
As an example:

typedef struct {
    double v1;
    double v2;
} D16;

typedef struct {
    D16 v1;
    D16 v2;
} D32;

typedef long long LL8;
typedef struct {
    long long v1;
    long long v2;
} LL16;
typedef struct {
    LL16 v1;
    LL16 v2;
} LL32;

typedef struct {
    LL32 v1;
    LL32 v2;
} LL64;

LL8 needx0(LL8 v0, LL8 v1);

void bar(LL64 v1, LL32 v2, LL16 v3, LL32 v4, LL8 v5, D16 v6, D16 v7, D16 v8);

LL8 foo(LL8 v0, LL64 v1, LL32 v2, LL16 v3, LL32 v4, LL8 v5, D16 v6, D16 v7, D16 v8)
{
  LL8 result = needx0(v0, 0);
  bar(v1, v2, v3, v4, v5, v6, v7, v8);
  return result + 1;
}

As you can see from the foo function, we should not modify the value of x0 until we call needx0.
This code is compiled to give the following instruction MIR code.

$sp = frame-setup SUBXri $sp, 256, 0
frame-setup STPDi killed $d13, killed $d12, $sp, 16
frame-setup STPDi killed $d11, killed $d10, $sp, 18
frame-setup STPDi killed $d9, killed $d8, $sp, 20

frame-setup STPXi killed $x26, killed $x25, $sp, 22
frame-setup STPXi killed $x24, killed $x23, $sp, 24
frame-setup STPXi killed $x22, killed $x21, $sp, 26
frame-setup STPXi killed $x20, killed $x19, $sp, 28
...
$x1 = MOVZXi 0, 0
BL @needx0, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit $x1, implicit-def $sp, implicit-def $x0
...

Since there are some other instruction sequences that duplicate foo, after the first execution of Machine Outliner you will get:

$sp = frame-setup SUBXri $sp, 256, 0
frame-setup STPDi killed $d13, killed $d12, $sp, 16
frame-setup STPDi killed $d11, killed $d10, $sp, 18
frame-setup STPDi killed $d9, killed $d8, $sp, 20

$x7 = ORRXrs $xzr, $lr, 0
BL @OUTLINED_FUNCTION_0, implicit-def $lr, implicit $sp, implicit-def $lr, implicit $sp, implicit $xzr, implicit $x7, implicit $x19, implicit $x20, implicit $x21, implicit $x22, implicit $x23, implicit $x24, implicit $x25, implicit $x26
$lr = ORRXrs $xzr, $x7, 0
...
BL @OUTLINED_FUNCTION_1, implicit-def $lr, implicit $sp, implicit-def $lr, implicit-def $sp, implicit-def $x0, implicit-def $x1, implicit $sp
...

For the first time we outlined the following sequence:

frame-setup STPXi killed $x26, killed $x25, $sp, 22
frame-setup STPXi killed $x24, killed $x23, $sp, 24
frame-setup STPXi killed $x22, killed $x21, $sp, 26
frame-setup STPXi killed $x20, killed $x19, $sp, 28

and

$x1 = MOVZXi 0, 0
BL @needx0, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit $x1, implicit-def $sp, implicit-def $x0

When we execute the outline again, we will get:

$x0 = ORRXrs $xzr, $lr, 0 <---- here
BL @OUTLINED_FUNCTION_2_0, implicit-def $lr, implicit $sp, implicit-def $sp, implicit-def $lr, implicit $sp, implicit $xzr, implicit $d8, implicit $d9, implicit $d10, implicit $d11, implicit $d12, implicit $d13, implicit $x0
$lr = ORRXrs $xzr, $x0, 0

$x7 = ORRXrs $xzr, $lr, 0
BL @OUTLINED_FUNCTION_0, implicit-def $lr, implicit $sp, implicit-def $lr, implicit $sp, implicit $xzr, implicit $x7, implicit $x19, implicit $x20, implicit $x21, implicit $x22, implicit $x23, implicit $x24, implicit $x25, implicit $x26
$lr = ORRXrs $xzr, $x7, 0
...
BL @OUTLINED_FUNCTION_1, implicit-def $lr, implicit $sp, implicit-def $lr, implicit-def $sp, implicit-def $x0, implicit-def $x1, implicit $sp

When calling OUTLINED_FUNCTION_2_0, we used x0 to save the lr register.
The reason for the above error appears to be that:

BL @OUTLINED_FUNCTION_1, implicit-def $lr, implicit $sp, implicit-def $lr, implicit-def $sp, implicit-def $x0, implicit-def $x1, implicit $sp

should be:

BL @OUTLINED_FUNCTION_1, implicit-def $lr, implicit $sp, implicit-def $lr, implicit-def $sp, implicit-def $x0, implicit-def $x1, implicit $sp, implicit $x0

When processing the same instruction with both implicit-def $x0 and implicit $x0 we should keep implicit $x0.
A reproducible demo is available at: https://github.com/DianQK/reproduce_outlined_function_use_live_x0.

Diff Detail

Event Timeline

DianQK created this revision.Nov 1 2021, 2:01 AM
DianQK requested review of this revision.Nov 1 2021, 2:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 1 2021, 2:01 AM
DianQK edited the summary of this revision. (Show Details)Nov 1 2021, 2:16 AM
DianQK edited the summary of this revision. (Show Details)Nov 1 2021, 2:41 AM
DianQK edited the summary of this revision. (Show Details)Nov 1 2021, 2:44 AM
DianQK updated this revision to Diff 383992.Nov 2 2021, 12:56 AM

remove !MOP.isDead() && and machine-outliner-side-effect-3.mir

DianQK edited the summary of this revision. (Show Details)Nov 2 2021, 12:57 AM
DianQK updated this revision to Diff 384023.Nov 2 2021, 3:10 AM

reformat

DianQK edited the summary of this revision. (Show Details)Nov 6 2021, 3:17 AM
DianQK edited the summary of this revision. (Show Details)
DianQK edited the summary of this revision. (Show Details)Nov 7 2021, 1:31 AM
jinlin added a comment.EditedNov 12 2021, 4:01 PM

Here is the output of first iteration machine outliner.
llc -march=arm64 -run-pass=machine-outliner -machine-outliner-reruns=0 --simplify-mir -verify-machineinstrs build/example.mir -o build/example.outlined.mir0

The implicit use of $x0 for outlined function OUTLINED_FUNCTION_1 is missing in this case. What we should do is to check whether implict-def and implict use of $x0 are in the same instruction or not. If so, we should ignore the implict-def. So the fix does serve this purpose.

before:

$x1 = MOVZXi 0, 0
BL @needx0, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit killed $x1, implicit-def $sp, implicit-def $x0
renamable $w0 = nsw ADDWri renamable $w0, 2, 0, implicit killed $x0

After:

frame-setup CFI_INSTRUCTION offset $w29, -16 
BL @OUTLINED_FUNCTION_1, implicit-def $lr, implicit $sp, implicit-def $lr, implicit-def $sp, implicit-def $x0, implicit-def $x1, implicit $sp
renamable $x0 = nsw ADDXri killed renamable $x0, 1, 0

name: OUTLINED_FUNCTION_1

body: |

bb.0:
  liveins: $d15, $x19, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $d8, $d9, $d10, $d11, $d12, $d13, $d14, $x0
 
  $x1 = MOVZXi 0, 0
  TCRETURNdi @needx0, 0, implicit $sp
jinlin added inline comments.Nov 12 2021, 4:44 PM
llvm/test/CodeGen/AArch64/machine-outliner-side-effect-2.mir
19–20

Why do you need to change this test file? With this fix, $x0 should be marked as use.

DianQK updated this revision to Diff 387013.Nov 13 2021, 3:43 AM

remove unnecessary contents

DianQK marked an inline comment as done.Nov 13 2021, 4:04 AM

Here is the output of first iteration machine outliner.
llc -march=arm64 -run-pass=machine-outliner -machine-outliner-reruns=0 --simplify-mir -verify-machineinstrs build/example.mir -o build/example.outlined.mir0

The implicit use of $x0 for outlined function OUTLINED_FUNCTION_1 is missing in this case. What we should do is to check whether implict-def and implict use of $x0 are in the same instruction or not. If so, we should ignore the implict-def. So the fix does serve this purpose.

before:

$x1 = MOVZXi 0, 0
BL @needx0, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit killed $x1, implicit-def $sp, implicit-def $x0
renamable $w0 = nsw ADDWri renamable $w0, 2, 0, implicit killed $x0

After:

frame-setup CFI_INSTRUCTION offset $w29, -16 
BL @OUTLINED_FUNCTION_1, implicit-def $lr, implicit $sp, implicit-def $lr, implicit-def $sp, implicit-def $x0, implicit-def $x1, implicit $sp
renamable $x0 = nsw ADDXri killed renamable $x0, 1, 0

name: OUTLINED_FUNCTION_1

body: |

bb.0:
  liveins: $d15, $x19, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $d8, $d9, $d10, $d11, $d12, $d13, $d14, $x0
 
  $x1 = MOVZXi 0, 0
  TCRETURNdi @needx0, 0, implicit $sp

Yes, we should do is to check whether implict-def and implict use of $x0 are in the same instruction or not.

llvm/test/CodeGen/AArch64/machine-outliner-side-effect-2.mir
19–20

I wanted to show that this has nothing to do with the dead attribute, which is unnecessary and has been removed.

19–20

Why do you need to change this test file? With this fix, $x0 should be marked as use.

DianQK edited the summary of this revision. (Show Details)Nov 13 2021, 4:08 AM

I have got your point. Please still keep your changes in file machine-outliner-side-effect-2.mir. Please update the comment in that file to state that the case of implict-def and implict use of $x0 in the same instruction are handled correctly.

DianQK updated this revision to Diff 387052.Nov 13 2021, 5:53 PM
DianQK marked an inline comment as done.

update machine-outliner-side-effect-2.mir

This comment was removed by jinlin.
jinlin accepted this revision.Nov 13 2021, 7:04 PM
This revision is now accepted and ready to land.Nov 13 2021, 7:04 PM
jinlin retitled this revision from Fix the side effect of outlined function when register is implicit and implicit-def to Fix the side effect of outlined function when register is implicit use and implicit-def in the same instruction..Nov 13 2021, 7:05 PM
jinlin retitled this revision from Fix the side effect of outlined function when register is implicit use and implicit-def in the same instruction. to Fix the side effect of outlined function when the register is implicit use and implicit-def in the same instruction..

Many thanks for your review @jinlin. I don’t have commit access, can you land this patch for me? Please use “DianQK dianqk@icloud.com” to commit the change.

DianQK updated this revision to Diff 387063.Nov 14 2021, 3:22 AM

rebase the patch to make sure everything is based on an up-to-date origin/main.