This is an archive of the discontinued LLVM Phabricator instance.

Fix incorrect logic in maintaining the side-effect of compiler generated outliner functions
ClosedPublic

Authored by jinlin on Dec 9 2019, 10:53 AM.

Details

Summary

Fix incorrect logic in maintaining the side-effect of compiler generated outliner functions by adding the up-exposed uses.

Diff Detail

Event Timeline

jinlin created this revision.Dec 9 2019, 10:53 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 9 2019, 10:53 AM

Are tests missing here?

As @lebedev.ri already said, is it possible to add a test for this? I know it is probably covered in your test for D71027 but an independent test would be great!

llvm/lib/CodeGen/MachineOutliner.cpp
1249

This comment is not correct anymore: The helper lambda is gone.

1252–1258

This ; can be removed.

1254

I guess this can be moved into the for-loop since it's not needed outside of the loop scope.

1258–1295

Can this be const &? You could even think about making this type explicit since it's not very verbose but no strong opinion on this.

1262

Can't this just be MachineInstr *MI = Iter;? You could actually just use Iter directly and omit the new variable.

As @lebedev.ri already said, is it possible to add a test for this? I know it is probably covered in your test for D71027 but an independent test would be great!

Hi David, a standalone test would be ideal, but it is extremely difficult to come up with such a test case. I have spent many days thinking about a test case for this incorrect logic.
Given the end of the workflow where it happen, exposing the bug is quite difficult.
Under such circumstance, it would not be prudent to get blocked on a test case. The outline-repeat feature exercises the need for this fix and a test case added there serves the purpose.
Hope you can understand and sympathize with me.

I think that it should be possible to write a test case for this. You don't really have to expose a *bug* here but just show that the exposed uses are added to the outlined function. That's sufficient for a testcase.

I think that you can probably do this by copying + editing one of the existing outliner testcases. A good test to base it off of might be llvm/test/CodeGen/AArch64/machine-outliner-calls.mir, since it's pretty simple.

llvm/lib/CodeGen/MachineOutliner.cpp
1251–1252

Can this be Register instead of unsigned?

1252–1253

Comment?

1254–1255

Add a brace?

1254–1255

Comment?

1258–1295

+1 for making type explicit

1261

Can you pull the variables into the loop header to fit this sort of style?

for (MachineBasicBlock::reverse_iterator Iter = (stuff), Last = (stuff); Iter != Last; ++Iter)

Then it's clear that the variables are only used in the loop.

jinlin marked 11 inline comments as done.Feb 13 2020, 11:39 AM
jinlin added inline comments.
llvm/lib/CodeGen/MachineOutliner.cpp
1262

The iterators you’ll be working with in the LLVM framework are special: they will automatically convert to a ptr-to-instance type whenever they need to. Instead of dereferencing the iterator and then taking the address of the result, you can simply assign the iterator to the proper pointer type and you get the dereference and address-of operation as a result of the assignment. You can refer to the details in http://llvm.org/docs/ProgrammersManual.html.

jinlin updated this revision to Diff 244488.Feb 13 2020, 11:40 AM

I have updated the code based on reviewers' feedback and added one test case. Thanks.

jinlin updated this revision to Diff 244492.Feb 13 2020, 11:46 AM

Fix typos in the comments.

jinlin updated this revision to Diff 244496.Feb 13 2020, 11:52 AM

More minor changes.

I think a MIR testcase would be simpler here. Something like this should work, no?

# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
# RUN: llc -mtriple=aarch64-apple-darwin -run-pass=machine-outliner -verify-machineinstrs  %s -o - | FileCheck %s
--- |
  define void @foo() noredzone {ret void}
...
---

name:            foo
tracksRegLiveness: true
body:             |
  ; CHECK-LABEL: name: foo
  ; CHECK: bb.0:
  ; CHECK:   successors: %bb.1(0x80000000)
  ; CHECK:   liveins: $w4
  ; CHECK:   BL @OUTLINED_FUNCTION_0, implicit-def $lr, implicit $sp, implicit-def $w3, implicit-def $w2, implicit-def $w1, implicit-def $w0, implicit-def $lr, implicit-def $w3, implicit-def $w2, implicit-def $w1, implicit-def $w0, implicit $sp, implicit $wzr, implicit $w4
  ; CHECK: bb.1:
  ; CHECK:   successors: %bb.2(0x80000000)
  ; CHECK:   liveins: $w4
  ; CHECK:   BL @OUTLINED_FUNCTION_0, implicit-def $lr, implicit $sp, implicit-def $w3, implicit-def $w2, implicit-def $w1, implicit-def $w0, implicit-def $lr, implicit-def $w3, implicit-def $w2, implicit-def $w1, implicit-def $w0, implicit $sp, implicit $wzr, implicit $w4
  ; CHECK: bb.2:
  ; CHECK:   successors: %bb.3(0x80000000)
  ; CHECK:   liveins: $w4
  ; CHECK:   BL @OUTLINED_FUNCTION_0, implicit-def $lr, implicit $sp, implicit-def $w3, implicit-def $w2, implicit-def $w1, implicit-def $w0, implicit-def $lr, implicit-def $w3, implicit-def $w2, implicit-def $w1, implicit-def $w0, implicit $sp, implicit $wzr, implicit $w4
  ; CHECK: bb.3:
  ; CHECK:   liveins: $w4
  ; CHECK:   RET_ReallyLR
  bb.0:
  liveins: $w4
    $w0 = ORRWri $wzr, 1
    $w1 = ORRWri $wzr, 2
    $w2 = ORRWri $wzr, 3
    $w3 = ORRWri $w4, 4
  bb.1:
  liveins: $w4
    $w0 = ORRWri $wzr, 1
    $w1 = ORRWri $wzr, 2
    $w2 = ORRWri $wzr, 3
    $w3 = ORRWri $w4, 4
  bb.2:
    liveins: $w4
    $w0 = ORRWri $wzr, 1
    $w1 = ORRWri $wzr, 2
    $w2 = ORRWri $wzr, 3
    $w3 = ORRWri $w4, 4
  bb.3:
    liveins: $w4
    RET_ReallyLR

I think a MIR testcase would be simpler here. Something like this should work, no?

# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
# RUN: llc -mtriple=aarch64-apple-darwin -run-pass=machine-outliner -verify-machineinstrs  %s -o - | FileCheck %s
--- |
  define void @foo() noredzone {ret void}
...
---

name:            foo
tracksRegLiveness: true
body:             |
  ; CHECK-LABEL: name: foo
  ; CHECK: bb.0:
  ; CHECK:   successors: %bb.1(0x80000000)
  ; CHECK:   liveins: $w4
  ; CHECK:   BL @OUTLINED_FUNCTION_0, implicit-def $lr, implicit $sp, implicit-def $w3, implicit-def $w2, implicit-def $w1, implicit-def $w0, implicit-def $lr, implicit-def $w3, implicit-def $w2, implicit-def $w1, implicit-def $w0, implicit $sp, implicit $wzr, implicit $w4
  ; CHECK: bb.1:
  ; CHECK:   successors: %bb.2(0x80000000)
  ; CHECK:   liveins: $w4
  ; CHECK:   BL @OUTLINED_FUNCTION_0, implicit-def $lr, implicit $sp, implicit-def $w3, implicit-def $w2, implicit-def $w1, implicit-def $w0, implicit-def $lr, implicit-def $w3, implicit-def $w2, implicit-def $w1, implicit-def $w0, implicit $sp, implicit $wzr, implicit $w4
  ; CHECK: bb.2:
  ; CHECK:   successors: %bb.3(0x80000000)
  ; CHECK:   liveins: $w4
  ; CHECK:   BL @OUTLINED_FUNCTION_0, implicit-def $lr, implicit $sp, implicit-def $w3, implicit-def $w2, implicit-def $w1, implicit-def $w0, implicit-def $lr, implicit-def $w3, implicit-def $w2, implicit-def $w1, implicit-def $w0, implicit $sp, implicit $wzr, implicit $w4
  ; CHECK: bb.3:
  ; CHECK:   liveins: $w4
  ; CHECK:   RET_ReallyLR
  bb.0:
  liveins: $w4
    $w0 = ORRWri $wzr, 1
    $w1 = ORRWri $wzr, 2
    $w2 = ORRWri $wzr, 3
    $w3 = ORRWri $w4, 4
  bb.1:
  liveins: $w4
    $w0 = ORRWri $wzr, 1
    $w1 = ORRWri $wzr, 2
    $w2 = ORRWri $wzr, 3
    $w3 = ORRWri $w4, 4
  bb.2:
    liveins: $w4
    $w0 = ORRWri $wzr, 1
    $w1 = ORRWri $wzr, 2
    $w2 = ORRWri $wzr, 3
    $w3 = ORRWri $w4, 4
  bb.3:
    liveins: $w4
    RET_ReallyLR

The outlined function in this test case does not serve the purpose since there are no live in registers. I did check this test case before. If I change the arguments 1, 2, 3, 4 to be incoming registers, the machine outliner won't kick in.

The outlined function in this test case does not serve the purpose since there are no live in registers. I did check this test case before. If I change the arguments 1, 2, 3, 4 to be incoming registers, the machine outliner won't kick in.

Maybe I'm misunderstanding something here?

(1) Locally, with the patch, adding

liveins: $w0, $w1, $w2, $w3, $w4

does not change the outliner's behaviour.

(2) If I understand correctly, what you are trying to test here is that implicit $xN is added to the outlined call when $xN is not undefined in the outlined range.

Without this patch, this testcase produces

BL @OUTLINED_FUNCTION_0, implicit-def $lr, implicit $sp, implicit-def $lr, implicit-def $w0, implicit-def $w1, implicit-def $w2, implicit-def $w3

With the patch, it adds implicit $wzr, implicit $w4 at the end:

BL @OUTLINED_FUNCTION_0, implicit-def $lr, implicit $sp, implicit-def $w3, implicit-def $w2, implicit-def $w1, implicit-def $w0, implicit-def $lr, implicit-def $w3, implicit-def $w2, implicit-def $w1, implicit-def $w0, implicit $sp, implicit $wzr, implicit $w4

It's kind of weird that the implicit defs are duplicated though. ($lr appears twice in both cases). I'd expect it to be

BL @OUTLINED_FUNCTION_0, implicit-def $lr, implicit-def $w0, implicit-def $w1, implicit-def $w2, implicit-def $w3, implicit $sp, implicit $wzr, implicit $w4

Also, I guess it would also be good to add a testcase that ensures that a register is not added as implicit when it's undefined in the range.

jinlin updated this revision to Diff 244548.Feb 13 2020, 4:27 PM

The reason that the implicit defs are duplicate is because the compiler traverses the instructions in the reverse order and update the side effect of the new call instruction on the fly.
So the def reg set is introduced to avoid the redundant register.

In fact, the existing compiler already inserts some duplicate defs for call instruction.

BL @OUTLINED_FUNCTION_0, implicit-def $lr, implicit $sp, implicit-def $lr, implicit-def $w0, implicit-def $w1, implicit-def $w2, implicit-def $w3

The outlined function in this test case does not serve the purpose since there are no live in registers. I did check this test case before. If I change the arguments 1, 2, 3, 4 to be incoming registers, the machine outliner won't kick in.

Maybe I'm misunderstanding something here?

(1) Locally, with the patch, adding

liveins: $w0, $w1, $w2, $w3, $w4

does not change the outliner's behaviour.

(2) If I understand correctly, what you are trying to test here is that implicit $xN is added to the outlined call when $xN is not undefined in the outlined range.

Without this patch, this testcase produces

BL @OUTLINED_FUNCTION_0, implicit-def $lr, implicit $sp, implicit-def $lr, implicit-def $w0, implicit-def $w1, implicit-def $w2, implicit-def $w3

With the patch, it adds implicit $wzr, implicit $w4 at the end:

BL @OUTLINED_FUNCTION_0, implicit-def $lr, implicit $sp, implicit-def $w3, implicit-def $w2, implicit-def $w1, implicit-def $w0, implicit-def $lr, implicit-def $w3, implicit-def $w2, implicit-def $w1, implicit-def $w0, implicit $sp, implicit $wzr, implicit $w4

It's kind of weird that the implicit defs are duplicated though. ($lr appears twice in both cases). I'd expect it to be

BL @OUTLINED_FUNCTION_0, implicit-def $lr, implicit-def $w0, implicit-def $w1, implicit-def $w2, implicit-def $w3, implicit $sp, implicit $wzr, implicit $w4

Also, I guess it would also be good to add a testcase that ensures that a register is not added as implicit when it's undefined in the range.

Hi Jessica.
My current test case does not have undefined use register issue. It is very difficult to come up another test case to show the undefined register issue. Is it all right to have one test case for this change?
Thanks,
--Jin

Also, I guess it would also be good to add a testcase that ensures that a register is not added as implicit when it's undefined in the range.

Hi Jessica,

The logic at line 1275 will not introduce any new stability issue compared to the existing compiler since the existing compiler does not generate any implicit use of registers. That is why I think it is fine not to have a test case for this. What do you think?

--Jin

jinlin updated this revision to Diff 245788.Feb 20 2020, 9:22 PM

Updated test case.

Hi Jessica,

Is there anything I can improve with latest change?

Thanks,

Jin

I think a MIR testcase would be simpler here. Something like this should work, no?

# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
# RUN: llc -mtriple=aarch64-apple-darwin -run-pass=machine-outliner -verify-machineinstrs  %s -o - | FileCheck %s
--- |
  define void @foo() noredzone {ret void}
...
---

name:            foo
tracksRegLiveness: true
body:             |
  ; CHECK-LABEL: name: foo
  ; CHECK: bb.0:
  ; CHECK:   successors: %bb.1(0x80000000)
  ; CHECK:   liveins: $w4
  ; CHECK:   BL @OUTLINED_FUNCTION_0, implicit-def $lr, implicit $sp, implicit-def $w3, implicit-def $w2, implicit-def $w1, implicit-def $w0, implicit-def $lr, implicit-def $w3, implicit-def $w2, implicit-def $w1, implicit-def $w0, implicit $sp, implicit $wzr, implicit $w4
  ; CHECK: bb.1:
  ; CHECK:   successors: %bb.2(0x80000000)
  ; CHECK:   liveins: $w4
  ; CHECK:   BL @OUTLINED_FUNCTION_0, implicit-def $lr, implicit $sp, implicit-def $w3, implicit-def $w2, implicit-def $w1, implicit-def $w0, implicit-def $lr, implicit-def $w3, implicit-def $w2, implicit-def $w1, implicit-def $w0, implicit $sp, implicit $wzr, implicit $w4
  ; CHECK: bb.2:
  ; CHECK:   successors: %bb.3(0x80000000)
  ; CHECK:   liveins: $w4
  ; CHECK:   BL @OUTLINED_FUNCTION_0, implicit-def $lr, implicit $sp, implicit-def $w3, implicit-def $w2, implicit-def $w1, implicit-def $w0, implicit-def $lr, implicit-def $w3, implicit-def $w2, implicit-def $w1, implicit-def $w0, implicit $sp, implicit $wzr, implicit $w4
  ; CHECK: bb.3:
  ; CHECK:   liveins: $w4
  ; CHECK:   RET_ReallyLR
  bb.0:
  liveins: $w4
    $w0 = ORRWri $wzr, 1
    $w1 = ORRWri $wzr, 2
    $w2 = ORRWri $wzr, 3
    $w3 = ORRWri $w4, 4
  bb.1:
  liveins: $w4
    $w0 = ORRWri $wzr, 1
    $w1 = ORRWri $wzr, 2
    $w2 = ORRWri $wzr, 3
    $w3 = ORRWri $w4, 4
  bb.2:
    liveins: $w4
    $w0 = ORRWri $wzr, 1
    $w1 = ORRWri $wzr, 2
    $w2 = ORRWri $wzr, 3
    $w3 = ORRWri $w4, 4
  bb.3:
    liveins: $w4
    RET_ReallyLR

This is a lot better, thank you for adding a MIR testcase! :)

I think that the testcase can be simplified a bit more, but this is very close.

llvm/test/CodeGen/AArch64/machine-outliner-side-effect.mir
2

I don't think you need prologepilog for this one

7–46

Most of the IR here can be deleted. Only things that must be directly referenced in the MIR are necessary. (e.g. function calls)

For example, you can replace baz.14 with just

define void @baz.14() { ret void } and it should still work.

61–83

Most of this can be deleted, I'm pretty sure.

87

Are the ADJCALLSTACKs actually necessary here?

88

Do you actually have to use calls to get the behaviour you want?

jinlin marked 5 inline comments as done.Mar 2 2020, 10:25 PM
jinlin added inline comments.
llvm/test/CodeGen/AArch64/machine-outliner-side-effect.mir
2

The flag prologepilog is necessary otherwise the machine outlined function will not be generated.

7–46

The IR here cannot be deleted since all of them are referenced in the MIR.

Replacing baz.14 with define void @baz.14() { ret void } does not work.

61–83

Removed all unnecessary ones.

87

Removed.

88

Replaced with parameter.

jinlin updated this revision to Diff 247790.Mar 2 2020, 10:29 PM

Thank you Jessica for quick feedback. I have updated the test case based on your advice.

The reason you weren't getting outlined functions is probably because there were attributes missing on the function.

I think that we should be able to simplify it further like this:

# RUN: llc -mtriple=aarch64 -run-pass=machine-outliner -verify-machineinstrs %s -o - | FileCheck %s

# The test checks whether the compiler updates the side effect of function @OUTLINED_FUNCTION_0 by adding the use of register x20.

--- |
  declare void @spam() local_unnamed_addr
  define void @baz() optsize minsize noredzone { ret void }
...
---
name:            baz
tracksRegLiveness: true
body:             |
  bb.0:
    liveins: $x0, $x20

    $x0 = COPY renamable $x20
    BL @spam, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp, implicit-def $x0
    renamable $x21 = COPY $x0

    $x0 = COPY renamable $x20
    BL @spam, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp, implicit-def $x0
    renamable $x22 = COPY $x0

    $x0 = COPY killed renamable $x20
    BL @spam, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp, implicit-def $x0
    renamable $x3 = COPY $x0

    RET_ReallyLR

...

# CHECK: BL @OUTLINED_FUNCTION_0, {{.*}}, implicit $x20, {{.*}}
jinlin updated this revision to Diff 248016.Mar 3 2020, 1:44 PM

The reason you weren't getting outlined functions is probably because there were attributes missing on the function.

I think that we should be able to simplify it further like this:

# RUN: llc -mtriple=aarch64 -run-pass=machine-outliner -verify-machineinstrs %s -o - | FileCheck %s

# The test checks whether the compiler updates the side effect of function @OUTLINED_FUNCTION_0 by adding the use of register x20.

--- |
  declare void @spam() local_unnamed_addr
  define void @baz() optsize minsize noredzone { ret void }
...
---
name:            baz
tracksRegLiveness: true
body:             |
  bb.0:
    liveins: $x0, $x20

    $x0 = COPY renamable $x20
    BL @spam, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp, implicit-def $x0
    renamable $x21 = COPY $x0

    $x0 = COPY renamable $x20
    BL @spam, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp, implicit-def $x0
    renamable $x22 = COPY $x0

    $x0 = COPY killed renamable $x20
    BL @spam, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp, implicit-def $x0
    renamable $x3 = COPY $x0

    RET_ReallyLR

...

# CHECK: BL @OUTLINED_FUNCTION_0, {{.*}}, implicit $x20, {{.*}}

Thank you Jessica for helping creating the test case. It is much more simple compared to my old test.

paquette accepted this revision.Mar 3 2020, 2:27 PM

LGTM!

This revision is now accepted and ready to land.Mar 3 2020, 2:27 PM
jinlin added a comment.Mar 3 2020, 2:58 PM

LGTM!

Thanks you Jessica for your great help!

jinlin updated this revision to Diff 248212.Mar 4 2020, 9:18 AM

Hi Jessica,

I have updated the test machine-outliner-noreturn-save-lr.mir due to the changes of side-effect information for outlined functions. Would you please review it?

Thanks,

Jin

paquette accepted this revision.Mar 4 2020, 9:38 AM

The changes to machine-outliner-noreturn-save-lr.mir look fine to me.

jinlin updated this revision to Diff 248236.Mar 4 2020, 10:24 AM

rebase master

The changes to machine-outliner-noreturn-save-lr.mir look fine to me.

Thank you Jessica for your quick update.

Hi, I just tried out this patch locally and I'm seeing failures running the tests:

Failing Tests (3):
    LLVM :: CodeGen/AArch64/machine-outliner-cfi.mir
    LLVM :: CodeGen/AArch64/machine-outliner-noreturn-save-lr.mir
    LLVM :: CodeGen/AArch64/machine-outliner-side-effect.mir

Hi, I just tried out this patch locally and I'm seeing failures running the tests:

Failing Tests (3):
    LLVM :: CodeGen/AArch64/machine-outliner-cfi.mir
    LLVM :: CodeGen/AArch64/machine-outliner-noreturn-save-lr.mir
    LLVM :: CodeGen/AArch64/machine-outliner-side-effect.mir

Thanks for your notification. I did test it last night and did not see the failures. Let me double check.

jinlin added a comment.Mar 4 2020, 1:13 PM

Hi, I just tried out this patch locally and I'm seeing failures running the tests:

Failing Tests (3):
    LLVM :: CodeGen/AArch64/machine-outliner-cfi.mir
    LLVM :: CodeGen/AArch64/machine-outliner-noreturn-save-lr.mir
    LLVM :: CodeGen/AArch64/machine-outliner-side-effect.mir

I am sorry to post the the incorrect version of the file MachineOutliner in the last diff.

When I compared the last diff (10) with diff 9, I found that the file MachineOutliner is in the old version. This morning I tried to use git rebase master in my local branch, somehow this file was changed when the conflict happened.

I am working on the revert now.

jinlin updated this revision to Diff 248300.Mar 4 2020, 1:22 PM
jinlin updated this revision to Diff 248301.Mar 4 2020, 1:26 PM
jinlin added a comment.Mar 4 2020, 1:28 PM

I have updated the diffs. Now when you compare Diff 248212 with Diff 248301, you will see they are the same.

jinlin added a comment.Mar 4 2020, 1:54 PM

All the llvm-lit tests passed without any unexpected fails.

This revision was automatically updated to reflect the committed changes.