Page MenuHomePhabricator

[MachineOutliner][AArch64] WA for cases where AArch64 Outliner needs to do more than one stack fixup.
ClosedPublic

Authored by plotfi on Jul 15 2020, 6:42 PM.

Details

Summary

My take on https://reviews.llvm.org/D83464

Basically it appears that in cases where we have noreturns or calls in an outliner candidate that has no LR, no available free regs, or any use of SP that we end up hitting stack fixup code with the MachineOutlinerDefault set as the FrameID. This triggers an assert of OF.FrameConstructionID != MachineOutlinerDefault && "Can only fix up stack references once" at llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:6595 because a lot of the fixup code is not tested to handle fixing up more than once and needs some better checks for possible illegal code generated.

This diff has test cases to cover cases where you have noreturn functions as well as cases where you can hit this same behavior with a BLR terminator in your outliner candidate (but not have that counted as a thunk).

I have created a new diff instead of appending to D83464 because Phab wont let me.

Diff Detail

Unit TestsFailed

TimeTest
20 mslinux > Flang.Preprocessing::compiler_defined_macros.F90
Script: -- : 'RUN: at line 6'; /mnt/disks/ssd0/agent/llvm-project/build/bin/f18 -intrinsic-module-directory /mnt/disks/ssd0/agent/llvm-project/build/tools/flang/include/flang -E /mnt/disks/ssd0/agent/llvm-project/flang/test/Preprocessing/compiler_defined_macros.F90 | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck --ignore-case /mnt/disks/ssd0/agent/llvm-project/flang/test/Preprocessing/compiler_defined_macros.F90

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
kyulee added inline comments.Jul 15 2020, 6:48 PM
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
6126

Shouldn't FlagsSetInAll & MachineOutlinerMBBFlags::HasCalls cover?

plotfi marked an inline comment as done.Jul 15 2020, 6:51 PM
plotfi added inline comments.
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
6126

Ah nice, I will use that. Thanks!

plotfi updated this revision to Diff 278387.Jul 16 2020, 12:31 AM

Updated based on @kyulee's suggestion to use FlagsSetInAll & MachineOutlinerMBBFlags::HasCalls

plotfi marked an inline comment as done.Jul 16 2020, 12:31 AM
plotfi updated this revision to Diff 278388.Jul 16 2020, 12:36 AM
plotfi marked an inline comment as done.Jul 16 2020, 12:10 PM
plotfi added inline comments.
llvm/test/CodeGen/AArch64/machine-outliner-noreturn-no-stack.mir
2

will drop -run-pass=prologepilog for now, since D76570 has not been landed yet.

Filed a bugzilla task to track progress: https://bugs.llvm.org/show_bug.cgi?id=46767

paquette added inline comments.Jul 17 2020, 1:19 PM
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
6113–6115

It's actually that it hasn't been implemented, not that it's untested. If you fix up twice you can get wrong code, guaranteed, because you don't know if you'll overflow the instruction.

plotfi updated this revision to Diff 278887.Jul 17 2020, 1:19 PM
plotfi marked an inline comment as done.
plotfi added inline comments.
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
6113–6115

Ahh! I will update the comment and the bugzilla task. Do you think the outliner should bail here if it is a Noreturn call or a call or just noreturn call?

plotfi updated this revision to Diff 278888.Jul 17 2020, 1:24 PM
paquette added inline comments.Jul 17 2020, 1:26 PM
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
6113–6115

I think what you want to do is bail as early as possible when you know

(1) The candidate will require stack fixups (it's noreturn, or MachineOutlinerDefault + SP is used in the sequence)

and

(2) The candidate contains calls

The (probably better) alternative is to remove those candidates from the candidate list rather than bail out.

At this point, you don't know if you're in the Default + SP unused case. You *do* know enough to handle the noreturn case at this point though.

plotfi updated this revision to Diff 278902.Jul 17 2020, 2:15 PM
plotfi updated this revision to Diff 278907.Jul 17 2020, 2:47 PM
plotfi updated this revision to Diff 278920.Jul 17 2020, 3:49 PM
plotfi updated this revision to Diff 279350.Jul 20 2020, 2:49 PM

Solidified the cases where we want to bail out after speaking with @paquette some more

plotfi marked 2 inline comments as done.Jul 20 2020, 2:49 PM
plotfi updated this revision to Diff 279352.Jul 20 2020, 2:56 PM
plotfi marked an inline comment as done.
plotfi added inline comments.
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
6190

I can put the above bail-out code here in case we want to only bail if the cases are true and we are sure there isn't enough space to save the LR. @paquette what do you think here?

plotfi updated this revision to Diff 279389.Jul 20 2020, 6:41 PM
plotfi updated this revision to Diff 279446.Jul 21 2020, 2:00 AM
paquette added inline comments.Jul 21 2020, 10:13 AM
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
6139

Why are we saving these instead of the candidates that we could potentially outline?

This part of the code is mostly intended to decide on whether or not we should discard candidates which require stack fixups. (It should probably be factored out into a function to make that clearer, but that's another patch for another day.)

I think that you probably want to drop these candidates from the main list specifically.

I think something like this should work (using the nice range-based erase_if and any_of from STLExtras.h)

if (FlagsSetInAll & MachineOutlinerMBBFlags::HasCalls) {
  // Remove all candidates which would require multiple stack fixups.
  erase_if(RepeatedSequenceLocs, [](Candidate &C) {
    if (C.LRU.available(AArch64::LR) && !IsNoReturn)
      return false;
    return any_of(C, [](const MachineInstr &MI) { return MI.isCall(); });
  });

  if (RepeatedSequenceLocs.size() < 2) {
    // We removed enough candidates that outlining would no longer be
    // beneficial. Bail out.
    RepeatedSequenceLocs.clear();
    return outliner::OutlinedFunction();
  }
}
6155–6158

Maybe an example would be useful here, rather than trying to explain everything in words:

// Fixing up the stack twice is not yet implemented.
// 
// E.g. cases like this are not yet supported:
//
// function_containing_sequence:
//   ...
//   save LR to SP <- Requires stack instr fixups in OUTLINED_FUNCTION_N
//   call OUTLINED_FUNCTION_N
//   restore LR from SP
//   ...
//
// OUTLINED_FUNCTION_N:
//   save LR to SP <- Requires stack instr fixups in OUTLINED_FUNCTION_N
//   ...
//   bl foo
//   restore LR from SP
//   ret
//
// This happens whenever function_containing_sequence
// - Is noreturn
// - Requires that we save LR at the point of the call, but there are no
//   available registers. In this case, we save using SP.
//
// TODO: Check if fixing up twice is safe so we can outline these.
6171

Which file?

6185–6188

I think what we really want to do here is remove the candidates which require multiple fixups from the list.

Let's say we have something like this:

some_noreturn_func:
  ...
  candidate_sequence_containing_call
  ...

dog:
  ...
  candidate_sequence_containing_call
  ...

cat:
  ...
  candidate_sequence_containing_call
  ...

fish:
  ...
  candidate_sequence_containing_call
  ...

The call imposes one round of stack fixups, since we save LR using SP inside the function frame.

Let's say that in dog, cat, and fish, we don't require fixups at the callsite.

For example, we could outline from dog like this:

dog:
  ...
  bl OUTLINED_FUNCTION_N
  ...

This is fine; we fix up the stack instructions once inside OUTLINED_FUNCTION_N.

This patch bails out when we have a situation like this. If you removed the candidate that appeared in some_noreturn_func from the list, however, you'd be able to outline from dog, cat, and fish safely.

So what I'd expect to end up with is, say, something like this:

some_noreturn_func:
  ...
  candidate_sequence_containing_call <-- Don't outline, noreturn means we need to fix up at the callsite.
  ...

dog:
  ...
  bl OUTLINED_FUNCTION_N
  ...

cat:
  ...
  save LR to a register <-- This is fine, it does not impose any stack fixups
  bl OUTLINED_FUNCTION_N
  restore LR from a register
  ...

fish:
  ...
  bl OUTLINED_FUNCTION_N
  ...
plotfi updated this revision to Diff 279596.Jul 21 2020, 11:45 AM

Update based on @paquette's feedback

plotfi marked 3 inline comments as done.Jul 21 2020, 11:45 AM
paquette added inline comments.Jul 21 2020, 1:22 PM
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
6254–6255

Now that I've thought about this a bit more, I don't think we should mention noreturn at all.

The only case we actually care about is the case in the example below. Whether or not, say, function_containing_sequence is noreturn is not relevant. The noreturn case only makes it so that we can never have:

function_containing_sequence:
  ...
  bl OUTLINED_FUNCTION_N
  ...

The actual problem we want to address here is MachineOutlinerDefault + requiring a save + restore to SP.

noreturn functions can force us into that case, but they don't disallow say,

function_containing_sequence:
  ...
  mov xn, lr
  bl OUTLINED_FUNCTION_N
  mov lr, xn
  ...

So there's only one case here: when at the call-site, and inside the function itself, we must modify the stack.

6261–6266

considering the comment I left above, you can probably just remove this

6295

maybe use llvm's any_of from STLExtras?

llvm/test/CodeGen/AArch64/machine-outliner-noreturn-no-stack.mir
2

Can you add a test for the non-noreturn case?

Also can you add a test which verifies that the stuff below works?

f1_doesnt_need_stack:
  ...
  seq containing call <- should be outlined
  ...
f2_doesnt_need_stack:
  ...
  seq containing call <- should be outlined
  ...
needs_stack_not_noreturn:
  ...
  seq containing call <- should not be outlined
  ...
needs_stack_noreturn:
  ...
  seq containing call <- should not be outlined
  ...
plotfi updated this revision to Diff 279644.Jul 21 2020, 3:34 PM
plotfi marked 5 inline comments as done.
plotfi added inline comments.
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
6295

llvm::any_of requires a begin and end not a front and back :-/

plotfi updated this revision to Diff 279649.Jul 21 2020, 3:52 PM
plotfi marked an inline comment as done.
plotfi updated this revision to Diff 279652.Jul 21 2020, 3:56 PM
plotfi updated this revision to Diff 279653.
Harbormaster completed remote builds in B65155: Diff 279653.
plotfi updated this revision to Diff 284174.Aug 8 2020, 11:56 PM

Adding BLR-based test case that explodes LR and all other registers to achieve the same assert without noreturn.

plotfi updated this revision to Diff 284175.Aug 9 2020, 12:30 AM
plotfi updated this revision to Diff 284242.Aug 9 2020, 5:02 PM
plotfi retitled this revision from [MachineOutliner][AArch64] Fix for noreturn functions to [MachineOutliner][AArch64] WA for cases where AArch64 Outliner needs to do more than one stack fixup..Aug 9 2020, 10:49 PM
plotfi edited the summary of this revision. (Show Details)
paquette added inline comments.Aug 10 2020, 9:14 AM
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
6235

typo: purpise

llvm/test/CodeGen/AArch64/machine-outliner-2fixup-blr-terminator.mir
8

I don't think you need this, since it isn't referenced at all in the MIR.

9

Very minor nit: it's probably easier to write

attributes #0 = {...}

This really doesn't matter so you don't have to change it, but in the future, it's probably a little cleaner for when you have multiple attributes.

Also do you need "frame-pointer"="all" and noinline here? At the very least, I'd expect that noinline shouldn't do anything this late in the pass pipeline.

llvm/test/CodeGen/AArch64/machine-outliner-no-noreturn-no-stack.mir
15–17

Nit: can this be consistent with the formatting for @stack_1 and @stack_2?

plotfi added inline comments.Aug 10 2020, 10:49 AM
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
6235

Oops, -> :set spell

llvm/test/CodeGen/AArch64/machine-outliner-2fixup-blr-terminator.mir
8

Ah yeah, you are right. I had a regular BL before that was using it, but not now.

9

Yeah, I was using clang -E to generate these earlier and the # was being problematic.

plotfi updated this revision to Diff 284445.Aug 10 2020, 10:51 AM
plotfi marked 4 inline comments as done.
This revision is now accepted and ready to land.Aug 10 2020, 10:57 AM
plotfi closed this revision.Aug 10 2020, 2:08 PM

Closing, cause phab doesn't seem to be auto-closing.
Landed at https://github.com/llvm/llvm-project/commit/7bc03f55539f7f081daea5363f2e4845b2e75f57