This is an archive of the discontinued LLVM Phabricator instance.

Split PrologEpilogInserter into 2 parts
AbandonedPublic

Authored by dschuff on Jan 22 2016, 12:10 PM.

Details

Reviewers
qcolombet
Summary

The base part finalizes frame layout, inserts prolog/epilog code, and
eliminates FrameIndex operands. It supports virtual registers but does not
support callee-saved registers or scavenging
The CSR-supporting part handles spilling of callee-saved registers and
scavenging, but does not support virtual registers.

Diff Detail

Event Timeline

dschuff updated this revision to Diff 45723.Jan 22 2016, 12:10 PM
dschuff retitled this revision from to Split PrologEpilogInserter into 2 parts.
dschuff updated this object.
dschuff added a reviewer: qcolombet.
dschuff added a subscriber: llvm-commits.
hfinkel added a subscriber: hfinkel.Feb 3 2016, 6:41 AM
hfinkel added inline comments.
lib/CodeGen/PrologEpilogInserter.cpp
264

Did we decide that we need to split the passes this way? It seems like the boilerplate from having two passes seems >= the difference between this runOnMachineFunction implementation and the simpler version.

356

Does having > INT_MAX FIs really work with just these changes?

dschuff added inline comments.Feb 4 2016, 1:43 PM
lib/CodeGen/PrologEpilogInserter.cpp
264

The motivation for this change was the really strong desire expressed to have clear separation at the pass level between passes that support virtual registers and passes that do not. (see also http://reviews.llvm.org/D16483). If we implement that, then we have to have a PEI pass that supports virtual registers.

Of course we could just declare that this PEI pass supports virtual registers. From my own experience and auditing of the code, it seems to work fine for WebAssembly. The subsets that WebAssembly uses (i.e. the parts that I've broken out here into the base PEI) are the only parts that actually run when the target returns an empty set of CSRs; I've basically just made that distinction explicit in this change. I'd be interested in getting an opinion from @qcolombet based on this more concrete change.

356

No, this change was just to make the types (signed vs unsigned) consistent (see also line 664 below), and, if there's going to be a sentinel value, it might as well be the max for the actual type used (unsigned) instead of a different type.

qcolombet edited edge metadata.Mar 8 2016, 9:53 AM

Hi Derek,

I am behind on reviews and I am still fighting to catching up.
I have one high level comment though, see the inlined comment.

Cheers,
-Quentin

lib/CodeGen/PrologEpilogInserter.cpp
77

It feels wrong to me that the high level class needs to have this field.

dschuff added inline comments.Mar 18 2016, 1:26 PM
lib/CodeGen/PrologEpilogInserter.cpp
77

I kind of agree. There are a couple of reasons:

  1. TargetFrameLowering::processFunctionBeforeFrameFinalized() is called by the base PEI::finalizeFrame() and it takes an RS pointer because this is the point where targets (PPC) create a stack slot for the RS to spill into. We could maybe work around that by adding a virtual method on the base class that calls processFunctionBeforeFrameFinalized() with a nullptr and with the real RS in the derived class.
  2. The base PEI::calculateFrameObjectOffsets() takes the scavenging spill slots into account when it lays out the stack frame, and uses the RS to find out where they are. (Likewise it also handles the CSR spill slots, which in principle the base class doesn't need to do). This seems trickier to work around. We could move the code that actually uses RS out to some other function but the logic of where it goes in the frame would still be local.
  3. The base PEI::replaceFrameIndices() keeps the RS up to date as it goes and passes RS to TargetRegisterInfo::eliminateFrameIndex() which is of course where it might actually be wanted by a target.

The root sources of this are that the stack frame layout (which is implemented in this pass, and is always needed) has to be aware of both CSRs and scavenging; and the target hooks (which are also always used) might need RS in any case.
We could try to move the RS/CSR frame layout code and the RS updating code out into some kind of other abstraction, but the RS class is itself essentially the abstraction that you'd want.

Another option might be to use a similar factoring (which splits the code that must deal with virtual registers out from the code that need not), but not actually create 2 different MachineFunctionPasses. We could say that the pass itself supports virtual registers, but we'd have better organization and clearer separation.

Hi Derek,

Another option might be to use a similar factoring (which splits the code that must deal with virtual registers out from the code that need not), but not actually create 2 different MachineFunctionPasses. We could say that the pass itself supports virtual registers, but we'd have better organization and clearer separation.

This may be the cleanest solution.

Could you prepare a patch for that direction?

Thanks,
-Quentin

http://reviews.llvm.org/D18366 has a change to tweak the organization and naming in PEI, with better-targeted asserts. We could also go a bit further and split the calls that are currently in runOnMachineFunction into different functions the way we have e.g. finalizeframe() here.