Page MenuHomePhabricator

[RISCV] Support lowering FrameIndex

Authored by asb on Nov 9 2017, 9:53 AM.



Introduces the AddrFI "addressing mode", which is necessary simple because it's not possible to write a pattern that directly matches a frameindex.

Ensures callee-saved registers are accessed relative to the stackpointer. This is necessary as callee saved register spills are performed before the frame pointer is set.

Diff Detail

Event Timeline

mgrang added a subscriber: mgrang.Nov 9 2017, 10:10 AM
mgrang added inline comments.

nit: Period after comment.

mgrang added inline comments.Nov 9 2017, 10:14 AM

This will have an "unused variable 'Reg'" error in a non-asserts build. No?

asb updated this revision to Diff 122295.Nov 9 2017, 11:48 AM
asb marked an inline comment as done.
asb edited the summary of this revision. (Show Details)

Thanks Mandeep, updated the patch. I've moved a bit of logic from the successor patch (D39849) to this one, as I think it makes more sense to use the correct register in eliminateFrameIndex from the start, rather than adding the extra logic in the next patch.


No, Reg gets used in the code below.

asb updated this revision to Diff 122635.Nov 13 2017, 4:22 AM

Rebase on latest LLVM and remove redundant return in eliminateFrameIndex.

sameer.abuasal edited reviewers, added: sabuasal; removed: sameer.abuasal.Nov 13 2017, 11:46 AM
apazos added inline comments.Nov 13 2017, 5:23 PM

too many ( )


you can make this the default value and eliminate the else.


maybe move default to the first check.

asb updated this revision to Diff 122789.Nov 14 2017, 12:05 AM
asb marked 3 inline comments as done.

Updated patch to address review comments.


The extra parens are actually necessary to avoid a compiler warning "warning: using the result of an assignment as a condition without parentheses [-Wparentheses]". This is a fairly ugly idiom and we can avoid the need for it altogether, so I've refactored the code to do so.

sabuasal edited edge metadata.Nov 16 2017, 12:30 PM

Hi @asb

Can you point me to a documentation for the addressing modes names you are using (ADDRii, MEMii ..) ? I can't find these on the GitHub page.

asb added a comment.Nov 16 2017, 12:40 PM

I should add a clarifying comment. These are codegen-only constructs that as far as I can see are required for frameindex lowering, in order to represent a frameindex+offset pair.

Why do you want the special load/store instructions?


You should implement getFrameIndexReference in TargetFrameLowering (in any case). You could use it here instead of all this logic.

asb updated this revision to Diff 123759.Nov 21 2017, 3:52 AM
asb marked an inline comment as done.

Rebased and updated to implement getFrameIndexReference.

@kparzysz honestly I don't really want the special load/store instructions, but was unable to write a pattern that would allow frameindex+offset to be matched by the normal load/store. Perhaps I'm missing something obvious?

hi Alex,

this doesn't apply for me on ToT. what version is it re based on?

asb added a comment.Nov 21 2017, 12:30 PM

Just tested again with r318797. Applies cleanly, compiles, all tests pass.

FrameIndex cannot be matched directly in patterns, but it can be matched by a ComplexPattern, which then can be used in selection patterns.

We have this in Hexagon (the relevant part is the getTargetFrameIndex):

bool HexagonDAGToDAGISel::SelectAddrFI(SDValue &N, SDValue &R) {
  if (N.getOpcode() != ISD::FrameIndex)
    return false;
  auto &HFI = *HST->getFrameLowering();
  MachineFrameInfo &MFI = MF->getFrameInfo();
  int FX = cast<FrameIndexSDNode>(N)->getIndex();
  if (!MFI.isFixedObjectIndex(FX) && HFI.needsAligna(*MF))
    return false;
  R = CurDAG->getTargetFrameIndex(FX, MVT::i32);
  return true;

Then in

def AddrFI: ComplexPattern<i32, 1, "SelectAddrFI", [frameindex], []>;

And then some Pat:

def: Pat<(VT (Load (add (i32 AddrFI:$fi), ImmPred:$Off))),
         (VT (MI AddrFI:$fi, imm:$Off))>;
asb added a comment.EditedNov 21 2017, 1:32 PM

FrameIndex cannot be matched directly in patterns, but it can be matched by a ComplexPattern, which then can be used in selection patterns.

We have this in Hexagon (the relevant part is the getTargetFrameIndex):

Thanks Krzysztof. This patch does something similar, in that I define a ComplexPattern in order to match a FrameIndex. The bit I remember having problems with was using the operand produced by the complex pattern with my targets standard GPR+imm load instructions. Of course my complexpattern was trying to get both the base and offset from the frameindex. Sticking to _just_ matching the frameindex in the ComplexPattern like your Hexagon snippets does seems more promising. I'll have another look tomorrow.

asb added a comment.Nov 22 2017, 12:56 PM

I've had a good look at the alternative approach, whereby a ComplexPattern is used that _just_ matches the FrameIndex and extra load and store tablegen patterns are written using that. This also requires lowering FrameIndex in RISCVDAGToDAGISel::Select. If going that way, it probably makes sense to move HexagonDAGToDAGISel::isOrEquivalentToAdd to target-independent code rather than copying it in to RISCVDagToDAGISel.

I've managed to achieve identical codegen apart from one regression, where a simple varargs test case starts with:

t0: ch = EntryToken
      t39: i32 = add nuw FrameIndex:i32<-1>, Constant:i32<4>
    t55: ch = store<ST4[%va]> t0, t39, FrameIndex:i32<0>, undef:i32
      t9: i32,ch = CopyFromReg t0, Register:i32 %vreg2
    t11: ch = store<ST4[FixedStack-4](align=8)> t0, t9, FrameIndex:i32<-4>, undef:i32
      t13: i32,ch = CopyFromReg t0, Register:i32 %vreg3
    t15: ch = store<ST4[<unknown>]> t0, t13, FrameIndex:i32<-5>, undef:i32
      t17: i32,ch = CopyFromReg t0, Register:i32 %vreg4
    t19: ch = store<ST4[FixedStack-6](align=16)> t0, t17, FrameIndex:i32<-6>, undef:i32
      t21: i32,ch = CopyFromReg t0, Register:i32 %vreg5
    t23: ch = store<ST4[<unknown>]> t0, t21, FrameIndex:i32<-7>, undef:i32
      t25: i32,ch = CopyFromReg t0, Register:i32 %vreg6
    t27: ch = store<ST4[FixedStack-8](align=8)> t0, t25, FrameIndex:i32<-8>, undef:i32
      t29: i32,ch = CopyFromReg t0, Register:i32 %vreg7
    t31: ch = store<ST4[<unknown>]> t0, t29, FrameIndex:i32<-9>, undef:i32
  t60: ch = TokenFactor t58:1, t55, t11, t15, t19, t23, t27, t31
t44: ch,glue = CopyToReg t60, Register:i32 %X10, t58
    t4: i32,ch = CopyFromReg t0, Register:i32 %vreg1
  t7: ch = store<ST4[<unknown>]> t0, t4, FrameIndex:i32<-3>, undef:i32
t58: i32,ch = load<LD4[%2]> t7, FrameIndex:i32<-1>, undef:i32
t45: ch = RISCVISD::RET_FLAG t44, Register:i32 %X10, t44:1

ends up as the following (two addi when one would suffice):

	%vreg7<def> = COPY %X17; GPR:%vreg7
	%vreg6<def> = COPY %X16; GPR:%vreg6
	%vreg5<def> = COPY %X15; GPR:%vreg5
	%vreg4<def> = COPY %X14; GPR:%vreg4
	%vreg3<def> = COPY %X13; GPR:%vreg3
	%vreg2<def> = COPY %X12; GPR:%vreg2
	%vreg1<def> = COPY %X11; GPR:%vreg1
	SW %vreg1, <fi#-3>, 0; mem:ST4[<unknown>] GPR:%vreg1
	SW %vreg7, <fi#-9>, 0; mem:ST4[<unknown>] GPR:%vreg7
	SW %vreg6, <fi#-8>, 0; mem:ST4[FixedStack-8](align=8) GPR:%vreg6
	SW %vreg5, <fi#-7>, 0; mem:ST4[<unknown>] GPR:%vreg5
	SW %vreg4, <fi#-6>, 0; mem:ST4[FixedStack-6](align=16) GPR:%vreg4
	SW %vreg3, <fi#-5>, 0; mem:ST4[<unknown>] GPR:%vreg3
	SW %vreg2, <fi#-4>, 0; mem:ST4[FixedStack-4](align=8) GPR:%vreg2
	%vreg8<def> = ADDI <fi#-1>, 0; GPR:%vreg8
	%vreg9<def> = ADDI %vreg8<kill>, 4; GPR:%vreg9,%vreg8
	SW %vreg9<kill>, <fi#0>, 0; mem:ST4[%va] GPR:%vreg9
	%vreg10<def> = LW <fi#-1>, 0; mem:LD4[%2] GPR:%vreg10
	%X10<def> = COPY %vreg10; GPR:%vreg10
	PseudoRET %X10<imp-use>

One the one hand, the RISC-V backend is starting by focusing on correct and well tested output, with performance work coming later. This means sub-optimal codegen can be tolerated initially. On the other hand, it seems a shame to regress code generation in this way.

I'll have a look if there's a solution that doesn't result in too much additional complexity.

asb added a comment.Nov 23 2017, 3:26 AM

Just to update on my previous comment and to make sure nobody wastes time on this: looking at the issue with a fresh pair of eyes this morning the selection issues were rather straight-forward to resolve. Krzysztof's proposed solution seems cleaner overall and is now working as a drop-in replacement for the RV32I tests. I'll clean up my implementation, test against the full RV32IMAFD+RV64IMAFD patchset and refresh this patch.

asb updated this revision to Diff 124451.Nov 27 2017, 1:10 PM
asb edited the summary of this revision. (Show Details)

Rewrite to avoid the need for the fake *_FI instructions. I also move isOrEquivalentToAdd from HexagonISelDAGToDAG to the SelectionDAGISel base class, so we can make use of it.

asb updated this revision to Diff 125373.Dec 4 2017, 10:51 AM

No patch changes, just refreshing the diff. If we can get this reviewed, it unblocks the following two patches which already have reviews.

sabuasal accepted this revision.EditedDec 7 2017, 1:27 PM


@kparzysz should probably take a look too

This revision is now accepted and ready to land.Dec 7 2017, 1:27 PM
kparzysz added inline comments.Dec 7 2017, 1:33 PM

This looks unused. Can it be removed?

asb updated this revision to Diff 126137.Dec 8 2017, 6:33 AM
asb marked an inline comment as done.

Rebase the patch and remove MEMii which is no longer needed.

asb added a comment.Dec 10 2017, 9:15 AM

@kparzysz, does this look good to you now?

kparzysz accepted this revision.Dec 10 2017, 2:22 PM

Yes, LGTM.

This revision was automatically updated to reflect the committed changes.