This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Readjusting the framestack for Zcmp
ClosedPublic

Authored by VincentWu on Sep 25 2022, 4:26 AM.

Details

Summary

This patch readjusts the frame stack for the push and pop instructions

co-author: @Lukacma

Diff Detail

Event Timeline

VincentWu created this revision.Sep 25 2022, 4:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 25 2022, 4:26 AM
VincentWu requested review of this revision.Sep 25 2022, 4:26 AM
VincentWu updated this revision to Diff 473904.Nov 8 2022, 12:52 AM

Do not emit Push/Pop when FP is in used

Fixed the problem mentioned in https://github.com/plctlab/llvm-project/issues/58

For the reason that the Spec of the ZCMP extension is not currently modified for this purpose, the LLVM implementation is modified first to avoid this problem. When DisableFramePointerElim is set, the generation of Push/Pop instructions is stopped.

KYG added a subscriber: KYG.Feb 14 2023, 12:19 AM
KYG added inline comments.
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
268

Seems like the def PGPR : RegisterClass ... is missed in RISCVRegisterInfo.td?

llvm/lib/Target/RISCV/RISCVPushPopOptimizer.cpp
106

Um...where is the implementation of this function? (Or use isCopyInstrImpl() instead?)

VincentWu updated this revision to Diff 497334.Feb 14 2023, 7:57 AM

rebase & update to 1.0

VincentWu marked 2 inline comments as done.Feb 14 2023, 7:58 AM
jrtc27 requested changes to this revision.Apr 11 2023, 11:33 AM
jrtc27 added inline comments.
llvm/test/CodeGen/RISCV/cm_mvas_mvsa.ll
2

Please no manual CHECK lines, use update_mir_test_checks.py

3

Indent, and why VALID rather than CHECK?

8

Don't just dump Clang-produced IR into a test, clean it up properly to be minimal and comprehensible. And fix your formatting issues while you're at it (*nocapture is incorrect)

llvm/test/CodeGen/RISCV/push-pop-popret.ll
3

?

15

Again, clean up your IR. Though this attribute isn't even there, so the comment is worse than unnecessary, it's wrong.

This revision now requires changes to proceed.Apr 11 2023, 11:33 AM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
448

Any reason why we change it to STL vector?

VincentWu updated this revision to Diff 516040.Apr 22 2023, 1:02 AM
VincentWu marked 5 inline comments as done.

address comments

VincentWu marked an inline comment as done.Apr 22 2023, 1:03 AM
kito-cheng added inline comments.Apr 24 2023, 3:03 AM
llvm/lib/Target/RISCV/RISCVRegisterInfo.td
186 ↗(On Diff #516040)

Zcmp?

VincentWu updated this revision to Diff 516439.EditedApr 24 2023, 9:22 AM
VincentWu marked an inline comment as done.
VincentWu retitled this revision from [RISCV] Add CodeGen support of RISCV zcmp Extension to [RISCV] Add CodeGen support of RISCV zcmp Extension.

rebase

VincentWu updated this revision to Diff 516440.Apr 24 2023, 9:24 AM

address comments

kito-cheng added inline comments.Apr 24 2023, 7:45 PM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
281

Zcmp

llvm/lib/Target/RISCV/RISCVMoveOptimizer.cpp
20

Zcmp

llvm/lib/Target/RISCV/RISCVPushPopOptimizer.cpp
21

Zcmp

kito-cheng added inline comments.Apr 24 2023, 7:45 PM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
301

Zcmp

KYG added inline comments.Apr 24 2023, 8:31 PM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
258

reallocPushStackFream -> reallocPushStackFrame ?
Is it necessary to setObjectOffset for non-pushed registers? I though their offset has already been set in
PEI::calculateFrameObjectOffsets (correct me if I'm wrong).

285

total_sp_adj = base_sp_adj (rlist) + additional_sp_adj (Spimm)
I think it's better to get base_sp_adj through rlist, instead of assuming that the init Spimm value is actually base_sp_adj then correct it to additional_sp_adj here.

565

Already checked in isCSIpushable().

1327

I think we should have something like getNonLibcallCSI(), maybe getNonPushPopCSI()?

1336–1341

Better just addImm(0) then update it in adjSPInPushPop(), since the Spimm field never mean to store the base sp adjustment.

1387–1399

Is it possible to cache the result (isPushalbe, and maybe rlist)?
Since push/pop uses a reversed order, using different rlist between prologue and epilogue will cause issue, so maybe just check isCSIpushable once then cache the rlist value.

1401–1426

Need to notify CM_POP kill (def) callee-saved registers, otherwise MIScheduler could not see the dependency, but I'm not sure how to do this (addDef to each poped register?).

Could you split this patch into 3 different revision for easier review? it also easier to demo what you improved for individual part.

  1. move merging pass
  2. Push/Pop optimization pass
  3. FrameLowering related stuffs

My understating is: 1 can be standalone patch, and 2 has some dependency on 3, but it should be able to split that into separated patch?

VincentWu updated this revision to Diff 521524.May 11 2023, 7:00 PM
VincentWu marked 9 inline comments as done.
VincentWu retitled this revision from [RISCV] Add CodeGen support of RISCV zcmp Extension to [RISCV] Readjusting the framestack for Zcmp.
VincentWu edited the summary of this revision. (Show Details)
VincentWu added a reviewer: Jim.
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
258

IMHO, I think it might be necessary in some cases.
Please refer to the callee_with_irq in llvm/test/CodeGen/RISCV/push-pop-popret.ll.
It will generate following code in RV32I

addi    sp, sp, -144
sw      ra, 140(sp)                     # 4-byte Folded Spill
sw      t0, 136(sp)                     # 4-byte Folded Spill
sw      t1, 132(sp)                     # 4-byte Folded Spill
sw      t2, 128(sp)                     # 4-byte Folded Spill
sw      s0, 124(sp)                     # 4-byte Folded Spill
sw      s1, 120(sp)                     # 4-byte Folded Spill
sw      a0, 116(sp)                     # 4-byte Folded Spill
sw      a1, 112(sp)                     # 4-byte Folded Spill
sw      a2, 108(sp)                     # 4-byte Folded Spill
sw      a3, 104(sp)                     # 4-byte Folded Spill
sw      a4, 100(sp)                     # 4-byte Folded Spill
sw      a5, 96(sp)                      # 4-byte Folded Spill
sw      a6, 92(sp)                      # 4-byte Folded Spill
sw      a7, 88(sp)                      # 4-byte Folded Spill
sw      s2, 84(sp)                      # 4-byte Folded Spill
sw      s3, 80(sp)                      # 4-byte Folded Spill
sw      s4, 76(sp)                      # 4-byte Folded Spill
sw      s5, 72(sp)                      # 4-byte Folded Spill
sw      s6, 68(sp)                      # 4-byte Folded Spill
sw      s7, 64(sp)                      # 4-byte Folded Spill
sw      s8, 60(sp)                      # 4-byte Folded Spill
sw      s9, 56(sp)                      # 4-byte Folded Spill
sw      s10, 52(sp)                     # 4-byte Folded Spill
sw      s11, 48(sp)                     # 4-byte Folded Spill
sw      t3, 44(sp)                      # 4-byte Folded Spill
sw      t4, 40(sp)                      # 4-byte Folded Spill
sw      t5, 36(sp)                      # 4-byte Folded Spill
sw      t6, 32(sp)

if we don't realloc non-pushed registers in RV32IZcmp, it will generate code like

cm.push {ra, s0-s11}, -112
addi    sp, sp, -32
sw      t0, 136(sp)                     # 4-byte Folded Spill
sw      t1, 132(sp)                     # 4-byte Folded Spill
sw      t2, 128(sp)                     # 4-byte Folded Spill
sw      a0, 116(sp)                     # 4-byte Folded Spill
sw      a1, 112(sp)                     # 4-byte Folded Spill
sw      a2, 108(sp)                     # 4-byte Folded Spill
sw      a3, 104(sp)                     # 4-byte Folded Spill
sw      a4, 100(sp)                     # 4-byte Folded Spill
sw      a5, 96(sp)                      # 4-byte Folded Spill
sw      a6, 92(sp)                      # 4-byte Folded Spill
sw      a7, 88(sp)                      # 4-byte Folded Spill
sw      t3, 44(sp)                      # 4-byte Folded Spill
sw      t4, 40(sp)                      # 4-byte Folded Spill
sw      t5, 36(sp)                      # 4-byte Folded Spill
sw      t6, 32(sp)                      # 4-byte Folded Spill

meanwhile, accronding to Operation of cm.push defiend in spec

You may find both s0 and t0 will be saved to sp+136. It will cause problems.
So I think I have to re-alloc non-pushed registers. please correct me if I'm wrong or there is a better solution )

1401–1426

I think I have got what you mean, but can you tell me which function I should use to kill the callee-saved register?
Or where the LW/LD instruction does?

KYG added inline comments.May 11 2023, 8:01 PM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
258

Ah yes, you're right.
However, seems like it's only necessary when push/pop change any of the offset, which happens only in irq handler.
So maybe it's better to reallocPushStackFrame only for functions with interrupt attribute.

1401–1426

Normally we should add those registers to the "Defs" list in .td, but I don't know how to do this since the rlist is a compile time variable.
So I just add implicit def to the MI as a convenient workaround.

for (int x = 0; RegList[x] <= MaxReg && RegList[x] > 0; ++x)
  PopBuilder.addDef(RegList[x], RegState::Implicit);
fakepaper56 added inline comments.
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
460

Is it better to support Zcmp for RISCVRegisterInfo::hasReservedSpillSlot and use a more general name for getNonLibcallCSI.

llvm/lib/Target/RISCV/RISCVRegisterInfo.td
187 ↗(On Diff #521524)

I suggest to add X26 to PGPR to make Zcmp work for spill register are {ra, s0 - 10}.

fakepaper56 added inline comments.May 12 2023, 1:33 AM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
281

How about rewriting to Spimm = std::min(StackAdjust - StackAdjBase, 48); ?

301

Could the for loop be refined to any_of ?

1326

IMHO, is it possible the callee save register needed spilled are not successive, like {ra, s0, s3}?

fakepaper56 added inline comments.May 14 2023, 10:51 PM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
258

Is the problem caused by not adjust RISCVRegisterInfo::hasReservedSpillSlot? The frame indexes of the wrong asm looks like that it uses the {ra, s0 - s11} frames.

Should we provide the Zcmp stack size information into getFrameIndexReference and getFirstSPAdjustAmount like lib call ?

craig.topper added inline comments.May 15 2023, 5:35 PM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
230

Unexpected*

262

Can this be a const reference? I think that's what getCalleedSavedInfo returns

299

Why can't we called RISCV::PGPRRegClass.contains(Reg)?

469

RISCV::PGPRRegClass.contains(Reg)?

603

Use curly braces here since there is a long comment.

1321

Posh -> Push

1321

unsigned should be sufficient here right?

llvm/lib/Target/RISCV/RISCVMachineFunctionInfo.h
69

zcmp -> Zcmp

evandro removed a subscriber: evandro.May 17 2023, 3:56 PM
VincentWu updated this revision to Diff 523779.May 19 2023, 7:53 AM
VincentWu marked 17 inline comments as done.

address comments
refactor Push/Pop reference to save-restore Libcall
better name for NonLibcallCSI

llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
1326

I think it should not support this syntax.
In spec, it is valid only in the following cases:

1401–1426

is it RegState::Implicit? did you mean RegState::Kill ?

or I have add .setMIFlag(MachineInstr::FrameDestroy), does it useful for this point?

VincentWu updated this revision to Diff 523790.May 19 2023, 8:22 AM

fix compile error

fakepaper56 added inline comments.May 21 2023, 10:49 PM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
1326

My idea is your code may raise problem when we need to spill s3 but not spill s1 and s2. The stack size you calculated is register number is 3 but indeed you spill 5 register.

1326

you could use else if (const char *SpillLibCall = getSpillLibCallName(*MF, CSI)) here.

fakepaper56 added inline comments.May 21 2023, 10:52 PM
llvm/lib/Target/RISCV/RISCVRegisterInfo.td
189–190 ↗(On Diff #523790)

Fold the two lines to (sequence "X%u", 18, 27) ?

fakepaper56 added inline comments.May 21 2023, 11:05 PM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
987

Support getRVPushStackSize() here.

996

Support getRVPushStackSize() here.

1280

Support getRVPushStackSize() here?

1314

Maybe align the size for accurate stack size. And it can help you not to align for getRVPushStackSize().

KYG added inline comments.May 21 2023, 11:13 PM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
1401–1426

It should be RegState::ImplicitDefine or RegState::Define, since CM_POP is defining new values to registers rather than using their values (RegState::Kill means the last use of a register).
I don't think .setMIFlag(MachineInstr::FrameDestroy) is enough for this purpose, it doesn't contain information about which registers are being assigned.

VincentWu updated this revision to Diff 525133.May 24 2023, 6:06 AM
VincentWu marked 9 inline comments as done.

address comments

fakepaper56 added inline comments.May 24 2023, 7:27 AM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
1401–1426

Why not use ImplicitDefine here? I think using ImplicitDefine can avoid using variadic operand for CM_PUSH and CM_POP.

fakepaper56 added inline comments.May 24 2023, 5:59 PM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
1324

Should we implicit use spill register like CM.POP?

1401–1426

I am sorry that I forgot to mention you this idea.

VincentWu updated this revision to Diff 525533.May 25 2023, 4:00 AM
VincentWu marked 3 inline comments as done.

use ImplicitDefine and set Implicit for push

VincentWu updated this revision to Diff 525536.May 25 2023, 4:04 AM

use PushPopRegs directly

fakepaper56 added inline comments.May 26 2023, 2:26 AM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
1331

CM.PUSH does "use" the callee save. I think you should use addUse instead of addDef.

VincentWu updated this revision to Diff 526310.May 27 2023, 9:08 PM

address comments

LGTM. But wait for other reviewers' approve.

craig.topper added inline comments.Jun 7 2023, 9:17 AM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
231

Put the register name comments at the end of the line instead of in the middle

277

Can we count PushPopRegs as we go through the loop and only have a special case for RISCV::X26 after the loop?

278

Can we replace this with

MaxPushPopReg = RISCV::X27;
break;

to avoid a return when every other case breaks.

483

Why is this blank line removed?

1308–1309

Emmit -> Emit

1412

This list is repeated in two places. Should it be a global variable?

VincentWu marked 7 inline comments as done.

address comments

craig.topper added inline comments.Jun 21 2023, 9:50 AM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
227

Why does this return`int`? Aren't all the enum values positive?

VincentWu marked an inline comment as done.

address comments

VincentWu updated this revision to Diff 536142.Jun 30 2023, 1:49 AM

rebase & update test case

LGTM

the status of this patch is sitll Needs Review.
should I land this patch?
or waiting for @jrtc27 comments?

This revision was not accepted when it landed; it landed in state Needs Review.Jul 6 2023, 8:24 PM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.