mflr is kind of expensive on Power version smaller than 10, so we should schedule the store for the mflr's def away from mflr.
In epilogue, the expensive mtlr has no user for its def, so it doesn't matter that the load and the mtlr are back-to-back.
| Paths 
 |  Differential  D137423  
[PowerPC] make expensive mflr be away from its user in the function prologue ClosedPublic Authored by shchenz on Nov 4 2022, 7:16 AM. 
Details 
 Summary mflr is kind of expensive on Power version smaller than 10, so we should schedule the store for the mflr's def away from mflr. In epilogue, the expensive mtlr has no user for its def, so it doesn't matter that the load and the mtlr are back-to-back. 
Diff Detail 
 Event Timeline
 shchenz retitled this revision from [PowerPC][WIP] make expensive mflr be awfy from its user in the function prologue to [PowerPC] make expensive mflr be away from its user in the function prologue.Comment Actions fix all LIT failures shchenz added a parent revision: D137612: [PowerPC] add a new subtarget feature FastMFLR.Nov 7 2022, 10:47 PM shchenz marked an inline comment as done. shchenz added inline comments. 
 shchenz marked an inline comment as done. This revision is now accepted and ready to land.Nov 14 2022, 10:05 AM This revision was landed with ongoing or failed builds.Nov 14 2022, 6:14 PM Closed by commit rGeb7d16ea2564: [PowerPC] make expensive mflr be away from its user in the function prologue (authored by shchenz).  ·  Explain Why This revision was automatically updated to reflect the committed changes. 
Revision Contents 
 
Diff 475324 llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
 llvm/test/CodeGen/PowerPC/2007-09-08-unaligned.ll
 llvm/test/CodeGen/PowerPC/2007-11-16-landingpad-split.ll
 llvm/test/CodeGen/PowerPC/2008-10-28-f128-i32.ll
 llvm/test/CodeGen/PowerPC/2010-05-03-retaddr1.ll
 llvm/test/CodeGen/PowerPC/CSR-fit.ll
 llvm/test/CodeGen/PowerPC/Frames-dyn-alloca-with-func-call.ll
 llvm/test/CodeGen/PowerPC/MCSE-caller-preserved-reg.ll
 llvm/test/CodeGen/PowerPC/P10-stack-alignment.ll
 llvm/test/CodeGen/PowerPC/PR35812-neg-cmpxchg.ll
 llvm/test/CodeGen/PowerPC/aix-cc-abi.ll
 llvm/test/CodeGen/PowerPC/aix-cc-byval-mem.ll
 llvm/test/CodeGen/PowerPC/aix-cc-byval.ll
 llvm/test/CodeGen/PowerPC/aix-cc-ext-vec-abi.ll
 llvm/test/CodeGen/PowerPC/aix-crspill.ll
 llvm/test/CodeGen/PowerPC/aix-csr.ll
 llvm/test/CodeGen/PowerPC/aix-emit-tracebacktable.ll
 llvm/test/CodeGen/PowerPC/aix-framepointer-save-restore.ll
 llvm/test/CodeGen/PowerPC/aix-llvm-intrinsic.ll
 llvm/test/CodeGen/PowerPC/aix-lr.ll
 llvm/test/CodeGen/PowerPC/aix-sret-param.ll
 llvm/test/CodeGen/PowerPC/aix-tls-gd-double.ll
 llvm/test/CodeGen/PowerPC/aix-tls-gd-int.ll
 llvm/test/CodeGen/PowerPC/aix-tls-gd-longlong.ll
 llvm/test/CodeGen/PowerPC/aix-tls-xcoff-reloc-large.ll
 llvm/test/CodeGen/PowerPC/aix-tls-xcoff-reloc.ll
 llvm/test/CodeGen/PowerPC/aix-user-defined-memcpy.ll
 llvm/test/CodeGen/PowerPC/aix-vec-arg-spills.ll
 llvm/test/CodeGen/PowerPC/aix-vector-stack-caller.ll
 llvm/test/CodeGen/PowerPC/aix-xcoff-reloc.ll
 llvm/test/CodeGen/PowerPC/aix-xcoff-symbol-rename.ll
 llvm/test/CodeGen/PowerPC/all-atomics.ll
 llvm/test/CodeGen/PowerPC/alloca-crspill.ll
 llvm/test/CodeGen/PowerPC/atomics-i128-ldst.ll
 llvm/test/CodeGen/PowerPC/atomics-i128.ll
 llvm/test/CodeGen/PowerPC/atomics-indexed.ll
 llvm/test/CodeGen/PowerPC/atomics.ll
 llvm/test/CodeGen/PowerPC/branch-on-store-cond.ll
 llvm/test/CodeGen/PowerPC/byval.ll
 llvm/test/CodeGen/PowerPC/canonical-merge-shuffles.ll
 llvm/test/CodeGen/PowerPC/constant-pool.ll
 llvm/test/CodeGen/PowerPC/csr-split.ll
 llvm/test/CodeGen/PowerPC/ctrloop-constrained-fp.ll
 llvm/test/CodeGen/PowerPC/ctrloop-fp128.ll
 llvm/test/CodeGen/PowerPC/cxx_tlscc64.ll
 llvm/test/CodeGen/PowerPC/disable-ctr-ppcf128.ll
 llvm/test/CodeGen/PowerPC/elf64-byval-cc.ll
 llvm/test/CodeGen/PowerPC/expand-foldable-isel.ll
 llvm/test/CodeGen/PowerPC/f128-aggregates.ll
 llvm/test/CodeGen/PowerPC/f128-arith.ll
 llvm/test/CodeGen/PowerPC/f128-branch-cond.ll
 llvm/test/CodeGen/PowerPC/f128-compare.ll
 llvm/test/CodeGen/PowerPC/f128-conv.ll
 llvm/test/CodeGen/PowerPC/f128-fma.ll
 llvm/test/CodeGen/PowerPC/f128-passByValue.ll
 llvm/test/CodeGen/PowerPC/f128-rounding.ll
 llvm/test/CodeGen/PowerPC/f128-truncateNconv.ll
 llvm/test/CodeGen/PowerPC/fast-isel-branch.ll
 llvm/test/CodeGen/PowerPC/float-load-store-pair.ll
 llvm/test/CodeGen/PowerPC/fmf-propagation.ll
 llvm/test/CodeGen/PowerPC/fminnum.ll
 llvm/test/CodeGen/PowerPC/fp-int128-fp-combine.ll
 llvm/test/CodeGen/PowerPC/fp-strict-conv-f128.ll
 llvm/test/CodeGen/PowerPC/fp-strict-conv-spe.ll
 llvm/test/CodeGen/PowerPC/fp-strict-f128.ll
 llvm/test/CodeGen/PowerPC/fp-strict-fcmp.ll
 llvm/test/CodeGen/PowerPC/fp-strict-round.ll
 llvm/test/CodeGen/PowerPC/fp-strict.ll
 llvm/test/CodeGen/PowerPC/fp128-bitcast-after-operation.ll
 llvm/test/CodeGen/PowerPC/frem.ll
 llvm/test/CodeGen/PowerPC/funnel-shift.ll
 llvm/test/CodeGen/PowerPC/handle-f16-storage-type.ll
 llvm/test/CodeGen/PowerPC/inlineasm-i64-reg.ll
 llvm/test/CodeGen/PowerPC/jump-tables-collapse-rotate.ll
 llvm/test/CodeGen/PowerPC/larger-than-red-zone.ll
 llvm/test/CodeGen/PowerPC/lower-intrinsics-afn-mass_notail.ll
 llvm/test/CodeGen/PowerPC/lower-intrinsics-fast-mass_notail.ll
 llvm/test/CodeGen/PowerPC/machine-pre.ll
 llvm/test/CodeGen/PowerPC/memCmpUsedInZeroEqualityComparison.ll
 llvm/test/CodeGen/PowerPC/no-duplicate.ll
 llvm/test/CodeGen/PowerPC/not-fixed-frame-object.ll
 llvm/test/CodeGen/PowerPC/out-of-range-dform.ll
 llvm/test/CodeGen/PowerPC/pcrel_ldst.ll
 llvm/test/CodeGen/PowerPC/pow-025-075-intrinsic-scalar-mass-fast.ll
 llvm/test/CodeGen/PowerPC/pow-025-075-nointrinsic-scalar-mass-fast.ll
 llvm/test/CodeGen/PowerPC/ppc-prologue.ll
 llvm/test/CodeGen/PowerPC/ppc-shrink-wrapping.ll
 llvm/test/CodeGen/PowerPC/ppc32-nest.ll
 llvm/test/CodeGen/PowerPC/ppc64-P9-setb.ll
 llvm/test/CodeGen/PowerPC/ppc64-byval-larger-struct.ll
 llvm/test/CodeGen/PowerPC/ppc64-byval-multi-store.ll
 llvm/test/CodeGen/PowerPC/ppc64-inlineasm-clobber.ll
 llvm/test/CodeGen/PowerPC/ppc64-nest.ll
 llvm/test/CodeGen/PowerPC/ppc64-notoc-rm-relocation.ll
 llvm/test/CodeGen/PowerPC/ppc64-rop-protection-aix.ll
 llvm/test/CodeGen/PowerPC/ppc64-rop-protection.ll
 llvm/test/CodeGen/PowerPC/ppcf128-constrained-fp-intrinsics.ll
 llvm/test/CodeGen/PowerPC/ppcf128-endian.ll
 llvm/test/CodeGen/PowerPC/pr33547.ll
 llvm/test/CodeGen/PowerPC/pr36292.ll
 llvm/test/CodeGen/PowerPC/pr41088.ll
 llvm/test/CodeGen/PowerPC/pr43527.ll
 llvm/test/CodeGen/PowerPC/pr43976.ll
 llvm/test/CodeGen/PowerPC/pr44183.ll
 llvm/test/CodeGen/PowerPC/pr45301.ll
 llvm/test/CodeGen/PowerPC/pr45432.ll
 llvm/test/CodeGen/PowerPC/pr47373.ll
 llvm/test/CodeGen/PowerPC/pr48519.ll
 llvm/test/CodeGen/PowerPC/pr48527.ll
 llvm/test/CodeGen/PowerPC/pr49092.ll
 llvm/test/CodeGen/PowerPC/pr55463.ll
 llvm/test/CodeGen/PowerPC/pr56469.ll
 llvm/test/CodeGen/PowerPC/read-set-flm.ll
 llvm/test/CodeGen/PowerPC/recipest.ll
 llvm/test/CodeGen/PowerPC/reg-scavenging.ll
 llvm/test/CodeGen/PowerPC/remove-redundant-load-imm.ll
 llvm/test/CodeGen/PowerPC/retaddr.ll
 llvm/test/CodeGen/PowerPC/retaddr2.ll
 llvm/test/CodeGen/PowerPC/retaddr_multi_levels.ll
 llvm/test/CodeGen/PowerPC/scalar-rounding-ops.ll
 llvm/test/CodeGen/PowerPC/sms-cpy-1.ll
 llvm/test/CodeGen/PowerPC/sms-phi-1.ll
 llvm/test/CodeGen/PowerPC/sms-phi-3.ll
 llvm/test/CodeGen/PowerPC/spe.ll
 llvm/test/CodeGen/PowerPC/srem-lkk.ll
 llvm/test/CodeGen/PowerPC/srem-seteq-illegal-types.ll
 llvm/test/CodeGen/PowerPC/stack-restore-with-setjmp.ll
 llvm/test/CodeGen/PowerPC/store_fptoi.ll
 llvm/test/CodeGen/PowerPC/tailcall-speculatable-callee.ll
 llvm/test/CodeGen/PowerPC/testComparesi32gtu.ll
 llvm/test/CodeGen/PowerPC/testComparesi32ltu.ll
 llvm/test/CodeGen/PowerPC/tocSaveInPrologue.ll
 llvm/test/CodeGen/PowerPC/urem-lkk.ll
 llvm/test/CodeGen/PowerPC/vector-constrained-fp-intrinsics.ll
 
 llvm/test/CodeGen/PowerPC/vector-reduce-fadd.ll
 llvm/test/DebugInfo/XCOFF/explicit-section.ll
 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The cost of an mflr is a characteristic of an implementation, not of the architecture. A future processor might have a slow mflr. I think a new subtarget feature is needed for this.