This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
-
TargetSubtargetInfo.h
-
lib/
-
CodeGen/
24/50
MachineCopyPropagation.cpp
-
Target/PowerPC/
-
PowerPC/
-
PPCSubtarget.h
-
test/CodeGen/PowerPC/
-
CodeGen/
-
PowerPC/
1/1
mcp-elim-eviction-chain.mir

Differential D122118

[MachineCopyPropagation] Eliminate spillage copies that might be caused by eviction chain
ClosedPublic

Authored by lkail on Mar 21 2022, 2:21 AM.

Download Raw Diff

Details

Reviewers

aditya_nandakumar
qcolombet

Group Reviewers

Restricted Project

Commits

rG96aaebd12e73: [MachineCopyPropagation] Eliminate spillage copies that might be caused by…

Summary

Remove spill-reload like copy chains. For example

r0 = COPY r1
r1 = COPY r2
r2 = COPY r3
r3 = COPY r4
<def-use r4>
r4 = COPY r3
r3 = COPY r2
r2 = COPY r1
r1 = COPY r0

will be folded into

r0 = COPY r1
r1 = COPY r4
<def-use r4>
r4 = COPY r1
r1 = COPY r0

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

lkail created this revision.Mar 21 2022, 2:21 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 21 2022, 2:21 AM

Herald added subscribers: dmgreen, hiraditya, nemanjai, qcolombet. · View Herald Transcript

lkail requested review of this revision.Mar 21 2022, 2:21 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 21 2022, 2:21 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

lkail retitled this revision from [MachineCopyPropagaion][WIP] Eliminate spillage copies that might caused by eviction chain to [MachineCopyPropagation][WIP] Eliminate spillage copies that might caused by eviction chain.Mar 21 2022, 2:21 AM

lkail updated this revision to Diff 416891.Mar 21 2022, 4:23 AM

lkail updated this revision to Diff 416893.Mar 21 2022, 4:32 AM

Harbormaster completed remote builds in B155352: Diff 416893.Mar 21 2022, 5:29 AM

lkail updated this revision to Diff 417201.Mar 22 2022, 12:35 AM

Harbormaster completed remote builds in B155570: Diff 417201.Mar 22 2022, 1:24 AM

At first glance it looks similar to what @aditya_nandakumar implemented internally to get rid of copies produced by eviction chains.

@aditya_nandakumar could you take a look?

qcolombet added a reviewer: qcolombet.Sep 19 2022, 4:21 PM

Talked to Aditya offline (a while back actually) and he told me he doesn't expect to have time to look at this any time soon.
Adding myself as a reviewers.

Hi @lkail ,

Thanks for your patience.
This goes in the right direction.

I think we miss a few comments and some cleanups (clang-format, remove stall comments, use proper LLVM_DEBUG macros, etc.) and we're good to go!

Cheers,
-Quentin

llvm/lib/CodeGen/MachineCopyPropagation.cpp
1066	Add comments. What this does: explain the algorithm at a high level.
1074	Not for this patch, but you may want to use `isCopyInstr` instead of `isCopy` to catch more cases. That said, it probably won't make much of a difference, since in particular most copies you're trying to remove here comes from splitting in regalloc (i.e., we'll have plain `COPY`).
1074	To be on the safe side, you may want to check that the operation has no implicit operand.
1081	Instead of copying the chain, can we hold on to the reference until we're done with the processing?
1088	Maybe worth splitting the reload from the spill chain as this assert is strange at first glance. Perhaps it wouldn't be as problematic when the method is properly documented. Put differently, let's leave it like this for now, add a bunch of comments and we'll see if it still feels weird after that.
1091	We'll need to expand on that commet because naively, a 2 pairs chain would be beneficial to remove, but this is not something this code can do since we don't recolor outside of the spill chain. I'd suggest putting something like: We need at least 3 pairs of copies for the transformation to apply, because the first outermost pair cannot be removed since we don't recolor outside of the chain and that we need at least one temporary spill slot to shorten the chain. If we only have a chain of two pairs, we already have the shortest sequence this code can handle: the outermost pair for the temporary spill slot, and the pair that use that temporary spill slot for the other end of the chain.
1103	Here and other places where you have "debug" statements: Put this in `LLVM_DEBUG` macros. Or remove completely.
1108	Move that into its own helper function and call it from an assert. static bool LLVM_ATTRIBUTE_UNUSED isValidChain(const SmallVectorImpl<MachineInstr *> &Chain) { // your checks here. } ... assert(isValidChain(Chain) && "Invalid chain to process");
1113	I feel that this is a bit late to check that. We should not put a copy in the "candidate" chains if the copy is not foldable. I would suggest to handle that in the main loop.
1120	By construction `Chain[Len - 4]->getOperand(0) == Chain[Len - 3]->getOperand(1)`, so I would instead put that in a variable and use it in both places. E.g., something like: // Pull the last spill slot used only within the chain as the final spill slot. MCRegister LastReusableRegSpillSlot = Chain[Len - 4]->getOperand(0).getReg() // Update the chain to skip all the intermediate register spill slots: // Spilling: Chain[0]->getOperand(0).setReg(LastReusableRegSpillSlot); // Reload: Chain[1]->getOperand(1).setReg(LastReusableRegSpillSlot);
1122	Maybe be worth adding a comment here that although the variable is called `MaybeDeadCopies`, we really are going to remove the related instructions. The fact that we use `MaybeDeadCopies` to do our code cleanup is slightly confusing because if we don't actually delete the intermediate copies (a.k.a. what remains of the chain at this point) the resulting code would be incorrect.
1187	At first it is strange to see that we look for a copy when `Reg` is a def, but I guess it makes sense because: We are not going to recolor `Reg` We need to consider this chain before it gets clobbered later in that same loop Assuming I understood that correctly, it deserves its comment here.
1195	Use `LeadRegs.find` and avoid the double lookups (one in `count` and one in `operator[]`).
1205	I think this statement deserves its own comment. IIUC here we unconditionally clobber all the registers (as opposed to only clobbering the definitions) because we only rewrite the chain itself (i.e., we don't attempt to rewrite uses after the chain). BTW, you need to take into account regmasks too.
1205	Shouldn't we clear the `SpillChains` here for defs and not-preversed-by-regmasks regs at this point?
llvm/test/CodeGen/PowerPC/mcp-elim-eviction-chain.mir
135	Add a test with regmasks.

Hi @qcolombet , thanks for your detail comments. I have uploaded another patch which is very different from previous one. Not sure I have addressed all your comments.

lkail marked 7 inline comments as done.Oct 27 2022, 9:04 AM

lkail retitled this revision from [MachineCopyPropagation] Eliminate spillage copies that might caused by eviction chain to [MachineCopyPropagation] Eliminate spillage copies that might be caused by eviction chain.Oct 27 2022, 9:13 AM

Harbormaster completed remote builds in B194670: Diff 471178.Oct 27 2022, 9:59 AM

Ping.

Gentle ping.

Hi @lkail,

I am halfway through.

I'm sharing my comments so far if you want to get started with some of the nitpicks.

Cheers,
-Quentin

llvm/lib/CodeGen/MachineCopyPropagation.cpp
110	Maybe rename in `LastSeenUseInCopy`. Essentially, I would avoid `LastUse` alone as it carries a lot of expected semantic that I don't think apply here.
201	Use `MI` directly here instead of adding it line 200.
295	Could `Current` be const here?
314	If `Def` is clobbered between `DefCopy` and `Current` I would have expected that `DefCopy` would have been removed from `Copies`. Put differently, I feel that if we have to check that here, the `Copies` map is holding hard-to-reason-about information. Are we missing some call to `clobberRegister`?
1073	I think you can simplify this with a range based loop with `rbegin`/`rend`.
1074	That should be a simple range based loop: for (const MachineInstr *MI: RC) MI->dump();
1093	Technically we could collapse this sequence to: // r0 = COPY r4 // <def-use r4> // r4 = COPY r0 I.e., I think it is worth explaining that the propagation doesn't check whether or not `r0` can be altered outside of the chain and that's why we conservatively keep its value as it was before the rewrite. (Would be a nice follow-up fix BTW :)).
1096	Typo: until
1096	typo: chain uses
1109	typo: encountered
1114	That sounds weird. Would you mind sharing the assertion?
1116	Could you add a comment on what the mappings hold (all three of them)? I haven't read the code past this point yet, but for instance I would have expected that the key in these maps would be a `Register` not `MachineInstr`.
1117	typo: until
1119	Instead of tracking that, should we just invalidate the chain / stop it before that point?
1132	Maybe add a todo that if the outermost pair of copies modifies a register that is dead outside of that pair, we could eliminate one more pair.

lkail added inline comments.Nov 18 2022, 12:07 AM

llvm/lib/CodeGen/MachineCopyPropagation.cpp
201	Correct me if I'm woring, `Copies.insert` insert successfully only when the key doesn't exist before. Suppose we have L0: R0 = COPY R1 L1: R2 = COPY R1 `LastUseSeenInCopy` should track MI in `L1` rather than `L0`.
314	I can imagine there might be some `RegMask`s doesn't implicit def any registers, just clobber them. When we are traversing the MBB and are encountered with a `RegMask` without any other implicit-def, we don't know which register to clobber directly. I'm not sure we have a way to enumerate registers a `RegMask` clobbers. If there is such a way, I think we should clobber registers when we are traversing the MBB, not checking `RegMask` clobbers here.
1114	It hits `DenseMapIterator`'s pointer operator->() const { assert(isHandleInSync() && "invalid iterator access!"); ... } I dived into it a bit, looks it's checking the validity of the iterator, i.e., if the container is updated, the iterator constructed before the update is invalid. Code like auto Leader = ChainLeader.find(MaybePrevReload); ... ChainLeader.insert({MaybeSpill, Leader->second}); ChainLeader.insert({&MI, Leader->second}); Should be avoided.
1116	I haven't read the code past this point yet, but for instance I would have expected that the key in these maps would be a Register not MachineInstr. I separate the algorithm implementation in to two stages. stage1: Collect spill-reload chains. stage2: Fold the chains. If using `Register`, we are unable to track different spill-reload chains that share same registers.
1119	The implementation doesn't invalidate any chain in stage1. Compared to previous implementation, I think current one is easier to reason and easier to maintain. When we are iterating MI inside the MBB, we don't know which `COPY` might be one of the innermost spill-reload pair and we don't want to lose track of the innermost spill-reload pair. The Source of the innermost spill is allowed to be re-use and re-def between the innermost spill-reload pair.

Address comments.

lkail marked 10 inline comments as done.Nov 18 2022, 12:09 AM

lkail updated this revision to Diff 476369.Nov 18 2022, 12:17 AM

lkail marked an inline comment as done.

lkail updated this revision to Diff 476371.Nov 18 2022, 12:30 AM

lkail added inline comments.Nov 18 2022, 12:57 AM

llvm/lib/CodeGen/MachineCopyPropagation.cpp
1093	Maybe we can check if `r0` is killed to remove one more COPY.

Harbormaster completed remote builds in B198396: Diff 476371.Nov 18 2022, 1:13 AM

qcolombet added inline comments.Nov 22 2022, 7:20 PM

llvm/lib/CodeGen/MachineCopyPropagation.cpp
201	Ah good point!
314	I think I see our misunderstanding. Given the name of the function I would expect that this function only does some queries on the tracker, but you're actually using this function to do some bookkeeping as well. So the conclusion is either rename this function to more accurately represents what it does (I don't have a good name for now) or move the bookkeeping in the main loop (i.e., I thought we were calling clobberRegister from the main loop already.) Regarding your comment on `RegMask`s, I am not sure I follow: `RegMask`s always list all the registers they preserve/clobber Liveness sets at basic block boundaries are not represented with `RegMask`s, but anyway we don't care because the tracking is always purely local to a basic block in that pass. (Unless you've changed that and I missed it :)).
1073	You should be able to use an even more compact form: for (auto I : make_range(SC.rbegin(), SC.rend())
1093	Yep, but for that to be accurate we would need to flip the direction of the analysis (from top-down, to bottom-up) to get proper liveness construction. (Or use the kill flag, but generally speaking we try to avoid relying on this.)
1114	Yep, every time you insert something in the dense map, you may invalidate the iterators.
1116	I see, make sense.

lkail added inline comments.Nov 23 2022, 1:36 AM

llvm/lib/CodeGen/MachineCopyPropagation.cpp
314	RegMasks always list all the registers they preserve/clobber Ah, I see. Is it a good idea to enumerate them via `TRI->getNumRegs()` and check `RegMask` to see if they are preserve/clobber? For the origin question If Def is clobbered between DefCopy and Current I would have expected that DefCopy would have been removed from Copies. Does it imply we should not check `RegMask` and don't update bookkeeping by calling `Tracker::clobberRegister` here, checking if `RegMask` clobbers registers should already have been conducted in the main loop? Correct me if I still fail to get your point.

qcolombet added inline comments.Nov 24 2022, 12:15 PM

llvm/lib/CodeGen/MachineCopyPropagation.cpp
314	Ah, I see. Is it a good idea to enumerate them via TRI->getNumRegs() and check RegMask to see if they are preserve/clobber? That's the idea. Though we wouldn't need to enumerate all the registers, only the ones that you care about. What you're doing here is already fine. The thing that bothers me in the current code is the call to `clobberRegister`. This modifies the state of the tracker. Usually the `find` methods only query the `RegMask`s directly (with `MachineOperand::clobbersPhysReg`). Correct me if I still fail to get your point. You got the point. Let's keep the code as is for now and let me do a full pass on the code so that I have a better model of how the whole things works. I'll probably won't have time to do it before next week though.

Hi @lkail,

Thanks for your patience.

Looks mostly good to me.

The only thing that makes me uneasy is the potential impact on compile time of CheckCopyConstraint.
Do you know on average how many chains we see per function?

I know we have a bot tracking compile time. If that doesn't show anything significant, I guess we could enable it by default. I just don't know how extensive the tests are.

Alternatively, to avoid surprising everybody, we could add a way to enable/disable the new folding.
Either like what was done with UseCopyInstr or with a target hook (you can add a command line option too that would override what the target asked for for testing purposes).

Then only enabled it for your target and send an RFC asking people to try the new folding for their targets.

What do you think?

Cheers,
-Quentin

llvm/lib/CodeGen/MachineCopyPropagation.cpp
1127	Nit: `const` on `MachineInstr*`
1147	You should be able to use a range loop: for (const MachineInstr *Spill : SC) { if (CopySourceInvalid.count(Spill)) return; }
1152	Nit: range loop
1158	That's going to be pretty expensive to walk all the register classes. I'm guessing you're trying to check if the resulting copy is legal and unfortunately there's no good way to do that. Did you see that showing up in compile time profile?
1188	Nit: range loop
1197	Nit: Here and other places where you use `isCopyInstr`: use the explicit type instead of `auto`. (The return type is hard to infer.)

lkail updated this revision to Diff 485466.Dec 28 2022, 12:18 AM

Herald added subscribers: • pcwang-thead, frasercrmck, luismarques and 20 others. · View Herald TranscriptDec 28 2022, 12:18 AM

Add target hook and command line option.

Herald added a subscriber: kbarton. · View Herald TranscriptDec 28 2022, 12:23 AM

Harbormaster completed remote builds in B205056: Diff 485469.Dec 28 2022, 1:16 AM

Do you know on average how many chains we see per function?

I have run llvm-test-suite, here's the stats

	functions	spill chain length	avg spill chain length per function	max spill chain length in one CU	number of spill chains	avg number of spill chains per function	max number of spill chains in one CU
powerpc64-ibm-aix	60426	308	0.005097143613676	52	27	0.000446827524576	3
x86_64-linux-gnu	185938	646	0.003474276371694	130	80	0.000430250943863	14

Compile time on powerpc64-ibm-aix

Tests: 1039
Metric: compile_time

Program                                       compile_time           
                                              baseline     experiment
SingleSour...sts/2002-10-09-ArrayResolution     0.07         0.50    
MultiSourc...marks/Trimaran/enc-pc1/enc-pc1     0.14         0.86    
SingleSour...e/UnitTests/2002-10-13-BadLoad     0.06         0.38    
SingleSour...UnitTests/2002-08-02-CastTest2     0.06         0.37    
SingleSour...nitTests/2002-04-17-PrintfChar     0.06         0.36    
SingleSour.../UnitTests/conditional-gnu-ext     0.07         0.23    
MultiSource/Benchmarks/llubenchmark/llu         0.10         0.35    
SingleSour...tTests/2003-08-05-CastFPToUint     0.06         0.22    
SingleSour...nitTests/2003-05-31-LongShifts     0.07         0.22    
SingleSour...e/UnitTests/Vector/Altivec/lde     0.21         0.68    
SingleSource/UnitTests/blockstret               0.06         0.18    
SingleSour...e/UnitTests/2002-05-03-NotTest     0.07         0.21    
SingleSour...UnitTests/2002-05-02-CastTest2     0.07         0.20    
SingleSour...UnitTests/2003-08-11-VaListArg     0.15         0.43    
SingleSource/UnitTests/StructModifyTest         0.07         0.19    
      compile_time             
run       baseline   experiment
count  1039.000000  1039.000000
mean   1.108578     1.102223   
std    4.496574     4.384102   
min    0.000000     0.000000   
25%    0.000000     0.000000   
50%    0.000000     0.000000   
75%    0.276350     0.278450   
max    80.927800    77.030800

Compile time on x86_64-linux-gnu

Tests: 2991
Metric: compile_time

Program                                       compile_time           
                                              baseline     experiment
UnitTests/...9-04-16-BitfieldInitialization     0.00         0.02    
UnitTests/2002-04-17-PrintfChar                 0.00         0.01    
UnitTests/2002-10-13-BadLoad                    0.00         0.01    
UnitTests/block-copied-in-cxxobj-1              0.00         0.01    
UnitTests/testcase-ExprConstant-1               0.01         0.02    
UnitTests/2010-05-24-BitfieldTest               0.00         0.01    
UnitTests/block-byref-test                      0.00         0.01    
UnitTests/testcase-Expr-1                       0.01         0.02    
UnitTests/byval-alignment                       0.01         0.02    
Benchmarks/Misc/pi                              0.01         0.02    
UnitTests/block-byref-cxxobj-test               0.01         0.01    
UnitTests/2020-01-06-coverage-008               0.01         0.02    
UnitTests/2002-08-02-CastTest2                  0.01         0.01    
UnitTests/2002-05-03-NotTest                    0.02         0.02    
UnitTests/2005-05-13-SDivTwo                    0.01         0.02    
      compile_time             
l/r       baseline   experiment
count  2991.000000  2991.000000
mean   0.291675     0.291211   
std    2.642554     2.642948   
min    0.000000     0.000000   
25%    0.000000     0.000000   
50%    0.000000     0.000000   
75%    0.000000     0.000000   
max    99.370200    99.510200

I don't see significant compile time regressions.

I just don't know how extensive the tests are.

I have tried bootstrapping stage3 and running llvm-test-suite on powerpc64-ibm-aix and x86_64-linux-gnu, no regression found. I'll follow your advice to send an RFC on discourse, currently I only enable it on PowerPC by default. https://reviews.llvm.org/D122118?id=485466 shows changes in other targets.

lkail updated this revision to Diff 485479.Dec 28 2022, 2:41 AM

Harbormaster completed remote builds in B205063: Diff 485479.Dec 28 2022, 3:37 AM

lkail updated this revision to Diff 485578.Dec 28 2022, 10:00 PM

Harbormaster completed remote builds in B205131: Diff 485578.Dec 28 2022, 11:01 PM

Compile-time: http://llvm-compile-time-tracker.com/compare.php?from=781eabeb40b8e47e3a46b0b927784e63f0aad9ab&to=0af2744a89bf0ed05e83ac1ed9d21d6d74cdfeca&stat=instructions%3Au

bjope added a subscriber: bjope.Dec 30 2022, 4:39 AM

Use range loop.

In D122118#4019373, @nikic wrote:

Compile-time: http://llvm-compile-time-tracker.com/compare.php?from=781eabeb40b8e47e3a46b0b927784e63f0aad9ab&to=0af2744a89bf0ed05e83ac1ed9d21d6d74cdfeca&stat=instructions%3Au

Much appreciated for your profiling!

Harbormaster completed remote builds in B205389: Diff 485899.Jan 2 2023, 7:39 PM

qcolombet accepted this revision.Jan 24 2023, 6:00 AM

This revision is now accepted and ready to land.Jan 24 2023, 6:00 AM

Matt added a subscriber: Matt.Jan 25 2023, 9:10 AM

This revision was landed with ongoing or failed builds.Feb 7 2023, 7:34 PM

Closed by commit rG96aaebd12e73: [MachineCopyPropagation] Eliminate spillage copies that might be caused by… (authored by lkail). · Explain Why

This revision was automatically updated to reflect the committed changes.

lkail added a commit: rG96aaebd12e73: [MachineCopyPropagation] Eliminate spillage copies that might be caused by….

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

TargetSubtargetInfo.h

5 lines

lib/

CodeGen/

MachineCopyPropagation.cpp

391 lines

Target/

PowerPC/

PPCSubtarget.h

2 lines

test/

CodeGen/

PowerPC/

mcp-elim-eviction-chain.mir

265 lines

Diff 495713

llvm/include/llvm/CodeGen/TargetSubtargetInfo.h

Show First 20 Lines • Show All 312 Lines • ▼ Show 20 Lines	public:

/// Classify a global function reference. This mainly used to fetch target		/// Classify a global function reference. This mainly used to fetch target
/// special flags for lowering a function address. For example mark a function		/// special flags for lowering a function address. For example mark a function
/// call should be plt or pc-related addressing.		/// call should be plt or pc-related addressing.
virtual unsigned char		virtual unsigned char
classifyGlobalFunctionReference(const GlobalValue *GV) const {		classifyGlobalFunctionReference(const GlobalValue *GV) const {
return 0;		return 0;
}		}

		/// Enable spillage copy elimination in MachineCopyPropagation pass. This
		/// helps removing redundant copies generated by register allocator when
		/// handling complex eviction chains.
		virtual bool enableSpillageCopyElimination() const { return false; }
};		};

} // end namespace llvm		} // end namespace llvm

#endif // LLVM_CODEGEN_TARGETSUBTARGETINFO_H		#endif // LLVM_CODEGEN_TARGETSUBTARGETINFO_H

llvm/lib/CodeGen/MachineCopyPropagation.cpp

Show First 20 Lines • Show All 74 Lines • ▼ Show 20 Lines

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "machine-cp"		#define DEBUG_TYPE "machine-cp"

STATISTIC(NumDeletes, "Number of dead copies deleted");		STATISTIC(NumDeletes, "Number of dead copies deleted");
STATISTIC(NumCopyForwards, "Number of copy uses forwarded");		STATISTIC(NumCopyForwards, "Number of copy uses forwarded");
STATISTIC(NumCopyBackwardPropagated, "Number of copy defs backward propagated");		STATISTIC(NumCopyBackwardPropagated, "Number of copy defs backward propagated");
		STATISTIC(SpillageChainsLength, "Length of spillage chains");
		STATISTIC(NumSpillageChains, "Number of spillage chains");
DEBUG_COUNTER(FwdCounter, "machine-cp-fwd",		DEBUG_COUNTER(FwdCounter, "machine-cp-fwd",
"Controls which register COPYs are forwarded");		"Controls which register COPYs are forwarded");

static cl::opt<bool> MCPUseCopyInstr("mcp-use-is-copy-instr", cl::init(false),		static cl::opt<bool> MCPUseCopyInstr("mcp-use-is-copy-instr", cl::init(false),
cl::Hidden);		cl::Hidden);
		static cl::opt<cl::boolOrDefault>
		EnableSpillageCopyElimination("enable-spill-copy-elim", cl::Hidden);

namespace {		namespace {

static std::optional<DestSourcePair> isCopyInstr(const MachineInstr &MI,		static std::optional<DestSourcePair> isCopyInstr(const MachineInstr &MI,
const TargetInstrInfo &TII,		const TargetInstrInfo &TII,
bool UseCopyInstr) {		bool UseCopyInstr) {
if (UseCopyInstr)		if (UseCopyInstr)
return TII.isCopyInstr(MI);		return TII.isCopyInstr(MI);

if (MI.isCopy())		if (MI.isCopy())
return std::optional<DestSourcePair>(		return std::optional<DestSourcePair>(
DestSourcePair{MI.getOperand(0), MI.getOperand(1)});		DestSourcePair{MI.getOperand(0), MI.getOperand(1)});

return std::nullopt;		return std::nullopt;
}		}

class CopyTracker {		class CopyTracker {
struct CopyInfo {		struct CopyInfo {
MachineInstr *MI;		MachineInstr MI, LastSeenUseInCopy;
		qcolombetUnsubmitted Done Reply Inline Actions Maybe rename in `LastSeenUseInCopy`. Essentially, I would avoid `LastUse` alone as it carries a lot of expected semantic that I don't think apply here. qcolombet: Maybe rename in `LastSeenUseInCopy`. Essentially, I would avoid `LastUse` alone as it carries…
SmallVector<MCRegister, 4> DefRegs;		SmallVector<MCRegister, 4> DefRegs;
bool Avail;		bool Avail;
};		};

DenseMap<MCRegister, CopyInfo> Copies;		DenseMap<MCRegister, CopyInfo> Copies;

public:		public:
/// Mark all of the given registers and their subregisters as unavailable for		/// Mark all of the given registers and their subregisters as unavailable for
▲ Show 20 Lines • Show All 69 Lines • ▼ Show 20 Lines	std::optional<DestSourcePair> CopyOperands =
isCopyInstr(*MI, TII, UseCopyInstr);		isCopyInstr(*MI, TII, UseCopyInstr);
assert(CopyOperands && "Tracking non-copy?");		assert(CopyOperands && "Tracking non-copy?");

MCRegister Src = CopyOperands->Source->getReg().asMCReg();		MCRegister Src = CopyOperands->Source->getReg().asMCReg();
MCRegister Def = CopyOperands->Destination->getReg().asMCReg();		MCRegister Def = CopyOperands->Destination->getReg().asMCReg();

// Remember Def is defined by the copy.		// Remember Def is defined by the copy.
for (MCRegUnitIterator RUI(Def, &TRI); RUI.isValid(); ++RUI)		for (MCRegUnitIterator RUI(Def, &TRI); RUI.isValid(); ++RUI)
Copies[*RUI] = {MI, {}, true};		Copies[*RUI] = {MI, nullptr, {}, true};

// Remember source that's copied to Def. Once it's clobbered, then		// Remember source that's copied to Def. Once it's clobbered, then
// it's no longer available for copy propagation.		// it's no longer available for copy propagation.
for (MCRegUnitIterator RUI(Src, &TRI); RUI.isValid(); ++RUI) {		for (MCRegUnitIterator RUI(Src, &TRI); RUI.isValid(); ++RUI) {
auto I = Copies.insert({*RUI, {nullptr, {}, false}});		auto I = Copies.insert({*RUI, {nullptr, nullptr, {}, false}});
		qcolombetUnsubmitted Not Done Reply Inline Actions Use `MI` directly here instead of adding it line 200. qcolombet: Use `MI` directly here instead of adding it line 200.
		lkailAuthorUnsubmitted Done Reply Inline Actions Correct me if I'm woring, `Copies.insert` insert successfully only when the key doesn't exist before. Suppose we have L0: R0 = COPY R1 L1: R2 = COPY R1 `LastUseSeenInCopy` should track MI in `L1` rather than `L0`. lkail: Correct me if I'm woring, `Copies.insert` insert successfully only when the key doesn't exist…
		qcolombetUnsubmitted Not Done Reply Inline Actions Ah good point! qcolombet: Ah good point!
auto &Copy = I.first->second;		auto &Copy = I.first->second;
if (!is_contained(Copy.DefRegs, Def))		if (!is_contained(Copy.DefRegs, Def))
Copy.DefRegs.push_back(Def);		Copy.DefRegs.push_back(Def);
		Copy.LastSeenUseInCopy = MI;
}		}
}		}

bool hasAnyCopies() {		bool hasAnyCopies() {
return !Copies.empty();		return !Copies.empty();
}		}

MachineInstr *findCopyForUnit(MCRegister RegUnit,		MachineInstr *findCopyForUnit(MCRegister RegUnit,
▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines	for (const MachineInstr &MI :
for (const MachineOperand &MO : MI.operands())		for (const MachineOperand &MO : MI.operands())
if (MO.isRegMask())		if (MO.isRegMask())
if (MO.clobbersPhysReg(AvailSrc) \|\| MO.clobbersPhysReg(AvailDef))		if (MO.clobbersPhysReg(AvailSrc) \|\| MO.clobbersPhysReg(AvailDef))
return nullptr;		return nullptr;

return AvailCopy;		return AvailCopy;
}		}

		// Find last COPY that defines Reg before Current MachineInstr.
		MachineInstr *findLastSeenDefInCopy(const MachineInstr &Current,
		qcolombetUnsubmitted Done Reply Inline Actions Could `Current` be const here? qcolombet: Could `Current` be const here?
		MCRegister Reg,
		const TargetRegisterInfo &TRI,
		const TargetInstrInfo &TII,
		bool UseCopyInstr) {
		MCRegUnitIterator RUI(Reg, &TRI);
		auto CI = Copies.find(*RUI);
		if (CI == Copies.end() \|\| !CI->second.Avail)
		return nullptr;

		MachineInstr *DefCopy = CI->second.MI;
		std::optional<DestSourcePair> CopyOperands =
		isCopyInstr(*DefCopy, TII, UseCopyInstr);
		Register Def = CopyOperands->Destination->getReg();
		if (!TRI.isSubRegisterEq(Def, Reg))
		return nullptr;

		for (const MachineInstr &MI :
		make_range(static_cast<const MachineInstr *>(DefCopy)->getIterator(),
		Current.getIterator()))
		qcolombetUnsubmitted Not Done Reply Inline Actions If `Def` is clobbered between `DefCopy` and `Current` I would have expected that `DefCopy` would have been removed from `Copies`. Put differently, I feel that if we have to check that here, the `Copies` map is holding hard-to-reason-about information. Are we missing some call to `clobberRegister`? qcolombet: If `Def` is clobbered between `DefCopy` and `Current` I would have expected that `DefCopy`…
		lkailAuthorUnsubmitted Done Reply Inline Actions I can imagine there might be some `RegMask`s doesn't implicit def any registers, just clobber them. When we are traversing the MBB and are encountered with a `RegMask` without any other implicit-def, we don't know which register to clobber directly. I'm not sure we have a way to enumerate registers a `RegMask` clobbers. If there is such a way, I think we should clobber registers when we are traversing the MBB, not checking `RegMask` clobbers here. lkail: I can imagine there might be some `RegMask`s doesn't implicit def any registers, just clobber…
		qcolombetUnsubmitted Not Done Reply Inline Actions I think I see our misunderstanding. Given the name of the function I would expect that this function only does some queries on the tracker, but you're actually using this function to do some bookkeeping as well. So the conclusion is either rename this function to more accurately represents what it does (I don't have a good name for now) or move the bookkeeping in the main loop (i.e., I thought we were calling clobberRegister from the main loop already.) Regarding your comment on `RegMask`s, I am not sure I follow: `RegMask`s always list all the registers they preserve/clobber Liveness sets at basic block boundaries are not represented with `RegMask`s, but anyway we don't care because the tracking is always purely local to a basic block in that pass. (Unless you've changed that and I missed it :)). qcolombet: I think I see our misunderstanding. Given the name of the function I would expect that this…
		lkailAuthorUnsubmitted Done Reply Inline Actions RegMasks always list all the registers they preserve/clobber Ah, I see. Is it a good idea to enumerate them via `TRI->getNumRegs()` and check `RegMask` to see if they are preserve/clobber? For the origin question If Def is clobbered between DefCopy and Current I would have expected that DefCopy would have been removed from Copies. Does it imply we should not check `RegMask` and don't update bookkeeping by calling `Tracker::clobberRegister` here, checking if `RegMask` clobbers registers should already have been conducted in the main loop? Correct me if I still fail to get your point. lkail: > RegMasks always list all the registers they preserve/clobber Ah, I see. Is it a good idea to…
		qcolombetUnsubmitted Not Done Reply Inline Actions Ah, I see. Is it a good idea to enumerate them via TRI->getNumRegs() and check RegMask to see if they are preserve/clobber? That's the idea. Though we wouldn't need to enumerate all the registers, only the ones that you care about. What you're doing here is already fine. The thing that bothers me in the current code is the call to `clobberRegister`. This modifies the state of the tracker. Usually the `find` methods only query the `RegMask`s directly (with `MachineOperand::clobbersPhysReg`). Correct me if I still fail to get your point. You got the point. Let's keep the code as is for now and let me do a full pass on the code so that I have a better model of how the whole things works. I'll probably won't have time to do it before next week though. qcolombet: > Ah, I see. Is it a good idea to enumerate them via TRI->getNumRegs() and check RegMask to see…
		for (const MachineOperand &MO : MI.operands())
		if (MO.isRegMask())
		if (MO.clobbersPhysReg(Def)) {
		LLVM_DEBUG(dbgs() << "MCP: Removed tracking of "
		<< printReg(Def, &TRI) << "\n");
		return nullptr;
		}

		return DefCopy;
		}

		// Find last COPY that uses Reg.
		MachineInstr *findLastSeenUseInCopy(MCRegister Reg,
		const TargetRegisterInfo &TRI) {
		MCRegUnitIterator RUI(Reg, &TRI);
		auto CI = Copies.find(*RUI);
		if (CI == Copies.end())
		return nullptr;
		return CI->second.LastSeenUseInCopy;
		}

void clear() {		void clear() {
Copies.clear();		Copies.clear();
}		}
};		};

class MachineCopyPropagation : public MachineFunctionPass {		class MachineCopyPropagation : public MachineFunctionPass {
const TargetRegisterInfo *TRI;		const TargetRegisterInfo *TRI;
const TargetInstrInfo *TII;		const TargetInstrInfo *TII;
Show All 23 Lines	public:
}		}

private:		private:
typedef enum { DebugUse = false, RegularUse = true } DebugType;		typedef enum { DebugUse = false, RegularUse = true } DebugType;

void ReadRegister(MCRegister Reg, MachineInstr &Reader, DebugType DT);		void ReadRegister(MCRegister Reg, MachineInstr &Reader, DebugType DT);
void ForwardCopyPropagateBlock(MachineBasicBlock &MBB);		void ForwardCopyPropagateBlock(MachineBasicBlock &MBB);
void BackwardCopyPropagateBlock(MachineBasicBlock &MBB);		void BackwardCopyPropagateBlock(MachineBasicBlock &MBB);
		void EliminateSpillageCopies(MachineBasicBlock &MBB);
bool eraseIfRedundant(MachineInstr &Copy, MCRegister Src, MCRegister Def);		bool eraseIfRedundant(MachineInstr &Copy, MCRegister Src, MCRegister Def);
void forwardUses(MachineInstr &MI);		void forwardUses(MachineInstr &MI);
void propagateDefs(MachineInstr &MI);		void propagateDefs(MachineInstr &MI);
bool isForwardableRegClassCopy(const MachineInstr &Copy,		bool isForwardableRegClassCopy(const MachineInstr &Copy,
const MachineInstr &UseI, unsigned UseIdx);		const MachineInstr &UseI, unsigned UseIdx);
bool isBackwardPropagatableRegClassCopy(const MachineInstr &Copy,		bool isBackwardPropagatableRegClassCopy(const MachineInstr &Copy,
const MachineInstr &UseI,		const MachineInstr &UseI,
unsigned UseIdx);		unsigned UseIdx);
▲ Show 20 Lines • Show All 674 Lines • ▼ Show 20 Lines	for (auto *Copy : MaybeDeadCopies) {
++NumDeletes;		++NumDeletes;
}		}

MaybeDeadCopies.clear();		MaybeDeadCopies.clear();
CopyDbgUsers.clear();		CopyDbgUsers.clear();
Tracker.clear();		Tracker.clear();
}		}

		static void LLVM_ATTRIBUTE_UNUSED printSpillReloadChain(
		qcolombetUnsubmitted Done Reply Inline Actions Add comments. What this does: explain the algorithm at a high level. qcolombet: Add comments. What this does: explain the algorithm at a high level.
		DenseMap<MachineInstr , SmallVector<MachineInstr >> &SpillChain,
		DenseMap<MachineInstr , SmallVector<MachineInstr >> &ReloadChain,
		MachineInstr *Leader) {
		auto &SC = SpillChain[Leader];
		auto &RC = ReloadChain[Leader];
		for (auto I = SC.rbegin(), E = SC.rend(); I != E; ++I)
		(*I)->dump();
		qcolombetUnsubmitted Done Reply Inline Actions I think you can simplify this with a range based loop with `rbegin`/`rend`. qcolombet: I think you can simplify this with a range based loop with `rbegin`/`rend`.
		qcolombetUnsubmitted Not Done Reply Inline Actions You should be able to use an even more compact form: for (auto I : make_range(SC.rbegin(), SC.rend()) qcolombet: You should be able to use an even more compact form: ``` for (auto I : make_range(SC.rbegin()…
		for (MachineInstr *MI : RC)
		qcolombetUnsubmitted Done Reply Inline Actions Not for this patch, but you may want to use `isCopyInstr` instead of `isCopy` to catch more cases. That said, it probably won't make much of a difference, since in particular most copies you're trying to remove here comes from splitting in regalloc (i.e., we'll have plain `COPY`). qcolombet: Not for this patch, but you may want to use `isCopyInstr` instead of `isCopy` to catch more…
		qcolombetUnsubmitted Done Reply Inline Actions To be on the safe side, you may want to check that the operation has no implicit operand. qcolombet: To be on the safe side, you may want to check that the operation has no implicit operand.
		qcolombetUnsubmitted Done Reply Inline Actions That should be a simple range based loop: for (const MachineInstr MI: RC) MI->dump(); qcolombet:* That should be a simple range based loop: ``` for (const MachineInstr *MI: RC) MI->dump()…
		MI->dump();
		}

		// Remove spill-reload like copy chains. For example
		// r0 = COPY r1
		// r1 = COPY r2
		// r2 = COPY r3
		qcolombetUnsubmitted Not Done Reply Inline Actions Instead of copying the chain, can we hold on to the reference until we're done with the processing? qcolombet: Instead of copying the chain, can we hold on to the reference until we're done with the…
		// r3 = COPY r4
		// <def-use r4>
		// r4 = COPY r3
		// r3 = COPY r2
		// r2 = COPY r1
		// r1 = COPY r0
		// will be folded into
		qcolombetUnsubmitted Done Reply Inline Actions Maybe worth splitting the reload from the spill chain as this assert is strange at first glance. Perhaps it wouldn't be as problematic when the method is properly documented. Put differently, let's leave it like this for now, add a bunch of comments and we'll see if it still feels weird after that. qcolombet: Maybe worth splitting the reload from the spill chain as this assert is strange at first glance.
		// r0 = COPY r1
		// r1 = COPY r4
		// <def-use r4>
		qcolombetUnsubmitted Done Reply Inline Actions We'll need to expand on that commet because naively, a 2 pairs chain would be beneficial to remove, but this is not something this code can do since we don't recolor outside of the spill chain. I'd suggest putting something like: We need at least 3 pairs of copies for the transformation to apply, because the first outermost pair cannot be removed since we don't recolor outside of the chain and that we need at least one temporary spill slot to shorten the chain. If we only have a chain of two pairs, we already have the shortest sequence this code can handle: the outermost pair for the temporary spill slot, and the pair that use that temporary spill slot for the other end of the chain. qcolombet: We'll need to expand on that commet because naively, a 2 pairs chain would be beneficial to…
		// r4 = COPY r1
		// r1 = COPY r0
		qcolombetUnsubmitted Done Reply Inline Actions Technically we could collapse this sequence to: // r0 = COPY r4 // <def-use r4> // r4 = COPY r0 I.e., I think it is worth explaining that the propagation doesn't check whether or not `r0` can be altered outside of the chain and that's why we conservatively keep its value as it was before the rewrite. (Would be a nice follow-up fix BTW :)). qcolombet: Technically we could collapse this sequence to: ``` // r0 = COPY r4 // <def-use r4> // r4 =…
		lkailAuthorUnsubmitted Done Reply Inline Actions Maybe we can check if `r0` is killed to remove one more COPY. lkail: Maybe we can check if `r0` is killed to remove one more COPY.
		qcolombetUnsubmitted Not Done Reply Inline Actions Yep, but for that to be accurate we would need to flip the direction of the analysis (from top-down, to bottom-up) to get proper liveness construction. (Or use the kill flag, but generally speaking we try to avoid relying on this.) qcolombet: Yep, but for that to be accurate we would need to flip the direction of the analysis (from top…
		// TODO: Currently we don't track usage of r0 outside the chain, so we
		// conservatively keep its value as it was before the rewrite.
		//
		qcolombetUnsubmitted Done Reply Inline Actions Typo: until qcolombet: Typo: until
		qcolombetUnsubmitted Done Reply Inline Actions typo: chain uses qcolombet: typo: chain uses
		// The algorithm is trying to keep
		// property#1: No Def of spill COPY in the chain is used or defined until the
		// paired reload COPY in the chain uses the Def.
		//
		// property#2: NO Source of COPY in the chain is used or defined until the next
		// COPY in the chain defines the Source, except the innermost spill-reload
		// pair.
		qcolombetUnsubmitted Done Reply Inline Actions Here and other places where you have "debug" statements: Put this in `LLVM_DEBUG` macros. Or remove completely. qcolombet: Here and other places where you have "debug" statements: Put this in `LLVM_DEBUG` macros. Or…
		//
		// The algorithm is conducted by checking every COPY inside the MBB, assuming
		// the COPY is a reload COPY, then try to find paired spill COPY by searching
		// the COPY defines the Src of the reload COPY backward. If such pair is found,
		// it either belongs to an existing chain or a new chain depends on
		qcolombetUnsubmitted Not Done Reply Inline Actions Move that into its own helper function and call it from an assert. static bool LLVM_ATTRIBUTE_UNUSED isValidChain(const SmallVectorImpl<MachineInstr > &Chain) { // your checks here. } ... assert(isValidChain(Chain) && "Invalid chain to process"); qcolombet:* Move that into its own helper function and call it from an assert. ``` static bool…
		// last available COPY uses the Def of the reload COPY.
		qcolombetUnsubmitted Done Reply Inline Actions typo: encountered qcolombet: typo: encountered
		// Implementation notes, we use CopyTracker::findLastDefCopy(Reg, ...) to find
		// out last COPY that defines Reg; we use CopyTracker::findLastUseCopy(Reg, ...)
		// to find out last COPY that uses Reg. When we are encountered with a Non-COPY
		// instruction, we check registers in the operands of this instruction. If this
		qcolombetUnsubmitted Not Done Reply Inline Actions I feel that this is a bit late to check that. We should not put a copy in the "candidate" chains if the copy is not foldable. I would suggest to handle that in the main loop. qcolombet: I feel that this is a bit late to check that. We should not put a copy in the "candidate"…
		// Reg is defined by a COPY, we untrack this Reg via
		qcolombetUnsubmitted Done Reply Inline Actions That sounds weird. Would you mind sharing the assertion? qcolombet: That sounds weird. Would you mind sharing the assertion?
		lkailAuthorUnsubmitted Done Reply Inline Actions It hits `DenseMapIterator`'s pointer operator->() const { assert(isHandleInSync() && "invalid iterator access!"); ... } I dived into it a bit, looks it's checking the validity of the iterator, i.e., if the container is updated, the iterator constructed before the update is invalid. Code like auto Leader = ChainLeader.find(MaybePrevReload); ... ChainLeader.insert({MaybeSpill, Leader->second}); ChainLeader.insert({&MI, Leader->second}); Should be avoided. lkail: It hits `DenseMapIterator`'s ``` pointer operator->() const { assert(isHandleInSync() &&…
		qcolombetUnsubmitted Not Done Reply Inline Actions Yep, every time you insert something in the dense map, you may invalidate the iterators. qcolombet: Yep, every time you insert something in the dense map, you may invalidate the iterators.
		// CopyTracker::clobberRegister(Reg, ...).
		void MachineCopyPropagation::EliminateSpillageCopies(MachineBasicBlock &MBB) {
		qcolombetUnsubmitted Done Reply Inline Actions Could you add a comment on what the mappings hold (all three of them)? I haven't read the code past this point yet, but for instance I would have expected that the key in these maps would be a `Register` not `MachineInstr`. qcolombet: Could you add a comment on what the mappings hold (all three of them)? I haven't read the code…
		lkailAuthorUnsubmitted Done Reply Inline Actions I haven't read the code past this point yet, but for instance I would have expected that the key in these maps would be a Register not MachineInstr. I separate the algorithm implementation in to two stages. stage1: Collect spill-reload chains. stage2: Fold the chains. If using `Register`, we are unable to track different spill-reload chains that share same registers. lkail: > I haven't read the code past this point yet, but for instance I would have expected that the…
		qcolombetUnsubmitted Not Done Reply Inline Actions I see, make sense. qcolombet: I see, make sense.
		// ChainLeader maps MI inside a spill-reload chain to its innermost reload COPY.
		qcolombetUnsubmitted Not Done Reply Inline Actions typo: until qcolombet: typo: until
		// Thus we can track if a MI belongs to an existing spill-reload chain.
		DenseMap<MachineInstr , MachineInstr > ChainLeader;
		qcolombetUnsubmitted Not Done Reply Inline Actions Instead of tracking that, should we just invalidate the chain / stop it before that point? qcolombet: Instead of tracking that, should we just invalidate the chain / stop it before that point?
		lkailAuthorUnsubmitted Done Reply Inline Actions The implementation doesn't invalidate any chain in stage1. Compared to previous implementation, I think current one is easier to reason and easier to maintain. When we are iterating MI inside the MBB, we don't know which `COPY` might be one of the innermost spill-reload pair and we don't want to lose track of the innermost spill-reload pair. The Source of the innermost spill is allowed to be re-use and re-def between the innermost spill-reload pair. lkail: The implementation doesn't invalidate any chain in stage1. Compared to previous implementation…
		// SpillChain maps innermost reload COPY of a spill-reload chain to a sequence
		qcolombetUnsubmitted Not Done Reply Inline Actions By construction `Chain[Len - 4]->getOperand(0) == Chain[Len - 3]->getOperand(1)`, so I would instead put that in a variable and use it in both places. E.g., something like: // Pull the last spill slot used only within the chain as the final spill slot. MCRegister LastReusableRegSpillSlot = Chain[Len - 4]->getOperand(0).getReg() // Update the chain to skip all the intermediate register spill slots: // Spilling: Chain[0]->getOperand(0).setReg(LastReusableRegSpillSlot); // Reload: Chain[1]->getOperand(1).setReg(LastReusableRegSpillSlot); qcolombet: By construction `Chain[Len - 4]->getOperand(0) == Chain[Len - 3]->getOperand(1)`, so I would…
		// of COPYs that forms spills of a spill-reload chain.
		// ReloadChain maps innermost reload COPY of a spill-reload chain to a
		qcolombetUnsubmitted Not Done Reply Inline Actions Maybe be worth adding a comment here that although the variable is called `MaybeDeadCopies`, we really are going to remove the related instructions. The fact that we use `MaybeDeadCopies` to do our code cleanup is slightly confusing because if we don't actually delete the intermediate copies (a.k.a. what remains of the chain at this point) the resulting code would be incorrect. qcolombet: Maybe be worth adding a comment here that although the variable is called `MaybeDeadCopies`, we…
		// sequence of COPYs that forms reloads of a spill-reload chain.
		DenseMap<MachineInstr , SmallVector<MachineInstr >> SpillChain, ReloadChain;
		// If a COPY's Source has use or def until next COPY defines the Source,
		// we put the COPY in this set to keep property#2.
		DenseSet<const MachineInstr *> CopySourceInvalid;
		qcolombetUnsubmitted Not Done Reply Inline Actions Nit: `const` on `MachineInstr` qcolombet:* Nit: `const` on `MachineInstr*`

		auto TryFoldSpillageCopies =
		[&, this](const SmallVectorImpl<MachineInstr *> &SC,
		const SmallVectorImpl<MachineInstr *> &RC) {
		assert(SC.size() == RC.size() && "Spill-reload should be paired");
		qcolombetUnsubmitted Done Reply Inline Actions Maybe add a todo that if the outermost pair of copies modifies a register that is dead outside of that pair, we could eliminate one more pair. qcolombet: Maybe add a todo that if the outermost pair of copies modifies a register that is dead outside…

		// We need at least 3 pairs of copies for the transformation to apply,
		// because the first outermost pair cannot be removed since we don't
		// recolor outside of the chain and that we need at least one temporary
		// spill slot to shorten the chain. If we only have a chain of two
		// pairs, we already have the shortest sequence this code can handle:
		// the outermost pair for the temporary spill slot, and the pair that
		// use that temporary spill slot for the other end of the chain.
		// TODO: We might be able to simplify to one spill-reload pair if collecting
		// more infomation about the outermost COPY.
		if (SC.size() <= 2)
		return;

		// If violate property#2, we don't fold the chain.
		for (const MachineInstr *Spill : make_range(SC.begin() + 1, SC.end()))
		qcolombetUnsubmitted Not Done Reply Inline Actions You should be able to use a range loop: for (const MachineInstr Spill : SC) { if (CopySourceInvalid.count(Spill)) return; } qcolombet:* You should be able to use a range loop: ``` for (const MachineInstr *Spill : SC) { if…
		if (CopySourceInvalid.count(Spill))
		return;

		for (const MachineInstr *Reload : make_range(RC.begin(), RC.end() - 1))
		if (CopySourceInvalid.count(Reload))
		qcolombetUnsubmitted Not Done Reply Inline Actions Nit: range loop qcolombet: Nit: range loop
		return;

		auto CheckCopyConstraint = [this](Register Def, Register Src) {
		for (const TargetRegisterClass *RC : TRI->regclasses()) {
		if (RC->contains(Def) && RC->contains(Src))
		return true;
		qcolombetUnsubmitted Not Done Reply Inline Actions That's going to be pretty expensive to walk all the register classes. I'm guessing you're trying to check if the resulting copy is legal and unfortunately there's no good way to do that. Did you see that showing up in compile time profile? qcolombet: That's going to be pretty expensive to walk all the register classes. I'm guessing you're…
		}
		return false;
		};

		auto UpdateReg = [](MachineInstr MI, const MachineOperand Old,
		const MachineOperand *New) {
		for (MachineOperand &MO : MI->operands()) {
		if (&MO == Old)
		MO.setReg(New->getReg());
		}
		};

		std::optional<DestSourcePair> InnerMostSpillCopy =
		isCopyInstr(SC[0], TII, UseCopyInstr);
		std::optional<DestSourcePair> OuterMostSpillCopy =
		isCopyInstr(SC.back(), TII, UseCopyInstr);
		std::optional<DestSourcePair> InnerMostReloadCopy =
		isCopyInstr(RC[0], TII, UseCopyInstr);
		std::optional<DestSourcePair> OuterMostReloadCopy =
		isCopyInstr(RC.back(), TII, UseCopyInstr);
		if (!CheckCopyConstraint(OuterMostSpillCopy->Source->getReg(),
		InnerMostSpillCopy->Source->getReg()) \|\|
		!CheckCopyConstraint(InnerMostReloadCopy->Destination->getReg(),
		OuterMostReloadCopy->Destination->getReg()))
		return;

		SpillageChainsLength += SC.size() + RC.size();
		NumSpillageChains += 1;
		UpdateReg(SC[0], InnerMostSpillCopy->Destination,
		qcolombetUnsubmitted Not Done Reply Inline Actions At first it is strange to see that we look for a copy when `Reg` is a def, but I guess it makes sense because: We are not going to recolor `Reg` We need to consider this chain before it gets clobbered later in that same loop Assuming I understood that correctly, it deserves its comment here. qcolombet: At first it is strange to see that we look for a copy when `Reg` is a def, but I guess it makes…
		OuterMostSpillCopy->Source);
		qcolombetUnsubmitted Not Done Reply Inline Actions Nit: range loop qcolombet: Nit: range loop
		UpdateReg(RC[0], InnerMostReloadCopy->Source,
		OuterMostReloadCopy->Destination);

		for (size_t I = 1; I < SC.size() - 1; ++I) {
		SC[I]->eraseFromParent();
		RC[I]->eraseFromParent();
		NumDeletes += 2;
		qcolombetUnsubmitted Not Done Reply Inline Actions Use `LeadRegs.find` and avoid the double lookups (one in `count` and one in `operator[]`). qcolombet: Use `LeadRegs.find` and avoid the double lookups (one in `count` and one in `operator[]`).
		}
		};
		qcolombetUnsubmitted Not Done Reply Inline Actions Nit: Here and other places where you use `isCopyInstr`: use the explicit type instead of `auto`. (The return type is hard to infer.) qcolombet: Nit: Here and other places where you use `isCopyInstr`: use the explicit type instead of `auto`.

		auto IsFoldableCopy = [this](const MachineInstr &MaybeCopy) {
		if (MaybeCopy.getNumImplicitOperands() > 0)
		return false;
		std::optional<DestSourcePair> CopyOperands =
		isCopyInstr(MaybeCopy, *TII, UseCopyInstr);
		if (!CopyOperands)
		return false;
		qcolombetUnsubmitted Not Done Reply Inline Actions I think this statement deserves its own comment. IIUC here we unconditionally clobber all the registers (as opposed to only clobbering the definitions) because we only rewrite the chain itself (i.e., we don't attempt to rewrite uses after the chain). BTW, you need to take into account regmasks too. qcolombet: I think this statement deserves its own comment. IIUC here we unconditionally clobber all the…
		qcolombetUnsubmitted Not Done Reply Inline Actions Shouldn't we clear the `SpillChains` here for defs and not-preversed-by-regmasks regs at this point? qcolombet: Shouldn't we clear the `SpillChains` here for defs and not-preversed-by-regmasks regs at this…
		Register Src = CopyOperands->Source->getReg();
		Register Def = CopyOperands->Destination->getReg();
		return Src && Def && !TRI->regsOverlap(Src, Def) &&
		CopyOperands->Source->isRenamable() &&
		CopyOperands->Destination->isRenamable();
		};

		auto IsSpillReloadPair = [&, this](const MachineInstr &Spill,
		const MachineInstr &Reload) {
		if (!IsFoldableCopy(Spill) \|\| !IsFoldableCopy(Reload))
		return false;
		std::optional<DestSourcePair> SpillCopy =
		isCopyInstr(Spill, *TII, UseCopyInstr);
		std::optional<DestSourcePair> ReloadCopy =
		isCopyInstr(Reload, *TII, UseCopyInstr);
		if (!SpillCopy \|\| !ReloadCopy)
		return false;
		return SpillCopy->Source->getReg() == ReloadCopy->Destination->getReg() &&
		SpillCopy->Destination->getReg() == ReloadCopy->Source->getReg();
		};

		auto IsChainedCopy = [&, this](const MachineInstr &Prev,
		const MachineInstr &Current) {
		if (!IsFoldableCopy(Prev) \|\| !IsFoldableCopy(Current))
		return false;
		std::optional<DestSourcePair> PrevCopy =
		isCopyInstr(Prev, *TII, UseCopyInstr);
		std::optional<DestSourcePair> CurrentCopy =
		isCopyInstr(Current, *TII, UseCopyInstr);
		if (!PrevCopy \|\| !CurrentCopy)
		return false;
		return PrevCopy->Source->getReg() == CurrentCopy->Destination->getReg();
		};

		for (MachineInstr &MI : llvm::make_early_inc_range(MBB)) {
		std::optional<DestSourcePair> CopyOperands =
		isCopyInstr(MI, *TII, UseCopyInstr);

		// Update track information via non-copy instruction.
		SmallSet<Register, 8> RegsToClobber;
		if (!CopyOperands) {
		for (const MachineOperand &MO : MI.operands()) {
		if (!MO.isReg())
		continue;
		Register Reg = MO.getReg();
		if (!Reg)
		continue;
		MachineInstr *LastUseCopy =
		Tracker.findLastSeenUseInCopy(Reg.asMCReg(), *TRI);
		if (LastUseCopy) {
		LLVM_DEBUG(dbgs() << "MCP: Copy source of\n");
		LLVM_DEBUG(LastUseCopy->dump());
		LLVM_DEBUG(dbgs() << "might be invalidated by\n");
		LLVM_DEBUG(MI.dump());
		CopySourceInvalid.insert(LastUseCopy);
		}
		// Must be noted Tracker.clobberRegister(Reg, ...) removes tracking of
		// Reg, i.e, COPY that defines Reg is removed from the mapping as well
		// as marking COPYs that uses Reg unavailable.
		// We don't invoke CopyTracker::clobberRegister(Reg, ...) if Reg is not
		// defined by a previous COPY, since we don't want to make COPYs uses
		// Reg unavailable.
		if (Tracker.findLastSeenDefInCopy(MI, Reg.asMCReg(), TRI, TII,
		UseCopyInstr))
		// Thus we can keep the property#1.
		RegsToClobber.insert(Reg);
		}
		for (Register Reg : RegsToClobber) {
		Tracker.clobberRegister(Reg, TRI, TII, UseCopyInstr);
		LLVM_DEBUG(dbgs() << "MCP: Removed tracking of " << printReg(Reg, TRI)
		<< "\n");
		}
		continue;
		}

		Register Src = CopyOperands->Source->getReg();
		Register Def = CopyOperands->Destination->getReg();
		// Check if we can find a pair spill-reload copy.
		LLVM_DEBUG(dbgs() << "MCP: Searching paired spill for reload: ");
		LLVM_DEBUG(MI.dump());
		MachineInstr *MaybeSpill =
		Tracker.findLastSeenDefInCopy(MI, Src.asMCReg(), TRI, TII, UseCopyInstr);
		bool MaybeSpillIsChained = ChainLeader.count(MaybeSpill);
		if (!MaybeSpillIsChained && MaybeSpill &&
		IsSpillReloadPair(*MaybeSpill, MI)) {
		// Check if we already have an existing chain. Now we have a
		// spill-reload pair.
		// L2: r2 = COPY r3
		// L5: r3 = COPY r2
		// Looking for a valid COPY before L5 which uses r3.
		// This can be serverial cases.
		// Case #1:
		// No COPY is found, which can be r3 is def-use between (L2, L5), we
		// create a new chain for L2 and L5.
		// Case #2:
		// L2: r2 = COPY r3
		// L5: r3 = COPY r2
		// Such COPY is found and is L2, we create a new chain for L2 and L5.
		// Case #3:
		// L2: r2 = COPY r3
		// L3: r1 = COPY r3
		// L5: r3 = COPY r2
		// we create a new chain for L2 and L5.
		// Case #4:
		// L2: r2 = COPY r3
		// L3: r1 = COPY r3
		// L4: r3 = COPY r1
		// L5: r3 = COPY r2
		// Such COPY won't be found since L4 defines r3. we create a new chain
		// for L2 and L5.
		// Case #5:
		// L2: r2 = COPY r3
		// L3: r3 = COPY r1
		// L4: r1 = COPY r3
		// L5: r3 = COPY r2
		// COPY is found and is L4 which belongs to an existing chain, we add
		// L2 and L5 to this chain.
		LLVM_DEBUG(dbgs() << "MCP: Found spill: ");
		LLVM_DEBUG(MaybeSpill->dump());
		MachineInstr *MaybePrevReload =
		Tracker.findLastSeenUseInCopy(Def.asMCReg(), *TRI);
		auto Leader = ChainLeader.find(MaybePrevReload);
		MachineInstr *L = nullptr;
		if (Leader == ChainLeader.end() \|\|
		(MaybePrevReload && !IsChainedCopy(*MaybePrevReload, MI))) {
		L = &MI;
		assert(!SpillChain.count(L) &&
		"SpillChain should not have contained newly found chain");
		} else {
		assert(MaybePrevReload &&
		"Found a valid leader through nullptr should not happend");
		L = Leader->second;
		assert(SpillChain[L].size() > 0 &&
		"Existing chain's length should be larger than zero");
		}
		assert(!ChainLeader.count(&MI) && !ChainLeader.count(MaybeSpill) &&
		"Newly found paired spill-reload should not belong to any chain "
		"at this point");
		ChainLeader.insert({MaybeSpill, L});
		ChainLeader.insert({&MI, L});
		SpillChain[L].push_back(MaybeSpill);
		ReloadChain[L].push_back(&MI);
		LLVM_DEBUG(dbgs() << "MCP: Chain " << L << " now is:\n");
		LLVM_DEBUG(printSpillReloadChain(SpillChain, ReloadChain, L));
		} else if (MaybeSpill && !MaybeSpillIsChained) {
		// MaybeSpill is unable to pair with MI. That's to say adding MI makes
		// the chain invalid.
		// The COPY defines Src is no longer considered as a candidate of a
		// valid chain. Since we expect the Def of a spill copy isn't used by
		// any COPY instruction until a reload copy. For example:
		// L1: r1 = COPY r2
		// L2: r3 = COPY r1
		// If we later have
		// L1: r1 = COPY r2
		// L2: r3 = COPY r1
		// L3: r2 = COPY r1
		// L1 and L3 can't be a valid spill-reload pair.
		// Thus we keep the property#1.
		LLVM_DEBUG(dbgs() << "MCP: Not paired spill-reload:\n");
		LLVM_DEBUG(MaybeSpill->dump());
		LLVM_DEBUG(MI.dump());
		Tracker.clobberRegister(Src.asMCReg(), TRI, TII, UseCopyInstr);
		LLVM_DEBUG(dbgs() << "MCP: Removed tracking of " << printReg(Src, TRI)
		<< "\n");
		}
		Tracker.trackCopy(&MI, TRI, TII, UseCopyInstr);
		}

		for (auto I = SpillChain.begin(), E = SpillChain.end(); I != E; ++I) {
		auto &SC = I->second;
		assert(ReloadChain.count(I->first) &&
		"Reload chain of the same leader should exist");
		auto &RC = ReloadChain[I->first];
		TryFoldSpillageCopies(SC, RC);
		}

		MaybeDeadCopies.clear();
		CopyDbgUsers.clear();
		Tracker.clear();
		}

bool MachineCopyPropagation::runOnMachineFunction(MachineFunction &MF) {		bool MachineCopyPropagation::runOnMachineFunction(MachineFunction &MF) {
if (skipFunction(MF.getFunction()))		if (skipFunction(MF.getFunction()))
return false;		return false;

		bool isSpillageCopyElimEnabled = false;
		switch (EnableSpillageCopyElimination) {
		case cl::BOU_UNSET:
		isSpillageCopyElimEnabled =
		MF.getSubtarget().enableSpillageCopyElimination();
		break;
		case cl::BOU_TRUE:
		isSpillageCopyElimEnabled = true;
		break;
		case cl::BOU_FALSE:
		isSpillageCopyElimEnabled = false;
		break;
		}

Changed = false;		Changed = false;

TRI = MF.getSubtarget().getRegisterInfo();		TRI = MF.getSubtarget().getRegisterInfo();
TII = MF.getSubtarget().getInstrInfo();		TII = MF.getSubtarget().getInstrInfo();
MRI = &MF.getRegInfo();		MRI = &MF.getRegInfo();

for (MachineBasicBlock &MBB : MF) {		for (MachineBasicBlock &MBB : MF) {
		if (isSpillageCopyElimEnabled)
		EliminateSpillageCopies(MBB);
BackwardCopyPropagateBlock(MBB);		BackwardCopyPropagateBlock(MBB);
ForwardCopyPropagateBlock(MBB);		ForwardCopyPropagateBlock(MBB);
}		}

return Changed;		return Changed;
}		}

MachineFunctionPass *		MachineFunctionPass *
llvm::createMachineCopyPropagationPass(bool UseCopyInstr = false) {		llvm::createMachineCopyPropagationPass(bool UseCopyInstr = false) {
return new MachineCopyPropagation(UseCopyInstr);		return new MachineCopyPropagation(UseCopyInstr);
}		}

llvm/lib/Target/PowerPC/PPCSubtarget.h

Show First 20 Lines • Show All 234 Lines • ▼ Show 20 Lines	#include "PPCGenSubtargetInfo.inc"
void getCriticalPathRCs(RegClassVector &CriticalPathRCs) const override;		void getCriticalPathRCs(RegClassVector &CriticalPathRCs) const override;

void overrideSchedPolicy(MachineSchedPolicy &Policy,		void overrideSchedPolicy(MachineSchedPolicy &Policy,
unsigned NumRegionInstrs) const override;		unsigned NumRegionInstrs) const override;
bool useAA() const override;		bool useAA() const override;

bool enableSubRegLiveness() const override;		bool enableSubRegLiveness() const override;

		bool enableSpillageCopyElimination() const override { return true; }

/// True if the GV will be accessed via an indirect symbol.		/// True if the GV will be accessed via an indirect symbol.
bool isGVIndirectSymbol(const GlobalValue *GV) const;		bool isGVIndirectSymbol(const GlobalValue *GV) const;

/// True if the ABI is descriptor based.		/// True if the ABI is descriptor based.
bool usesFunctionDescriptors() const {		bool usesFunctionDescriptors() const {
// Both 32-bit and 64-bit AIX are descriptor based. For ELF only the 64-bit		// Both 32-bit and 64-bit AIX are descriptor based. For ELF only the 64-bit
// v1 ABI uses descriptors.		// v1 ABI uses descriptors.
return isAIXABI() \|\| (is64BitELFABI() && !isELFv2ABI());		return isAIXABI() \|\| (is64BitELFABI() && !isELFv2ABI());
▲ Show 20 Lines • Show All 55 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/mcp-elim-eviction-chain.mir

This file was added.

				# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
				# RUN: llc -O3 -verify-machineinstrs -mtriple=powerpc64-unknown-unknown \
				# RUN: -simplify-mir -run-pass=machine-cp %s -o - \| FileCheck %s

				--- \|
				declare void @foo()
				define void @test0() {
				entry:
				ret void
				}

				define void @test1() {
				entry:
				ret void
				}

				define void @test2() {
				entry:
				ret void
				}

				define void @test3() {
				entry:
				ret void
				}

				define void @test4() {
				entry:
				ret void
				}

				define void @test5() {
				entry:
				ret void
				}

				define void @test6() {
				entry:
				ret void
				}

				...
				---
				name: test0
				alignment: 4
				tracksRegLiveness: true
				body: \|
				bb.0.entry:
				liveins: $x4, $x5, $x20, $x21, $x22
				; CHECK-LABEL: name: test0
				; CHECK: liveins: $x4, $x5, $x20, $x21, $x22
				; CHECK-NEXT: {{ $}}
				; CHECK-NEXT: renamable $x24 = COPY $x4
				; CHECK-NEXT: $x23 = COPY renamable $x20
				; CHECK-NEXT: renamable $x20 = ADD8 $x4, $x5
				; CHECK-NEXT: renamable $x4 = COPY renamable $x20
				; CHECK-NEXT: renamable $x20 = COPY $x23
				; CHECK-NEXT: renamable $x23 = COPY renamable $x24
				; CHECK-NEXT: $x3 = COPY renamable $x4
				; CHECK-NEXT: BLR8 implicit $lr8, implicit undef $rm, implicit $x3, implicit $x20, implicit $x21, implicit $x22, implicit $x23
				renamable $x23 = COPY renamable $x4
				renamable $x24 = COPY renamable $x23
				renamable $x23 = COPY renamable $x22
				renamable $x22 = COPY renamable $x21
				renamable $x21 = COPY renamable $x20
				renamable $x20 = ADD8 $x4, $x5
				renamable $x4 = COPY renamable $x20
				renamable $x20 = COPY renamable $x21
				renamable $x21 = COPY renamable $x22
				renamable $x22 = COPY renamable $x23
				renamable $x23 = COPY renamable $x24
				$x3 = COPY renamable $x4
				BLR8 implicit $lr8, implicit undef $rm, implicit $x3, implicit $x20, implicit $x21, implicit $x22, implicit $x23

				...

				# Duplicated pairs.
				---
				name: test1
				alignment: 4
				tracksRegLiveness: true
				body: \|
				bb.0.entry:
				liveins: $x3, $x20, $x21, $x22, $x23
				; CHECK-LABEL: name: test1
				; CHECK: liveins: $x3, $x20, $x21, $x22, $x23
				; CHECK-NEXT: {{ $}}
				; CHECK-NEXT: renamable $x24 = COPY $x3
				; CHECK-NEXT: renamable $x23 = COPY renamable $x22
				; CHECK-NEXT: renamable $x22 = COPY renamable $x21
				; CHECK-NEXT: renamable $x21 = COPY renamable $x20
				; CHECK-NEXT: BLR8 implicit $lr8, implicit undef $rm, implicit $x3, implicit $x20, implicit $x21, implicit $x22, implicit $x23, implicit $x24
				renamable $x23 = COPY $x3
				renamable $x24 = COPY renamable $x23
				renamable $x23 = COPY renamable $x22
				renamable $x22 = COPY renamable $x21
				renamable $x21 = COPY renamable $x20
				renamable $x20 = COPY renamable $x21
				renamable $x21 = COPY renamable $x22
				renamable $x22 = COPY renamable $x23
				renamable $x23 = COPY renamable $x24
				renamable $x24 = COPY renamable $x23
				renamable $x23 = COPY renamable $x22
				renamable $x22 = COPY renamable $x21
				renamable $x21 = COPY renamable $x20
				BLR8 implicit $lr8, implicit undef $rm, implicit $x3, implicit $x20, implicit $x21, implicit $x22, implicit $x23, implicit $x24

				...

				# Chain one after one.
				---
				name: test2
				alignment: 4
				tracksRegLiveness: true
				body: \|
				bb.0.entry:
				liveins: $x3, $x18, $x19, $x20, $x21, $x22, $x23, $x24
				; CHECK-LABEL: name: test2
				; CHECK: liveins: $x3, $x18, $x19, $x20, $x21, $x22, $x23, $x24
				; CHECK-NEXT: {{ $}}
				; CHECK-NEXT: renamable $x21 = COPY renamable $x20
				; CHECK-NEXT: renamable $x20 = COPY renamable $x21
				; CHECK-NEXT: renamable $x25 = COPY renamable $x24
				; CHECK-NEXT: renamable $x24 = COPY renamable $x25
				; CHECK-NEXT: BLR8 implicit $lr8, implicit undef $rm, implicit $x3, implicit $x18, implicit $x19, implicit $x20, implicit $x21, implicit $x22, implicit $x23, implicit $x24, implicit $x25
				renamable $x21 = COPY renamable $x20
				renamable $x20 = COPY renamable $x19
				renamable $x19 = COPY renamable $x18
				renamable $x18 = COPY renamable $x19
				renamable $x19 = COPY renamable $x20
				renamable $x20 = COPY renamable $x21
				renamable $x25 = COPY renamable $x24
				renamable $x24 = COPY renamable $x23
				renamable $x23 = COPY renamable $x22
				renamable $x22 = COPY renamable $x23
				qcolombetUnsubmitted Done Reply Inline Actions Add a test with regmasks. qcolombet: Add a test with regmasks.
				renamable $x23 = COPY renamable $x24
				renamable $x24 = COPY renamable $x25
				BLR8 implicit $lr8, implicit undef $rm, implicit $x3, implicit $x18, implicit $x19, implicit $x20, implicit $x21, implicit $x22, implicit $x23, implicit $x24, implicit $x25

				...

				# Reorder code in test2, thus we have two chains in build simultaneously.
				---
				name: test3
				alignment: 4
				tracksRegLiveness: true
				body: \|
				bb.0.entry:
				liveins: $x3, $x18, $x19, $x20, $x21, $x22, $x23, $x24
				; CHECK-LABEL: name: test3
				; CHECK: liveins: $x3, $x18, $x19, $x20, $x21, $x22, $x23, $x24
				; CHECK-NEXT: {{ $}}
				; CHECK-NEXT: renamable $x21 = COPY renamable $x20
				; CHECK-NEXT: renamable $x25 = COPY renamable $x24
				; CHECK-NEXT: renamable $x20 = COPY renamable $x21
				; CHECK-NEXT: renamable $x24 = COPY renamable $x25
				; CHECK-NEXT: BLR8 implicit $lr8, implicit undef $rm, implicit $x3, implicit $x18, implicit $x19, implicit $x20, implicit $x21, implicit $x22, implicit $x23, implicit $x24, implicit $x25
				renamable $x21 = COPY renamable $x20
				renamable $x25 = COPY renamable $x24
				renamable $x20 = COPY renamable $x19
				renamable $x24 = COPY renamable $x23
				renamable $x19 = COPY renamable $x18
				renamable $x23 = COPY renamable $x22
				renamable $x18 = COPY renamable $x19
				renamable $x22 = COPY renamable $x23
				renamable $x19 = COPY renamable $x20
				renamable $x23 = COPY renamable $x24
				renamable $x20 = COPY renamable $x21
				renamable $x24 = COPY renamable $x25
				BLR8 implicit $lr8, implicit undef $rm, implicit $x3, implicit $x18, implicit $x19, implicit $x20, implicit $x21, implicit $x22, implicit $x23, implicit $x24, implicit $x25

				...

				---
				name: test4
				alignment: 4
				tracksRegLiveness: true
				body: \|
				bb.0.entry:
				liveins: $x3, $x4, $x5
				; CHECK-LABEL: name: test4
				; CHECK: liveins: $x3, $x4, $x5
				; CHECK-NEXT: {{ $}}
				; CHECK-NEXT: BLR8 implicit $lr8, implicit undef $rm, implicit $x3
				renamable $x5 = COPY renamable $x3
				renamable $x4 = COPY renamable $x3
				renamable $x2 = COPY renamable $x3
				renamable $x3 = COPY renamable $x2
				renamable $x3 = COPY renamable $x4
				renamable $x3 = COPY renamable $x5
				BLR8 implicit $lr8, implicit undef $rm, implicit $x3

				...

				# Chain across regmask.
				---
				name: test5
				alignment: 4
				tracksRegLiveness: true
				body: \|
				bb.0.entry:
				liveins: $x17, $x16, $x15, $x14, $x3
				; CHECK-LABEL: name: test5
				; CHECK: liveins: $x17, $x16, $x15, $x14, $x3
				; CHECK-NEXT: {{ $}}
				; CHECK-NEXT: renamable $x18 = COPY renamable $x17
				; CHECK-NEXT: $x17 = COPY renamable $x3
				; CHECK-NEXT: BL8_NOP @foo, csr_ppc64, implicit-def dead $lr8, implicit $rm, implicit-def $x3, implicit $x3
				; CHECK-NEXT: renamable $x3 = COPY $x17
				; CHECK-NEXT: BLR8 implicit $lr8, implicit undef $rm, implicit $x3
				renamable $x18 = COPY renamable $x17
				renamable $x17 = COPY renamable $x16
				renamable $x16 = COPY renamable $x15
				renamable $x15 = COPY renamable $x14
				renamable $x14 = COPY renamable $x3
				BL8_NOP @foo, csr_ppc64, implicit-def dead $lr8, implicit $rm, implicit-def $x3, implicit $x3
				renamable $x3 = COPY renamable $x14
				renamable $x14 = COPY renamable $x15
				renamable $x15 = COPY renamable $x16
				renamable $x16 = COPY renamable $x17
				renamable $x17 = COPY renamable $x18
				BLR8 implicit $lr8, implicit undef $rm, implicit $x3

				...

				# Two chains across regmask.
				---
				name: test6
				alignment: 4
				tracksRegLiveness: true
				body: \|
				bb.0.entry:
				liveins: $x20, $x19, $x17, $x16, $x15, $x14, $x3, $x4
				; CHECK-LABEL: name: test6
				; CHECK: liveins: $x20, $x19, $x17, $x16, $x15, $x14, $x3, $x4
				; CHECK-NEXT: {{ $}}
				; CHECK-NEXT: renamable $x21 = COPY renamable $x20
				; CHECK-NEXT: renamable $x18 = COPY renamable $x17
				; CHECK-NEXT: $x17 = COPY renamable $x3
				; CHECK-NEXT: $x20 = COPY renamable $x4
				; CHECK-NEXT: BL8_NOP @foo, csr_ppc64, implicit-def dead $lr8, implicit $rm, implicit-def $x3, implicit $x3, implicit-def $x4, implicit $x4
				; CHECK-NEXT: renamable $x3 = COPY $x17
				; CHECK-NEXT: renamable $x4 = COPY $x20
				; CHECK-NEXT: BLR8 implicit $lr8, implicit undef $rm, implicit $x3, implicit $x4
				renamable $x21 = COPY renamable $x20
				renamable $x18 = COPY renamable $x17
				renamable $x17 = COPY renamable $x16
				renamable $x16 = COPY renamable $x15
				renamable $x20 = COPY renamable $x19
				renamable $x15 = COPY renamable $x14
				renamable $x14 = COPY renamable $x3
				renamable $x19 = COPY renamable $x4
				BL8_NOP @foo, csr_ppc64, implicit-def dead $lr8, implicit $rm, implicit-def $x3, implicit $x3, implicit-def $x4, implicit $x4
				renamable $x3 = COPY renamable $x14
				renamable $x14 = COPY renamable $x15
				renamable $x4 = COPY renamable $x19
				renamable $x15 = COPY renamable $x16
				renamable $x19 = COPY renamable $x20
				renamable $x16 = COPY renamable $x17
				renamable $x20 = COPY renamable $x21
				renamable $x17 = COPY renamable $x18
				BLR8 implicit $lr8, implicit undef $rm, implicit $x3, implicit $x4

				...

This is an archive of the discontinued LLVM Phabricator instance.

[MachineCopyPropagation] Eliminate spillage copies that might be caused by eviction chainClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 495713

llvm/include/llvm/CodeGen/TargetSubtargetInfo.h

llvm/lib/CodeGen/MachineCopyPropagation.cpp

llvm/lib/Target/PowerPC/PPCSubtarget.h

llvm/test/CodeGen/PowerPC/mcp-elim-eviction-chain.mir

[MachineCopyPropagation] Eliminate spillage copies that might be caused by eviction chain
ClosedPublic