This is an archive of the discontinued LLVM Phabricator instance.

[Greedy RegAlloc] Add logic to greedy reg alloc to avoid bad eviction chains
ClosedPublic

Authored by myatsina on Jul 24 2017, 2:01 PM.

Download Raw Diff

Details

Reviewers

MatzeB
qcolombet
wmi
stoklund

Commits

rGf9371d821f74: Add logic to greedy reg alloc to avoid bad eviction chains
rL316295: Add logic to greedy reg alloc to avoid bad eviction chains

Summary

This fixes bugzilla 26810
https://bugs.llvm.org/show_bug.cgi?id=26810

This is intended to prevent sequences like:
movl %ebp, 8(%esp) # 4-byte Spill
movl %ecx, %ebp
movl %ebx, %ecx
movl %edi, %ebx
movl %edx, %edi
cltd
idivl %esi
movl %edi, %edx
movl %ebx, %edi
movl %ecx, %ebx
movl %ebp, %ecx
movl 16(%esp), %ebp # 4 - byte Reload

Such sequences are created in 2 scenarios:

Scenario #1:
vreg0 is evicted from physreg0 by vreg1
Evictee vreg0 is intended for region splitting with split candidate physreg0 (the reg vreg0 was evicted from)
Region splitting creates a local interval because of interference with the evictor vreg1 (normally region spliiting creates 2 interval, the "by reg" and "by stack" intervals. Local interval created when interference occurs.)
one of the split intervals ends up evicting vreg2 from physreg1
Evictee vreg2 is intended for region splitting with split candidate physreg1
one of the split intervals ends up evicting vreg3 from physreg2 etc.. until someone spills

Scenario #2
vreg0 is evicted from physreg0 by vreg1
vreg2 is evicted from physreg2 by vreg3 etc
Evictee vreg0 is intended for region splitting with split candidate physreg1
Region splitting creates a local interval because of interference with the evictor vreg1
one of the split intervals ends up evicting back original evictor vreg1 from physreg0 (the reg vreg0 was evicted from)
Another evictee vreg2 is intended for region splitting with split candidate physreg1
one of the split intervals ends up evicting vreg3 from physreg2 etc.. until someone spills

As compile time was a concern, I've added a flag to control weather we do cost calculations for local intervals we expect to be created (it's on by default for X86 target, off for the rest).

Diff Detail

Repository: rL LLVM

Event Timeline

myatsina created this revision.Jul 24 2017, 2:01 PM

myatsina added a subscriber: aaboud.Jul 24 2017, 2:04 PM

Hi,

The greedy allocator is already very complicated and I am not sure the additional complexity of the eviction track is worth it.
Is it something that could be cleaned up in machine copy propagation? The problem is very local so that sounds doable.

I will have a closer look to the patch because fixing the problem from the start is obviously better that patching up later, but given how rare that problem is I really believe exploring other, less complex avenue is interesting.

Cheers,
Quentin

test/CodeGen/X86/bug26810.ll
1 ↗	(On Diff #107959)	Could you use a .mir test to make the test more robust?
3 ↗	(On Diff #107959)	That sounds wrong for a new test. Testing should be positive as much as possible IMO.

This revision now requires changes to proceed.Jul 24 2017, 3:44 PM

In D35816#819501, @qcolombet wrote:

Hi,

The greedy allocator is already very complicated and I am not sure the additional complexity of the eviction track is worth it.
Is it something that could be cleaned up in machine copy propagation? The problem is very local so that sounds doable.

I will have a closer look to the patch because fixing the problem from the start is obviously better that patching up later, but given how rare that problem is I really believe exploring other, less complex avenue is interesting.

Cheers,
Quentin

Thank you for suggesting the machine copy propagation, I've started working on this direction, it definitely seems easier to implement it there.
On the other hand, if I understood correctly, one of the issues with the old llvm register allocator (linear scan) was that that it did a lot of decisions that the rewriter had to clean up afterwards, and it was intended that greedy will try to avoid such decisions. I'm not sure if this eviction chain falls under this category or not.

Thanks,
Marina

test/CodeGen/X86/bug26810.ll
1 ↗	(On Diff #107959)	Will do.
3 ↗	(On Diff #107959)	I wasn't very satisfied with this check as well. I'll make it into a positive test indeed.

In D35816#822628, @myatsina wrote:

In D35816#819501, @qcolombet wrote:

Hi,

The greedy allocator is already very complicated and I am not sure the additional complexity of the eviction track is worth it.
Is it something that could be cleaned up in machine copy propagation? The problem is very local so that sounds doable.

I will have a closer look to the patch because fixing the problem from the start is obviously better that patching up later, but given how rare that problem is I really believe exploring other, less complex avenue is interesting.

Cheers,
Quentin

Thank you for suggesting the machine copy propagation, I've started working on this direction, it definitely seems easier to implement it there.
On the other hand, if I understood correctly, one of the issues with the old llvm register allocator (linear scan) was that that it did a lot of decisions that the rewriter had to clean up afterwards, and it was intended that greedy will try to avoid such decisions. I'm not sure if this eviction chain falls under this category or not.

Thanks,
Marina

I've checked the copy propagation pass feasibility -
I was able to catch a few new cases (probably because the increase in weight I did in my Greedy patch wasn't high enough, but that's heuristic and we might be able to tune it).
On the other hand, I'm now failing to catch all the cases that cross basic blocks because this pass works at a BB block level.

Based on this I think the solution should probably be kept in Greedy (+ possibly additional cleanup in the copy propagation pass).

Thanks,
Marina

Based on this I think the solution should probably be kept in Greedy (+ possibly additional cleanup in the copy propagation pass).

Would a super block copy propagation pass work?
I believe the code in that pass should just work in such configuration.

In D35816#841290, @qcolombet wrote:

Based on this I think the solution should probably be kept in Greedy (+ possibly additional cleanup in the copy propagation pass).

Would a super block copy propagation pass work?
I believe the code in that pass should just work in such configuration.

When you say "super blocks" do you refer to restructuring the CFG (using tail duplication) and making the common path linear so that it can be combined into one large basic block?
I haven't really seen this concept in llvm (except for some "SuperBlock" in debug info, which seems to be unrelated), so if you have some references for me that would be great.

If we're talking about somehow flattening and "ordering" the BB, control flow and loops and and scanning them looking for cross block chains, then I don't think it's something trivial.
It's not always legal to replace such chains (if someone uses or clobbers one of the registers in the middle of the chain I can no longer do the replacement).
Here's one example, I cannot do the replacement "xmm0 = copy xmm3" + "xmm3 =copy xmm0" because if I reach bb2 from bb1 then xmm0 is part of the copy chain, but if I reach bb2 from bb3, then it is not.

bb1:
xmm0 = copy xmm1
// fall through bb2

bb2:
xmm1 = copy xmm2
xmm2 = copy xmm3
...
xmm3 = copy xmm2
xmm2 = copy xmm1
xmm1 = copy xmm0
test
je bb3

bb3:
xmm0 = /* something */
test
je bb2

In order to properly identify this I need to do liveness analysis for each reg suspected to be in the copy chain. I need to check if any clobbering (or even use of an "intermediate" value) might reach one of the BBs the chain is spread across - if so, I cannot do replacement.

Also, I may have several suspected chains in parallel which complicates it even more.

Please let me know if I understood correctly the "super block copy propagation".

Thanks,
Marina

As far as I read (http://www.eecs.umich.edu/~mahlke/papers/1993/hwu_jsuper93.pdf), in order to create superblocks, we need to identify traces using execution profile information, and then do tail duplication to avoid multiple entrances.
According the authors of this technique, this transformation as itself takes significant amount of code and compile time.
I don’t think this transformation is something we should do only for the sake of machine copy propagation pass, as it adds significant complexity.
The decision of supporting this transformation and the possible optimizations that can benefit from it seems like an orthogonal discussion that is not directly related to this bad eviction chains I’m trying to solve.

Even if I do have some transformation to superblocks that have single entry and multiple exists, I will still need to do liveness tracking over all possible paths to maintain correctness:

bb1:
xmm0 = copy xmm1
xmm1 = copy xmm2
xmm2 = copy xmm3
...

test
je bb3

bb2:
xmm3 = copy xmm2
xmm2 = copy xmm1
xmm1 = copy xmm0
return

bb3:
return

The path bb1->bb2 can benefit from the change, but it is not legal for me to this change if a paths like bb1->bb3 exits – I need to scan all paths.

For each suspected copy chain I will need to track a whole subtree in this superblock CFG which begins with the first copy of the chain.
I will need to make sure all possible paths from that first copy contain the whole chain and that there is no path that clobbers one of the registers in the middle of that chain.
So I find myself doing some sort of liveness tracking here too.
I know my original solution added complexity to Greedy, but Greedy’s decisions are the source of this issue, and it doesn’t seem like we have an elegant way to clean up the consequences of those decisions when we’re talking about cross-BB chains.

Thanks,
Marina

Have you had a chance to look at it yet?

Thanks,
Marina

Hi Marina,

Thanks for reminding me about this patch.

I was not able to look at it yet.
I will try to get to it in the next two weeks.

Cheers,
-Quentin

Hi Marina,

I had a quick look at the patch and I am not sure this the right approach.
The patch tries to avoid splitting when it might be part of a bad eviction chain, but I would argue there is no such thing as bad eviction chain. The evictions happened to relax the constraints on the allocation problem and blocking the splitting it won't help.
Now, unless phabricator is not showing everything (it is acting weird since the patch is quite big), my understanding of the patch is that it actually does not prevent eviction chains, it just resorts on less fancy splitting heuristics, which happens to spread splitting decisions around instead of having them localized at regions boundaries. Thus, we may still have eviction chains but they may be harder to spot.

Generally speaking, split points are not bad. What is bad though is the fact that we make poor allocation decisions that prevent to get rid of them later. I would focus my effort on that front if I were you.
For instance, in the example from the PR, I believe the bunch of copies don't get coalesced because we choose the color for the less constrained live-range first. That is if we were to allocate vreg79 first, then I believe vreg80 could use that as a hint eliminating one of the copy. Same for vreg81 and vreg82 and so on.
However, what happened is that we allocate vreg80 first and the allocation order makes it such that vreg79 won't be able to satisfy the hint, since what we pick interferes with what vreg79 can use. Given we have the same structure before and after the split point, the live-ranges get allocated in the order. Thus, the first mistake propagates through all the live-range (vreg79 prevents vreg81 to satisfy its live-range and so on).
For instance, if we were to delay vreg79, I believe we would satisfy all hints.

There is a lot of speculation in what I described for the example from the PR. I will try to spend some time verifying if any of those changes would indeed fix this problem.
In particular, I believe the problem can be solved with some tweaks in:

TargetRegisterInfo::getRegAllocationHints (e.g., we could give a hint to vreg80 so that it avoids vreg79 interferences)
Priority when enqueueing live-ranges
Consider the cost of using a register that is going to create a broken hint down the road when assigning a color (similar idea than first item)
Try to reconcile hints that are in the same region instead of one at a time

The last point is probably the one that is going to less affect existing code and thus would be probably the easiest to qualify.

Cheers,
-Quentin

For the record, playing with the order helps but to do the optimal (local) coloring here I would need to spend more time on the heuristic.
Again the easiest fix is probably the hints reconciling by region.

Hi Quentin,

I wouldn’t say my patch tries to avoid splitting, but rather tries to improve the calculation of the spill weight of split candidates:
When the register allocator decides to do a region split, it looks for the best physical register candidate for the split.
The best candidate is the one that will cause the minimal spill cost.
When calculating the spill cost of each candidate the algorithm takes into account interferences in the entrance/exist of the basic blocks.
However, there may be interference local to a basic block, which later, during the split itself will cause the creation of a new local interval (which will be local to the basic block) on top of the “by reg” and “by stack” intervals which are created during the split.
The algorithm currently ignores the fact that this local interval may cause spills (and thus may increase the spill weight of this candidate for the split).

My solution is to try to predict if this split candidate will case the creation of local intervals and if they in turn will cause spills, and add their spill weights to the total weight.
By doing so, I try to make the spill weight calculation of each candidate more accurate and allow the algorithm to choose a more suitable candidate.

If a local interval is created then we have a few options for its allocation:

The interval will be allocated to some free reg – no additional spill cost needed.
The interval may cause an eviction – in some cases this eviction is "bad" and guaranteed to causes a spill (it’s “bad” when you’re evicting the interval that evicted you, kind of like a cat and mouse game - somebody must loose here) - in this patch I’m trying to predict if it’s "bad" or not, and incorporate the spill weight of this interval.
The interval may spill – I’ve already encountered a case where the new local interval is in a hot loop and ends us spilling around all uses – this spill cost wasn’t considered when the candidate was chosen. I have a solution for this case which is based on parts of this patch.
The interval may split – I guess there might be some spill cost to consider here as well, but I didn’t explore this case yet.

I did see nice performance results with my current solution.
I will try to look into the hint reconciling as well, but I do think that the current spill weight calculation of the split candidates is not accurate enough and we need to consider the affects of those local intervals.

Thanks,
Marina

gberry added a subscriber: gberry.Oct 11 2017, 1:16 PM

As compile time was a concern, I've added a flag to control weather we do cost calculations for local intervals we expect to be created (it's on by default for X86 target, off for the rest).
I've fixed the tests and some comments.

Do you have additional comments?

Hi Marina,

Looks reasonable to me.
Do a second pass before committing to make sure everything follow LLVM coding standard. I've highlighted a few problems.

I can help with that if you wish, but I figured you probably don't want to wait for me to do that :P.

Cheers,
-Quentin

lib/CodeGen/RegAllocGreedy.cpp
302 ↗	(On Diff #119158)	brief*
303 ↗	(On Diff #119158)	Use lower case for the first letter for methods.
310 ↗	(On Diff #119158)	Ditto
1377 ↗	(On Diff #119158)	Check*

This revision is now accepted and ready to land.Oct 16 2017, 8:57 AM

Closed by commit rL316295: Add logic to greedy reg alloc to avoid bad eviction chains (authored by myatsina). · Explain WhyOct 22 2017, 11:00 AM

This revision was automatically updated to reflect the committed changes.

myatsina marked 4 inline comments as done.

ChuanqiXu added a subscriber: ChuanqiXu.Nov 19 2020, 5:59 PM

Herald added a project: Restricted Project. · View Herald TranscriptNov 19 2020, 5:59 PM

Herald added a subscriber: pengfei. · View Herald Transcript

mtrofin mentioned this in D98232: [regalloc] Ensure Query::collectInterferringVregs is called before interval iteration.Mar 10 2021, 12:03 PM

mtrofin mentioned this in D121128: [regalloc] Remove -consider-local-interval-cost.Mar 7 2022, 8:30 AM

mtrofin mentioned this in rG294eca35a00f: [regalloc] Remove -consider-local-interval-cost.Mar 14 2022, 10:49 AM

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

CodeGen/

CalcSpillWeights.h

26 lines

LiveIntervalAnalysis.h

5 lines

Target/

TargetSubtargetInfo.h

5 lines

lib/

CodeGen/

CalcSpillWeights.cpp

65 lines

LiveIntervalAnalysis.cpp

8 lines

RegAllocGreedy.cpp

356 lines

TargetSubtargetInfo.cpp

4 lines

Target/

X86/

X86Subtarget.h

4 lines

test/

CodeGen/

X86/

bug26810.ll

312 lines

greedy_regalloc_bad_eviction_sequence.ll

116 lines

Diff 119789

llvm/trunk/include/llvm/CodeGen/CalcSpillWeights.h

Show First 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	public:
VirtRegAuxInfo(MachineFunction &mf, LiveIntervals &lis,		VirtRegAuxInfo(MachineFunction &mf, LiveIntervals &lis,
VirtRegMap *vrm, const MachineLoopInfo &loops,		VirtRegMap *vrm, const MachineLoopInfo &loops,
const MachineBlockFrequencyInfo &mbfi,		const MachineBlockFrequencyInfo &mbfi,
NormalizingFn norm = normalizeSpillWeight)		NormalizingFn norm = normalizeSpillWeight)
: MF(mf), LIS(lis), VRM(vrm), Loops(loops), MBFI(mbfi), normalize(norm) {}		: MF(mf), LIS(lis), VRM(vrm), Loops(loops), MBFI(mbfi), normalize(norm) {}

/// \brief (re)compute li's spill weight and allocation hint.		/// \brief (re)compute li's spill weight and allocation hint.
void calculateSpillWeightAndHint(LiveInterval &li);		void calculateSpillWeightAndHint(LiveInterval &li);

		/// \brief Compute future expected spill weight of a split artifact of li
		/// that will span between start and end slot indexes.
		/// \param li The live interval to be split.
		/// \param start The expected begining of the split artifact. Instructions
		/// before start will not affect the weight.
		/// \param end The expected end of the split artifact. Instructions
		/// after end will not affect the weight.
		/// \return The expected spill weight of the split artifact. Returns
		/// negative weight for unspillable li.
		float futureWeight(LiveInterval &li, SlotIndex start, SlotIndex end);

		/// \brief Helper function for weight calculations.
		/// (Re)compute li's spill weight and allocation hint, or, for non null
		/// start and end - compute future expected spill weight of a split
		/// artifact of li that will span between start and end slot indexes.
		/// \param li The live interval for which to compute the weight.
		/// \param start The expected begining of the split artifact. Instructions
		/// before start will not affect the weight. Relevant for
		/// weight calculation of future split artifact.
		/// \param end The expected end of the split artifact. Instructions
		/// after end will not affect the weight. Relevant for
		/// weight calculation of future split artifact.
		/// \return The spill weight. Returns negative weight for unspillable li.
		float weightCalcHelper(LiveInterval &li, SlotIndex *start = nullptr,
		SlotIndex *end = nullptr);
};		};

/// \brief Compute spill weights and allocation hints for all virtual register		/// \brief Compute spill weights and allocation hints for all virtual register
/// live intervals.		/// live intervals.
void calculateSpillWeightsAndHints(LiveIntervals &LIS, MachineFunction &MF,		void calculateSpillWeightsAndHints(LiveIntervals &LIS, MachineFunction &MF,
VirtRegMap *VRM,		VirtRegMap *VRM,
const MachineLoopInfo &MLI,		const MachineLoopInfo &MLI,
const MachineBlockFrequencyInfo &MBFI,		const MachineBlockFrequencyInfo &MBFI,
VirtRegAuxInfo::NormalizingFn norm =		VirtRegAuxInfo::NormalizingFn norm =
normalizeSpillWeight);		normalizeSpillWeight);

} // end namespace llvm		} // end namespace llvm

#endif // LLVM_CODEGEN_CALCSPILLWEIGHTS_H		#endif // LLVM_CODEGEN_CALCSPILLWEIGHTS_H

llvm/trunk/include/llvm/CodeGen/LiveIntervalAnalysis.h

Show First 20 Lines • Show All 101 Lines • ▼ Show 20 Lines	public:
LiveIntervals();		LiveIntervals();
~LiveIntervals() override;		~LiveIntervals() override;

/// Calculate the spill weight to assign to a single instruction.		/// Calculate the spill weight to assign to a single instruction.
static float getSpillWeight(bool isDef, bool isUse,		static float getSpillWeight(bool isDef, bool isUse,
const MachineBlockFrequencyInfo *MBFI,		const MachineBlockFrequencyInfo *MBFI,
const MachineInstr &Instr);		const MachineInstr &Instr);

		/// Calculate the spill weight to assign to a single instruction.
		static float getSpillWeight(bool isDef, bool isUse,
		const MachineBlockFrequencyInfo *MBFI,
		const MachineBasicBlock *MBB);

LiveInterval &getInterval(unsigned Reg) {		LiveInterval &getInterval(unsigned Reg) {
if (hasInterval(Reg))		if (hasInterval(Reg))
return *VirtRegIntervals[Reg];		return *VirtRegIntervals[Reg];
else		else
return createAndComputeVirtRegInterval(Reg);		return createAndComputeVirtRegInterval(Reg);
}		}

const LiveInterval &getInterval(unsigned Reg) const {		const LiveInterval &getInterval(unsigned Reg) const {
▲ Show 20 Lines • Show All 359 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/Target/TargetSubtargetInfo.h

Show First 20 Lines • Show All 215 Lines • ▼ Show 20 Lines	public:
}		}

/// \brief True if the subtarget should run the local reassignment		/// \brief True if the subtarget should run the local reassignment
/// heuristic of the register allocator.		/// heuristic of the register allocator.
/// This heuristic may be compile time intensive, \p OptLevel provides		/// This heuristic may be compile time intensive, \p OptLevel provides
/// a finer grain to tune the register allocator.		/// a finer grain to tune the register allocator.
virtual bool enableRALocalReassignment(CodeGenOpt::Level OptLevel) const;		virtual bool enableRALocalReassignment(CodeGenOpt::Level OptLevel) const;

		/// \brief True if the subtarget should consider the cost of local intervals
		/// created by a split candidate when choosing the best split candidate. This
		/// heuristic may be compile time intensive.
		virtual bool enableAdvancedRASplitCost() const;

/// \brief Enable use of alias analysis during code generation (during MI		/// \brief Enable use of alias analysis during code generation (during MI
/// scheduling, DAGCombine, etc.).		/// scheduling, DAGCombine, etc.).
virtual bool useAA() const;		virtual bool useAA() const;

/// \brief Enable the use of the early if conversion pass.		/// \brief Enable the use of the early if conversion pass.
virtual bool enableEarlyIfConversion() const { return false; }		virtual bool enableEarlyIfConversion() const { return false; }

/// \brief Return PBQPConstraint(s) for the target.		/// \brief Return PBQPConstraint(s) for the target.
Show All 19 Lines

llvm/trunk/lib/CodeGen/CalcSpillWeights.cpp

Show First 20 Lines • Show All 127 Lines • ▼ Show 20 Lines	for (LiveInterval::const_vni_iterator I = LI.vni_begin(), E = LI.vni_end();
}		}

if (!TII.isTriviallyReMaterializable(*MI, LIS.getAliasAnalysis()))		if (!TII.isTriviallyReMaterializable(*MI, LIS.getAliasAnalysis()))
return false;		return false;
}		}
return true;		return true;
}		}

void		void VirtRegAuxInfo::calculateSpillWeightAndHint(LiveInterval &li) {
VirtRegAuxInfo::calculateSpillWeightAndHint(LiveInterval &li) {		float weight = weightCalcHelper(li);
		// Check if unspillable.
		if (weight < 0)
		return;
		li.weight = weight;
		}

		float VirtRegAuxInfo::futureWeight(LiveInterval &li, SlotIndex start,
		SlotIndex end) {
		return weightCalcHelper(li, &start, &end);
		}

		float VirtRegAuxInfo::weightCalcHelper(LiveInterval &li, SlotIndex *start,
		SlotIndex *end) {
MachineRegisterInfo &mri = MF.getRegInfo();		MachineRegisterInfo &mri = MF.getRegInfo();
const TargetRegisterInfo &tri = *MF.getSubtarget().getRegisterInfo();		const TargetRegisterInfo &tri = *MF.getSubtarget().getRegisterInfo();
MachineBasicBlock *mbb = nullptr;		MachineBasicBlock *mbb = nullptr;
MachineLoop *loop = nullptr;		MachineLoop *loop = nullptr;
bool isExiting = false;		bool isExiting = false;
float totalWeight = 0;		float totalWeight = 0;
unsigned numInstr = 0; // Number of instructions using li		unsigned numInstr = 0; // Number of instructions using li
SmallPtrSet<MachineInstr*, 8> visited;		SmallPtrSet<MachineInstr*, 8> visited;

// Find the best physreg hint and the best virtreg hint.		// Find the best physreg hint and the best virtreg hint.
float bestPhys = 0, bestVirt = 0;		float bestPhys = 0, bestVirt = 0;
unsigned hintPhys = 0, hintVirt = 0;		unsigned hintPhys = 0, hintVirt = 0;

// Don't recompute a target specific hint.		// Don't recompute a target specific hint.
bool noHint = mri.getRegAllocationHint(li.reg).first != 0;		bool noHint = mri.getRegAllocationHint(li.reg).first != 0;

// Don't recompute spill weight for an unspillable register.		// Don't recompute spill weight for an unspillable register.
bool Spillable = li.isSpillable();		bool Spillable = li.isSpillable();

		bool localSplitArtifact = start && end;

		// Do not update future local split artifacts.
		bool updateLI = !localSplitArtifact;

		if (localSplitArtifact) {
		MachineBasicBlock localMBB = LIS.getMBBFromIndex(end);
		assert(localMBB == LIS.getMBBFromIndex(*start) &&
		"start and end are expected to be in the same basic block");

		// Local split artifact will have 2 additional copy instructions and they
		// will be in the same BB.
		// localLI = COPY other
		// ...
		// other = COPY localLI
		totalWeight += LiveIntervals::getSpillWeight(true, false, &MBFI, localMBB);
		totalWeight += LiveIntervals::getSpillWeight(false, true, &MBFI, localMBB);

		numInstr += 2;
		}

for (MachineRegisterInfo::reg_instr_iterator		for (MachineRegisterInfo::reg_instr_iterator
I = mri.reg_instr_begin(li.reg), E = mri.reg_instr_end();		I = mri.reg_instr_begin(li.reg), E = mri.reg_instr_end();
I != E; ) {		I != E; ) {
MachineInstr mi = &(I++);		MachineInstr mi = &(I++);

		// For local split artifacts, we are interested only in instructions between
		// the expected start and end of the range.
		SlotIndex si = LIS.getInstructionIndex(*mi);
		if (localSplitArtifact && ((si < start) \|\| (si > end)))
		continue;

numInstr++;		numInstr++;
if (mi->isIdentityCopy() \|\| mi->isImplicitDef() \|\| mi->isDebugValue())		if (mi->isIdentityCopy() \|\| mi->isImplicitDef() \|\| mi->isDebugValue())
continue;		continue;
if (!visited.insert(mi).second)		if (!visited.insert(mi).second)
continue;		continue;

float weight = 1.0f;		float weight = 1.0f;
if (Spillable) {		if (Spillable) {
Show All 38 Lines	if (TargetRegisterInfo::isPhysicalRegister(hint)) {
hintVirt = hint;		hintVirt = hint;
}		}
}		}
}		}

Hint.clear();		Hint.clear();

// Always prefer the physreg hint.		// Always prefer the physreg hint.
		if (updateLI) {
if (unsigned hint = hintPhys ? hintPhys : hintVirt) {		if (unsigned hint = hintPhys ? hintPhys : hintVirt) {
mri.setRegAllocationHint(li.reg, 0, hint);		mri.setRegAllocationHint(li.reg, 0, hint);
// Weakly boost the spill weight of hinted registers.		// Weakly boost the spill weight of hinted registers.
totalWeight *= 1.01F;		totalWeight *= 1.01F;
}		}
		}

// If the live interval was already unspillable, leave it that way.		// If the live interval was already unspillable, leave it that way.
if (!Spillable)		if (!Spillable)
return;		return -1.0;

// Mark li as unspillable if all live ranges are tiny and the interval		// Mark li as unspillable if all live ranges are tiny and the interval
// is not live at any reg mask. If the interval is live at a reg mask		// is not live at any reg mask. If the interval is live at a reg mask
// spilling may be required.		// spilling may be required.
if (li.isZeroLength(LIS.getSlotIndexes()) &&		if (updateLI && li.isZeroLength(LIS.getSlotIndexes()) &&
!li.isLiveAtIndexes(LIS.getRegMaskSlots())) {		!li.isLiveAtIndexes(LIS.getRegMaskSlots())) {
li.markNotSpillable();		li.markNotSpillable();
return;		return -1.0;
}		}

// If all of the definitions of the interval are re-materializable,		// If all of the definitions of the interval are re-materializable,
// it is a preferred candidate for spilling.		// it is a preferred candidate for spilling.
// FIXME: this gets much more complicated once we support non-trivial		// FIXME: this gets much more complicated once we support non-trivial
// re-materialization.		// re-materialization.
if (isRematerializable(li, LIS, VRM, *MF.getSubtarget().getInstrInfo()))		if (isRematerializable(li, LIS, VRM, *MF.getSubtarget().getInstrInfo()))
totalWeight *= 0.5F;		totalWeight *= 0.5F;

li.weight = normalize(totalWeight, li.getSize(), numInstr);		if (localSplitArtifact)
		return normalize(totalWeight, start->distance(*end), numInstr);
		return normalize(totalWeight, li.getSize(), numInstr);
}		}

llvm/trunk/lib/CodeGen/LiveIntervalAnalysis.cpp

Show First 20 Lines • Show All 818 Lines • ▼ Show 20 Lines	for (const MachineBasicBlock *Pred : PHIMBB->predecessors())
return true;		return true;
}		}
return false;		return false;
}		}

float LiveIntervals::getSpillWeight(bool isDef, bool isUse,		float LiveIntervals::getSpillWeight(bool isDef, bool isUse,
const MachineBlockFrequencyInfo *MBFI,		const MachineBlockFrequencyInfo *MBFI,
const MachineInstr &MI) {		const MachineInstr &MI) {
BlockFrequency Freq = MBFI->getBlockFreq(MI.getParent());		return getSpillWeight(isDef, isUse, MBFI, MI.getParent());
		}

		float LiveIntervals::getSpillWeight(bool isDef, bool isUse,
		const MachineBlockFrequencyInfo *MBFI,
		const MachineBasicBlock *MBB) {
		BlockFrequency Freq = MBFI->getBlockFreq(MBB);
const float Scale = 1.0f / MBFI->getEntryFreq();		const float Scale = 1.0f / MBFI->getEntryFreq();
return (isDef + isUse) * (Freq.getFrequency() * Scale);		return (isDef + isUse) * (Freq.getFrequency() * Scale);
}		}

LiveRange::Segment		LiveRange::Segment
LiveIntervals::addSegmentToEndOfBlock(unsigned reg, MachineInstr &startInst) {		LiveIntervals::addSegmentToEndOfBlock(unsigned reg, MachineInstr &startInst) {
LiveInterval& Interval = createEmptyInterval(reg);		LiveInterval& Interval = createEmptyInterval(reg);
VNInfo *VN = Interval.getNextValue(		VNInfo *VN = Interval.getNextValue(
▲ Show 20 Lines • Show All 757 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/RegAllocGreedy.cpp

Show All 17 Lines
#include "RegAllocBase.h"		#include "RegAllocBase.h"
#include "SpillPlacement.h"		#include "SpillPlacement.h"
#include "Spiller.h"		#include "Spiller.h"
#include "SplitKit.h"		#include "SplitKit.h"
#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/BitVector.h"		#include "llvm/ADT/BitVector.h"
#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/IndexedMap.h"		#include "llvm/ADT/IndexedMap.h"
		#include "llvm/ADT/MapVector.h"
#include "llvm/ADT/SetVector.h"		#include "llvm/ADT/SetVector.h"
#include "llvm/ADT/SmallPtrSet.h"		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/SmallSet.h"		#include "llvm/ADT/SmallSet.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/Analysis/AliasAnalysis.h"		#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/OptimizationRemarkEmitter.h"		#include "llvm/Analysis/OptimizationRemarkEmitter.h"
▲ Show 20 Lines • Show All 90 Lines • ▼ Show 20 Lines	static cl::opt<bool> EnableDeferredSpilling(
cl::init(false));		cl::init(false));

// FIXME: Find a good default for this flag and remove the flag.		// FIXME: Find a good default for this flag and remove the flag.
static cl::opt<unsigned>		static cl::opt<unsigned>
CSRFirstTimeCost("regalloc-csr-first-time-cost",		CSRFirstTimeCost("regalloc-csr-first-time-cost",
cl::desc("Cost for first time use of callee-saved register."),		cl::desc("Cost for first time use of callee-saved register."),
cl::init(0), cl::Hidden);		cl::init(0), cl::Hidden);

		static cl::opt<bool> ConsiderLocalIntervalCost(
		"condsider-local-interval-cost", cl::Hidden,
		cl::desc("Consider the cost of local intervals created by a split "
		"candidate when choosing the best split candidate."),
		cl::init(false));

static RegisterRegAlloc greedyRegAlloc("greedy", "greedy register allocator",		static RegisterRegAlloc greedyRegAlloc("greedy", "greedy register allocator",
createGreedyRegisterAllocator);		createGreedyRegisterAllocator);

namespace {		namespace {

class RAGreedy : public MachineFunctionPass,		class RAGreedy : public MachineFunctionPass,
public RegAllocBase,		public RegAllocBase,
private LiveRangeEdit::Delegate {		private LiveRangeEdit::Delegate {
▲ Show 20 Lines • Show All 132 Lines • ▼ Show 20 Lines	struct EvictionCost {
void setBrokenHints(unsigned NHints) { BrokenHints = NHints; }		void setBrokenHints(unsigned NHints) { BrokenHints = NHints; }

bool operator<(const EvictionCost &O) const {		bool operator<(const EvictionCost &O) const {
return std::tie(BrokenHints, MaxWeight) <		return std::tie(BrokenHints, MaxWeight) <
std::tie(O.BrokenHints, O.MaxWeight);		std::tie(O.BrokenHints, O.MaxWeight);
}		}
};		};

		/// EvictionTrack - Keeps track of past evictions in order to optimize region
		/// split decision.
		class EvictionTrack {

		public:
		using EvictorInfo =
		std::pair<unsigned /* evictor /, unsigned / physreg */>;
		using EvicteeInfo = llvm::MapVector<unsigned /* evictee */, EvictorInfo>;

		private:
		/// Each Vreg that has been evicted in the last stage of selectOrSplit will
		/// be mapped to the evictor Vreg and the PhysReg it was evicted from.
		EvicteeInfo Evictees;

		public:
		/// \brief Clear all eviction information.
		void clear() { Evictees.clear(); }

		/// \brief Clear eviction information for the given evictee Vreg.
		/// E.g. when Vreg get's a new allocation, the old eviction info is no
		/// longer relevant.
		/// \param Evictee The evictee Vreg for whom we want to clear collected
		/// eviction info.
		void clearEvicteeInfo(unsigned Evictee) { Evictees.erase(Evictee); }

		/// \brief Track new eviction.
		/// The Evictor vreg has evicted the Evictee vreg from Physreg.
		/// \praram PhysReg The phisical register Evictee was evicted from.
		/// \praram Evictor The evictor Vreg that evicted Evictee.
		/// \praram Evictee The evictee Vreg.
		void addEviction(unsigned PhysReg, unsigned Evictor, unsigned Evictee) {
		Evictees[Evictee].first = Evictor;
		Evictees[Evictee].second = PhysReg;
		}

		/// Return the Evictor Vreg which evicted Evictee Vreg from PhysReg.
		/// \praram Evictee The evictee vreg.
		/// \return The Evictor vreg which evicted Evictee vreg from PhysReg. 0 if
		/// nobody has evicted Evictee from PhysReg.
		EvictorInfo getEvictor(unsigned Evictee) {
		if (Evictees.count(Evictee)) {
		return Evictees[Evictee];
		}

		return EvictorInfo(0, 0);
		}
		};

		// Keeps track of past evictions in order to optimize region split decision.
		EvictionTrack LastEvicted;

// splitting state.		// splitting state.
std::unique_ptr<SplitAnalysis> SA;		std::unique_ptr<SplitAnalysis> SA;
std::unique_ptr<SplitEditor> SE;		std::unique_ptr<SplitEditor> SE;

/// Cached per-block interference maps		/// Cached per-block interference maps
InterferenceCache IntfCache;		InterferenceCache IntfCache;

/// All basic blocks where the current register has uses.		/// All basic blocks where the current register has uses.
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	#endif

/// Callee-save register cost, calculated once per machine function.		/// Callee-save register cost, calculated once per machine function.
BlockFrequency CSRCost;		BlockFrequency CSRCost;

/// Run or not the local reassignment heuristic. This information is		/// Run or not the local reassignment heuristic. This information is
/// obtained from the TargetSubtargetInfo.		/// obtained from the TargetSubtargetInfo.
bool EnableLocalReassign;		bool EnableLocalReassign;

		/// Enable or not the the consideration of the cost of local intervals created
		/// by a split candidate when choosing the best split candidate.
		bool EnableAdvancedRASplitCost;

/// Set of broken hints that may be reconciled later because of eviction.		/// Set of broken hints that may be reconciled later because of eviction.
SmallSetVector<LiveInterval *, 8> SetOfBrokenHints;		SmallSetVector<LiveInterval *, 8> SetOfBrokenHints;

public:		public:
RAGreedy();		RAGreedy();

/// Return the pass name.		/// Return the pass name.
StringRef getPassName() const override { return "Greedy Register Allocator"; }		StringRef getPassName() const override { return "Greedy Register Allocator"; }
Show All 26 Lines	private:
void LRE_DidCloneVirtReg(unsigned, unsigned) override;		void LRE_DidCloneVirtReg(unsigned, unsigned) override;
void enqueue(PQueue &CurQueue, LiveInterval *LI);		void enqueue(PQueue &CurQueue, LiveInterval *LI);
LiveInterval *dequeue(PQueue &CurQueue);		LiveInterval *dequeue(PQueue &CurQueue);

BlockFrequency calcSpillCost();		BlockFrequency calcSpillCost();
bool addSplitConstraints(InterferenceCache::Cursor, BlockFrequency&);		bool addSplitConstraints(InterferenceCache::Cursor, BlockFrequency&);
void addThroughConstraints(InterferenceCache::Cursor, ArrayRef<unsigned>);		void addThroughConstraints(InterferenceCache::Cursor, ArrayRef<unsigned>);
void growRegion(GlobalSplitCandidate &Cand);		void growRegion(GlobalSplitCandidate &Cand);
BlockFrequency calcGlobalSplitCost(GlobalSplitCandidate&);		bool splitCanCauseEvictionChain(unsigned Evictee, GlobalSplitCandidate &Cand,
		unsigned BBNumber,
		const AllocationOrder &Order);
		BlockFrequency calcGlobalSplitCost(GlobalSplitCandidate &,
		const AllocationOrder &Order,
		bool *CanCauseEvictionChain);
bool calcCompactRegion(GlobalSplitCandidate&);		bool calcCompactRegion(GlobalSplitCandidate&);
void splitAroundRegion(LiveRangeEdit&, ArrayRef<unsigned>);		void splitAroundRegion(LiveRangeEdit&, ArrayRef<unsigned>);
void calcGapWeights(unsigned, SmallVectorImpl<float>&);		void calcGapWeights(unsigned, SmallVectorImpl<float>&);
unsigned canReassign(LiveInterval &VirtReg, unsigned PhysReg);		unsigned canReassign(LiveInterval &VirtReg, unsigned PhysReg);
bool shouldEvict(LiveInterval &A, bool, LiveInterval &B, bool);		bool shouldEvict(LiveInterval &A, bool, LiveInterval &B, bool);
bool canEvictInterference(LiveInterval&, unsigned, bool, EvictionCost&);		bool canEvictInterference(LiveInterval&, unsigned, bool, EvictionCost&);
		bool canEvictInterferenceInRange(LiveInterval &VirtReg, unsigned PhysReg,
		SlotIndex Start, SlotIndex End,
		EvictionCost &MaxCost);
		unsigned getCheapestEvicteeWeight(const AllocationOrder &Order,
		LiveInterval &VirtReg, SlotIndex Start,
		SlotIndex End, float *BestEvictWeight);
void evictInterference(LiveInterval&, unsigned,		void evictInterference(LiveInterval&, unsigned,
SmallVectorImpl<unsigned>&);		SmallVectorImpl<unsigned>&);
bool mayRecolorAllInterferences(unsigned PhysReg, LiveInterval &VirtReg,		bool mayRecolorAllInterferences(unsigned PhysReg, LiveInterval &VirtReg,
SmallLISet &RecoloringCandidates,		SmallLISet &RecoloringCandidates,
const SmallVirtRegSet &FixedRegisters);		const SmallVirtRegSet &FixedRegisters);

unsigned tryAssign(LiveInterval&, AllocationOrder&,		unsigned tryAssign(LiveInterval&, AllocationOrder&,
SmallVectorImpl<unsigned>&);		SmallVectorImpl<unsigned>&);
unsigned tryEvict(LiveInterval&, AllocationOrder&,		unsigned tryEvict(LiveInterval&, AllocationOrder&,
SmallVectorImpl<unsigned>&, unsigned = ~0u);		SmallVectorImpl<unsigned>&, unsigned = ~0u);
unsigned tryRegionSplit(LiveInterval&, AllocationOrder&,		unsigned tryRegionSplit(LiveInterval&, AllocationOrder&,
SmallVectorImpl<unsigned>&);		SmallVectorImpl<unsigned>&);
/// Calculate cost of region splitting.		/// Calculate cost of region splitting.
unsigned calculateRegionSplitCost(LiveInterval &VirtReg,		unsigned calculateRegionSplitCost(LiveInterval &VirtReg,
AllocationOrder &Order,		AllocationOrder &Order,
BlockFrequency &BestCost,		BlockFrequency &BestCost,
unsigned &NumCands, bool IgnoreCSR);		unsigned &NumCands, bool IgnoreCSR,
		bool *CanCauseEvictionChain = nullptr);
/// Perform region splitting.		/// Perform region splitting.
unsigned doRegionSplit(LiveInterval &VirtReg, unsigned BestCand,		unsigned doRegionSplit(LiveInterval &VirtReg, unsigned BestCand,
bool HasCompact,		bool HasCompact,
SmallVectorImpl<unsigned> &NewVRegs);		SmallVectorImpl<unsigned> &NewVRegs);
/// Check other options before using a callee-saved register for the first		/// Check other options before using a callee-saved register for the first
/// time.		/// time.
unsigned tryAssignCSRFirstTime(LiveInterval &VirtReg, AllocationOrder &Order,		unsigned tryAssignCSRFirstTime(LiveInterval &VirtReg, AllocationOrder &Order,
unsigned PhysReg, unsigned &CostPerUseLimit,		unsigned PhysReg, unsigned &CostPerUseLimit,
▲ Show 20 Lines • Show All 437 Lines • ▼ Show 20 Lines	for (unsigned i = Q.interferingVRegs().size(); i; --i) {
return false;		return false;
}		}
}		}
}		}
MaxCost = Cost;		MaxCost = Cost;
return true;		return true;
}		}

		/// \brief Return true if all interferences between VirtReg and PhysReg between
		/// Start and End can be evicted.
		///
		/// \param VirtReg Live range that is about to be assigned.
		/// \param PhysReg Desired register for assignment.
		/// \param Start Start of range to look for interferences.
		/// \param End End of range to look for interferences.
		/// \param MaxCost Only look for cheaper candidates and update with new cost
		/// when returning true.
		/// \return True when interference can be evicted cheaper than MaxCost.
		bool RAGreedy::canEvictInterferenceInRange(LiveInterval &VirtReg,
		unsigned PhysReg, SlotIndex Start,
		SlotIndex End,
		EvictionCost &MaxCost) {
		EvictionCost Cost;

		for (MCRegUnitIterator Units(PhysReg, TRI); Units.isValid(); ++Units) {
		LiveIntervalUnion::Query &Q = Matrix->query(VirtReg, *Units);

		// Check if any interfering live range is heavier than MaxWeight.
		for (unsigned i = Q.interferingVRegs().size(); i; --i) {
		LiveInterval *Intf = Q.interferingVRegs()[i - 1];

		// Check if interference overlast the segment in interest.
		if (!Intf->overlaps(Start, End))
		continue;

		// Cannot evict non virtual reg interference.
		if (!TargetRegisterInfo::isVirtualRegister(Intf->reg))
		return false;
		// Never evict spill products. They cannot split or spill.
		if (getStage(*Intf) == RS_Done)
		return false;

		// Would this break a satisfied hint?
		bool BreaksHint = VRM->hasPreferredPhys(Intf->reg);
		// Update eviction cost.
		Cost.BrokenHints += BreaksHint;
		Cost.MaxWeight = std::max(Cost.MaxWeight, Intf->weight);
		// Abort if this would be too expensive.
		if (!(Cost < MaxCost))
		return false;
		}
		}

		if (Cost.MaxWeight == 0)
		return false;

		MaxCost = Cost;
		return true;
		}

		/// \brief Return tthe physical register that will be best
		/// candidate for eviction by a local split interval that will be created
		/// between Start and End.
		///
		/// \param Order The allocation order
		/// \param VirtReg Live range that is about to be assigned.
		/// \param Start Start of range to look for interferences
		/// \param End End of range to look for interferences
		/// \param BestEvictweight The eviction cost of that eviction
		/// \return The PhysReg which is the best candidate for eviction and the
		/// eviction cost in BestEvictweight
		unsigned RAGreedy::getCheapestEvicteeWeight(const AllocationOrder &Order,
		LiveInterval &VirtReg,
		SlotIndex Start, SlotIndex End,
		float *BestEvictweight) {
		EvictionCost BestEvictCost;
		BestEvictCost.setMax();
		BestEvictCost.MaxWeight = VirtReg.weight;
		unsigned BestEvicteePhys = 0;

		// Go over all physical registers and find the best candidate for eviction
		for (auto PhysReg : Order.getOrder()) {

		if (!canEvictInterferenceInRange(VirtReg, PhysReg, Start, End,
		BestEvictCost))
		continue;

		// Best so far.
		BestEvicteePhys = PhysReg;
		}
		*BestEvictweight = BestEvictCost.MaxWeight;
		return BestEvicteePhys;
		}

/// evictInterference - Evict any interferring registers that prevent VirtReg		/// evictInterference - Evict any interferring registers that prevent VirtReg
/// from being assigned to Physreg. This assumes that canEvictInterference		/// from being assigned to Physreg. This assumes that canEvictInterference
/// returned true.		/// returned true.
void RAGreedy::evictInterference(LiveInterval &VirtReg, unsigned PhysReg,		void RAGreedy::evictInterference(LiveInterval &VirtReg, unsigned PhysReg,
SmallVectorImpl<unsigned> &NewVRegs) {		SmallVectorImpl<unsigned> &NewVRegs) {
// Make sure that VirtReg has a cascade number, and assign that cascade		// Make sure that VirtReg has a cascade number, and assign that cascade
// number to every evicted register. These live ranges than then only be		// number to every evicted register. These live ranges than then only be
// evicted by a newer cascade, preventing infinite loops.		// evicted by a newer cascade, preventing infinite loops.
Show All 18 Lines	void RAGreedy::evictInterference(LiveInterval &VirtReg, unsigned PhysReg,
}		}

// Evict them second. This will invalidate the queries.		// Evict them second. This will invalidate the queries.
for (unsigned i = 0, e = Intfs.size(); i != e; ++i) {		for (unsigned i = 0, e = Intfs.size(); i != e; ++i) {
LiveInterval *Intf = Intfs[i];		LiveInterval *Intf = Intfs[i];
// The same VirtReg may be present in multiple RegUnits. Skip duplicates.		// The same VirtReg may be present in multiple RegUnits. Skip duplicates.
if (!VRM->hasPhys(Intf->reg))		if (!VRM->hasPhys(Intf->reg))
continue;		continue;

		LastEvicted.addEviction(PhysReg, VirtReg.reg, Intf->reg);

Matrix->unassign(*Intf);		Matrix->unassign(*Intf);
assert((ExtraRegInfo[Intf->reg].Cascade < Cascade \|\|		assert((ExtraRegInfo[Intf->reg].Cascade < Cascade \|\|
VirtReg.isSpillable() < Intf->isSpillable()) &&		VirtReg.isSpillable() < Intf->isSpillable()) &&
"Cannot decrease cascade number, illegal eviction");		"Cannot decrease cascade number, illegal eviction");
ExtraRegInfo[Intf->reg].Cascade = Cascade;		ExtraRegInfo[Intf->reg].Cascade = Cascade;
++NumEvicted;		++NumEvicted;
NewVRegs.push_back(Intf->reg);		NewVRegs.push_back(Intf->reg);
}		}
▲ Show 20 Lines • Show All 305 Lines • ▼ Show 20 Lines	for (unsigned i = 0; i != UseBlocks.size(); ++i) {

// Unless the value is redefined in the block.		// Unless the value is redefined in the block.
if (BI.LiveIn && BI.LiveOut && BI.FirstDef)		if (BI.LiveIn && BI.LiveOut && BI.FirstDef)
Cost += SpillPlacer->getBlockFrequency(Number);		Cost += SpillPlacer->getBlockFrequency(Number);
}		}
return Cost;		return Cost;
}		}

		/// \brief Check if splitting Evictee will create a local split interval in
		/// basic block number BBNumber that may cause a bad eviction chain. This is
		/// intended to prevent bad eviction sequences like:
		/// movl %ebp, 8(%esp) # 4-byte Spill
		/// movl %ecx, %ebp
		/// movl %ebx, %ecx
		/// movl %edi, %ebx
		/// movl %edx, %edi
		/// cltd
		/// idivl %esi
		/// movl %edi, %edx
		/// movl %ebx, %edi
		/// movl %ecx, %ebx
		/// movl %ebp, %ecx
		/// movl 16(%esp), %ebp # 4 - byte Reload
		///
		/// Such sequences are created in 2 scenarios:
		///
		/// Scenario #1:
		/// vreg0 is evicted from physreg0 by vreg1.
		/// Evictee vreg0 is intended for region splitting with split candidate
		/// physreg0 (the reg vreg0 was evicted from).
		/// Region splitting creates a local interval because of interference with the
		/// evictor vreg1 (normally region spliitting creates 2 interval, the "by reg"
		/// and "by stack" intervals and local interval created when interference
		/// occurs).
		/// One of the split intervals ends up evicting vreg2 from physreg1.
		/// Evictee vreg2 is intended for region splitting with split candidate
		/// physreg1.
		/// One of the split intervals ends up evicting vreg3 from physreg2, etc.
		///
		/// Scenario #2
		/// vreg0 is evicted from physreg0 by vreg1.
		/// vreg2 is evicted from physreg2 by vreg3 etc.
		/// Evictee vreg0 is intended for region splitting with split candidate
		/// physreg1.
		/// Region splitting creates a local interval because of interference with the
		/// evictor vreg1.
		/// One of the split intervals ends up evicting back original evictor vreg1
		/// from physreg0 (the reg vreg0 was evicted from).
		/// Another evictee vreg2 is intended for region splitting with split candidate
		/// physreg1.
		/// One of the split intervals ends up evicting vreg3 from physreg2, etc.
		///
		/// \param Evictee The register considered to be split.
		/// \param Cand The split candidate that determines the physical register
		/// we are splitting for and the interferences.
		/// \param BBNumber The number of a BB for which the region split process will
		/// create a local split interval.
		/// \param Order The phisical registers that may get evicted by a split
		/// artifact of Evictee.
		/// \return True if splitting Evictee may cause a bad eviction chain, false
		/// otherwise.
		bool RAGreedy::splitCanCauseEvictionChain(unsigned Evictee,
		GlobalSplitCandidate &Cand,
		unsigned BBNumber,
		const AllocationOrder &Order) {
		EvictionTrack::EvictorInfo VregEvictorInfo = LastEvicted.getEvictor(Evictee);
		unsigned Evictor = VregEvictorInfo.first;
		unsigned PhysReg = VregEvictorInfo.second;

		// No actual evictor.
		if (!Evictor \|\| !PhysReg)
		return false;

		float MaxWeight = 0;
		unsigned FutureEvictedPhysReg =
		getCheapestEvicteeWeight(Order, LIS->getInterval(Evictee),
		Cand.Intf.first(), Cand.Intf.last(), &MaxWeight);

		// The bad eviction chain occurs when either the split candidate the the
		// evited reg or one of the split artifact will evict the evicting reg.
		if ((PhysReg != Cand.PhysReg) && (PhysReg != FutureEvictedPhysReg))
		return false;

		Cand.Intf.moveToBlock(BBNumber);

		// Check to see if the Evictor contains interference (with Evictee) in the
		// given BB. If so, this interference caused the eviction of Evictee from
		// PhysReg. This suggest that we will create a local interval during the
		// region split to avoid this interference This local interval may cause a bad
		// eviction chain.
		if (!LIS->hasInterval(Evictor))
		return false;
		LiveInterval &EvictorLI = LIS->getInterval(Evictor);
		if (EvictorLI.FindSegmentContaining(Cand.Intf.first()) == EvictorLI.end())
		return false;

		// Now, check to see if the local interval we will create is going to be
		// expensive enough to evict somebody If so, this may cause a bad eviction
		// chain.
		VirtRegAuxInfo VRAI(MF, LIS, VRM, getAnalysis<MachineLoopInfo>(), *MBFI);
		float splitArtifactWeight =
		VRAI.futureWeight(LIS->getInterval(Evictee),
		Cand.Intf.first().getPrevIndex(), Cand.Intf.last());
		if (splitArtifactWeight >= 0 && splitArtifactWeight < MaxWeight)
		return false;

		return true;
		}

/// calcGlobalSplitCost - Return the global split cost of following the split		/// calcGlobalSplitCost - Return the global split cost of following the split
/// pattern in LiveBundles. This cost should be added to the local cost of the		/// pattern in LiveBundles. This cost should be added to the local cost of the
/// interference pattern in SplitConstraints.		/// interference pattern in SplitConstraints.
///		///
BlockFrequency RAGreedy::calcGlobalSplitCost(GlobalSplitCandidate &Cand) {		BlockFrequency RAGreedy::calcGlobalSplitCost(GlobalSplitCandidate &Cand,
		const AllocationOrder &Order,
		bool *CanCauseEvictionChain) {
BlockFrequency GlobalCost = 0;		BlockFrequency GlobalCost = 0;
const BitVector &LiveBundles = Cand.LiveBundles;		const BitVector &LiveBundles = Cand.LiveBundles;
		unsigned VirtRegToSplit = SA->getParent().reg;
ArrayRef<SplitAnalysis::BlockInfo> UseBlocks = SA->getUseBlocks();		ArrayRef<SplitAnalysis::BlockInfo> UseBlocks = SA->getUseBlocks();
for (unsigned i = 0; i != UseBlocks.size(); ++i) {		for (unsigned i = 0; i != UseBlocks.size(); ++i) {
const SplitAnalysis::BlockInfo &BI = UseBlocks[i];		const SplitAnalysis::BlockInfo &BI = UseBlocks[i];
SpillPlacement::BlockConstraint &BC = SplitConstraints[i];		SpillPlacement::BlockConstraint &BC = SplitConstraints[i];
bool RegIn = LiveBundles[Bundles->getBundle(BC.Number, false)];		bool RegIn = LiveBundles[Bundles->getBundle(BC.Number, false)];
bool RegOut = LiveBundles[Bundles->getBundle(BC.Number, true)];		bool RegOut = LiveBundles[Bundles->getBundle(BC.Number, true)];
unsigned Ins = 0;		unsigned Ins = 0;

		Cand.Intf.moveToBlock(BC.Number);
		// Check wheather a local interval is going to be created during the region
		// split.
		if (EnableAdvancedRASplitCost && CanCauseEvictionChain &&
		Cand.Intf.hasInterference() && BI.LiveIn && BI.LiveOut && RegIn &&
		RegOut) {

		if (splitCanCauseEvictionChain(VirtRegToSplit, Cand, BC.Number, Order)) {
		// This interfernce cause our eviction from this assignment, we might
		// evict somebody else, add that cost.
		// See splitCanCauseEvictionChain for detailed description of scenarios.
		GlobalCost += SpillPlacer->getBlockFrequency(BC.Number);
		GlobalCost += SpillPlacer->getBlockFrequency(BC.Number);

		*CanCauseEvictionChain = true;
		}
		}

if (BI.LiveIn)		if (BI.LiveIn)
Ins += RegIn != (BC.Entry == SpillPlacement::PrefReg);		Ins += RegIn != (BC.Entry == SpillPlacement::PrefReg);
if (BI.LiveOut)		if (BI.LiveOut)
Ins += RegOut != (BC.Exit == SpillPlacement::PrefReg);		Ins += RegOut != (BC.Exit == SpillPlacement::PrefReg);
while (Ins--)		while (Ins--)
GlobalCost += SpillPlacer->getBlockFrequency(BC.Number);		GlobalCost += SpillPlacer->getBlockFrequency(BC.Number);
}		}

for (unsigned i = 0, e = Cand.ActiveBlocks.size(); i != e; ++i) {		for (unsigned i = 0, e = Cand.ActiveBlocks.size(); i != e; ++i) {
unsigned Number = Cand.ActiveBlocks[i];		unsigned Number = Cand.ActiveBlocks[i];
bool RegIn = LiveBundles[Bundles->getBundle(Number, false)];		bool RegIn = LiveBundles[Bundles->getBundle(Number, false)];
bool RegOut = LiveBundles[Bundles->getBundle(Number, true)];		bool RegOut = LiveBundles[Bundles->getBundle(Number, true)];
if (!RegIn && !RegOut)		if (!RegIn && !RegOut)
continue;		continue;
if (RegIn && RegOut) {		if (RegIn && RegOut) {
// We need double spill code if this block has interference.		// We need double spill code if this block has interference.
Cand.Intf.moveToBlock(Number);		Cand.Intf.moveToBlock(Number);
if (Cand.Intf.hasInterference()) {		if (Cand.Intf.hasInterference()) {
GlobalCost += SpillPlacer->getBlockFrequency(Number);		GlobalCost += SpillPlacer->getBlockFrequency(Number);
GlobalCost += SpillPlacer->getBlockFrequency(Number);		GlobalCost += SpillPlacer->getBlockFrequency(Number);

		// Check wheather a local interval is going to be created during the
		// region split.
		if (EnableAdvancedRASplitCost && CanCauseEvictionChain &&
		splitCanCauseEvictionChain(VirtRegToSplit, Cand, Number, Order)) {
		// This interfernce cause our eviction from this assignment, we might
		// evict somebody else, add that cost.
		// See splitCanCauseEvictionChain for detailed description of
		// scenarios.
		GlobalCost += SpillPlacer->getBlockFrequency(Number);
		GlobalCost += SpillPlacer->getBlockFrequency(Number);

		*CanCauseEvictionChain = true;
		}
}		}
continue;		continue;
}		}
// live-in / stack-out or stack-in live-out.		// live-in / stack-out or stack-in live-out.
GlobalCost += SpillPlacer->getBlockFrequency(Number);		GlobalCost += SpillPlacer->getBlockFrequency(Number);
}		}
return GlobalCost;		return GlobalCost;
}		}
▲ Show 20 Lines • Show All 148 Lines • ▼ Show 20 Lines	void RAGreedy::splitAroundRegion(LiveRangeEdit &LREdit,

if (VerifyEnabled)		if (VerifyEnabled)
MF->verify(this, "After splitting live range around region");		MF->verify(this, "After splitting live range around region");
}		}

unsigned RAGreedy::tryRegionSplit(LiveInterval &VirtReg, AllocationOrder &Order,		unsigned RAGreedy::tryRegionSplit(LiveInterval &VirtReg, AllocationOrder &Order,
SmallVectorImpl<unsigned> &NewVRegs) {		SmallVectorImpl<unsigned> &NewVRegs) {
unsigned NumCands = 0;		unsigned NumCands = 0;
		BlockFrequency SpillCost = calcSpillCost();
BlockFrequency BestCost;		BlockFrequency BestCost;

// Check if we can split this live range around a compact region.		// Check if we can split this live range around a compact region.
bool HasCompact = calcCompactRegion(GlobalCand.front());		bool HasCompact = calcCompactRegion(GlobalCand.front());
if (HasCompact) {		if (HasCompact) {
// Yes, keep GlobalCand[0] as the compact region candidate.		// Yes, keep GlobalCand[0] as the compact region candidate.
NumCands = 1;		NumCands = 1;
BestCost = BlockFrequency::getMaxFrequency();		BestCost = BlockFrequency::getMaxFrequency();
} else {		} else {
// No benefit from the compact region, our fallback will be per-block		// No benefit from the compact region, our fallback will be per-block
// splitting. Make sure we find a solution that is cheaper than spilling.		// splitting. Make sure we find a solution that is cheaper than spilling.
BestCost = calcSpillCost();		BestCost = SpillCost;
DEBUG(dbgs() << "Cost of isolating all blocks = ";		DEBUG(dbgs() << "Cost of isolating all blocks = ";
MBFI->printBlockFreq(dbgs(), BestCost) << '\n');		MBFI->printBlockFreq(dbgs(), BestCost) << '\n');
}		}

		bool CanCauseEvictionChain = false;
unsigned BestCand =		unsigned BestCand =
calculateRegionSplitCost(VirtReg, Order, BestCost, NumCands,		calculateRegionSplitCost(VirtReg, Order, BestCost, NumCands,
false/IgnoreCSR/);		false /IgnoreCSR/, &CanCauseEvictionChain);

		// Split candidates with compact regions can cause a bad eviction sequence.
		// See splitCanCauseEvictionChain for detailed description of scenarios.
		// To avoid it, we need to comapre the cost with the spill cost and not the
		// current max frequency.
		if (HasCompact && (BestCost > SpillCost) && (BestCand != NoCand) &&
		CanCauseEvictionChain) {
		return 0;
		}

// No solutions found, fall back to single block splitting.		// No solutions found, fall back to single block splitting.
if (!HasCompact && BestCand == NoCand)		if (!HasCompact && BestCand == NoCand)
return 0;		return 0;

return doRegionSplit(VirtReg, BestCand, HasCompact, NewVRegs);		return doRegionSplit(VirtReg, BestCand, HasCompact, NewVRegs);
}		}

unsigned RAGreedy::calculateRegionSplitCost(LiveInterval &VirtReg,		unsigned RAGreedy::calculateRegionSplitCost(LiveInterval &VirtReg,
AllocationOrder &Order,		AllocationOrder &Order,
BlockFrequency &BestCost,		BlockFrequency &BestCost,
unsigned &NumCands,		unsigned &NumCands, bool IgnoreCSR,
bool IgnoreCSR) {		bool *CanCauseEvictionChain) {
unsigned BestCand = NoCand;		unsigned BestCand = NoCand;
Order.rewind();		Order.rewind();
while (unsigned PhysReg = Order.next()) {		while (unsigned PhysReg = Order.next()) {
if (IgnoreCSR && isUnusedCalleeSavedReg(PhysReg))		if (IgnoreCSR && isUnusedCalleeSavedReg(PhysReg))
continue;		continue;

// Discard bad candidates before we run out of interference cache cursors.		// Discard bad candidates before we run out of interference cache cursors.
// This will only affect register classes with a lot of registers (>32).		// This will only affect register classes with a lot of registers (>32).
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	while (unsigned PhysReg = Order.next()) {
SpillPlacer->finish();		SpillPlacer->finish();

// No live bundles, defer to splitSingleBlocks().		// No live bundles, defer to splitSingleBlocks().
if (!Cand.LiveBundles.any()) {		if (!Cand.LiveBundles.any()) {
DEBUG(dbgs() << " no bundles.\n");		DEBUG(dbgs() << " no bundles.\n");
continue;		continue;
}		}

Cost += calcGlobalSplitCost(Cand);		bool HasEvictionChain = false;
		Cost += calcGlobalSplitCost(Cand, Order, &HasEvictionChain);
DEBUG({		DEBUG({
dbgs() << ", total = "; MBFI->printBlockFreq(dbgs(), Cost)		dbgs() << ", total = "; MBFI->printBlockFreq(dbgs(), Cost)
<< " with bundles";		<< " with bundles";
for (int i : Cand.LiveBundles.set_bits())		for (int i : Cand.LiveBundles.set_bits())
dbgs() << " EB#" << i;		dbgs() << " EB#" << i;
dbgs() << ".\n";		dbgs() << ".\n";
});		});
if (Cost < BestCost) {		if (Cost < BestCost) {
BestCand = NumCands;		BestCand = NumCands;
BestCost = Cost;		BestCost = Cost;
		// See splitCanCauseEvictionChain for detailed description of bad
		// eviction chain scenarios.
		if (CanCauseEvictionChain)
		*CanCauseEvictionChain = HasEvictionChain;
}		}
++NumCands;		++NumCands;
}		}

		if (CanCauseEvictionChain && BestCand != NoCand) {
		// See splitCanCauseEvictionChain for detailed description of bad
		// eviction chain scenarios.
		DEBUG(dbgs() << "Best split candidate of vreg "
		<< PrintReg(VirtReg.reg, TRI) << " may ");
		if (!(*CanCauseEvictionChain))
		DEBUG(dbgs() << "not ");
		DEBUG(dbgs() << "cause bad eviction chain\n");
		}

return BestCand;		return BestCand;
}		}

unsigned RAGreedy::doRegionSplit(LiveInterval &VirtReg, unsigned BestCand,		unsigned RAGreedy::doRegionSplit(LiveInterval &VirtReg, unsigned BestCand,
bool HasCompact,		bool HasCompact,
SmallVectorImpl<unsigned> &NewVRegs) {		SmallVectorImpl<unsigned> &NewVRegs) {
SmallVector<unsigned, 8> UsedCands;		SmallVector<unsigned, 8> UsedCands;
// Prepare split editor.		// Prepare split editor.
▲ Show 20 Lines • Show All 1,046 Lines • ▼ Show 20 Lines
unsigned RAGreedy::selectOrSplitImpl(LiveInterval &VirtReg,		unsigned RAGreedy::selectOrSplitImpl(LiveInterval &VirtReg,
SmallVectorImpl<unsigned> &NewVRegs,		SmallVectorImpl<unsigned> &NewVRegs,
SmallVirtRegSet &FixedRegisters,		SmallVirtRegSet &FixedRegisters,
unsigned Depth) {		unsigned Depth) {
unsigned CostPerUseLimit = ~0u;		unsigned CostPerUseLimit = ~0u;
// First try assigning a free register.		// First try assigning a free register.
AllocationOrder Order(VirtReg.reg, *VRM, RegClassInfo, Matrix);		AllocationOrder Order(VirtReg.reg, *VRM, RegClassInfo, Matrix);
if (unsigned PhysReg = tryAssign(VirtReg, Order, NewVRegs)) {		if (unsigned PhysReg = tryAssign(VirtReg, Order, NewVRegs)) {
		// If VirtReg got an assignment, the eviction info is no longre relevant.
		LastEvicted.clearEvicteeInfo(VirtReg.reg);
// When NewVRegs is not empty, we may have made decisions such as evicting		// When NewVRegs is not empty, we may have made decisions such as evicting
// a virtual register, go with the earlier decisions and use the physical		// a virtual register, go with the earlier decisions and use the physical
// register.		// register.
if (CSRCost.getFrequency() && isUnusedCalleeSavedReg(PhysReg) &&		if (CSRCost.getFrequency() && isUnusedCalleeSavedReg(PhysReg) &&
NewVRegs.empty()) {		NewVRegs.empty()) {
unsigned CSRReg = tryAssignCSRFirstTime(VirtReg, Order, PhysReg,		unsigned CSRReg = tryAssignCSRFirstTime(VirtReg, Order, PhysReg,
CostPerUseLimit, NewVRegs);		CostPerUseLimit, NewVRegs);
if (CSRReg \|\| !NewVRegs.empty())		if (CSRReg \|\| !NewVRegs.empty())
Show All 17 Lines	if (unsigned PhysReg =
unsigned Hint = MRI->getSimpleHint(VirtReg.reg);		unsigned Hint = MRI->getSimpleHint(VirtReg.reg);
// If VirtReg has a hint and that hint is broken record this		// If VirtReg has a hint and that hint is broken record this
// virtual register as a recoloring candidate for broken hint.		// virtual register as a recoloring candidate for broken hint.
// Indeed, since we evicted a variable in its neighborhood it is		// Indeed, since we evicted a variable in its neighborhood it is
// likely we can at least partially recolor some of the		// likely we can at least partially recolor some of the
// copy-related live-ranges.		// copy-related live-ranges.
if (Hint && Hint != PhysReg)		if (Hint && Hint != PhysReg)
SetOfBrokenHints.insert(&VirtReg);		SetOfBrokenHints.insert(&VirtReg);
		// If VirtReg eviction someone, the eviction info for it as an evictee is
		// no longre relevant.
		LastEvicted.clearEvicteeInfo(VirtReg.reg);
return PhysReg;		return PhysReg;
}		}

assert((NewVRegs.empty() \|\| Depth) && "Cannot append to existing NewVRegs");		assert((NewVRegs.empty() \|\| Depth) && "Cannot append to existing NewVRegs");

// The first time we see a live range, don't try to split or spill.		// The first time we see a live range, don't try to split or spill.
// Wait until the second time, when all smaller ranges have been allocated.		// Wait until the second time, when all smaller ranges have been allocated.
// This gives a better picture of the interference to split around.		// This gives a better picture of the interference to split around.
if (Stage < RS_Split) {		if (Stage < RS_Split) {
setStage(VirtReg, RS_Split);		setStage(VirtReg, RS_Split);
DEBUG(dbgs() << "wait for second round\n");		DEBUG(dbgs() << "wait for second round\n");
NewVRegs.push_back(VirtReg.reg);		NewVRegs.push_back(VirtReg.reg);
return 0;		return 0;
}		}

if (Stage < RS_Spill) {		if (Stage < RS_Spill) {
// Try splitting VirtReg or interferences.		// Try splitting VirtReg or interferences.
unsigned NewVRegSizeBefore = NewVRegs.size();		unsigned NewVRegSizeBefore = NewVRegs.size();
unsigned PhysReg = trySplit(VirtReg, Order, NewVRegs);		unsigned PhysReg = trySplit(VirtReg, Order, NewVRegs);
if (PhysReg \|\| (NewVRegs.size() - NewVRegSizeBefore))		if (PhysReg \|\| (NewVRegs.size() - NewVRegSizeBefore)) {
		// If VirtReg got split, the eviction info is no longre relevant.
		LastEvicted.clearEvicteeInfo(VirtReg.reg);
return PhysReg;		return PhysReg;
}		}
		}

// If we couldn't allocate a register from spilling, there is probably some		// If we couldn't allocate a register from spilling, there is probably some
// invalid inline assembly. The base class will report it.		// invalid inline assembly. The base class will report it.
if (Stage >= RS_Done \|\| !VirtReg.isSpillable())		if (Stage >= RS_Done \|\| !VirtReg.isSpillable())
return tryLastChanceRecoloring(VirtReg, Order, NewVRegs, FixedRegisters,		return tryLastChanceRecoloring(VirtReg, Order, NewVRegs, FixedRegisters,
Depth);		Depth);

// Finally spill VirtReg itself.		// Finally spill VirtReg itself.
▲ Show 20 Lines • Show All 96 Lines • ▼ Show 20 Lines	bool RAGreedy::runOnMachineFunction(MachineFunction &mf) {
TRI = MF->getSubtarget().getRegisterInfo();		TRI = MF->getSubtarget().getRegisterInfo();
TII = MF->getSubtarget().getInstrInfo();		TII = MF->getSubtarget().getInstrInfo();
RCI.runOnMachineFunction(mf);		RCI.runOnMachineFunction(mf);

EnableLocalReassign = EnableLocalReassignment \|\|		EnableLocalReassign = EnableLocalReassignment \|\|
MF->getSubtarget().enableRALocalReassignment(		MF->getSubtarget().enableRALocalReassignment(
MF->getTarget().getOptLevel());		MF->getTarget().getOptLevel());

		EnableAdvancedRASplitCost = ConsiderLocalIntervalCost \|\|
		MF->getSubtarget().enableAdvancedRASplitCost();

if (VerifyEnabled)		if (VerifyEnabled)
MF->verify(this, "Before greedy register allocator");		MF->verify(this, "Before greedy register allocator");

RegAllocBase::init(getAnalysis<VirtRegMap>(),		RegAllocBase::init(getAnalysis<VirtRegMap>(),
getAnalysis<LiveIntervals>(),		getAnalysis<LiveIntervals>(),
getAnalysis<LiveRegMatrix>());		getAnalysis<LiveRegMatrix>());
Indexes = &getAnalysis<SlotIndexes>();		Indexes = &getAnalysis<SlotIndexes>();
MBFI = &getAnalysis<MachineBlockFrequencyInfo>();		MBFI = &getAnalysis<MachineBlockFrequencyInfo>();
Show All 15 Lines	bool RAGreedy::runOnMachineFunction(MachineFunction &mf) {
SA.reset(new SplitAnalysis(VRM, LIS, *Loops));		SA.reset(new SplitAnalysis(VRM, LIS, *Loops));
SE.reset(new SplitEditor(SA, AA, LIS, VRM, DomTree, MBFI));		SE.reset(new SplitEditor(SA, AA, LIS, VRM, DomTree, MBFI));
ExtraRegInfo.clear();		ExtraRegInfo.clear();
ExtraRegInfo.resize(MRI->getNumVirtRegs());		ExtraRegInfo.resize(MRI->getNumVirtRegs());
NextCascade = 1;		NextCascade = 1;
IntfCache.init(MF, Matrix->getLiveUnions(), Indexes, LIS, TRI);		IntfCache.init(MF, Matrix->getLiveUnions(), Indexes, LIS, TRI);
GlobalCand.resize(32); // This will grow as needed.		GlobalCand.resize(32); // This will grow as needed.
SetOfBrokenHints.clear();		SetOfBrokenHints.clear();
		LastEvicted.clear();

allocatePhysRegs();		allocatePhysRegs();
tryHintsRecoloring();		tryHintsRecoloring();
postOptimization();		postOptimization();
reportNumberOfSplillsReloads();		reportNumberOfSplillsReloads();

releaseMemory();		releaseMemory();
return true;		return true;
}		}

llvm/trunk/lib/CodeGen/TargetSubtargetInfo.cpp

Show First 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	bool TargetSubtargetInfo::enableJoinGlobalCopies() const {
return enableMachineScheduler();		return enableMachineScheduler();
}		}

bool TargetSubtargetInfo::enableRALocalReassignment(		bool TargetSubtargetInfo::enableRALocalReassignment(
CodeGenOpt::Level OptLevel) const {		CodeGenOpt::Level OptLevel) const {
return true;		return true;
}		}

		bool TargetSubtargetInfo::enableAdvancedRASplitCost() const {
		return false;
		}

bool TargetSubtargetInfo::enablePostRAScheduler() const {		bool TargetSubtargetInfo::enablePostRAScheduler() const {
return getSchedModel().PostRAScheduler;		return getSchedModel().PostRAScheduler;
}		}

bool TargetSubtargetInfo::useAA() const {		bool TargetSubtargetInfo::useAA() const {
return false;		return false;
}		}

▲ Show 20 Lines • Show All 48 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/X86/X86Subtarget.h

Show First 20 Lines • Show All 666 Lines • ▼ Show 20 Lines	public:
/// Return the instruction itineraries based on the subtarget selection.		/// Return the instruction itineraries based on the subtarget selection.
const InstrItineraryData *getInstrItineraryData() const override {		const InstrItineraryData *getInstrItineraryData() const override {
return &InstrItins;		return &InstrItins;
}		}

AntiDepBreakMode getAntiDepBreakMode() const override {		AntiDepBreakMode getAntiDepBreakMode() const override {
return TargetSubtargetInfo::ANTIDEP_CRITICAL;		return TargetSubtargetInfo::ANTIDEP_CRITICAL;
}		}

		virtual bool enableAdvancedRASplitCost() const {
		return true;
		}
};		};

} // end namespace llvm		} // end namespace llvm

#endif // LLVM_LIB_TARGET_X86_X86SUBTARGET_H		#endif // LLVM_LIB_TARGET_X86_X86SUBTARGET_H

llvm/trunk/test/CodeGen/X86/bug26810.ll

				; RUN: llc < %s -march=x86 -regalloc=greedy -stop-after=greedy \| FileCheck %s
				; Make sure bad eviction sequence doesnt occur

				; Fix for bugzilla 26810.
				; This test is meant to make sure bad eviction sequence like the one described
				; below does not occur
				;
				; movapd %xmm7, 160(%esp) # 16-byte Spill
				; movapd %xmm5, %xmm7
				; movapd %xmm4, %xmm5
				; movapd %xmm3, %xmm4
				; movapd %xmm2, %xmm3
				; some_inst
				; movapd %xmm3, %xmm2
				; movapd %xmm4, %xmm3
				; movapd %xmm5, %xmm4
				; movapd %xmm7, %xmm5
				; movapd 160(%esp), %xmm7 # 16-byte Reload

				; Make sure we have no redundant copies in the problematic code section
				; CHECK-LABEL: name: loop
				; CHECK: bb.2.for.body:
				; CHECK: SUBPDrr
				; CHECK-NEXT: MOVAPSmr
				; CHECK-NEXT: MOVAPSrm
				; CHECK-NEXT: MULPDrm
				; CHECK-NEXT: ADDPDrr
				; CHECK-NEXT: ADD32ri8

				target datalayout = "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32"
				target triple = "i386-pc-linux-gnu"

				%struct._iobuf = type { i8* }

				$"\01??_C@_01NOFIACDB@w?$AA@" = comdat any

				$"\01??_C@_09LAIDGMDM@?1dev?1null?$AA@" = comdat any

				@"\01?v@@3PAU__m128d@@A" = global [8 x <2 x double>] zeroinitializer, align 16
				@"\01?m1@@3PAU__m128d@@A" = local_unnamed_addr global [76800000 x <2 x double>] zeroinitializer, align 16
				@"\01?m2@@3PAU__m128d@@A" = local_unnamed_addr global [8 x <2 x double>] zeroinitializer, align 16
				@"\01??_C@_01NOFIACDB@w?$AA@" = linkonce_odr unnamed_addr constant [2 x i8] c"w\00", comdat, align 1
				@"\01??_C@_09LAIDGMDM@?1dev?1null?$AA@" = linkonce_odr unnamed_addr constant [10 x i8] c"/dev/null\00", comdat, align 1

				; Function Attrs: norecurse
				define i32 @main() local_unnamed_addr #0 {
				entry:
				tail call void @init()
				%0 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 0), align 16, !tbaa !8
				%1 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 1), align 16, !tbaa !8
				%2 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 2), align 16, !tbaa !8
				%3 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 3), align 16, !tbaa !8
				%4 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 4), align 16, !tbaa !8
				%5 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 5), align 16, !tbaa !8
				%6 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 6), align 16, !tbaa !8
				%7 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 7), align 16, !tbaa !8
				%.promoted.i = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 0), align 16, !tbaa !8
				%.promoted51.i = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 1), align 16, !tbaa !8
				%.promoted53.i = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 2), align 16, !tbaa !8
				%.promoted55.i = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 3), align 16, !tbaa !8
				%.promoted57.i = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 4), align 16, !tbaa !8
				%.promoted59.i = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 5), align 16, !tbaa !8
				%.promoted61.i = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 6), align 16, !tbaa !8
				%.promoted63.i = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 7), align 16, !tbaa !8
				br label %for.body.i

				for.body.i: ; preds = %for.body.i, %entry
				%add.i64.i = phi <2 x double> [ %.promoted63.i, %entry ], [ %add.i.i, %for.body.i ]
				%add.i3662.i = phi <2 x double> [ %.promoted61.i, %entry ], [ %add.i36.i, %for.body.i ]
				%add.i3860.i = phi <2 x double> [ %.promoted59.i, %entry ], [ %add.i38.i, %for.body.i ]
				%add.i4058.i = phi <2 x double> [ %.promoted57.i, %entry ], [ %add.i40.i, %for.body.i ]
				%add.i4256.i = phi <2 x double> [ %.promoted55.i, %entry ], [ %add.i42.i, %for.body.i ]
				%add.i4454.i = phi <2 x double> [ %.promoted53.i, %entry ], [ %add.i44.i, %for.body.i ]
				%add.i4652.i = phi <2 x double> [ %.promoted51.i, %entry ], [ %add.i46.i, %for.body.i ]
				%add.i4850.i = phi <2 x double> [ %.promoted.i, %entry ], [ %add.i48.i, %for.body.i ]
				%i.049.i = phi i32 [ 0, %entry ], [ %inc.i, %for.body.i ]
				%arrayidx.i = getelementptr inbounds [76800000 x <2 x double>], [76800000 x <2 x double>]* @"\01?m1@@3PAU__m128d@@A", i32 0, i32 %i.049.i
				%8 = load <2 x double>, <2 x double>* %arrayidx.i, align 16, !tbaa !8
				%mul.i.i = fmul <2 x double> %0, %8
				%add.i48.i = fadd <2 x double> %add.i4850.i, %mul.i.i
				%mul.i47.i = fmul <2 x double> %1, %8
				%add.i46.i = fadd <2 x double> %add.i4652.i, %mul.i47.i
				%mul.i45.i = fmul <2 x double> %2, %8
				%add.i44.i = fadd <2 x double> %add.i4454.i, %mul.i45.i
				%mul.i43.i = fmul <2 x double> %3, %8
				%add.i42.i = fadd <2 x double> %add.i4256.i, %mul.i43.i
				%mul.i41.i = fmul <2 x double> %4, %8
				%add.i40.i = fadd <2 x double> %add.i4058.i, %mul.i41.i
				%mul.i39.i = fmul <2 x double> %5, %8
				%add.i38.i = fadd <2 x double> %add.i3860.i, %mul.i39.i
				%mul.i37.i = fmul <2 x double> %6, %8
				%add.i36.i = fsub <2 x double> %add.i3662.i, %mul.i37.i
				%mul.i35.i = fmul <2 x double> %7, %8
				%add.i.i = fadd <2 x double> %add.i64.i, %mul.i35.i
				%inc.i = add nuw nsw i32 %i.049.i, 1
				%exitcond.i = icmp eq i32 %inc.i, 76800000
				br i1 %exitcond.i, label %loop.exit, label %for.body.i

				loop.exit: ; preds = %for.body.i
				store <2 x double> %add.i48.i, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 0), align 16, !tbaa !8
				store <2 x double> %add.i46.i, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 1), align 16, !tbaa !8
				store <2 x double> %add.i46.i, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 1), align 16, !tbaa !8
				store <2 x double> %add.i44.i, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 2), align 16, !tbaa !8
				store <2 x double> %add.i42.i, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 3), align 16, !tbaa !8
				store <2 x double> %add.i40.i, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 4), align 16, !tbaa !8
				store <2 x double> %add.i38.i, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 5), align 16, !tbaa !8
				store <2 x double> %add.i36.i, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 6), align 16, !tbaa !8
				store <2 x double> %add.i.i, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 7), align 16, !tbaa !8
				%call.i = tail call %struct._iobuf* @fopen(i8* getelementptr inbounds ([10 x i8], [10 x i8]* @"\01??_C@_09LAIDGMDM@?1dev?1null?$AA@", i32 0, i32 0), i8* getelementptr inbounds ([2 x i8], [2 x i8]* @"\01??_C@_01NOFIACDB@w?$AA@", i32 0, i32 0)) #7
				%call1.i = tail call i32 @fwrite(i8* bitcast ([8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A" to i8), i32 16, i32 8, %struct._iobuf %call.i) #7
				%call2.i = tail call i32 @fclose(%struct._iobuf* %call.i) #7
				ret i32 0
				}

				define void @init() local_unnamed_addr #1 {
				entry:
				call void @llvm.memset.p0i8.i32(i8* bitcast ([8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A" to i8*), i8 0, i32 128, i32 16, i1 false)
				%call.i = tail call i64 @_time64(i64* null)
				%conv = trunc i64 %call.i to i32
				tail call void @srand(i32 %conv)
				br label %for.body6

				for.body6: ; preds = %for.body6, %entry
				%i2.051 = phi i32 [ 0, %entry ], [ %inc14, %for.body6 ]
				%call7 = tail call i32 @rand()
				%conv8 = sitofp i32 %call7 to double
				%tmp.sroa.0.0.vec.insert = insertelement <2 x double> undef, double %conv8, i32 0
				%call9 = tail call i32 @rand()
				%conv10 = sitofp i32 %call9 to double
				%tmp.sroa.0.8.vec.insert = insertelement <2 x double> %tmp.sroa.0.0.vec.insert, double %conv10, i32 1
				%arrayidx12 = getelementptr inbounds [76800000 x <2 x double>], [76800000 x <2 x double>]* @"\01?m1@@3PAU__m128d@@A", i32 0, i32 %i2.051
				store <2 x double> %tmp.sroa.0.8.vec.insert, <2 x double>* %arrayidx12, align 16, !tbaa !8
				%inc14 = add nuw nsw i32 %i2.051, 1
				%exitcond = icmp eq i32 %inc14, 76800000
				br i1 %exitcond, label %for.body21.preheader, label %for.body6

				for.body21.preheader: ; preds = %for.body6
				%call25 = tail call i32 @rand()
				%conv26 = sitofp i32 %call25 to double
				%tmp23.sroa.0.0.vec.insert = insertelement <2 x double> undef, double %conv26, i32 0
				%call28 = tail call i32 @rand()
				%conv29 = sitofp i32 %call28 to double
				%tmp23.sroa.0.8.vec.insert = insertelement <2 x double> %tmp23.sroa.0.0.vec.insert, double %conv29, i32 1
				store <2 x double> %tmp23.sroa.0.8.vec.insert, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 0), align 16, !tbaa !8
				%call25.1 = tail call i32 @rand()
				%conv26.1 = sitofp i32 %call25.1 to double
				%tmp23.sroa.0.0.vec.insert.1 = insertelement <2 x double> undef, double %conv26.1, i32 0
				%call28.1 = tail call i32 @rand()
				%conv29.1 = sitofp i32 %call28.1 to double
				%tmp23.sroa.0.8.vec.insert.1 = insertelement <2 x double> %tmp23.sroa.0.0.vec.insert.1, double %conv29.1, i32 1
				store <2 x double> %tmp23.sroa.0.8.vec.insert.1, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 1), align 16, !tbaa !8
				%call25.2 = tail call i32 @rand()
				%conv26.2 = sitofp i32 %call25.2 to double
				%tmp23.sroa.0.0.vec.insert.2 = insertelement <2 x double> undef, double %conv26.2, i32 0
				%call28.2 = tail call i32 @rand()
				%conv29.2 = sitofp i32 %call28.2 to double
				%tmp23.sroa.0.8.vec.insert.2 = insertelement <2 x double> %tmp23.sroa.0.0.vec.insert.2, double %conv29.2, i32 1
				store <2 x double> %tmp23.sroa.0.8.vec.insert.2, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 2), align 16, !tbaa !8
				%call25.3 = tail call i32 @rand()
				%conv26.3 = sitofp i32 %call25.3 to double
				%tmp23.sroa.0.0.vec.insert.3 = insertelement <2 x double> undef, double %conv26.3, i32 0
				%call28.3 = tail call i32 @rand()
				%conv29.3 = sitofp i32 %call28.3 to double
				%tmp23.sroa.0.8.vec.insert.3 = insertelement <2 x double> %tmp23.sroa.0.0.vec.insert.3, double %conv29.3, i32 1
				store <2 x double> %tmp23.sroa.0.8.vec.insert.3, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 3), align 16, !tbaa !8
				%call25.4 = tail call i32 @rand()
				%conv26.4 = sitofp i32 %call25.4 to double
				%tmp23.sroa.0.0.vec.insert.4 = insertelement <2 x double> undef, double %conv26.4, i32 0
				%call28.4 = tail call i32 @rand()
				%conv29.4 = sitofp i32 %call28.4 to double
				%tmp23.sroa.0.8.vec.insert.4 = insertelement <2 x double> %tmp23.sroa.0.0.vec.insert.4, double %conv29.4, i32 1
				store <2 x double> %tmp23.sroa.0.8.vec.insert.4, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 4), align 16, !tbaa !8
				%call25.5 = tail call i32 @rand()
				%conv26.5 = sitofp i32 %call25.5 to double
				%tmp23.sroa.0.0.vec.insert.5 = insertelement <2 x double> undef, double %conv26.5, i32 0
				%call28.5 = tail call i32 @rand()
				%conv29.5 = sitofp i32 %call28.5 to double
				%tmp23.sroa.0.8.vec.insert.5 = insertelement <2 x double> %tmp23.sroa.0.0.vec.insert.5, double %conv29.5, i32 1
				store <2 x double> %tmp23.sroa.0.8.vec.insert.5, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 5), align 16, !tbaa !8
				%call25.6 = tail call i32 @rand()
				%conv26.6 = sitofp i32 %call25.6 to double
				%tmp23.sroa.0.0.vec.insert.6 = insertelement <2 x double> undef, double %conv26.6, i32 0
				%call28.6 = tail call i32 @rand()
				%conv29.6 = sitofp i32 %call28.6 to double
				%tmp23.sroa.0.8.vec.insert.6 = insertelement <2 x double> %tmp23.sroa.0.0.vec.insert.6, double %conv29.6, i32 1
				store <2 x double> %tmp23.sroa.0.8.vec.insert.6, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 6), align 16, !tbaa !8
				%call25.7 = tail call i32 @rand()
				%conv26.7 = sitofp i32 %call25.7 to double
				%tmp23.sroa.0.0.vec.insert.7 = insertelement <2 x double> undef, double %conv26.7, i32 0
				%call28.7 = tail call i32 @rand()
				%conv29.7 = sitofp i32 %call28.7 to double
				%tmp23.sroa.0.8.vec.insert.7 = insertelement <2 x double> %tmp23.sroa.0.0.vec.insert.7, double %conv29.7, i32 1
				store <2 x double> %tmp23.sroa.0.8.vec.insert.7, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 7), align 16, !tbaa !8
				ret void
				}

				; Function Attrs: norecurse nounwind
				define void @loop() local_unnamed_addr #2 {
				entry:
				%0 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 0), align 16, !tbaa !8
				%1 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 1), align 16, !tbaa !8
				%2 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 2), align 16, !tbaa !8
				%3 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 3), align 16, !tbaa !8
				%4 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 4), align 16, !tbaa !8
				%5 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 5), align 16, !tbaa !8
				%6 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 6), align 16, !tbaa !8
				%7 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?m2@@3PAU__m128d@@A", i32 0, i32 7), align 16, !tbaa !8
				%.promoted = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 0), align 16, !tbaa !8
				%.promoted51 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 1), align 16, !tbaa !8
				%.promoted53 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 2), align 16, !tbaa !8
				%.promoted55 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 3), align 16, !tbaa !8
				%.promoted57 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 4), align 16, !tbaa !8
				%.promoted59 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 5), align 16, !tbaa !8
				%.promoted61 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 6), align 16, !tbaa !8
				%.promoted63 = load <2 x double>, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 7), align 16, !tbaa !8
				br label %for.body

				for.cond.cleanup: ; preds = %for.body
				store <2 x double> %add.i48, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 0), align 16, !tbaa !8
				store <2 x double> %add.i46, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 1), align 16, !tbaa !8
				store <2 x double> %add.i44, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 2), align 16, !tbaa !8
				store <2 x double> %add.i42, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 3), align 16, !tbaa !8
				store <2 x double> %add.i40, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 4), align 16, !tbaa !8
				store <2 x double> %add.i38, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 5), align 16, !tbaa !8
				store <2 x double> %add.i36, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 6), align 16, !tbaa !8
				store <2 x double> %add.i, <2 x double>* getelementptr inbounds ([8 x <2 x double>], [8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A", i32 0, i32 7), align 16, !tbaa !8
				ret void

				for.body: ; preds = %for.body, %entry
				%add.i64 = phi <2 x double> [ %.promoted63, %entry ], [ %add.i, %for.body ]
				%add.i3662 = phi <2 x double> [ %.promoted61, %entry ], [ %add.i36, %for.body ]
				%add.i3860 = phi <2 x double> [ %.promoted59, %entry ], [ %add.i38, %for.body ]
				%add.i4058 = phi <2 x double> [ %.promoted57, %entry ], [ %add.i40, %for.body ]
				%add.i4256 = phi <2 x double> [ %.promoted55, %entry ], [ %add.i42, %for.body ]
				%add.i4454 = phi <2 x double> [ %.promoted53, %entry ], [ %add.i44, %for.body ]
				%add.i4652 = phi <2 x double> [ %.promoted51, %entry ], [ %add.i46, %for.body ]
				%add.i4850 = phi <2 x double> [ %.promoted, %entry ], [ %add.i48, %for.body ]
				%i.049 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
				%arrayidx = getelementptr inbounds [76800000 x <2 x double>], [76800000 x <2 x double>]* @"\01?m1@@3PAU__m128d@@A", i32 0, i32 %i.049
				%8 = load <2 x double>, <2 x double>* %arrayidx, align 16, !tbaa !8
				%mul.i = fmul <2 x double> %8, %0
				%add.i48 = fadd <2 x double> %add.i4850, %mul.i
				%mul.i47 = fmul <2 x double> %8, %1
				%add.i46 = fadd <2 x double> %add.i4652, %mul.i47
				%mul.i45 = fmul <2 x double> %8, %2
				%add.i44 = fadd <2 x double> %add.i4454, %mul.i45
				%mul.i43 = fmul <2 x double> %8, %3
				%add.i42 = fadd <2 x double> %add.i4256, %mul.i43
				%mul.i41 = fmul <2 x double> %8, %4
				%add.i40 = fadd <2 x double> %add.i4058, %mul.i41
				%mul.i39 = fmul <2 x double> %8, %5
				%add.i38 = fadd <2 x double> %add.i3860, %mul.i39
				%mul.i37 = fmul <2 x double> %8, %6
				%add.i36 = fsub <2 x double> %add.i3662, %mul.i37
				%mul.i35 = fmul <2 x double> %8, %7
				%add.i = fadd <2 x double> %add.i64, %mul.i35
				%inc = add nuw nsw i32 %i.049, 1
				%exitcond = icmp eq i32 %inc, 76800000
				br i1 %exitcond, label %for.cond.cleanup, label %for.body
				}

				; Function Attrs: nounwind
				define void @"\01?dump@@YAXXZ"() local_unnamed_addr #3 {
				entry:
				%call = tail call %struct._iobuf* @fopen(i8* getelementptr inbounds ([10 x i8], [10 x i8]* @"\01??_C@_09LAIDGMDM@?1dev?1null?$AA@", i32 0, i32 0), i8* getelementptr inbounds ([2 x i8], [2 x i8]* @"\01??_C@_01NOFIACDB@w?$AA@", i32 0, i32 0))
				%call1 = tail call i32 @fwrite(i8* bitcast ([8 x <2 x double>]* @"\01?v@@3PAU__m128d@@A" to i8), i32 16, i32 8, %struct._iobuf %call)
				%call2 = tail call i32 @fclose(%struct._iobuf* %call)
				ret void
				}

				declare void @srand(i32) local_unnamed_addr #4

				declare i32 @rand() local_unnamed_addr #4

				; Function Attrs: nounwind
				declare noalias %struct._iobuf* @fopen(i8* nocapture readonly, i8* nocapture readonly) local_unnamed_addr #5

				; Function Attrs: nounwind
				declare i32 @fwrite(i8* nocapture, i32, i32, %struct._iobuf* nocapture) local_unnamed_addr #5

				; Function Attrs: nounwind
				declare i32 @fclose(%struct._iobuf* nocapture) local_unnamed_addr #5

				declare i64 @_time64(i64*) local_unnamed_addr #4

				; Function Attrs: argmemonly nounwind
				declare void @llvm.memset.p0i8.i32(i8* nocapture writeonly, i8, i32, i32, i1) #6

				attributes #0 = { norecurse "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="pentium4" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
				attributes #1 = { "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="pentium4" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
				attributes #2 = { norecurse nounwind "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="pentium4" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
				attributes #3 = { nounwind "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="pentium4" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
				attributes #4 = { "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="pentium4" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
				attributes #5 = { nounwind "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="pentium4" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
				attributes #6 = { argmemonly nounwind }
				attributes #7 = { nounwind }

				!llvm.linker.options = !{!0, !1, !2, !3, !4}
				!llvm.module.flags = !{!5, !6}
				!llvm.ident = !{!7}

				!0 = !{!"/FAILIFMISMATCH:\22_MSC_VER=1900\22"}
				!1 = !{!"/FAILIFMISMATCH:\22_ITERATOR_DEBUG_LEVEL=0\22"}
				!2 = !{!"/FAILIFMISMATCH:\22RuntimeLibrary=MT_StaticRelease\22"}
				!3 = !{!"/DEFAULTLIB:libcpmt.lib"}
				!4 = !{!"/FAILIFMISMATCH:\22_CRT_STDIO_ISO_WIDE_SPECIFIERS=0\22"}
				!5 = !{i32 1, !"NumRegisterParameters", i32 0}
				!6 = !{i32 1, !"wchar_size", i32 2}
				!7 = !{!"clang version 5.0.0 (cfe/trunk 305640)"}
				!8 = !{!9, !9, i64 0}
				!9 = !{!"omnipotent char", !10, i64 0}
				!10 = !{!"Simple C++ TBAA"}

llvm/trunk/test/CodeGen/X86/greedy_regalloc_bad_eviction_sequence.ll

				; RUN: llc < %s -march=x86 -regalloc=greedy -stop-after=greedy \| FileCheck %s
				; Make sure bad eviction sequence doesnt occur

				; Part of the fix for bugzilla 26810.
				; This test is meant to make sure bad eviction sequence like the one described
				; below does not occur
				;
				; movl %ebp, 8(%esp) # 4-byte Spill
				; movl %ecx, %ebp
				; movl %ebx, %ecx
				; movl %edi, %ebx
				; movl %edx, %edi
				; cltd
				; idivl %esi
				; movl %edi, %edx
				; movl %ebx, %edi
				; movl %ecx, %ebx
				; movl %ebp, %ecx
				; movl 16(%esp), %ebp # 4 - byte Reload

				; Make sure we have no redundant copies in the problematic code seqtion
				; CHECK-LABEL: name: bar
				; CHECK: bb.3.for.body:
				; CHECK: %eax = COPY
				; CHECK-NEXT: CDQ
				; CHECK-NEXT: IDIV32r
				; CHECK-NEXT: ADD32rr


				target datalayout = "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32"
				target triple = "i386-pc-linux-gnu"


				; Function Attrs: norecurse nounwind readonly
				define i32 @bar(i32 %size, i32* nocapture readonly %arr, i32* nocapture readnone %tmp) local_unnamed_addr #1 {
				entry:
				%0 = load i32, i32* %arr, align 4, !tbaa !3
				%arrayidx3 = getelementptr inbounds i32, i32* %arr, i32 1
				%1 = load i32, i32* %arrayidx3, align 4, !tbaa !3
				%arrayidx5 = getelementptr inbounds i32, i32* %arr, i32 2
				%2 = load i32, i32* %arrayidx5, align 4, !tbaa !3
				%arrayidx7 = getelementptr inbounds i32, i32* %arr, i32 3
				%3 = load i32, i32* %arrayidx7, align 4, !tbaa !3
				%arrayidx9 = getelementptr inbounds i32, i32* %arr, i32 4
				%4 = load i32, i32* %arrayidx9, align 4, !tbaa !3
				%arrayidx11 = getelementptr inbounds i32, i32* %arr, i32 5
				%5 = load i32, i32* %arrayidx11, align 4, !tbaa !3
				%arrayidx13 = getelementptr inbounds i32, i32* %arr, i32 6
				%6 = load i32, i32* %arrayidx13, align 4, !tbaa !3
				%arrayidx15 = getelementptr inbounds i32, i32* %arr, i32 7
				%7 = load i32, i32* %arrayidx15, align 4, !tbaa !3
				%arrayidx17 = getelementptr inbounds i32, i32* %arr, i32 8
				%8 = load i32, i32* %arrayidx17, align 4, !tbaa !3
				%cmp69 = icmp sgt i32 %size, 1
				br i1 %cmp69, label %for.body, label %for.cond.cleanup

				for.cond.cleanup: ; preds = %for.body, %entry
				%x0.0.lcssa = phi i32 [ %0, %entry ], [ %add, %for.body ]
				%x1.0.lcssa = phi i32 [ %1, %entry ], [ %sub, %for.body ]
				%x2.0.lcssa = phi i32 [ %2, %entry ], [ %mul, %for.body ]
				%x3.0.lcssa = phi i32 [ %3, %entry ], [ %div, %for.body ]
				%x4.0.lcssa = phi i32 [ %4, %entry ], [ %add19, %for.body ]
				%x5.0.lcssa = phi i32 [ %5, %entry ], [ %sub20, %for.body ]
				%x6.0.lcssa = phi i32 [ %6, %entry ], [ %add21, %for.body ]
				%x7.0.lcssa = phi i32 [ %7, %entry ], [ %mul22, %for.body ]
				%x8.0.lcssa = phi i32 [ %8, %entry ], [ %sub23, %for.body ]
				%mul24 = mul nsw i32 %x1.0.lcssa, %x0.0.lcssa
				%mul25 = mul nsw i32 %mul24, %x2.0.lcssa
				%mul26 = mul nsw i32 %mul25, %x3.0.lcssa
				%mul27 = mul nsw i32 %mul26, %x4.0.lcssa
				%mul28 = mul nsw i32 %mul27, %x5.0.lcssa
				%mul29 = mul nsw i32 %mul28, %x6.0.lcssa
				%mul30 = mul nsw i32 %mul29, %x7.0.lcssa
				%mul31 = mul nsw i32 %mul30, %x8.0.lcssa
				ret i32 %mul31

				for.body: ; preds = %entry, %for.body
				%i.079 = phi i32 [ %inc, %for.body ], [ 1, %entry ]
				%x8.078 = phi i32 [ %sub23, %for.body ], [ %8, %entry ]
				%x7.077 = phi i32 [ %mul22, %for.body ], [ %7, %entry ]
				%x6.076 = phi i32 [ %add21, %for.body ], [ %6, %entry ]
				%x5.075 = phi i32 [ %sub20, %for.body ], [ %5, %entry ]
				%x4.074 = phi i32 [ %add19, %for.body ], [ %4, %entry ]
				%x3.073 = phi i32 [ %div, %for.body ], [ %3, %entry ]
				%x2.072 = phi i32 [ %mul, %for.body ], [ %2, %entry ]
				%x1.071 = phi i32 [ %sub, %for.body ], [ %1, %entry ]
				%x0.070 = phi i32 [ %add, %for.body ], [ %0, %entry ]
				%add = add nsw i32 %x1.071, %x0.070
				%sub = sub nsw i32 %x1.071, %x2.072
				%mul = mul nsw i32 %x3.073, %x2.072
				%div = sdiv i32 %x3.073, %x4.074
				%add19 = add nsw i32 %x5.075, %x4.074
				%sub20 = sub nsw i32 %x5.075, %x6.076
				%add21 = add nsw i32 %x7.077, %x6.076
				%mul22 = mul nsw i32 %x8.078, %x7.077
				%sub23 = sub nsw i32 %x8.078, %add
				%inc = add nuw nsw i32 %i.079, 1
				%exitcond = icmp eq i32 %inc, %size
				br i1 %exitcond, label %for.cond.cleanup, label %for.body, !llvm.loop !7
				}

				attributes #0 = { norecurse nounwind readnone "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-features"="+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
				attributes #1 = { norecurse nounwind readonly "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-features"="+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }

				!llvm.module.flags = !{!0, !1}
				!llvm.ident = !{!2}

				!0 = !{i32 1, !"NumRegisterParameters", i32 0}
				!1 = !{i32 1, !"wchar_size", i32 2}
				!2 = !{!"clang version 5.0.0 (cfe/trunk 305640)"}
				!3 = !{!4, !4, i64 0}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}
				!7 = distinct !{!7, !8}
				!8 = !{!"llvm.loop.unroll.disable"}

This is an archive of the discontinued LLVM Phabricator instance.

[Greedy RegAlloc] Add logic to greedy reg alloc to avoid bad eviction chainsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 119789

llvm/trunk/include/llvm/CodeGen/CalcSpillWeights.h

llvm/trunk/include/llvm/CodeGen/LiveIntervalAnalysis.h

llvm/trunk/include/llvm/Target/TargetSubtargetInfo.h

llvm/trunk/lib/CodeGen/CalcSpillWeights.cpp

llvm/trunk/lib/CodeGen/LiveIntervalAnalysis.cpp

llvm/trunk/lib/CodeGen/RegAllocGreedy.cpp

llvm/trunk/lib/CodeGen/TargetSubtargetInfo.cpp

llvm/trunk/lib/Target/X86/X86Subtarget.h

llvm/trunk/test/CodeGen/X86/bug26810.ll

llvm/trunk/test/CodeGen/X86/greedy_regalloc_bad_eviction_sequence.ll

[Greedy RegAlloc] Add logic to greedy reg alloc to avoid bad eviction chains
ClosedPublic