This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/CodeGen/
-
CodeGen/
7/11
RegisterCoalescer.cpp
-
test/DebugInfo/MIR/X86/
-
DebugInfo/
-
MIR/
-
X86/
-
regcoalescing-clears-dead-dbgvals.mir

Differential D64630

[DebugInfo] Address performance regression with r364515
ClosedPublic

Authored by jmorse on Jul 12 2019, 6:16 AM.

Download Raw Diff

Details

Reviewers

aprantl
vsk
probinson
echristo
rupprecht
bjope
qcolombet

Commits

rGd9c9a4e48d28: [DebugInfo] Avoid register coalesing unsoundly changing DBG_VALUE locations

Summary

Hi; Hot on the heels of D56151 being committed in r364515 to fix PR40010, I wound up reverting it as there were reports of performance regressions on llvm-commits. Building AMDGPUDisassembler.cpp with ASAN enabled showed that register coalescing jumped from three seconds to fourty seconds. This patch cuts down on the performance cost (there's still a little) while preserving the original behaviour. It is, alas, ugly.

I believe the root cause of the performance problem is that DBG_VALUE insts don't get an entry in the SlotIndex map of Insts to Slots. The current code iterates through each DBG_VALUE for a register, getting its slot, then runs a liveness query. However, getting the slot requires a forwards-walk through the block until a non-debug instruction is found. With ASAN, packs of up to 800 DBG_VALUEs in a row appear (for AMDGPUDisassembler.cpp), each of which gets examined, which ends up having quadratic complexity.

The solution is to not lookup slots from DBG_VALUEs, but to instead iterate over slots and check nearby DBG_VALUEs. To this end, I've replaced the per-dbg-value query method with some helper functions that will scan a basic block for unsound DBG_VALUEs, and maintains a copy of the current SlotIndex instead of lookup up for each DBG_VALUE.

This isn't pleasant, and just writing more C++ to fix problems isn't great, but:

We now have to query liveness to ensure DBG_VALUE soundness,
I imagine that indexing DBG_VALUE -> slots would be either expensive or risk causing codegen changes with -g,
We're going to wind up iterating over instructions at some point to fix this.

(Note that r364515 was reverted in r365448, this diff is based on r364515, as opposed to what's on trunk right now, to show the "old" and "new" implementation).

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jmorse created this revision.Jul 12 2019, 6:16 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2019, 6:16 AM

Herald added subscribers: llvm-commits, tpr, qcolombet, MatzeB. · View Herald Transcript

With ASAN, packs of up to 800 DBG_VALUEs in a row appear (for that file)

That seems excessive; in the past when I've seen this explosion of DBG instructions, the vast majority were redundant. Have you looked at them to see if this is the case? We might rather eliminate duplicates than write code to make it cheaper to have lots of unnecessary instructions.

In D64630#1583174, @probinson wrote:

With ASAN, packs of up to 800 DBG_VALUEs in a row appear (for that file)

That seems excessive; in the past when I've seen this explosion of DBG instructions, the vast majority were redundant. Have you looked at them to see if this is the case? We might rather eliminate duplicates than write code to make it cheaper to have lots of unnecessary instructions.

dbg.value packs are a major issue, ASan aside. I wrote this proposal a while back and never followed through on it:
http://lists.llvm.org/pipermail/llvm-dev/2018-October/127228.html

With the lack of follow-through on my part, I think we should take Chris's suggestion of changing our representation to multiplex dbg.value, so that one instruction can describe an arbitrary number of variable locations without increasing the number of instructions that other passes have to iterate over.

Happily D58453 killing off a large amount of placeDbgValues activity significantly reduces DBG_VALUE grouping -- I don't have the numbers to hand, but I would say the density was almost an order of magnitude lower. The largest back in the benchmark I referred to was about ~120, and other large packs occurred much less frequently.

Reid wrote:

dbg.value packs are a major issue, ASan aside. I wrote this proposal a while back and never followed through on it:
http://lists.llvm.org/pipermail/llvm-dev/2018-October/127228.html

With the lack of follow-through on my part, I think we should take Chris's suggestion of changing our representation to multiplex dbg.value, so that one instruction can describe an arbitrary number of variable locations without increasing the number of instructions that other passes have to iterate over.

While this review isn't the place, this is definitely an area I'd want to invest time into, debug-info in the instruction stream is a frequent pain.

Ping -- this is the last thing (TM) blocking placeDbgValues being removed, more or less. I've edited the summary to make more sense; as a brief recap:

D56151 was about dropping variable locations where we couldn't guarantee that RegisterCoalescing would correctly preserve locations,
The patch made asan builds extremely slow, and was reverted,
This new patch re-implements the idea, avoiding those lookups.

More info in the summary.

aprantl added inline comments.Oct 25 2019, 3:02 PM

lib/CodeGen/RegisterCoalescer.cpp
1978 ↗	(On Diff #209470)	///
2003 ↗	(On Diff #209470)	if (MI.isDebugValue()) { if (!RegIsLive &&MI.getOperand(0).isReg() && MI.getOperand(0).getReg() == Reg) { } return; } // If not, update current liveness record. SlotIndex Slot = Slots.getInstructionIndex(MI);
2015 ↗	(On Diff #209470)	return is redundant
3362 ↗	(On Diff #209470)	///
3391 ↗	(On Diff #209470)	This has so much identical boilerplate with CheckDbgValuesInBlockForPhysReg that I wonder if it should be a generalized version that takes a std::function for code that is different?

In your summary you mentioned that:

However, getting the slot requires a forwards-walk through the block until a non-debug instruction is found.

Is it possible to just speed that part up? If so, we could just keep the simpler mergingChangesDbgValue implementation.

Possible options include: 1) computing a slot indexes structure for dbg_values on the side or 2) changing the MIR representation (akin to D51664, or introducing DBG_VALUE_PACKs -- @rnk suggested this a few comments up already, but I assume this is a stretch?). It might be nice to do something along these lines, since the approach in this patch looks like it can walk each MBB in the program (twice) per coalesce pair? Here's a hare-brained sketch of (1):

DVNumbers = map of DBG_VALUE to number
DVSlots = map of number interval to Slot
CurSlot = nullptr
DVPackLen = 0
for (index, MI) in enumerate(rpot(MF)):
  if MI is DBG_VALUE:
    DVNumbers[MI] = index
    DVPackLen++
  else:
    CurSlot = slot for the current MI
    if DVPackLen > 0:
      DVSlots[{index - DVPackLen, index}] = CurSlot
      DVPackLen = 0

lib/CodeGen/RegisterCoalescer.cpp
3363 ↗	(On Diff #209470)	`CheckDbgValuesInBlock` looks like it's more about making dead dbg_values `undef` than about checking them. Let's call it `makeDeadDbgValsUndef`?

bjope added a subscriber: bjope.Oct 30 2019, 10:29 AM

bjope added inline comments.Oct 30 2019, 10:48 AM

lib/CodeGen/RegisterCoalescer.cpp
1978 ↗	(On Diff #209470)	Not clear to me, just reading the description and the argument list, to understand if `Reg` is the physical register or the other register being coalesced.
1982 ↗	(On Diff #209470)	I think we want Reg to be of type Register here (at least in the future, but maybe we can get it right from the start).
2004 ↗	(On Diff #209470)	Not sure if the old code cared much about it either, but I always wonder if we forget to consider sub registers when I see code that only looks at getReg() but not getSubReg() and not even mentions sub registers in any code comment. But since we are dealing with physical registers, then maybe getSubReg() is out-of-play here (although it is only one side of the coalescing pair that is physical).

Fold two helpers into one; review comments and formatting

Herald added a subscriber: hiraditya. · View Herald TranscriptNov 7 2019, 5:12 AM

Bjorn wrote inline:

Not sure if the old code cared much about it either, but I always wonder if we forget to consider sub registers when I see code that only looks at getReg() but not getSubReg()

In this circumstance I think it's legitimate to ignore the subregisters -- we're not considering the precise location of a value with a reg/vreg, just whether the merging vregs are live or not. The riskiest circumstance would be a non-live DBG_VALUE-of-subreg being undef'd when a live value was merged into a different subregister within the same virtual register. This is conservative at the least; and I suspect that kind of merging would require an undef anyway, although I'm not overly familiar with subregisters.

Vedant wrote:

[A proposal for a different way of doing this]

I think that'd work, trading some memory for some performance. The approach in this patch seems to be Good Enough (TM) though, I don't observe any performance differences on a clang-3.4 RelWithDebInfo build, and only some fractional increases on the (pathalogical) ASan build. My preference is shipping this, and making DBG_VALUE_PACKs happen sometime soon, as that'll eliminate similar problems elsewhere (PR43855 recently bit me, for example).

I guess there are a few alternatives to consider that call for changing data structures. Perhaps it makes more sense to start with the approach taken here to address the performance issue and to keep an eye out for any more problems. BTW @jmorse is this patch still rebased on top of r364515? I don't see mergingChangesDbgValue anymore.

llvm/lib/CodeGen/RegisterCoalescer.cpp
3336–3337	IIUC there's no need to check for the case where Reg is live & OtherLiveness is not, because makeDeadDbgValsUndef is called once for each vreg in a coalesce pair. (Assuming that's correct) maybe that's worth a comment here, or in the function doc.

Correctly base patch on prior implementation

Vedant wrote:

I guess there are a few alternatives to consider that call for changing data structures. Perhaps it makes more sense to start with the approach taken here to address the performance issue and to keep an eye out for any more problems.

Indeed, I'd much prefer to design it out; the current scenario isn't ideal for a number of reasons.

BTW @jmorse is this patch still rebased on top of r364515? I don't see mergingChangesDbgValue anymore.

*blinks* ah yeah, the latest update should fix that.

+ bjope & Quentin

I think this looks reasonable, but am not yet familiar enough with RegisterCoalescer to confidently lgtm.

bjope added inline comments.Nov 9 2019, 6:08 AM

llvm/lib/CodeGen/RegisterCoalescer.cpp
3382	I think it is enough to do this if the next instruction isDebugValue. So it could perhaps speed up things if we only do these liveness calculations for the first MI in a sequence of DebugValue (or DebugInstr) instructions? One idea would be to keep track of if the previous instruction was a DebugValue, move these calculation into the `if (MI.isDebugValue)` above, and do it conditionally when the previous instruction wasn't a DebugValue. E.g. by saving the MachineInstr* pointing to the previous instruction (indicating that we should update RegIsLive/OtherIsLive) whenever a MI that isn't isDebugValue is found.
3384	Is it correct to use the RegSlot here. Maybe it should be getDeadSlot() to make sure the liveAt call below will get "live out" from the MI rather than "live in" (although I'm still learning about these slot indices myself)?

Sitrep on this -- Vedants question about early-exits led to me digging further into the pass, and discovering even more bad assumptions I'd made. I've prototyped something based on Vedants proposal of the other way of doing this; it'll have to wait until next week though.

jmorse marked an inline comment as done.Nov 18 2019, 12:50 PM

jmorse added inline comments.

llvm/lib/CodeGen/RegisterCoalescer.cpp
3336–3337	For completeness: the original intention here / site of this comment, was to detect an early exit. The assumption was that the register coalescer doesn't merge overlapping live ranges; and so if a block was completely covered by a live range, we could assume no invalid coalescing could occur. Digging into that however, it turns out the coalescer really does merge overlapping live ranges, which is great! But not for this implementation. New one up in a few moments; alas it's another redo :/

Hokay, here's another take on this problem. To recap, all my previous implementations were broken because of

a) performance problems looking up slot indexes for DBG_VALUEs, and
b) I wasn't aware that the coalescer will merge overlapping live ranges.

which I'll address in order below.

This implementation is inspired by Vedants sketch -- we build a data structure for looking up DBG_VALUEs by slot index quickly. For this (bear with me) I've mapped VRegs -> a set of {SlotIndex,MachineInstr*} pairs. In the body of checkMergingChangesDbgValuesImpl, we can then use the set-order to simultaneously advance through:

The live ranges of the VReg being merged, and
The set of all DBG_VALUEs for the other VReg.

to identify those DBG_VALUEs at risk of unsound merging. This avoids having to perform a slot index lookup at all, at the cost of stepping through valid DBG_VALUEs in the process. For an asan build of clang-3.4 I don't observe any change in build time with this change. For the previous worst-case file, AMDGPUDisassembler.cpp on trunk/master, the register coalescing pass increases from 2.9 seconds to roughly 3.5 seconds (out of an overall compile time of 27 seconds). This is, IMO, quite good given that it's dealing with pathological conditions (the packs of 800 DBG_VALUEs in a row).

For the correctness issue, observe the ShouldUndef lambda in the patch. This identifies when overlapping ranges have been merged, and examines RegisterCoalescers record of how it resolved the conflict (details in the comment). This probably wants attention from people who know the register allocator/coalescer -- the aim is to ensure DBG_VALUEs, which don't contribute to liveness, do not refer to a different live value-number after coalescing. We don't care if they refer to a non-live vreg. (Paging @andreadb , this is what I was going to mention, a uh, while ago).

Running through llvm-dwarfdump --statistics, the same as before, a tiny fraction of variables go missing (<0.02%). This is fine IMO, as we're dropping locations that are broken.

Rough edges:

It's not great to make JoinVals and RegisterCoalescer friends; I'll introduce an accessor for examining conflict resolutions later, but I've burnt out of time today.
I haven't made an attempt to address vregs being merged with physregs in this patch -- if this seems to be going in the right direction, I'll extend it to that.

aprantl added inline comments.Nov 19 2019, 9:51 AM

llvm/lib/CodeGen/RegisterCoalescer.cpp
140	This sets of my "this looks expensive" alarm. Could this be more efficiently be replaced by a `DenseMap` or a sorted `std::vector` + `std::lower_bound()`, or is this the best choice here? Similarly, should the `std::set` be something more compact?
345	`buildVRegToDbgValueMap()` or something more descriptive?
3483	This flip-flopping of DbgValuesSeen is hard to follow .. is there something more obvious that could be done? Otherwise, can it be documented?
3572	Ah. That might be the answer.

Revision addressing some comments and cutting down on datastructure sizes. Note that I've deleted two tests here: with the additional conflict resolution information, there are some non-live DBG_VALUEs that can be fixed. Specifically, CR_Erase indicating "this was a redundant and dead copy of the other vreg that we're erasing" is something that can be safely resolved.

Herald added a subscriber: mgrang. · View Herald TranscriptNov 22 2019, 10:15 AM

jmorse marked 6 inline comments as done.Nov 22 2019, 10:19 AM

jmorse added inline comments.

llvm/lib/CodeGen/RegisterCoalescer.cpp
140	I was going for strong ordering guarantees -- but it turns out that it isn't really necessary to erase elements in the body of the pass, so a sorted vector works just fine. Thanks for the tip!
3483	I realised I could just rely on there not being any DBG_VALUEs in the ToInsert vector instead of explicitly tracking these things, so I've just deleted that flag.

This looks nice now!

This revision is now accepted and ready to land.Nov 22 2019, 12:10 PM

Thanks for sticking with this through many revisions; hopefully this time it sticks.

Closed by commit rGd9c9a4e48d28: [DebugInfo] Avoid register coalesing unsoundly changing DBG_VALUE locations (authored by jmorse). · Explain WhyNov 25 2019, 5:54 AM

This revision was automatically updated to reflect the committed changes.

aheejin added a subscriber: aheejin.May 12 2023, 5:44 PM

aheejin added inline comments.

llvm/lib/CodeGen/RegisterCoalescer.cpp
146	Hello, this CL's been a while, but this structure doesn't seem to be used anywhere (in the current code as well). What is this for?

Herald added a project: Restricted Project. · View Herald TranscriptMay 12 2023, 5:44 PM

aheejin mentioned this in D150606: [RegisterCoalescer] Remove DbgMergedVRegNums (NFC).May 15 2023, 1:14 PM

aheejin mentioned this in rG3eccb40fa983: [RegisterCoalescer] Remove DbgMergedVRegNums (NFC).May 18 2023, 4:03 PM

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

RegisterCoalescer.cpp

178 lines

test/

DebugInfo/

MIR/

X86/

regcoalescing-clears-dead-dbgvals.mir

145 lines

Diff 230887

llvm/lib/CodeGen/RegisterCoalescer.cpp

Show First 20 Lines • Show All 114 Lines • ▼ Show 20 Lines	static cl::opt<unsigned> LargeIntervalFreqThreshold(
"large-interval-freq-threshold", cl::Hidden,		"large-interval-freq-threshold", cl::Hidden,
cl::desc("For a large interval, if it is coalesed with other live "		cl::desc("For a large interval, if it is coalesed with other live "
"intervals many times more than the threshold, stop its "		"intervals many times more than the threshold, stop its "
"coalescing to control the compile time. "),		"coalescing to control the compile time. "),
cl::init(100));		cl::init(100));

namespace {		namespace {

		class JoinVals;

class RegisterCoalescer : public MachineFunctionPass,		class RegisterCoalescer : public MachineFunctionPass,
private LiveRangeEdit::Delegate {		private LiveRangeEdit::Delegate {
MachineFunction* MF = nullptr;		MachineFunction* MF = nullptr;
MachineRegisterInfo* MRI = nullptr;		MachineRegisterInfo* MRI = nullptr;
const TargetRegisterInfo* TRI = nullptr;		const TargetRegisterInfo* TRI = nullptr;
const TargetInstrInfo* TII = nullptr;		const TargetInstrInfo* TII = nullptr;
LiveIntervals *LIS = nullptr;		LiveIntervals *LIS = nullptr;
const MachineLoopInfo* Loops = nullptr;		const MachineLoopInfo* Loops = nullptr;
AliasAnalysis *AA = nullptr;		AliasAnalysis *AA = nullptr;
RegisterClassInfo RegClassInfo;		RegisterClassInfo RegClassInfo;

		/// Debug variable location tracking -- for each VReg, maintain an
		/// ordered-by-slot-index set of DBG_VALUEs, to help quick
		/// identification of whether coalescing may change location validity.
		using DbgValueLoc = std::pair<SlotIndex, MachineInstr*>;
		DenseMap<unsigned, std::vector<DbgValueLoc>> DbgVRegToValues;
		aprantlUnsubmitted Done Reply Inline Actions This sets of my "this looks expensive" alarm. Could this be more efficiently be replaced by a `DenseMap` or a sorted `std::vector` + `std::lower_bound()`, or is this the best choice here? Similarly, should the `std::set` be something more compact? aprantl: This sets of my "this looks expensive" alarm. Could this be more efficiently be replaced by a…
		jmorseAuthorUnsubmitted Done Reply Inline Actions I was going for strong ordering guarantees -- but it turns out that it isn't really necessary to erase elements in the body of the pass, so a sorted vector works just fine. Thanks for the tip! jmorse: I was going for strong ordering guarantees -- but it turns out that it isn't really necessary…

		/// VRegs may be repeatedly coalesced, and have many DBG_VALUEs attached.
		/// To avoid repeatedly merging sets of DbgValueLocs, instead record
		/// which vregs have been coalesced, and where to. This map is from
		/// vreg => {set of vregs merged in}.
		DenseMap<unsigned, SmallVector<unsigned, 4>> DbgMergedVRegNums;
		aheejinUnsubmitted Not Done Reply Inline Actions Hello, this CL's been a while, but this structure doesn't seem to be used anywhere (in the current code as well). What is this for? aheejin: Hello, this CL's been a while, but this structure doesn't seem to be used anywhere (in the…

/// A LaneMask to remember on which subregister live ranges we need to call		/// A LaneMask to remember on which subregister live ranges we need to call
/// shrinkToUses() later.		/// shrinkToUses() later.
LaneBitmask ShrinkMask;		LaneBitmask ShrinkMask;

/// True if the main range of the currently coalesced intervals should be		/// True if the main range of the currently coalesced intervals should be
/// checked for smaller live intervals.		/// checked for smaller live intervals.
bool ShrinkMainRange = false;		bool ShrinkMainRange = false;

▲ Show 20 Lines • Show All 180 Lines • ▼ Show 20 Lines	class RegisterCoalescer : public MachineFunctionPass,
/// Optimizations should use this to make sure that deleted instructions		/// Optimizations should use this to make sure that deleted instructions
/// are always accounted for.		/// are always accounted for.
void deleteInstr(MachineInstr* MI) {		void deleteInstr(MachineInstr* MI) {
ErasedInstrs.insert(MI);		ErasedInstrs.insert(MI);
LIS->RemoveMachineInstrFromMaps(*MI);		LIS->RemoveMachineInstrFromMaps(*MI);
MI->eraseFromParent();		MI->eraseFromParent();
}		}

		/// Walk over function and initialize the DbgVRegToValues map.
		void buildVRegToDbgValueMap(MachineFunction &MF);
		aprantlUnsubmitted Done Reply Inline Actions `buildVRegToDbgValueMap()` or something more descriptive? aprantl: `buildVRegToDbgValueMap()` or something more descriptive?

		/// Test whether, after merging, any DBG_VALUEs would refer to a
		/// different value number than before merging, and whether this can
		/// be resolved. If not, mark the DBG_VALUE as being undef.
		void checkMergingChangesDbgValues(CoalescerPair &CP, LiveRange &LHS,
		JoinVals &LHSVals, LiveRange &RHS,
		JoinVals &RHSVals);

		void checkMergingChangesDbgValuesImpl(unsigned Reg, LiveRange &OtherRange,
		LiveRange &RegRange, JoinVals &Vals2);

public:		public:
static char ID; ///< Class identification, replacement for typeinfo		static char ID; ///< Class identification, replacement for typeinfo

RegisterCoalescer() : MachineFunctionPass(ID) {		RegisterCoalescer() : MachineFunctionPass(ID) {
initializeRegisterCoalescerPass(*PassRegistry::getPassRegistry());		initializeRegisterCoalescerPass(*PassRegistry::getPassRegistry());
}		}

void getAnalysisUsage(AnalysisUsage &AU) const override;		void getAnalysisUsage(AnalysisUsage &AU) const override;
▲ Show 20 Lines • Show All 1,307 Lines • ▼ Show 20 Lines	if (IsUndef) {
// point so we have to potentially shrink the main range if the		// point so we have to potentially shrink the main range if the
// use was ending a live segment there.		// use was ending a live segment there.
LiveQueryResult Q = Int.Query(UseIdx);		LiveQueryResult Q = Int.Query(UseIdx);
if (Q.valueOut() == nullptr)		if (Q.valueOut() == nullptr)
ShrinkMainRange = true;		ShrinkMainRange = true;
}		}
}		}

void RegisterCoalescer::updateRegDefsUses(unsigned SrcReg,		void RegisterCoalescer::updateRegDefsUses(unsigned SrcReg, unsigned DstReg,
unsigned DstReg,
unsigned SubIdx) {		unsigned SubIdx) {
bool DstIsPhys = Register::isPhysicalRegister(DstReg);		bool DstIsPhys = Register::isPhysicalRegister(DstReg);
LiveInterval *DstInt = DstIsPhys ? nullptr : &LIS->getInterval(DstReg);		LiveInterval *DstInt = DstIsPhys ? nullptr : &LIS->getInterval(DstReg);

if (DstInt && DstInt->hasSubRanges() && DstReg != SrcReg) {		if (DstInt && DstInt->hasSubRanges() && DstReg != SrcReg) {
for (MachineOperand &MO : MRI->reg_operands(DstReg)) {		for (MachineOperand &MO : MRI->reg_operands(DstReg)) {
unsigned SubReg = MO.getSubReg();		unsigned SubReg = MO.getSubReg();
if (SubReg == 0 \|\| MO.isUndef())		if (SubReg == 0 \|\| MO.isUndef())
▲ Show 20 Lines • Show All 529 Lines • ▼ Show 20 Lines	class JoinVals {
LiveIntervals *LIS;		LiveIntervals *LIS;
SlotIndexes *Indexes;		SlotIndexes *Indexes;
const TargetRegisterInfo *TRI;		const TargetRegisterInfo *TRI;

/// Value number assignments. Maps value numbers in LI to entries in		/// Value number assignments. Maps value numbers in LI to entries in
/// NewVNInfo. This is suitable for passing to LiveInterval::join().		/// NewVNInfo. This is suitable for passing to LiveInterval::join().
SmallVector<int, 8> Assignments;		SmallVector<int, 8> Assignments;

		public:
/// Conflict resolution for overlapping values.		/// Conflict resolution for overlapping values.
enum ConflictResolution {		enum ConflictResolution {
/// No overlap, simply keep this value.		/// No overlap, simply keep this value.
CR_Keep,		CR_Keep,

/// Merge this value into OtherVNI and erase the defining instruction.		/// Merge this value into OtherVNI and erase the defining instruction.
/// Used for IMPLICIT_DEF, coalescable copies, and copies from external		/// Used for IMPLICIT_DEF, coalescable copies, and copies from external
/// values.		/// values.
Show All 12 Lines	enum ConflictResolution {

/// Unresolved conflict. Visit later when all values have been mapped.		/// Unresolved conflict. Visit later when all values have been mapped.
CR_Unresolved,		CR_Unresolved,

/// Unresolvable conflict. Abort the join.		/// Unresolvable conflict. Abort the join.
CR_Impossible		CR_Impossible
};		};

		private:
/// Per-value info for LI. The lane bit masks are all relative to the final		/// Per-value info for LI. The lane bit masks are all relative to the final
/// joined register, so they can be compared directly between SrcReg and		/// joined register, so they can be compared directly between SrcReg and
/// DstReg.		/// DstReg.
struct Val {		struct Val {
ConflictResolution Resolution = CR_Keep;		ConflictResolution Resolution = CR_Keep;

/// Lanes written by this def, 0 for unanalyzed values.		/// Lanes written by this def, 0 for unanalyzed values.
LaneBitmask WriteLanes;		LaneBitmask WriteLanes;
▲ Show 20 Lines • Show All 144 Lines • ▼ Show 20 Lines	void eraseInstrs(SmallPtrSetImpl<MachineInstr*> &ErasedInstrs,
SmallVectorImpl<unsigned> &ShrinkRegs,		SmallVectorImpl<unsigned> &ShrinkRegs,
LiveInterval *LI = nullptr);		LiveInterval *LI = nullptr);

/// Remove liverange defs at places where implicit defs will be removed.		/// Remove liverange defs at places where implicit defs will be removed.
void removeImplicitDefs();		void removeImplicitDefs();

/// Get the value assignments suitable for passing to LiveInterval::join.		/// Get the value assignments suitable for passing to LiveInterval::join.
const int *getAssignments() const { return Assignments.data(); }		const int *getAssignments() const { return Assignments.data(); }

		/// Get the conflict resolution for a value number.
		ConflictResolution getResolution(unsigned Num) const {
		return Vals[Num].Resolution;
		}
};		};

} // end anonymous namespace		} // end anonymous namespace

LaneBitmask JoinVals::computeWriteLanes(const MachineInstr *DefMI, bool &Redef)		LaneBitmask JoinVals::computeWriteLanes(const MachineInstr *DefMI, bool &Redef)
const {		const {
LaneBitmask L;		LaneBitmask L;
for (const MachineOperand &MO : DefMI->operands()) {		for (const MachineOperand &MO : DefMI->operands()) {
▲ Show 20 Lines • Show All 899 Lines • ▼ Show 20 Lines	if (LI.valnos.size() < LargeIntervalSizeThreshold)
return false;		return false;
auto &Counter = LargeLIVisitCounter[LI.reg];		auto &Counter = LargeLIVisitCounter[LI.reg];
if (Counter < LargeIntervalFreqThreshold) {		if (Counter < LargeIntervalFreqThreshold) {
Counter++;		Counter++;
return false;		return false;
}		}
return true;		return true;
}		}

bool RegisterCoalescer::joinVirtRegs(CoalescerPair &CP) {		bool RegisterCoalescer::joinVirtRegs(CoalescerPair &CP) {
		vskUnsubmitted Not Done Reply Inline Actions IIUC there's no need to check for the case where Reg is live & OtherLiveness is not, because makeDeadDbgValsUndef is called once for each vreg in a coalesce pair. (Assuming that's correct) maybe that's worth a comment here, or in the function doc. vsk: IIUC there's no need to check for the case where Reg is live & OtherLiveness is not, because…
		jmorseAuthorUnsubmitted Done Reply Inline Actions For completeness: the original intention here / site of this comment, was to detect an early exit. The assumption was that the register coalescer doesn't merge overlapping live ranges; and so if a block was completely covered by a live range, we could assume no invalid coalescing could occur. Digging into that however, it turns out the coalescer really does merge overlapping live ranges, which is great! But not for this implementation. New one up in a few moments; alas it's another redo :/ jmorse: For completeness: the original intention here / site of this comment, was to detect an early…
SmallVector<VNInfo*, 16> NewVNInfo;		SmallVector<VNInfo*, 16> NewVNInfo;
LiveInterval &RHS = LIS->getInterval(CP.getSrcReg());		LiveInterval &RHS = LIS->getInterval(CP.getSrcReg());
LiveInterval &LHS = LIS->getInterval(CP.getDstReg());		LiveInterval &LHS = LIS->getInterval(CP.getDstReg());
bool TrackSubRegLiveness = MRI->shouldTrackSubRegLiveness(*CP.getNewRC());		bool TrackSubRegLiveness = MRI->shouldTrackSubRegLiveness(*CP.getNewRC());
JoinVals RHSVals(RHS, CP.getSrcReg(), CP.getSrcIdx(), LaneBitmask::getNone(),		JoinVals RHSVals(RHS, CP.getSrcReg(), CP.getSrcIdx(), LaneBitmask::getNone(),
NewVNInfo, CP, LIS, TRI, false, TrackSubRegLiveness);		NewVNInfo, CP, LIS, TRI, false, TrackSubRegLiveness);
JoinVals LHSVals(LHS, CP.getDstReg(), CP.getDstIdx(), LaneBitmask::getNone(),		JoinVals LHSVals(LHS, CP.getDstReg(), CP.getDstIdx(), LaneBitmask::getNone(),
NewVNInfo, CP, LIS, TRI, false, TrackSubRegLiveness);		NewVNInfo, CP, LIS, TRI, false, TrackSubRegLiveness);
Show All 28 Lines	if (RHS.hasSubRanges() \|\| LHS.hasSubRanges()) {
} else if (DstIdx != 0) {		} else if (DstIdx != 0) {
// Transform LHS lanemasks to new register class if necessary.		// Transform LHS lanemasks to new register class if necessary.
for (LiveInterval::SubRange &R : LHS.subranges()) {		for (LiveInterval::SubRange &R : LHS.subranges()) {
LaneBitmask Mask = TRI->composeSubRegIndexLaneMask(DstIdx, R.LaneMask);		LaneBitmask Mask = TRI->composeSubRegIndexLaneMask(DstIdx, R.LaneMask);
R.LaneMask = Mask;		R.LaneMask = Mask;
}		}
}		}
LLVM_DEBUG(dbgs() << "\t\tLHST = " << printReg(CP.getDstReg()) << ' ' << LHS		LLVM_DEBUG(dbgs() << "\t\tLHST = " << printReg(CP.getDstReg()) << ' ' << LHS
<< '\n');		<< '\n');
		bjopeUnsubmitted Not Done Reply Inline Actions I think it is enough to do this if the next instruction isDebugValue. So it could perhaps speed up things if we only do these liveness calculations for the first MI in a sequence of DebugValue (or DebugInstr) instructions? One idea would be to keep track of if the previous instruction was a DebugValue, move these calculation into the `if (MI.isDebugValue)` above, and do it conditionally when the previous instruction wasn't a DebugValue. E.g. by saving the MachineInstr* pointing to the previous instruction (indicating that we should update RegIsLive/OtherIsLive) whenever a MI that isn't isDebugValue is found. bjope: I think it is enough to do this if the next instruction isDebugValue. So it could perhaps speed…

// Determine lanemasks of RHS in the coalesced register and merge subranges.		// Determine lanemasks of RHS in the coalesced register and merge subranges.
		bjopeUnsubmitted Not Done Reply Inline Actions Is it correct to use the RegSlot here. Maybe it should be getDeadSlot() to make sure the liveAt call below will get "live out" from the MI rather than "live in" (although I'm still learning about these slot indices myself)? bjope: Is it correct to use the RegSlot here. Maybe it should be getDeadSlot() to make sure the liveAt…
unsigned SrcIdx = CP.getSrcIdx();		unsigned SrcIdx = CP.getSrcIdx();
if (!RHS.hasSubRanges()) {		if (!RHS.hasSubRanges()) {
LaneBitmask Mask = SrcIdx == 0 ? CP.getNewRC()->getLaneMask()		LaneBitmask Mask = SrcIdx == 0 ? CP.getNewRC()->getLaneMask()
: TRI->getSubRegIndexLaneMask(SrcIdx);		: TRI->getSubRegIndexLaneMask(SrcIdx);
mergeSubRangeInto(LHS, RHS, Mask, CP, DstIdx);		mergeSubRangeInto(LHS, RHS, Mask, CP, DstIdx);
} else {		} else {
// Pair up subranges and merge.		// Pair up subranges and merge.
for (LiveInterval::SubRange &R : RHS.subranges()) {		for (LiveInterval::SubRange &R : RHS.subranges()) {
Show All 22 Lines	bool RegisterCoalescer::joinVirtRegs(CoalescerPair &CP) {
// Erase COPY and IMPLICIT_DEF instructions. This may cause some external		// Erase COPY and IMPLICIT_DEF instructions. This may cause some external
// registers to require trimming.		// registers to require trimming.
SmallVector<unsigned, 8> ShrinkRegs;		SmallVector<unsigned, 8> ShrinkRegs;
LHSVals.eraseInstrs(ErasedInstrs, ShrinkRegs, &LHS);		LHSVals.eraseInstrs(ErasedInstrs, ShrinkRegs, &LHS);
RHSVals.eraseInstrs(ErasedInstrs, ShrinkRegs);		RHSVals.eraseInstrs(ErasedInstrs, ShrinkRegs);
while (!ShrinkRegs.empty())		while (!ShrinkRegs.empty())
shrinkToUses(&LIS->getInterval(ShrinkRegs.pop_back_val()));		shrinkToUses(&LIS->getInterval(ShrinkRegs.pop_back_val()));

		// Scan and mark undef any DBG_VALUEs that would refer to a different value.
		checkMergingChangesDbgValues(CP, LHS, LHSVals, RHS, RHSVals);

// Join RHS into LHS.		// Join RHS into LHS.
LHS.join(RHS, LHSVals.getAssignments(), RHSVals.getAssignments(), NewVNInfo);		LHS.join(RHS, LHSVals.getAssignments(), RHSVals.getAssignments(), NewVNInfo);

// Kill flags are going to be wrong if the live ranges were overlapping.		// Kill flags are going to be wrong if the live ranges were overlapping.
// Eventually, we should simply clear all kill flags when computing live		// Eventually, we should simply clear all kill flags when computing live
// ranges. They are reinserted after register allocation.		// ranges. They are reinserted after register allocation.
MRI->clearKillFlags(LHS.reg);		MRI->clearKillFlags(LHS.reg);
MRI->clearKillFlags(RHS.reg);		MRI->clearKillFlags(RHS.reg);
Show All 15 Lines	bool RegisterCoalescer::joinVirtRegs(CoalescerPair &CP) {

return true;		return true;
}		}

bool RegisterCoalescer::joinIntervals(CoalescerPair &CP) {		bool RegisterCoalescer::joinIntervals(CoalescerPair &CP) {
return CP.isPhys() ? joinReservedPhysReg(CP) : joinVirtRegs(CP);		return CP.isPhys() ? joinReservedPhysReg(CP) : joinVirtRegs(CP);
}		}

		void RegisterCoalescer::buildVRegToDbgValueMap(MachineFunction &MF)
		{
		const SlotIndexes &Slots = *LIS->getSlotIndexes();
		SmallVector<MachineInstr *, 8> ToInsert;

		// After collecting a block of DBG_VALUEs into ToInsert, enter them into the
		// vreg => DbgValueLoc map.
		auto CloseNewDVRange = [this, &ToInsert](SlotIndex Slot) {
		for (auto *X : ToInsert)
		DbgVRegToValues[X->getOperand(0).getReg()].push_back({Slot, X});

		ToInsert.clear();
		};

		// Iterate over all instructions, collecting them into the ToInsert vector.
		// Once a non-debug instruction is found, record the slot index of the
		// collected DBG_VALUEs.
		for (auto &MBB : MF) {
		SlotIndex CurrentSlot = Slots.getMBBStartIdx(&MBB);

		for (auto &MI : MBB) {
		if (MI.isDebugValue() && MI.getOperand(0).isReg() &&
		MI.getOperand(0).getReg().isVirtual()) {
		ToInsert.push_back(&MI);
		} else if (!MI.isDebugInstr()) {
		CurrentSlot = Slots.getInstructionIndex(MI);
		CloseNewDVRange(CurrentSlot);
		aprantlUnsubmitted Done Reply Inline Actions This flip-flopping of DbgValuesSeen is hard to follow .. is there something more obvious that could be done? Otherwise, can it be documented? aprantl: This flip-flopping of DbgValuesSeen is hard to follow .. is there something more obvious that…
		jmorseAuthorUnsubmitted Done Reply Inline Actions I realised I could just rely on there not being any DBG_VALUEs in the ToInsert vector instead of explicitly tracking these things, so I've just deleted that flag. jmorse: I realised I could just rely on there not being any DBG_VALUEs in the ToInsert vector instead…
		}
		}

		// Close range of DBG_VALUEs at the end of blocks.
		CloseNewDVRange(Slots.getMBBEndIdx(&MBB));
		}

		// Sort all DBG_VALUEs we've seen by slot number.
		for (auto &Pair : DbgVRegToValues)
		llvm::sort(Pair.second);
		}

		void RegisterCoalescer::checkMergingChangesDbgValues(CoalescerPair &CP,
		LiveRange &LHS,
		JoinVals &LHSVals,
		LiveRange &RHS,
		JoinVals &RHSVals) {
		auto ScanForDstReg = [&](unsigned Reg) {
		checkMergingChangesDbgValuesImpl(Reg, RHS, LHS, LHSVals);
		};

		auto ScanForSrcReg = [&](unsigned Reg) {
		checkMergingChangesDbgValuesImpl(Reg, LHS, RHS, RHSVals);
		};

		// Scan for potentially unsound DBG_VALUEs: examine first the register number
		// Reg, and then any other vregs that may have been merged into it.
		auto PerformScan = [this](unsigned Reg, std::function<void(unsigned)> Func) {
		Func(Reg);
		if (DbgMergedVRegNums.count(Reg))
		for (unsigned X : DbgMergedVRegNums[Reg])
		Func(X);
		};

		// Scan for unsound updates of both the source and destination register.
		PerformScan(CP.getSrcReg(), ScanForSrcReg);
		PerformScan(CP.getDstReg(), ScanForDstReg);
		}

		void RegisterCoalescer::checkMergingChangesDbgValuesImpl(unsigned Reg,
		LiveRange &OtherLR,
		LiveRange &RegLR,
		JoinVals &RegVals) {
		// Are there any DBG_VALUEs to examine?
		auto VRegMapIt = DbgVRegToValues.find(Reg);
		if (VRegMapIt == DbgVRegToValues.end())
		return;

		auto &DbgValueSet = VRegMapIt->second;
		auto DbgValueSetIt = DbgValueSet.begin();
		auto SegmentIt = OtherLR.begin();

		bool LastUndefResult = false;
		SlotIndex LastUndefIdx;

		// If the "Other" register is live at a slot Idx, test whether Reg can
		// safely be merged with it, or should be marked undef.
		auto ShouldUndef = [&RegVals, &RegLR, &LastUndefResult,
		&LastUndefIdx](SlotIndex Idx) -> bool {
		// Our worst-case performance typically happens with asan, causing very
		// many DBG_VALUEs of the same location. Cache a copy of the most recent
		// result for this edge-case.
		if (LastUndefIdx == Idx)
		return LastUndefResult;

		// If the other range was live, and Reg's was not, the register coalescer
		// will not have tried to resolve any conflicts. We don't know whether
		// the DBG_VALUE will refer to the same value number, so it must be made
		// undef.
		auto OtherIt = RegLR.find(Idx);
		if (OtherIt == RegLR.end())
		return true;

		// Both the registers were live: examine the conflict resolution record for
		// the value number Reg refers to. CR_Keep meant that this value number
		// "won" and the merged register definitely refers to that value. CR_Erase
		// means the value number was a redundant copy of the other value, which
		// was coalesced and Reg deleted. It's safe to refer to the other register
		// (which will be the source of the copy).
		auto Resolution = RegVals.getResolution(OtherIt->valno->id);
		LastUndefResult = Resolution != JoinVals::CR_Keep &&
		Resolution != JoinVals::CR_Erase;
		LastUndefIdx = Idx;
		return LastUndefResult;
		};

		// Iterate over both the live-range of the "Other" register, and the set of
		// DBG_VALUEs for Reg at the same time. Advance whichever one has the lowest
		// slot index. This relies on the DbgValueSet being ordered.
		aprantlUnsubmitted Done Reply Inline Actions Ah. That might be the answer. aprantl: Ah. That might be the answer.
		while (DbgValueSetIt != DbgValueSet.end() && SegmentIt != OtherLR.end()) {
		if (DbgValueSetIt->first < SegmentIt->end) {
		// "Other" is live and there is a DBG_VALUE of Reg: test if we should
		// set it undef.
		if (DbgValueSetIt->first >= SegmentIt->start &&
		DbgValueSetIt->second->getOperand(0).getReg() != 0 &&
		ShouldUndef(DbgValueSetIt->first)) {
		// Mark undef, erase record of this DBG_VALUE to avoid revisiting.
		DbgValueSetIt->second->getOperand(0).setReg(0);
		continue;
		}
		++DbgValueSetIt;
		} else {
		++SegmentIt;
		}
		}
		}

namespace {		namespace {

/// Information concerning MBB coalescing priority.		/// Information concerning MBB coalescing priority.
struct MBBPriorityInfo {		struct MBBPriorityInfo {
MachineBasicBlock *MBB;		MachineBasicBlock *MBB;
unsigned Depth;		unsigned Depth;
bool IsSplit;		bool IsSplit;

▲ Show 20 Lines • Show All 266 Lines • ▼ Show 20 Lines	bool RegisterCoalescer::runOnMachineFunction(MachineFunction &fn) {
JoinSplitEdges = EnableJoinSplits;		JoinSplitEdges = EnableJoinSplits;

LLVM_DEBUG(dbgs() << "******** SIMPLE REGISTER COALESCING ********\n"		LLVM_DEBUG(dbgs() << "******** SIMPLE REGISTER COALESCING ********\n"
<< "********** Function: " << MF->getName() << '\n');		<< "********** Function: " << MF->getName() << '\n');

if (VerifyCoalescing)		if (VerifyCoalescing)
MF->verify(this, "Before register coalescing");		MF->verify(this, "Before register coalescing");

		DbgVRegToValues.clear();
		DbgMergedVRegNums.clear();
		buildVRegToDbgValueMap(fn);

RegClassInfo.runOnMachineFunction(fn);		RegClassInfo.runOnMachineFunction(fn);

// Join (coalesce) intervals if requested.		// Join (coalesce) intervals if requested.
if (EnableJoining)		if (EnableJoining)
joinAllIntervals();		joinAllIntervals();

// After deleting a lot of copies, register classes may be less constrained.		// After deleting a lot of copies, register classes may be less constrained.
// Removing sub-register operands may allow GR32_ABCD -> GR32 and DPR_VFP2 ->		// Removing sub-register operands may allow GR32_ABCD -> GR32 and DPR_VFP2 ->
▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

llvm/test/DebugInfo/MIR/X86/regcoalescing-clears-dead-dbgvals.mir

This file was added.

				# RUN: llc -mtriple=x86_64-unknown-unknown %s -o - -run-pass=simple-register-coalescing \| FileCheck %s
				# PR40010: DBG_VALUEs do not contribute to the liveness of virtual registers,
				# and the register coalescer would merge new live values on top of DBG_VALUEs,
				# leading to them presenting new (wrong) values to the debugger. Test that
				# when out of liveness, coalescing will mark DBG_VALUEs in non-live locations
				# as undef.
				--- \|
				; ModuleID = './test.ll'
				source_filename = "./test.ll"
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

				; Function Attrs: nounwind readnone speculatable
				declare void @llvm.dbg.value(metadata, metadata, metadata) #0

				; Original IR source here:
				define i32 @test(i32* %pin) {
				entry:
				br label %start.test1

				start.test1: ; preds = %start, %entry
				%foo = phi i32 [ 0, %entry ], [ %bar, %start.test1 ]
				%baz = load i32, i32* %pin, align 1
				%qux = xor i32 %baz, 1234
				%bar = add i32 %qux, %foo
				call void @llvm.dbg.value(metadata i32 %foo, metadata !3, metadata !DIExpression()), !dbg !5
				%cmp = icmp ugt i32 %bar, 1000000
				br i1 %cmp, label %leave, label %start.test1

				leave: ; preds = %start
				ret i32 %bar
				}

				; Stubs to appease the MIR parser
				define i32 @test2(i32* %pin) {
				entry:
				ret i32 0
				start.test2:
				ret i32 0
				leave:
				ret i32 0
				}

				; Function Attrs: nounwind
				declare void @llvm.stackprotector(i8, i8*) #1

				attributes #0 = { nounwind readnone speculatable }
				attributes #1 = { nounwind }

				!llvm.module.flags = !{!0}
				!llvm.dbg.cu = !{!1}

				!0 = !{i32 2, !"Debug Info Version", i32 3}
				!1 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !2, producer: "beards", isOptimized: true, runtimeVersion: 4, emissionKind: FullDebug)
				!2 = !DIFile(filename: "bees.cpp", directory: "")
				!3 = !DILocalVariable(name: "bees", scope: !4)
				!4 = distinct !DISubprogram(name: "nope", scope: !1, file: !2, line: 1, spFlags: DISPFlagDefinition, unit: !1)
				!5 = !DILocation(line: 0, scope: !4)

				...
				---
				name: test
				tracksRegLiveness: true
				body: \|
				bb.0.entry:
				successors: %bb.1(0x80000000)
				liveins: $rdi

				%2:gr64 = COPY killed $rdi
				%3:gr32 = MOV32r0 implicit-def dead $eflags
				%4:gr32 = MOV32ri 1234
				%7:gr32 = COPY killed %3

				bb.1.start.test1:
				successors: %bb.2(0x04000000), %bb.1(0x7c000000)

				; CHECK-LABEL: name: test
				;
				; We currently expect %1 and %0 to merge into %7
				;
				; CHECK: %[[REG1:[0-9]+]]:gr32 = MOV32rm
				; CHECK-NEXT: %[[REG2:[0-9]+]]:gr32 = XOR32rr %[[REG1]]
				; CHECK-NEXT: %[[REG3:[0-9]+]]:gr32 = ADD32rr %[[REG3]], %[[REG2]]
				; CHECK-NEXT: DBG_VALUE $noreg

				%0:gr32 = COPY killed %7
				%8:gr32 = MOV32rm %2, 1, $noreg, 0, $noreg :: (load 4 from %ir.pin, align 1)
				%5:gr32 = COPY killed %8
				%5:gr32 = XOR32rr %5, %4, implicit-def dead $eflags
				%1:gr32 = COPY killed %0
				%1:gr32 = ADD32rr %1, killed %5, implicit-def dead $eflags
				DBG_VALUE %0, $noreg, !3, !DIExpression(), debug-location !5
				CMP32ri %1, 1000001, implicit-def $eflags
				%7:gr32 = COPY %1
				JCC_1 %bb.1, 2, implicit killed $eflags
				JMP_1 %bb.2

				bb.2.leave:
				$eax = COPY killed %1
				RET 0, killed $eax

				...
				---
				name: test2
				tracksRegLiveness: true
				body: \|
				bb.0.entry:
				successors: %bb.1(0x80000000)
				liveins: $rdi

				%2:gr64 = COPY killed $rdi
				%3:gr32 = MOV32r0 implicit-def dead $eflags
				%4:gr32 = MOV32ri 1234
				%7:gr32 = COPY killed %3

				bb.1.start.test2:
				successors: %bb.2(0x04000000), %bb.1(0x7c000000)

				; CHECK-LABEL: name: test2
				;
				; %0 should be merged into %7, but as %0 is live at this location the
				; DBG_VALUE should be preserved and point at the operand of ADD32rr.
				; RegisterCoalescer resolves %0 as CR_Erase: %0 is a redundant copy and
				; can be erased.
				;
				; CHECK: %[[REG11:[0-9]+]]:gr32 = MOV32rm
				; CHECK-NEXT: %[[REG12:[0-9]+]]:gr32 = XOR32rr %[[REG11]]
				; CHECK-NEXT: DBG_VALUE %[[REG13:[0-9]+]]
				; CHECK-NEXT: %[[REG13]]:gr32 = ADD32rr %[[REG13]], %[[REG12]]

				%0:gr32 = COPY killed %7
				%8:gr32 = MOV32rm %2, 1, $noreg, 0, $noreg :: (load 4 from %ir.pin, align 1)
				%8:gr32 = XOR32rr %8, %4, implicit-def dead $eflags
				DBG_VALUE %0, $noreg, !3, !DIExpression(), debug-location !5
				%0:gr32 = ADD32rr %0, killed %8, implicit-def dead $eflags
				CMP32ri %0, 1000001, implicit-def $eflags
				%7:gr32 = COPY %0
				JCC_1 %bb.1, 2, implicit killed $eflags
				JMP_1 %bb.2

				bb.2.leave:
				$eax = COPY killed %7
				RET 0, killed $eax

				...