This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/CodeGen/
-
CodeGen/
9/10
MachineSink.cpp
-
test/DebugInfo/MIR/X86/
-
DebugInfo/
-
MIR/
-
X86/
-
postra-subreg-sink.mir
-
sink-leaves-undef.mir

Differential D58238

[DebugInfo] MachineSink: Insert undef DBG_VALUEs when sinking instructions, try to forward copies
ClosedPublic

Authored by jmorse on Feb 14 2019, 7:56 AM.

Download Raw Diff

Details

Reviewers

aprantl
bjope
vsk

Commits

rGee50590e1684: [DebugInfo] MachineSink: Insert undef DBG_VALUEs when sinking instructions

Summary

When we sink DBG_VALUEs at the moment, we simply move the DBG_VALUE instruction to below the sunk instruction. However, we should also mark the variable as being undef at the source location, to terminate any earlier variable location. This patch does that -- plus, if the instruction being sunk is a copy, it attempts to propagate the copy through the DBG_VALUE, replacing the destination with the source.

To avoid any kind of subregister shennanigans, vreg copy propagation only happens if all the subregisters agree; for physical registers we only propagate if the DBG_VALUE operand is the same as the destination of the copy. So:

%1 = COPY %0
DBG_VALUE %1

Would copy-prop, while

%1 = COPY %0
DBG_VALUE %1.subreg_8bit

Would not. Additional analysis might determine this to be safe, but I haven't implemented it here.

Likewise after regalloc:

$eax = COPY $ecx
DBG_VALUE $eax

Would constant-prop, while

$ax = $cx
DBG_VALUE $eax

Would not.

With this patch, building clang-3.4 yields a fractional number more number of covered variables, and roughly 0.5% more scope-bytes coverage.

Diff Detail

Event Timeline

jmorse created this revision.Feb 14 2019, 7:56 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 14 2019, 7:56 AM

Herald added subscribers: llvm-commits, jdoerfert. · View Herald Transcript

jmorse added a parent revision: D58191: [DebugInfo] Make postra sinking of DBG_VALUEs safe in the presence of subregisters.Feb 14 2019, 7:56 AM

aprantl added inline comments.Feb 14 2019, 8:41 AM

lib/CodeGen/MachineSink.cpp
798	I may be misunderstanding, but if it isn't a vreg, don't we need to check that there are no defs of the reg in between? Or is this all pre-ra?
807	Does this get easier to read if you unconditionally call setReg(0) and then overwrite it in the copy case?

Explicitly test for whether we're pre or post regalloc when deciding whether a copy can be propagated, refactor test logic.

jmorse marked 4 inline comments as done.Feb 15 2019, 2:37 AM

jmorse added inline comments.

lib/CodeGen/MachineSink.cpp
798	Good question -- this code is called by both pre and post RA code. For post-ra it's guaranteed that there are no defs in between by the calling code: otherwise it would not be legal to sink the copy in the first place. For pre-ra we can rely on the SSA-ness of the function to guarantee validity. However, I've been using the "both-are-vregs" and "both-are-physregs" tests as a proxy for whether we're pre or post RA, which isn't necessarily sound. I don't believe the pre-ra machine sinker will sink anything with meaningful physreg operations, but I've added an explicit test to make this more robust.
807	It does, I've done that and simplified the tests into two (large) conditionals.

I have some doubt about this. Mainly the part about inserting undefs. Although, some of my doubts are partially based on the big-black-hole regarding how debug info is supposed to work for heavily optimized code (and more specifically how this is supposed to work in LLVM based on what we got today).

I'll try to describe this by an example. Assume the C-code looks like this:

1:  x = y;
2:  z = z & q1;
3:  x = x +3;
4:  z = z | q2;

then we might end up MIR like this as input to MachineSink

%1 = LOAD ...
DBG_VALUE %1, %noreg, "x", ...
%2 = AND ...
%3 = ADD %1, 3
DBG_VALUE %3, %noreg, "x", ...
%4 = OR %2, ...

Now assume that the ADD is sunk (for simplicity in this example just to below the OR and not to another BB).

Isn't it perfectly OK to get

%1 = LOAD ...
DBG_VALUE %1, %noreg, "x", ...
%2 = AND ...
%4 = OR %2, ...
%3 = ADD %1, 3
DBG_VALUE %3, %noreg, "x", ...

making it appear as "x" is equal to "y" up until the ADD has been executed.

As I understand it this patch would introduce a DBG_VALUE saying that we do not know the value of "x" when doing the OR.

%1 = LOAD ...
DBG_VALUE %1, %noreg, "x", ...
%2 = AND ...
DBG_VALUE %noreg, %noreg, "x", ...
%4 = OR %2, ...
%3 = ADD %1, 3
DBG_VALUE %3, %noreg, "x", ...

But we have just delayed the ADD a little bit (so we have not really executed line 3 yet).

It might also be tempting to express the add in the DIExpression (a "salvage" kind of solution):

%1 = LOAD ...
DBG_VALUE %1, %noreg, "x", ...
%2 = AND ...
DBG_VALUE %1, %noreg, "x", !DIExpresssion("%1 + 3")
%4 = OR %2, ...
%3 = ADD %1, 3
DBG_VALUE %3, %noreg, "x", ...

I think the "salvage" kind of solution is incorrect here. We have not optimized away the add, it has just been moved. When debugging it would be confusing if "x" already has the value "y + 3" before executing the ADD (which in this example probably is the only instruction with line 3 as debug location).

The alternative of making "x" appear as optimized out is also weird IMO. We have not lost track of the value of "x". It first gets the value %1 and then after the ADD it has the value %3.

Maybe it is a philosophical question. Is the compiler scheduling/reordering source instructions (in the debugger it will appear as source statements are executed in a random order), or are we scheduling/reordering machine instructions (and in the debugger it still should appear as if we execute variable assignments in the order of the source code). VLIW-scheduling, tail merging, etc, of course make things extra complicated, since we basically will be all over the place all the time.

The "copy-prop" part of this patch might be OK. But couldn't it just be seen as a the same scenario as with the ADD above, where we are delaying the assignment?

(Sorry if my simplified examples doesn't make sense for the actual problem that you attempt to solve.)

Does it matter that we actually sink into a later BB here? Is this fixing some problem where it from a debugging perspective would appear wrong if the variable does not get the new value before the end of the BB?

Hi,

In D58238#1399186, @bjope wrote:

I have some doubt about this. Mainly the part about inserting undefs. Although, some of my doubts are partially based on the big-black-hole regarding how debug info is supposed to work for heavily optimized code (and more specifically how this is supposed to work in LLVM based on what we got today).

No problemo -- my debuginfo knowledge is pretty fresh anyway, most of this is based on educated guesses of how it's supposed to work.

My understanding is that the difficulty comes from the re-ordering of DBG_VALUEs, because it re-orders how variable assignments appear. This can lead to debuggers displaying states of the program (i.e. a set of assignments to variables) that did not exist in the original program. If we take the example and put in DBG_VALUE instructions for the 'z' variable, and then sink the addition as you describe, we get:

%1 = LOAD ...
DBG_VALUE %1, %noreg, "x", ...
%2 = AND ...
DBG_VALUE %2, %noreg, "z", ...
%4 = OR %2, ...
DBG_VALUE %4, %noreg, "z", ...
%3 = ADD %1, 3
DBG_VALUE %3, %noreg, "x", ...

If a debugger landed after the OR instruction (before the sunk ADD is executed) then it would observe the value of "y" from line 4, but the value of "x" from line 1, which was overwritten in the original program. This is a state that didn't originally exist -- and it has the potential to mislead developers, for example if iterating over pairs of something and the two values displayed aren't actually paired.

For the exact example you give where there *isn't* any re-ordering, then we're probably needlessly reducing the range where we have variable locations, however that's an exception to a general rule IMHO (and this patch is implementing the general rule).

One response to reordering would be "Yeah, that's going to happen when your code is optimised", which is totally true. IMHO having variables read "optimized out" is slightly superior because it doesn't represent a nonexistant state of the program. That might force developers to dig deeper to find out what's going on (assuming they can step the debugger), but on the other hand they're guaranteed that when there is a variable location, it's accurate. Otherwise developers might not fully trust the debuginfo they're getting.

Maybe it is a philosophical question. Is the compiler scheduling/reordering source instructions (in the debugger it will appear as source statements are executed in a random order), or are we scheduling/reordering machine instructions (and in the debugger it still should appear as if we execute variable assignments in the order of the source code). VLIW-scheduling, tail merging, etc, of course make things extra complicated, since we basically will be all over the place all the time.

To me it's the latter, my mental model (maybe wrong) is that while the compiler is optimising code, for debuginfo it has a state-machine of variable locations represented by dbg.values that it needs to try and preserve the order of, marking locations undef if they don't exist any more. The logical consequence is that if we had a function where the compiler managed to completely reverse the order of computation, no variable would have a location, which would be sound, but not useful.

Does it matter that we actually sink into a later BB here? Is this fixing some problem where it from a debugging perspective would appear wrong if the variable does not get the new value before the end of the BB?

The sinking is irrelevant to the question I think, in that it's a general question of "should re-ordered variable assignments be visible?".

jmorse added a child revision: D58386: [DebugInfo] Pre-RA MachineSink: sink DBG_VALUEs that don't immediately follow the sunk instruction too.Feb 19 2019, 7:35 AM

NikolaPrica added a subscriber: NikolaPrica.Feb 19 2019, 7:37 AM

NikolaPrica added inline comments.

lib/CodeGen/MachineSink.cpp
786	TargetInstrInfo::isCopyInstr here? It should support pseudo COPY instruction and target specific register copy instructions.

jmorse added a child revision: D58453: [DebugInfo][CGP] Limit placeDbgValues movement of dbg.value intrinsics.Feb 20 2019, 8:23 AM

Use isCopyInstr to detect copy instructions, which will catch more opportunities that just isCopy().

jmorse marked 2 inline comments as done.Feb 21 2019, 5:18 AM

jmorse added inline comments.

lib/CodeGen/MachineSink.cpp
786	Sounds good, updated test to call isCopyInstr.

wolfgangp added a subscriber: wolfgangp.Feb 21 2019, 3:21 PM

So this basically prevents the debugger from displaying an inconsistent program state when some of the sideeffects from a line preceding its current stopping point have not been completed yet. Makes sense to me. One minor concern: If the sunk instruction does not get sunk past the next source line, we may create some unneeded location list entries, e.g.:

x = y; x = x + 3; z = z ? z + 3 : b; // all on the same line

If the x + 3 is sunk past the assignment to z (but before any code attributed to the next line is executed) we generate an additional location list entry for x, where there is a small gap in the covered range for x, even though the user is not able to observe x until the next line.
Doesn't look like a big deal, though.

In D58238#1406518, @wolfgangp wrote:

So this basically prevents the debugger from displaying an inconsistent program state when some of the sideeffects from a line preceding its current stopping point have not been completed yet. Makes sense to me. One minor concern: If the sunk instruction does not get sunk past the next source line, we may create some unneeded location list entries, e.g.:

x = y; x = x + 3; z = z ? z + 3 : b; // all on the same line

If the x + 3 is sunk past the assignment to z (but before any code attributed to the next line is executed) we generate an additional location list entry for x, where there is a small gap in the covered range for x, even though the user is not able to observe x until the next line.
Doesn't look like a big deal, though.

Still nice/potentially worthwhile to have the location accuracy on a per-instruction basis. You could imagine if + was an operator overload and you broke inside that, went up a frame and tried ot print the variable's value - you're at the '+' (or assignment, etc) exactly, within a line, so there's still value in having the location be accurate on that sub-line granularity.

Maybe it is a philosophical question. Is the compiler scheduling/reordering source instructions (in the debugger it will appear as source statements are executed in a random order), or are we scheduling/reordering machine instructions (and in the debugger it still should appear as if we execute variable assignments in the order of the source code). VLIW-scheduling, tail merging, etc, of course make things extra complicated, since we basically will be all over the place all the time.

To me it's the latter, my mental model (maybe wrong) is that while the compiler is optimising code, for debuginfo it has a state-machine of variable locations represented by dbg.values that it needs to try and preserve the order of, marking locations undef if they don't exist any more. The logical consequence is that if we had a function where the compiler managed to completely reverse the order of computation, no variable would have a location, which would be sound, but not useful.

So then we should check if there is any other dbg.value that we sink past? Otherwise there won't be any reordering and no need for an undef location?
It might be that we sink past instructions that has been sunk earlier (or simply been re-scheduled at ISel etc), so it could be that we restore the source order when sinking. Or we could be sinking past some constant materialisation that has no associated source line. Etc.

After ISel we could have something like this:

%1 = LOAD ...
DBG_VALUE %1, %noreg, "x", ...
%2 = AND ...
%3 = ADD %1, 3
DBG_VALUE %3, %noreg, "x", ...
%5 = SUB 0, 0

or we could just as well have the SUB before the ADD afaict

%1 = LOAD ...
DBG_VALUE %1, %noreg, "x", ...
%2 = AND ...
%5 = SUB 0, 0
%3 = ADD %1, 3
DBG_VALUE %3, %noreg, "x", ...

Sinking the ADD past the SUB would introduce an undef location for "x", when done by MachineSink. But not if the instructions were emitted in that order already at ISel.
So minor changes in ISel-scheduling could ripple down, impacting how many undef DBG_VALUE we see after MachineSink. It is not like ISel care that much about where it inserts this SUB afaict.

When for example regalloc inserts spill/reload code, I guess that it in some sense can be seen as the spill is hoisted from the reload, so in effect it sinks lots of instructions past the spilling instruction. But it does not seem correct to sprinkle undef DBG_VALUE instructions all over the code, in such situations.

Anyway, I'm still a little bit confused, and haven't really understood the full consequence of this.
Just got a feeling that some criterias for when to do this is missing, but maybe the aptch just takes a defensive appoach. We do not really need to insert the undef location always, but we need to do it sometimes. So it is better to do it too often compared to too seldom (as having an undef location always is OK).

In D58238#1406919, @bjope wrote:

So then we should check if there is any other dbg.value that we sink past? Otherwise there won't be any reordering and no need for an undef location?

True -- I'd previously been ignoring this issue as I thought it'd require another re-scan of instructions we sink past, but as the calling code walks through blocks backwards, it should be easy to maintain a little extra state recording what's been seen. (I might roll this into the parent patch too).

When for example regalloc inserts spill/reload code, I guess that it in some sense can be seen as the spill is hoisted from the reload, so in effect it sinks lots of instructions past the spilling instruction. But it does not seem correct to sprinkle undef DBG_VALUE instructions all over the code, in such situations.

I don't completely follow this, a vreg that gets spilt-and-reloaded has a location even when it's spilt surely? (How livedebugvalues and debug locations for spilt values works isn't something I've looked into).

Anyway, I'm still a little bit confused, and haven't really understood the full consequence of this.
Just got a feeling that some criterias for when to do this is missing, but maybe the aptch just takes a defensive appoach. We do not really need to insert the undef location always, but we need to do it sometimes. So it is better to do it too often compared to too seldom (as having an undef location always is OK).

It's definitely a defensive approach, as covered above I tend to prioritise not allowing unsound variable locations, even when we could be more complete.

jmorse mentioned this in D59027: [WIP][DebugInfo] When sinking DBG_VALUEs, only insert undef DBG_VALUEs if sinking would re-order assignments.Mar 6 2019, 8:41 AM

I produced another patch (D59027, work-in-progress) that only creates undef DBG_VALUEs when the order of assignments would change... however I then realised I'd misunderstood Bjorns question here:

In D58238#1406919, @bjope wrote:
It might be that we sink past instructions that has been sunk earlier (or simply been re-scheduled at ISel etc), so it could be that we restore the source order when sinking. Or we could be sinking past some constant materialisation that has no associated source line. Etc.

After ISel we could have something like this:
%1 = LOAD ...
DBG_VALUE %1, %noreg, "x", ...
%2 = AND ...
%3 = ADD %1, 3
DBG_VALUE %3, %noreg, "x", ...
%5 = SUB 0, 0
or we could just as well have the SUB before the ADD afaict
%1 = LOAD ...
DBG_VALUE %1, %noreg, "x", ...
%2 = AND ...
%5 = SUB 0, 0
%3 = ADD %1, 3
DBG_VALUE %3, %noreg, "x", ...
Sinking the ADD past the SUB would introduce an undef location for "x", when done by MachineSink. But not if the instructions were emitted in that order already at ISel.

If I understand you correctly, this actually isn't a case that MachineSink deals with: MachineSink sinks insts from parent to child basic blocks, to remove partially redundant computations. Take this example [0], where the load of %ab is only used down one path, MachineSink moves it to only execute in the block which actually uses the load. With that in mind it might be clearer why undefs become necessary: if the DBG_VALUE moves to a different block, one of the source-level assignments to the corresponding variable has disappeared from other paths through the program. Those paths should have the variable appear ``optimised out'' until the variable gets a new location, to stop the old location being used on the other paths which would present a stale variable value (effectively this re-orders the appearance of assignments).

Using the example quoted above, if ADD were sunk into a _successor_ block, then an undef DBG_VALUE would have to go somewhere, to terminate the earlier location of "x" in the other successor blocks. (NB, MachineSink doesn't operate if there's only one successor).

[0] https://github.com/llvm-mirror/llvm/blob/master/test/CodeGen/X86/MachineSink-DbgValue.ll

Herald added a subscriber: ormris. · View Herald TranscriptMar 7 2019, 3:47 AM

In D58238#1421352, @jmorse wrote:
I produced another patch (D59027, work-in-progress) that only creates undef DBG_VALUEs when the order of assignments would change... however I then realised I'd misunderstood Bjorns question here:
In D58238#1406919, @bjope wrote:
It might be that we sink past instructions that has been sunk earlier (or simply been re-scheduled at ISel etc), so it could be that we restore the source order when sinking. Or we could be sinking past some constant materialisation that has no associated source line. Etc.

After ISel we could have something like this:
%1 = LOAD ...
DBG_VALUE %1, %noreg, "x", ...
%2 = AND ...
%3 = ADD %1, 3
DBG_VALUE %3, %noreg, "x", ...
%5 = SUB 0, 0
or we could just as well have the SUB before the ADD afaict
%1 = LOAD ...
DBG_VALUE %1, %noreg, "x", ...
%2 = AND ...
%5 = SUB 0, 0
%3 = ADD %1, 3
DBG_VALUE %3, %noreg, "x", ...
Sinking the ADD past the SUB would introduce an undef location for "x", when done by MachineSink. But not if the instructions were emitted in that order already at ISel.
If I understand you correctly, this actually isn't a case that MachineSink deals with: MachineSink sinks insts from parent to child basic blocks, to remove partially redundant computations. Take this example [0], where the load of %ab is only used down one path, MachineSink moves it to only execute in the block which actually uses the load. With that in mind it might be clearer why undefs become necessary: if the DBG_VALUE moves to a different block, one of the source-level assignments to the corresponding variable has disappeared from other paths through the program. Those paths should have the variable appear ``optimised out'' until the variable gets a new location, to stop the old location being used on the other paths which would present a stale variable value (effectively this re-orders the appearance of assignments).

Using the example quoted above, if ADD were sunk into a _successor_ block, then an undef DBG_VALUE would have to go somewhere, to terminate the earlier location of "x" in the other successor blocks. (NB, MachineSink doesn't operate if there's only one successor).

[0] https://github.com/llvm-mirror/llvm/blob/master/test/CodeGen/X86/MachineSink-DbgValue.ll

Yes, I was just thinking out loud about sinking/scheduling in general and not so much the specific situation with MachineSink. But I also wondered if the sinking done by MachineSink could be seen as a special case of sinking in general, where we sink one instruction at a time. For MachineSink we eventually reach the end of the BB, and then continue into a successor. So how do we determine when it is time to insert "undef" when sinking one instruction at a time? What kind of reorderings should/shouldn't trigger that we insert an "undef"? I guess this is one thing we should try to describe in the documentation (also for DGB_VALUE), to make sure new developers understand the basic logic behind how we implement these things.

I only had a quick look at D59027, but it feels like it tries to handle this a little bit more "carefully" (avoiding <optimized out> in some situations compared to this patch).
Btw, I'll be OOO for a week. So I won't be able to give much more feedback right now. No need to wait for my approval if you get LGTM from someone else while I'm away.

ormris removed a subscriber: ormris.Mar 7 2019, 4:11 PM

In D58238#1422261, @bjope wrote:

Yes, I was just thinking out loud about sinking/scheduling in general and not so much the specific situation with MachineSink. But I also wondered if the sinking done by MachineSink could be seen as a special case of sinking in general, where we sink one instruction at a time. For MachineSink we eventually reach the end of the BB, and then continue into a successor. So how do we determine when it is time to insert "undef" when sinking one instruction at a time? What kind of reorderings should/shouldn't trigger that we insert an "undef"? I guess this is one thing we should try to describe in the documentation (also for DGB_VALUE), to make sure new developers understand the basic logic behind how we implement these things.

Ahhh, now that all fits in my mind, cool. I'll certainly ship a docs patch, when I've convinced myself I know what's going on at the CodeGen level :o

I only had a quick look at D59027, but it feels like it tries to handle this a little bit more "carefully" (avoiding <optimized out> in some situations compared to this patch).
Btw, I'll be OOO for a week. So I won't be able to give much more feedback right now. No need to wait for my approval if you get LGTM from someone else while I'm away.

Cool, now clicking a phab button that allegedly will put this back in for review.

vsk mentioned this in D58191: [DebugInfo] Make postra sinking of DBG_VALUEs safe in the presence of subregisters.Jul 10 2019, 1:15 PM

vsk added inline comments.Jul 10 2019, 2:07 PM

lib/CodeGen/MachineSink.cpp
798	This subreg equality check seems a bit tricky. Mind breaking it up? Currently it reads like '(true\|false) == DstMO->getSubReg()', which is surprising because I expected an unsigned-unsigned comparison.

Split up subregister comparision into pairs, to make it clear we're comparing unsigneds.

While we're here, update JCC insts in the test, and remove some un-necessary attributes.

jmorse marked an inline comment as done.Jul 11 2019, 8:53 AM

aprantl added inline comments.Jul 29 2019, 1:30 PM

lib/CodeGen/MachineSink.cpp
806	There ought to be some simplification of these two conditions possible. Perhaps adding a `continue`?

This update simplifies & flattens the copy-forwarding logic as suggested. Rather than trying to select the condition where copy-forwarding is valid, instead continue around the loop whenever a precondition isn't met.

This saves us a level of indentation and makes the conditionals much easier to understand.

aprantl added inline comments.Sep 3 2019, 9:05 AM

lib/CodeGen/MachineSink.cpp
812	nit: else after continue is redundant.

Nix an un-necessary else.

jmorse marked 4 inline comments as done.Sep 4 2019, 8:25 AM

Ping -- I think this is more or less accepted, but I'd feel happier if this patch was green.

I don't want to preempt any of the other reviewers, but from my point I think this looks good.

lib/CodeGen/MachineSink.cpp
774	Not your code, but this looks like it could be replaced by a range-based-for.

This revision is now accepted and ready to land.Oct 24 2019, 2:47 PM

Closed by commit rGee50590e1684: [DebugInfo] MachineSink: Insert undef DBG_VALUEs when sinking instructions (authored by jmorse). · Explain WhyOct 28 2019, 5:20 AM

This revision was automatically updated to reflect the committed changes.

Herald added a subscriber: hiraditya. · View Herald TranscriptOct 28 2019, 5:20 AM

Hi! This patch seems to drastically increase compile times for some fuchsia builds. I'm working on creating a small reproducer that I can share, but just wanted to raise awareness in the meantime.

Is the extra cost here, or is it more work being generated for LiveDebugValues?

In D58238#1727932, @aprantl wrote:

Is the extra cost here, or is it more work being generated for LiveDebugValues?

I'm not sure what under the hood is causing this longer compile time. The most I can tell from a time trace is that a lot of the time is taken up in the backend. I've filed https://bugs.llvm.org/show_bug.cgi?id=43855 to track this. @jmorse could you look into this when you get the chance? Thanks.

jmorse mentioned this in rGfca41001963c: [DebugInfo] Re-apply two patches to MachineSink.Dec 5 2019, 7:59 AM

Revision Contents

Path

Size

lib/

CodeGen/

MachineSink.cpp

52 lines

test/

DebugInfo/

MIR/

X86/

postra-subreg-sink.mir

3 lines

sink-leaves-undef.mir

105 lines

Diff 218349

lib/CodeGen/MachineSink.cpp

Show First 20 Lines • Show All 733 Lines • ▼ Show 20 Lines	return MBP.LHS.isReg() && MBP.RHS.isImm() && MBP.RHS.getImm() == 0 &&
MBP.LHS.getReg() == BaseOp->getReg();		MBP.LHS.getReg() == BaseOp->getReg();
}		}

/// Sink an instruction and its associated debug instructions. If the debug		/// Sink an instruction and its associated debug instructions. If the debug
/// instructions to be sunk are already known, they can be provided in DbgVals.		/// instructions to be sunk are already known, they can be provided in DbgVals.
static void performSink(MachineInstr &MI, MachineBasicBlock &SuccToSinkTo,		static void performSink(MachineInstr &MI, MachineBasicBlock &SuccToSinkTo,
MachineBasicBlock::iterator InsertPos,		MachineBasicBlock::iterator InsertPos,
SmallVectorImpl<MachineInstr > DbgVals = nullptr) {		SmallVectorImpl<MachineInstr > DbgVals = nullptr) {
		const MachineRegisterInfo &MRI = MI.getMF()->getRegInfo();
		const TargetInstrInfo &TII = *MI.getMF()->getSubtarget().getInstrInfo();

// If debug values are provided use those, otherwise call collectDebugValues.		// If debug values are provided use those, otherwise call collectDebugValues.
SmallVector<MachineInstr *, 2> DbgValuesToSink;		SmallVector<MachineInstr *, 2> DbgValuesToSink;
if (DbgVals)		if (DbgVals)
DbgValuesToSink.insert(DbgValuesToSink.begin(),		DbgValuesToSink.insert(DbgValuesToSink.begin(),
DbgVals->begin(), DbgVals->end());		DbgVals->begin(), DbgVals->end());
else		else
MI.collectDebugValues(DbgValuesToSink);		MI.collectDebugValues(DbgValuesToSink);

// If we cannot find a location to use (merge with), then we erase the debug		// If we cannot find a location to use (merge with), then we erase the debug
// location to prevent debug-info driven tools from potentially reporting		// location to prevent debug-info driven tools from potentially reporting
// wrong location information.		// wrong location information.
if (!SuccToSinkTo.empty() && InsertPos != SuccToSinkTo.end())		if (!SuccToSinkTo.empty() && InsertPos != SuccToSinkTo.end())
MI.setDebugLoc(DILocation::getMergedLocation(MI.getDebugLoc(),		MI.setDebugLoc(DILocation::getMergedLocation(MI.getDebugLoc(),
InsertPos->getDebugLoc()));		InsertPos->getDebugLoc()));
else		else
MI.setDebugLoc(DebugLoc());		MI.setDebugLoc(DebugLoc());

// Move the instruction.		// Move the instruction.
MachineBasicBlock *ParentBlock = MI.getParent();		MachineBasicBlock *ParentBlock = MI.getParent();
SuccToSinkTo.splice(InsertPos, ParentBlock, MI,		SuccToSinkTo.splice(InsertPos, ParentBlock, MI,
++MachineBasicBlock::iterator(MI));		++MachineBasicBlock::iterator(MI));

// Move previously adjacent debug value instructions to the insert position.		// Sink a copy of debug users to the insert position. Mark the original
		// DBG_VALUE location as 'undef', indicating that any earlier variable
		// location should be terminated as we've optimised away the value at this
		// point.
		// If the sunk instruction is a copy, try to forward the copy instead of
		// leaving an 'undef' DBG_VALUE in the original location. Don't do this if
		// there's any subregister weirdness involved.
for (SmallVectorImpl<MachineInstr *>::iterator DBI = DbgValuesToSink.begin(),		for (SmallVectorImpl<MachineInstr *>::iterator DBI = DbgValuesToSink.begin(),
		aprantlUnsubmitted Not Done Reply Inline Actions Not your code, but this looks like it could be replaced by a range-based-for. aprantl: Not your code, but this looks like it could be replaced by a range-based-for.
DBE = DbgValuesToSink.end();		DBE = DbgValuesToSink.end();
DBI != DBE; ++DBI) {		DBI != DBE; ++DBI) {
MachineInstr DbgMI = DBI;		MachineInstr DbgMI = DBI;
SuccToSinkTo.splice(InsertPos, ParentBlock, DbgMI,		MachineInstr NewDbgMI = DbgMI->getMF()->CloneMachineInstr(DBI);
++MachineBasicBlock::iterator(DbgMI));		SuccToSinkTo.insert(InsertPos, NewDbgMI);

		// Copy DBG_VALUE operand and set the original to undef. We then check to
		// see whether this is something that can be copy-forwarded. If it isn't,
		// continue around the loop.
		MachineOperand DbgMO = DbgMI->getOperand(0);
		DbgMI->getOperand(0).setReg(0);

		NikolaPricaUnsubmitted Done Reply Inline Actions TargetInstrInfo::isCopyInstr here? It should support pseudo COPY instruction and target specific register copy instructions. NikolaPrica: TargetInstrInfo::isCopyInstr here? It should support pseudo COPY instruction and target…
		jmorseAuthorUnsubmitted Done Reply Inline Actions Sounds good, updated test to call isCopyInstr. jmorse: Sounds good, updated test to call isCopyInstr.
		const MachineOperand SrcMO = nullptr, DstMO = nullptr;
		if (!TII.isCopyInstr(MI, SrcMO, DstMO))
		continue;

		// Check validity of forwarding this copy.
		bool PostRA = MRI.getNumVirtRegs() == 0;

		// Trying to forward between physical and virtual registers is too hard.
		if (DbgMO.getReg().isVirtual() != SrcMO->getReg().isVirtual())
		continue;

		// Only try virtual register copy-forwarding before regalloc, and physical
		aprantlUnsubmitted Done Reply Inline Actions I may be misunderstanding, but if it isn't a vreg, don't we need to check that there are no defs of the reg in between? Or is this all pre-ra? aprantl: I may be misunderstanding, but if it isn't a vreg, don't we need to check that there are no…
		jmorseAuthorUnsubmitted Done Reply Inline Actions Good question -- this code is called by both pre and post RA code. For post-ra it's guaranteed that there are no defs in between by the calling code: otherwise it would not be legal to sink the copy in the first place. For pre-ra we can rely on the SSA-ness of the function to guarantee validity. However, I've been using the "both-are-vregs" and "both-are-physregs" tests as a proxy for whether we're pre or post RA, which isn't necessarily sound. I don't believe the pre-ra machine sinker will sink anything with meaningful physreg operations, but I've added an explicit test to make this more robust. jmorse: Good question -- this code is called by both pre and post RA code. For post-ra it's guaranteed…
		vskUnsubmitted Done Reply Inline Actions This subreg equality check seems a bit tricky. Mind breaking it up? Currently it reads like '(true\|false) == DstMO->getSubReg()', which is surprising because I expected an unsigned-unsigned comparison. vsk: This subreg equality check seems a bit tricky. Mind breaking it up? Currently it reads like '…
		// register copy-forwarding after regalloc.
		bool arePhysRegs = !DbgMO.getReg().isVirtual();
		if (arePhysRegs != PostRA)
		continue;

		// Pre-regalloc, only forward if all subregisters agree (or there are no
		// subregs at all). More analysis might recover some forwardable copies.
		if (!PostRA && (DbgMO.getSubReg() != SrcMO->getSubReg() \|\|
		aprantlUnsubmitted Done Reply Inline Actions There ought to be some simplification of these two conditions possible. Perhaps adding a `continue`? aprantl: There ought to be some simplification of these two conditions possible. Perhaps adding a…
		DbgMO.getSubReg() != DstMO->getSubReg()))
		aprantlUnsubmitted Done Reply Inline Actions Does this get easier to read if you unconditionally call setReg(0) and then overwrite it in the copy case? aprantl: Does this get easier to read if you unconditionally call setReg(0) and then overwrite it in the…
		jmorseAuthorUnsubmitted Done Reply Inline Actions It does, I've done that and simplified the tests into two (large) conditionals. jmorse: It does, I've done that and simplified the tests into two (large) conditionals.
		continue;
		// Post-regalloc, we may be sinking a DBG_VALUE of a sub or super-register
		// of this copy. Only forward the copy if the DBG_VALUE operand exactly
		// matches the copy destination.
		else if (PostRA && DbgMO.getReg() != DstMO->getReg())
		aprantlUnsubmitted Done Reply Inline Actions nit: else after continue is redundant. aprantl: nit: else after continue is redundant.
		continue;

		DbgMI->getOperand(0).setReg(SrcMO->getReg());
		DbgMI->getOperand(0).setSubReg(SrcMO->getSubReg());
}		}
}		}

/// SinkInstruction - Determine whether it is safe to sink the specified machine		/// SinkInstruction - Determine whether it is safe to sink the specified machine
/// instruction out of its current block into a successor.		/// instruction out of its current block into a successor.
bool MachineSinking::SinkInstruction(MachineInstr &MI, bool &SawStore,		bool MachineSinking::SinkInstruction(MachineInstr &MI, bool &SawStore,
AllSuccsCache &AllSuccessors) {		AllSuccsCache &AllSuccessors) {
// Don't sink instructions that the target prefers not to sink.		// Don't sink instructions that the target prefers not to sink.
▲ Show 20 Lines • Show All 457 Lines • Show Last 20 Lines

test/DebugInfo/MIR/X86/postra-subreg-sink.mir

Show First 20 Lines • Show All 63 Lines • ▼ Show 20 Lines	body: \|
; DBG_VALUE of $bx should sink: a write of its superregister sinks		; DBG_VALUE of $bx should sink: a write of its superregister sinks
; DBG_VALUE of $ecx should sink: a write of one of its subregisters sinks		; DBG_VALUE of $ecx should sink: a write of one of its subregisters sinks

; CHECK-LABEL: name: main		; CHECK-LABEL: name: main
; CHECK: bb.0.entry:		; CHECK: bb.0.entry:
; CHECK: successors: %bb.1(0x80000000)		; CHECK: successors: %bb.1(0x80000000)
; CHECK: liveins: $edi		; CHECK: liveins: $edi
; CHECK: DBG_VALUE $edi, $noreg, ![[BARVAR]]		; CHECK: DBG_VALUE $edi, $noreg, ![[BARVAR]]
		; CHECK-NEXT: DBG_VALUE $edi, $noreg, ![[ARGVAR]]
		; CHECK-NEXT: DBG_VALUE $noreg, $noreg, ![[BAZVAR]]
; CHECK-NEXT: renamable $cl = MOV8ri 1		; CHECK-NEXT: renamable $cl = MOV8ri 1
		; CHECK-NEXT: DBG_VALUE $noreg, $noreg, ![[FOOVAR]]
; CHECK-NEXT: JMP_1 %bb.1		; CHECK-NEXT: JMP_1 %bb.1
; CHECK: bb.1.return:		; CHECK: bb.1.return:
; CHECK: liveins: $cl, $edi		; CHECK: liveins: $cl, $edi
; CHECK: renamable $ebx = COPY $edi		; CHECK: renamable $ebx = COPY $edi
; CHECK-NEXT: DBG_VALUE $bx, $noreg, ![[BAZVAR]]		; CHECK-NEXT: DBG_VALUE $bx, $noreg, ![[BAZVAR]]
; CHECK-NEXT: DBG_VALUE $ebx, $noreg, ![[ARGVAR]]		; CHECK-NEXT: DBG_VALUE $ebx, $noreg, ![[ARGVAR]]
; CHECK-NEXT: renamable $ch = COPY renamable $cl		; CHECK-NEXT: renamable $ch = COPY renamable $cl
; CHECK-NEXT: DBG_VALUE $ecx, $noreg, ![[FOOVAR]]		; CHECK-NEXT: DBG_VALUE $ecx, $noreg, ![[FOOVAR]]
Show All 24 Lines

test/DebugInfo/MIR/X86/sink-leaves-undef.mir

This file was added.

				# RUN: llc %s -o - -run-pass=machine-sink -mtriple=x86_64-- \| FileCheck %s
				# This is a copy of test/CodeGen/X86/MachineSink-DbgValue.ll, where we
				# additionally test that when the MOV32rm defining %0 is sunk, it leaves
				# an 'undef' DBG_VALUE behind to terminate earlier location ranges.
				--- \|
				target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-apple-macosx10.7.0"

				; Function Attrs: nounwind readonly ssp uwtable
				define i32 @foo(i32 %i, i32* nocapture %c) !dbg !4 {
				call void @llvm.dbg.value(metadata i32 %i, metadata !9, metadata !DIExpression()), !dbg !14
				%ab = load i32, i32* %c, align 1, !dbg !15
				call void @llvm.dbg.value(metadata i32* %c, metadata !10, metadata !DIExpression()), !dbg !16
				call void @llvm.dbg.value(metadata i32 %ab, metadata !12, metadata !DIExpression()), !dbg !15
				%cd = icmp eq i32 %i, 42, !dbg !17
				br i1 %cd, label %bb1, label %bb2, !dbg !17

				bb1: ; preds = %0
				%gh = add nsw i32 %ab, 2, !dbg !18
				br label %bb2, !dbg !18

				bb2: ; preds = %bb1, %0
				%.0 = phi i32 [ %gh, %bb1 ], [ 0, %0 ]
				ret i32 %.0, !dbg !19
				}

				; Function Attrs: nounwind readnone speculatable
				declare void @llvm.dbg.value(metadata, metadata, metadata)

				; Function Attrs: nounwind
				declare void @llvm.stackprotector(i8, i8*)

				!llvm.dbg.cu = !{!0}
				!llvm.module.flags = !{!3}

				!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "Apple clang version 3.0 (tags/Apple/clang-211.10.1) (based on LLVM 3.0svn)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, retainedTypes: !2)
				!1 = !DIFile(filename: "a.c", directory: "/private/tmp")
				!2 = !{}
				!3 = !{i32 1, !"Debug Info Version", i32 3}
				!4 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 2, type: !5, virtualIndex: 6, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !0, retainedNodes: !8)
				!5 = !DISubroutineType(types: !6)
				!6 = !{!7}
				!7 = !DIBasicType(name: "int", size: 32, align: 32, encoding: DW_ATE_signed)
				!8 = !{!9, !10, !12}
				!9 = !DILocalVariable(name: "i", arg: 1, scope: !4, file: !1, line: 2, type: !7)
				!10 = !DILocalVariable(name: "c", arg: 2, scope: !4, file: !1, line: 2, type: !11)
				!11 = !DIDerivedType(tag: DW_TAG_pointer_type, scope: !0, baseType: !7, size: 64, align: 64)
				!12 = !DILocalVariable(name: "a", scope: !13, file: !1, line: 3, type: !7)
				!13 = distinct !DILexicalBlock(scope: !4, file: !1, line: 2, column: 25)
				!14 = !DILocation(line: 2, column: 13, scope: !4)
				!15 = !DILocation(line: 3, column: 14, scope: !13)
				!16 = !DILocation(line: 2, column: 22, scope: !4)
				!17 = !DILocation(line: 4, column: 3, scope: !13)
				!18 = !DILocation(line: 5, column: 5, scope: !13)
				!19 = !DILocation(line: 7, column: 1, scope: !13)
				; CHECK: ![[VARNUM:[0-9]+]] = !DILocalVariable(name: "a",
				...
				---
				name: foo
				alignment: 4
				tracksRegLiveness: true
				registers:
				- { id: 0, class: gr32 }
				- { id: 1, class: gr32 }
				- { id: 2, class: gr32 }
				- { id: 3, class: gr32 }
				- { id: 4, class: gr64 }
				- { id: 5, class: gr32 }
				- { id: 6, class: gr32 }
				liveins:
				- { reg: '$edi', virtual-reg: '%3' }
				- { reg: '$rsi', virtual-reg: '%4' }
				body: \|
				bb.0 (%ir-block.0):
				successors: %bb.1, %bb.2
				liveins: $edi, $rsi
				; CHECK-LABEL: bb.0 (%ir-block.0):
				; CHECK: DBG_VALUE $noreg, $noreg, ![[VARNUM]]

				DBG_VALUE $edi, $noreg, !9, !DIExpression(), debug-location !14
				DBG_VALUE $rsi, $noreg, !10, !DIExpression(), debug-location !16
				%4:gr64 = COPY $rsi
				DBG_VALUE %4, $noreg, !10, !DIExpression(), debug-location !16
				%3:gr32 = COPY $edi
				DBG_VALUE %3, $noreg, !9, !DIExpression(), debug-location !14
				%0:gr32 = MOV32rm %4, 1, $noreg, 0, $noreg, debug-location !15 :: (load 4 from %ir.c, align 1)
				DBG_VALUE %0, $noreg, !12, !DIExpression(), debug-location !15
				%5:gr32 = MOV32r0 implicit-def dead $eflags
				%6:gr32 = SUB32ri8 %3, 42, implicit-def $eflags, debug-location !17
				JCC_1 %bb.2, 5, implicit $eflags, debug-location !17
				JMP_1 %bb.1, debug-location !17

				bb.1.bb1:
				; CHECK-LABEL: bb.1.bb1:
				; CHECK: %[[VREG:[0-9]+]]:gr32 = MOV32rm
				; CHECK-NEXT: DBG_VALUE %[[VREG]], $noreg, ![[VARNUM]]
				; CHECK-NEXT: ADD32ri8
				%1:gr32 = nsw ADD32ri8 %0, 2, implicit-def dead $eflags, debug-location !18

				bb.2.bb2:
				%2:gr32 = PHI %5, %bb.0, %1, %bb.1
				$eax = COPY %2, debug-location !19
				RET 0, $eax, debug-location !19

				...