Page MenuHomePhabricator

[DebugInfo] DBG_VALUE instruction generated for loop counter is placed incorrectly.
ClosedPublic

Authored by avl on May 30 2019, 3:06 AM.

Details

Summary

512B bb.2.for.body:
520B MOVaddr
524B MOVi32imm

DBG_VALUE $x19, $noreg, !"i", <<<<<<<<<<<<<<<

Above DBG_VALUE should be placed before MOV* instructions. Test case presented after
the problem description.

DBG_VALUE instruction is placed incorrectly because wrong debug variable range
is calculated at "COMPUTING LIVE DEBUG VARIABLES" stage. Range for debug variable
instruction (for "i" variable) is computed according to the current state of instructions
inside of loop basic block. But later - Register Allocator creates new instructions
which are not taken into account when Live Debug Variables are computed. Please check
real data(full log in the end): after COMPUTING LIVE DEBUG VARIABLES there is following
instruction range for "i" variable - [544B;848r). It is shorter than Live Interval for
corresponding register [512B,848r:1). Register Allocator inserts two instructions
before 544B. As the result Debug Value record is inserted after these additional
instructions. That is incorrect. Debug Value for the loop counter should be inserted
before any loop instruction.

related reviews: D35953, https://bugs.llvm.org/show_bug.cgi?id=33730, D46599, https://bugs.llvm.org/show_bug.cgi?id=37149.

Debug variable range become incorrect when it was trimmed
at lib/CodeGen/LiveDebugVariables.cpp:

if (I.start() < RStart) {

// Interval start overlaps range - trim to the scope range.
I.setStartUnchecked(RStart);
// Remember that this interval was trimmed.
trimmedDefs.insert(RStart);

}

The fix is to not trim Debug Variable Range.
It does not brake live-debug-variables.ll(no new .debug_loc entries).
It does not affect compilation time for test case from https://bugs.llvm.org/show_bug.cgi?id=33730.

cat test_debug_val.cpp:

void func(int, ...);

int array[0x100];

int main( int argc, char **argv )
{

int var = 56;

int a1 = array[1]; int a2 = array[2]; int a3 = array[3]; int a4 = array[4];
int a5 = array[5]; int a6 = array[6]; int a7 = array[7]; int a8 = array[8];
int a9 = array[9]; int a10 = array[10];

for( int i = 0; i < 0x100; i++ ) {    

    array[i] = var;

    func(0, i, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10 );
}

return 0;

}

clang -O3 -g -c --target=aarch64-unknown-linux -std=gnu++14 test_debug_val.cpp -S -mllvm -print-after-all -mllvm -debug

test_debug_val.s:
.Ltmp7:

//DEBUG_VALUE: main:a2 <- $w21

.LBB0_1: // %for.body

                                // =>This Inner Loop Header: Depth=1
ldr     w2, [sp, #44]           // 4-byte Folded Reload

.Ltmp8:

.loc    1 16 17 is_stmt 1       // test_debug_val.cpp:16:17
adrp    x8, array
add     x8, x8, :lo12:array
mov     w9, #56

.Ltmp9:

//DEBUG_VALUE: i <- $x19

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

DEBUG_VALUE is reported not in the very beginning of Loop.
It is reported after several instructions.

print-after-all data:

  • COMPUTING LIVE DEBUG VARIABLES: main **
  • DEBUG VARIABLES **

!"i,14" [432B;448B):0 [544B;848r):1 Loc0=0 Loc1=%35

  1. * IR Dump After Debug Variable Analysis *:
  2. Machine code for function main: NoPHIs, TracksLiveness

512B bb.2.for.body:

; predecessors: %bb.0, %bb.2
  successors: %bb.1(0x04000000), %bb.2(0x7c000000); %bb.1(3.12%), %bb.2(96.88%)

544B STRWroX %28:gpr32, %27:gpr64common, %35:gpr64common, 0, 1, debug-location !54 :: (store 4 into %ir.scevgep, !tbaa !38); test_debug_val.cpp:16:17

  • INTERVALS **

%35 [8r,448B:0)[512B,848r:1)[848r,944B:2) 0@8r 1@512B-phi 2@848r weight:2.289400e-01

  1. * IR Dump After Greedy Register Allocator *:
  2. Machine code for function main: NoPHIs, TracksLiveness

512B bb.2.for.body:

; predecessors: %bb.0, %bb.2
  successors: %bb.1(0x04000000), %bb.2(0x7c000000); %bb.1(3.12%), %bb.2(96.88%)

520B %37:gpr64common = MOVaddr target-flags(aarch64-page) @array, target-flags(aarch64-pageoff, aarch64-nc) @array, debug-location !54; test_debug_val.cpp:16:17
524B %39:gpr32 = MOVi32imm 56, debug-location !54; test_debug_val.cpp:16:17
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
544B STRWroX %39:gpr32, %37:gpr64common, %35:gpr64common, 0, 1, debug-location !54 :: (store 4 into %ir.scevgep, !tbaa !38); test_debug_val.cpp:16:17

  1. * IR Dump After Virtual Register Rewriter *:
  2. Machine code for function main: NoPHIs, TracksLiveness, NoVRegs

512B bb.2.for.body:

; predecessors: %bb.0, %bb.2
  successors: %bb.1(0x04000000), %bb.2(0x7c000000); %bb.1(3.12%), %bb.2(96.88%)
  liveins: $w20, $w21, $w22, $w23, $w24, $w25, $w26, $w27, $w28, $x19

520B renamable $x8 = MOVaddr target-flags(aarch64-page) @array, target-flags(aarch64-pageoff, aarch64-nc) @array, debug-location !54; test_debug_val.cpp:16:17
524B renamable $w9 = MOVi32imm 56, debug-location !54; test_debug_val.cpp:16:17
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

DBG_VALUE $x19, $noreg, !"i", !DIExpression(), debug-location !51; test_debug_val.cpp:0 line no:14

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
544B STRWroX killed renamable $w9, killed renamable $x8, renamable $x19, 0, 1, debug-location !54 :: (store 4 into %ir.scevgep, !tbaa !38); test_debug_val.cpp:16:17

Diff Detail

Repository
rL LLVM

Event Timeline

avl created this revision.May 30 2019, 3:06 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 30 2019, 3:06 AM
aprantl added a subscriber: debug-info.
aprantl added inline comments.May 30 2019, 8:41 AM
llvm/test/CodeGen/AArch64/wrong_debug_loc_after_regalloc.ll
104 ↗(On Diff #202138)

Please remove all attributes that aren't necessary for the test. This will make maintenance much easier.

llvm/test/DebugInfo/X86/dbg-addr-dse.ll
103 ↗(On Diff #202138)

We can probably delete the TBAA info without compromising the test.

avl updated this revision to Diff 202404.May 31 2019, 3:09 AM

cleaned up tests.

aprantl added inline comments.May 31 2019, 9:16 AM
llvm/test/CodeGen/AArch64/wrong_debug_loc_after_regalloc.ll
45 ↗(On Diff #202404)

what is the debug location attached to the MOV, the DBG_VALUE, and the instructions in between?
If I understand the original code correctly, the intention was to not move a DBG_VALUE before the lexical scope of its variable starts.

llvm/test/DebugInfo/X86/dbg-addr-dse.ll
35 ↗(On Diff #202404)

I don't think it's appropriate to change all dbg.addr instructions to dbg.values in a test that has dbg.addr in the name.

aprantl added a subscriber: rnk.May 31 2019, 9:20 AM
avl marked 2 inline comments as done.Jun 2 2019, 11:52 PM
avl added inline comments.
llvm/test/CodeGen/AArch64/wrong_debug_loc_after_regalloc.ll
45 ↗(On Diff #202404)

log(before this patch) :

512B bb.2.for.body:

520B renamable $x8 = MOVaddr target-flags(aarch64-page) @array, target-flags(aarch64-pageoff, aarch64-nc) @array, debug-location !54; test_debug_val.cpp:16:17
524B renamable $w9 = MOVi32imm 56, debug-location !54; test_debug_val.cpp:16:17

DBG_VALUE $x19, $noreg, !"i", !DIExpression(), debug-location !51; test_debug_val.cpp:0 line no:14

544B STRWroX killed renamable $w9, killed renamable $x8, renamable $x19, 0, 1, debug-location !54 :: (store 4 into %ir.scevgep, !tbaa !38); test_debug_val.cpp:16:17

!51 = !DILocation(line: 0, scope: !35)
!52 = !DILocation(line: 14, column: 4, scope: !35)
!53 = !DILocation(line: 21, column: 4, scope: !14)
!54 = !DILocation(line: 16, column: 8, scope: !55)

short log :

512B bb.2.for.body:

520B $x8 = MOVaddr scope: !55
524B $w9 = MOVi32imm scope: !55

DBG_VALUE $x19, $noreg, !"i" scope: !35

544B STRWroX killed renamable $w9 scope: !55

i.e. DBG_VALUE was inserted into enclosed scope.

The reason for this is that scope identified by MachineInstr :

LiveDebugVariables.cpp:902

const InsnRange &Range : Scope->getRanges()

using InsnRange = std::pair<const MachineInstr *, const MachineInstr *>;

At the moment when scope was created the very first instruction was STRWroX.
So it become start of range. Start slot index(with type of Slot_Block) for
DBG_VALUE was trimmed to this instruction range. But after Register Allocator
there were inserted new MOV instructions before STRWroX with the same scope as STRWroX.
But Slot indices for DBG_VALUE was not updated and scope boundaries was not updated.

Actually Slot indices for DBG_VALUE looks correct without trimming : 512B,848B.
512B points to start of basic block:

512B bb.2.for.body:

which match with definition of slot index of type Slot_Block:

Basic block boundary. Used for live ranges entering and leaving a
/ block without being live in the layout neighbor. Also used as the
/// def slot of PHI-defs.

i.e. 512B,848B actually match with lexical scope and does not require trimming.

Also, trimming this slot index and leaving it of type Slot_Block is a error. Slot_Block should point to start of basic block.

I.setStartUnchecked(RStart); <<< trimmed I.start() is a Slot_Block

This concrete problem could be solved by avoiding trimming for Slot_Block only.

But it looks like trimming also is not required for the original problem test case : live-debug-variables.ll. With and without trimming it generates the same DBG_VALUE instructions :

	  DBG_VALUE $ecx, $noreg, !"i4", !DIExpression(), debug-location !14; foo.c:3:35 @[ foo.c:8:10 ] line no:3
	  DBG_VALUE $ebx, $noreg, !"i4", !DIExpression(), debug-location !14; foo.c:3:35 @[ foo.c:8:10 ] line no:3

So in short :

  1. For the problem from D62650: SlotIndex was trimmed incorrectly. It does not require trimming and has incorrect value for Slot_Block after trimming.
  1. Trimming for testcase from D35953 looks unnecessary. With and without trimming it generates the same DBG_VALUE instructions.

Thus it looks like trimming could be removed at all.

llvm/test/DebugInfo/X86/dbg-addr-dse.ll
35 ↗(On Diff #202404)

I noticed that current clang generates IR without @llvm.dbg.addr.
That is why I regenerated the test. But I agree that this test should still
check for @llvm.dbg.addr as it was created originally. So I would return it back ...

aprantl added inline comments.Jun 3 2019, 8:29 AM
llvm/test/DebugInfo/X86/dbg-addr-dse.ll
35 ↗(On Diff #202404)

IIRC, there's a flag to change this behavior. It's the default on windows only.

I'm reposting your reply with an easier to read format:

log(before this patch) :

512B bb.2.for.body:

520B renamable $x8 = MOVaddr target-flags(aarch64-page) @array, target-flags(aarch64-pageoff, aarch64-nc) @array, debug-location !54; test_debug_val.cpp:16:17
524B renamable $w9 = MOVi32imm 56, debug-location !54; test_debug_val.cpp:16:17

DBG_VALUE $x19, $noreg, !"i", !DIExpression(), debug-location !51; test_debug_val.cpp:0 line no:14
544B STRWroX killed renamable $w9, killed renamable $x8, renamable $x19, 0, 1, debug-location !54 :: (store 4 into %ir.scevgep, !tbaa !38); test_debug_val.cpp:16:17

!51 = !DILocation(line: 0, scope: !35)
!52 = !DILocation(line: 14, column: 4, scope: !35)
!53 = !DILocation(line: 21, column: 4, scope: !14)
!54 = !DILocation(line: 16, column: 8, scope: !55)

short log :

512B bb.2.for.body:

520B $x8 = MOVaddr scope: !55
524B $w9 = MOVi32imm scope: !55

DBG_VALUE $x19, $noreg, !"i" scope: !35
544B STRWroX killed renamable $w9 scope: !55

i.e. DBG_VALUE was inserted into enclosed scope.

The reason for this is that scope identified by MachineInstr :

LiveDebugVariables.cpp:902

const InsnRange &Range : Scope->getRanges()

using InsnRange = std::pair<const MachineInstr *, const MachineInstr *>;

At the moment when scope was created the very first instruction was STRWroX.
So it become start of range. Start slot index(with type of Slot_Block) for
DBG_VALUE was trimmed to this instruction range. But after Register Allocator
there were inserted new MOV instructions before STRWroX with the same scope as STRWroX.
But Slot indices for DBG_VALUE was not updated and scope boundaries was not updated.

Actually Slot indices for DBG_VALUE looks correct without trimming : 512B,848B.
512B points to start of basic block:

512B bb.2.for.body:

which match with definition of slot index of type Slot_Block:

Basic block boundary. Used for live ranges entering and leaving a
/ block without being live in the layout neighbor. Also used as the
/// def slot of PHI-defs.

i.e. 512B,848B actually match with lexical scope and does not require trimming.

Also, trimming this slot index and leaving it of type Slot_Block is a error. Slot_Block should point to start of basic block.

I.setStartUnchecked(RStart); <<< trimmed I.start() is a Slot_Block

This concrete problem could be solved by avoiding trimming for Slot_Block only.

But it looks like trimming also is not required for the original problem test case : live-debug-variables.ll. With and without trimming it generates the same DBG_VALUE instructions :

DBG_VALUE $ecx, $noreg, !"i4", !DIExpression(), debug-location !14; foo.c:3:35 @[ foo.c:8:10 ] line no:3
DBG_VALUE $ebx, $noreg, !"i4", !DIExpression(), debug-location !14; foo.c:3:35 @[ foo.c:8:10 ] line no:3

So in short :

For the problem from D62650: SlotIndex was trimmed incorrectly. It does not require trimming and has incorrect value for Slot_Block after trimming.
Trimming for testcase from D35953 looks unnecessary. With and without trimming it generates the same DBG_VALUE instructions.
Thus it looks like trimming could be removed at all.

Have you looked at the performance of the example posted in https://bugs.llvm.org/show_bug.cgi?id=33730 ?

Have you looked at the performance of the example posted in https://bugs.llvm.org/show_bug.cgi?id=33730 ?

Just to clarify: I'm asking because this change was introduced to fix a performance problem, not a correctness issue, but the LLVM testsuite only verifies the functional change, not its performance characteristics. I'm worried that reverting the change would regress the performance again.

avl added a comment.Jun 3 2019, 9:58 AM

Have you looked at the performance of the example posted in https://bugs.llvm.org/show_bug.cgi?id=33730 ?

Yes. I checked the performance impact of this patch for test case from https://bugs.llvm.org/show_bug.cgi?id=33730.
It does not increase execution time.

Removing full patch for D35953 affects performance. But removing only trimming of slot indexes does not affect performance.

I will post performance results for both modes shortly.

avl updated this revision to Diff 202801.Jun 3 2019, 2:35 PM

restored dbg-addr-dse.ll.

avl added a comment.Jun 3 2019, 2:38 PM

following performance results with and without this patch. they are equal :

if this patch applied:

time clang foo.c -O2 -c -g -mllvm -stats 2>&1 | grep livedebug
15074 livedebugvars - Number of DBG_VALUEs inserted

real 0m4,819s
user 0m4,799s
sys 0m0,022s

if this patch is not applied:

time clang foo.c -O2 -c -g -mllvm -stats 2>&1 | grep livedebug
15074 livedebugvars - Number of DBG_VALUEs inserted

real 0m4,838s
user 0m4,808s
sys 0m0,032s

Before D35953 that time was over 40min.

aprantl accepted this revision.Jun 4 2019, 8:57 PM

I'm kind of curious where the performance is gained back, but generally, if the performance is the same, we can tentatively land this. Please be quick to revert if complaints about the performance resurface.

This revision is now accepted and ready to land.Jun 4 2019, 8:57 PM

And thanks for doing the measurements!

dstenb added a subscriber: dstenb.Jun 5 2019, 3:50 AM
This revision was automatically updated to reflect the committed changes.