This is an archive of the discontinued LLVM Phabricator instance.

Extending debug ranges
ClosedPublic

Authored by tvvikram on Aug 11 2015, 3:21 AM.

Details

Summary

This patch is an early implementation to extend debug ranges and provide multiple location support for debug variables. This work is based on the idea put forward in the mail thread: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-June/087109.html

A. Debug Range Extension: Currently, the debug ranges in LLVM end prematurely at the end of every basic block. This patch tries to extend them across basic blocks such that when two predecessors of a basic block have the same variable residing in the same location, a new debug value representing the variable and the location is added at the start of the current basic block.

Example: Consider the following partial code containing only DBG_VALUE instructions:
BB#3:
.
DBG_VALUE %EBX, %noreg, !"n", <!17>; line no:8
.
JMP_1 <BB#5>

BB#4:
.
DBG_VALUE %EBX, %noreg, !"n", <!17>; line no:8
.

BB#5: ; preds - BB#3, BB#4
DBG_VALUE %EBX, %noreg, !"n", <!17>; line no:8 <-- newly inserted
.

Here, BB#5 is a successor of both BB#3 and BB#4. Source variable "n" resides in %EBX on both paths. So a new DBG_VALUE instruction will be added at the start of BB#5 for "n" at location %EBX.

B. Multiple locations: Moved to http://reviews.llvm.org/D11986.

Diff Detail

Event Timeline

tvvikram updated this revision to Diff 31781.Aug 11 2015, 3:21 AM
tvvikram retitled this revision from to Extend debug ranges and provide multiple location support for debug variables.
tvvikram updated this object.
tvvikram added a subscriber: llvm-commits.
tvvikram updated this object.Aug 11 2015, 3:26 AM
tvvikram added a reviewer: loladiro.

Adding a few more people who I'd like to take a look and comment. For background, this work is being done as part of our effort to improve debug info quality in julia. There is still some work left to do, but I'd like to gather feedback on implementation/design early, as well as make people aware that this is being worked on.

dblaikie edited edge metadata.Aug 11 2015, 9:52 AM
dblaikie added a subscriber: dblaikie.

+samsonov, who built DbgValueHistoryCalculator

It might be helpful to have a bit of a high level design describing how the
patch achieves this (you could go through & write comments in the code
review to describe the various parts - if they're not worthy of
sufficiently descriptive inline comments in the source itself)

Is it reasonable/possible/helpful to split the two (A and B, in your
original description) features into separate patches to simplify review
(smaller patches are exponentially easier/faster to review)?

aprantl edited edge metadata.Aug 11 2015, 9:55 AM

First off, thanks for working on this!

The patch as it is doesn't apply cleanly (see DbgValueHistoryMap::print) and 243841 is a CFE revision...?

friss edited edge metadata.Aug 11 2015, 11:32 AM

This patch is an early implementation to extend debug ranges and provide multiple location support for debug variables. This work is based on the idea put forward in the mail thread: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-June/087109.html

Thank you so much for working on this! I just skimmed through the patch and description. Some questions bellow:

A. Debug Range Extension: Currently, the debug ranges in LLVM end prematurely at the end of every basic block. This patch tries to extend them across basic blocks such that when two predecessors of a basic block have the same variable residing in the same location, a new debug value representing the variable and the location is added at the start of the current basic block.

I think there is an interesting corner case here. I believe that constant values should have a special treatment. When you have code like:

int i = 42;
while (i) i =foo(i)

the DBG_VALUE for i will have a constant on one edge into the loop and a register on the other one. In that scenario, in case a constant competes with a dynamic location at BB entry, I strongly believe the dynamic loc should win.

Also, was it really easier to write a different pass to do the data flow propagation? I saw in the patch that it is because you want to insert new instructions, but couldn't it be done without inserting the instructions, just keeping some temporary data structures up-to-date? I can see some appeal in minimally modifying the current code and reviewing the new pass as a different thing though.

B. Multiple locations: LLVM currently can represent a debug variable only at a single location. This patch provides an infrastructure for earlier passes to express multiple locations for a debug variable by adding a new flag - PreserveDbgValLoc to the DBG_VALUE instruction. The presence of the flag will mean multiple debug ranges for the variable can co-exist together. The patch also infers multiple locations by looking at move instructions.

Could that be split, maybe even multiple times (split out of this patch, then the move inference part)? Why is the flag needed rather than tracking multiple locations unconditionally.
One thing that we discussed with Adrian some time ago is that emitting multiple locations isn't really useful for the debuggers we now, they will just use one location. Thus the generic code shouldn't maybe emit multiple locations as it will cost file size and IO time for nothing.

Example - inferring from move instruction: Consider the following two instructions:

DBG_VALUE %EDI, %noreg, !"n", <!12>
%EBX<def> = MOV32rr %EDI
DBG_VALUE %EBX, %noreg, !"n", <!12>; flags: PreserveDbgValLoc  <-- newly inserted

After the copy instruction, the value “n” will be present in two locations - %EDI and %EBX and the same is represented with the new DBG_VALUE instruction having the flag PreserveDbgValLoc set.

The two ranges continue to exist together until one of the following happens:
Either %EDI or %EBX is clobbered, the two ranges are conservatively ended
A DBG_VALUE instruction for variable “n” without the flag is seen, when the previous ranges are ended.

Test cases: 3 new test cases are added to test the above implementations. There are numerous places in the test cases where debug range extension and addition of PreserveDbgValLoc happens. All test cases under 'ninja check' run, with around 5 miscompare errors. I will correct them once the implementation is okay as the miscompares are volatile with code change.

SVN base revision used: 243841

One thing I'm missing is results. You propose the patch, so I suppose this approach works for you. Could you give some details on what works better, eg does it fix all the issues you were seeing? Also, what's the compile time impact of that change and the effect on binary size?

Fred

One thing that we discussed with Adrian some time ago is that emitting multiple locations isn't really useful for the debuggers we now, they will just use one location. Thus the generic code shouldn't maybe emit multiple locations as it will cost file size and IO time for nothing.

This is a chicken-and-egg problem. I think the compiler needs to take the lead here, so that debuggers will have something to experiment with and take advantage of. Actual debug-info-size data would be worth having, if it seems excessive this could be put under some kind of flag. The primary size cost would be in recording the addresses to delimit the additional ranges.

I didn't have a chance to actually play with the code so far (and the patch didn't apply cleanly), but here are some very high level comments and questions:

  • Why did you choose to insert DBG_VALUEs into the machine function instead of implementing it directly inside DbgValueHistoryCalculator?
  • Having it as a separate pass could actually be good for testability — did you look into using the brand new MIR serialization format for writing MIR->MIR testcases?
  • There should also be a test case with an aggregate variable that is split across multiple DW_OP_pieces to make sure they are handled correctly.
  • Did you run any benchmarks with the new pass? Does it cause any measurable slowdown?

The MultipleLocations handling should probably be split out into a separate patch and reviewed separately.

thanks,
adrian

lib/CodeGen/AsmPrinter/DbgValueHistoryCalculator.cpp
135 ↗(On Diff #31781)

Comment should be above the return.

lib/CodeGen/AsmPrinter/DbgValueHistoryCalculator.h
41 ↗(On Diff #31781)

Just a general remark: Please follow
http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments
and use /// doxygen comments without \brief where possible.

lib/CodeGen/DebugValueFixup.cpp
1 ↗(On Diff #31781)

Fixup is not a really descriptive name. Really it inserts additional DEBUG_VALUES to extend the live ranges of debug locations. ExtendDebugLocationLiveRanges, DebugValueLiveRangeExtender, ...?

15 ↗(On Diff #31781)

Please elaborate.

19 ↗(On Diff #31781)

Should be rephrased as "This is a separate pass from DbgValueHistoryCalculator because....".

aprantl edited edge metadata.Aug 11 2015, 3:01 PM
aprantl added a subscriber: arphaman.
Prashanth removed a subscriber: Prashanth.
tvvikram added inline comments.Aug 12 2015, 12:59 AM
lib/CodeGen/DebugValueFixup.cpp
1 ↗(On Diff #31781)

I will rename it to ExtendDebugValueRangeAndLocation.cpp?

+samsonov, who built DbgValueHistoryCalculator

It might be helpful to have a bit of a high level design describing how the
patch achieves this (you could go through & write comments in the code
review to describe the various parts - if they're not worthy of
sufficiently descriptive inline comments in the source itself)

Is it reasonable/possible/helpful to split the two (A and B, in your
original description) features into separate patches to simplify review
(smaller patches are exponentially easier/faster to review)?

I am separating the patches.

This patch is an early implementation to extend debug ranges and provide multiple location support for debug variables. This work is based on the idea put forward in the mail thread: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-June/087109.html

Thank you so much for working on this! I just skimmed through the patch and description. Some questions bellow:

A. Debug Range Extension: Currently, the debug ranges in LLVM end prematurely at the end of every basic block. This patch tries to extend them across basic blocks such that when two predecessors of a basic block have the same variable residing in the same location, a new debug value representing the variable and the location is added at the start of the current basic block.

I think there is an interesting corner case here. I believe that constant values should have a special treatment. When you have code like:

int i = 42;
while (i) i =foo(i)

the DBG_VALUE for i will have a constant on one edge into the loop and a register on the other one. In that scenario, in case a constant competes with a dynamic location at BB entry, I strongly believe the dynamic loc should win.

Currently, both are treated as separate locations and we do nothing about it.

Also, was it really easier to write a different pass to do the data flow propagation? I saw in the patch that it is because you want to insert new instructions, but couldn't it be done without inserting the instructions, just keeping some temporary data structures up-to-date? I can see some appeal in minimally modifying the current code and reviewing the new pass as a different thing though.

buildLocationLists() need ranges with real instructions. So we had to insert a new instruction.

B. Multiple locations: LLVM currently can represent a debug variable only at a single location. This patch provides an infrastructure for earlier passes to express multiple locations for a debug variable by adding a new flag - PreserveDbgValLoc to the DBG_VALUE instruction. The presence of the flag will mean multiple debug ranges for the variable can co-exist together. The patch also infers multiple locations by looking at move instructions.

Could that be split, maybe even multiple times (split out of this patch, then the move inference part)? Why is the flag needed rather than tracking multiple locations unconditionally.

Flag is provided so that prior passes can set it to convey that a variable exists at a location along with previous locations, if any.

One thing that we discussed with Adrian some time ago is that emitting multiple locations isn't really useful for the debuggers we now, they will just use one location. Thus the generic code shouldn't maybe emit multiple locations as it will cost file size and IO time for nothing.

Example - inferring from move instruction: Consider the following two instructions:

DBG_VALUE %EDI, %noreg, !"n", <!12>
%EBX<def> = MOV32rr %EDI
DBG_VALUE %EBX, %noreg, !"n", <!12>; flags: PreserveDbgValLoc  <-- newly inserted

After the copy instruction, the value “n” will be present in two locations - %EDI and %EBX and the same is represented with the new DBG_VALUE instruction having the flag PreserveDbgValLoc set.

The two ranges continue to exist together until one of the following happens:
Either %EDI or %EBX is clobbered, the two ranges are conservatively ended
A DBG_VALUE instruction for variable “n” without the flag is seen, when the previous ranges are ended.

Test cases: 3 new test cases are added to test the above implementations. There are numerous places in the test cases where debug range extension and addition of PreserveDbgValLoc happens. All test cases under 'ninja check' run, with around 5 miscompare errors. I will correct them once the implementation is okay as the miscompares are volatile with code change.

SVN base revision used: 243841

One thing I'm missing is results. You propose the patch, so I suppose this approach works for you. Could you give some details on what works better, eg does it fix all the issues you were seeing? Also, what's the compile time impact of that change and the effect on binary size?

Fred

I didn't have a chance to actually play with the code so far (and the patch didn't apply cleanly), but here are some very high level comments and questions:

  • Why did you choose to insert DBG_VALUEs into the machine function instead of implementing it directly inside DbgValueHistoryCalculator?

I have written a brief note about it at start of DebugValueFixup.cpp.

  • Having it as a separate pass could actually be good for testability — did you look into using the brand new MIR serialization format for writing MIR->MIR testcases?

I am quite new to MIR serialization. I will try.

  • There should also be a test case with an aggregate variable that is split across multiple DW_OP_pieces to make sure they are handled correctly.

Will try to come up with one.

  • Did you run any benchmarks with the new pass? Does it cause any measurable slowdown?

I have not tried any benchmarks as of yet.

The MultipleLocations handling should probably be split out into a separate patch and reviewed separately.

Creating 2 new patches.

thanks,
adrian

I have created 2 new patches. Should I create a new review or should I update in the current review itself?

Just update this review with the basic patch and create a new one for the multivalue handling and put a links to the other one into the description.

thanks!

tvvikram updated this revision to Diff 31969.Aug 12 2015, 12:21 PM

This patch contains only Extending Debug Range implementation. The patch on multiple location support for debug variables will be a new review as suggested.

tvvikram retitled this revision from Extend debug ranges and provide multiple location support for debug variables to Extending debug ranges.Aug 12 2015, 12:37 PM
tvvikram updated this object.
tvvikram added a comment.EditedAug 12 2015, 12:44 PM
  • There should also be a test case with an aggregate variable that is split across multiple DW_OP_pieces to make sure they are handled correctly.

Will try to come up with one.

There were pieces-*.ll tests under test/DebugInfo/ that handled aggregate variables. I have corrected them to match the new correct ranges that get generated.

Have you had a chance to benchmark the new pass? (Compiling clang with optimizations+debug info would be a good start)

Here's a test case involving constants to play with:

void g(int *);
int f() {
  int x = 23;
  g(&x);
  if (x == 42)
    ++x;
  return x;
}

Compiled with -O1 I get:

define i32 @f() #0 {
entry:
  %x = alloca i32, align 4
  %0 = bitcast i32* %x to i8*, !dbg !14
  call void @llvm.lifetime.start(i64 4, i8* %0) #1, !dbg !14
  tail call void @llvm.dbg.value(metadata i32 23, i64 0, metadata !9, metadata !15), !dbg !16
  store i32 23, i32* %x, align 4, !dbg !16, !tbaa !17
  tail call void @llvm.dbg.value(metadata i32* %x, i64 0, metadata !9, metadata !15), !dbg !16
  call void @g(i32* nonnull %x) #1, !dbg !21
  call void @llvm.dbg.value(metadata i32* %x, i64 0, metadata !9, metadata !15), !dbg !16
  %1 = load i32, i32* %x, align 4, !dbg !22, !tbaa !17
  %cmp = icmp eq i32 %1, 42, !dbg !24
  br i1 %cmp, label %if.then, label %if.end, !dbg !25

if.then:                                          ; preds = %entry
  %inc = add nsw i32 %1, 1, !dbg !26
  call void @llvm.dbg.value(metadata i32 %inc, i64 0, metadata !9, metadata !15), !dbg !16
  store i32 %inc, i32* %x, align 4, !dbg !26, !tbaa !17
  br label %if.end, !dbg !26

if.end:                                           ; preds = %if.then, %entry
  call void @llvm.dbg.value(metadata i32* %x, i64 0, metadata !9, metadata !15), !dbg !16
  %2 = load i32, i32* %x, align 4, !dbg !27, !tbaa !17
  call void @llvm.lifetime.end(i64 4, i8* %0) #1, !dbg !28
  ret i32 %2, !dbg !29
}

and then

# *** IR Dump After DBG_VALUE Fixup Pass ***:
# Machine code for function f: Post SSA
Frame Objects:
  fi#-1: size=8, align=16, fixed, at location [SP-8]
  fi#0: size=4, align=4, at location [SP-12]

BB#0: derived from LLVM BB %entry
    Live Ins: %RBP
	PUSH64r %RBP<kill>, %RSP<imp-def>, %RSP<imp-use>; flags: FrameSetup
	CFI_INSTRUCTION <call frame instruction>
	CFI_INSTRUCTION <call frame instruction>
	%RBP<def> = MOV64rr %RSP; flags: FrameSetup
	CFI_INSTRUCTION <call frame instruction>
	%RSP<def,tied1> = SUB64ri8 %RSP<tied0>, 16, %EFLAGS<imp-def,dead>; flags: FrameSetup dbg:var.c:3:7
	DBG_VALUE 23, 0, !"x", <!15>; line no:3
	MOV32mi %RBP, 1, %noreg, -4, %noreg, 23; mem:ST4[%x](tbaa=!18) dbg:var.c:3:7
	%RDI<def> = LEA64r %RBP, 1, %noreg, -4, %noreg
	DBG_VALUE %RDI, 0, !"x", <!15>; line no:3 indirect
	CALL64pcrel32 <ga:@g>, <regmask>, %RSP<imp-use>, %RDI<imp-use>, %RSP<imp-def>; dbg:var.c:4:3
	%EAX<def> = MOV32rm %RBP, 1, %noreg, -4, %noreg; mem:LD4[%x](tbaa=!18) dbg:var.c:5:7
	CMP32ri8 %EAX, 42, %EFLAGS<imp-def>; dbg:var.c:5:7
	JNE_1 <BB#2>, %EFLAGS<imp-use>
    Successors according to CFG: BB#1(16) BB#2(16)

BB#1: derived from LLVM BB %if.then
    Live Ins: %EAX %RBP
    Predecessors according to CFG: BB#0
	DBG_VALUE %RDI, 0, !"x", <!15>; line no:3 indirect
	DBG_VALUE 23, 0, !"x", <!15>; line no:3
	%EAX<def,tied1> = INC32r %EAX<kill,tied0>, %EFLAGS<imp-def,dead>; dbg:var.c:6:5
	DBG_VALUE %EAX, %noreg, !"x", <!15>; line no:3
	MOV32mr %RBP, 1, %noreg, -4, %noreg, %EAX<kill>; mem:ST4[%x](tbaa=!18) dbg:var.c:6:5
    Successors according to CFG: BB#2

BB#2: derived from LLVM BB %if.end
    Live Ins: %RBP
    Predecessors according to CFG: BB#0 BB#1
	DBG_VALUE 23, 0, !"x", <!15>; line no:3
	%EAX<def> = MOV32rm %RBP, 1, %noreg, -4, %noreg; mem:LD4[%x](tbaa=!18) dbg:var.c:7:10
	%RSP<def,tied1> = ADD64ri8 %RSP<tied0>, 16, %EFLAGS<imp-def,dead>; dbg:var.c:7:3
	%RBP<def> = POP64r %RSP<imp-def>, %RSP<imp-use>; dbg:var.c:7:3
	RETQ %EAX; dbg:var.c:7:3

# End machine code for function f.

Note how the constant value of x is propagated into the last basic block.

Have you had a chance to benchmark the new pass? (Compiling clang with optimizations+debug info would be a good start)

I have not done any benchmarking as of now. I will try to add few "STATISTIC" code which will count the number of DBG_VALUE instrs inserted/removed.

I compiled your test case with clang -O1 and here is the input to DBG_VALUE fixup pass (output of StackMapLivenessAnalysis).

# *** IR Dump After StackMap Liveness Analysis ***:
# Machine code for function f: Post SSA
Frame Objects:
  fi#0: size=4, align=4, at location [SP-4]

BB#0: derived from LLVM BB %entry
  PUSH64r %RAX<undef>, %RSP<imp-def>, %RSP<imp-use>; flags: FrameSetup dbg:r.c:3:7
  CFI_INSTRUCTION <call frame instruction>
  DBG_VALUE 23, 0, !"x", <!14>; line no:3
  MOV32mi %RSP, 1, %noreg, 4, %noreg, 23; mem:ST4[%x](tbaa=!17) dbg:r.c:3:7
  %RDI<def> = LEA64r %RSP, 1, %noreg, 4, %noreg
  DBG_VALUE %RDI, 0, !"x", <!14>; line no:3 indirect
  CALL64pcrel32 <ga:@g>, <regmask>, %RSP<imp-use>, %RDI<imp-use>, %RSP<imp-def>; dbg:r.c:4:3
  %EAX<def> = MOV32rm %RSP, 1, %noreg, 4, %noreg; mem:LD4[%x](tbaa=!17) dbg:r.c:5:7
  CMP32ri8 %EAX, 42, %EFLAGS<imp-def>; dbg:r.c:5:7
  JNE_1 <BB#2>, %EFLAGS<imp-use>
    Successors according to CFG: BB#1(16) BB#2(16)

BB#1: derived from LLVM BB %if.then
    Live Ins: %EAX
    Predecessors according to CFG: BB#0
  DBG_VALUE 23, 0, !"x", <!14>; line no:3
  %EAX<def,tied1> = INC32r %EAX<kill,tied0>, %EFLAGS<imp-def,dead>; dbg:r.c:6:5
  DBG_VALUE %EAX, %noreg, !"x", <!14>; line no:3
  MOV32mr %RSP, 1, %noreg, 4, %noreg, %EAX<kill>; mem:ST4[%x](tbaa=!17) dbg:r.c:6:5
    Successors according to CFG: BB#2

BB#2: derived from LLVM BB %if.end
    Predecessors according to CFG: BB#0 BB#1
  DBG_VALUE 23, 0, !"x", <!14>; line no:3
  %EAX<def> = MOV32rm %RSP, 1, %noreg, 4, %noreg; mem:LD4[%x](tbaa=!17) dbg:r.c:7:10
  %RDX<def> = POP64r %RSP<imp-def>, %RSP<imp-use>; dbg:r.c:7:3
  RETQ %EAX; dbg:r.c:7:3

# End machine code for function f.

The DBG_VALUE instr for "x" is already present in BB#2 and only "x" in RDI got propagated to BB#1.

With respect to handling variables having constant value, an early thought is to have ranges for them until the variable is moved to register. When the variable is moved to register, it will end all previous ranges of that variable because the move to register will mean that it may be operated on and the value may change. The current patch handles only register locations.

tvvikram added inline comments.Aug 17 2015, 7:05 AM
lib/CodeGen/DebugValueFixup.cpp
2 ↗(On Diff #31969)

Is it okay to name the pass as "ExtendDebugValueRangeAndLocation" ?

test/DebugInfo/extend-debug-range.ll
1 ↗(On Diff #31969)

I am checking the generated machine instructions after the new pass to check for propagated debug ranges. But llc will also generate the .s file. Now, I tried using -stop-after option to llc, but it still generates the .s file.

Hi, Sorry about not replying earlier last week,

I hope that you will utilize the new MIR serialization format to test the new pass. I added an inline comment that should explain how you can use the new format to create a test case for this pass.

Cheers,
Alex

test/DebugInfo/extend-debug-range.ll
1 ↗(On Diff #31969)

If you would like to test just the individual DebugValueFixup pass, you can use the new 'run-pass' option in llc. The run-pass option expects an MIR file as input. MIR files contain the serialized machine instructions. I'm still working on the documentation for this format, but you can take a look at the current docs at http://llvm.org/docs/MIRLangRef.html.

To generate an input MIR file for this test, you can use the 'stop-after' option with llc. You can pass the current 'extend-debug-range.ll' file directly to it and llc will print the MIR into stdout. You have to specify a name of a pass that runs before the DebugValueFixup for the 'stop-after' option. I see that the DebugValueFixup pass should run after the StackMapLivenessAnalysis pass, so you can obtain the MIR input file by running the following command:

llc -stop-after stackmap-liveness extend-debug-range.ll -o /dev/null > debug-value-fixup.mir

The generated MIR input file will be target dependent, so you will have to place the test in the appropriate target directory in test/CodeGen.

After you've generated an input MIR file, you can transform it into an actual test that tests your pass.
You can use a RUN line similar to this:

RUN: llc -o /dev/null -run-pass=dbgval-fixup -march=??? %s | FileCheck %s

When running this test case, llc will print the produced MIR with the machine instructions that were transformed by your pass directly to stdout.

tvvikram updated this revision to Diff 32399.Aug 18 2015, 4:35 AM

This patch has the following changes over the previous one:

a. Renamed DebugValueFixup pass as ExtendDebugRangeLocation pass.

b. Improved removal of redundant DBG_VALUE instructions. Added a test case for the same - redundant-dbg_val.ll

c. Add "STATISTIC" to count the number of DBG_VALUE instructions inserted/deleted. I am not sure what benchmarks to exactly test.

d. Updated the test case extend-debug-range.ll to compare against assembly instead of machine instructions. I am getting an error in using MIR serialization, I can send a new patch containing the two new MIR test cases once it works.

test/DebugInfo/extend-debug-range.ll
1 ↗(On Diff #31969)

I tried the above commands but I keep getting the following error with the second command:

$llc -run-pass=extend-dbg-range-loc -march=x86-64 d.mir -o /dev/null
error: d.mir:172:76: expected an implicit register operand 'implicit-def %eflags'
    dead %eax = XOR32rr undef %eax, undef %eax, implicit-def dead %eflags, implicit-def %al, debug-location !25

[Please note that the pass has been renamed to extend-dbg-range-loc]

friss added a comment.Aug 18 2015, 8:18 AM

c. Add "STATISTIC" to count the number of DBG_VALUE instructions inserted/deleted. I am not sure what benchmarks to exactly test.

The kind of numbers that I (and Adrian I'm pretty sure) wanted to see are more compiler performance numbers. You add a new pass that will have a cost, and we care a lot about -O0 -g compile times. So what is the cost of your new code in terms of compile time?

tvvikram added a comment.EditedAug 19 2015, 12:46 AM

c. Add "STATISTIC" to count the number of DBG_VALUE instructions inserted/deleted. I am not sure what benchmarks to exactly test.

The kind of numbers that I (and Adrian I'm pretty sure) wanted to see are more compiler performance numbers. You add a new pass that will have a cost, and we care a lot about -O0 -g compile times. So what is the cost of your new code in terms of compile time?

I used clang -O0 -g followed by llc -O0 -time-passes and noticed that the new pass takes around 0.5% - 0.7% of Wall time.

tvvikram updated this revision to Diff 32512.Aug 19 2015, 12:58 AM

Added an .mir test case under test/DebugInfo/MIR/

Thanks!

test/DebugInfo/MIR/extend-debug-range.mir
1 ↗(On Diff #32512)

This test is target specific, so you have to add the -march or -mtriple option to llc's parameters.

The test should also be placed in a target specific subdirectory, something like 'DebugInfo/MIR/X86' will work well, but feel free to use another one. This target specific test directory also needs special check in the 'lit.local.cfg' file to make sure that the tests in that directory run only when the target is available. For X86, you can use a condition like this:

if not 'X86' in config.root.targets:
    config.unsupported = True
arphaman added inline comments.Aug 19 2015, 5:28 PM
test/DebugInfo/MIR/extend-debug-range.mir
249 ↗(On Diff #32512)

I tweaked the syntax for the memory operands slightly while working on the fix for the problem you reported, so this instruction should be as follows now:

MOV32mr %rip, 1, _, @m, _, %ecx, debug-location !37 :: (store 4 into @m)
tvvikram updated this revision to Diff 32795.Aug 20 2015, 10:00 PM

Updated .mir test cases and added them under test/DebugInfo/MIR/X86

I compiled your test case with clang -O1 and here is the input to DBG_VALUE fixup pass (output of StackMapLivenessAnalysis).

The DBG_VALUE instr for "x" is already present in BB#2 and only "x" in RDI got propagated to BB#1.

Looking into this I found that LiveDebugVariables is the culprit here:

void UserValue::extendDef(SlotIndex Idx, unsigned LocNo,
                          LiveRange *LR, const VNInfo *VNI,
                          SmallVectorImpl<SlotIndex> *Kills,
                          LiveIntervals &LIS, MachineDominatorTree &MDT,
                          UserValueScopes &UVS) {
    ...
    for (unsigned i = 0, e = Children.size(); i != e; ++i) {
      MachineBasicBlock *MBB = Children[i]->getBlock();
      if (UVS.dominates(MBB))
        Todo.push_back(LIS.getMBBStartIdx(MBB));
  }
}

It look like it just unconditionally propagates all DBG_VALUEs down the dominator tree, which happens to work fine if there already is another DBG_VALUE but is wrong if the DBG_VALUE is coming from only one of the predecessors like in this example here.

I filed this as PR24563 (https://llvm.org/bugs/show_bug.cgi?id=24563).

With respect to handling variables having constant value, an early thought is to have ranges for them until the variable is moved to register. When the variable is moved to register, it will end all previous ranges of that variable because the move to register will mean that it may be operated on and the value may change. The current patch handles only register locations.

That sounds reasonable. Memory locations should be treated similarly, too.

I compiled your test case with clang -O1 and here is the input to DBG_VALUE fixup pass (output of StackMapLivenessAnalysis).

The DBG_VALUE instr for "x" is already present in BB#2 and only "x" in RDI got propagated to BB#1.

Looking into this I found that LiveDebugVariables is the culprit here:

void UserValue::extendDef(SlotIndex Idx, unsigned LocNo,
                          LiveRange *LR, const VNInfo *VNI,
                          SmallVectorImpl<SlotIndex> *Kills,
                          LiveIntervals &LIS, MachineDominatorTree &MDT,
                          UserValueScopes &UVS) {
    ...
    for (unsigned i = 0, e = Children.size(); i != e; ++i) {
      MachineBasicBlock *MBB = Children[i]->getBlock();
      if (UVS.dominates(MBB))
        Todo.push_back(LIS.getMBBStartIdx(MBB));
  }
}

It look like it just unconditionally propagates all DBG_VALUEs down the dominator tree, which happens to work fine if there already is another DBG_VALUE but is wrong if the DBG_VALUE is coming from only one of the predecessors like in this example here.

I filed this as PR24563 (https://llvm.org/bugs/show_bug.cgi?id=24563).

Okay!

With respect to handling variables having constant value, an early thought is to have ranges for them until the variable is moved to register. When the variable is moved to register, it will end all previous ranges of that variable because the move to register will mean that it may be operated on and the value may change. The current patch handles only register locations.

That sounds reasonable. Memory locations should be treated similarly, too.

Okay. Is it fine if this implementation is submitted as a separate patch? The current patch handles only register locations.

The LLVM project generally encourages smaller, incremental patches, as they tend to be easier to review and test.

I just commented in the PR:

The debug range extension (http://reviews.llvm.org/D11933) is partially redundant with what LiveDebugVariables is trying to do here. The range extension pass is a full data flow analysis — this means we could simplify LiveDebugVariables and have it no longer attempt to extend the liveness of a DEBUG_VALUE beyond the end of a basic block.

It's possible that simplifying LiveDebugVariables could make up some of the runtime that is spent in the debug range extension pass.

I just commented in the PR:

The debug range extension (http://reviews.llvm.org/D11933) is partially redundant with what LiveDebugVariables is trying to do here. The range extension pass is a full data flow analysis — this means we could simplify LiveDebugVariables and have it no longer attempt to extend the liveness of a DEBUG_VALUE beyond the end of a basic block.

It's possible that simplifying LiveDebugVariables could make up some of the runtime that is spent in the debug range extension pass.

I added a comment in PR (https://llvm.org/bugs/show_bug.cgi?id=24563#c3)

Please let me know if the patch requires any more changes.

Thanks so far. I added a bunch of more detailed comments inline.

lib/CodeGen/ExtendDebugRangeLocation.cpp
333 ↗(On Diff #32795)

Please remove anything multiple locations from this patch it makes reviewing more complicated than necessary.

336 ↗(On Diff #32795)

Maybe: Terminate all open ranges at the end of the current basic block.

353 ↗(On Diff #32795)

remove these comments.

356 ↗(On Diff #32795)

Doxygen (see above)
Also the comment should mention that this joins the analysis results of all incoming edges in a basic block.

359 ↗(On Diff #32795)

I don't understand the first sentence.

393 ↗(On Diff #32795)

range-based for?

404 ↗(On Diff #32795)

drop "currently"

443 ↗(On Diff #32795)

transfer

test/DebugInfo/MIR/X86/extend-debug-range.mir
101 ↗(On Diff #32795)

We usually drop all non-essential attributes from IR testcases (usually everything in quotes).

aprantl added inline comments.Aug 26 2015, 5:33 PM
lib/CodeGen/ExtendDebugRangeLocation.cpp
1 ↗(On Diff #32795)

Ranges are an implementation detail and probably shouldn't show up in the name.

10 ↗(On Diff #32795)

This should ideally be a doxygen comment.

10 ↗(On Diff #32795)

Instead of saying "Fixup", better say:
This pass implements a data flow analysis that propagates debug location information by inserting additional DBG_VALUE instructions into the machine instruction stream.

12 ↗(On Diff #32795)

"This pass internally builds debug location liveness ranges to determine the points where additional DBG_VALUEs need to be inserted."?

16 ↗(On Diff #32795)

Please only document what the code does and don't talk about future extensions.

18 ↗(On Diff #32795)

???

21 ↗(On Diff #32795)

This is not a sound argument. If this were implement in DbgValueHistoryCalculator we wouldn't need to emit any DBG_VALUE intrinsics, because we could just pass the ranges themselves to the backend.

Let's just say that this is a separate pass to facilitate testing and improve modularity.

57 ↗(On Diff #32795)

LLVM build with autobrief, so unless you want more than the first sentence to appear in the brief section, the \brief can/should be omitted from the Doxygen comments.

78 ↗(On Diff #32795)

///

80 ↗(On Diff #32795)

For better readability it might be worth considering using a struct with self-explaining named members instead of an opaque std::pair.

82 ↗(On Diff #32795)

Same here.

139 ↗(On Diff #32795)

Full sentences ending with "." are preferred. That said, this comment essentially repeats the function name... what it should say is that it expects DebugValues as inputs.

159 ↗(On Diff #32795)

Better: \return the instruction following \c MI.

161 ↗(On Diff #32795)

This could be split out into a separate patch and/or tested separately.

179 ↗(On Diff #32795)

Could this be written as a ranged-based for?

190 ↗(On Diff #32795)

Please run the path through clang-format.

238 ↗(On Diff #32795)

missing . at the end

239 ↗(On Diff #32795)

range-based for?

250 ↗(On Diff #32795)

remove the comment.

262 ↗(On Diff #32795)

doxygen

280 ↗(On Diff #32795)

doxygen. (the coding guidelines prefer only a single comment on the declaration, so they don't get out of sync).

281 ↗(On Diff #32795)

please remove references to multiple locations in this patch

289 ↗(On Diff #32795)

Why is this correct?

306 ↗(On Diff #32795)

Are constants and memory locations handled in a safe/correct way?

312 ↗(On Diff #32795)

The enclosing function is very long. I suggest splitting out certain pieces into separate functions with self-explanatory names so a reader can understand what the function does on a higher level. This looks like a good candidate for that.

tvvikram added a comment.EditedAug 27 2015, 5:09 AM

Thanks so far. I added a bunch of more detailed comments inline.

I will make the suggested changes and submit a new patch.

I am not sure about the name for the new pass. Please suggest.
I have a couple of names though - ImproveDebugValues.cpp, DebugValueDFA.cpp (DFA: DataFlow Analysis). Please confirm.

Thanks so far. I added a bunch of more detailed comments inline.

I will make the suggested changes and submit a new patch.

I am not sure about the name for the new pass. Please suggest.
I have a couple of names though - ImproveDebugValues.cpp, DebugValueDFA.cpp (DFA: DataFlow Analysis). Please confirm.

To prevent you from having to rename the file and pass more than once, let's defer this decision until the very end.
Another possible name is "DebugValuePropagation", although "LiveDebugValues", or "DebugValueLiveness" would actually be the most fitting, because we are doing a Liveness analysis. This conflicts with the existing LiveDebugVariables pass that consumes all DBG_VALUEs that describe vregs, builds a side-table that is used during register allocation and writes out new DBG_VALUEs with real registers after regalloc is done. Perhaps we could rename the old LiveDebugVariables to "MaterializeDebugValues", or "RegallocDebugValues?

To recap, my vision is that:

  • the old LiveDebugVariables becomes a simple and fast BB-local pass that emits the bare minimum of DBG_VALUEs and
  • this new pass then performs the control-flow-aware liveness DFA that extends their range beyond BB boundaries by inserting enough DBG_VALUEs that
  • the DWARF backend may again assume that DBG_VALUEs are BB-local (and only performs a very simple merging of adjacent ranges).

This way we will end up with small, understandable, and (via MIR) testable passes with a well-defined scope that do one task well.

tvvikram updated this revision to Diff 33700.Sep 1 2015, 9:22 AM
tvvikram marked 25 inline comments as done.

Made review comment changes.

Thanks so far. I added a bunch of more detailed comments inline.

I will make the suggested changes and submit a new patch.

I am not sure about the name for the new pass. Please suggest.
I have a couple of names though - ImproveDebugValues.cpp, DebugValueDFA.cpp (DFA: DataFlow Analysis). Please confirm.

To prevent you from having to rename the file and pass more than once, let's defer this decision until the very end.

Thanks.

Another possible name is "DebugValuePropagation", although "LiveDebugValues", or "DebugValueLiveness" would actually be the most fitting, because we are doing a Liveness analysis. This conflicts with the existing LiveDebugVariables pass that consumes all DBG_VALUEs that describe vregs, builds a side-table that is used during register allocation and writes out new DBG_VALUEs with real registers after regalloc is done. Perhaps we could rename the old LiveDebugVariables to "MaterializeDebugValues", or "RegallocDebugValues?

The new pass could be a place to do generic code changes related to DBG_VALUE instructions like inferring multiple locations, inserting missing DBG_VALUE instructions if possible, etc. If that is the case, I think we should name the pass with a generic name. Otherwise, I am okay with the name changes you suggested.

To recap, my vision is that:

  • the old LiveDebugVariables becomes a simple and fast BB-local pass that emits the bare minimum of DBG_VALUEs and
  • this new pass then performs the control-flow-aware liveness DFA that extends their range beyond BB boundaries by inserting enough DBG_VALUEs that
  • the DWARF backend may again assume that DBG_VALUEs are BB-local (and only performs a very simple merging of adjacent ranges).

This way we will end up with small, understandable, and (via MIR) testable passes with a well-defined scope that do one task well.

Thanks for the high level view.

lib/CodeGen/ExtendDebugRangeLocation.cpp
16 ↗(On Diff #32795)

Removed.

18 ↗(On Diff #32795)

Removed. Initially, this file was meant to handle any code changes related to DBG_VALUE instructions.

159 ↗(On Diff #32795)

I am not sure about it. MI could already be the last instr in MBB. Anyway, I have removed this routine for now.

161 ↗(On Diff #32795)

Removed. I will submit a separate patch for this.

179 ↗(On Diff #32795)

I was not updating the iterators after erasing 'u' at line 221. I have rewritten the iterators but could not convert it to 'range-based for'.

289 ↗(On Diff #32795)

Unintended comment. Removed.

306 ↗(On Diff #32795)

AFAIK, yes as only register locations are handled now.

333 ↗(On Diff #32795)

Sorry. I had missed out removing this when I separated out the multiple locations patch. Removed.

359 ↗(On Diff #32795)

Removed. I meant that the current implementation support bbs with at most 2 predecessors.

393 ↗(On Diff #32795)

The comparison at line 397 prevented me in converting to 'range based for'.

Thanks, more comments inline!

Side note, the clang coding standards recommend to write every comment in the code as complete sentences, with a "." at the end.

lib/CodeGen/ExtendDebugRangeLocation.cpp
81 ↗(On Diff #33700)

OpenRanges should probably be a local variable in the while (!BBWorklist.empty()) loop in ExtendRanges().

300 ↗(On Diff #33700)

This loop is quadratic and only works for two predecessors. Off the top of my head, what about, e.g, having a SmallDenseMap that stores a separate IncomingLocs for each MBB?

IncomingLocs could be a sorted SmallVector or another SmallDenseMap of

struct IncomingLoc {
  DIVariable* Var;
  MachineInstr Loc;
};

A nullptr for Loc (or a missing entry) means that this value has been determined to be unknown.
Additionally each MBB gets a flag that indicates whether it has been visited or not.

At the end of each MBB, go through the list of successors and immediately join() the OutgoingLocs with the IncomingLocs of the successor. In the join function every input wins over the IncomingLocs of a not-yet-visited MBB, after the first visit the MBB is marked as visited. Likewise, a nullptr (or whatever we use to indicate the kill value) also wins against any other value.

360 ↗(On Diff #33700)

Ok. See my other comment about generalizing the algorithm.

362 ↗(On Diff #33700)

Typically in LLVM transfer functions return bool and this would be written as:

bool Changed = false;
Changed |= transfer(MI)

Thanks, more comments inline!

Side note, the clang coding standards recommend to write every comment in the code as complete sentences, with a "." at the end.

Thanks. I will make the changes and submit a new patch.

Thanks! Any updates?

Thanks! Any updates?

Sorry, I couldn't work on this for quite sometime. I will update and submit the patch as soon as possible.

tvvikram updated this revision to Diff 39892.Nov 11 2015, 3:13 AM

First, sorry for submitting this after a long gap.

The following changes have been made:
a. Make OpenRanges usage local to every mbb.
b. Rewrite 'join' routine - generic implementation by looking at all predecessors of an mbb to extend the debug value ranges across mbb by using IncomingLocs as suggested in the review comment. Previously, the implementation was limited to 1 and 2 predecessors.

tvvikram marked 3 inline comments as done.Nov 11 2015, 3:21 AM
tvvikram added inline comments.
lib/CodeGen/ExtendDebugRangeLocation.cpp
301 ↗(On Diff #39892)

I have maintained InLocs for every MBB. These InLocs will also represent the set of DBG_VALUE instructions already inserted. At every 'join' call, the incoming locations are calculated progressively over the predecessors of the MBB (as suggested). Any new DBG_VALUE instruction calculated is inserted at the start of the MBB as well as to the InLocs of this MBB.

Thanks of the update, this is starting to look really good. I have a couple of comments inline that are mostly about readability. I'll play with the patch and look at the correctness next. Did you make any compile time measurements with the new algorithm? For example, by compiling clang?

lib/CodeGen/ExtendDebugRangeLocation.cpp
67 ↗(On Diff #39892)

I found InlinedVariable to be misleading because the Inlined-At is optional. What about just calling it a DebugVariable and promoting it to a struct?

/// A potentially inlined instance of a variable.
struct DebugVar {
  const DILocalVariable &Var;
  const DILocation *InlinedAt;
}
79 ↗(On Diff #39892)

This should probably be a (Small)DenseSet (=hash map) instead of a std::map.

87 ↗(On Diff #39892)

Instead of having them as members that are cleared for each function, it would be better to make them local variables that are explicitly passed as arguments. It's less error-prone and makes the code easier to read because there is no global state.

199 ↗(On Diff #39892)

This entire loop could be replaced by a use of std::remove_if with the condition in a lambda. I believe it would be more readable this way.

222 ↗(On Diff #39892)

Again, this can be expressed more compactly with std::remove_if.

256 ↗(On Diff #39892)

Since this comparison is duplicated three times why not implement an operator== for VarLoc?

305 ↗(On Diff #39892)

With the operator== (see line 256) this pattern can probably also be collapsed to a std::remove_if.

test/DebugInfo/X86/sret.ll
7 ↗(On Diff #39892)

I believe, we actually want the dwo_id to be hardcoded in this one testcase, because it forces us to think why the hash changed, when it changed.

tvvikram updated this revision to Diff 40162.Nov 13 2015, 10:09 AM
tvvikram marked 9 inline comments as done.

I have made the suggested changes. In few of the test cases I checked using llc --time-passes, the time spent in this pass was <1%.

tvvikram added inline comments.Nov 13 2015, 10:10 AM
test/DebugInfo/X86/sret.ll
7 ↗(On Diff #39892)

I had noticed that the id had changed in one of previous patches. Fortunately, the id getting generated now is same as in trunk. So I have reverted this change in this patch.

I just manually verified that this indeed works as advertised on a couple of old bugreports with debug info + ASAN. Totally awesome!
Couple more inline comments and I think the only major thing left to do here is to agree on a name for the new pass :-)

It worries me a bit that none of the test cases exhibit the case with multiple predecessors that the join() function is implementing (nested ifs without else, switch stmts...). It would be good to add one.

lib/CodeGen/ExtendDebugRangeLocation.cpp
182 ↗(On Diff #40162)

What if MI is, e.g., constant 1 and V.MI is constant 2? Looks like the function would still return true because isDescribedByReg will return 0 of both. We should also compare the kind (const/reg/...) here.

Regression/Unit test that catches these corner cases?

284 ↗(On Diff #40162)

This can now be converted to a range-based for.

339 ↗(On Diff #40162)

Why is it necessary to push it into InLocs here? Wouldn't transfer pick it up automatically from the DBG_VALUE that we just inserted?

386 ↗(On Diff #40162)

This can now be converted to a range-based for.

tvvikram marked 2 inline comments as done.Nov 25 2015, 9:31 AM

I just manually verified that this indeed works as advertised on a couple of old bugreports with debug info + ASAN. Totally awesome!

Good to hear that!

Couple more inline comments and I think the only major thing left to do here is to agree on a name for the new pass :-)

I am okay with any of the names that you had suggested before. Please confirm. Also I will submit the current changes along with pass renaming.

It worries me a bit that none of the test cases exhibit the case with multiple predecessors that the join() function is implementing (nested ifs without else, switch stmts...). It would be good to add one.

I have come up with a test case that does join() with 3 predecessors. Will submit along with pass renaming.

lib/CodeGen/ExtendDebugRangeLocation.cpp
182 ↗(On Diff #40162)

As the TODO in line 203 says, I am handling DBG_VALUEs that describes a register. OpenRanges when built are guaranteed to describe a register.

339 ↗(On Diff #40162)

InLocs also represent the list of already inserted DBG_VALUEs; because we do not want to insert redundant DBG_VALUEs. Only new incoming locs that are calculated during each join() call for the MBB are inserted.

My vote is to call the pass LiveDbgValues. This is analogous to the well known "live variables" data-flow analysis and thus succinctly describes what this pass is doing (Compute the set of live DBG_VALUEs at each basic block and insert them).

Unless somebody else objects IMO this pass is good to land then.
On the road ahead there are several important things that need to be done in short order, but can/should be done in separate commits:

  • Remove the broken propagation in LiveDebugVariables (https://llvm.org/bugs/show_bug.cgi?id=24563) to fix the correctness issue noted in PR24563. This will also reclaim some of the performance we are spending in this new pass.
  • DbgValueHistoryCalculator contains a hack (collectChangingRegs()) to identify the frame pointer. With this pass in place, we should be able to remove it and rely on DBG_VALUEs without any loss of functionality.
  • Implement the missing location types (constant values, memory locations, fp constants ...)

thanks for all your work,
adrian

tvvikram updated this revision to Diff 41897.Dec 4 2015, 10:38 AM

Renamed the pass as "LiveDebugValues".

aprantl accepted this revision.Dec 8 2015, 8:33 AM
aprantl edited edge metadata.

This is ready to land now. Thanks again!

This revision is now accepted and ready to land.Dec 8 2015, 8:33 AM

Commit rL255096. Thank you!

jfb added a subscriber: jfb.Dec 8 2015, 11:57 PM

Hi Vikram,

This seems to break WebAssembly tests:

https://build.chromium.org/p/client.wasm.llvm/builders/linux/builds/726/steps/steps/logs/stdio

Other buildbots also look very sad:

http://lab.llvm.org:8011/console

Could you look into it?

Thanks,

JF

reverted r255101

@joker.eph Thanks for reverting.
@jfb I am looking into it.

tvvikram updated this revision to Diff 42538.Dec 11 2015, 10:27 AM
tvvikram edited edge metadata.

I got to know that I had built and tested only for certain targets (sorry that I didn't realize about it earlier). I had to make a couple of minor fixes to make the patch work for all targets.

Just to clarify, I am using cmake+Ninja build with the the following command to configure cmake:
cmake -G Ninja -DCMAKE_BUILD_TYPE="Debug" -DLLVM_TARGETS_TO_BUILD="all" -DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD="WebAssembly" -DLLVM_BINUTILS_INCDIR=/usr/include -DBUILD_SHARED_LIBS=1 -DCMAKE_CXX_FLAGS=" -gdwarf-2 " /path/to/llvmsrc/

and got the following result with "ninja check":
[46/46] Running the LLVM regression tests

  • Testing: 15343 tests, 8 threads --

Testing: 0 .. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
Testing Time: 224.82s

Expected Passes    : 15157
Expected Failures  : 131
Unsupported Tests  : 55

I got similar results with Release build too. Please let me know if the results look fine and if I need to run any more tests before committing.

This sounds reasonable to me.
We can't test every possible configuration before committing anyway (Windows, Linux, OS X, etc.), this is why we revert easily.

Please go ahead and re-commit.