This is an archive of the discontinued LLVM Phabricator instance.

[LiveDebugVariables] Use lexical scope to trim debug value live intervals
ClosedPublic

Authored by rob.lougher on Jul 27 2017, 11:36 AM.

Details

Summary

This is a fix for PR33730.

The following program causes Live Debug Variables to emit unnecessary debug values:

extern int foobar(int, int, int, int, int);

int F(int i1, int i2, int i3, int i4, int i5) {
  return foobar(i1, i2, i3, i4, i5);
}

int foo(int a, int b, int c, int d, int e) {
  return F(a,b,c,d,e) +
         F(a,b,c,d,e) +
         F(a,b,c,d,e) +
         F(a,b,c,d,e);
}

$ clang foo.c -c -O2 -g -emit-llvm
$ llc foo.bc -print-after virtregrewriter 2>&1 | grep DBG_VALUE | grep i4 | grep 8:10

		DBG_VALUE %EBX, %noreg, !"i4", <!17>; line no:3 inlined @[ foo.c:8:10 ]
		DBG_VALUE <fi#2>, 0, !"i4", <!17>; line no:3 inlined @[ foo.c:8:10 ]
		DBG_VALUE %EBP, %noreg, !"i4", <!17>; line no:3 inlined @[ foo.c:8:10 ]

According to this the first inlined "i4" parameter is in 3 registers (in fact, on entry it is only in %EBX). This can also be seen in the DWARF output, where the debug location ends up with 3 entries:

0x00000110:     DW_TAG_inlined_subroutine [9] *
                  DW_AT_abstract_origin [DW_FORM_ref4]	(cu + 0x0061 => {0x00000061} "F")
                  DW_AT_low_pc [DW_FORM_addr]	(0x000000000000002f)
                  DW_AT_high_pc [DW_FORM_data4]	(0x00000009)
                  DW_AT_call_file [DW_FORM_data1]	("foo.c")
                  DW_AT_call_line [DW_FORM_data1]	(8)

0x0000013e:       DW_TAG_formal_parameter [10]  
                    DW_AT_location [DW_FORM_sec_offset]	(0x00000312)
                    DW_AT_abstract_origin [DW_FORM_ref4]	(cu + 0x008e => {0x0000008e} "i4")
...
.debug_loc contents:
...
0x00000312: Beginning address offset: 0x000000000000002f
               Ending address offset: 0x0000000000000046
                Location description: 53 

            Beginning address offset: 0x0000000000000046
               Ending address offset: 0x0000000000000065
                Location description: 77 0c 

            Beginning address offset: 0x0000000000000065
               Ending address offset: 0x000000000000009a
                Location description: 56

Similarly, "i4"s debug value for the second, third and fourth inlining also show the same 3 DBG_VALUES:

$ llc foo.bc -print-after virtregrewriter 2>&1 | grep DBG_VALUE | grep i4 | grep 9:10

		DBG_VALUE %EBX, %noreg, !"i4", <!17>; line no:3 inlined @[ foo.c:9:10 ]
		DBG_VALUE <fi#2>, 0, !"i4", <!17>; line no:3 inlined @[ foo.c:9:10 ]
		DBG_VALUE %EBP, %noreg, !"i4", <!17>; line no:3 inlined @[ foo.c:9:10 ]
...

The problem is caused by the live intervals assigned to the debug values by Live Debug Variables. All four inlined debug values for "i4" share the same virtual register. The live interval spans all the inlined callsites, although each inlined instance accounts for only a region of the full range. The register allocator later splits the virtual register, and as each debug value is associated with the entire live range they are also split, creating a number of intervals per value. Later, debug values are emitted for each of the intervals, although most of the intervals are outside the inlined instance.

Splitting of the debug values can also create a compile-time explosion. If the above program is modified to create 1500 inlined calls, the program takes over 40 minutes to compile with -g (3.5 GHz Core-i7) and Live Debug Variables inserts over 6 million DBG_VALUEs (see PR33730 for further details).

This patch fixes the problem by using the lexical scope of the inlined debug locations to "trim" the debug value live intervals. After this we generate a single debug value:

$ llc foo.bc -print-after virtregrewriter 2>&1 | grep DBG_VALUE | grep i4 | grep 8:10

		DBG_VALUE %EBX, %noreg, !"i4", <!17>; line no:3 inlined @[ foo.c:8:10 ]

The DWARF also now contains a single entry in the debug location:

0x00000110:     DW_TAG_inlined_subroutine [9] *
                  DW_AT_abstract_origin [DW_FORM_ref4]	(cu + 0x0061 => {0x00000061} "F")
                  DW_AT_low_pc [DW_FORM_addr]	(0x000000000000002f)
                  DW_AT_high_pc [DW_FORM_data4]	(0x00000009)
                  DW_AT_call_file [DW_FORM_data1]	("foo.c")
                  DW_AT_call_line [DW_FORM_data1]	(8)

                    DW_AT_location [DW_FORM_sec_offset]	(0x0000017f)
                    DW_AT_abstract_origin [DW_FORM_ref4]	(cu + 0x008e => {0x0000008e} "i4")
...
.debug_loc contents:
...
0x0000017f: Beginning address offset: 0x000000000000002f
               Ending address offset: 0x000000000000004c
                Location description: 53

The other inlined debug values for "i4" now also show the correct values:

$ llc foo.bc -print-after virtregrewriter 2>&1 | grep DBG_VALUE | grep i4 | grep 9:10

		DBG_VALUE %EBX, %noreg, !"i4", <!17>; line no:3 inlined @[ foo.c:9:10 ]
		DBG_VALUE <fi#2>, 0, !"i4", <!17>; line no:3 inlined @[ foo.c:9:10 ]

$ llc foo.bc -print-after virtregrewriter 2>&1 | grep DBG_VALUE | grep i4 | grep 10:10

		DBG_VALUE <fi#2>, 0, !"i4", <!17>; line no:3 inlined @[ foo.c:10:10 ]
		DBG_VALUE %EBP, %noreg, !"i4", <!17>; line no:3 inlined @[ foo.c:10:10 ]

$ llc foo.bc -print-after virtregrewriter 2>&1 | grep DBG_VALUE | grep i4 | grep 11:10

		DBG_VALUE %EBP, %noreg, !"i4", <!17>; line no:3 inlined @[ foo.c:11:10 ]

Diff Detail

Repository
rL LLVM

Event Timeline

rob.lougher created this revision.Jul 27 2017, 11:36 AM
davide added a subscriber: davide.Jul 27 2017, 11:39 AM
aprantl added inline comments.Jul 27 2017, 11:42 AM
lib/CodeGen/LiveDebugVariables.cpp
692 ↗(On Diff #108492)

I would personally prefer a short summary of the problem over a reference to a PR.

aprantl added inline comments.Jul 27 2017, 11:50 AM
lib/CodeGen/LiveDebugVariables.cpp
107 ↗(On Diff #108492)

///

694 ↗(On Diff #108492)

Why only do this for inlined variables? Couldn't non-inlined variables also benefit from being trimmed at their lexical scope's end? Or is this situation impossible for non-inlined variables?

704 ↗(On Diff #108492)

Can you add a high-level comment here what the loop is doing?

probinson added inline comments.
lib/CodeGen/LiveDebugVariables.cpp
25 ↗(On Diff #108492)

Not in alphabetical order.

32 ↗(On Diff #108492)

Not in alphabetical order.

aprantl added inline comments.Jul 27 2017, 4:06 PM
lib/CodeGen/LiveDebugVariables.cpp
707 ↗(On Diff #108492)

could you run the patch through clang-format?

Address review comments. Now enabled for non-inlined variables. This required fixing a test to make it more resilient (it checks that we get the expected number of debug values, but was also enforcing a specific order). I've also changed where the LexicalScopes are generated, as they're only needed when computing the intervals.

aprantl accepted this revision.Aug 1 2017, 12:34 PM

Small nitpick on the test, but otherwise this looks good now, thanks!

test/DebugInfo/X86/live-debug-variables.ll
24 ↗(On Diff #109160)

If you are only checking for i4, why are all the other dbg.values necessary? I would expect that we either check for them, too, or that we can remove them to keep the test simple.

This revision is now accepted and ready to land.Aug 1 2017, 12:34 PM
This revision was automatically updated to reflect the committed changes.