Page MenuHomePhabricator

Fix debug_loc offset difference with basic block sections
Needs ReviewPublic

Authored by tmsriram on Sat, Aug 1, 9:42 PM.

Details

Summary

debug_loc for params not used in entry block need to be computed with the section start as the base label.

The attached test will not assemble without the patch as the assembler will complain that we are trying to compute a label difference from 2 different sections.

If the basic block containing the first mention of the param is not in the same section as the entry basic block, we cannot use the CurrentFnBegin label. To fix it, we use the Section begin label instead. Using the Section Label will result in loss of debug loc fidelity as we do not capture the basic blocks before the start of this section.

Right now, LabelsBeforeInsn just stores one label for each insn. We can fix the fidelity problem by storing all the basic block section labels that appear before the insn and generating a difference for each of them. Hence, I have made this as a FIXME.

Is this alright? I am not very familiar with debug_loc, other better suggestions are much appreciated here.

Diff Detail

Event Timeline

tmsriram created this revision.Sat, Aug 1, 9:42 PM
Herald added a project: Restricted Project. · View Herald TranscriptSat, Aug 1, 9:42 PM
tmsriram requested review of this revision.Sat, Aug 1, 9:42 PM

Trying to reproduce this (by taking the .ll test case and compiling it with clang -c) I don't get any debug_loc section or DW_AT_locations. Is there something I'm missing about how to reproduce this?

(trying to see what the non-bb-sections location description looks like - if it depends on basic block ordering, it's probably incorrect anyway, and might be worth fixing so it doesn't depend on that non-guaranteed property & then there'll be less divergence between bb-sections and not)

Trying to reproduce this (by taking the .ll test case and compiling it with clang -c) I don't get any debug_loc section or DW_AT_locations. Is there something I'm missing about how to reproduce this?

(trying to see what the non-bb-sections location description looks like - if it depends on basic block ordering, it's probably incorrect anyway, and might be worth fixing so it doesn't depend on that non-guaranteed property & then there'll be less divergence between bb-sections and not)

Yes, you are right. The debug_loc section is not generated without basic block sections, I will dig more.

Trying to reproduce this (by taking the .ll test case and compiling it with clang -c) I don't get any debug_loc section or DW_AT_locations. Is there something I'm missing about how to reproduce this?

(trying to see what the non-bb-sections location description looks like - if it depends on basic block ordering, it's probably incorrect anyway, and might be worth fixing so it doesn't depend on that non-guaranteed property & then there'll be less divergence between bb-sections and not)

I see why debug_loc is not produced in the default case. This is what happens:

+ For the default case (no bbsections), DebugLoc in https://github.com/llvm/llvm-project/blob/master/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L1744 , it has exactly one value that contains the whole function scope as the start label is lfunc_begin and the end label is the end of the function.
+ Such location lists are handled specially here when the scope is the whole function : https://github.com/llvm/llvm-project/blob/master/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L1841 and the debug_loc is not generated. Diffing the asm by forcing a debug_loc entry, what happens here is that a DW_AT_location is replaced with a DW_AT_const_value. I don't understand this too well but this seems like an optimization to me.

Does this make sense?

Trying to reproduce this (by taking the .ll test case and compiling it with clang -c) I don't get any debug_loc section or DW_AT_locations. Is there something I'm missing about how to reproduce this?

(trying to see what the non-bb-sections location description looks like - if it depends on basic block ordering, it's probably incorrect anyway, and might be worth fixing so it doesn't depend on that non-guaranteed property & then there'll be less divergence between bb-sections and not)

I see why debug_loc is not produced in the default case. This is what happens:

+ For the default case (no bbsections), DebugLoc in https://github.com/llvm/llvm-project/blob/master/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L1744 , it has exactly one value that contains the whole function scope as the start label is lfunc_begin and the end label is the end of the function.
+ Such location lists are handled specially here when the scope is the whole function : https://github.com/llvm/llvm-project/blob/master/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L1841 and the debug_loc is not generated.

OK ^ and with bb sections do we see the same behavior? If not, why not?

Diffing the asm by forcing a debug_loc entry,

Not sure what you mean by this ^ could you explain further?

what happens here is that a DW_AT_location is replaced with a DW_AT_const_value. I don't understand this too well but this seems like an optimization to me.

Does this make sense?

I'm still a bit confused. I think if non-bb-sections doesn't need a location list, then bb-sections shouldn't either. Perhaps that issue should be fixed first, if it's an issue?

In any case/any order - perhaps this patch is still reflective of a bug that exists even in this ^ issue is fixed, but maybe it's that the test case isn't sufficient to demonstrate the remaining bug - perhaps the test case could be complicated a bit further to ensure it's not testing a behavior that shouldn't be happening for other reasons anyway?

Trying to reproduce this (by taking the .ll test case and compiling it with clang -c) I don't get any debug_loc section or DW_AT_locations. Is there something I'm missing about how to reproduce this?

(trying to see what the non-bb-sections location description looks like - if it depends on basic block ordering, it's probably incorrect anyway, and might be worth fixing so it doesn't depend on that non-guaranteed property & then there'll be less divergence between bb-sections and not)

I see why debug_loc is not produced in the default case. This is what happens:

+ For the default case (no bbsections), DebugLoc in https://github.com/llvm/llvm-project/blob/master/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L1744 , it has exactly one value that contains the whole function scope as the start label is lfunc_begin and the end label is the end of the function.
+ Such location lists are handled specially here when the scope is the whole function : https://github.com/llvm/llvm-project/blob/master/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L1841 and the debug_loc is not generated.

OK ^ and with bb sections do we see the same behavior? If not, why not?

Please bear with my longish response here:

First, let me explain what happens with the test by default. There is no .debug_loc section by default and here is why:

Without bbsections, the ASM looks like this:

	.text
	.file	"locdspnm.cpp"
...
	.type	_ZL4ncatPcjz,@function
_ZL4ncatPcjz:                           # @_ZL4ncatPcjz
.Lfunc_begin0:
	.file	3 "/proc/self/cwd/third_party/icu/source/common/locdspnm.cpp"
	.loc	3 37 0                          # third_party/icu/source/common/locdspnm.cpp:37:0
	.cfi_startproc
# %bb.0:                                # %.critedge3
	subq	$56, %rsp
...
	je	.LBB0_2
# %bb.1:                                # %.critedge3
...
	movaps	%xmm7, 32(%rsp)
.LBB0_2:                                # %.critedge3
...
	leaq	-128(%rsp), %rax
	#DEBUG_VALUE: ncat:buflen <- 157
.Ltmp0:
	.loc	3 47 3 prologue_end             # third_party/icu/source/common/locdspnm.cpp:47:3
...
	movl	$16, (%rax)
.Ltmp1:
	.p2align	4, 0x90
.LBB0_3:                                # =>This Inner Loop Header: Depth=1
	#DEBUG_VALUE: ncat:buflen <- 157
	.loc	3 0 3 is_stmt 0                 # third_party/icu/source/common/locdspnm.cpp:0:3
	jmp	.LBB0_3
.Lfunc_end0:
	.size	_ZL4ncatPcjz, .Lfunc_end0-_ZL4ncatPcjz
	.cfi_endproc

The function buildLocationList : https://github.com/llvm/llvm-project/blob/master/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L1638 is called and it adds two entries to DebugLoc with StartLabel and Label pairs as follows:

  1. .Lfunc_begin0, .Ltmp1
  2. .Ltmp1, .Lfunc_end0

The size of DebugLoc is 2 but these two ranges are merged into one range with MergeRanges here: https://github.com/llvm/llvm-project/blob/master/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L1733

There is now one entry in DebugLoc with full function scope .Lfunc_begin0, .Lfunc_end0

whose StartLabel is the start of the function and the EndLabel is the function end.

There is exactly one entry in the Location List DebugLoc in function buildLocationList. Before returning, there is a check to see if this one entry is valid throughout the scope of the function: https://github.com/llvm/llvm-project/blob/master/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L1744 This returns true in this case.

This leads to an optimization to use DW_AT_const_value instead of DW_AT_location and the dwarf standard here seems to suggest this : http://dwarfstd.org/ShowIssue.php?issue=161109.2

This optimization is triggered right here: https://github.com/llvm/llvm-project/blob/master/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L1841

At this point isValidSingleLocation is the return value of buildLocationList and when true does not create a .debug_loc section but instead creates a DW_AT_const_value.

Now, I tried to force a .debug_loc for this by simply making the value of isValidSingleLocation false always and the .debug_loc list shows up as expected. Diffing the asm without and with .debug_loc, the asm differs exactly like this :

Without debug_locWith debug_loc
.section .debug_loc
...
......
.byte 28 # DW_AT_const_value.byte 2 # DW_AT_location
.byte 15 # DW_FORM_udata.byte 23 # DW_FORM_sec_offset

To summarize, it seems like DW_AT_const_value can be used instead of DW_AT_location under certain conditions and that's what seems to be happening here. Please correct me if this is a wrong assumption.

This is for the default case.

Now, let's look at what happens with bbsections, first without this patch. With bb sections, buildLocationList is called again. Here is the ASM with basic block sections:

	.text
	.file	"locdspnm.cpp"
	.type	_ZL4ncatPcjz,@function
_ZL4ncatPcjz:                           # @_ZL4ncatPcjz
.Lfunc_begin0:
	.loc	3 37 0                          # third_party/icu/source/common/locdspnm.cpp:37:0
	.cfi_startproc
# %bb.0:                                # %.critedge3
...
	je	_ZL4ncatPcjz.2
	jmp	_ZL4ncatPcjz.1
	.cfi_endproc
	.section	.text,"ax",@progbits,unique,1
_ZL4ncatPcjz.1:                         # %.critedge3
	.cfi_startproc
	.cfi_def_cfa %rsp, 64
...
	jmp	_ZL4ncatPcjz.2
.LBB_END0_1:
	.size	_ZL4ncatPcjz.1, .LBB_END0_1-_ZL4ncatPcjz.1
	.cfi_endproc
	.section	.text,"ax",@progbits,unique,2
_ZL4ncatPcjz.2:                         # %.critedge3
	.cfi_startproc
	.cfi_def_cfa %rsp, 64
...
	leaq	-128(%rsp), %rax
	#DEBUG_VALUE: ncat:buflen <- 157
.Ltmp0:
	.loc	3 47 3 prologue_end             # third_party/icu/source/common/locdspnm.cpp:47:3
...
	jmp	_ZL4ncatPcjz.3
.Ltmp1:
.LBB_END0_2:
	.size	_ZL4ncatPcjz.2, .LBB_END0_2-_ZL4ncatPcjz.2
	.cfi_endproc
	.p2align	4, 0x90
	.section	.text,"ax",@progbits,unique,3
_ZL4ncatPcjz.3:                         # =>This Inner Loop Header: Depth=1
	.cfi_startproc
	.cfi_def_cfa %rsp, 64
	#DEBUG_VALUE: ncat:buflen <- 157
	.loc	3 0 3 is_stmt 0                 # third_party/icu/source/common/locdspnm.cpp:0:3
	jmp	_ZL4ncatPcjz.3
.LBB_END0_3:
	.size	_ZL4ncatPcjz.3, .LBB_END0_3-_ZL4ncatPcjz.3
	.cfi_endproc
	.text
.Lfunc_end0:
	.size	_ZL4ncatPcjz, .Lfunc_end0-_ZL4ncatPcjz

Again, buildLocationList at https://github.com/llvm/llvm-project/blob/master/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L1638 adds two entries but their start and end label pairs are now:

  1. .Lfunc_begin0, .Ltmp1 // This is the same as the default case.
  2. _ZL4ncatPcjz.3, .LBB_END0_3 // Note that the start label is the start of the next section and the end label is the end of the next section

Now, merge ranges will not merge these two into a single range as .Ltmp1 is not equal to _ZL4ncatPcjz3.

So, buildLocationList returns false at : https://github.com/llvm/llvm-project/blob/master/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L1741

That is, the optimization to not make a .debug_loc is not applicable in this case as that only holds if there is a valid single location here : https://github.com/llvm/llvm-project/blob/master/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L1841

So, a loc list is constructed but without this patch this triggers a problem at : https://github.com/llvm/llvm-project/blob/master/llvm/lib/CodeGen/AsmPrinter/DebugLocStream.h#L99

This is because the startSym and the endSym have to be in the same section. .Ltmp1 and .Lfunc_begin0 are in different sections. The assembler will complain if we try to assemble this.

So, what triggered this problem? LabelsBeforeInsn[insn] cannot be loosely assigned to be the start of the function with basic block sections if the instruction and the start of the function are in different sections.

Now, this patch fixes it by making LabelsBeforeInsn to be the start of the section. So, the two entries in DebugLoc would become:

  1. _ZL4ncatPcjz.2, .Ltmp1
  2. _ZL4ncatPcjz.3, .Lfunc_end0

This is good, but we are missing the entries corresponding to (.Lfunc_begin0 ... _ZL4ncatPcjz.2) which denote the 2 basic blocks before. We could handle this if we extend LabelsBeforeInsn to have multiple values in the case of basic block sections.

The way this patch exists, it loses the debug_loc fidelity as it does not capture the other blocks.

Thoughts?

Apologies for the very long reply!

Diffing the asm by forcing a debug_loc entry,

Not sure what you mean by this ^ could you explain further?

what happens here is that a DW_AT_location is replaced with a DW_AT_const_value. I don't understand this too well but this seems like an optimization to me.

Does this make sense?

I'm still a bit confused. I think if non-bb-sections doesn't need a location list, then bb-sections shouldn't either. Perhaps that issue should be fixed first, if it's an issue?

In any case/any order - perhaps this patch is still reflective of a bug that exists even in this ^ issue is fixed, but maybe it's that the test case isn't sufficient to demonstrate the remaining bug - perhaps the test case could be complicated a bit further to ensure it's not testing a behavior that shouldn't be happening for other reasons anyway?