This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] More precise variable range for stack locations
ClosedPublic

Authored by NikolaPrica on May 6 2019, 8:14 AM.

Details

Summary

Variable's stack location can stretch longer than they should. If variable is placed at stack in some nested basic block its range can be calculated to be up to next occurrence of variable's DBG_VALUE or up to the end of function thus covering basic blocks that should not be covered by location range. This is because DbgEntityHistoryCalculator ends register locations at the end of basic block only if register has been changed through course of function which is not case for register used to reference stack objects.
This patch also tries to produce single value location if location list builder managed to merge all locations into one.

Diff Detail

Event Timeline

NikolaPrica created this revision.May 6 2019, 8:14 AM
ormris removed a subscriber: ormris.May 6 2019, 9:06 AM
dstenb added inline comments.May 6 2019, 11:56 PM
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1399

It seems that this can give incorrect single value locations for cases where we have produced location list entries with empty ranges.

For example:

frame-setup PUSH64r undef $rax, implicit-def $rsp, implicit $rsp
DBG_VALUE 123, $noreg, !12, !DIExpression(), debug-location !13 
DBG_VALUE 456, $noreg, !12, !DIExpression(), debug-location !13 
CALL64pcrel32 @foo, csr_64, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, debug-location !14 
$eax = XOR32rr undef $eax, undef $eax, implicit-def dead $eflags, debug-location !15                                                                   
$rcx = frame-destroy POP64r implicit-def $rsp, implicit $rsp, debug-location !15 
RETQ killed $eax, debug-location !15

Here, the first debug value will result in a location list entry with an empty range, since it is immediately followed by the second, and such entries are omitted from the location list, resulting in a single entry in Entries for the constant value 456. However, MInsn is the first debug value, so we will create a single location description with the wrong value:

0x00000043:     DW_TAG_variable
                  DW_AT_const_value	(123)

I'm not sure what the check should be to avoid these sort of cases. Perhaps check that all debug value entries in HistoryMapEntries are described by the same location, and have the same expression?

NikolaPrica marked an inline comment as done.May 7 2019, 12:59 AM
NikolaPrica added inline comments.
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1399

Such check could be return value of buildLocationList? Or we can just take the last DebugInstr from HistoryMapEntries which could be the last entry or the entry before the last one?

dstenb added inline comments.May 7 2019, 3:43 AM
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1399

Or we can just take the last DebugInstr from HistoryMapEntries which could be the last entry or the entry before the last one?

That is perhaps possible. I tried to think if we can land in a situation where the last DebugInstr in HistoryMapEntries produces a location list entry with an empty range. For example something like this:

DBG_VALUE 123, $noreg, !12, !DIExpression(), debug-location !13 
[...]
DBG_VALUE 123, $noreg, !12, !DIExpression(), debug-location !13
DBG_VALUE 456, $noreg, !12, !DIExpression(), debug-location !13 
[end of function]

However, AFAIK we can't insert debug values after terminators currently (the machine verifier will complain about that), and with a presence of a terminator in the last block then the range will not be empty, so perhaps this is guaranteed? I'm not completely sure.

NikolaPrica added inline comments.May 7 2019, 5:32 AM
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1399

I'm not sure either. But I suppose that such situation can appear for local variable that ends at some basic block? Maybe it is safer perform such verification in buildLocationList and return it as result?

(Adding Jeremy as reviewer as he has also been working in this area recently.)

lib/CodeGen/AsmPrinter/DbgEntityHistoryCalculator.cpp
359–360

Perhaps this must not be done in this patch, but I wonder if we should close all register-described values here?

For example:

volatile int g;

int baz(int p) {
  int local = 123;

  if (g > 1) {
    local = p;
    g += p;
  }

  return 3;
}

when compiling the above with clang -O1 -g --target=x86_64-unknown-linux-gnu we land in the following MIR after livedebugvalues:

bb.0.entry:
  successors: %bb.1(0x40000000), %bb.2(0x40000000)
  liveins: $edi

  DBG_VALUE $edi, $noreg, !16, !DIExpression(), debug-location !18
  DBG_VALUE $edi, $noreg, !16, !DIExpression(), debug-location !18
  DBG_VALUE 123, $noreg, !17, !DIExpression(), debug-location !19
  renamable $eax = MOV32rm $rip, 1, $noreg, @g, $noreg, debug-location !20 :: (volatile dereferenceable load 4 from @g, !tbaa !22)
  CMP32ri8 killed renamable $eax, 2, implicit-def $eflags, debug-location !26
  JCC_1 %bb.2, 12, implicit $eflags, debug-location !27

bb.1.if.then:
  successors: %bb.2(0x80000000)
  liveins: $edi

  DBG_VALUE 123, $noreg, !17, !DIExpression(), debug-location !19
  DBG_VALUE $edi, $noreg, !16, !DIExpression(), debug-location !18
  DBG_VALUE $edi, $noreg, !17, !DIExpression(), debug-location !19
  ADD32mr $rip, 1, $noreg, @g, $noreg, killed renamable $edi, implicit-def dead $eflags, debug-location !28 :: (volatile store 4 into @g, !tbaa !22), (volatile dereferenceable load 4 from @g, !tbaa !22)

bb.2.if.end:
  DBG_VALUE $edi, $noreg, !16, !DIExpression(), debug-location !18
  $eax = MOV32ri 3, debug-location !31
  RETQ $eax, debug-location !31

As seen, in the g > 1 branch the local variable is described by edi. As edi is not modified, it will not be contained in ChangingRegs.

As we don't close the range here, this means that we'll incorrectly describe local using edi for the rest of the function:

DW_AT_location	(0x00000000
   [0x0000000000000000,  0x000000000000000b): DW_OP_consts +123, DW_OP_stack_value
   [0x000000000000000b,  0x0000000000000017): DW_OP_reg5 RDI)
0000000000000000 baz:
       0: 8b 05 00 00 00 00            	movl	(%rip), %eax
       6: 83 f8 02                     	cmpl	$2, %eax
       9: 7c 06                        	jl	6 <baz+0x11>
       b: 01 3d 00 00 00 00            	addl	%edi, (%rip)
      11: b8 03 00 00 00               	movl	$3, %eax
      16: c3                           	retq
jmorse added inline comments.May 7 2019, 9:01 AM
lib/CodeGen/AsmPrinter/DbgEntityHistoryCalculator.cpp
359–360

Perhaps this must not be done in this patch, but I wonder if we should close all register-described values here?

I'd say definite yes -- when I dug into the history [0] for PR41600 (which this patch fixes I think) it sounded like ChangingRegs was a mild hack to work around variable locations being unreliable ~5 years ago. IMHO (YMMV) we might be able to delete it in another patch as a now redundant feature. (Plus as you demonstrate, it can be broken).

[0] https://reviews.llvm.org/D3933

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1399

Great question @ empty ranges; would those be filtered out by this clause [0]?

I believe there are certain circumstances where LLVM won't have a (Machine IR) terminator after a function call declared attribute((noreturn)), often at the end of a function. A quick experiment shows that during IR optimisation another part of the compiler places an 'unreachable' inst after the noreturn call and drops any dbg.values, so I doubt this scenario is accessible from source languages.

[0] https://github.com/llvm/llvm-project/blob/da82ce99b7460164ffb841619aadb44f397d2106/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L1189

test/DebugInfo/MIR/X86/dbg-stack-value-range.mir
4–9

IMHO the second sentence should describe the desired behaviour ("DbgEntityHistoryCalculator should close the variable's range at the end of the basic block..."), then afterwards describe the error that this test is for. It's not obvious right now that the second sentence is about the error case, until you read the third sentence.

73

I understand attributes are a pain to maintain in tests; can they be dropped?

106

Nit - most of the default MIR data can be omitted, pretty much every attribute that's false. (--simplify-mir to llc will drop it automatically).

NikolaPrica marked 5 inline comments as done.May 8 2019, 4:05 AM
NikolaPrica added inline comments.
lib/CodeGen/AsmPrinter/DbgEntityHistoryCalculator.cpp
359–360

This is similar problem that I am targeting. It would be solved by closing register-described values at the end of block. The only difference is that local is dead after the branch, but the idea is the same: Having BB1 that branches to BB2 and BB3, different DBG_VALUE from BB2 takes range across BB3 due to lack of variable's DBG_VALUE in it (variable might be dead or it just does lack location at that block).

I stumbled on a case when frame register is not in ChangingRegs but as you shown it can happen for other registers. I needed to modify DwarfDebug part in order to preserve single value variable locations because of partitioning of variable's range. Current condition for recognizing single value locations would not be recognized as single value location).

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1399

Great question @ empty ranges; would those be filtered out by this clause [0]?

They would, but the result location will end up as a DebugLocEntry::Value. The catch is that initialization of DbgVariable takes DBG_VALUE instruction. Maybe we should refactor DbgVariable to have DebugLocEntry::Value instead of DBG_VALUE instruction?

test/DebugInfo/MIR/X86/dbg-stack-value-range.mir
4–9

Sure. I will rewrite it!

73

I usually drop them from .ll but now I needed "no-frame-pointer" for test case so I missed to remove them from .mir. I will drop it.

I've posted new approach for fixing this problem at https://reviews.llvm.org/D61890. If that approach turns to be better I will abandon this change.

Just to note: the patch this is based on (r359426) was reverted in r360301 due to compile time issues. I've got a patch incoming (tomorrow) that relies on this patch to fix it: they'll need to be un-reverted / committed in a certain order and possibly all together. (We can work through that when this is approved though).

lib/CodeGen/AsmPrinter/DbgEntityHistoryCalculator.cpp
359–360

NB: I think the ChangingRegs situation can be addressed separately. and this patch is fine to go ahead with the other tweaks.

aprantl added inline comments.May 15 2019, 8:54 AM
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1395

Tiny nit: Can we use the actual class name instead of paraphrasing it as "location list builder"?

-Take last HistoryMapEntries as a representative single value location.
-Update test explanation.
-Strip uneccessary parts of test.

jmorse accepted this revision.May 17 2019, 9:09 AM

LGTM (I think I can approve?). The source IR needing no-frame-pointer-elim seems like enough reason to keep that attribute around.

As mentioned above, r360301 needs to be reverted before this is committed, and then D61940 is needed to fix the compile time regression that will cause. Best to wait for D61940 to be approved before committing, so that we can do that all at once.

This revision is now accepted and ready to land.May 17 2019, 9:09 AM

Nikola mentioned in D61890 that the changes to fission-ranges.ll result in reduced variable location ranges, which is unfortunate. IMO, eyeballing the test, I think the input IR we have at the moment is wrong, and it was only the bug that this patch fixes that caused us to get the output right.

In fission-ranges.ll the dbg.value for !18 / the "e" variable, has a !DILocation !39, which corresonds to the lexical block where 'b' is iterated over. This causes LiveDebugValues to not propagate the variable location out of the block modifying "e", because it believes "e" isn't valid outside of that lexical block. In the source code at the top of the file though we can see that "e" has a function-wide scope. Thus the IR in the test is wrong, the dbg.value should have a different DILocation! We only covered the whole function before because DbgEntityHistoryCalculator was extending variable locations too far.

We should check whether clang still generates this faulty IR, but I've run out of time for today.

For the sample code in fisson-range.ll we now generate slightly different IR (few more llvm.dbg.values). With such IR, LLVM generates one block larger range but nevertheless the problem there is different nature and it could be used as a test case for further improvement of LiveDebugValues (value from stack gets loaded into a callee clobberabble register and at the entry of a block we have two variable locations and we choose the one which is inserted later - see LiveDebugvalues::transferDebugValue). I will take a look into that.

I suppose that this is ready to land?

Actually I've just produce fix for fisson-ranges.ll that will have better ranges. I'll post it as child patch of this one.

I suppose that this is ready to land?

Indeed, although as mentioned that requires one patch to be unreverted, and D61940 to be ready to fix the performance regression in it :/. There shouldn't be a serious delay, hopefully.

-Rebase
-Change referenced scope to function scope for variable "b" in fisson-ranges.ll

Adding a comment about another hand-written example.

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1408–1409

If we have input like this:

!13 = !DILocalVariable(name: "local", scope: !7, file: !1, line: 5, type: !14)
!14 = !DIBasicType(name: "unsigned int", size: 32, encoding: DW_ATE_unsigned)
[...]
  bb.0.entry:
    DBG_VALUE 123, $noreg, !13, !DIExpression(), debug-location !16
    DBG_VALUE 456, $noreg, !13, !DIExpression(DW_OP_LLVM_fragment, 0, 16), debug-location !18
    DBG_VALUE 789, $noreg, !13, !DIExpression(DW_OP_LLVM_fragment, 16, 16), debug-location !18
    CALL64pcrel32 @bar, csr_64, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, debug-location !17
    RETQ debug-location !18

buildLocationList() will produce a single entry (a composite location description containing the 456 and 789 fragment). Then MInsn will be:

DBG_VALUE 123, $noreg, !13, !DIExpression(), debug-location !16

and DbgMI:

DBG_VALUE 789, $noreg, !13, !DIExpression(DW_OP_LLVM_fragment, 16, 16), debug-location !18

and End null. As MInsn covers the whole scope, we will produce a single location description, but with DbgMI, which is only one of the two fragmented debug values which the variable is described by:

DW_AT_location	(DW_OP_piece 0x2, DW_OP_constu 0x315, DW_OP_stack_value, DW_OP_piece 0x2)

That input is hand-modified, and perhaps we can never get such debug information here in DwarfDebug when running through the normal pass chain. Still, it feels a bit iffy that a size optimization may result in us dropping information. My feeling is that this framework code should be able to handle that type of silly input.

Maybe we should be conservative and only try to produce single location descriptions in the most obvious cases. The most conservative approach I guess would be to only produce single location descriptions when all debug values are the same. Another is perhaps only doing this if all debug values are non-fragmented?

  • Represent DbgVariable's single value representation by DebugLocEntry::Value insted of representing it with MachineInstr. Such representation will allow us to produce fragment single value representation (for now it is not supported).
  • Update DwarfDebug::buildLocationList to check if produced list has single entry that is valid throughout whole variable's scope.

If this seems as a proper way, I suppose that we can make DebugLoc::Value now as a separate class ValueLoc or ValueLocation? Such refactor could go as a separate NFC patch if it is fine?

aprantl added inline comments.May 24 2019, 8:41 AM
lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
610

typo: Check and has a single

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
247

///

270

per coding style this comment should be on the decl inside the class not on the implementation

1177

If the resulting list

  • Update comments.

Thanks a lot! This "location list -> single location description" detection seems more robust, and should help us moving towards supporting rewrites of single-entry location lists containing fragments also.

Nit: Perhaps we should move the Value class out of DebugLocEntry now that we use it for DbgVariable also? However, I don't have any good ideas on in where to place it (e.g. which header file). I think we can land this without doing such a refactoring though.

lib/CodeGen/AsmPrinter/DwarfDebug.h
114

Nit: Perhaps change single instruction -> single location?

Nit: Perhaps we should move the Value class out of DebugLocEntry now that we use it for DbgVariable also? However, I don't have any good ideas on in where to place it (e.g. which header file). I think we can land this without doing such a refactoring though.

Sorry, I saw now that you had already added a comment about that.

  • Updated comment (single instruction -> single location)

@dstenb Nevermind. Thanks for the review anyway! Do you think that this is ready for landing now?

dstenb accepted this revision.May 27 2019, 7:08 AM

Do you think that this is ready for landing now?

LGTM, but I think it would be good to see if @aprantl has any more comments on this.

aprantl accepted this revision.May 28 2019, 9:39 AM

LGTM. Have you tested the performance when compiling Clang with ASAN enabled to make sure we're not accidentally regressing because of the more thorough single-location check?

lib/CodeGen/AsmPrinter/DbgEntityHistoryCalculator.cpp
364

personally I'd write this as CurElem->first == FrameReg not a big deal, though

lib/CodeGen/AsmPrinter/DwarfDebug.h
119

NFC commit for later, make this Optional<unsigned>

Have you tested the performance when compiling Clang with ASAN enabled to make sure we're not accidentally regressing because of the more thorough single-location check?

I've built LLVM with '-DLLVM_USE_SANITIZER=Address' using downloaded clang binary to produce private clang binary. I've run tests and compiled latest version of gdb with new private binary and I didn't see any regression there. Is that what you meant?

dstenb added a comment.Jun 3 2019, 2:58 AM

Have you tested the performance when compiling Clang with ASAN enabled to make sure we're not accidentally regressing because of the more thorough single-location check?

I've built LLVM with '-DLLVM_USE_SANITIZER=Address' using downloaded clang binary to produce private clang binary. I've run tests and compiled latest version of gdb with new private binary and I didn't see any regression there. Is that what you meant?

I think that the idea was to see if this patch increases the compilation time when building such a Clang binary.

(Just make sure to use the RelWithDebInfo build type also.)

Have you tested the performance when compiling Clang with ASAN enabled to make sure we're not accidentally regressing because of the more thorough single-location check?

I've built LLVM with '-DLLVM_USE_SANITIZER=Address' using downloaded clang binary to produce private clang binary. I've run tests and compiled latest version of gdb with new private binary and I didn't see any regression there. Is that what you meant?

I think that the idea was to see if this patch increases the compilation time when building such a Clang binary.

(Just make sure to use the RelWithDebInfo build type also.)

Yes. Enabling ASAN on a large C++ code base such as Clang is a great way to see compiler performance problems.
Can you post the output of "ninja clean && time ninja clang" before and after applying your patch to the *host* compiler?

Have you tested the performance when compiling Clang with ASAN enabled to make sure we're not accidentally regressing because of the more thorough single-location check?

I've built LLVM with '-DLLVM_USE_SANITIZER=Address' using downloaded clang binary to produce private clang binary. I've run tests and compiled latest version of gdb with new private binary and I didn't see any regression there. Is that what you meant?

I think that the idea was to see if this patch increases the compilation time when building such a Clang binary.

(Just make sure to use the RelWithDebInfo build type also.)

Yes. Enabling ASAN on a large C++ code base such as Clang is a great way to see compiler performance problems.
Can you post the output of "ninja clean && time ninja clang" before and after applying your patch to the *host* compiler?

Thanks for clarification!

before

real 46m29,670s
user 325m27,488s
sys 6m17,255s

after

real 46m41,562s
user 326m16,497s
sys 6m18,981s

Considering user + sys there is 0.25% increment with patch.

Thanks, that looks like an acceptable performance/accuracy tradeoff.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2019, 1:37 AM

Thanks for the review and patience!