This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Drop location ranges for variables which exist entirely outside the variable's scope
ClosedPublic

Authored by Orlando on Jun 18 2020, 3:04 PM.

Details

Summary

This patch reduces file size by dropping variable locations a debugger user
will not see.

Background

PR45889 [1] describes a bug where we get a variable location for an instruction
range that exists outside of the instruction range of the scope that it is
declared in.

We discovered the root cause of that particular problem lies in the handling of
parameters and argument values through SelectionDAG. Locations like these are
superfluous. Test llvm/test/DebugInfo/COFF/register-variables.ll shows the same
problem occuring as a result of a seemingly incorrect scope range (more on this
later). In cases like this the variable locations appear to be sane and it is
that the scope ranges are wrong or misleading.

A debugger user will not get to see these out of scope locations. Not only do
they increase the size of the binary in and of themselves, but having more than
one location will prevent us from emitting locations that cover their entire
scope as single locations. Single locations are desirable because they take up
less space than location lists with single entries.

Solution

This patch drops variable locations which exist entirely outside of the
variable's scope. The way it is implemented is simple: after building the debug
entity history map we loop through it. For each variable we look at each entry.
If the entry opens a location range which does not intersect any of the
variable's scope's ranges then we mark it for removal. After visiting the
entries for each variable we also mark any clobbering entries which will no
longer be referenced for removal, and then finally erase the marked entries.
This all requires the ability to query the order of instructions, so before
this runs we number them.

Results

Building CTMark with CMAKE_BUILD_TYPE=RelWithDebInfo without (base) and with
(patched) this patch yields a geomean binary size difference of -1.9%, with the
.debug_loc sections up to nearly 14% smaller in some cases. For a clang-3.4
build I see similar percentage savings to sqlite3 in the suite below.

Program                                        base      patched   diff 
 test-suite :: CTMark/SPASS/SPASS.test          3971320   3814544  -3.9%
 test-suite...ark/tramp3d-v4/tramp3d-v4.test    14357584  13812000 -3.8%
 test-suite...TMark/7zip/7zip-benchmark.test    9190760   8885512  -3.3%
 test-suite :: CTMark/Bullet/bullet.test        7403096   7182752  -3.0%
 test-suite...:: CTMark/sqlite3/sqlite3.test    4077552   4003072  -1.8%
 test-suite :: CTMark/kimwitu++/kc.test         5117728   5038336  -1.6%
 test-suite...:: CTMark/ClamAV/clamscan.test    2545312   2526184  -0.8%
 test-suite :: CTMark/lencod/lencod.test        2643960   2631568  -0.5%
 test-suite...Mark/mafft/pairlocalalign.test    1530904   1526760  -0.3%
 test-suite...-typeset/consumer-typeset.test    1527120   1524040  -0.2%
 Geomean difference                                                -1.9%

My machine is not set up for precise performance measurements, but I observed
no significant change in compile time (0.0x% change in either direction).

The function validThroughout in DwarfDebug.cpp returns true when a location can
be considered a single location for a variable. Earlier I mentioned that single
locations are desirable. Here are some numbers from a clang-3.4 build that show
the patch improving our abillity to detect them:

                                    base    patched
times validThroughout is called     3676550 3638269
times validThroughout returns true  1470517 1541733
percentage of calls returning true  40.0%   42.4%

For these benchmarks 'base' is clang at 772349de887, and 'patched' is that
commit with the patch applied on top.

Tests

Added llvm/test/DebugInfo/X86/trim-var-locs.mir

Modified llvm/test/DebugInfo/X86/live-debug-variables.ll
Modified llvm/test/DebugInfo/ARM/PR26163.ll
In both tests an out of scope location is now removed. The remaining location
covers the entire scope of the variable allowing us to emit it as a single
location.

Modified llvm/test/DebugInfo/COFF/register-variables.ll
Branch folding merges the tails of if.then and if.else into if.else. Each
blocks' debug-locations point to different scopes so when they're merged we
can't use either. Because of this the variable 'c' ends up with a location
range which doesn't cover any instructions in its scope; with the patch applied
the location range is dropped and its flag changes to IsOptimizedOut.

Related future work?

The simple instruction numbering added in this patch can be used to improve how
we detect single locations in validThroughout (DwarfDebug.cpp). With a small
change on top of this patch we can reduce binary size even further, and
potentially by a similar magnitude. In addition it allows us to replaces a
linear scan in validThroughout with a map lookup.

Summary

This patch reduces the binary size of RelWithDebInfo builds by an average of
1.9%, and by nearly 4% in one case, across 11 applications by dropping variable
locations which a debugger user will never see. Some of these locations exist
in the first place as a result of bugs in clang so there's an argument that
this patch should not land, and instead we should make a verifer pass (either
in clang and llc, or in llvm-dwarfdump), and focus on fixing the issues they
reveal. OTOH this patch has immediate and tangible wins, and doesn't preclude
work on fixing those fundamental issues.

What do you think?

Thank you for taking the time to read this,
Orlando

[1] https://bugs.llvm.org/show_bug.cgi?id=45889

Diff Detail

Event Timeline

Orlando created this revision.Jun 18 2020, 3:04 PM

Thanks for working this up/sending it out.

The textual description doesn't /sound/ like it handles partial overlap ("If the entry opens a location range which does not intersect any of the variable's scope's ranges then we mark it for removal.") how are they handled? Are locations trimmed down to match the scope range too?

I'm guessing they're actually covered and tested - overlap that extends beyond the end or start of the scope, etc.

Personally: I'm marginally in favor. Though I wouldn't mind seeing more data about cases where this turns up (which should be easier to find with this prototype) to see if they're readily fixable bugs.

Oh, also: did you try running llvm-dwarfdump statistics before/after? I believe it tracks number of bytes of variable location relative to the enclosing scope. Assuming it doesn't count bytes of variable location extending beyond the enclosing scope (if it does, that bug should be fixed - maybe flagging those "extra bytes" in a separate penalty bucket - and could have a separate penalty bucket for cases where a single location could've been used but a location list was used instead (either because of these extended scopes - or otherwise)) - the number of covered bytes should remain the same before/after this patch? Does it?

@Orlando Thanks for this!
As @dblaikie pointed out, I think this should not affect locations coverage, so it'd be nice if you check/share the results of either llvm-dwarfdump --statistics or llvm-locstats.

@djtodoro, @dblaikie, thank you both for your comments.

Thanks for working this up/sending it out.

The textual description doesn't /sound/ like it handles partial overlap ("If the entry opens a location range which does not intersect any of the variable's scope's ranges then we mark it for removal.") how are they handled?

Partial overlaps are ignored for now. i.e. if the location range partially overlaps a scope range then we do not drop it.

Are locations trimmed down to match the scope range too?

Not in this patch - I have a TODO comment in there for exactly that though. The change required would be somewhat invasive so I thought it best to leave it for now.

I'm guessing they're actually covered and tested - overlap that extends beyond the end or start of the scope, etc.

They've been considered and handled in the code, but my test has room for improvement here.

Personally: I'm marginally in favor. Though I wouldn't mind seeing more data about cases where this turns up (which should be easier to find with this prototype) to see if they're readily fixable bugs.

I don't think these numbers are what you had in mind, but I had them to hand, and may be interesting?

clang-3.4 RelWithDebInfo -trim-var-locs=true build
variables analysed                3693356
variables with dropped locations  512874   (13.89% of variables)  
locations analysed                8798121
locations dropped                 665043   (7.56% of locations)

Oh, also: did you try running llvm-dwarfdump statistics before/after? I believe it tracks number of bytes of variable location relative to the enclosing scope. Assuming it doesn't count bytes of variable location extending beyond the enclosing scope (if it does, that bug should be fixed - maybe flagging those "extra bytes" in a separate penalty bucket - and could have a separate penalty bucket for cases where a single location could've been used but a location list was used instead (either because of these extended scopes - or otherwise)) - the number of covered bytes should remain the same before/after this patch? Does it?

Good idea. I tried this just now and was alarmed to see a difference in scope bytes covered! But then I had a look at the statistics code and IIUC it looks like no distinction is made between which scope the bytes cover:

in llvm/tools/llvm-dwarfdump/Statistics.cpp

 ...
 @ https://github.com/llvm/llvm-project/blob/master/llvm/tools/llvm-dwarfdump/Statistics.cpp#L280
    for (auto Entry : *Loc) {
      uint64_t BytesEntryCovered = Entry.Range->HighPC - Entry.Range->LowPC;
      BytesCovered += BytesEntryCovered;
      if (IsEntryValue(Entry.Expr))
        BytesEntryValuesCovered += BytesEntryCovered;
    }
...
@ https://github.com/llvm/llvm-project/blob/master/llvm/tools/llvm-dwarfdump/Statistics.cpp#L317
// Turns out we have a lot of ranges that extend past the lexical scope.
GlobalStats.ScopeBytesCovered += std::min(BytesInScope, BytesCovered);

I'm a bit surprised how much code is necessary for this, but if there are clear size benefits of doing this, we should probably do it. Thanks for the detailed writeup!

llvm/lib/CodeGen/AsmPrinter/DbgEntityHistoryCalculator.cpp
100

Nit: It's nice to use Doxygen-style /// comments here, since IDEs can format it specially and integrate it in help browsers etc.

130

Is this an inclusive interval on both sides or should this be [StartMI, EndMI)?

202

this looks like it could be a range-based for loop?

270

range-based for?

Orlando marked 2 inline comments as done.Jun 19 2020, 3:06 PM

Hi @aprantl, thanks for taking a look.

llvm/lib/CodeGen/AsmPrinter/DbgEntityHistoryCalculator.cpp
130

This is meant to be inclusive on both sides, yeah. Looks like the second if inside the function body should actually be:

if (EndMI && !isBefore(RangesI->second, EndMI, Ordering))
   return RangesI;

I'll fix this mistake. Though luckily this won't have made a noticeable difference since AFAICT we're only incorrectly dropping 0 length location ranges because of it, and these wouldn't be emitted anyway.

202

I agree that this loop is a little bit ugly; I'm doing it this way to update StartIndex.

aprantl added inline comments.Jun 19 2020, 3:36 PM
llvm/lib/CodeGen/AsmPrinter/DbgEntityHistoryCalculator.cpp
202

I see. I guess that is better than

for (auto EI : HistoryMapEntries) {
  LLVM_DEFER { StartIndex++; };
  ...
Orlando updated this revision to Diff 272375.Jun 22 2020, 4:11 AM
Orlando marked 6 inline comments as done.

Addressed comments from @aprantl

To continue to scope-bytes-covered conversation a little further: Even if we did report the coverage stat correctly (i.e. only parent scope bytes covered count) we'd still see slightly different results. There's at least one place I can think of where some special case code is triggered because of the fewer number of location ranges.

This patch depends on the ranges for all scopes to be (reasonably) correct, but I think there's one modified test where that's not the case; a variable location is dropped because (AFAICT looking at the equivalent DWARF output) its containing scope isn't pointing to the right instructions.
If we take the stance that these cases are bugs, then this analysis is better off done in a verifier, so we can find and fix those cases, rather than papering over compiler bugs.

I might be wrong about that test, and maybe we aren't going to take that stance, but I thought it was worth bringing up here.

llvm/test/DebugInfo/COFF/register-variables.ll
111

I believe this one is being marked as OptimizedOut because the containing scope is pointing to the wrong instructions. The variable's instruction range looks not unreasonable.

jmorse added a subscriber: jmorse.Jun 23 2020, 6:28 AM

FWIW: I'd enjoy this patch landing. AFAIUI there's no meaningful information communicated to the DWARF consumers by out-of-scope ranges (even if it indicates a compiler bug somewhere), and we may as well save them disk space and debugger-load-time.

If we're looking to make no-out-of-scope-ranges a verification property, it's probably best to use this code to produce warnings, then errors, then push further up the compilation pipeline. I'd much prefer connecting source/variable locations and designing this problem out, though.

llvm/test/DebugInfo/X86/trim-var-locs.mir
117

nit: scooooope

Orlando marked an inline comment as done.Jun 24 2020, 2:14 AM

This patch depends on the ranges for all scopes to be (reasonably) correct,

I'd say instead that 'variable locations depend on the ranges for all scopes to be (reasonably) correct'. And that this patch just acknowledges that relationship and clears away what we cannot use/see in a debugger.

but I think there's one modified test where that's not the case; a variable location is dropped because (AFAICT looking at the equivalent DWARF output) its containing scope isn't pointing to the right instructions.
If we take the stance that these cases are bugs, then this analysis is better off done in a verifier, so we can find and fix those cases, rather than papering over compiler bugs.

I might be wrong about that test, and maybe we aren't going to take that stance, but I thought it was worth bringing up here.

llvm/lib/CodeGen/AsmPrinter/DbgEntityHistoryCalculator.cpp
202

I wasn't aware of LLVM_DEFER (and I can't see it with grep?). I've left it as is for now but I'm happy to change it.

llvm/test/DebugInfo/X86/trim-var-locs.mir
117

I'll update this here if I need to make other changes, or when I land it otherwise, thanks!

This patch depends on the ranges for all scopes to be (reasonably) correct,

I'd say instead that 'variable locations depend on the ranges for all scopes to be (reasonably) correct'. And that this patch just acknowledges that relationship and clears away what we cannot use/see in a debugger.

Either way, the question remains: when we find cases where we need to "clear away" something, is that a bug, or is this merely a cleanup pass? In the case of the test I commented on, it's a bug, and I'd rather not be hiding bugs.

This patch depends on the ranges for all scopes to be (reasonably) correct,

I'd say instead that 'variable locations depend on the ranges for all scopes to be (reasonably) correct'. And that this patch just acknowledges that relationship and clears away what we cannot use/see in a debugger.

Either way, the question remains: when we find cases where we need to "clear away" something, is that a bug, or is this merely a cleanup pass? In the case of the test I commented on, it's a bug, and I'd rather not be hiding bugs.

+1.

This patch depends on the ranges for all scopes to be (reasonably) correct,

I'd say instead that 'variable locations depend on the ranges for all scopes to be (reasonably) correct'. And that this patch just acknowledges that relationship and clears away what we cannot use/see in a debugger.

Either way, the question remains: when we find cases where we need to "clear away" something, is that a bug, or is this merely a cleanup pass? In the case of the test I commented on, it's a bug, and I'd rather not be hiding bugs.

Looking closer at COFF/register-variables.ll, it doesn't look like a bug but instead another victim of how we model debug info. Before running -branch-folder (Control Flow Optimizer) all the instructions in the else block belong to the else block scope. The branch folder merges the common tails from 'then' and 'else' into 'else', merging the debug-locations while it does so. @jmorse summarised the situation well offline: Every time we call getMergedLocation, we are creating the conditions where this occurs, and eliminating it during compilation would be a high burden.

Looking closer at COFF/register-variables.ll, it doesn't look like a bug but instead another victim of how we model debug info. Before running -branch-folder (Control Flow Optimizer) all the instructions in the else block belong to the else block scope. The branch folder merges the common tails from 'then' and 'else' into 'else', merging the debug-locations while it does so. @jmorse summarised the situation well offline: Every time we call getMergedLocation, we are creating the conditions where this occurs, and eliminating it during compilation would be a high burden.

And yet, the variable was allocated to a register, and the variable's location information pointed to the correct instruction range.
Inadequacies in our ability to represent the scope properly shouldn't cause us to eliminate *correct* location information for variables.

Paul wrote:

And yet, the variable was allocated to a register, and the variable's location information pointed to the correct instruction range.
Inadequacies in our ability to represent the scope properly shouldn't cause us to eliminate *correct* location information for variables.

My understanding is that we haven't failed to represent the scope here, instead it's been destroyed by optimisation. Here's the MIR before -branch-folder, minus some long lines:

bb.1.if.then: 
 ; predecessors: %bb.0 
   liveins: $eax 
   DBG_VALUE $eax, $noreg, !"a", !DIExpression(), debug-location !27; t.cpp:11:9 line no:11 
   DBG_VALUE $eax, $noreg, !"a", !DIExpression(), debug-location !34; t.cpp:4:33 @[ t.cpp:12:13 ] line no:4 
   renamable $eax = nsw ADD32ri8 killed renamable $eax(tied-def 0), 1, implicit-def dead $eflags, debug-location !36; t.cpp:5:13 @[ t.cpp:12:13 ] 
   DBG_VALUE $eax, $noreg, !"b", !DIExpression(), debug-location !37; t.cpp:5:7 @ [ t.cpp:12:13 ] line no:5 
   ADD32mi8 $rip, 1, $noreg, @x, $noreg, 1, implicit-def dead $eflags, debug-location !38 :: (deref stuff), (more deref stuff); t.cpp:6:3 @[ t.cpp:12:13 ] 
   DBG_VALUE $eax, $noreg, !"b", !DIExpression(), debug-location !43; t.cpp:12:9 line no:12 
   $ecx = COPY killed renamable $eax, debug-location !44; t.cpp:13:5 
   SEH_Epilogue debug-location !44; t.cpp:13:5 
   $rsp = frame-destroy ADD64ri8 $rsp(tied-def 0), 32, implicit-def dead $eflags, debug-location !44; t.cpp:13:5 
   $rsi = frame-destroy POP64r implicit-def $rsp, implicit $rsp, debug-location !44; t.cpp:13:5 
   TCRETURNdi64 @putint, 0, <regmask blah blah blah>, implicit $rsp, implicit $ssp, implicit $ecx, debug-location !44; t.cpp:13:5 
 
 bb.2.if.else: 
 ; predecessors: %bb.0 
   liveins: $eax 
   DBG_VALUE $eax, $noreg, !"c", !DIExpression(), debug-location !46; t.cpp:15:9 line no:15 
   $ecx = COPY killed renamable $eax, debug-location !47; t.cpp:16:5 
   SEH_Epilogue debug-location !47; t.cpp:16:5 
   $rsp = frame-destroy ADD64ri8 $rsp(tied-def 0), 32, implicit-def dead $eflags, debug-location !47; t.cpp:16:5 
   $rsi = frame-destroy POP64r implicit-def $rsp, implicit $rsp, debug-location !47; t.cpp:16:5 
   TCRETURNdi64 @putint, 0, <regmask blah blah blah>, implicit $rsp, implicit $ssp, implicit $ecx, debug-location !47; t.cpp:16:5

And here it is after:

[More bits of bb.1]
  DBG_VALUE $eax, $noreg, !"b", !DIExpression(), debug-location !37; t.cpp:5:7 @ [ t.cpp:12:13 ] line no:5
  ADD32mi8 $rip, 1, $noreg, @x, $noreg, 1, implicit-def dead $eflags, debug-location !38 :: (deref stuff), (more deref stuff); t.cpp:6:3 @[ t.cpp:12:13 ]
  DBG_VALUE $eax, $noreg, !"b", !DIExpression(), debug-location !43; t.cpp:12:9 line no:12

bb.2.if.else:
; predecessors: %bb.0, %bb.1
  liveins: $eax
  DBG_VALUE $eax, $noreg, !"c", !DIExpression(), debug-location !46; t.cpp:15:9 line no:15
  $ecx = COPY killed renamable $eax, debug-location !DILocation(line: 0, scope: !19); t.cpp:0
  SEH_Epilogue debug-location !DILocation(line: 0, scope: !19); t.cpp:0
  $rsp = frame-destroy ADD64ri8 $rsp(tied-def 0), 32, implicit-def dead $eflags, debug-location !DILocation(line: 0, scope: !19); t.cpp:0
  $rsi = frame-destroy POP64r implicit-def $rsp, implicit $rsp, debug-location !DILocation(line: 0, scope: !19); t.cpp:0
  TCRETURNdi64 @putint, 0, <regmask blah blah blah>, implicit $rsp, implicit $ssp, implicit $ecx, debug-location !DILocation(line: 0, scope: !19); t.cpp:0

The tail of both blocks have been de-duplicated into the same block (bb.2), from the COPY to ecx, to the tail call. For every pair of duplicate instructions, there was no agreement on the source location, so they've all been given a zero line-number with the scope set to the parent scope of each pair, so for the original source:

void f(int p) {
   if (p) {
     int a = getint();
     int b = inlineinc(a);
     putint(b);
   } else {
     int c = getint();
     putint(c);
   }
}

Those instructions are considered in the scope of the "if" rather than for either of the blocks. The patch here then decides that, because the variable location is outside of a scope where it makes sense, it should be deleted. When you say,

Inadequacies in our ability to represent the scope properly shouldn't cause us to eliminate *correct* location information for variables.

Is there something that we can be doing better to represent the scopes in this scenario? I think it comes back to being able to describe instructions that correspond to multiple source locations, which as far as I'm aware is an open question right now.

Looking closer at COFF/register-variables.ll, it doesn't look like a bug but instead another victim of how we model debug info. Before running -branch-folder (Control Flow Optimizer) all the instructions in the else block belong to the else block scope. The branch folder merges the common tails from 'then' and 'else' into 'else', merging the debug-locations while it does so. @jmorse summarised the situation well offline: Every time we call getMergedLocation, we are creating the conditions where this occurs, and eliminating it during compilation would be a high burden.

Hmm - could you explain that in more detail? If we merge the locations both if and else scopes would cease to exist (since we can't represent that ambiguity), right? But the dbg.value doesn't use/care about its !dbg, so it continues existing/describing a variable location over some unrelated instructions?

Fair enough.

And yet, the variable was allocated to a register, and the variable's location information pointed to the correct instruction range.
Inadequacies in our ability to represent the scope properly shouldn't cause us to eliminate *correct* location information for variables.

This is I think a point of disagreement (between you and I) - I don't think it's useful to emit DWARF that describes variables outside their scope. I don't think any consumer should do anything useful with that data & it seems like wasted bytes to me. (name lookup wouldn't find the variable at any point where it has a location, etc)

If we've failed to track a variable's location within its scope, we shouldn't emit any location for it. I don't think that variable location information is correct if it's not in the scope of the variable - in this case, there is no scope of the variable (or its been reduced) - so no range of instructions over which to describe the location of the variable, since it's not in-scope.

If that's the case - that the merged instructions drop the scope and leave behind dbg.values that describe the variable even though it's not in scope - where the fix would be to remove the dbg.values (because now they describe the location of a variable outside that variable's scope) would be impractical/ie: it's cheaper to remove it later - then I'm OK with that.

Though I worry about that this would leave around a lot of dead dbg.value intrinsics, perhaps? That we'd be better off cleaning up earlier, not just for the sake of the resulting DWARF.

From my perspective it's a question of whether we should actively drop that erroneous debug info at the end now - knowing that (so far as I've seen in the discussion) all such instances of it /might/ be bugs (if someone can show one example where we don't consider it a bug in an LLVM optimization

Looking closer at COFF/register-variables.ll, it doesn't look like a bug but instead another victim of how we model debug info. Before running -branch-folder (Control Flow Optimizer) all the instructions in the else block belong to the else block scope. The branch folder merges the common tails from 'then' and 'else' into 'else', merging the debug-locations while it does so. @jmorse summarised the situation well offline: Every time we call getMergedLocation, we are creating the conditions where this occurs, and eliminating it during compilation would be a high burden.

Hmm - could you explain that in more detail? If we merge the locations both if and else scopes would cease to exist (since we can't represent that ambiguity), right? But the dbg.value doesn't use/care about its !dbg, so it continues existing/describing a variable location over some unrelated instructions?

That's right.

Fair enough.

And yet, the variable was allocated to a register, and the variable's location information pointed to the correct instruction range.
Inadequacies in our ability to represent the scope properly shouldn't cause us to eliminate *correct* location information for variables.

This is I think a point of disagreement (between you and I) - I don't think it's useful to emit DWARF that describes variables outside their scope. I don't think any consumer should do anything useful with that data & it seems like wasted bytes to me. (name lookup wouldn't find the variable at any point where it has a location, etc)

If we've failed to track a variable's location within its scope, we shouldn't emit any location for it. I don't think that variable location information is correct if it's not in the scope of the variable - in this case, there is no scope of the variable (or its been reduced) - so no range of instructions over which to describe the location of the variable, since it's not in-scope.

If that's the case - that the merged instructions drop the scope and leave behind dbg.values that describe the variable even though it's not in scope - where the fix would be to remove the dbg.values (because now they describe the location of a variable outside that variable's scope) would be impractical/ie: it's cheaper to remove it later - then I'm OK with that.

That summarises the situation as I see it, yeah.

Though I worry about that this would leave around a lot of dead dbg.value intrinsics, perhaps? That we'd be better off cleaning up earlier, not just for the sake of the resulting DWARF.

I don't think we could do it before isel. One reason being that in IR we only know where locations start and not exactly how far they extend which means we can't do very precise scope range overlap checks. I suppose it could be possible to do it earlier post-isel? Though trimming here at the end is very safe because we have the final program structure to work with; knowing nothing is going to move around afterwards is nice.

From my perspective it's a question of whether we should actively drop that erroneous debug info at the end now - knowing that (so far as I've seen in the discussion) all such instances of it /might/ be bugs (if someone can show one example where we don't consider it a bug in an LLVM optimization

I'm not sure I follow here, please could you rephrase this part?

Orlando edited the summary of this revision. (Show Details)Jul 6 2020, 1:50 AM

Does anyone have any concerns with this patch that they feel have not been addressed?

I've slightly reworded the patch description following the discussion on the nature of the changes to the register-variables.ll test.

I think I didn't fully grasp that the blocks were being (tail-)merged, which makes the scope ambiguous, and all the rest. So I withdraw the objection on that basis. DWARF is fine with multiple variables pointing to the same location, but it's less forgiving about scopes IIRC, much like it can't describe multiple source attributions for an instructions. This all makes me sad, but that's how DWARF is at the moment.

Is there still an open question about whether this wants to be a cleanup pass or a verifier check? I apologize for losing track.

I think I didn't fully grasp that the blocks were being (tail-)merged, which makes the scope ambiguous, and all the rest. So I withdraw the objection on that basis. DWARF is fine with multiple variables pointing to the same location, but it's less forgiving about scopes IIRC, much like it can't describe multiple source attributions for an instructions. This all makes me sad, but that's how DWARF is at the moment.

Is there still an open question about whether this wants to be a cleanup pass or a verifier check? I apologize for losing track.

My take on it is that it's probably not practical to do this as a cleanup - it'd mean any time we merge debug locations, etc, we'd have to go check for isolated variable locations that have become invalid.

(though, inversely: I worry that not cleaning up those variable locations might be a source of IR bloat and algorithmic scaling problems when the debug locations are scanned... )

I think I didn't fully grasp that the blocks were being (tail-)merged, which makes the scope ambiguous, and all the rest. So I withdraw the objection on that basis. DWARF is fine with multiple variables pointing to the same location, but it's less forgiving about scopes IIRC, much like it can't describe multiple source attributions for an instructions. This all makes me sad, but that's how DWARF is at the moment.

Is there still an open question about whether this wants to be a cleanup pass or a verifier check? I apologize for losing track.

I think we have ruled out a MIR/IR verifier pass, but flagging it in llvm-dwarfdump somehow would still be nice and I wrote a ticket for fixing up the --statistics (PR46575). Instead, I think the question is now whether this should happen earlier in some way to reduce the number of redundant intrinsics, as David says in his reply below.

My take on it is that it's probably not practical to do this as a cleanup - it'd mean any time we merge debug locations, etc, we'd have to go check for isolated variable locations that have become invalid.

(though, inversely: I worry that not cleaning up those variable locations might be a source of IR bloat and algorithmic scaling problems when the debug locations are scanned... )

I chose to do the trimming here because I can say with confidence that it won't cause any coverage or correctness regressions. I agree that it would be nice to remove redundant intrinsics, though I'm not exactly sure what that implementation would entail without further investigation. Is anyone able to offer any insight on this? If not, I suppose it might be reasonable to attempt to do this earlier (in IR) to see if there are any problems, and compare the results. Though I won't be able to get on this for a little while myself.

I think I didn't fully grasp that the blocks were being (tail-)merged, which makes the scope ambiguous, and all the rest. So I withdraw the objection on that basis. DWARF is fine with multiple variables pointing to the same location, but it's less forgiving about scopes IIRC, much like it can't describe multiple source attributions for an instructions. This all makes me sad, but that's how DWARF is at the moment.

Is there still an open question about whether this wants to be a cleanup pass or a verifier check? I apologize for losing track.

I think we have ruled out a MIR/IR verifier pass, but flagging it in llvm-dwarfdump somehow would still be nice and I wrote a ticket for fixing up the --statistics (PR46575). Instead, I think the question is now whether this should happen earlier in some way to reduce the number of redundant intrinsics, as David says in his reply below.

My take on it is that it's probably not practical to do this as a cleanup - it'd mean any time we merge debug locations, etc, we'd have to go check for isolated variable locations that have become invalid.

(though, inversely: I worry that not cleaning up those variable locations might be a source of IR bloat and algorithmic scaling problems when the debug locations are scanned... )

I chose to do the trimming here because I can say with confidence that it won't cause any coverage or correctness regressions. I agree that it would be nice to remove redundant intrinsics, though I'm not exactly sure what that implementation would entail without further investigation. Is anyone able to offer any insight on this? If not, I suppose it might be reasonable to attempt to do this earlier (in IR) to see if there are any problems, and compare the results. Though I won't be able to get on this for a little while myself.

I don't have any particular insight on that, no. If no one else is stepping up, this patch as-is (though I haven't reviewed the implementation in detail) seems like a reasonable improvement at least, and should be acceptable.

Thanks everyone for the review and discussion so far. The general idea of the patch has been okayed and it just needs a code review now. Could anyone please take a look?

Thanks,
Orlando

aprantl accepted this revision.Jul 21 2020, 2:01 PM

LGTM once Paul's comment is addressed.

This revision is now accepted and ready to land.Jul 21 2020, 2:01 PM
Orlando marked an inline comment as done.Jul 22 2020, 1:43 AM

LGTM once Paul's comment is addressed.

Thanks! If you're referring to Paul's inline comment in test llvm/test/DebugInfo/COFF/register-variables.ll we resolved that in the non-inline comments and I should've marked it as done, oops. I can't find another unaddressed comment so I'll land this soon.

This revision was automatically updated to reflect the committed changes.