This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo][InstrRef] Add a max-stack-slots-to-track limitation / cut-out
ClosedPublic

Authored by jmorse on Jan 31 2022, 5:31 AM.

Details

Summary

Over in D116821, there's a reproducer that stimulates an interesting scenario in InstrRefBasedLDV: with thousands of local variables and asan enabled, the max working set (i.e. maximum number of values live at the same time) is very large, and get forced onto the stack. InstrRefBasedLDV tries to track all those variable locations, allocates some massive tables of values, and can then run out of memory (think ~6k values across 15k blocks -> massive allocations). This is undesirable.

I believe this kind of input is rare, because it's very unlikely that a developer is able to reason about thousands of live, named variables, concurrently while writing their software. Much more likely is that the input is autogenerated, or affected by some kind of instrumentation. In that scenario we should fail gracefully, by ceasing to track stack slots past a certain number: which is what this patch does. That puts an upper bound on the maximum working set that we try to track, and avoids un-necessarily crashing. The downside is that we might un-necessarily drop some variable locations, but the developer is probably happy with something rather than nothing.

This can certainly be improved so that the quality of data tracked degrades gracefully, however this is a starting point that avoids excessive consumption.

In terms of the patch, this adds a cl::opt for max number of stack slots to track, and has the stack-slot-numbering code optionally return None. That then filters through a number of code paths, which can then chose to not track a spill / restore if it touches an untracked spill slot. The added test checks that we drop variable locations that are on the stack, if we set the limit to zero.

Diff Detail

Event Timeline

jmorse created this revision.Jan 31 2022, 5:31 AM
jmorse requested review of this revision.Jan 31 2022, 5:31 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 31 2022, 5:31 AM
jmorse updated this revision to Diff 404532.Jan 31 2022, 7:49 AM

Fix an off-by-one (The limit should be hit when the number of seen locations is >= the limit, otherwise setting a limit of 0 will never work). Add the inverse test to the MIR file added, to check that stack locations would actually be found if the limit wasn't on.

Orlando added inline comments.Jan 31 2022, 10:03 AM
llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
159

nit: developers -> developer's

163

I don't know what constitutes a reasonable cut-off for this. What made you choose 250?

llvm/test/DebugInfo/MIR/InstrRef/spill-slot-limits.mir
27–28

I'm confused - why don't we see this initial non-spill location for the -livedebugvalues-max-stack-slots=0 case (in regular CHECK directives above)?

jmorse added inline comments.Jan 31 2022, 10:10 AM
llvm/test/DebugInfo/MIR/InstrRef/spill-slot-limits.mir
27–28

The test spills to the stack and restores to $esi -- if we don't track it through the stack, there'll be no location. I suppose a clearer test would test each stage.

jmorse added inline comments.Jan 31 2022, 10:12 AM
llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
163

It's an educated guess -- I've been building some large C++ codebases with different limits (and with a compiler that aborts when it hits the limit) to try and find out what's reasonable. So far I've reached a limit of 30 without any "normal" builds failing. Still waiting for LTO builds to complete, which is the much more interesting case.

Orlando accepted this revision.Jan 31 2022, 10:27 AM

LGTM

llvm/test/DebugInfo/MIR/InstrRef/spill-slot-limits.mir
27–28

Oops what I really meant was "why don't we see a $eax location" and, now I look again, I can answer the myself: $eax is clobbered before we get to the debug instruction (and, as per the point of this patch, we don't follow the value into the stack). Sorry for the noise!

This revision is now accepted and ready to land.Jan 31 2022, 10:27 AM
This revision was landed with ongoing or failed builds.Feb 1 2022, 11:25 AM
This revision was automatically updated to reflect the committed changes.

It is not just the PPC bots. This is breaking multiple bots (presumably due to missing triple). A non-PPC example: https://lab.llvm.org/buildbot/#/builders/179/builds/2845

FWIW, @kda beat me to it with his revert, but I was just about to commit a fix that simply adds -march=x86-64 to the llc invocations in the test case (just as the original test case that the new one is based on).

Thanks for reverting it to bring the bots back to green Kevin.

jmorse added a comment.Feb 2 2022, 2:39 AM

/me blinks

Yup, missing triple there, sorry for the noise :( and thanks for the recovery.