This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/lib/CodeGen/
-
lib/
-
CodeGen/
-
CommandFlags.cpp
-
LiveDebugValues/
-
LiveDebugValues.h
-
LiveDebugValues.cpp
-
MachineFunction.cpp

Differential D116821

[DebugInfo][InstrRef] Move instr-ref controlling flag out of TargetOptions
ClosedPublic

Authored by jmorse on Jan 7 2022, 10:11 AM.

Download Raw Diff

Details

Reviewers

Orlando
StephenTozer
djtodoro
TWeaver

Commits

rG6a605b97a200: [DebugInfo] Move flag for instr-ref to LLVM option, from TargetOptions

Summary

Hi,

This patch moves the flag controlling instruction referencing out of TargetOptions and into a plain old LLVM option. I don't think TargetOptions is the right place for this to be, mostly because I've tried twice [0, 1] to have instruction referencing on-by-default in clang, and it's still not working out (which is highly unfortunate). Both clang, and seemingly other frontends, populate TargetOptions with their own opinions on what codegen should look like, rather than starting with the defaults from codegen::InitTargetOptionsFromCodeGenFlags. This doesn't make it a good place to control an *optional* extra part of LLVM. Plus, it's harder to control from LTO [2].

This patch switches the truth-test for using instruction referencing from a field in TargetOptions to the added debuginfoShouldUseDebugInstrRef method, which performs the same checks on Triple + command line flags. This will actually turn on instruction referencing for x86 on all frontends, like I intended in [0] :/. I would add an end-to-end test checking that clang does the right thing (produces DBG_INSTR_REF after SelectionDAG), but there doesn't seem to be a right place to put it (cross-project-tests maybe, but it doesn't have any proper compiler tests there). CC @dblaikie in case he has an opinion on such testing.

It's massively rubbish that this has cut out a large amount of time where I thought this was getting test exposure (possibly LTO is still testing it), I just didn't understand how TargetOptions worked. That might mean this has to be reverted / disabled on the release branch (sad faces) if there's turbulence. Too bad.

[0] D114631
[1] rGea22fdd120
[2] https://lists.llvm.org/pipermail/llvm-dev/2020-December/147090.html

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jmorse created this revision.Jan 7 2022, 10:11 AM

Herald added subscribers: pengfei, hiraditya. · View Herald TranscriptJan 7 2022, 10:11 AM

jmorse requested review of this revision.Jan 7 2022, 10:11 AM

Herald added a project: Restricted Project. · View Herald TranscriptJan 7 2022, 10:11 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

(cross-project-tests maybe, but it doesn't have any proper compiler tests there

Not sure what you mean by "it doesn't have any proper compiler tests there", proper in what sense? I expect you could do something like my recent simplified template names roundtrip test ( cross-project-tests/debuginfo-tests/clang_llvm_roundtrip/simplified_template_names.cpp ) which runs clang and checks the result with llvm-dwarfdump. Presumably the same could be done for this functionality - validating that some certain locations that are a signal that instruction referencing was used.

In D116821#3227958, @dblaikie wrote:

Not sure what you mean by "it doesn't have any proper compiler tests there", proper in what sense? I expect you could do something like my recent simplified template names roundtrip test ( cross-project-tests/debuginfo-tests/clang_llvm_roundtrip/simplified_template_names.cpp ) which runs clang and checks the result with llvm-dwarfdump. Presumably the same could be done for this functionality - validating that some certain locations that are a signal that instruction referencing was used.

Thanks for the example; by "proper compiler tests" I was meaning that the existing tests involve examining or running the produced binary, rather than stopping compilation half way through and looking at internal representation details. i.e.: clang test.c -mllvm -stop-after=finalize-isel -o - | FileCheck blah. That's an easy test to write, but there's no similar prior art as far as I understand it.

I'll have a look at detecting a "signal" that the right code-paths were used, although it's ultimately a proxy for what I want to test.

In D116821#3227990, @jmorse wrote:

In D116821#3227958, @dblaikie wrote:

Not sure what you mean by "it doesn't have any proper compiler tests there", proper in what sense? I expect you could do something like my recent simplified template names roundtrip test ( cross-project-tests/debuginfo-tests/clang_llvm_roundtrip/simplified_template_names.cpp ) which runs clang and checks the result with llvm-dwarfdump. Presumably the same could be done for this functionality - validating that some certain locations that are a signal that instruction referencing was used.

Thanks for the example; by "proper compiler tests" I was meaning that the existing tests involve examining or running the produced binary, rather than stopping compilation half way through and looking at internal representation details. i.e.: clang test.c -mllvm -stop-after=finalize-isel -o - | FileCheck blah. That's an easy test to write, but there's no similar prior art as far as I understand it.

Oh, that seems OK to test too/instead - though I'd sort of lean towards cross project tests being more end-to-end "functional" tests, like what's the external behavior we expect to see for this change (I assume that's some improved location quality/coverage/etc). It might be that this. stop-after test might be suitable to live in clang. (tests for things that aren't IR serialized are imperfect - MCoptions (like DWARF type units), etc, - either are untested in clang, or are tested by going a bit further into LLVM than we usually would for a clang test, but I think that's a limitation we live with)

Harbormaster completed remote builds in B142114: Diff 398180.Jan 7 2022, 11:44 AM

Questions around testing aside, the code changes here LGTM.

Cheers Stephen; I'm leaning towards not adding or altering tests here -- there's already a test for this option, and that LLVM does the right thing in LLC. What changes is a matter of storage: instead of the flag being stored in TargetOptions, which every frontend initializes itself it seems, it's now a (global) flag to LLVM. I'd prefer to stay away from parsing MIR in clang tests!.

This revision was not accepted when it landed; it landed in state Needs Review.Jan 12 2022, 5:28 AM

This revision was landed with ongoing or failed builds.

Closed by commit rG6a605b97a200: [DebugInfo] Move flag for instr-ref to LLVM option, from TargetOptions (authored by jmorse). · Explain Why

This revision was automatically updated to reflect the committed changes.

jmorse added a commit: rG6a605b97a200: [DebugInfo] Move flag for instr-ref to LLVM option, from TargetOptions.

+- 6% compile time regresion observed. This is kinda huge change.

http://llvm-compile-time-tracker.com/compare.php?from=23e8a4d16018076e6eed4aa821c020b7fe348c69&to=6a605b97a2006bd391f129a606483656b7c6fb28&stat=instructions

(Apologies that this didn't arrive in the patch that said "turn instr-ref on by default", there were a few mistakes there),

In D116821#3253911, @xbolva00 wrote:

+- 6% compile time regresion observed. This is kinda huge change.

http://llvm-compile-time-tracker.com/compare.php?from=23e8a4d16018076e6eed4aa821c020b7fe348c69&to=6a605b97a2006bd391f129a606483656b7c6fb28&stat=instructions

Yup this enables a significantly different mechanism for tracking variable locations [0] -- if you switch to the "size-file" difference on the compile time tracker, there's a corresponding (in one sample 13.5%) increase in DWARF + location lists produced. It significantly increases our variable location coverage: on the one hand there will definitely be a performance cost caused by producing more information, on the other hand, maybe it can be faster.

[0] https://www.youtube.com/watch?v=yxuwfNnp064

... although when I tested this prior to landing [0] there weren't any double-digit instruction-count regressions, I'll check whether I've missed a patch or something.

[0] http://llvm-compile-time-tracker.com/compare.php?from=849b17949f18b5753592588a2290f2f3dde73ac0&to=f484f03178d47dfc06442b1878462e7c2b7fedf6&stat=instructions

In D116821#3254088, @jmorse wrote:

... although when I tested this prior to landing [0] there weren't any double-digit instruction-count regressions, I'll check whether I've missed a patch or something.

^ Ignore that, turns out I can't read, this pretty much matches what I got pre-landing.

tl;dr: this is working as intended -- whether it's a worthwhile trade-off is open to discussion. For testing / profiling examples / opting out, the option is "-mllvm -experimental-debug-variable-locations=false".

Yeah, the compile-time impact here is not great :( I have some hope that some of this can be mitigated. I've applied a low-hanging optimization in https://github.com/llvm/llvm-project/commit/cbaae614224211d01b385227e7efb447c87fc3ce and another is in D117575, but someone who actually understands what this code is doing algorithmically could probably identify more optimization opportunities.

Looks like I can do some work reduction for a common case that cuts time off the worst regression [0], will have that up later today, although it's not NFC.

[0] http://llvm-compile-time-tracker.com/compare.php?from=1b09d0c42b42be219dd0984e0714d68b4a36cd3e&to=ac78c9434e1a33a137c07054f1c2300083524ca0&stat=instructions

uabelho added a subscriber: uabelho.Jan 21 2022, 4:22 AM

I wrote
https://github.com/llvm/llvm-project/issues/53336
about an X86 verifier error that popped up with this patch.

D117877, I'll try and cook up a few similar patches elsewhere too.

In D116821#3261011, @uabelho wrote:

I wrote
https://github.com/llvm/llvm-project/issues/53336
about an X86 verifier error that popped up with this patch.

Cheers, will get on that next.

I ran into a regression with this patch, which triggers the error "fatal error: error in backend: unknown codeview register ST0" on code that built fine before that.

Here's a reduced reproducer:

$ cat dither.c 
a() {
  long double b;
  asm volatile("" : "=t"(b));
}
$ clang -target x86_64-w64-mingw32 -c -O2 -g -gcodeview dither.c 
fatal error: error in backend: unknown codeview register ST0
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: bin/clang -target x86_64-w64-mingw32 -c -O2 -g -gcodeview dither.c
1.	<eof> parser at end of file
2.	Code generation
3.	Running pass 'Function Pass Manager' on module 'dither.c'.
4.	Running pass 'X86 Assembly Printer' on function '@a'

This makes it impossible to build VLC for x86_64 windows with PDB debug info, which worked fine before.

In D116821#3261109, @mstorsjo wrote:

I ran into a regression with this patch, which triggers the error "fatal error: error in backend: unknown codeview register ST0" on code that built fine before that.

Here's a reduced reproducer:

$ cat dither.c 
a() {
  long double b;
  asm volatile("" : "=t"(b));
}
$ clang -target x86_64-w64-mingw32 -c -O2 -g -gcodeview dither.c 
fatal error: error in backend: unknown codeview register ST0
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: bin/clang -target x86_64-w64-mingw32 -c -O2 -g -gcodeview dither.c
1.	<eof> parser at end of file
2.	Code generation
3.	Running pass 'Function Pass Manager' on module 'dither.c'.
4.	Running pass 'X86 Assembly Printer' on function '@a'

Do @rnk or @aganea have any clue about this?

In D116821#3261205, @mstorsjo wrote:
In D116821#3261109, @mstorsjo wrote:
I ran into a regression with this patch, which triggers the error "fatal error: error in backend: unknown codeview register ST0" on code that built fine before that.

Here's a reduced reproducer:
$ cat dither.c 
a() {
  long double b;
  asm volatile("" : "=t"(b));
}
$ clang -target x86_64-w64-mingw32 -c -O2 -g -gcodeview dither.c 
fatal error: error in backend: unknown codeview register ST0
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: bin/clang -target x86_64-w64-mingw32 -c -O2 -g -gcodeview dither.c
1.	<eof> parser at end of file
2.	Code generation
3.	Running pass 'Function Pass Manager' on module 'dither.c'.
4.	Running pass 'X86 Assembly Printer' on function '@a'
Do @rnk or @aganea have any clue about this?

Some mappings seem to be missing, but I don't know why the Location->Register didn't return the FPU stack ST0-ST7 before? Here: https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp#L1348

diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp
index 9da0a8129f23..c83c211b0c58 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp
@@ -111,6 +111,15 @@ void X86_MC::initLLVMToSEHAndCVRegMapping(MCRegisterInfo *MRI) {

       {codeview::RegisterId::EFLAGS, X86::EFLAGS},

+      {codeview::RegisterId::ST0, X86::ST0},
+      {codeview::RegisterId::ST1, X86::ST1},
+      {codeview::RegisterId::ST2, X86::ST2},
+      {codeview::RegisterId::ST3, X86::ST3},
+      {codeview::RegisterId::ST4, X86::ST4},
+      {codeview::RegisterId::ST5, X86::ST5},
+      {codeview::RegisterId::ST6, X86::ST6},
+      {codeview::RegisterId::ST7, X86::ST7},
+
       {codeview::RegisterId::ST0, X86::FP0},
       {codeview::RegisterId::ST1, X86::FP1},
       {codeview::RegisterId::ST2, X86::FP2},

In D116821#3261927, @aganea wrote:

Some mappings seem to be missing, but I don't know why the Location->Register didn't return the FPU stack ST0-ST7 before?

Ah, I might have an explanation -- the x87 FP stackifier pass complete ignores debug instructions, and so DBG_VALUEs of $fp0 pass through to the DWARF/CodeView emitter untouched. Wheras the "new" variable location tracking extracts physreg locations from the instructions that define values, so it'll produce references to ST0-7 taken from "real" instructions, instead of unstackified FP0-7.

I'm no expert on x87, as I understand it the registers are referred to in a stack but the old variable location tracking paid no attention to the stack changing depth, and that probably lead to lots of wrong variable locations. I haven't tried to fix this in "instruction referencing" mode, but it's technically do-able. I was rather hoping that everyone just used the SSE registers on x86_64, but I guess that doesn't happen when people us "long double".

Hi Jeremy,
after this patch we see significantly increased compile-time memory usage, which makes a large number of targets to fail to compile with the (already quite large) memory limit we have on our build cluster. I'll try to get specific numbers and maybe an example, but it seems like this patch has raised enough concerns to maybe flip the default back while all of these are being addressed?

In D116821#3261927, @aganea wrote:

In D116821#3261205, @mstorsjo wrote:
In D116821#3261109, @mstorsjo wrote:
I ran into a regression with this patch, which triggers the error "fatal error: error in backend: unknown codeview register ST0" on code that built fine before that.

Here's a reduced reproducer:
$ cat dither.c 
a() {
  long double b;
  asm volatile("" : "=t"(b));
}
$ clang -target x86_64-w64-mingw32 -c -O2 -g -gcodeview dither.c 
fatal error: error in backend: unknown codeview register ST0
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: bin/clang -target x86_64-w64-mingw32 -c -O2 -g -gcodeview dither.c
1.	<eof> parser at end of file
2.	Code generation
3.	Running pass 'Function Pass Manager' on module 'dither.c'.
4.	Running pass 'X86 Assembly Printer' on function '@a'
Do @rnk or @aganea have any clue about this?

diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp
index 9da0a8129f23..c83c211b0c58 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp
@@ -111,6 +111,15 @@ void X86_MC::initLLVMToSEHAndCVRegMapping(MCRegisterInfo *MRI) {

       {codeview::RegisterId::EFLAGS, X86::EFLAGS},

+      {codeview::RegisterId::ST0, X86::ST0},
+      {codeview::RegisterId::ST1, X86::ST1},
+      {codeview::RegisterId::ST2, X86::ST2},
+      {codeview::RegisterId::ST3, X86::ST3},
+      {codeview::RegisterId::ST4, X86::ST4},
+      {codeview::RegisterId::ST5, X86::ST5},
+      {codeview::RegisterId::ST6, X86::ST6},
+      {codeview::RegisterId::ST7, X86::ST7},
+
       {codeview::RegisterId::ST0, X86::FP0},
       {codeview::RegisterId::ST1, X86::FP1},
       {codeview::RegisterId::ST2, X86::FP2},

Thanks @aganea! Would you mind posting such a patch for review, or should I? I’m not sure how to create a simple enough and robust testcase for it, but maybe it’s not strictly necessary when it’s just adding more enum entries in a table?

In D116821#3263011, @alexfh wrote:

Hi Jeremy,
after this patch we see significantly increased compile-time memory usage, which makes a large number of targets to fail to compile with the (already quite large) memory limit we have on our build cluster. I'll try to get specific numbers and maybe an example, but it seems like this patch has raised enough concerns to maybe flip the default back while all of these are being addressed?

He explained that there is nothing much what is possible to do - you get better debug info but also CT and MU is a bit worse. But if I want debug info (-g), I want the best possible one, so …

Why to pull every useful work out of trunk just because somebody has some issue? There is a flag, consider using it.

In D116821#3263418, @xbolva00 wrote:

In D116821#3263011, @alexfh wrote:

Hi Jeremy,
after this patch we see significantly increased compile-time memory usage, which makes a large number of targets to fail to compile with the (already quite large) memory limit we have on our build cluster. I'll try to get specific numbers and maybe an example, but it seems like this patch has raised enough concerns to maybe flip the default back while all of these are being addressed?

He explained that there is nothing much what is possible to do - you get better debug info but also CT and MU is a bit worse. But if I want debug info (-g), I want the best possible one, so …

Why to pull every useful work out of trunk just because somebody has some issue? There is a flag, consider using it.

For what it's worth, this comes off a bit snarky/hyperbolic.

There's certainly tradeoffs to be made - "no regression ever" isn't a useful stance (there's some amount of compile time, memory usage, etc, that folks would likely be willing to pay for some benefit), but equally there is some level of regression that's "too much".

I don't think anyone's intending to suggest this work is dead/must be thrown out, but if the regression is more than expected (that's not clear to me - I think this patch should've included more numbers about expected regressions, and those suggesting that it might be suitable to revert should be documenting the level of regressions they're seeing - hopefully demonstrating more significant regressions than what's expected) it's probably reasonable to revert, analyze, and see if there are some extreme cases that weren't encountered/understood before the patch was committed, etc.

@alexfh - do you have some numbers to help clarify how significant these regressions are?
@jmorse - do you have some numbers to help clarify what sort of regressions you were anticipating this patch causing?

it's probably reasonable to revert, analyze, and see if there are some extreme cases

Revert… So (silent) users, which are now or would be (soon new release) happy with better debug info, would just lose this feature?

Otherwise I dont know why many new bigger things in llvm go under own flag if not for situations like this one.

In D116821#3263746, @xbolva00 wrote:

it's probably reasonable to revert, analyze, and see if there are some extreme cases

Revert… So (silent) users, which are now or would be (soon new release) happy with better debug info, would just lose this feature?

Similarly, they could use the flag to continue to opt in to the behavior.

Otherwise I dont know why many new bigger things in llvm go under own flag if not for situations like this one.

Generally if it ends up being a feature where a small subset of users have an esoteric need compared to the norm, they could continue to use the flag to opt out.

But if the tradeoff is more broadly unfavorable, then that probably indicates the default should reflect that.

It's unclear where we are in this balance without numbers in either direction.

But also if the observed regressions are larger than the documented/intended, that's probably reflective of some considerations that hadn't been accounted before before committing, and so worth reverting to known good and investigating those considerations to make sure they're understood before moving forward.

Similarly, they could use the flag to continue to opt in to the behavior.

Ah yeah, if you meant "revert = set this as false by default" than it is a reasonable thing to do if regressions are too large for common codebases.

In D116821#3263871, @xbolva00 wrote:

Similarly, they could use the flag to continue to opt in to the behavior.

Ah yeah, if you meant "revert = set this as false by default" than it is a reasonable thing to do if regressions are too large for common codebases.

Yeah, that's what I mean - "revert the observable effect of this change". Mechanically I'm open to ideas and I don't think we need to churn this whole patch. Probably would've been good to make the behavior changing patch smaller/simpler so it'd be easy to revert/toggle off, but 'tis what it is. So, yeah, might be suitable to commit a change to the comparisons in debuginfoShouldUseDebugInstrRef to disable this functionality for now for further investigation.

I agree with @dblaikie that a bit more time for @alexfh's numbers would be good -- I'd be especially interested in the code characteristics (i.e.: is this a large C++ codebase under LTO, with things like exceptions?). If there are any runs where the build completes, a comparison of how much debug-info is in the before/after files would be great, if that's possible.

@jmorse - do you have some numbers to help clarify what sort of regressions you were anticipating this patch causing?

For compile time on large C++ codebases, something in the order of 1% regression as the clang frontend took up the vast majority of compile time; on smaller benchmarks, pretty much in line with what CTMark on the compile-time tracker was reporting. I think with the patches from @nikic (cheers!), D117877 and something else I've got cooking then the regression on CTMark should be 2-3%. Caroline kindly ran some tests too [0], unfortunately one of the algorithms changed a lot between now and then :/,

For memory consumption, I only observed negligible changes in max-rss when testing C++ codebases, CTMarks numbers for max-rss are roughly 1%. My general expectation is that the instr-ref LiveDebugValues implementation shouldn't be responsible for peak memory usage as it only fully analyses one variable at a time, instead any greater memory usage Should (TM) come from storing the location information for all variables before DWARF emissions happens. If that holds, any increase in max-rss should be somewhat proportionate to a change in the output file size / output debug-info. I can see a few outcomes here:

Optimistic: the increase in memory usage is in proportion with more debug-info being produced. Discussion should then move to whether we should generate all that data,
Pessimistic: there's little change in how much debug-info is produced, but lots of extra memory used. That's almost certainly a bug or design problem, to be reverted and investigated,
Complete misery: I'm aware of a few pathological scenarios that LLVM can't cope with, usually extremely large functions with lots of inlining, where describing all variables locations at the same time simply won't fit in memory. In the past that's been "fixed" by truncating the debug-info for such functions [1]. If that kind of scenario is occurring it's revert-and-write-more-code.

[0] https://lists.llvm.org/pipermail/llvm-dev/2021-September/152457.html
[1] https://reviews.llvm.org/D80662

In D116821#3265685, @jmorse wrote:

For memory consumption, I only observed negligible changes in max-rss when testing C++ codebases, CTMarks numbers for max-rss are roughly 1%.

Just a quick note on this: The max-rss numbers reported by the compile-time tracker are quite misleading in this case, because they add up max-rss from all compilations and the final link, while in this case only link memory usage increases. Here are the numbers from just the ReleaseLTO-g link steps:

sqlite3.link 	191MiB 	195MiB (+2.26%)
consumer-typeset.link 	155MiB 	155MiB (+0.07%)
bullet.link 	208MiB 	207MiB (-0.04%)
tramp3d-v4.link 	613MiB 	620MiB (+1.02%)
pairlocalalign.link 	99MiB 	126MiB (+26.85%)
clamscan.link 	213MiB 	216MiB (+1.23%)
lencod.link 	257MiB 	275MiB (+6.90%)
SPASS.link 	254MiB 	343MiB (+35.24%)
7zip-benchmark.link 	439MiB 	442MiB (+0.67%)

The 25% and 35% increases on pairlocalalign and SPASS stand out as outliers here and might be good (relatively small) test cases to double check.

Hi,

In D116821#3265861, @nikic wrote:

Just a quick note on this: The max-rss numbers reported by the compile-time tracker are quite misleading in this case, because they add up max-rss from all compilations and the final link, while in this case only link memory usage increases. Here are the numbers from just the ReleaseLTO-g link steps:

Ah, right!, that explains a few things, thanks for the tip,

SPASS.link 	254MiB 	343MiB (+35.24%)

Now that's unexpected, I'll dig in there. This certainly moves the direction of travel towards disabling/reverting.

jmorse added a reverting change: rG74db5c8c95e2: Revert rG6a605b97a200 due to excessive memory use.Jan 24 2022, 9:08 AM

Pushed up 74db5c8c95 that effectively reverts this patch, through disabling the on-by-default-for-x86 portion of code.

I've dug through the CTMark SPASS sample and found some now-obvious problems. I reckon they can be addressed, but not immediately, so it's best to revert for the moment.

This raises some uncertainty regarding the LLVM14 branch date; I'll post back here closer to the time.

In D116821#3266620, @jmorse wrote:

Pushed up 74db5c8c95 that effectively reverts this patch, through disabling the on-by-default-for-x86 portion of code.

I've dug through the CTMark SPASS sample and found some now-obvious problems. I reckon they can be addressed, but not immediately, so it's best to revert for the moment.

This raises some uncertainty regarding the LLVM14 branch date; I'll post back here closer to the time.

You could default this option to just false, instead of ifdefing it out, so anybody could enable it explicitly.

In D116821#3266623, @xbolva00 wrote:

You could default this option to just false, instead of ifdefing it out, so anybody could enable it explicitly.

Possibly I'm misunderstanding but: below the #if 0'd block, there's a test of the cl::opt that allows that:

return ValueTrackingVariableLocations == cl::boolOrDefault::BOU_TRUE;

So that anyone can specify -mllvm -experimental-debug-variable-locations=true to get the instr-ref behaviour. The command-line-flag defaults to "unset", causing llvm::debuginfoShouldUseDebugInstrRef to return false on "main" right now.

It's mostly that we try very hard to not have #if 0 blocks in code please :)

The XFAIL * means that it also won't show up when you try to turn it on or
if something breaks. An option will let you explicitly keep that code
tested.

Thanks!

-eric

jmorse mentioned this in rGd27f02261442: [NFC][DebugInfo] Strip out an undesired #if 0 block.Jan 24 2022, 10:05 AM

(Block stripped out in d27f022614)

In D116821#3263405, @mstorsjo wrote:

Thanks @aganea! Would you mind posting such a patch for review, or should I? I’m not sure how to create a simple enough and robust testcase for it, but maybe it’s not strictly necessary when it’s just adding more enum entries in a table?

Feel free to go ahead with a patch!

mstorsjo mentioned this in D118072: [X86] [CodeView] Add codeview mappings for registers ST0-ST7.Jan 24 2022, 1:13 PM

I found one particularly bad input. Here what time -v clang -fno-omit-frame-pointer -fcommon -no-canonical-prefixes -fdiagnostics-show-option -fmessage-length=0 -fno-exceptions -fbracket-depth=768 -fno-strict-aliasing -fmerge-all-constants -flax-vector-conversions=all -funsigned-char -Xclang=-fuse-ctor-homing -fcolor-diagnostics --cuda-host-only -faligned-allocation -fnew-alignment=8 -fshow-overloads=best -mllvm -disable-binop-extract-shuffle -fdebug-types-section -O2 -momit-leaf-frame-pointer -ffp-contract=off --target=x86_64-unknown-linux-gnu -fno-unique-section-names -maes -m64 -mcx16 -msse4.2 -mpclmul -mprefer-vector-width=128 -gmlt -gz=zlib -fdebug-default-version=5 -fsanitize=address,bool,bounds,builtin,float-cast-overflow,integer-divide-by-zero,null,return,returns-nonnull-attribute,shift-exponent,signed-integer-overflow,unreachable,vla-bound -fsanitize-address-globals-dead-stripping -fsanitize-trap=null -fno-sanitize-recover=all -fno-omit-frame-pointer -fsized-deallocation -fPIE -fdirect-access-external-data -ffunction-sections -fdata-sections -pthread -fno-sanitize-trap=all -std=gnu++17 -stdlib=libc++ -Wno-deprecated-declarations -Wno-inconsistent-missing-override -save-stats -c q.cc -o q.o says 1. at rG74db5c8c95e2aed40d288bb2df92eb859b87c827 (with the new behavior disabled) and 2. at rG80532ebb508d0ca62f96df5f253db8caed969397 (right before rG74db5c8c95e2aed40d288bb2df92eb859b87c827):

1

User time (seconds): 190.92
System time (seconds): 2.16
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 3:13.10
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 4926000
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 27080
Voluntary context switches: 8
Involuntary context switches: 579
Swaps: 0
File system inputs: 0
File system outputs: 54168
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0

2

User time (seconds): 255.60
System time (seconds): 8.09
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 4:23.71
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 27272308
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 32213
Voluntary context switches: 8
Involuntary context switches: 779
Swaps: 0
File system inputs: 0
File system outputs: 54168
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0

The most problematic are User time (seconds): 190.92 -> 255.60 (+34%) and Maximum resident set size (kbytes): 4926000 -> 27272308 (5.5x).

As for the nature of the code, it's a large generated file (preprocessed size ~22MB) with a few kinds of boilerplate (e.g. a method with a ton of v->push_back({some, constant}); calls and other stuff). I can try to find out which of them is responsible for most of the increase.

mstorsjo mentioned this in rG70cb8daed45f: [X86] [CodeView] Add codeview mappings for registers ST0-ST7.Jan 25 2022, 12:10 AM

I reduced our problematic file to a function with a couple thousand variables of a type containing a few std::function<>s. This can be demonstrated by this synthetic test case (compiled on x86-64 linux with libc++):

#include <functional>
#include <vector>
namespace n {

using F = std::function<void()>;

struct S {
  F f1;
  F f2;
  F f3;
};

std::vector<S> f() {
  std::vector<S> v;
  #define X(s) S s; v.push_back(s)
  #define XX(s) \
    X(s ## 0); \
    X(s ## 1); \
    X(s ## 2); \
    X(s ## 3); \
    X(s ## 4); \
    X(s ## 5); \
    X(s ## 6); \
    X(s ## 7); \
    X(s ## 8); \
    X(s ## 9)
  #define XXX(s) \
    XX(s ## 0); \
    XX(s ## 1); \
    XX(s ## 2); \
    XX(s ## 3); \
    XX(s ## 4); \
    XX(s ## 5); \
    XX(s ## 6); \
    XX(s ## 7); \
    XX(s ## 8); \
    XX(s ## 9)
  #define XXXX(s) \
    XXX(s ## 0); \
    XXX(s ## 1); \
    XXX(s ## 2); \
    XXX(s ## 3); \
    XXX(s ## 4); \
    XXX(s ## 5); \
    XXX(s ## 6); \
    XXX(s ## 7); \
    XXX(s ## 8); \
    XXX(s ## 9)

  XXXX(a);
  return v;
}
}

$ time -v ./clang-74db5c8c95e2aed40d288bb2df92eb859b87c827 -O1 --target=x86_64-generic-linux-gnu -g -fsanitize=address -fsized-deallocation -std=c++17 -c q.cc -o q.o
        Command being timed: "./clang-74db5c8c95e2aed40d288bb2df92eb859b87c827 -O1 --target=x86_64-generic-linux-gnu -g -fsanitize=address -fsized-deallocation -std=c++17 -c q.cc -o q.o"
        User time (seconds): 20.13
        System time (seconds): 0.31
        Percent of CPU this job got: 99%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:20.46
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 1029064
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 8101
        Voluntary context switches: 8
        Involuntary context switches: 42
        Swaps: 0
        File system inputs: 0
        File system outputs: 5104
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
$ time -v ./clang-80532ebb508d0ca62f96df5f253db8caed969397 -O1 --target=x86_64-generic-linux-gnu -g -fsanitize=address -fsized-deallocation -std=c++17 -c q.cc -o q.o
        Command being timed: "./clang-80532ebb508d0ca62f96df5f253db8caed969397 -O1 --target=x86_64-generic-linux-gnu -g -fsanitize=address -fsized-deallocation -std=c++17 -c q.cc -o q.o"
        User time (seconds): 66.67
        System time (seconds): 2.06
        Percent of CPU this job got: 99%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 1:08.75
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 7632172
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 12537
        Voluntary context switches: 8
        Involuntary context switches: 223
        Swaps: 0
        File system inputs: 0
        File system outputs: 4872
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

Note the 3.3x run time and ~7.5x maximum rss with this patch. But what's surprising to me is the high run time and memory consumption of clang even without this patch =\

jmorse mentioned this in D118460: [DebugInfo][InstrRef] Use a depth-first search to reduce the lifetime of tracking information.Jan 28 2022, 5:26 AM

Hi @alexfh , first many thanks for coming back with details and a reproducer!,

The most problematic are User time (seconds): 190.92 -> 255.60 (+34%) and Maximum resident set size (kbytes): 4926000 -> 27272308 (5.5x).

Oooooff :/,

As for the nature of the code, it's a large generated file (preprocessed size ~22MB) with a few kinds of boilerplate (e.g. a method with a ton of v->push_back({some, constant}); calls and other stuff). I can try to find out which of them is responsible for most of the increase.

As ever, generated code can easily stimulates limitations, and in this case it's really exciting: the culprit is ASAN having thousands of live pointers to wherever it is asan stores things. They all get spilt to the stack: and because my working assumption when writing this code was that LLVM would usually seek to minimise the working set size for obvious reasons, InstrRefBasedLDV tries to track all the spilt values in every block, which in this case is thousands of slots. That leads to the ballooning memory consumption, which won't be fixed by the other improvements I've made. The cherry on the cake is that, because asan pointers don't have any variables associated with them, there's zero improvement in debug experience from instr-ref on this sample.

VarLocBasedLDV does just fine because it only tracks locations after they're assigned to a variable; instr-ref tracks everything to increase the number of locations that are known to contain a value, hence this explosion.

An easy way out would be to set a hard limit on the number of spill slots being tracked, on the (to me) fairly reasonable principle of "If your working set is over (say) 1000 Values, you probably autogenerated this code and probably won't be debugging it by hand", much like existing cut-offs in LiveDebugValues. I instinctively don't want to add arbitrary limits though.

I'll give it some more thought over the weekend. The assumption about the working set being small is pretty heavily baked into what I've been writing though.

In D116821#3279890, @jmorse wrote:

An easy way out would be to set a hard limit on the number of spill slots being tracked, on the (to me) fairly reasonable principle of "If your working set is over (say) 1000 Values, you probably autogenerated this code and probably won't be debugging it by hand", much like existing cut-offs in LiveDebugValues. I instinctively don't want to add arbitrary limits though.

(... or maybe for large inputs reduce the (currently highly detailed) quality of information about stack slots, down to the level VarLocBasedLDV works at. You probably don't care about individual subregister-fields within slots if you have so many Values. Quickly hacking shows this reduces memory blow-up to 10%, time to 15%).

Regarding a value cutoff, I think that makes sense. To date, LLVM has never intentionally thrown away correct debug info, because historically we've had so preciously little of it. However, as we get better at tracking debug info, I think we're going to need more heuristics like this to cutoff excessive amounts of debug information.

Anecdotally, debug info for mostly trivial inlined call frames (unique_ptr::get, operator*, etc) takes up a huge chunk of PDBs for Chrome. I think it might be time to start developing heuristics, modes, config files, or flags that discard trivial information like this to save file size and compile time.

jmorse mentioned this in D118601: [DebugInfo][InstrRef] Add a max-stack-slots-to-track limitation / cut-out.Jan 31 2022, 5:31 AM

Sitrep on this:

I've added a cut-off in D118601 that limits max-rss growth in the sample from @alexfh to "only" about 12%, which given its autogenerated nature, seems reasonable to me. I've got some scripts working out what the cut-off value would be for some LTO builds of large codebases, I think the best plan is "pick a reasonable number and double it".
Other patches in the related (largely approvedish) stack address most of the growth in max-rss for CTMark, although it still goes up by 10% for SPASS. That's in a function with 220k instructions, 50k of which are DBG_VALUEs, 90k DBG_VALUEs with instr-ref.
On the performance front, the regression in compile-time is now down to less than 2% geomean [0], there are still one or two tricks I can pull.

However I'm out of the (figurative) office for most of tomorrow, so not all of this is going to hit by branch time. I'll wave the situation in Tom Stellards direction and see what happens.

[0] http://llvm-compile-time-tracker.com/compare.php?from=30efee764d95d425b480758ed02682fab6426d28&to=2de5ce8245a00d52e397057d5976d89f3002220d&stat=instructions

@alexfh - while it's great to have highlighted this issue upstream (& I think it's been valuable to have motivated the improvements so far) - if you want to CC me on the internal bug or whatever, maybe we can see if __attribute__((nodebug)) or the like might be suitable on some of this large autogenerated code.

In D116821#3280137, @rnk wrote:

Regarding a value cutoff, I think that makes sense. To date, LLVM has never intentionally thrown away correct debug info, because historically we've had so preciously little of it. However, as we get better at tracking debug info, I think we're going to need more heuristics like this to cutoff excessive amounts of debug information.

Anecdotally, debug info for mostly trivial inlined call frames (unique_ptr::get, operator*, etc) takes up a huge chunk of PDBs for Chrome. I think it might be time to start developing heuristics, modes, config files, or flags that discard trivial information like this to save file size and compile time.

FTR, Sony downstream has implemented an option that will automatically treat certain methods and functions as nodebug. The criteria, IIRC, are: method with in-class definition; explicitly marked inline; or attribute always_inline. It gets rid of a lot of the fluff. If there's interest I can put it on my to-do list to post a patch.

In D116821#3285685, @probinson wrote:

In D116821#3280137, @rnk wrote:

Regarding a value cutoff, I think that makes sense. To date, LLVM has never intentionally thrown away correct debug info, because historically we've had so preciously little of it. However, as we get better at tracking debug info, I think we're going to need more heuristics like this to cutoff excessive amounts of debug information.

Anecdotally, debug info for mostly trivial inlined call frames (unique_ptr::get, operator*, etc) takes up a huge chunk of PDBs for Chrome. I think it might be time to start developing heuristics, modes, config files, or flags that discard trivial information like this to save file size and compile time.

FTR, Sony downstream has implemented an option that will automatically treat certain methods and functions as nodebug. The criteria, IIRC, are: method with in-class definition; explicitly marked inline; or attribute always_inline. It gets rid of a lot of the fluff. If there's interest I can put it on my to-do list to post a patch.

Hmm - I'd at least be curious to know more about the criteria used. Those seem a bit more aggressive than I'd have figured - non-trivial inline functions in headers seem like they could be quite confusing to treat as nodebug.

jmorse mentioned this in rG3fab2d138e30: [DebugInfo][InstrRef] Add a max-stack-slots-to-track cut-out.Feb 1 2022, 11:26 AM

In D116821#3286031, @dblaikie wrote:

Hmm - I'd at least be curious to know more about the criteria used. Those seem a bit more aggressive than I'd have figured - non-trivial inline functions in headers seem like they could be quite confusing to treat as nodebug.

Well, those are the criteria we landed on. And of course it's controlled by a command-line option, so you can get everything if you want.

The argument (made by licensees; we didn't make this up out of thin air) is that in-class defined methods, and those explicitly marked as inline or always_inline in the source, do tend to be small and not the source of real bugs. Also, making it obvious from the source whether a given function would be affected was a positive point. Yes, a team could spend a lot of effort adding nodebug everywhere, but having the quick and easy ability to _get_ debug info for everything (by fiddling a command-line option, no tedious source modifications required) was also a win.

You can invent a counter example easily, but in our experience the counter-examples aren't the norm, at least for the licensees who asked for this.

In D116821#3288655, @probinson wrote:

In D116821#3286031, @dblaikie wrote:

Hmm - I'd at least be curious to know more about the criteria used. Those seem a bit more aggressive than I'd have figured - non-trivial inline functions in headers seem like they could be quite confusing to treat as nodebug.

Well, those are the criteria we landed on. And of course it's controlled by a command-line option, so you can get everything if you want.

The argument (made by licensees; we didn't make this up out of thin air) is that in-class defined methods, and those explicitly marked as inline or always_inline in the source, do tend to be small and not the source of real bugs. Also, making it obvious from the source whether a given function would be affected was a positive point. Yes, a team could spend a lot of effort adding nodebug everywhere, but having the quick and easy ability to _get_ debug info for everything (by fiddling a command-line option, no tedious source modifications required) was also a win.

You can invent a counter example easily, but in our experience the counter-examples aren't the norm, at least for the licensees who asked for this.

cool cool - good to know!

In D116821#3288655, @probinson wrote:

The argument (made by licensees; we didn't make this up out of thin air) is that in-class defined methods, and those explicitly marked as inline or always_inline in the source, do tend to be small and not the source of real bugs. Also, making it obvious from the source whether a given function would be affected was a positive point. Yes, a team could spend a lot of effort adding nodebug everywhere, but having the quick and easy ability to _get_ debug info for everything (by fiddling a command-line option, no tedious source modifications required) was also a win.

I've heard about this behavior in the Sony compiler, but for some reason I never thought about it as a flag. A flag to get this behavior seems like something we could reasonably accept upstream, for all the reasons you describe. It's a reasonable heuristic for some codebases. However, in my experience, there are plenty of complex methods defined inline in class templates, and the existing behavior of retaining all the inlined call site info seems like a better default.

You may recall this came up in the -gno-inline-line-tables patch https://reviews.llvm.org/D67723, which was really driven by debug info size concerns rather than debug experience concerns.

jmorse mentioned this in rG14aaaa12366f: Re-apply 3fab2d138e30, now with a triple added.Feb 2 2022, 3:04 AM

jmorse mentioned this in rG6e03a68b776d: [DebugInfo] Re-enable instruction referencing for x86_64.Feb 2 2022, 11:43 AM

Sitrep: I've re-enabled instr-ref for x86_64 in rG6e03a68b776d. I believe all the issues reported here so far are fixed / optimised further, so I've filed a cherry-pick request in https://github.com/llvm/llvm-project/issues/53548 , which I believe is the correct procedure.

Any more difficulties -> we'll take them as they come.

We noticed another crash with this change that still repros on ToT: https://github.com/llvm/llvm-project/issues/54190

@jmorse can you please check?

Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2022, 6:20 PM

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

CommandFlags.cpp

18 lines

LiveDebugValues/

LiveDebugValues.h

4 lines

LiveDebugValues.cpp

14 lines

MachineFunction.cpp

4 lines

Diff 399297

llvm/lib/CodeGen/CommandFlags.cpp

Show First 20 Lines • Show All 84 Lines • ▼ Show 20 Lines
CGOPT(bool, UniqueBasicBlockSectionNames)		CGOPT(bool, UniqueBasicBlockSectionNames)
CGOPT(EABI, EABIVersion)		CGOPT(EABI, EABIVersion)
CGOPT(DebuggerKind, DebuggerTuningOpt)		CGOPT(DebuggerKind, DebuggerTuningOpt)
CGOPT(bool, EnableStackSizeSection)		CGOPT(bool, EnableStackSizeSection)
CGOPT(bool, EnableAddrsig)		CGOPT(bool, EnableAddrsig)
CGOPT(bool, EmitCallSiteInfo)		CGOPT(bool, EmitCallSiteInfo)
CGOPT(bool, EnableMachineFunctionSplitter)		CGOPT(bool, EnableMachineFunctionSplitter)
CGOPT(bool, EnableDebugEntryValues)		CGOPT(bool, EnableDebugEntryValues)
CGOPT_EXP(bool, ValueTrackingVariableLocations)
CGOPT(bool, ForceDwarfFrameSection)		CGOPT(bool, ForceDwarfFrameSection)
CGOPT(bool, XRayOmitFunctionIndex)		CGOPT(bool, XRayOmitFunctionIndex)
CGOPT(bool, DebugStrictDwarf)		CGOPT(bool, DebugStrictDwarf)
CGOPT(unsigned, AlignLoops)		CGOPT(unsigned, AlignLoops)

codegen::RegisterCodeGenFlags::RegisterCodeGenFlags() {		codegen::RegisterCodeGenFlags::RegisterCodeGenFlags() {
#define CGBINDOPT(NAME) \		#define CGBINDOPT(NAME) \
do { \		do { \
▲ Show 20 Lines • Show All 326 Lines • ▼ Show 20 Lines	#define CGBINDOPT(NAME) \
CGBINDOPT(EmitCallSiteInfo);		CGBINDOPT(EmitCallSiteInfo);

static cl::opt<bool> EnableDebugEntryValues(		static cl::opt<bool> EnableDebugEntryValues(
"debug-entry-values",		"debug-entry-values",
cl::desc("Enable debug info for the debug entry values."),		cl::desc("Enable debug info for the debug entry values."),
cl::init(false));		cl::init(false));
CGBINDOPT(EnableDebugEntryValues);		CGBINDOPT(EnableDebugEntryValues);

static cl::opt<bool> ValueTrackingVariableLocations(
"experimental-debug-variable-locations",
cl::desc("Use experimental new value-tracking variable locations"),
cl::init(false));
CGBINDOPT(ValueTrackingVariableLocations);

static cl::opt<bool> EnableMachineFunctionSplitter(		static cl::opt<bool> EnableMachineFunctionSplitter(
"split-machine-functions",		"split-machine-functions",
cl::desc("Split out cold basic blocks from machine functions based on "		cl::desc("Split out cold basic blocks from machine functions based on "
"profile information"),		"profile information"),
cl::init(false));		cl::init(false));
CGBINDOPT(EnableMachineFunctionSplitter);		CGBINDOPT(EnableMachineFunctionSplitter);

static cl::opt<bool> ForceDwarfFrameSection(		static cl::opt<bool> ForceDwarfFrameSection(
▲ Show 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	codegen::InitTargetOptionsFromCodeGenFlags(const Triple &TheTriple) {
Options.EmitAddrsig = getEnableAddrsig();		Options.EmitAddrsig = getEnableAddrsig();
Options.EmitCallSiteInfo = getEmitCallSiteInfo();		Options.EmitCallSiteInfo = getEmitCallSiteInfo();
Options.EnableDebugEntryValues = getEnableDebugEntryValues();		Options.EnableDebugEntryValues = getEnableDebugEntryValues();
Options.ForceDwarfFrameSection = getForceDwarfFrameSection();		Options.ForceDwarfFrameSection = getForceDwarfFrameSection();
Options.XRayOmitFunctionIndex = getXRayOmitFunctionIndex();		Options.XRayOmitFunctionIndex = getXRayOmitFunctionIndex();
Options.DebugStrictDwarf = getDebugStrictDwarf();		Options.DebugStrictDwarf = getDebugStrictDwarf();
Options.LoopAlignment = getAlignLoops();		Options.LoopAlignment = getAlignLoops();

if (auto Opt = getExplicitValueTrackingVariableLocations())
Options.ValueTrackingVariableLocations = *Opt;
else
Options.ValueTrackingVariableLocations =
getDefaultValueTrackingVariableLocations(TheTriple);

Options.MCOptions = mc::InitMCTargetOptionsFromFlags();		Options.MCOptions = mc::InitMCTargetOptionsFromFlags();

Options.ThreadModel = getThreadModel();		Options.ThreadModel = getThreadModel();
Options.EABIVersion = getEABIVersion();		Options.EABIVersion = getEABIVersion();
Options.DebuggerTuning = getDebuggerTuningOpt();		Options.DebuggerTuning = getDebuggerTuningOpt();
Options.SwiftAsyncFramePointer = getSwiftAsyncFramePointer();		Options.SwiftAsyncFramePointer = getSwiftAsyncFramePointer();
return Options;		return Options;
}		}
▲ Show 20 Lines • Show All 137 Lines • ▼ Show 20 Lines
/// Set function attributes of functions in Module M based on CPU,		/// Set function attributes of functions in Module M based on CPU,
/// Features, and command line flags.		/// Features, and command line flags.
void codegen::setFunctionAttributes(StringRef CPU, StringRef Features,		void codegen::setFunctionAttributes(StringRef CPU, StringRef Features,
Module &M) {		Module &M) {
for (Function &F : M)		for (Function &F : M)
setFunctionAttributes(CPU, Features, F);		setFunctionAttributes(CPU, Features, F);
}		}

bool codegen::getDefaultValueTrackingVariableLocations(const llvm::Triple &T) {
if (T.getArch() == llvm::Triple::x86_64)
return true;
return false;
}

llvm/lib/CodeGen/LiveDebugValues/LiveDebugValues.h

	//===- LiveDebugValues.cpp - Tracking Debug Value MIs ---------- C++ ----===//			//===- LiveDebugValues.cpp - Tracking Debug Value MIs ---------- C++ ----===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef LLVM_LIB_CODEGEN_LIVEDEBUGVALUES_LIVEDEBUGVALUES_H			#ifndef LLVM_LIB_CODEGEN_LIVEDEBUGVALUES_LIVEDEBUGVALUES_H
	#define LLVM_LIB_CODEGEN_LIVEDEBUGVALUES_LIVEDEBUGVALUES_H			#define LLVM_LIB_CODEGEN_LIVEDEBUGVALUES_LIVEDEBUGVALUES_H

	#include "llvm/CodeGen/MachineDominators.h"			#include "llvm/CodeGen/MachineDominators.h"
	#include "llvm/CodeGen/MachineFunction.h"			#include "llvm/CodeGen/MachineFunction.h"
	#include "llvm/CodeGen/TargetPassConfig.h"			#include "llvm/CodeGen/TargetPassConfig.h"
				#include "llvm/ADT/Triple.h"

	namespace llvm {			namespace llvm {

	// Inline namespace for types / symbols shared between different			// Inline namespace for types / symbols shared between different
	// LiveDebugValues implementations.			// LiveDebugValues implementations.
	inline namespace SharedLiveDebugValues {			inline namespace SharedLiveDebugValues {

	// Expose a base class for LiveDebugValues interfaces to inherit from. This			// Expose a base class for LiveDebugValues interfaces to inherit from. This
	// allows the generic LiveDebugValues pass handles to call into the			// allows the generic LiveDebugValues pass handles to call into the
	// implementation.			// implementation.
	class LDVImpl {			class LDVImpl {
	public:			public:
	virtual bool ExtendRanges(MachineFunction &MF, MachineDominatorTree *DomTree,			virtual bool ExtendRanges(MachineFunction &MF, MachineDominatorTree *DomTree,
	TargetPassConfig *TPC, unsigned InputBBLimit,			TargetPassConfig *TPC, unsigned InputBBLimit,
	unsigned InputDbgValLimit) = 0;			unsigned InputDbgValLimit) = 0;
	virtual ~LDVImpl() {}			virtual ~LDVImpl() {}
	};			};

	} // namespace SharedLiveDebugValues			} // namespace SharedLiveDebugValues

	// Factory functions for LiveDebugValues implementations.			// Factory functions for LiveDebugValues implementations.
	extern LDVImpl *makeVarLocBasedLiveDebugValues();			extern LDVImpl *makeVarLocBasedLiveDebugValues();
	extern LDVImpl *makeInstrRefBasedLiveDebugValues();			extern LDVImpl *makeInstrRefBasedLiveDebugValues();

				extern bool debuginfoShouldUseDebugInstrRef(const Triple &T);

	} // namespace llvm			} // namespace llvm

	#endif // LLVM_LIB_CODEGEN_LIVEDEBUGVALUES_LIVEDEBUGVALUES_H			#endif // LLVM_LIB_CODEGEN_LIVEDEBUGVALUES_LIVEDEBUGVALUES_H

llvm/lib/CodeGen/LiveDebugValues/LiveDebugValues.cpp

Show All 34 Lines
using namespace llvm;		using namespace llvm;

static cl::opt<bool>		static cl::opt<bool>
ForceInstrRefLDV("force-instr-ref-livedebugvalues", cl::Hidden,		ForceInstrRefLDV("force-instr-ref-livedebugvalues", cl::Hidden,
cl::desc("Use instruction-ref based LiveDebugValues with "		cl::desc("Use instruction-ref based LiveDebugValues with "
"normal DBG_VALUE inputs"),		"normal DBG_VALUE inputs"),
cl::init(false));		cl::init(false));

		static cl::opt<cl::boolOrDefault> ValueTrackingVariableLocations(
		"experimental-debug-variable-locations",
		cl::desc("Use experimental new value-tracking variable locations"));

// Options to prevent pathological compile-time behavior. If InputBBLimit and		// Options to prevent pathological compile-time behavior. If InputBBLimit and
// InputDbgValueLimit are both exceeded, range extension is disabled.		// InputDbgValueLimit are both exceeded, range extension is disabled.
static cl::opt<unsigned> InputBBLimit(		static cl::opt<unsigned> InputBBLimit(
"livedebugvalues-input-bb-limit",		"livedebugvalues-input-bb-limit",
cl::desc("Maximum input basic blocks before DBG_VALUE limit applies"),		cl::desc("Maximum input basic blocks before DBG_VALUE limit applies"),
cl::init(10000), cl::Hidden);		cl::init(10000), cl::Hidden);
static cl::opt<unsigned> InputDbgValueLimit(		static cl::opt<unsigned> InputDbgValueLimit(
"livedebugvalues-input-dbg-value-limit",		"livedebugvalues-input-dbg-value-limit",
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	if (InstrRefBased) {
DomTree = &MDT;		DomTree = &MDT;
MDT.calculate(MF);		MDT.calculate(MF);
TheImpl = &*InstrRefImpl;		TheImpl = &*InstrRefImpl;
}		}

return TheImpl->ExtendRanges(MF, DomTree, TPC, InputBBLimit,		return TheImpl->ExtendRanges(MF, DomTree, TPC, InputBBLimit,
InputDbgValueLimit);		InputDbgValueLimit);
}		}

		bool llvm::debuginfoShouldUseDebugInstrRef(const Triple &T) {
		// Enable by default on x86_64, disable if explicitly turned off on cmdline.
		if (T.getArch() == llvm::Triple::x86_64 &&
		ValueTrackingVariableLocations != cl::boolOrDefault::BOU_FALSE)
		return true;

		// Otherwise: enable if explicitly requestedo n command line.
		return ValueTrackingVariableLocations == cl::boolOrDefault::BOU_TRUE;
		}

llvm/lib/CodeGen/MachineFunction.cpp

Show First 20 Lines • Show All 70 Lines • ▼ Show 20 Lines
#include <cstddef>		#include <cstddef>
#include <cstdint>		#include <cstdint>
#include <iterator>		#include <iterator>
#include <string>		#include <string>
#include <type_traits>		#include <type_traits>
#include <utility>		#include <utility>
#include <vector>		#include <vector>

		#include "LiveDebugValues/LiveDebugValues.h"

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "codegen"		#define DEBUG_TYPE "codegen"

static cl::opt<unsigned> AlignAllFunctions(		static cl::opt<unsigned> AlignAllFunctions(
"align-all-functions",		"align-all-functions",
cl::desc("Force the alignment of all functions in log2 format (e.g. 4 "		cl::desc("Force the alignment of all functions in log2 format (e.g. 4 "
"means align on 16B boundaries)."),		"means align on 16B boundaries)."),
▲ Show 20 Lines • Show All 1,146 Lines • ▼ Show 20 Lines	bool MachineFunction::useDebugInstrRef() const {
// should not suffer.		// should not suffer.
if (getTarget().getOptLevel() == CodeGenOpt::None)		if (getTarget().getOptLevel() == CodeGenOpt::None)
return false;		return false;

// Don't use instr-ref if this function is marked optnone.		// Don't use instr-ref if this function is marked optnone.
if (F.hasFnAttribute(Attribute::OptimizeNone))		if (F.hasFnAttribute(Attribute::OptimizeNone))
return false;		return false;

if (getTarget().Options.ValueTrackingVariableLocations)		if (llvm::debuginfoShouldUseDebugInstrRef(getTarget().getTargetTriple()))
return true;		return true;

return false;		return false;
}		}

// Use one million as a high / reserved number.		// Use one million as a high / reserved number.
const unsigned MachineFunction::DebugOperandMemNumber = 1000000;		const unsigned MachineFunction::DebugOperandMemNumber = 1000000;

▲ Show 20 Lines • Show All 269 Lines • Show Last 20 Lines