This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
-
TargetFrameLowering.h
-
lib/
-
CodeGen/
-
AsmPrinter/
-
DwarfCompileUnit.cpp
1
CFIInstrInserter.cpp
-
Target/X86/
-
X86/
-
X86FrameLowering.h
3/4
X86FrameLowering.cpp
1
X86MCInstLower.cpp
-
X86MachineFunctionInfo.h
-
test/DebugInfo/X86/
-
DebugInfo/
-
X86/
1/3
stack_adjustments_trigger_cfa_frame_base.ll

Differential D143463

[X86] Use the CFA when appropriate for better variable locations around calls.
ClosedPublic

Authored by khuey on Feb 6 2023, 10:18 PM.

Download Raw Diff

Details

Reviewers

MaskRay
scott.linder
jhenderson
jryans
jmorse

Group Reviewers

debug-info

Commits

rG3be667ae5a10: [X86] Use the CFA when appropriate for better variable locations around calls.
rGd421f5226048: [X86] Use the CFA as the DWARF frame base for better variable locations around…

Summary

Without frame pointers, the locations of variables on the stack are emitted
relative to the stack pointer (via the stack pointer being the value of
DW_AT_frame_base on the subprogram). If a call modifies the stack pointer
this results in the locations being wrong and the debugger displaying the
wrong values for variables.

By using DW_OP_call_frame_cfa in these situations the emitted location for
the variable will automatically handle changes in the stack pointer
(provided LLVM is emitting the correct CFI directives elsewhere, of course).
The CFA needs to be adjusted for the size of the stack frame (including the
return address) to allow the variable locations themselves to remain
unchanged by this patch.

Certain LLDB features cannot cope with DW_OP_call_frame_cfa, so this change
is heuristically limited to the cases where it's necessary for correctness
to minimize the fallout there.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

khuey created this revision.Feb 6 2023, 10:18 PM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 6 2023, 10:18 PM

Herald added subscribers: pengfei, hiraditya, dschuff. · View Herald Transcript

khuey requested review of this revision.Feb 6 2023, 10:18 PM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 6 2023, 10:18 PM

Herald added subscribers: llvm-commits, aheejin. · View Herald Transcript

Harbormaster completed remote builds in B212280: Diff 495380.Feb 6 2023, 10:59 PM

Herald added a subscriber: ormris. · View Herald TranscriptFeb 6 2023, 10:59 PM

Poke?

Pretty sure those test failures aren't me.

MaskRay added a reviewer: debug-info.Mar 14 2023, 8:45 PM

Poke again? This has been sitting for two months now.

Thank you for the patch! Sorry for the long delay in review

As a bit of a side-note, I think X86 always maintains a "precise CFA" to refer to, but that may not always be true (see https://reviews.llvm.org/D14948).

I'm curious about a couple points:

If we do always have the CFA available, then is there any reason not to use it unconditionally?
If we cannot use it unconditionally, can the condition be more direct, i.e. can we remove the heuristic?
Is the "frame base" value ABI-stable? If not, is there a strong reason not to rework the variable expressions to instead be CFA-relative, and then just define the "frame base" to always be equal to the CFA? It would avoid the extra offset needed in the current version of the patch.

llvm/lib/CodeGen/CFIInstrInserter.cpp
153	Fairly harmless (and NFC in this case) but I don't believe `Register` is intended to be constructed with a dwarf register ordinal
llvm/lib/Target/X86/X86FrameLowering.cpp
3808	wrt. the type confusion here, it seems to be a bug introduced during a bulk update which replaced many instances of `unsigned` with `Register` (2481f26ac3f22)
3817–3818	Can the new `X86MachineFunctionInfo` field be eliminated, and this instead check the same condition the rest of the X86 code checks before creating a `OpAdjustCfaOffset`? It seems to consistently just be that `!X86FrameLowering::hasFP(MF)`.
3817–3824
llvm/lib/Target/X86/X86MCInstLower.cpp
2122	I think the fact that we need this extra call is evidence that the heuristic-based approach and a field in `X86MachineFunctionInfo` is not ideal

scott.linder added a reviewer: scott.linder.Apr 11 2023, 2:52 PM

In D143463#4259475, @scott.linder wrote:

As a bit of a side-note, I think X86 always maintains a "precise CFA" to refer to, but that may not always be true (see https://reviews.llvm.org/D14948).

This is relying on the CFA being correct when it's present. CFI is not always present (see X86FrameLowering::needsDwarfCFI which at a minimum excludes some Win64 stuff) but it might be present in all cases that matter. I'm not sure.

I'm curious about a couple points:

If we do always have the CFA available, then is there any reason not to use it unconditionally?

In decreasing order of persuasiveness to me

If the frame pointer is present (or the stack pointer remains unchanged) using it is simpler for the debug info consumer (as there's no need to consult the CFI and run through its state machine). In theory there might also be old tools out there that don't support CFI as well.
If the frame pointer is present (or the stack pointer remains unchanged) using it works today. Don't fix what isn't broke.
If there are any bugs out there that result in an incorrect CFA the machine registers are more likely to be correct as they're actually used by the program and thus issues are more likely to be noticed.

Although it seems the gcc maintainers were not persuaded by any of this. gcc emits DW_OP_call_frame_cfa in every circumstance I can think of (at least on x86-64).

If we cannot use it unconditionally, can the condition be more direct, i.e. can we remove the heuristic?

I'm not aware of any reason we cannot use the CFA unconditionally (at least on x86) if it is present. I believe it is possible to remove the setHasCFIAdjustCfa heuristic and use the CFA in place of the stack pointer unconditionally if the above arguments are not persuasive. I believe it's also possible to use the CFA in place of the frame pointer even when it's present, if desired.

Is the "frame base" value ABI-stable?

In the debug info? I don't see why it would be. As mentioned above gcc emits DW_AT_frame_base = DW_OP_call_frame_cfa.

If not, is there a strong reason not to rework the variable expressions to instead be CFA-relative, and then just define the "frame base" to always be equal to the CFA? It would avoid the extra offset needed in the current version of the patch.

The reason I didn't do this originally was that I expected it to require updating a massive number of tests, but it turns out git grep DW_OP_fbreg | wc -l has fewer than 100 hits in LLVM so it probably wouldn't be that bad to update the tests. Looking into it a bit more, it would require changes in DwarfExpression::addMachineRegExpression which is 1) already pretty complex and 2) shared across architectures. It's not entirely clear to me what's going on with CFI on other arches, CFIInstrInserter is definitely x86 specific but other arches seem to have at least some CFI stuff in the target specific frame lowering code. Untangling all this seems like a significantly bigger task than merely switching to using the CFA + offset.

khuey added inline comments.Apr 11 2023, 9:17 PM

llvm/lib/Target/X86/X86FrameLowering.cpp
3817–3818	The idea here was that if the frame register is the stack pointer and a CFA adjustment was emitted we'd switch to the CFA. This proposal would get rid of the latter, which I discussed in more detail in the top-level comment.

In D143463#4259947, @khuey wrote:

In D143463#4259475, @scott.linder wrote:

As a bit of a side-note, I think X86 always maintains a "precise CFA" to refer to, but that may not always be true (see https://reviews.llvm.org/D14948).

This is relying on the CFA being correct when it's present. CFI is not always present (see X86FrameLowering::needsDwarfCFI which at a minimum excludes some Win64 stuff) but it might be present in all cases that matter. I'm not sure.

I'm curious about a couple points:

If we do always have the CFA available, then is there any reason not to use it unconditionally?

In decreasing order of persuasiveness to me

If the frame pointer is present (or the stack pointer remains unchanged) using it is simpler for the debug info consumer (as there's no need to consult the CFI and run through its state machine). In theory there might also be old tools out there that don't support CFI as well.

Very reasonable, but if GCC is already using the CFA unconditionally it seems likely that implies a pretty good support base?

If the frame pointer is present (or the stack pointer remains unchanged) using it works today. Don't fix what isn't broke.

I definitely think this holds a lot of weight, but just having one concept (CFA) instead of several (stack-pointer, frame-pointer, frame-base, CFA) is also very attractive. The fact that the patch is even necessary also demonstrates something is broken, although it might be the exception that proves the rule.

If there are any bugs out there that result in an incorrect CFA the machine registers are more likely to be correct as they're actually used by the program and thus issues are more likely to be noticed.

If the CFA is incorrect, unwinding seems like it is doomed to fail, which should constitute a program execution bug if e.g. exceptions are enabled, right? I don't necessarily think we will end up with more elusive debug-info bugs as a result of relying on the CFA.

Although it seems the gcc maintainers were not persuaded by any of this. gcc emits DW_OP_call_frame_cfa in every circumstance I can think of (at least on x86-64).

If we cannot use it unconditionally, can the condition be more direct, i.e. can we remove the heuristic?

I'm not aware of any reason we cannot use the CFA unconditionally (at least on x86) if it is present. I believe it is possible to remove the setHasCFIAdjustCfa heuristic and use the CFA in place of the stack pointer unconditionally if the above arguments are not persuasive. I believe it's also possible to use the CFA in place of the frame pointer even when it's present, if desired.

I meant to propose another middle-ground, where we use the CFA whenever !hasFP, which (IIUC) is a precondition for x86 ever adjusting the CFA. It would mean we use the CFA more than with the heuristic, but still not unconditionally.

I think I am mostly concerned with the statefulness of the heuristic, coupled with the fact that it seems to not be completely contained. The majority of it is handled by X86FrameLowering, but there is at least the one instance where it has to be explicitly managed in X86MCInstLower. If others are not as concerned with this I'm happy to concede on it, but it is the main concern I have.

Is the "frame base" value ABI-stable?

In the debug info? I don't see why it would be. As mentioned above gcc emits DW_AT_frame_base = DW_OP_call_frame_cfa.

If not, is there a strong reason not to rework the variable expressions to instead be CFA-relative, and then just define the "frame base" to always be equal to the CFA? It would avoid the extra offset needed in the current version of the patch.

The reason I didn't do this originally was that I expected it to require updating a massive number of tests, but it turns out git grep DW_OP_fbreg | wc -l has fewer than 100 hits in LLVM so it probably wouldn't be that bad to update the tests. Looking into it a bit more, it would require changes in DwarfExpression::addMachineRegExpression which is 1) already pretty complex and 2) shared across architectures. It's not entirely clear to me what's going on with CFI on other arches, CFIInstrInserter is definitely x86 specific but other arches seem to have at least some CFI stuff in the target specific frame lowering code. Untangling all this seems like a significantly bigger task than merely switching to using the CFA + offset.

I definitely understand the complexity in DwarfExpression::addMachineRegExpression makes any change very difficult. I am happy with continuing to apply the offset to the CFA in the frame base, and if some other brave soul wants to change that in a future patch they won't be any worse off.

My current leaning is to change to using the CFA unconditionally, but offset to maintain the same frame base value. It will result in a slightly larger expression in the default case, but I would be much more confident that the result is always correct.

I would appreciate any other opinions from @aprantl @dblaikie @probinson et al.

In D143463#4262748, @scott.linder wrote:

Very reasonable, but if GCC is already using the CFA unconditionally it seems likely that implies a pretty good support base?

Indeed.

I meant to propose another middle-ground, where we use the CFA whenever !hasFP, which (IIUC) is a precondition for x86 ever adjusting the CFA. It would mean we use the CFA more than with the heuristic, but still not unconditionally.

!hasFP is a precondition for x86 emitting CFA offset adjustment directives solely because if there is a frame pointer LLVM simply defines the CFA to be the frame pointer. And then the frame pointer remains the same regardless of adjustments to %rsp, so the CFA doesn't need to be adjusted simultaneously. See the createDefCfaRegister call inside X86FrameLowering::emitPrologue (which is gated on, at the top level, HasFP).

I think I am mostly concerned with the statefulness of the heuristic, coupled with the fact that it seems to not be completely contained. The majority of it is handled by X86FrameLowering, but there is at least the one instance where it has to be explicitly managed in X86MCInstLower. If others are not as concerned with this I'm happy to concede on it, but it is the main concern I have.

The not-being-contained is just because X86::MOVPC32r is really weird.

I definitely understand the complexity in DwarfExpression::addMachineRegExpression makes any change very difficult. I am happy with continuing to apply the offset to the CFA in the frame base, and if some other brave soul wants to change that in a future patch they won't be any worse off.

Yeah I don't really want to do that here but I agree that folding the offset between the CFA and the stack pointer into the variable expressions rather than leaving it in the DW_AT_frame_base expression is the ideal end point.

My current leaning is to change to using the CFA unconditionally, but offset to maintain the same frame base value. It will result in a slightly larger expression in the default case, but I would be much more confident that the result is always correct.

So, to be clear, you're learning towards using a CFA-based DW_AT_frame_base in all three of these scenarios?

There is a CFA, and LLVM is currently using %rsp as the DW_AT_frame_base, and %rsp changes throughout the function.
There is a CFA, and LLVM is currently using %rsp as the DW_AT_frame_base, and %rsp remains constant (after the prologue).
There is a CFA, and LLVM is currently using %rbp as the DW_AT_frame_base.

Where using the CFA for 1 is what this patch as originally proposed does, and 1 + 2 is the "middle ground" you suggested previously, as I understand it.

I think using the CFA for 1 + 2 + 3 makes more sense than doing just 1 + 2, so I think we more or less agree here.

In D143463#4263367, @khuey wrote:

In D143463#4262748, @scott.linder wrote:

Very reasonable, but if GCC is already using the CFA unconditionally it seems likely that implies a pretty good support base?

Indeed.

I meant to propose another middle-ground, where we use the CFA whenever !hasFP, which (IIUC) is a precondition for x86 ever adjusting the CFA. It would mean we use the CFA more than with the heuristic, but still not unconditionally.

!hasFP is a precondition for x86 emitting CFA offset adjustment directives solely because if there is a frame pointer LLVM simply defines the CFA to be the frame pointer. And then the frame pointer remains the same regardless of adjustments to %rsp, so the CFA doesn't need to be adjusted simultaneously. See the createDefCfaRegister call inside X86FrameLowering::emitPrologue (which is gated on, at the top level, HasFP).

I think I am mostly concerned with the statefulness of the heuristic, coupled with the fact that it seems to not be completely contained. The majority of it is handled by X86FrameLowering, but there is at least the one instance where it has to be explicitly managed in X86MCInstLower. If others are not as concerned with this I'm happy to concede on it, but it is the main concern I have.

The not-being-contained is just because X86::MOVPC32r is really weird.

OK, I may have just been overly concerned about this; my familiarity with the X86 target in LLVM is very limited.

I definitely understand the complexity in DwarfExpression::addMachineRegExpression makes any change very difficult. I am happy with continuing to apply the offset to the CFA in the frame base, and if some other brave soul wants to change that in a future patch they won't be any worse off.

Yeah I don't really want to do that here but I agree that folding the offset between the CFA and the stack pointer into the variable expressions rather than leaving it in the DW_AT_frame_base expression is the ideal end point.

My current leaning is to change to using the CFA unconditionally, but offset to maintain the same frame base value. It will result in a slightly larger expression in the default case, but I would be much more confident that the result is always correct.

So, to be clear, you're learning towards using a CFA-based DW_AT_frame_base in all three of these scenarios?

There is a CFA, and LLVM is currently using %rsp as the DW_AT_frame_base, and %rsp changes throughout the function.

There is a CFA, and LLVM is currently using %rsp as the DW_AT_frame_base, and %rsp remains constant (after the prologue).

There is a CFA, and LLVM is currently using %rbp as the DW_AT_frame_base.

Where using the CFA for 1 is what this patch as originally proposed does, and 1 + 2 is the "middle ground" you suggested previously, as I understand it.

I think using the CFA for 1 + 2 + 3 makes more sense than doing just 1 + 2, so I think we more or less agree here.

I do think my leaning is towards using the CFA for 1 + 2 + 3, but I also appreciate that your patch as-is addresses a real bug that affects debug-info correctness. I am OK with accepting your patch with a FIXME or TODO describing the desire to eventually define the frame base as equal to the CFA.

I would just ask that you give others at least another day to respond. I recognize it has already been months, but one more day would make me a little less concerned that I missed something obvious that the more experienced LLVM debug-info devs would spot :)

This revision is now accepted and ready to land.Apr 13 2023, 11:27 AM

I'm happy to update this to do 1 + 2 + 3 too, it's not hard.

In D143463#4266127, @khuey wrote:

I'm happy to update this to do 1 + 2 + 3 too, it's not hard.

If you don't mind, I do think it is still my preferred approach; I'll leave it up to you

khuey planned changes to this revision.Apr 13 2023, 1:02 PM

Use the CFA as the DWARF frame base whenever it's present.

As discussed, this drops the previous heuristic for determining
when we need to use the CFA for accurate results and simply uses
the CFA in all cases.

This revision is now accepted and ready to land.Apr 14 2023, 3:06 PM

Herald added a reviewer: jhenderson. · View Herald TranscriptApr 14 2023, 3:06 PM

Herald added a subscriber: cmtice. · View Herald Transcript

Other than dbg-baseptr.ll, the test changes are all adjusting for the slightly larger DW_AT_frame_base value.

Harbormaster completed remote builds in B225738: Diff 513760.Apr 14 2023, 4:14 PM

LGTM, thank you!

This revision is now accepted and ready to land.Apr 17 2023, 1:02 PM

Thanks. I don't have commit access so after whoever else you want to look at it is satisfied someone will need to land this :)

Poke. If nobody else is going to review this can someone land it?

This looks good to me as well. Thanks for working on this! 🙂

I'll land these changes momentarily.

Closed by commit rGd421f5226048: [X86] Use the CFA as the DWARF frame base for better variable locations around… (authored by khuey, committed by jryans). · Explain WhyMay 15 2023, 7:10 AM

This revision was automatically updated to reflect the committed changes.

jryans added a commit: rGd421f5226048: [X86] Use the CFA as the DWARF frame base for better variable locations around….

This appears to have broken a bunch of LLDB test — can you please investigate and potentially revert the patch until we found a solution?

https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/55133/changes#d421f5226048e4a5d88aab157d0f4d434c43f208

https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/55133/testReport/

aprantl added a subscriber: jasonmolenda.May 15 2023, 8:46 AM

Thanks for the test failure report, I'll revert this for now then.

jryans added a reverting change: rGd6e4c4f8c172: Revert "[X86] Use the CFA as the DWARF frame base for better variable locations….May 15 2023, 8:55 AM

I didn't have time to look at the test failure in detail. Since it's in the frame diagnose command, maybe it just needs to be updated for this. Let me know if you need any help with figuring out a path forward!

How do I run these tests? make check either doesn't include them, or they pass on my machine.

Or maybe I need to explicitly enable lldb first ... I don't see an lldb binary in my build.

Yeah, you’ll need to enable it by giving cmake something like -D LLVM_ENABLE_PROJECTS="lldb". See https://llvm.org/docs/CMake.html#frequently-used-llvm-related-variables for more details.

Even after enabling and building lldb make check-lldb-api says all the tests are unsupported.

If you're testing the API you might need -DLLDB_ENABLE_PYTHON=On too which enables the LLDB python bindings. That option is present in the Cmake step of the logs. (Not an LLDB dev, just passing by).

Yeah, that gets some of the lldb-api tests to run.

The ones that are failing are still not running though. After looking into a bit they're all Mac only tests, and I don't have a Mac machine.

I'll read the code and see if there's anything obvious that frame diagnose is doing.

So this is actually a case of what I said earlier about there possibly being old tools out there that don't know how to cope with CFI. Turns out one of them is this "frame diagnose" feature in lldb.

Based on code inspection what I think is happening is that we're getting to this point https://github.com/llvm/llvm-project/blob/d9610b4a56c532614545eef5995362e99b776535/lldb/source/Expression/DWARFExpression.cpp#L2676. DWARFExpression::MatchesOperand takes the operand from a crashing instruction (e.g. [rbp + 42]) and tries to match that up with the locations of the various variables provided in the DWARF. It does this symbolically, looking for exactly two forms of DWARF expressions in variable locations, DW_OP_regN/x and DW_OP_fbreg <offset> where DW_AT_frame_base is itself DW_OP_regN/x. Because I changed DW_AT_frame_base to be DW_OP_call_frame_cfa DW_OP_consts <offset> DW_OP_plus, nothing on the stack is recognized anymore. There's no code here that knows how to deal with DW_OP_call_frame_cfa.

I don't see any easy way to fix this. The CFA is only available at this point in value form (via StackFrame's StackID m_cfa). Despite the comments, StackFrame::GetFrameBaseExpression really does return the DW_AT_frame_base expression, not the CFA (the author of the comment describing the function seems not to have understood that the frame base and the canonical frame address are not the same thing). With the symbolic form of the CFA available we could recognize the sequence DW_AT_location = DW_OP_fbreg <offset1>, DW_AT_frame_base = DW_OP_call_frame_cfa DW_OP_consts <offset2> DW_OP_plus, CFA = rXX + offset3 and match it to e.g. mov [%rXX + offset4], %rax where offset4 = offset1 + offset2 + offset3, but that would require a bunch of work in lldb to plumb the symbolic form of the CFA up out of the unwinding layer to somewhere where it's available to CommandObjectFrameDiagnose.

Without that, the main alternative I see is to revert back from using the CFA unconditionally to only using it in the cases where it's necessary for correctness. Then most functions out there will continue to have frame pointer expressions that fit the format "frame diagnose" is expecting, and the only functions it won't be able to handle are the ones that currently have wrong locations anyways.

As an aside, is this frame diagnose feature actually used? I've never heard of it, and some cursory googling (e.g. '"frame diagnose" lldb') suggests nobody else has either. The code looks like it's been basically untouched since it was written in 2016. Is there someone who can say it's ok to break it? It's already not working with any code gcc produced in the last decade or so.

It might be worth posting your question on discourse in the LLDB subcategory for greater visibility.

Side note (no action required): I think we might be able to improve variable availability as well as correctness with this patch if we could teach LLVM to use fbreg rather than the SP/BP for location list entries too (see this issue).

Thanks for working on this!

khuey reopened this revision.May 21 2023, 10:59 AM

This revision is now accepted and ready to land.May 21 2023, 10:59 AM

Revert back to the heuristic based approach for using the CFA only when it affects correctness.

I've realized that it's not *just* the frame diagnose tests that fail,
TestStdFunctionStepIntoCallable.py is also failing. Since not all of lldb's features can cope
with CFA-based DW_AT_frame_bases, let's go back to using the CFA only when the locations are
currently wrong. Then we won't be breaking anything that currently works.

khuey requested review of this revision.May 21 2023, 11:26 AM

I filed https://github.com/llvm/llvm-project/issues/62840 and https://github.com/llvm/llvm-project/issues/62841 on the LLDB issues.

Harbormaster completed remote builds in B233442: Diff 524120.May 21 2023, 12:20 PM

LGTM, thank you again for attempting the switch to doing this unconditionally! I was clearly wrong about there being no dependence if even an in-tree project is breaking

This revision is now accepted and ready to land.May 22 2023, 4:00 PM

Sorry for having missed this; it sounds like a good direction to take, and will fix various variable locations.

Just to confirm my understanding, this should have no interaction with the shrink-wrapping optimisation pass because there shouldn't be any stack-stored variables before frame setup occurs, yes?

llvm/test/DebugInfo/X86/stack_adjustments_trigger_cfa_frame_base.ll
143–146	NB -- if these attributes aren't necessary for the test to operate, it's better to remove them to avoid future maintenance burdens

Closed by commit rG3be667ae5a10: [X86] Use the CFA when appropriate for better variable locations around calls. (authored by khuey, committed by scott.linder). · Explain WhyMay 23 2023, 1:25 PM

This revision was automatically updated to reflect the committed changes.

scott.linder added a commit: rG3be667ae5a10: [X86] Use the CFA when appropriate for better variable locations around calls..

scott.linder added inline comments.May 23 2023, 1:26 PM

llvm/test/DebugInfo/X86/stack_adjustments_trigger_cfa_frame_base.ll
143–146	I removed these before landing! @khuey let me know if you have concerns with dropping the attributes

khuey added inline comments.May 23 2023, 2:43 PM

llvm/test/DebugInfo/X86/stack_adjustments_trigger_cfa_frame_base.ll
143–146	Should be fine, I believe.

In D143463#4364427, @jmorse wrote:

Just to confirm my understanding, this should have no interaction with the shrink-wrapping optimisation pass because there shouldn't be any stack-stored variables before frame setup occurs, yes?

I'm not familiar with how that optimization works I don't think it will matter at all since it just changes how the frame base is computed by the debugger. Assuming there are tests for that optimization, they passed on the earlier version of this patch that made the change unconditionally.

skan added a subscriber: skan.May 23 2023, 7:24 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

TargetFrameLowering.h

7 lines

lib/

CodeGen/

AsmPrinter/

DwarfCompileUnit.cpp

5 lines

CFIInstrInserter.cpp

3 lines

Target/

X86/

X86FrameLowering.h

2 lines

X86FrameLowering.cpp

22 lines

X86MCInstLower.cpp

2 lines

X86MachineFunctionInfo.h

8 lines

test/

DebugInfo/

X86/

stack_adjustments_trigger_cfa_frame_base.ll

233 lines

Diff 524120

llvm/include/llvm/CodeGen/TargetFrameLowering.h

Show First 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	public:

// Maps a callee saved register to a stack slot with a fixed offset.		// Maps a callee saved register to a stack slot with a fixed offset.
struct SpillSlot {		struct SpillSlot {
unsigned Reg;		unsigned Reg;
int Offset; // Offset relative to stack pointer on function entry.		int Offset; // Offset relative to stack pointer on function entry.
};		};

struct DwarfFrameBase {		struct DwarfFrameBase {
// The frame base may be either a register (the default), the CFA,		// The frame base may be either a register (the default), the CFA with an
// or a WebAssembly-specific location description.		// offset, or a WebAssembly-specific location description.
enum FrameBaseKind { Register, CFA, WasmFrameBase } Kind;		enum FrameBaseKind { Register, CFA, WasmFrameBase } Kind;
struct WasmFrameBase {		struct WasmFrameBase {
unsigned Kind; // Wasm local, global, or value stack		unsigned Kind; // Wasm local, global, or value stack
unsigned Index;		unsigned Index;
};		};
union {		union {
		// Used with FrameBaseKind::Register.
unsigned Reg;		unsigned Reg;
		// Used with FrameBaseKind::CFA.
		int Offset;
struct WasmFrameBase WasmLoc;		struct WasmFrameBase WasmLoc;
} Location;		} Location;
};		};

private:		private:
StackDirection StackDir;		StackDirection StackDir;
Align StackAlignment;		Align StackAlignment;
Align TransientStackAlignment;		Align TransientStackAlignment;
▲ Show 20 Lines • Show All 392 Lines • Show Last 20 Lines

llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp

Show First 20 Lines • Show All 525 Lines • ▼ Show 20 Lines	case TargetFrameLowering::DwarfFrameBase::Register: {
MachineLocation Location(FrameBase.Location.Reg);		MachineLocation Location(FrameBase.Location.Reg);
addAddress(*SPDie, dwarf::DW_AT_frame_base, Location);		addAddress(*SPDie, dwarf::DW_AT_frame_base, Location);
}		}
break;		break;
}		}
case TargetFrameLowering::DwarfFrameBase::CFA: {		case TargetFrameLowering::DwarfFrameBase::CFA: {
DIELoc *Loc = new (DIEValueAllocator) DIELoc;		DIELoc *Loc = new (DIEValueAllocator) DIELoc;
addUInt(*Loc, dwarf::DW_FORM_data1, dwarf::DW_OP_call_frame_cfa);		addUInt(*Loc, dwarf::DW_FORM_data1, dwarf::DW_OP_call_frame_cfa);
		if (FrameBase.Location.Offset != 0) {
		addUInt(*Loc, dwarf::DW_FORM_data1, dwarf::DW_OP_consts);
		addSInt(*Loc, dwarf::DW_FORM_sdata, FrameBase.Location.Offset);
		addUInt(*Loc, dwarf::DW_FORM_data1, dwarf::DW_OP_plus);
		}
addBlock(*SPDie, dwarf::DW_AT_frame_base, Loc);		addBlock(*SPDie, dwarf::DW_AT_frame_base, Loc);
break;		break;
}		}
case TargetFrameLowering::DwarfFrameBase::WasmFrameBase: {		case TargetFrameLowering::DwarfFrameBase::WasmFrameBase: {
// FIXME: duplicated from Target/WebAssembly/WebAssembly.h		// FIXME: duplicated from Target/WebAssembly/WebAssembly.h
const unsigned TI_GLOBAL_RELOC = 3;		const unsigned TI_GLOBAL_RELOC = 3;
if (FrameBase.Location.WasmLoc.Kind == TI_GLOBAL_RELOC) {		if (FrameBase.Location.WasmLoc.Kind == TI_GLOBAL_RELOC) {
// These need to be relocatable.		// These need to be relocatable.
▲ Show 20 Lines • Show All 1,098 Lines • Show Last 20 Lines

llvm/lib/CodeGen/CFIInstrInserter.cpp

	Show First 20 Lines • Show All 135 Lines • ▼ Show 20 Lines

	char CFIInstrInserter::ID = 0;			char CFIInstrInserter::ID = 0;
	INITIALIZE_PASS(CFIInstrInserter, "cfi-instr-inserter",			INITIALIZE_PASS(CFIInstrInserter, "cfi-instr-inserter",
	"Check CFA info and insert CFI instructions if needed", false,			"Check CFA info and insert CFI instructions if needed", false,
	false)			false)
	FunctionPass *llvm::createCFIInstrInserter() { return new CFIInstrInserter(); }			FunctionPass *llvm::createCFIInstrInserter() { return new CFIInstrInserter(); }

	void CFIInstrInserter::calculateCFAInfo(MachineFunction &MF) {			void CFIInstrInserter::calculateCFAInfo(MachineFunction &MF) {
				const TargetRegisterInfo &TRI = *MF.getSubtarget().getRegisterInfo();
	// Initial CFA offset value i.e. the one valid at the beginning of the			// Initial CFA offset value i.e. the one valid at the beginning of the
	// function.			// function.
	int InitialOffset =			int InitialOffset =
	MF.getSubtarget().getFrameLowering()->getInitialCFAOffset(MF);			MF.getSubtarget().getFrameLowering()->getInitialCFAOffset(MF);
	// Initial CFA register value i.e. the one valid at the beginning of the			// Initial CFA register value i.e. the one valid at the beginning of the
	// function.			// function.
	Register InitialRegister =			Register InitialRegister =
	MF.getSubtarget().getFrameLowering()->getInitialCFARegister(MF);			MF.getSubtarget().getFrameLowering()->getInitialCFARegister(MF);
	const TargetRegisterInfo &TRI = *MF.getSubtarget().getRegisterInfo();			InitialRegister = TRI.getDwarfRegNum(InitialRegister, true);
				scott.linderUnsubmitted Not Done Reply Inline Actions Fairly harmless (and NFC in this case) but I don't believe `Register` is intended to be constructed with a dwarf register ordinal scott.linder: Fairly harmless (and NFC in this case) but I don't believe `Register` is intended to be…
	unsigned NumRegs = TRI.getNumRegs();			unsigned NumRegs = TRI.getNumRegs();

	// Initialize MBBMap.			// Initialize MBBMap.
	for (MachineBasicBlock &MBB : MF) {			for (MachineBasicBlock &MBB : MF) {
	MBBCFAInfo &MBBInfo = MBBVector[MBB.getNumber()];			MBBCFAInfo &MBBInfo = MBBVector[MBB.getNumber()];
	MBBInfo.MBB = &MBB;			MBBInfo.MBB = &MBB;
	MBBInfo.IncomingCFAOffset = InitialOffset;			MBBInfo.IncomingCFAOffset = InitialOffset;
	MBBInfo.OutgoingCFAOffset = InitialOffset;			MBBInfo.OutgoingCFAOffset = InitialOffset;
	▲ Show 20 Lines • Show All 288 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86FrameLowering.h

Show First 20 Lines • Show All 187 Lines • ▼ Show 20 Lines	restoreWin32EHStackPointers(MachineBasicBlock &MBB,
const DebugLoc &DL, bool RestoreSP = false) const;		const DebugLoc &DL, bool RestoreSP = false) const;

void restoreWinEHStackPointersInParent(MachineFunction &MF) const;		void restoreWinEHStackPointersInParent(MachineFunction &MF) const;

int getInitialCFAOffset(const MachineFunction &MF) const override;		int getInitialCFAOffset(const MachineFunction &MF) const override;

Register getInitialCFARegister(const MachineFunction &MF) const override;		Register getInitialCFARegister(const MachineFunction &MF) const override;

		DwarfFrameBase getDwarfFrameBase(const MachineFunction &MF) const override;

/// Return true if the function has a redzone (accessible bytes past the		/// Return true if the function has a redzone (accessible bytes past the
/// frame of the top of stack function) as part of it's ABI.		/// frame of the top of stack function) as part of it's ABI.
bool has128ByteRedZone(const MachineFunction& MF) const;		bool has128ByteRedZone(const MachineFunction& MF) const;

private:		private:
bool isWin64Prologue(const MachineFunction &MF) const;		bool isWin64Prologue(const MachineFunction &MF) const;

bool needsDwarfCFI(const MachineFunction &MF) const;		bool needsDwarfCFI(const MachineFunction &MF) const;
▲ Show 20 Lines • Show All 69 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86FrameLowering.cpp

Show First 20 Lines • Show All 411 Lines • ▼ Show 20 Lines

void X86FrameLowering::BuildCFI(MachineBasicBlock &MBB, void X86FrameLowering::BuildCFI(MachineBasicBlock &MBB,

MachineBasicBlock::iterator MBBI, MachineBasicBlock::iterator MBBI,

const DebugLoc &DL, const DebugLoc &DL,

const MCCFIInstruction &CFIInst, const MCCFIInstruction &CFIInst,

MachineInstr::MIFlag Flag) const { MachineInstr::MIFlag Flag) const {

MachineFunction &MF = *MBB.getParent(); MachineFunction &MF = *MBB.getParent();

unsigned CFIIndex = MF.addFrameInst(CFIInst); unsigned CFIIndex = MF.addFrameInst(CFIInst);

if (CFIInst.getOperation() == MCCFIInstruction::OpAdjustCfaOffset)

MF.getInfo<X86MachineFunctionInfo>()->setHasCFIAdjustCfa(true);

BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION)) BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))

.addCFIIndex(CFIIndex) .addCFIIndex(CFIIndex)

.setMIFlag(Flag); .setMIFlag(Flag);

} }

/// Emits Dwarf Info specifying offsets of callee saved registers and /// Emits Dwarf Info specifying offsets of callee saved registers and

/// frame pointer. This is called only when basic block sections are enabled. /// frame pointer. This is called only when basic block sections are enabled.

void X86FrameLowering::emitCalleeSavedFrameMovesFullCFA( void X86FrameLowering::emitCalleeSavedFrameMovesFullCFA(

▲ Show 20 Lines • Show All 3,368 Lines • ▼ Show 20 Lines MachineBasicBlock::iterator X86FrameLowering::restoreWin32EHStackPointers(

} }

return MBBI; return MBBI;

} }

int X86FrameLowering::getInitialCFAOffset(const MachineFunction &MF) const { int X86FrameLowering::getInitialCFAOffset(const MachineFunction &MF) const {

return TRI->getSlotSize(); return TRI->getSlotSize();

} }

scott.linderUnsubmitted

Not Done

wrt. the type confusion here, it seems to be a bug introduced during a bulk update which replaced many instances of unsigned with Register (2481f26ac3f22)

scott.linder: wrt. the type confusion here, it seems to be a bug introduced during a bulk update which…

X86FrameLowering::getInitialCFARegister(const MachineFunction &MF) const { X86FrameLowering::getInitialCFARegister(const MachineFunction &MF) const {

return TRI->getDwarfRegNum(StackPtr, true); return StackPtr;

}

TargetFrameLowering::DwarfFrameBase

X86FrameLowering::getDwarfFrameBase(const MachineFunction &MF) const {

const TargetRegisterInfo *RI = MF.getSubtarget().getRegisterInfo();

if (getInitialCFARegister(MF) == FrameRegister &&

MF.getInfo<X86MachineFunctionInfo>()->hasCFIAdjustCfa()) {

scott.linderUnsubmitted

Done

- if (getInitialCFARegister(MF) == FrameRegister &&

- MF.getInfo<X86MachineFunctionInfo>()->hasCFIAdjustCfa()) {

+ if (!hasFP(MF)) {

DwarfFrameBase FrameBase;

Can the new X86MachineFunctionInfo field be eliminated, and this instead check the same condition the rest of the X86 code checks before creating a OpAdjustCfaOffset? It seems to consistently just be that !X86FrameLowering::hasFP(MF).

scott.linder: Can the new `X86MachineFunctionInfo` field be eliminated, and this instead check the same…

khueyAuthorUnsubmitted

Done

The idea here was that if the frame register is the stack pointer *and* a CFA adjustment was emitted we'd switch to the CFA. This proposal would get rid of the latter, which I discussed in more detail in the top-level comment.

khuey: The idea here was that if the frame register is the stack pointer *and* a CFA adjustment was…

DwarfFrameBase FrameBase;

FrameBase.Kind = DwarfFrameBase::CFA;

FrameBase.Location.Offset =

-MF.getFrameInfo().getStackSize() - getInitialCFAOffset(MF);

return FrameBase;

}

scott.linderUnsubmitted

Done

if (getInitialCFARegister(MF) == FrameRegister &&

- MF.getInfo<X86MachineFunctionInfo>()->hasCFIAdjustCfa()) {

- DwarfFrameBase FrameBase;

- FrameBase.Kind = DwarfFrameBase::CFA;

- FrameBase.Location.Offset =

- -MF.getFrameInfo().getStackSize() - getInitialCFAOffset(MF);

- return FrameBase;

- }

+ MF.getInfo<X86MachineFunctionInfo>()->hasCFIAdjustCfa())

+ return {DwarfFrameBase::CFA, -MF.getFrameInfo().getStackSize() - getInitialCFAOffset(MF)};

return DwarfFrameBase{DwarfFrameBase::Register, {FrameRegister}};

scott.linder:

return DwarfFrameBase{DwarfFrameBase::Register, {FrameRegister}};

} }

namespace { namespace {

// Struct used by orderFrameObjects to help sort the stack objects. // Struct used by orderFrameObjects to help sort the stack objects.

struct X86FrameSortingObject { struct X86FrameSortingObject {

bool IsValid = false; // true if we care about this Object. bool IsValid = false; // true if we care about this Object.

unsigned ObjectIndex = 0; // Index of Object into MFI list. unsigned ObjectIndex = 0; // Index of Object into MFI list.

unsigned ObjectSize = 0; // Size of Object in bytes. unsigned ObjectSize = 0; // Size of Object in bytes.

▲ Show 20 Lines • Show All 242 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86MCInstLower.cpp

Show All 12 Lines

#include "MCTargetDesc/X86ATTInstPrinter.h"		#include "MCTargetDesc/X86ATTInstPrinter.h"
#include "MCTargetDesc/X86BaseInfo.h"		#include "MCTargetDesc/X86BaseInfo.h"
#include "MCTargetDesc/X86EncodingOptimization.h"		#include "MCTargetDesc/X86EncodingOptimization.h"
#include "MCTargetDesc/X86InstComments.h"		#include "MCTargetDesc/X86InstComments.h"
#include "MCTargetDesc/X86ShuffleDecode.h"		#include "MCTargetDesc/X86ShuffleDecode.h"
#include "MCTargetDesc/X86TargetStreamer.h"		#include "MCTargetDesc/X86TargetStreamer.h"
#include "X86AsmPrinter.h"		#include "X86AsmPrinter.h"
		#include "X86MachineFunctionInfo.h"
#include "X86RegisterInfo.h"		#include "X86RegisterInfo.h"
#include "X86ShuffleDecodeConstantPool.h"		#include "X86ShuffleDecodeConstantPool.h"
#include "X86Subtarget.h"		#include "X86Subtarget.h"
#include "llvm/ADT/SmallString.h"		#include "llvm/ADT/SmallString.h"
#include "llvm/ADT/iterator_range.h"		#include "llvm/ADT/iterator_range.h"
#include "llvm/CodeGen/MachineConstantPool.h"		#include "llvm/CodeGen/MachineConstantPool.h"
#include "llvm/CodeGen/MachineFunction.h"		#include "llvm/CodeGen/MachineFunction.h"
#include "llvm/CodeGen/MachineModuleInfoImpls.h"		#include "llvm/CodeGen/MachineModuleInfoImpls.h"
▲ Show 20 Lines • Show All 2,084 Lines • ▼ Show 20 Lines	case X86::MOVPC32r: {
// TODO: This is needed only if we require precise CFA.		// TODO: This is needed only if we require precise CFA.
bool HasActiveDwarfFrame = OutStreamer->getNumFrameInfos() &&		bool HasActiveDwarfFrame = OutStreamer->getNumFrameInfos() &&
!OutStreamer->getDwarfFrameInfos().back().End;		!OutStreamer->getDwarfFrameInfos().back().End;

int stackGrowth = -RI->getSlotSize();		int stackGrowth = -RI->getSlotSize();

if (HasActiveDwarfFrame && !hasFP) {		if (HasActiveDwarfFrame && !hasFP) {
OutStreamer->emitCFIAdjustCfaOffset(-stackGrowth);		OutStreamer->emitCFIAdjustCfaOffset(-stackGrowth);
		MF->getInfo<X86MachineFunctionInfo>()->setHasCFIAdjustCfa(true);
		scott.linderUnsubmitted Not Done Reply Inline Actions I think the fact that we need this extra call is evidence that the heuristic-based approach and a field in `X86MachineFunctionInfo` is not ideal scott.linder: I think the fact that we need this extra call is evidence that the heuristic-based approach and…
}		}

// Emit the label.		// Emit the label.
OutStreamer->emitLabel(PICBase);		OutStreamer->emitLabel(PICBase);

// popl $reg		// popl $reg
EmitAndCountInstruction(		EmitAndCountInstruction(
MCInstBuilder(X86::POP32r).addReg(MI->getOperand(0).getReg()));		MCInstBuilder(X86::POP32r).addReg(MI->getOperand(0).getReg()));
▲ Show 20 Lines • Show All 153 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86MachineFunctionInfo.h

Show First 20 Lines • Show All 111 Lines • ▼ Show 20 Lines	class X86MachineFunctionInfo : public MachineFunctionInfo {
/// addr]. If so, bit 60 of the in-memory frame pointer will be 1 to enable		/// addr]. If so, bit 60 of the in-memory frame pointer will be 1 to enable
/// other tools to detect the extended record.		/// other tools to detect the extended record.
bool HasSwiftAsyncContext = false;		bool HasSwiftAsyncContext = false;

/// True if this function has tile virtual register. This is used to		/// True if this function has tile virtual register. This is used to
/// determine if we should insert tilerelease in frame lowering.		/// determine if we should insert tilerelease in frame lowering.
bool HasVirtualTileReg = false;		bool HasVirtualTileReg = false;

		/// True if this function has CFI directives that adjust the CFA.
		/// This is used to determine if we should direct the debugger to use
		/// the CFA instead of the stack pointer.
		bool HasCFIAdjustCfa = false;

MachineInstr *StackPtrSaveMI = nullptr;		MachineInstr *StackPtrSaveMI = nullptr;

std::optional<int> SwiftAsyncContextFrameIdx;		std::optional<int> SwiftAsyncContextFrameIdx;

// Preallocated fields are only used during isel.		// Preallocated fields are only used during isel.
// FIXME: Can we find somewhere else to store these?		// FIXME: Can we find somewhere else to store these?
DenseMap<const Value *, size_t> PreallocatedIds;		DenseMap<const Value *, size_t> PreallocatedIds;
SmallVector<size_t, 0> PreallocatedStackSizes;		SmallVector<size_t, 0> PreallocatedStackSizes;
▲ Show 20 Lines • Show All 94 Lines • ▼ Show 20 Lines	public:
void setHasPreallocatedCall(bool v) { HasPreallocatedCall = v; }		void setHasPreallocatedCall(bool v) { HasPreallocatedCall = v; }

bool hasSwiftAsyncContext() const { return HasSwiftAsyncContext; }		bool hasSwiftAsyncContext() const { return HasSwiftAsyncContext; }
void setHasSwiftAsyncContext(bool v) { HasSwiftAsyncContext = v; }		void setHasSwiftAsyncContext(bool v) { HasSwiftAsyncContext = v; }

bool hasVirtualTileReg() const { return HasVirtualTileReg; }		bool hasVirtualTileReg() const { return HasVirtualTileReg; }
void setHasVirtualTileReg(bool v) { HasVirtualTileReg = v; }		void setHasVirtualTileReg(bool v) { HasVirtualTileReg = v; }

		bool hasCFIAdjustCfa() const { return HasCFIAdjustCfa; }
		void setHasCFIAdjustCfa(bool v) { HasCFIAdjustCfa = v; }

void setStackPtrSaveMI(MachineInstr *MI) { StackPtrSaveMI = MI; }		void setStackPtrSaveMI(MachineInstr *MI) { StackPtrSaveMI = MI; }
MachineInstr *getStackPtrSaveMI() const { return StackPtrSaveMI; }		MachineInstr *getStackPtrSaveMI() const { return StackPtrSaveMI; }

std::optional<int> getSwiftAsyncContextFrameIdx() const {		std::optional<int> getSwiftAsyncContextFrameIdx() const {
return SwiftAsyncContextFrameIdx;		return SwiftAsyncContextFrameIdx;
}		}
void setSwiftAsyncContextFrameIdx(int v) { SwiftAsyncContextFrameIdx = v; }		void setSwiftAsyncContextFrameIdx(int v) { SwiftAsyncContextFrameIdx = v; }

Show All 31 Lines

llvm/test/DebugInfo/X86/stack_adjustments_trigger_cfa_frame_base.ll

This file was added.

				; RUN: llc < %s -filetype=obj -o %t
				; RUN: llvm-dwarfdump -v -debug-info %t \| FileCheck %s
				;
				; use core::hint::black_box;
				;
				; #[inline(never)]
				; fn callee(
				; s1: &(),
				; s2: &(),
				; s3: &(),
				; s4: &(),
				; s5: &(),
				; s6: &(),
				; s7: &(),
				; s8: &(),
				; s9: &mut (),
				; ) {
				; black_box(s1);
				; black_box(s2);
				; black_box(s3);
				; black_box(s4);
				; black_box(s5);
				; black_box(s6);
				; black_box(s7);
				; black_box(s8);
				; black_box(s9);
				; }
				;
				; pub fn caller() {
				; let s = ();
				; let mut t = ();
				; callee(&s, &s, &s, &s, &s, &s, &s, &s, &mut t);
				; }
				;
				; Test that if a call requires fiddling with the stack pointer we switch to
				; using a CFA-based DW_AT_frame_base

				; CHECK: DW_AT_frame_base [DW_FORM_exprloc] (DW_OP_call_frame_cfa, DW_OP_consts -{{[0-9]+}}, DW_OP_plus)
				; CHECK-NOT: DW_TAG
				; CHECK: _ZN10playground6caller

				; ModuleID = 'playground.71f4e8b5-cgu.0'
				source_filename = "playground.71f4e8b5-cgu.0"
				target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				; core::hint::black_box
				; Function Attrs: inlinehint nonlazybind uwtable
				define align 1 ptr @_ZN4core4hint9black_box17h9f9a3aab786d67e0E(ptr align 1 %dummy) unnamed_addr #0 !dbg !6 {
				start:
				%0 = alloca ptr, align 8
				%dummy.dbg.spill = alloca ptr, align 8
				store ptr %dummy, ptr %dummy.dbg.spill, align 8
				call void @llvm.dbg.declare(metadata ptr %dummy.dbg.spill, metadata !15, metadata !DIExpression()), !dbg !18
				store ptr %dummy, ptr %0, align 8, !dbg !19
				call void asm sideeffect "", "r,~{memory}"(ptr %0), !dbg !19, !srcloc !20
				%1 = load ptr, ptr %0, align 8, !dbg !19, !nonnull !21, !align !22, !noundef !21
				ret ptr %1, !dbg !23
				}

				; core::hint::black_box
				; Function Attrs: inlinehint nonlazybind uwtable
				define align 1 ptr @_ZN4core4hint9black_box17hff24a8f6cdc261d0E(ptr align 1 %dummy) unnamed_addr #0 !dbg !24 {
				start:
				%0 = alloca ptr, align 8
				%dummy.dbg.spill = alloca ptr, align 8
				store ptr %dummy, ptr %dummy.dbg.spill, align 8
				call void @llvm.dbg.declare(metadata ptr %dummy.dbg.spill, metadata !29, metadata !DIExpression()), !dbg !32
				store ptr %dummy, ptr %0, align 8, !dbg !33
				call void asm sideeffect "", "r,~{memory}"(ptr %0), !dbg !33, !srcloc !20
				%1 = load ptr, ptr %0, align 8, !dbg !33, !nonnull !21, !align !22, !noundef !21
				ret ptr %1, !dbg !34
				}

				; playground::callee
				; Function Attrs: noinline nonlazybind uwtable
				define internal void @_ZN10playground6callee17hf55947d3dfc887f4E(ptr align 1 %s1, ptr align 1 %s2, ptr align 1 %s3, ptr align 1 %s4, ptr align 1 %s5, ptr align 1 %s6, ptr align 1 %s7, ptr align 1 %s8, ptr align 1 %s9) unnamed_addr #1 !dbg !35 {
				start:
				%s9.dbg.spill = alloca ptr, align 8
				%s8.dbg.spill = alloca ptr, align 8
				%s7.dbg.spill = alloca ptr, align 8
				%s6.dbg.spill = alloca ptr, align 8
				%s5.dbg.spill = alloca ptr, align 8
				%s4.dbg.spill = alloca ptr, align 8
				%s3.dbg.spill = alloca ptr, align 8
				%s2.dbg.spill = alloca ptr, align 8
				%s1.dbg.spill = alloca ptr, align 8
				store ptr %s1, ptr %s1.dbg.spill, align 8
				call void @llvm.dbg.declare(metadata ptr %s1.dbg.spill, metadata !41, metadata !DIExpression()), !dbg !50
				store ptr %s2, ptr %s2.dbg.spill, align 8
				call void @llvm.dbg.declare(metadata ptr %s2.dbg.spill, metadata !42, metadata !DIExpression()), !dbg !51
				store ptr %s3, ptr %s3.dbg.spill, align 8
				call void @llvm.dbg.declare(metadata ptr %s3.dbg.spill, metadata !43, metadata !DIExpression()), !dbg !52
				store ptr %s4, ptr %s4.dbg.spill, align 8
				call void @llvm.dbg.declare(metadata ptr %s4.dbg.spill, metadata !44, metadata !DIExpression()), !dbg !53
				store ptr %s5, ptr %s5.dbg.spill, align 8
				call void @llvm.dbg.declare(metadata ptr %s5.dbg.spill, metadata !45, metadata !DIExpression()), !dbg !54
				store ptr %s6, ptr %s6.dbg.spill, align 8
				call void @llvm.dbg.declare(metadata ptr %s6.dbg.spill, metadata !46, metadata !DIExpression()), !dbg !55
				store ptr %s7, ptr %s7.dbg.spill, align 8
				call void @llvm.dbg.declare(metadata ptr %s7.dbg.spill, metadata !47, metadata !DIExpression()), !dbg !56
				store ptr %s8, ptr %s8.dbg.spill, align 8
				call void @llvm.dbg.declare(metadata ptr %s8.dbg.spill, metadata !48, metadata !DIExpression()), !dbg !57
				store ptr %s9, ptr %s9.dbg.spill, align 8
				call void @llvm.dbg.declare(metadata ptr %s9.dbg.spill, metadata !49, metadata !DIExpression()), !dbg !58
				; call core::hint::black_box
				%_10 = call align 1 ptr @_ZN4core4hint9black_box17h9f9a3aab786d67e0E(ptr align 1 %s1), !dbg !59
				; call core::hint::black_box
				%_12 = call align 1 ptr @_ZN4core4hint9black_box17h9f9a3aab786d67e0E(ptr align 1 %s2), !dbg !60
				; call core::hint::black_box
				%_14 = call align 1 ptr @_ZN4core4hint9black_box17h9f9a3aab786d67e0E(ptr align 1 %s3), !dbg !61
				; call core::hint::black_box
				%_16 = call align 1 ptr @_ZN4core4hint9black_box17h9f9a3aab786d67e0E(ptr align 1 %s4), !dbg !62
				; call core::hint::black_box
				%_18 = call align 1 ptr @_ZN4core4hint9black_box17h9f9a3aab786d67e0E(ptr align 1 %s5), !dbg !63
				; call core::hint::black_box
				%_20 = call align 1 ptr @_ZN4core4hint9black_box17h9f9a3aab786d67e0E(ptr align 1 %s6), !dbg !64
				; call core::hint::black_box
				%_22 = call align 1 ptr @_ZN4core4hint9black_box17h9f9a3aab786d67e0E(ptr align 1 %s7), !dbg !65
				; call core::hint::black_box
				%_24 = call align 1 ptr @_ZN4core4hint9black_box17h9f9a3aab786d67e0E(ptr align 1 %s8), !dbg !66
				; call core::hint::black_box
				%_26 = call align 1 ptr @_ZN4core4hint9black_box17hff24a8f6cdc261d0E(ptr align 1 %s9), !dbg !67
				ret void, !dbg !68
				}

				; playground::caller
				; Function Attrs: nonlazybind uwtable
				define void @_ZN10playground6caller17h0397b5030166733dE() unnamed_addr #2 !dbg !69 {
				start:
				%t = alloca {}, align 1
				%s = alloca {}, align 1
				call void @llvm.dbg.declare(metadata ptr %s, metadata !73, metadata !DIExpression()), !dbg !77
				call void @llvm.dbg.declare(metadata ptr %t, metadata !75, metadata !DIExpression()), !dbg !78
				; call playground::callee
				call void @_ZN10playground6callee17hf55947d3dfc887f4E(ptr align 1 %s, ptr align 1 %s, ptr align 1 %s, ptr align 1 %s, ptr align 1 %s, ptr align 1 %s, ptr align 1 %s, ptr align 1 %s, ptr align 1 %t), !dbg !79
				ret void, !dbg !80
				}

				; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn
				declare void @llvm.dbg.declare(metadata, metadata, metadata) #3

				attributes #0 = { inlinehint nonlazybind uwtable "probe-stack"="__rust_probestack" "target-cpu"="x86-64" }
				attributes #1 = { noinline nonlazybind uwtable "probe-stack"="__rust_probestack" "target-cpu"="x86-64" }
				attributes #2 = { nonlazybind uwtable "probe-stack"="__rust_probestack" "target-cpu"="x86-64" }
				attributes #3 = { nocallback nofree nosync nounwind readnone speculatable willreturn }
				jmorseUnsubmitted Not Done Reply Inline Actions NB -- if these attributes aren't necessary for the test to operate, it's better to remove them to avoid future maintenance burdens jmorse: NB -- if these attributes aren't necessary for the test to operate, it's better to remove them…
				scott.linderUnsubmitted Not Done Reply Inline Actions I removed these before landing! @khuey let me know if you have concerns with dropping the attributes scott.linder: I removed these before landing! @khuey let me know if you have concerns with dropping the…
				khueyAuthorUnsubmitted Done Reply Inline Actions Should be fine, I believe. khuey: Should be fine, I believe.

				!llvm.module.flags = !{!0, !1, !2, !3}
				!llvm.dbg.cu = !{!4}

				!0 = !{i32 7, !"PIC Level", i32 2}
				!1 = !{i32 2, !"RtLibUseGOT", i32 1}
				!2 = !{i32 2, !"Dwarf Version", i32 4}
				!3 = !{i32 2, !"Debug Info Version", i32 3}
				!4 = distinct !DICompileUnit(language: DW_LANG_Rust, file: !5, producer: "clang LLVM (rustc version 1.69.0-nightly (e1eaa2d5d 2023-02-06))", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false)
				!5 = !DIFile(filename: "src/lib.rs/@/playground.71f4e8b5-cgu.0", directory: "/playground")
				!6 = distinct !DISubprogram(name: "black_box<&()>", linkageName: "_ZN4core4hint9black_box17h9f9a3aab786d67e0E", scope: !8, file: !7, line: 294, type: !10, scopeLine: 294, flags: DIFlagPrototyped, spFlags: DISPFlagLocalToUnit \| DISPFlagDefinition, unit: !4, templateParams: !16, retainedNodes: !14)
				!7 = !DIFile(filename: "/rustc/e1eaa2d5d4d1f5b7b89561a940718058d414e89c/library/core/src/hint.rs", directory: "", checksumkind: CSK_MD5, checksum: "2eba1ee5b9c26bf5eea6ed3dac7a7b79")
				!8 = !DINamespace(name: "hint", scope: !9)
				!9 = !DINamespace(name: "core", scope: null)
				!10 = !DISubroutineType(types: !11)
				!11 = !{!12, !12}
				!12 = !DIDerivedType(tag: DW_TAG_pointer_type, name: "&()", baseType: !13, size: 64, align: 64, dwarfAddressSpace: 0)
				!13 = !DIBasicType(name: "()", encoding: DW_ATE_unsigned)
				!14 = !{!15}
				!15 = !DILocalVariable(name: "dummy", arg: 1, scope: !6, file: !7, line: 294, type: !12)
				!16 = !{!17}
				!17 = !DITemplateTypeParameter(name: "T", type: !12)
				!18 = !DILocation(line: 294, column: 27, scope: !6)
				!19 = !DILocation(line: 295, column: 5, scope: !6)
				!20 = !{i32 382361}
				!21 = !{}
				!22 = !{i64 1}
				!23 = !DILocation(line: 296, column: 2, scope: !6)
				!24 = distinct !DISubprogram(name: "black_box<&mut ()>", linkageName: "_ZN4core4hint9black_box17hff24a8f6cdc261d0E", scope: !8, file: !7, line: 294, type: !25, scopeLine: 294, flags: DIFlagPrototyped, spFlags: DISPFlagLocalToUnit \| DISPFlagDefinition, unit: !4, templateParams: !30, retainedNodes: !28)
				!25 = !DISubroutineType(types: !26)
				!26 = !{!27, !27}
				!27 = !DIDerivedType(tag: DW_TAG_pointer_type, name: "&mut ()", baseType: !13, size: 64, align: 64, dwarfAddressSpace: 0)
				!28 = !{!29}
				!29 = !DILocalVariable(name: "dummy", arg: 1, scope: !24, file: !7, line: 294, type: !27)
				!30 = !{!31}
				!31 = !DITemplateTypeParameter(name: "T", type: !27)
				!32 = !DILocation(line: 294, column: 27, scope: !24)
				!33 = !DILocation(line: 295, column: 5, scope: !24)
				!34 = !DILocation(line: 296, column: 2, scope: !24)
				!35 = distinct !DISubprogram(name: "callee", linkageName: "_ZN10playground6callee17hf55947d3dfc887f4E", scope: !37, file: !36, line: 4, type: !38, scopeLine: 4, flags: DIFlagPrototyped, spFlags: DISPFlagLocalToUnit \| DISPFlagDefinition, unit: !4, templateParams: !21, retainedNodes: !40)
				!36 = !DIFile(filename: "src/lib.rs", directory: "/playground", checksumkind: CSK_MD5, checksum: "bb1df4ba7c42e8987c349ab2cbe5f6b6")
				!37 = !DINamespace(name: "playground", scope: null)
				!38 = !DISubroutineType(types: !39)
				!39 = !{null, !12, !12, !12, !12, !12, !12, !12, !12, !27}
				!40 = !{!41, !42, !43, !44, !45, !46, !47, !48, !49}
				!41 = !DILocalVariable(name: "s1", arg: 1, scope: !35, file: !36, line: 5, type: !12)
				!42 = !DILocalVariable(name: "s2", arg: 2, scope: !35, file: !36, line: 6, type: !12)
				!43 = !DILocalVariable(name: "s3", arg: 3, scope: !35, file: !36, line: 7, type: !12)
				!44 = !DILocalVariable(name: "s4", arg: 4, scope: !35, file: !36, line: 8, type: !12)
				!45 = !DILocalVariable(name: "s5", arg: 5, scope: !35, file: !36, line: 9, type: !12)
				!46 = !DILocalVariable(name: "s6", arg: 6, scope: !35, file: !36, line: 10, type: !12)
				!47 = !DILocalVariable(name: "s7", arg: 7, scope: !35, file: !36, line: 11, type: !12)
				!48 = !DILocalVariable(name: "s8", arg: 8, scope: !35, file: !36, line: 12, type: !12)
				!49 = !DILocalVariable(name: "s9", arg: 9, scope: !35, file: !36, line: 13, type: !27)
				!50 = !DILocation(line: 5, column: 5, scope: !35)
				!51 = !DILocation(line: 6, column: 5, scope: !35)
				!52 = !DILocation(line: 7, column: 5, scope: !35)
				!53 = !DILocation(line: 8, column: 5, scope: !35)
				!54 = !DILocation(line: 9, column: 5, scope: !35)
				!55 = !DILocation(line: 10, column: 5, scope: !35)
				!56 = !DILocation(line: 11, column: 5, scope: !35)
				!57 = !DILocation(line: 12, column: 5, scope: !35)
				!58 = !DILocation(line: 13, column: 5, scope: !35)
				!59 = !DILocation(line: 15, column: 5, scope: !35)
				!60 = !DILocation(line: 16, column: 5, scope: !35)
				!61 = !DILocation(line: 17, column: 5, scope: !35)
				!62 = !DILocation(line: 18, column: 5, scope: !35)
				!63 = !DILocation(line: 19, column: 5, scope: !35)
				!64 = !DILocation(line: 20, column: 5, scope: !35)
				!65 = !DILocation(line: 21, column: 5, scope: !35)
				!66 = !DILocation(line: 22, column: 5, scope: !35)
				!67 = !DILocation(line: 23, column: 5, scope: !35)
				!68 = !DILocation(line: 24, column: 2, scope: !35)
				!69 = distinct !DISubprogram(name: "caller", linkageName: "_ZN10playground6caller17h0397b5030166733dE", scope: !37, file: !36, line: 26, type: !70, scopeLine: 26, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !4, templateParams: !21, retainedNodes: !72)
				!70 = !DISubroutineType(types: !71)
				!71 = !{null}
				!72 = !{!73, !75}
				!73 = !DILocalVariable(name: "s", scope: !74, file: !36, line: 27, type: !13, align: 1)
				!74 = distinct !DILexicalBlock(scope: !69, file: !36, line: 27, column: 5)
				!75 = !DILocalVariable(name: "t", scope: !76, file: !36, line: 28, type: !13, align: 1)
				!76 = distinct !DILexicalBlock(scope: !74, file: !36, line: 28, column: 5)
				!77 = !DILocation(line: 27, column: 9, scope: !74)
				!78 = !DILocation(line: 28, column: 9, scope: !76)
				!79 = !DILocation(line: 29, column: 5, scope: !76)
				!80 = !DILocation(line: 30, column: 2, scope: !81)
				!81 = !DILexicalBlockFile(scope: !69, file: !36, discriminator: 0)

This is an archive of the discontinued LLVM Phabricator instance.

[X86] Use the CFA when appropriate for better variable locations around calls.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 524120

llvm/include/llvm/CodeGen/TargetFrameLowering.h

llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp

llvm/lib/CodeGen/CFIInstrInserter.cpp

llvm/lib/Target/X86/X86FrameLowering.h

llvm/lib/Target/X86/X86FrameLowering.cpp

llvm/lib/Target/X86/X86MCInstLower.cpp

llvm/lib/Target/X86/X86MachineFunctionInfo.h

llvm/test/DebugInfo/X86/stack_adjustments_trigger_cfa_frame_base.ll

[X86] Use the CFA when appropriate for better variable locations around calls.
ClosedPublic