This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Some fields do not need relocations even relax is enabled.
ClosedPublic

Authored by HsiangKai on May 6 2019, 12:34 AM.

Details

Summary

In debug frame information, some fields, e.g., Length in CIE/FDE and
Offset in FDE are attributes to describe the structure of CIE/FDE. They
are not related to the relaxed code. However, these attributes are
symbol differences. So, in current design, these attributes will be
filled as zero and LLVM generates relocations for them.

Diff Detail

Repository
rL LLVM

Event Timeline

HsiangKai created this revision.May 6 2019, 12:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 6 2019, 12:34 AM

I don't claim to be an MCExpr expert, but I think this should not be necessary. There are other situations where symbol differences are computed and emitted as constants, for example when DW_AT_high_pc is a length rather than an address, and they clearly don't require this tactic.

I don't claim to be an MCExpr expert, but I think this should not be necessary. There are other situations where symbol differences are computed and emitted as constants, for example when DW_AT_high_pc is a length rather than an address, and they clearly don't require this tactic.

This patch is provided for the issue discussed in https://reviews.llvm.org/D45181#1470143.
When relaxation is enabled, linker may modify the code sequence according to relocation types. So, DW_AT_high_pc will be changed even it is used as a length instead of an address.

In current implementation, LLVM will generate relocations for ALL binary expressions if relaxation is enabled. However, not all binary expressions need relocations even relaxation is enabled. This attributes in MCExpr is designed for the values which are not expected to be changed after relaxation.

If the flag in MCExpr is an acceptable way to solve the problem, I will refine the patch, e.g., add test cases, until it is ready to land.

Thanks for the reference, that helps some.
However I still think this isn't the right way to go about fixing the problem. This started because RISC-V decided not to resolve certain kinds of expressions, and then it turns out that applies to too many expressions. Somebody suggested that the asm backend decision should be more refined, based on symbol section perhaps, and that feels like a more principled way to go about it.

Thanks for the reference, that helps some.
However I still think this isn't the right way to go about fixing the problem. This started because RISC-V decided not to resolve certain kinds of expressions, and then it turns out that applies to too many expressions. Somebody suggested that the asm backend decision should be more refined, based on symbol section perhaps, and that feels like a more principled way to go about it.

I don't think to refine the backend decision is a solution for the problem. I have explained the reason in https://reviews.llvm.org/D45181#1495288.

In` EmitFDE` example, Length value will be converted to a Fixup. It is reasonable to convert the expression to fixup due to the symbol values are unknown then.

When finalization, these fixups will be converted to relocations if relaxation is enabled. However, there is no way to decide to generate relocations or generate absolute value in MCAssembler::evaluateFixup(). There is no information to distinguish these two cases.

The call stack when evaluating Length in FDE is

MCExpr::evaluateAsAbsolute()
MCObjectStreamer::EmitValueImpl()  <= generate MCFixup
MCELFStreamer::EmitValueImpl()
MCStreamer::EmitValue()
emitAbsValue()

Another way to solve the problem is to modify the interfaces of emitAbsValue, EmitValue, and EmitValueImpl. Carry the information to MCObjectStreamer::EmitValueImpl and attach the information to MCFixup when generating fixups.

HsiangKai updated this revision to Diff 199357.May 13 2019, 7:37 PM

Attach a test case.

HsiangKai updated this revision to Diff 199412.May 14 2019, 5:44 AM
HsiangKai added reviewers: rogfer01, jrtc27.

Add a flag to MCFixup instead of MCExpr.

It still bothers me that MCDwarf has to know exactly when to override the target's decision to do relaxation.

Why is it appropriate to do that in the two methods where you have made that change in this patch, and not in many other places within MCDwarf that call the same helper? What guides me to know it's correct in the first three calls to emitAbsValue within FrameEmitterImpl::EmitFDE, but not in the fourth? Without becoming an expert in all the fine details of every target.

include/llvm/MC/MCFixup.h
92 ↗(On Diff #199412)

"Require the assembler to evaluate the fixup."

(It is going to become an absolute value regardless, in the contexts where this is set to true; the only question is where the evaluation happens.)

include/llvm/MC/MCStreamer.h
621 ↗(On Diff #199412)

Need to add Fixed to the doxygen comment for this method.

It still bothers me that MCDwarf has to know exactly when to override the target's decision to do relaxation.

Why is it appropriate to do that in the two methods where you have made that change in this patch, and not in many other places within MCDwarf that call the same helper? What guides me to know it's correct in the first three calls to emitAbsValue within FrameEmitterImpl::EmitFDE, but not in the fourth? Without becoming an expert in all the fine details of every target.

It is not about target's decision. It is about whether the symbol difference will be changed after linking. For example, length field in FDE is used to describe the size of this FDE structure. It will not be affected by linker even relaxation is enabled. However, address_range in FDE is used to describe the range of instructions related to this FDE. If the linker modifies the instructions in the range, address_range will need be updated by linker accordingly. That's why it needs to set Fixed flag in length field instead of address_range field in FDE.

I will review other uses of emitAbsValue to ensure the flag is set correctly.

Set Fixed flag to length of debug_info and debug_line. These fields should be fixed even the linker has modified the instructions in the program.

Update comments.

HsiangKai updated this revision to Diff 199765.May 16 2019, 1:44 AM

Check Fixup is nullptr or not before calling its member functions.

Why do we need this flag; can we not already determine this (either in generic code or the target backends)? Anything where all symbols referenced are not in SHF_EXECINSTR sections can be evaluated even with relaxations, which covers all these cases:

  1. MCDwarfLineTableHeader::Emit's LineStartSym/LineEndSym/ProEndSym are in .debug_line.dwo which is just SHF_EXCLUDE
  2. EmitGenDwarfInfo's InfoStart/InfoEnd are in .debug_info.dwo which is just SHF_EXCLUDE
  3. FrameEmitterImpl::EmitCIE's sectionStart/sectionEnd are either in .eh_frame which is SHF_ALLOC (and also SHF_WRITE on non-x86 Solaris), or in .debug_frame which has no flags
  4. FrameEmitterImpl::EmitFDE's fdeStart/fdeEnd/cieStart/SectionStart should all also be in the same section as above

Why do we need this flag; can we not already determine this (either in generic code or the target backends)? Anything where all symbols referenced are not in SHF_EXECINSTR sections can be evaluated even with relaxations, which covers all these cases:

  1. MCDwarfLineTableHeader::Emit's LineStartSym/LineEndSym/ProEndSym are in .debug_line.dwo which is just SHF_EXCLUDE
  2. EmitGenDwarfInfo's InfoStart/InfoEnd are in .debug_info.dwo which is just SHF_EXCLUDE
  3. FrameEmitterImpl::EmitCIE's sectionStart/sectionEnd are either in .eh_frame which is SHF_ALLOC (and also SHF_WRITE on non-x86 Solaris), or in .debug_frame which has no flags
  4. FrameEmitterImpl::EmitFDE's fdeStart/fdeEnd/cieStart/SectionStart should all also be in the same section as above

I got it. I missed the information in MCSymbol. Every symbol should have information about the section it is located. So, we could use something like sym.getSection().getKind().isText() to determine the symbol is located in text or not. You are right. I will abandon this revision. Thanks.

HsiangKai updated this revision to Diff 202082.May 29 2019, 5:35 PM
HsiangKai edited the summary of this revision. (Show Details)

Fix the bug according to @jrtc27's suggestions.

edward-jones added inline comments.Jun 19 2019, 7:03 AM
lib/MC/MCExpr.cpp
610

I found a bug with these latest changes when compiling the following program:

func:
  .cfi_startproc
  .cfi_endproc

With the command:

llvm-mc -filetype=obj tmp.s -o tmp.o -triple riscv32

The calls to ...->getSymbol().getSection() can trigger an assertion if the symbol is not defined in a section, which seems to happen in the above testcase.

Just a quick comment, I'm assuming this depends on D58335 based on the test file you've modified?

HsiangKai marked an inline comment as done.Jun 22 2019, 9:02 AM

Just a quick comment, I'm assuming this depends on D58335 based on the test file you've modified?

Yes, the test case is based on D58335.

lib/MC/MCExpr.cpp
610

Thank you for your information. I have fixed the bug. Please help me to review it and verify the solution. Thanks.

edward-jones added inline comments.Jun 25 2019, 8:06 AM
lib/MC/MCExpr.cpp
610

Yep, this all works for me now. Looks good!

lenary added a subscriber: lenary.Jul 5 2019, 7:58 AM

I added D58335 as a parent revision, as the relax-debug-frame.ll file is created in that revision, and this patch does not apply without it.

Does this solution extend/address other label deltas, like those that would appear in the other DWARF/non-frame sections? (eg: currently we use a length in the DWARF high_pc because it means not having to use a relocation for high_pc, which reduces object size - so, does this fix cause relocations to be used for that length expression to ensure they're correct if the linker changes the instructions in that range?)

Does this solution extend/address other label deltas, like those that would appear in the other DWARF/non-frame sections? (eg: currently we use a length in the DWARF high_pc because it means not having to use a relocation for high_pc, which reduces object size - so, does this fix cause relocations to be used for that length expression to ensure they're correct if the linker changes the instructions in that range?)

After D45181 committed, LLVM will generate relocation pairs for symbol difference expressions if target enables relaxation. So, it will generate relocations for high_pc due to it is a symbol difference expression of end symbol and start symbol of the function range. Such as,

.word   .Lfunc_begin0           # DW_AT_low_pc
.word   .Lfunc_end4-.Lfunc_begin0 # DW_AT_high_pc

This commit modified the condition to generate relocations for symbol difference expressions. Not all symbol difference expressions are related to executable code. However, only executable code will be modified by linker for relaxation. So, we do not need to generate relocations for symbol differences not located in executable code sections.

Does this solution extend/address other label deltas, like those that would appear in the other DWARF/non-frame sections? (eg: currently we use a length in the DWARF high_pc because it means not having to use a relocation for high_pc, which reduces object size - so, does this fix cause relocations to be used for that length expression to ensure they're correct if the linker changes the instructions in that range?)

After D45181 committed, LLVM will generate relocation pairs for symbol difference expressions if target enables relaxation. So, it will generate relocations for high_pc due to it is a symbol difference expression of end symbol and start symbol of the function range. Such as,

.word   .Lfunc_begin0           # DW_AT_low_pc
.word   .Lfunc_end4-.Lfunc_begin0 # DW_AT_high_pc

This commit modified the condition to generate relocations for symbol difference expressions. Not all symbol difference expressions are related to executable code. However, only executable code will be modified by linker for relaxation. So, we do not need to generate relocations for symbol differences not located in executable code sections.

Good to know - so has LLVM been creating incorrect DWARF in these cases so far - or has no target enabled linker relaxation yet?

Is there any way to make this more targeted - does linker relaxation allow arbitrary machine code changes? Or is there something more fine-grained - so we could compute compile-time constants for some label differences, but only use linker relocations for cases where we know the instruction sequence includes something relaxable?

Does this solution extend/address other label deltas, like those that would appear in the other DWARF/non-frame sections? (eg: currently we use a length in the DWARF high_pc because it means not having to use a relocation for high_pc, which reduces object size - so, does this fix cause relocations to be used for that length expression to ensure they're correct if the linker changes the instructions in that range?)

After D45181 committed, LLVM will generate relocation pairs for symbol difference expressions if target enables relaxation. So, it will generate relocations for high_pc due to it is a symbol difference expression of end symbol and start symbol of the function range. Such as,

.word   .Lfunc_begin0           # DW_AT_low_pc
.word   .Lfunc_end4-.Lfunc_begin0 # DW_AT_high_pc

This commit modified the condition to generate relocations for symbol difference expressions. Not all symbol difference expressions are related to executable code. However, only executable code will be modified by linker for relaxation. So, we do not need to generate relocations for symbol differences not located in executable code sections.

Good to know - so has LLVM been creating incorrect DWARF in these cases so far - or has no target enabled linker relaxation yet?

Is there any way to make this more targeted - does linker relaxation allow arbitrary machine code changes? Or is there something more fine-grained - so we could compute compile-time constants for some label differences, but only use linker relocations for cases where we know the instruction sequence includes something relaxable?

AFAIK only RISC-V enables this feature. Currently, .debug_frame/.eh_frame is wrong after relaxation. I already fixed parts of the problem in D58335. In D58335, .debug_frame/.eh_frame will generate relocations if relaxation is enabled. Otherwise, it will not aware relaxation at all and it doesn't work to do frame inspection when debugging.

Linker relaxation processes some patterns of instruction sequences. For example, RISC-V could use auipc/jalr to jump to 32-bits pc-relative address. If linker knows the jump target is within +/-1 Mib, it could replace auipc/jalr to jal. The replacement will shorten the instruction sequence. So, the label differences between these instruction sequences may be changed after relaxation. That is why we need to annotate relocation types for these symbol differences in code sections.

Currently, there is no way to make it more targeted. When relaxation is enabled, it assumes all code sequences may be changed after relaxation. If we want to make placement of relocations more specific, we need to inspect every instruction between two symbols when we create symbol difference expressions for them. I think it will do a lot of work to gain few benefits. (Fewer relocations, smaller object code, less linker efforts, etc.)

Does this solution extend/address other label deltas, like those that would appear in the other DWARF/non-frame sections? (eg: currently we use a length in the DWARF high_pc because it means not having to use a relocation for high_pc, which reduces object size - so, does this fix cause relocations to be used for that length expression to ensure they're correct if the linker changes the instructions in that range?)

After D45181 committed, LLVM will generate relocation pairs for symbol difference expressions if target enables relaxation. So, it will generate relocations for high_pc due to it is a symbol difference expression of end symbol and start symbol of the function range. Such as,

.word   .Lfunc_begin0           # DW_AT_low_pc
.word   .Lfunc_end4-.Lfunc_begin0 # DW_AT_high_pc

This commit modified the condition to generate relocations for symbol difference expressions. Not all symbol difference expressions are related to executable code. However, only executable code will be modified by linker for relaxation. So, we do not need to generate relocations for symbol differences not located in executable code sections.

Good to know - so has LLVM been creating incorrect DWARF in these cases so far - or has no target enabled linker relaxation yet?

Is there any way to make this more targeted - does linker relaxation allow arbitrary machine code changes? Or is there something more fine-grained - so we could compute compile-time constants for some label differences, but only use linker relocations for cases where we know the instruction sequence includes something relaxable?

AFAIK only RISC-V enables this feature. Currently, .debug_frame/.eh_frame is wrong after relaxation. I already fixed parts of the problem in D58335. In D58335, .debug_frame/.eh_frame will generate relocations if relaxation is enabled. Otherwise, it will not aware relaxation at all and it doesn't work to do frame inspection when debugging.

Linker relaxation processes some patterns of instruction sequences. For example, RISC-V could use auipc/jalr to jump to 32-bits pc-relative address. If linker knows the jump target is within +/-1 Mib, it could replace auipc/jalr to jal. The replacement will shorten the instruction sequence. So, the label differences between these instruction sequences may be changed after relaxation. That is why we need to annotate relocation types for these symbol differences in code sections.

Currently, there is no way to make it more targeted. When relaxation is enabled, it assumes all code sequences may be changed after relaxation. If we want to make placement of relocations more specific, we need to inspect every instruction between two symbols when we create symbol difference expressions for them. I think it will do a lot of work to gain few benefits. (Fewer relocations, smaller object code, less linker efforts, etc.)

Thanks for helping me understand the context/support here.

I'm aware of some folks (should be announced/discussed upstream soon) working on a feature that'd require linker relaxation for x86 - but only under some fairly narrow circumstances, so there would be potentially more benefit to be gained by avoiding these label difference relocations where the address range is known not to be modifiable by the linker.

But all good - not your puzzle to solve, good for me to understand so I can keep it in mind as further work for this x86 situation/project.

asb accepted this revision.Jul 15 2019, 10:16 PM

LGTM, thanks! Some minor rebasing is needed for relax-debug-frame.ll. Please go ahead and commit.

This revision is now accepted and ready to land.Jul 15 2019, 10:16 PM
This revision was automatically updated to reflect the committed changes.
MaskRay added subscribers: reames, compnerd.EditedMay 11 2023, 2:12 PM

@reames for D150373

For MCObjectStreamer::emitDwarfAdvanceLineAddr and MCObjectStreamer::emitDwarfAdvanceFrameAddr, we have:

  • fast path: used by assembly input (commit 57ab708bdd3231b23a8ef4978b11ff07616034a2 (2010))
  • slow path (a relaxable fragment): used by code generation (much more important case)

This piece of code (canFold) blocks label difference folding, so that we will take the slow path.
The code was removed by @compnerd's linker relaxation refactoring D103539 (RISCV: adjust handling of relocation emission for RISCV)

I think we really just want to get rid of the "fast path": D150004.

With both fast path and slow path, we have diverging behaviors for RISC-V: how RISC-V linker relaxation leads to relocations happens in different code paths.

The slow path tends to be tested more and more robust, since it controls direct object code emission (more likely noticed by debuggers, etc), while the fast path is only for assembly output (assembly input files are less common & clang -S a.c; clang -c a.s users are uncommon).

I have tried reading MCExpr.cpp multiple times and I always feel that some parts of it need more love to address some problems (some are documented as FIXME).

(If the fast path is for simple operations like LEB128 (see D150004's analysis), I will more likely accept.
However, I'd prefer that we pay more attention to make the "slow path" faster.)