This is an archive of the discontinued LLVM Phabricator instance.

Allow epilogue_begin to be emitted when generating DWARF
ClosedPublic

Authored by Ariel-Burton on Sep 6 2022, 2:05 PM.

Details

Summary

We identify epilogue code by looking for instructions tagged
with FrameDestroy.

A function may have more than one epilogue, e.g., because of early
returns or code duplicated during optimization. We need only track
the current block, and emit epilogie_begin at most once per block.

We reduce the number of entries in the line table by combining
epilogue_begin with other flags instead of emitting a separate
entry just for epilogue_begin.

Diff Detail

Event Timeline

Ariel-Burton created this revision.Sep 6 2022, 2:05 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 6 2022, 2:05 PM
Ariel-Burton requested review of this revision.Sep 6 2022, 2:05 PM
MatzeB added a comment.Sep 6 2022, 2:17 PM

What is this epilogue information going to be used for?

Generally I find the notion of "epilogue" a bit sketchy, since you can easily imagine the scheduler mixing CSR restore instructions (and whatever else you consider epilogue) with normal computation so you'll have epilogue and non-epilogue code mixed together so a notion of "epilogue starts here" doesn't really mean much...

What is this epilogue information going to be used for?

Generally I find the notion of "epilogue" a bit sketchy, since you can easily imagine the scheduler mixing CSR restore instructions (and whatever else you consider epilogue) with normal computation so you'll have epilogue and non-epilogue code mixed together so a notion of "epilogue starts here" doesn't really mean much...

As per the DWARF standard, it is intended to identify a location at which am exit breakpoint may be placed before the return.

One use case is for "run-to-end-of-funciton" (which some (e.g., z/OS), but not all debuggers support), instead of the gdb "finish"-like semantics, which will run to just after the current function returns (i.e., it will return to the caller).

I hear that the notion of epilogue might be a bit sketchy with some architectures, or optimization settings. I don't think that matters. Supporting epilogue_begin isn't mandatory, and from what I see in the sources, FrameDestory is not used on all platforms. On those platforms epilogue_begin will not be emitted at all. This change will not affect existing tools which ignore epilogue_begin for the reasons you give, or indeed, any other reasons. But it is useful for tools which do care, and on platforms where the epilogue is identifiable.

MatzeB removed a reviewer: MatzeB.Sep 7 2022, 9:48 AM

I think this is a reasonable feature to support. As @MatzeB pointed out, it may not be useful in optimized code. Are you primarily targeting optimized or unoptimized code? How much does the debug info size grow in % if you build, e.g., clang?

I think this is a reasonable feature to support. As @MatzeB pointed out, it may not be useful in optimized code. Are you primarily targeting optimized or unoptimized code? How much does the debug info size grow in % if you build, e.g., clang?

I sampled the binaries generated by a x86-64 linux build of clang. As one might expect, the section that grew is .debug_line. I see an increases of about 2%. However, the overall size of just the debug info increases by considerably less, about 0.3%. The increase of the overall file size is even less, at about 0,2%. Interestingly, I saw small reductions in the sizes of the .debug_line_str and .debug_str sections.

Thanks for this. On OpenVMS, our calling convention wants to know about both prologue and epilogue sections. For asych-error handling, we won't call a routine's handler (ie, personality-routine) if it is in its prologue or epilogue since the routine's frame is no longer active. In those cases, we walk back up the stack one frame to begin our handler search. Today, we just look at the instruction sequences and trying to guess as best we can. Having something in the debug_line table will be a big help.

Thanks for this. On OpenVMS, our calling convention wants to know about both prologue and epilogue sections. For asych-error handling, we won't call a routine's handler (ie, personality-routine) if it is in its prologue or epilogue since the routine's frame is no longer active. In those cases, we walk back up the stack one frame to begin our handler search. Today, we just look at the instruction sequences and trying to guess as best we can. Having something in the debug_line table will be a big help.

that seems concerning - sounds like the correctness/behavior of the platform depends on the presence/contents of .debug_line? That doesn't sound right. I thought anything relevant to that should be encoded in .debug_frame/.eh_frame, no?

Thanks for this. On OpenVMS, our calling convention wants to know about both prologue and epilogue sections. For asych-error handling, we won't call a routine's handler (ie, personality-routine) if it is in its prologue or epilogue since the routine's frame is no longer active. In those cases, we walk back up the stack one frame to begin our handler search. Today, we just look at the instruction sequences and trying to guess as best we can. Having something in the debug_line table will be a big help.

that seems concerning - sounds like the correctness/behavior of the platform depends on the presence/contents of .debug_line? That doesn't sound right. I thought anything relevant to that should be encoded in .debug_frame/.eh_frame, no?

Sorry for the delay, been on the road and only catching up. Yes my bad, we have that in eh_frame, not in the debug_line. The weakness of epilogue support has always been a concern of mine and I'm happy that things are catching up. The debug_line indicators of epilogue will help our debugger and performance analysis tools.

ping for approval

As one might expect, the section that grew is .debug_line. I see an increases of about 2%. However, the overall size of just the debug info increases by considerably less, about 0.3%. The increase of the overall file size is even less, at about 0,2%.

Under what build configuration/example target? These numbers can vary pretty widely depending on compression, split dwarf, optimized/unoptimized, etc. Might be good to have a few scenarios covered, at least with clang or something similarly large-ish from the LLVM project.

Interestingly, I saw small reductions in the sizes of the .debug_line_str and .debug_str sections.

that seems really suspicious - is there any reason why this change would reduce those?

As one might expect, the section that grew is .debug_line. I see an increases of about 2%. However, the overall size of just the debug info increases by considerably less, about 0.3%. The increase of the overall file size is even less, at about 0,2%.

Under what build configuration/example target? These numbers can vary pretty widely depending on compression, split dwarf, optimized/unoptimized, etc. Might be good to have a few scenarios covered, at least with clang or something similarly large-ish from the LLVM project.

This was a vanilla x86-64 linux build with debug.

Interestingly, I saw small reductions in the sizes of the .debug_line_str and .debug_str sections.

that seems really suspicious - is there any reason why this change would reduce those?

I've looked more closely at the strings, and I believe this is a false alarm. The reason for the reduction in size is that the path in which the with-the-change was built was shorter than of the without-the-change build.

dblaikie accepted this revision.Dec 1 2022, 1:27 PM

I still suspect more data would be useful, but sure - let's give it a go.

This revision is now accepted and ready to land.Dec 1 2022, 1:27 PM
This revision was automatically updated to reflect the committed changes.
labath added a subscriber: labath.Dec 12 2022, 12:36 AM

I don't think that this indicates a problem with this patch, but you might be interested to know that this broke a test in lldb which was trying to change the PC value to a given line (and it failed because that line was now associated with multiple line entries). I have worked around the problem with a06342d250ec7bee37dc93477f233e43e597aca5 and filed PR59458 to track possible improvements in the lldb implementation.