This is an archive of the discontinued LLVM Phabricator instance.

[DWARF] Expose raw bytes in DWARFExpression
ClosedPublic

Authored by rafauler on Aug 4 2021, 5:28 PM.

Details

Summary

This information is necessary for clients of DebugInfo that
do not want to process a DWARF expression, but just treat it as a blob
of data. In BOLT, for example, we need to read these expressions in
CFIs and write them back to the binary, unchanged, so having access to
the original expression encoding is a shortcut to avoid the need to
re-encode the entire expression when re-writing exception handling
info (CFIs).

This patch is an alternative to https://reviews.llvm.org/D98301, in
which we implement the support to re-encode these expressions. But
since we don't really need to change anything in these expressions,
we can just copy their bytes.

Diff Detail

Event Timeline

rafauler requested review of this revision.Aug 4 2021, 5:28 PM
rafauler created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptAug 4 2021, 5:28 PM

On the surface, this seems reasonable, and is simple enough that a test is probably superfluous. However, my slight concern is that this appears to be dead code to an LLVM developer and therefore a candidate for removal, since there are no clients. I'd like one of those who more regular contributes to the library maintenance to chime in with their thoughts.

This will work fine for many DWARF expressions, but if they contain a DW_OP_convert which references a Type DIE by offset, or a DW_OP_call (currently not generated by LLVM, you may still need to rewrite the expression. If you are okay with these limitations, this patch is fine, I just would echo @jhenderson and recommend adding a unit test.

That's a valid point. For context, this is the full patch in our repo that is a mirror of current LLVM monorepo with BOLT in it (in folder /bolt), and shows how BOLT as a client uses this info:

https://github.com/facebookincubator/BOLT/pull/196

@aprantl interesting point, thanks for the heads up on this corner case. I read about it in the DWARF manual and here are my 2c. BOLT is currently forced to deal with DWARF expressions because they appear in the CFI (unwind metadata). They rarely appear, but they do appear. For unwinding purposes, I've checked the DWARF manual and it says that these operations are forbidden in an expression (probably because we don't want unwind info depending on debug info). [1]

So BOLT does rewrite all unwind info related to CFIs because their encoding is very dependent on the code layout (something BOLT is completely changing). But BOLT does not rewrite stuff that doesn't depend on layout because that's usually very expensive at post-link phase. So most of debug_info is just copied, and we definitely preserve original DIE offsets (but, for this case, I wouldn't worry about it, since this operation is not permitted to exist in CFIs anyway).

Regarding the unit test, I think it makes sense, let me write one.

[1] DWARF 5 manual sec 6.4.2 Call Frame Instructions:

Some call frame instructions have operands that are encoded as DWARF
expressions (see Section 2.5.1 on page 26). The following DWARF operators
cannot be used in such operands:
DW_OP_addrx, DW_OP_call2, DW_OP_call4, DW_OP_call_ref, DW_OP_const_type, DW_OP_constx, DW_OP_convert, DW_OP_deref_type, DW_OP_regval_type and DW_OP_reinterpret operators are not allowed in an operand of these instructions because the call frame information must not depend on other debug sections.
DW_OP_push_object_address is not meaningful in an operand of these instructions because there is no object context to provide a value to push.
DW_OP_call_frame_cfa is not meaningful in an operand of these instructions because its use would be circular.
Call frame instructions to which these restrictions apply include DW_CFA_def_cfa_expression, DW_CFA_expression and DW_CFA_val_expression.
aprantl accepted this revision.Aug 6 2021, 10:47 AM

Sounds good then! LGTM with a unit test.

This revision is now accepted and ready to land.Aug 6 2021, 10:47 AM
rafauler updated this revision to Diff 365881.Aug 11 2021, 5:11 PM

Adding a test case that explains how BOLT leverages this API.
Let me know if should I go for a simpler check that doesn't
involve a round trip to the assembler/cfi parser. Thanks!

I had something using yaml2obj input in mind, but this works, too.

jhenderson added inline comments.Aug 16 2021, 4:28 AM
llvm/unittests/DebugInfo/DWARF/DWARFExpressionCopyBytesTest.cpp
147

Fix lint issue.

rafauler updated this revision to Diff 367044.Aug 17 2021, 3:51 PM

Fixing linter issue

rafauler marked an inline comment as done.Aug 17 2021, 3:52 PM
This revision was automatically updated to reflect the committed changes.