This is an archive of the discontinued LLVM Phabricator instance.

PostfixExpression: Introduce CFANode
ClosedPublic

Authored by labath on Apr 26 2019, 4:43 AM.

Details

Summary

This node represents the "Canonical Frame Address" of the current frame,
and is used by various DWARF expressions to express adresses of various
objects relative to the frame.

The motivation for this is the ability to translate breakpad register
unwind rules (strings like: ".cfa 8 - ^") back into dwarf.

This patch introduces the new node type, and teaches the dwarf codegen
how to handle it. A slight complication here is that the value of CFA is
provided differently depending on the context where the dwarf expression
is used. In unwind rules, it is made available implicitly via the
initial dwarf stack. It variable location expressions, it needs to be
referred to explicitly via a special opcode.

As currently, the only usage of this is in the unwind rules, I made the
dwarf generator use this convention, but this can be easily made
configurable, if necessary. The implementation does this via keeping
track of the DWARF stack depth an any given point, and then copying the
CFA value from the bottom of the stack via the DW_OP_pick opcode. This
could be made more efficient for simple expressions, but here I chose to
start with the most general implementation possible.

Event Timeline

labath created this revision.Apr 26 2019, 4:43 AM

PS: Although this patch is technically dependency free, for this to work as a part of the bigger picture, we'll need to get lldb's dwarf evaluator to actually push the CFA value to the dwarf stack (D61018) and fix its implementation of DW_OP_pick (D61182).

FYI: for all DWARF expressions found in EH frame stuff, they expect the CFA address to be pushed onto the stack prior to executing the expression. Do we just want to follow suit with the these expressions and avoid the extra needed node?

FYI: for all DWARF expressions found in EH frame stuff, they expect the CFA address to be pushed onto the stack prior to executing the expression. Do we just want to follow suit with the these expressions and avoid the extra needed node?

That is exactly what I am doing, and that's why I need D61018 for this to work, because right now we actually *aren't* pushing the cfa value to the dwarf stack.

However, I still need some kind of a node in the expression AST, so that the DWARF generator knows that it should insert a reference to the initial value.

Now theoretically, the dwarf codegen could be made smarter and try to optimize the generated expression. So for instance for an expression like ".cfa 8 +" it could be made to generate "DW_OP_lit8, DW_OP_plus" instead of the current "DW_OP_pick 0, DW_OP_uconst 8, DW_OP_plus". However, I don't see a need for that now, particularly, as this expression can be completely optimized away by using the "isCFAPlusOffset" mode of the UnwindPlan (which is something I plan to look at).

However, this got me thinking about the name of the special node, and I concluded that a better name would be the "InitialValueNode", reflecting it's usage instead of what it is supposed to contain. That way, it can be used in other situations where an initial value is present on the dwarf stack, and the name CFANode can remain free for the cases where one has to emit an explicit opcode (DW_OP_call_frame_cfa) in order to retrieve it.

labath updated this revision to Diff 197086.Apr 29 2019, 4:20 AM

s/CFA/InitialValue/

clayborg accepted this revision.Apr 29 2019, 7:12 AM
This revision is now accepted and ready to land.Apr 29 2019, 7:12 AM
amccarth accepted this revision.Apr 29 2019, 1:01 PM

LGTM, but I found one comment a bit confusing for me.

source/Symbol/PostfixExpression.cpp
150

I'm having trouble understanding this comment.

"will contain their value" -- What does "their" refer to here? I guessed InitialValueNodes, but then I'd expect "value" to be "values" (plural), and I'm not sure how that relates to the stack depth.

labath marked an inline comment as done.Apr 30 2019, 6:30 AM
labath added inline comments.
source/Symbol/PostfixExpression.cpp
150

I've rephrased this so that it is (hopefully) more understandable. The reason for the singular-plural schism is that the expression can theoretically contain multiple InitialValueNodes (I'm not sure why would anyone want to do that in practice, but the option is there), but they will all refer to the same value, which is the value pushed onto the dwarf stack before the expression evaluation began.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptApr 30 2019, 6:31 AM