This is an archive of the discontinued LLVM Phabricator instance.

Fix for PR33930. Short-circuit metadata mapping when cloning a varargs thunk.
ClosedPublic

Authored by wolfgangp on Oct 27 2017, 4:59 PM.

Details

Summary

The problem in PRR33930 comes about because clang is cloning a function (to generate varargs thunks) before all the Metadata nodes are resolved. The value mapper, which is used by the cloner to deal with Medatdata nodes, asserts when it encounters temporary otherwise unresolved nodes.

Ideally we would improve varargs thunk generation to not use cloning, but in the meantime the fix I'm proposing here is to clone the function's DISubprogram node explicitly (the verifier insists on ever function having its own). The cloned node is then entered into the value map that is supplied to the cloner (which feeds it to the value mapper). This has the effect that the value mapper does not look any further when it's trying to map all the rest of the MD nodes and therefore does not run into any temporary MD nodes that still exist at than point (in the form of forward references to struct declarations).

Furthermore, there are unresolved DILocalVariable nodes referenced by dbg.value intrinsics, which causes the value mapper to assert when it is remapping the function's instructions' MDNodes. These nodes are normally resolved at the catch-all at the end of the module, but they can be safely resolved here, since they won't change any more. The value mapper will safely clone them (if necessary) and remap their operand nodes during the cloning process.

The only drawback is that the cloned function's (the thunk's) DILocalVariableNodes will not appear in the newly cloned DISubprogram's node variable list. Instead, the new node points to the original function's variable list. However, since the only difference between the original function and the thunk is the this-ptr adjustment, the resulting debug information is correct, at least in the test cases I have looked at.

Unfortunately this isn't the cleanest solution in the world. A different approach, deferring the creation of varargs thunks until the end of the module, is more invasive. The approach proposed here duplicates Metadata. If anybody has any better ideas, please let me know.

Note that there are two patches here. One is an llvm patch that is NFC (making resolve() public) while the main one is a clang patch.

Diff Detail

Repository
rL LLVM

Event Timeline

wolfgangp created this revision.Oct 27 2017, 4:59 PM
aprantl edited edge metadata.Oct 30 2017, 8:56 AM

The only drawback is that the cloned function's (the thunk's) DILocalVariableNodes will not appear in the newly cloned DISubprogram's node variable list. Instead, the new node points to the original function's variable list. However, since the only difference between the original function and the thunk is the this-ptr adjustment, the resulting debug information is correct, at least in the test cases I have looked at.

Does the thunk contain any local variables other than function arguments?
What I'm getting at is: If this is only a thunk, do we need any local variables at all?

include/llvm/IR/Metadata.h
961 ↗(On Diff #120710)

The \brief is redundant and can be dropped.

lib/CodeGen/CGVTables.cpp
130 ↗(On Diff #120710)

///

140 ↗(On Diff #120710)

debug.declare -> (llvm.)dbg.declare.

143 ↗(On Diff #120710)

does this work?
for (auto &BB : Fn->getBasicBlockList())

145 ↗(On Diff #120710)

if (auto *DII ...

wolfgangp updated this revision to Diff 120876.Oct 30 2017, 1:05 PM

Incorporated review comments.

wolfgangp marked 5 inline comments as done.Oct 30 2017, 1:06 PM
wolfgangp added inline comments.
lib/CodeGen/CGVTables.cpp
143 ↗(On Diff #120710)

Yep, thanks. I wasn't aware of this.

aprantl accepted this revision.Oct 30 2017, 5:29 PM

This works for me, but as I said previously, perhaps we can get by with just not having any variables described in the thunk to further simplify the code.

This revision is now accepted and ready to land.Oct 30 2017, 5:29 PM
wolfgangp marked an inline comment as done.Oct 30 2017, 6:23 PM

This works for me, but as I said previously, perhaps we can get by with just not having any variables described in the thunk to further simplify the code.

I remember, but this "thunk" isn't really a thunk in the sense that it makes some adjustments and then branches to (or calls) the real function. It effectively becomes the real function (hence the cloning), so if you drop the variables you lose the ability to debug it.

I see. Then it makes sense to do it this way. Thanks for clarifying!

This revision was automatically updated to reflect the committed changes.