This is an archive of the discontinued LLVM Phabricator instance.

[lld] Destroy CommonLinkerContext inside lld::*::link after D108850
AbandonedPublic

Authored by MaskRay on Jan 30 2023, 6:51 PM.

Details

Reviewers
aganea
bjoo
jhuber6
mstorsjo
Group Reviewers
Restricted Project
Summary

Close https://github.com/llvm/llvm-project/issues/59874

D108850 changes ports to allocate CommonLinkerContext on the heap and destroys
lctx in lld/tools/lld/lld.cpp:lldMain. For API users of lld::*::link, they
need to manually call CommonLinkerContext::destroy() to prevent states from a
previous invocation. This is not ergonomic.

This patch moves CommonLinkerContext destruction into lld::*::link by using
local variables. COFFLinkerContext consumes ~5KiB stack space but it should be
fine.

Only one CommonLinkerContext::destroy() remains: in safeLldMain for
destruction after longjmp in the presence of a fatal error.

Diff Detail

Event Timeline

MaskRay created this revision.Jan 30 2023, 6:51 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJan 30 2023, 6:51 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript
MaskRay requested review of this revision.Jan 30 2023, 6:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 30 2023, 6:51 PM

The OP in PR59874 could use safeLldMain instead of lld::elf::link and that will solve their issue (but lld.cpp.obj has to be linked in). With D119049 we should probably decide if we want to expose the naked lld::*::link functions outside of LLD, or if users should always use safeLldMain, or something else.

The reason for not allocating ctx on the stack is to survive calls to lld::error() in LLD-as-lib scenarios. lld::error() calls lld::exitLld() which calls llvm::Process::Exit(), which, if a llvm::CrashRecoveryContext exists on the stack, restores the flow back in lld::safeLldMain. As soon as longjmp jumps back into safeLldMain, the stack below it becomes invalid in theory. At that point it is not safe anymore to call CommonLinkerContext::destroy(), right? (even if that could work in some implementations).

The reason for not allocating ctx on the stack is to survive calls to lld::error() in LLD-as-lib scenarios.

Just another quick note -- in theory we can survive crashes as well here. But there's at least a case in lldELF which throws an error() inside a constructor, and that corrupts the heap on ctx deletion, on Windows at least, see rG45b8a741fbbf271e0fb71294cb7cdce3ad4b9bf3

Requiring safeLldMain looks good to me as well. Perhaps we should just remove declarations of lld::*::link.

MaskRay added a comment.EditedJan 31 2023, 11:29 AM

The reason for not allocating ctx on the stack is to survive calls to lld::error() in LLD-as-lib scenarios.

Just another quick note -- in theory we can survive crashes as well here. But there's at least a case in lldELF which throws an error() inside a constructor, and that corrupts the heap on ctx deletion, on Windows at least, see rG45b8a741fbbf271e0fb71294cb7cdce3ad4b9bf3

Do you happen to have more information about which specific line may trigger a problem on Windows? Does this patch regress it?

Do you happen to have more information about which specific line may trigger a problem on Windows? Does this patch regress it?

The previous issue was caused by this call to fatal but you fixed it in rGd9dbf9e30a581fcadd667b6d8e5827a4003b85a2, see problem description in rG45b8a741fbbf.
My whole point is that we don't need this current patch, users can just call lld::safeLldMain().

bjoo added a comment.Jan 31 2023, 3:17 PM

I will try that

Do you happen to have more information about which specific line may trigger a problem on Windows? Does this patch regress it?

The previous issue was caused by this call to fatal but you fixed it in rGd9dbf9e30a581fcadd667b6d8e5827a4003b85a2, see problem description in rG45b8a741fbbf.
My whole point is that we don't need this current patch, users can just call lld::safeLldMain().

How is auto *ctx = new CommonLinkerContext (with a later cleanup) different from an on-stack COFFLinkerContext ctx; whose destructor is not called?

aganea added a comment.Feb 5 2023, 8:11 AM

How is auto *ctx = new CommonLinkerContext (with a later cleanup) different from an on-stack COFFLinkerContext ctx; whose destructor is not called?

How is the destructor not called in this patch? ctx on the stack in this patch will call ~CommonLinkerContext() and the would destroy all the instances, the string saver, the bump allocator, etc. We want to avoid that in the "normal" scenario, for a quick exit.

You're absolutely right in the sense that as code stands right now, ctx could very well live on the stack. But since your fix rGd9dbf9e30a58, we could revert rG45b8a741fbbf and catch fatal() and exceptions in LLD-as-lib usages. Do we want that? Or do we keep assuming that on the first fatal error, we return SafeReturn { ... canRunAgain = false } to force the enclosing app. to exit? I'd prefer something more resilient if possible (that is, revert rG45b8a741fbbf and not commit this patch).

How is auto *ctx = new CommonLinkerContext (with a later cleanup) different from an on-stack COFFLinkerContext ctx; whose destructor is not called?

How is the destructor not called in this patch? ctx on the stack in this patch will call ~CommonLinkerContext() and the would destroy all the instances, the string saver, the bump allocator, etc. We want to avoid that in the "normal" scenario, for a quick exit.

You're absolutely right in the sense that as code stands right now, ctx could very well live on the stack. But since your fix rGd9dbf9e30a58, we could revert rG45b8a741fbbf and catch fatal() and exceptions in LLD-as-lib usages. Do we want that? Or do we keep assuming that on the first fatal error, we return SafeReturn { ... canRunAgain = false } to force the enclosing app. to exit? I'd prefer something more resilient if possible (that is, revert rG45b8a741fbbf and not commit this patch).

Reverting 45b8a741fbbf271e0fb71294cb7cdce3ad4b9bf3 seems reasonable. I need to play more with your another patch...

MaskRay abandoned this revision.Jul 22 2023, 6:31 PM

Abandon after D119049