This is an archive of the discontinued LLVM Phabricator instance.

Change to use ValueHandle for associated data of GlobalCtors
Needs ReviewPublic

Authored by wlei on Sep 7 2021, 12:47 PM.

Details

Summary

We found one use-after-free case that the data(GV) in GlobalCtors is used after it's freed in GetOrCreateLLVMGlobal:Entry->eraseFromParent();.

As mentioned in the comments, when there is a conflict with the global creation, it will replace old one and update all the old usages with the new one, so here this patch tries to fix it by using ValueHandle for associated data of GlobalCtors.

Diff Detail

Event Timeline

wlei created this revision.Sep 7 2021, 12:47 PM
wlei requested review of this revision.Sep 7 2021, 12:47 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 7 2021, 12:47 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
wlei edited the summary of this revision. (Show Details)Sep 7 2021, 1:09 PM
wlei added reviewers: hoy, wenlei.
wlei edited the summary of this revision. (Show Details)Sep 7 2021, 6:05 PM
wlei added reviewers: bruno, tra, andreybokhanko, rjmccall, rnk.
rnk added a comment.Sep 7 2021, 6:22 PM

This seems like a good use for a ValueHandle, even if it is slightly more code.

I agree with Reid. IRGen should never be holding naked references (across different invocations as an ASTConsumer) to a replaceable entity. There are some entities that are known not to be replaceable, but in general the associated data of a global constructor doesn't seem like one of them. We should be using a value handle for this.

wenlei added a comment.Sep 8 2021, 8:00 AM

Yes, this looks like the kind of stuff VH is designed for. @wlei here's an example of how VH is used to handle deletion of BFI of basic blocks: https://reviews.llvm.org/D75341.

wlei updated this revision to Diff 371455.Sep 8 2021, 2:50 PM

change to use ValueHandle for associated data

wlei added a comment.Sep 8 2021, 2:50 PM

@rnk @rjmccall Thanks for the suggestion! @wenlei Thanks for the helpful example!

Changed to use ValueHandle(WeakTrackingVH) for this, please see if this looks good to you.

wlei retitled this revision from Fix use-after-free from GlobalCtors associated data to Change to use ValueHandle for associated data of GlobalCtors .Sep 8 2021, 2:52 PM
wlei edited the summary of this revision. (Show Details)
rnk added a comment.Sep 8 2021, 3:20 PM

This seems incorrect: a weak VH won't have the same semantics, it will just drop the comdat associativity when the underlying global changes type. BTW, can this be tested, or is this only possible when clang is being used as a library, perhaps as in swift?

wlei added a comment.Sep 8 2021, 4:30 PM

This seems incorrect: a weak VH won't have the same semantics, it will just drop the comdat associativity when the underlying global changes type.

@rnk Thanks. Here I tried to use WeakTrackingVH working together with replaceAllUsesWith which is already there (in CodeGenModule.cpp:3904), my debugging dump showed like:

GlobalCtors = [..,OldPtr];
OldPtr->replaceAllUsesWith(NewPtr);
GlobalCtors = [..,NewPtr];

I think this is what we desire. Do you think this can cause other issues or anything I missed?

BTW, can this be tested, or is this only possible when clang is being used as a library, perhaps as in swift?

Not swift, it seems this use-after-free needs a large program to repro, I tried to simplify it but failed to expose it by small program.

clang/lib/CodeGen/CodeGenModule.cpp
3904

Here is the replaceAllUsesWith .

It shouldn't need a large program to duplicate, but the test will need to exercise some unusual, arguably invalid conditions.

We need to add an entry with associated data to GlobalCtors or GlobalDtors and then replace the associated data. We only actually use associated data in either of these lists when we have a C++ global variable with a dynamic initializer that needs a COMDAT key. So first of all, the target needs to use COMDAT. Also, the variable needs to satisfy the conditions for using a COMDAT key, which is apparently just being externally visible (which is surprising to me — I would expect this to only be useful on entities with vague linkage?). If all of that is true, then the COMDAT key we use for these entities is just the address of the variable.

Okay, so the test needs to trigger a replacement of the llvm::GlobalVariable that we emitted a constructor for. The main reason we replace global variables is that we emitted a use of it with one IR type and then, when we got around to emitting its definition, we realized it needed to have a different IR type. (This is really just a bad representation in LLVM: external variable declarations don't actually need a value type, and it just causes problems.) This can happen in several different situations, like if the type of the variable was initially incomplete, or in some cases involving unions. But none of that should apply here, because we actually compute a constant initializer for the global before we register a dynamic constructor function for it, so I can't see why we'd need to replace it.

My guess is that somehow your program is causing a second variable to be emitted with the same name, maybe using asm labels, or maybe due to some oversight in the mangling rules.

rnk added a comment.Sep 8 2021, 7:09 PM

Not swift, it seems this use-after-free needs a large program to repro, I tried to simplify it but failed to expose it by small program.

If you use an ASan build to detect the use-after-free, you may be able to CReduce it down automatically.

rnk added a comment.Sep 8 2021, 7:10 PM

I think this is what we desire. Do you think this can cause other issues or anything I missed?

I forgot to mention, yes, sounds good, I see, it is only weak when the value is erased without being replaced.