This is an archive of the discontinued LLVM Phabricator instance.

[Verifier][WinEH] Check funclet tokens on intrinsic calls that may lower to function calls
ClosedPublic

Authored by sgraenitz on Nov 16 2022, 5:05 AM.

Details

Summary

WinEHPrepare requires funclet operand bundles ("tokens") on function calls from EH funclets to prevent them from getting removed as "implausible" calls. This includes calls to intrinsic functions that lower to function calls in the course of IR transformations (e.g. ObjC ARC runtime calls).

We can not detect such cases in WinEHPrepare itself, because at this point they mixed up with valid implausible calls. These must be removed to guarantee that the EH backend can assign unique colors and EH state numbers to all blocks.

This patch allows the IR Verifier to detect missing and dangling funclet tokens. Non-conforming IR becomes illegal and miscompilations are detected early. In order to find funclet pad instructions for funclets that extend over multiple blocks, we have to calculate EH funclet colors. As coloring can be expensive, it runs on-demand and results are cached per function.

Diff Detail

Event Timeline

sgraenitz created this revision.Nov 16 2022, 5:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 16 2022, 5:05 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
sgraenitz requested review of this revision.Nov 16 2022, 5:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 16 2022, 5:05 AM
sgraenitz added inline comments.Nov 16 2022, 5:57 AM
llvm/test/Transforms/ObjCARC/invoke-2.ll
42

The verifier already caught this, which I assume is a bug that sat here unnoticed for years -- catchpad without catchret?

rnk added inline comments.Nov 18 2022, 10:52 AM
llvm/lib/IR/Verifier.cpp
4928

This property should be validated for invokes, calls, and callbr, right?

5807

I think these comments should discuss the purpose of this verification. Maybe we can say:

// Verify that each function call belongs to exactly one funclet (or the main function),
// and that there aren't any unmediated control transfers between funclets.
llvm/test/Transforms/ObjCARC/invoke-2.ll
42

Looks like a bug to me

sgraenitz updated this revision to Diff 476879.Nov 21 2022, 6:26 AM

Update comment

llvm/lib/IR/Verifier.cpp
4928

If I understand correctly, we can guarantee that for IR right after codegen, but not in between transformations -- because transformations must be allowed to introduce implausible calls? If we could guarantee it for calls across all transformations, then WinEHPrepare::removeImplausibleInstructions() doesn't seem to be necessary.

5807

belongs to exactly one funclet (or the main function)

As mentioned in D137944, I am not sure we can guarantee that when running the verifier in between transformations

Is there any more feedback on this extension of the verifier?

This review belongs to a series of patches. I am planning to land it towards the end of next week, so that it's ready for the next release branching in February. If you have any more remarks, please let me know soon.

Would it make sense to check calls to non-intrinsic functions here? I don't see any meaningful distinction between an intrinsic that will be lowered to a non-intrinsic call, vs. an actual call.

Would it make sense to allow dangling funclet tokens in unreachable code? In particular, I'm concerned about cases where a bit of code is logically inside a funclet, but control flow simplification cuts it off from the funclet entry. (WinEHPrepare will throw away unreachable code anyway, so it shouldn't care.)

Thanks for your feedback @efriedma

Would it make sense to check calls to non-intrinsic functions here? I don't see any meaningful distinction between an intrinsic that will be lowered to a non-intrinsic call, vs. an actual call.

Front-ends that target WinEH are responsible for setting up funclet tokens for function calls. In Clang this happens in generic code paths and it works reliably. While we could check it here for all cases, it doesn't seem necessary and it would add overhead. The compromise to check funclet tokens for certain intrinsic calls arised from the fact that middle-end passes can add/replace intrinsics and sometimes miss to set up funclet tokens. One current example is here: https://reviews.llvm.org/D137944#C3801853OL1776

It would be handy to catch these cases early in the verifier. Reid explained some more background here: https://reviews.llvm.org/D134866#3824796

Would it make sense to allow dangling funclet tokens in unreachable code? In particular, I'm concerned about cases where a bit of code is logically inside a funclet, but control flow simplification cuts it off from the funclet entry. (WinEHPrepare will throw away unreachable code anyway, so it shouldn't care.)

Good point. I am not an expert for details in simplifycfg. If it produces such cases legally and doesn't clean them up (and instead rely on WinEHPrepare), that's odd but it needs to be considered. Dangling tokens were the symptom I observed in https://reviews.llvm.org/D134866 (which turned out to be a front-end codegen issue) and catching them here would be great, but they used to be less frequent than missing token cases and I guess we could live without them for now.

Thanks for your feedback @efriedma

Would it make sense to check calls to non-intrinsic functions here? I don't see any meaningful distinction between an intrinsic that will be lowered to a non-intrinsic call, vs. an actual call.

Front-ends that target WinEH are responsible for setting up funclet tokens for function calls. In Clang this happens in generic code paths and it works reliably. While we could check it here for all cases, it doesn't seem necessary and it would add overhead. The compromise to check funclet tokens for certain intrinsic calls arised from the fact that middle-end passes can add/replace intrinsics and sometimes miss to set up funclet tokens. One current example is here: https://reviews.llvm.org/D137944#C3801853OL1776

It would be handy to catch these cases early in the verifier. Reid explained some more background here: https://reviews.llvm.org/D134866#3824796

Okay

Would it make sense to allow dangling funclet tokens in unreachable code? In particular, I'm concerned about cases where a bit of code is logically inside a funclet, but control flow simplification cuts it off from the funclet entry. (WinEHPrepare will throw away unreachable code anyway, so it shouldn't care.)

Good point. I am not an expert for details in simplifycfg. If it produces such cases legally and doesn't clean them up (and instead rely on WinEHPrepare), that's odd but it needs to be considered. Dangling tokens were the symptom I observed in https://reviews.llvm.org/D134866 (which turned out to be a front-end codegen issue) and catching them here would be great, but they used to be less frequent than missing token cases and I guess we could live without them for now.

I'm concerned specifically about cases where a block is unreachable in the sense of isReachableFromEntry. I'm not concerned about SimplifyCFG itself (that cleans up dead blocks); more concerned about, for example, clang omitting some control flow due to an if (0), but we emit the body of the if because it contains a label.

sgraenitz updated this revision to Diff 491754.Jan 24 2023, 6:37 AM

Rebase and drop check for dangling funclet tokens

Thanks for your quick reply!

more concerned about, for example, clang omitting some control flow due to an if (0), but we emit the body of the if because it contains a label

I didn't manage to reproduce this quickly. If you have more specific ideas for a reproducer, please let me know. Maybe next time I work on this, I can investigate in more detail!

In the meantime I dropped the dangling funclet checks and rebased to TOT. Dou think we can give this a try?

efriedma accepted this revision.Jan 25 2023, 9:50 AM

LGTM

I didn't manage to reproduce this quickly. If you have more specific ideas for a reproducer, please let me know.

Something like the following:

void g1(), g2();
void f() {
  try {
    g();
  } catch (...) {
    return;
    Z: g(); goto Z;
  }
}
This revision is now accepted and ready to land.Jan 25 2023, 9:50 AM
This revision was landed with ongoing or failed builds.Jan 27 2023, 9:06 AM
This revision was automatically updated to reflect the committed changes.

Thanks! Unfortunately the release branched earlier this year so this didn't make it into 16.x