This is an archive of the discontinued LLVM Phabricator instance.

[OpaquePtr] Support opaque pointers in intrinsic type check
ClosedPublic

Authored by nikic on Jun 29 2021, 2:59 PM.

Details

Summary

This adds partial support for opaque pointers in intrinsic type checks. Specifically, this handles the case where a fixed pointer type is specified.

This is less straight-forward than it might initially seem, because we should only accept opaque pointers here in --force-opaque-pointers mode. Otherwise, there would be more than one valid type signature for a given intrinsic name.

I'm accessing the flag on LLVMContextImpl, but possibly the ForceOpaquePointers mode should be part of the public LLVM context API?

Diff Detail

Event Timeline

nikic created this revision.Jun 29 2021, 2:59 PM
nikic requested review of this revision.Jun 29 2021, 2:59 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 29 2021, 2:59 PM
nikic edited the summary of this revision. (Show Details)Jun 29 2021, 3:00 PM

I'm accessing the flag on LLVMContextImpl, but possibly the ForceOpaquePointers mode should be part of the public LLVM context API?

I agree that adding something to LLVMContext itself would be cleaner. I suspect this flag will be needed in other places too!

BTW, the proposed API might be more forward-looking / clear if it talked about what's supported rather than what's being forced. Something like inverting to supportsTypedPointers() (or using onlySupportsOpaquePointers()). Forcing implies a change from a default state. I figure there'll be some period of time where the default will be NOT to support typed pointers, and this flag becomes a way to turn off an on-by-default, until eventually typed pointers are eliminated entirely... at which point maybe the command-line option can/should be renamed or removed, but it'd be nice not to have churn in the LLVMContext API.

llvm/lib/IR/Function.cpp
1412–1417

Your approach might be the right / pragmatic way forward, but wanted to check whether the following alternatives had been considered/discussed somewhere for when --force-opaque-pointers=false:

  1. Change mangling for intrinsics to differentiate "opaque pointer" vs other pointers.
  2. Allow either signature, but require all pointers to be the same as each other within an intrinsic (all i8* or all ptr), and require an intrinsic to be used consistently across a module (whatever client / use-site adds the declaration to the module is "correct", everyone else needs to bitcast).
llvm/test/Assembler/remangle-intrinsic-opaque-ptr.ll
2–5

It'd be good to add a positive RUN: line (using --force-opaque-pointers and a different CHECK prefix) to confirm this testcase doesn't bitrot.

nikic updated this revision to Diff 355574.Jun 30 2021, 8:56 AM

Add LLVMContext method.

nikic added inline comments.Jun 30 2021, 9:04 AM
llvm/lib/IR/Function.cpp
1412–1417

For 1: We do already mangle opaque pointers differently. However, this only applies to cases where multiple pointer types are accepted. This already works, and we have a couple tests for stuff like @llvm.memset.p0(ptr). Here the pointer type is fixed, so it's not part of the mangled name. We would have to change the name mangling to include a typed/opaque indicator where none exists right now, This would be a lot of churn for functionality that is useful neither before nor after the opaque pointer migration.

For 2: I'm pretty sure this is going to crash and burn, because a lot of code expects that an intrinsic is actually a Function, not a ConstantExpr.

nikic updated this revision to Diff 355587.Jun 30 2021, 9:15 AM

Fix typos.

nikic updated this revision to Diff 355602.Jun 30 2021, 9:45 AM

Also handles PointerToElt case. It looks like PointerTo is unused in-tree and VectorOfAnyPointersToElt is part of type mangling, so can directly allow opaque pointers.

dexonsmith added inline comments.Jun 30 2021, 10:30 AM
llvm/lib/IR/Function.cpp
1412–1417

On 1: agreed that this sounds like excessive churn.

On 2: I see; and updating code to see look through the bitcast would also be a ton of churn that would eventually just be deleted.

Stepping back, I wonder if it'd be useful to transition such intrinsics over to using opaque pointers instead of i8*, as a way of breaking out a manageable chunk of the overall transition (reducing the impact of --force-opaque-pointers). For example, intrinsics could be moved incrementally one-at-time or in logical groups (e.g., change @llvm.va_start and @llvm.va_end to use ptr instead of i8*). Seems like any code changes for this would need to be done anyway, and probably aren't that hard... and this way there's less code multiplexing between the two modes.

Not suggesting you do that instead of this patch, more of a general suggestion that could be done separately... although if this implementation makes it illegal for an intrinsic to choose ptr its fixed pointer type outside of --force-opaque-pointers, that'd have to massaged somehow...

nikic added inline comments.Jun 30 2021, 11:05 AM
llvm/lib/IR/Function.cpp
1412–1417

The core problem with doing that is that as soon as we generate opaque pointers somewhere, these might propagate into some place that doesn't yet support them, so that's something we could only do relatively late-stage in the migration with confidence. So I don't think that's a useful consideration at this point in time.

I did play with the idea of having something like a "convert to opaque pointers" pass that could be inserted early in the pipeline to allow more exposure to opaque pointers by converting things like function arguments. Ultimately, I'm not sure how useful something like that would be vs a full switch, because the mixed mode differs significantly from the fully opaque one. It might be useful if clang continues to lag behind badly.

dexonsmith accepted this revision.Jun 30 2021, 12:08 PM

LGTM! Thanks for walking me through the reasoning. (BTW, the phab description is out of date re: LLVMContextImpl, maybe fix it for the commit message.)

llvm/lib/IR/Function.cpp
1412–1417

Yup, you're right, sounds like it'd need to be late stage.

This revision is now accepted and ready to land.Jun 30 2021, 12:08 PM
aeubanks added inline comments.
llvm/lib/IR/Function.cpp
1412–1417

I never intended for --force-opaque-pointers to be used for anything other than checking if we're good to go to update all tests. The plan was to turn on --force-opaque-pointers locally, run check-llvm, check-clang, build a large codebase with clang, etc., and see if there are crashes. Once we think we're all good, we start updating llvm tests to use opaque pointers. After that, remove --force-opaque-pointers.

My opinion is that we should go with #2 and expect either a declaration of the intrinsic with a ptr return type, or a declaration with an i8* return type, but not both.

I don't think supporting mixed opaque pointer and non-opaque pointer modules is something we explicitly want to support. It may happen to work in some (maybe even many) cases, but in cases with intrinsics like this it won't really make sense to. And we definitely don't want to support the same intrinsic twice with different return types in the same module.

nikic added inline comments.Jun 30 2021, 1:05 PM
llvm/lib/IR/Function.cpp
1412–1417

Currently, --force-opaque-pointers is just there to find assertions failure. However, when we actually want to make the switch, I believe that has to happen by flipping the flag, not by converting individual pointer types in tests. Importantly, converting individual types still leaves you with typed pointers for allocas and for globals/functions, because the desired pointer type is never explicitly mentioned for them.

I don't think that your option #2 can work without some major changes to how we handle intrinsics. We rely on things being unique by design. E.g. if some code creates a call to Intrinsic::stackprotector, then it does so using Intrinsic::getDeclaration() and expects to get back a function of a certain type. It will generate invalid code if it gets back one with opaque pointer parameters (when we're talking about non-force-opaque-pointers mode) because one happens to be declared in the module already. We also can't just insert a bitcast, because the method returns a Function* and changing that to Constant*would have some major impact because now it no longer counts as a FunctionCallee.

I don't think this is something we should try to pursue, especially given how it is only useful for mixed typed and opaque pointer mode. None of this is relevant when working with fully opaque pointers.

nikic added inline comments.Jun 30 2021, 2:08 PM
llvm/lib/IR/Function.cpp
1412–1417

In any case, I'd like to land this patch to unblock further work, as this is one of the most common assertion failures right now. If someone figures out how to make the mixed type + opaque pointer case work nicely, we can always lift the restriction at that point.

aeubanks accepted this revision.Jun 30 2021, 2:10 PM
aeubanks added inline comments.
llvm/lib/IR/Function.cpp
1412–1417

Initially I had thought that we'd have separate flags for bitcode/text IR readers to use opaque pointers, but yeah that doesn't address fundamental things like alloca.

This revision was automatically updated to reflect the committed changes.