This is an archive of the discontinued LLVM Phabricator instance.

Mangling for intrinsic names w/function type parameters
ClosedPublic

Authored by reames on Jul 21 2014, 3:32 PM.

Details

Summary

This diff is not intended to be submitted in it's current form, but is primarily to spark discussion.

Currently, we have a type parameter mechanism for intrinsics. Rather than having to specify a separate intrinsic for each combination of argument and return types, we can specify a single intrinsic with one or more type parameters. These type parameters are passed explicitly to Intrinsic::getDeclaration or can be specified implicitly in the naming of the intrinsic function in an LL file.

Today, the types are limited to integer, floating point, and pointer types. With a goal of supporting symbolic targets for patchpoints and statepoints (out of tree GC support), I would like to extend this mechanism to handle other types. In particular, I want to parametrize the intrinsics by function types.

I'm looking for feedback on what mangling scheme do we want to use? In particular, how do we make it play well with type and function renaming in the IR? I'm utterly unfamiliar with this code, so I'm hoping for feedback from folks who may understand the current scheme, it's motivations, and usage.

I've implemented a straw man proposal, but I suspect we'll want to find something better.

Note: The interesting implementation is in Function.cpp. The test case is useful for understanding the context and the mangling scheme. I've included a change to the patchpoint intrinsic to highlight the intent, but that change is incomplete and currently breaks every other patchpoint test.

Diff Detail

Repository
rL LLVM

Event Timeline

reames updated this revision to Diff 11721.Jul 21 2014, 3:32 PM
reames retitled this revision from to Mangling for intrinsic names w/function type parameters.
reames updated this object.
reames edited the test plan for this revision. (Show Details)
reames added reviewers: atrick, ributzka, nlewycky, chandlerc.
reames added a subscriber: Unknown Object (MLST).
atrick edited edge metadata.Jul 21 2014, 6:06 PM

I might be missing something. But I think that LLVM should have a small finite number of intrinsics for patchpoint, and the frontend (JIT client) should be able to pass it a symbolic target with any signature. In other words, we should always treat the target arguments as varargs from the intrinsic's point of view. That doesn't change the actual calling convention used.

lhames edited edge metadata.Aug 7 2014, 11:11 AM

Sorry for the delayed reply. If I understand this correctly, what you're proposing is just a generalization of the existing name-mangling scheme to include function-pointer types?

I understand this from a consistency stand-point - we already parameterize intrinsics like llvm.memcpy this way. On the other hand, it's not clear to me what it would actually buy us. Why not just stick to bit-casting the function pointer to i8*? I think that's actually just as type-safe in the end?

atrick added a comment.Oct 9 2014, 2:02 PM

This is my current thinking on this patch:

We do want to allow defining a single intrinsic in tablegen that can support any return type. Having a single intrinsic opcode will cleanup the code.

The only issue with the approach taken in this patch is that it mangles the entire signature of the target function into the patchpoint symbol. This seems unnecessary to me. I would prefer to see simple declarations for the intrinsics and use a bitcast when needed to pass the target function as i8*. I'm not aware of any use for the complex manglings, but we can debate that in the larger statepoint infrastructure review.

I believe Juergen has a patch that takes a simpler approach and I've asked him to post it.

ributzka accepted this revision.Oct 9 2014, 2:52 PM
ributzka edited edge metadata.

Just a few coding style nitpicks. Otherwise LGTM.

lib/IR/Function.cpp
427–433 ↗(On Diff #11721)

No {} needed

451–453 ↗(On Diff #11721)

No {} needed

460 ↗(On Diff #11721)

assert or llvm_unreachable?

683 ↗(On Diff #11721)

isVarArg -> IsVarArg

694 ↗(On Diff #11721)

nullptr ;-)

744 ↗(On Diff #11721)

ditto

764–766 ↗(On Diff #11721)

No {} needed

769–771 ↗(On Diff #11721)

ditto

This revision is now accepted and ready to land.Oct 9 2014, 2:52 PM
reames added a comment.Oct 9 2014, 3:33 PM

Thanks for the code comments. I'm going to hold off fixing these until
we have the overall direction settled.

Philip

reames updated this revision to Diff 15883.Nov 6 2014, 11:15 AM
reames edited edge metadata.

Juergen & Andy, I believe this change is good to go, but given it's been a while, I just wanted to confirm. Are we all comfortable with the mangling support? (I've included a cleaned up version of the patch.)

Juergen - I left the {} around the single statement if since they're multi line.

atrick added a comment.Nov 6 2014, 2:05 PM

Ok with me, as long as Juergen and Lang don’t see a problem. I don’t think this changes symbols for any existing intrinsics does it?

-Andy

Yes, please commit.

reames closed this revision.Nov 11 2014, 4:13 PM
reames updated this revision to Diff 16069.

Closed by commit rL221742 (authored by @reames).

pcc added a subscriber: pcc.Apr 7 2016, 2:03 PM
pcc added inline comments.
llvm/trunk/lib/IR/Function.cpp
470

Sorry for the late review, but this mangling doesn't seem sound to me. It may fail in the presence of IR linking.

merge1.ll

%mystruct = type { i16, i16, [1 x i8*] }

declare <16 x %mystruct*> @llvm.masked.load.v16p0mystruct(<16 x %mystruct*>*, i32, <16 x i1>, <16 x %mystruct*>)

define void @foo() {
  %x = call <16 x %mystruct*> @llvm.masked.load.v16p0mystruct(<16 x %mystruct*>* undef, i32 undef, <16 x i1> undef, <16 x %mystruct*> undef)
  ret void
}

merge2.ll

%mystruct = type { i32, i16, [1 x i8*] }

declare <16 x %mystruct*> @llvm.masked.load.v16p0mystruct(<16 x %mystruct*>*, i32, <16 x i1>, <16 x %mystruct*>)

define void @bar() {
  %x = call <16 x %mystruct*> @llvm.masked.load.v16p0mystruct(<16 x %mystruct*>* undef, i32 undef, <16 x i1> undef, <16 x %mystruct*> undef)
  ret void
}
$ ra/bin/llvm-link -o merge.bc merge1.ll merge2.ll
Intrinsic name not mangled correctly for type arguments! Should be: llvm.masked.load.v16p0mystruct.0
<16 x %mystruct.0*> (<16 x %mystruct.0*>*, i32, <16 x i1>, <16 x %mystruct.0*>)* @llvm.masked.load.v16p0mystruct
ra/bin/llvm-link: merge2.ll: error: input module is broken!