This is an archive of the discontinued LLVM Phabricator instance.

Mangling for intrinsic names w/function type parameters
ClosedPublic

Authored by reames on Jul 21 2014, 3:32 PM.

Download Raw Diff

Details

Reviewers

chandlerc
atrick
nlewycky
ributzka
lhames

Commits

rG319c48eb2d03: Extend intrinsic name mangling to support arrays, named structs, and function…
rL221742: Extend intrinsic name mangling to support arrays, named structs, and function…

Summary

This diff is not intended to be submitted in it's current form, but is primarily to spark discussion.

Currently, we have a type parameter mechanism for intrinsics. Rather than having to specify a separate intrinsic for each combination of argument and return types, we can specify a single intrinsic with one or more type parameters. These type parameters are passed explicitly to Intrinsic::getDeclaration or can be specified implicitly in the naming of the intrinsic function in an LL file.

Today, the types are limited to integer, floating point, and pointer types. With a goal of supporting symbolic targets for patchpoints and statepoints (out of tree GC support), I would like to extend this mechanism to handle other types. In particular, I want to parametrize the intrinsics by function types.

I'm looking for feedback on what mangling scheme do we want to use? In particular, how do we make it play well with type and function renaming in the IR? I'm utterly unfamiliar with this code, so I'm hoping for feedback from folks who may understand the current scheme, it's motivations, and usage.

I've implemented a straw man proposal, but I suspect we'll want to find something better.

Note: The interesting implementation is in Function.cpp. The test case is useful for understanding the context and the mangling scheme. I've included a change to the patchpoint intrinsic to highlight the intent, but that change is incomplete and currently breaks every other patchpoint test.

Diff Detail

Repository: rL LLVM

Event Timeline

reames updated this revision to Diff 11721.Jul 21 2014, 3:32 PM

reames retitled this revision from to Mangling for intrinsic names w/function type parameters.

reames updated this object.

reames edited the test plan for this revision. (Show Details)

reames added reviewers: atrick, ributzka, nlewycky, chandlerc.

reames added a subscriber: Unknown Object (MLST).

I might be missing something. But I think that LLVM should have a small finite number of intrinsics for patchpoint, and the frontend (JIT client) should be able to pass it a symbolic target with any signature. In other words, we should always treat the target arguments as varargs from the intrinsic's point of view. That doesn't change the actual calling convention used.

atrick added a reviewer: lhames.Jul 22 2014, 11:44 AM

Sorry for the delayed reply. If I understand this correctly, what you're proposing is just a generalization of the existing name-mangling scheme to include function-pointer types?

I understand this from a consistency stand-point - we already parameterize intrinsics like llvm.memcpy this way. On the other hand, it's not clear to me what it would actually buy us. Why not just stick to bit-casting the function pointer to i8*? I think that's actually just as type-safe in the end?

This is my current thinking on this patch:

We do want to allow defining a single intrinsic in tablegen that can support any return type. Having a single intrinsic opcode will cleanup the code.

The only issue with the approach taken in this patch is that it mangles the entire signature of the target function into the patchpoint symbol. This seems unnecessary to me. I would prefer to see simple declarations for the intrinsics and use a bitcast when needed to pass the target function as i8*. I'm not aware of any use for the complex manglings, but we can debate that in the larger statepoint infrastructure review.

I believe Juergen has a patch that takes a simpler approach and I've asked him to post it.

Just a few coding style nitpicks. Otherwise LGTM.

lib/IR/Function.cpp
427–433 ↗	(On Diff #11721)	No {} needed
451–453 ↗	(On Diff #11721)	No {} needed
460 ↗	(On Diff #11721)	assert or llvm_unreachable?
683 ↗	(On Diff #11721)	isVarArg -> IsVarArg
694 ↗	(On Diff #11721)	nullptr ;-)
744 ↗	(On Diff #11721)	ditto
764–766 ↗	(On Diff #11721)	No {} needed
769–771 ↗	(On Diff #11721)	ditto

This revision is now accepted and ready to land.Oct 9 2014, 2:52 PM

Thanks for the code comments. I'm going to hold off fixing these until
we have the overall direction settled.

Philip

Juergen & Andy, I believe this change is good to go, but given it's been a while, I just wanted to confirm. Are we all comfortable with the mangling support? (I've included a cleaned up version of the patch.)

Juergen - I left the {} around the single statement if since they're multi line.

Ok with me, as long as Juergen and Lang don’t see a problem. I don’t think this changes symbols for any existing intrinsics does it?

-Andy

Yes, please commit.

Closed by commit rL221742 (authored by @reames).

pcc added a subscriber: pcc.Apr 7 2016, 2:03 PM

pcc added inline comments.

llvm/trunk/lib/IR/Function.cpp

470

Sorry for the late review, but this mangling doesn't seem sound to me. It may fail in the presence of IR linking.

merge1.ll

%mystruct = type { i16, i16, [1 x i8*] }

declare <16 x %mystruct*> @llvm.masked.load.v16p0mystruct(<16 x %mystruct*>*, i32, <16 x i1>, <16 x %mystruct*>)

define void @foo() {
  %x = call <16 x %mystruct*> @llvm.masked.load.v16p0mystruct(<16 x %mystruct*>* undef, i32 undef, <16 x i1> undef, <16 x %mystruct*> undef)
  ret void
}

merge2.ll

%mystruct = type { i32, i16, [1 x i8*] }

declare <16 x %mystruct*> @llvm.masked.load.v16p0mystruct(<16 x %mystruct*>*, i32, <16 x i1>, <16 x %mystruct*>)

define void @bar() {
  %x = call <16 x %mystruct*> @llvm.masked.load.v16p0mystruct(<16 x %mystruct*>* undef, i32 undef, <16 x i1> undef, <16 x %mystruct*> undef)
  ret void
}

$ ra/bin/llvm-link -o merge.bc merge1.ll merge2.ll
Intrinsic name not mangled correctly for type arguments! Should be: llvm.masked.load.v16p0mystruct.0
<16 x %mystruct.0*> (<16 x %mystruct.0*>*, i32, <16 x i1>, <16 x %mystruct.0*>)* @llvm.masked.load.v16p0mystruct
ra/bin/llvm-link: merge2.ll: error: input module is broken!

Revision Contents

Path

Size

llvm/

trunk/

lib/

IR/

Function.cpp

34 lines

Diff 16069

llvm/trunk/lib/IR/Function.cpp

	Show First 20 Lines • Show All 449 Lines • ▼ Show 20 Lines

	#define GET_FUNCTION_RECOGNIZER			#define GET_FUNCTION_RECOGNIZER
	#include "llvm/IR/Intrinsics.gen"			#include "llvm/IR/Intrinsics.gen"
	#undef GET_FUNCTION_RECOGNIZER			#undef GET_FUNCTION_RECOGNIZER

	return 0;			return 0;
	}			}

				/// Returns a stable mangling for the type specified for use in the name
				/// mangling scheme used by 'any' types in intrinsic signatures.
				static std::string getMangledTypeStr(Type* Ty) {
				std::string Result;
				if (PointerType* PTyp = dyn_cast<PointerType>(Ty)) {
				Result += "p" + llvm::utostr(PTyp->getAddressSpace()) +
				getMangledTypeStr(PTyp->getElementType());
				} else if (ArrayType* ATyp = dyn_cast<ArrayType>(Ty)) {
				Result += "a" + llvm::utostr(ATyp->getNumElements()) +
				getMangledTypeStr(ATyp->getElementType());
				} else if (StructType* STyp = dyn_cast<StructType>(Ty)) {
				if (!STyp->isLiteral())
				Result += STyp->getName();
				pccUnsubmitted Not Done Reply Inline Actions Sorry for the late review, but this mangling doesn't seem sound to me. It may fail in the presence of IR linking. merge1.ll %mystruct = type { i16, i16, [1 x i8] } declare <16 x %mystruct> @llvm.masked.load.v16p0mystruct(<16 x %mystruct>, i32, <16 x i1>, <16 x %mystruct>) define void @foo() { %x = call <16 x %mystruct> @llvm.masked.load.v16p0mystruct(<16 x %mystruct> undef, i32 undef, <16 x i1> undef, <16 x %mystruct> undef) ret void } merge2.ll %mystruct = type { i32, i16, [1 x i8] } declare <16 x %mystruct> @llvm.masked.load.v16p0mystruct(<16 x %mystruct>, i32, <16 x i1>, <16 x %mystruct>) define void @bar() { %x = call <16 x %mystruct> @llvm.masked.load.v16p0mystruct(<16 x %mystruct>* undef, i32 undef, <16 x i1> undef, <16 x %mystruct> undef) ret void } $ ra/bin/llvm-link -o merge.bc merge1.ll merge2.ll Intrinsic name not mangled correctly for type arguments! Should be: llvm.masked.load.v16p0mystruct.0 <16 x %mystruct.0> (<16 x %mystruct.0>, i32, <16 x i1>, <16 x %mystruct.0>) @llvm.masked.load.v16p0mystruct ra/bin/llvm-link: merge2.ll: error: input module is broken! pcc: Sorry for the late review, but this mangling doesn't seem sound to me. It may fail in the…
				else
				llvm_unreachable("TODO: implement literal types");
				} else if (FunctionType* FT = dyn_cast<FunctionType>(Ty)) {
				Result += "f_" + getMangledTypeStr(FT->getReturnType());
				for (size_t i = 0; i < FT->getNumParams(); i++)
				Result += getMangledTypeStr(FT->getParamType(i));
				if (FT->isVarArg())
				Result += "vararg";
				Result += "f"; //ensure distinguishable
				} else if (Ty)
				Result += EVT::getEVT(Ty).getEVTString();
				return Result;
				}

	std::string Intrinsic::getName(ID id, ArrayRef<Type*> Tys) {			std::string Intrinsic::getName(ID id, ArrayRef<Type*> Tys) {
	assert(id < num_intrinsics && "Invalid intrinsic ID!");			assert(id < num_intrinsics && "Invalid intrinsic ID!");
	static const char * const Table[] = {			static const char * const Table[] = {
	"not_intrinsic",			"not_intrinsic",
	#define GET_INTRINSIC_NAME_TABLE			#define GET_INTRINSIC_NAME_TABLE
	#include "llvm/IR/Intrinsics.gen"			#include "llvm/IR/Intrinsics.gen"
	#undef GET_INTRINSIC_NAME_TABLE			#undef GET_INTRINSIC_NAME_TABLE
	};			};
	if (Tys.empty())			if (Tys.empty())
	return Table[id];			return Table[id];
	std::string Result(Table[id]);			std::string Result(Table[id]);
	for (unsigned i = 0; i < Tys.size(); ++i) {			for (unsigned i = 0; i < Tys.size(); ++i) {
	if (PointerType* PTyp = dyn_cast<PointerType>(Tys[i])) {			Result += "." + getMangledTypeStr(Tys[i]);
	Result += ".p" + llvm::utostr(PTyp->getAddressSpace()) +
	EVT::getEVT(PTyp->getElementType()).getEVTString();
	}
	else if (Tys[i])
	Result += "." + EVT::getEVT(Tys[i]).getEVTString();
	}			}
	return Result;			return Result;
	}			}


	/// IIT_Info - These are enumerators that describe the entries returned by the			/// IIT_Info - These are enumerators that describe the entries returned by the
	/// getIntrinsicInfoTableEntries function.			/// getIntrinsicInfoTableEntries function.
	///			///
	▲ Show 20 Lines • Show All 376 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Mangling for intrinsic names w/function type parametersClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 16069

llvm/trunk/lib/IR/Function.cpp

Mangling for intrinsic names w/function type parameters
ClosedPublic