This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/IR/
-
llvm/
-
IR/
2/6
IntrinsicsDXIL.td
-
lib/Target/DirectX/
-
Target/
-
DirectX/
-
DXIL.td
-
DXILOpLowering.cpp
-
test/CodeGen/DirectX/
-
CodeGen/
-
DirectX/
-
resources.ll
-
utils/TableGen/
-
TableGen/
-
DXILEmitter.cpp

Differential D128839

[DirectX backend] Add createHandle BufferLoad/Store DXIL operation
Needs ReviewPublic

Authored by python3kgae on Jun 29 2022, 10:32 AM.

Download Raw Diff

Details

Reviewers

MaskRay
tstellar
pete
jdoerfert
sheredom
antiagainst
nhaehnle
rnk
beanz
pow2clk
bogner
kuhar
nikic

Summary

New DXIL operations for createHandle BufferLoad/Store are added.
A new helper class DXILOpBuilder is added to create DXIL op function calls.

TableGen backend for DXILOperation will create table for DXIL op function parameter types.
When create DXIL op function, these parameter types will used to create the function type.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

python3kgae created this revision.Jun 29 2022, 10:32 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 29 2022, 10:32 AM

Herald added subscribers: StephenFan, hiraditya. · View Herald Transcript

python3kgae requested review of this revision.Jun 29 2022, 10:32 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 29 2022, 10:32 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B172816: Diff 441076.Jun 29 2022, 10:33 AM

python3kgae added a parent revision: D127990: [DirectX] add thread/group id DXIL operations..Jun 29 2022, 10:33 AM

beanz added inline comments.Jun 29 2022, 10:55 AM

llvm/include/llvm/IR/IntrinsicsDXIL.td
22	Do you have a plan for taking LLVM load instructions and converting them to these intrinsics? I think we need to think about how we want to translate LLVM gep/load/store instructions into DXIL ops, and I don't think we should add these intrinsics until we know what that is going to look like.

python3kgae added inline comments.Jun 29 2022, 11:30 AM

llvm/include/llvm/IR/IntrinsicsDXIL.td
22	These intrinsics are trying to make the distance from hlsl to DXIL shorter. They're just wrapper for DXIL operation functions so generate DXIL is easier. I did experiment to generate DXIL directly from GEP/load/store, then found create intrinsic might help the translation.

beanz added inline comments.Jun 29 2022, 11:32 AM

llvm/include/llvm/IR/IntrinsicsDXIL.td
22	I still don't know that these are the _right_ intrinsics. How are we going to map GEP/load/store to these intrinsics?

nikic resigned from this revision.Jun 30 2022, 12:32 AM

python3kgae added inline comments.Jun 30 2022, 4:51 AM

llvm/include/llvm/IR/IntrinsicsDXIL.td
22	It will not be a simple map, we'll need a pass to translate GEP/load/store to these intrinsics. These intrinsics are to make the pass easier to write and leave the details like DXIL opcode, DXIL struct type to DXILOpLowering pass. Maybe we can allow GEP/load/store in final DXIL for future DXIL version, but to generate early version of DXIL, these intrinsics will be helpful.

beanz added inline comments.Jun 30 2022, 6:24 AM

llvm/include/llvm/IR/IntrinsicsDXIL.td
22	I didn't mean to imply it would be a simple map (as in map data structure), but it is a mapping operation. GEPs get folded in with loads and stores to form load and store DXIL Ops. Clang will generate GEPs, loads, and stores through known handle pointers. Unlike in DXC we won't map those to "high level" intrinsics during codegen, instead we'll emit the GEPs, loads and stores. That will allow LLVM's optimization passes (like SROA) to run without needing to be taught about all of the special intrinsics for HLSL. If the input to our backend is expected to be GEPs, loads and stores, I fail to see why we would translate those to an intrinsic that has an identical signature to the DXIL Op (minus the opcode) instead of just translating it to the DXIL Op directly.

Remove intrinsic, add DXILOpBuilder to create DXIL op function calls.

Herald added subscribers: nlopes, mgorny. · View Herald TranscriptJul 2 2022, 1:45 PM

python3kgae edited the summary of this revision. (Show Details)Jul 2 2022, 1:45 PM

Harbormaster completed remote builds in B173409: Diff 441904.Jul 2 2022, 2:27 PM

Please avoid using UndefValue::get whenever possible as we are trying to get rid of undef. Please use PoisonValue. Thank you!

Change UndefValue to PoisonValue.

Harbormaster completed remote builds in B173425: Diff 441921.Jul 2 2022, 9:16 PM

nhaehnle added inline comments.Jul 4 2022, 5:26 AM

llvm/include/llvm/IR/IntrinsicsDXIL.td
22	FWIW, we have a similar issue in LLPC, our SPIR-V-to-AMDGPU-backend shader compiler. The backend has a family of `llvm.amdgcn.buffer.load/store` intrinsics that take a buffer descriptor and offset arguments. We generate those from load/store/atomic/gep on a "fat pointer" address space in this pass: https://github.com/GPUOpen-Drivers/llpc/blob/dev/lgc/patch/PatchBufferOp.cpp I don't really have more thoughts on the issue right now, but I believe it is a very similar problem and so a future exchange of thoughts may well be helpful. For example, it is not clear to me what the "correct" place for lowering these loads and store is. What AMDGPU and LLPC do has evolved historically. I'd say it's fairly reasonable, we did learn that pushing the lowering to be later is helpful in our case but pushing it all the way to a MIR pass (which the DXIL backend doesn't use anyway) would have been a painful amount of work.

This change adds a bunch of functionality with no tests. How are the new dxil ops generated?

I feel like you have two changes going in here at once. One is a refactor that is probably NFC, and the other is adding some operations that aren't used anywhere. These should be two separate changes, and we shouldn't add the operations until we have codegen in place so that we know how we're going to generate them.

kuhar resigned from this revision.Jun 11 2023, 1:15 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

IR/

IntrinsicsDXIL.td

8 lines

lib/

Target/

DirectX/

DXIL.td

44 lines

DXILOpLowering.cpp

240 lines

test/

CodeGen/

DirectX/

resources.ll

44 lines

utils/

TableGen/

DXILEmitter.cpp

39 lines

Diff 441076

llvm/include/llvm/IR/IntrinsicsDXIL.td

	Show All 11 Lines

	let TargetPrefix = "dxil" in {			let TargetPrefix = "dxil" in {

	def int_dxil_thread_id : Intrinsic<[llvm_i32_ty], [llvm_i32_ty], [IntrNoMem, IntrWillReturn]>;			def int_dxil_thread_id : Intrinsic<[llvm_i32_ty], [llvm_i32_ty], [IntrNoMem, IntrWillReturn]>;
	def int_dxil_group_id : Intrinsic<[llvm_i32_ty], [llvm_i32_ty], [IntrNoMem, IntrWillReturn]>;			def int_dxil_group_id : Intrinsic<[llvm_i32_ty], [llvm_i32_ty], [IntrNoMem, IntrWillReturn]>;
	def int_dxil_thread_id_in_group : Intrinsic<[llvm_i32_ty], [llvm_i32_ty], [IntrNoMem, IntrWillReturn]>;			def int_dxil_thread_id_in_group : Intrinsic<[llvm_i32_ty], [llvm_i32_ty], [IntrNoMem, IntrWillReturn]>;
	def int_dxil_flattened_thread_id_in_group : Intrinsic<[llvm_i32_ty], [], [IntrNoMem, IntrWillReturn]>;			def int_dxil_flattened_thread_id_in_group : Intrinsic<[llvm_i32_ty], [], [IntrNoMem, IntrWillReturn]>;

				def int_dxil_create_handle : Intrinsic<[ llvm_i64_ty ], [llvm_i8_ty, llvm_i32_ty, llvm_i32_ty, llvm_i1_ty], [IntrNoMem, IntrWillReturn]>;

				def int_dxil_buffer_load : Intrinsic<[ llvm_any_ty, LLVMMatchType<0>, LLVMMatchType<0>, LLVMMatchType<0>, llvm_i32_ty ],
				beanzUnsubmitted Not Done Reply Inline Actions Do you have a plan for taking LLVM load instructions and converting them to these intrinsics? I think we need to think about how we want to translate LLVM gep/load/store instructions into DXIL ops, and I don't think we should add these intrinsics until we know what that is going to look like. beanz: Do you have a plan for taking LLVM load instructions and converting them to these intrinsics?
				python3kgaeAuthorUnsubmitted Done Reply Inline Actions These intrinsics are trying to make the distance from hlsl to DXIL shorter. They're just wrapper for DXIL operation functions so generate DXIL is easier. I did experiment to generate DXIL directly from GEP/load/store, then found create intrinsic might help the translation. python3kgae: These intrinsics are trying to make the distance from hlsl to DXIL shorter. They're just…
				beanzUnsubmitted Not Done Reply Inline Actions I still don't know that these are the _right_ intrinsics. How are we going to map GEP/load/store to these intrinsics? beanz: I still don't know that these are the _right_ intrinsics. How are we going to map…
				python3kgaeAuthorUnsubmitted Done Reply Inline Actions It will not be a simple map, we'll need a pass to translate GEP/load/store to these intrinsics. These intrinsics are to make the pass easier to write and leave the details like DXIL opcode, DXIL struct type to DXILOpLowering pass. Maybe we can allow GEP/load/store in final DXIL for future DXIL version, but to generate early version of DXIL, these intrinsics will be helpful. python3kgae: It will not be a simple map, we'll need a pass to translate GEP/load/store to these intrinsics.
				beanzUnsubmitted Not Done Reply Inline Actions I didn't mean to imply it would be a simple map (as in map data structure), but it is a mapping operation. GEPs get folded in with loads and stores to form load and store DXIL Ops. Clang will generate GEPs, loads, and stores through known handle pointers. Unlike in DXC we won't map those to "high level" intrinsics during codegen, instead we'll emit the GEPs, loads and stores. That will allow LLVM's optimization passes (like SROA) to run without needing to be taught about all of the special intrinsics for HLSL. If the input to our backend is expected to be GEPs, loads and stores, I fail to see why we would translate those to an intrinsic that has an identical signature to the DXIL Op (minus the opcode) instead of just translating it to the DXIL Op directly. beanz: I didn't mean to imply it would be a simple map (as in map data structure), but it is a mapping…
				nhaehnleUnsubmitted Not Done Reply Inline Actions FWIW, we have a similar issue in LLPC, our SPIR-V-to-AMDGPU-backend shader compiler. The backend has a family of `llvm.amdgcn.buffer.load/store` intrinsics that take a buffer descriptor and offset arguments. We generate those from load/store/atomic/gep on a "fat pointer" address space in this pass: https://github.com/GPUOpen-Drivers/llpc/blob/dev/lgc/patch/PatchBufferOp.cpp I don't really have more thoughts on the issue right now, but I believe it is a very similar problem and so a future exchange of thoughts may well be helpful. For example, it is not clear to me what the "correct" place for lowering these loads and store is. What AMDGPU and LLPC do has evolved historically. I'd say it's fairly reasonable, we did learn that pushing the lowering to be later is helpful in our case but pushing it all the way to a MIR pass (which the DXIL backend doesn't use anyway) would have been a painful amount of work. nhaehnle: FWIW, we have a similar issue in LLPC, our SPIR-V-to-AMDGPU-backend shader compiler. The…
				[ llvm_i64_ty, llvm_i32_ty, llvm_i32_ty], [IntrReadMem, IntrWillReturn]>;
				def int_dxil_buffer_store : Intrinsic<[ ], [ llvm_i64_ty, llvm_i32_ty, llvm_i32_ty, llvm_any_ty, LLVMMatchType<0>, LLVMMatchType<0>, LLVMMatchType<0>, llvm_i8_ty],
				[ IntrWriteMem, IntrWillReturn]>;


	}			}

llvm/lib/Target/DirectX/DXIL.td

	Show All 20 Lines
	}			}

	def Unary : dxil_class<"Unary">;			def Unary : dxil_class<"Unary">;
	def Binary : dxil_class<"Binary">;			def Binary : dxil_class<"Binary">;
	def FlattenedThreadIdInGroupClass : dxil_class<"FlattenedThreadIdInGroup">;			def FlattenedThreadIdInGroupClass : dxil_class<"FlattenedThreadIdInGroup">;
	def ThreadIdInGroupClass : dxil_class<"ThreadIdInGroup">;			def ThreadIdInGroupClass : dxil_class<"ThreadIdInGroup">;
	def ThreadIdClass : dxil_class<"ThreadId">;			def ThreadIdClass : dxil_class<"ThreadId">;
	def GroupIdClass : dxil_class<"GroupId">;			def GroupIdClass : dxil_class<"GroupId">;
				def BufferLoadClass : dxil_class<"BufferLoad">;
				def BufferStoreClass : dxil_class<"BufferStore">;
				def CreateHandleClass : dxil_class<"CreateHandle">;

	def binary_uint : dxil_category<"Binary uint">;			def binary_uint : dxil_category<"Binary uint">;
	def unary_float : dxil_category<"Unary float">;			def unary_float : dxil_category<"Unary float">;
	def ComputeID : dxil_category<"Compute/Mesh/Amplification shader">;			def ComputeID : dxil_category<"Compute/Mesh/Amplification shader">;
				def Resources : dxil_category<"Resources">;

	// The parameter description for a DXIL instruction			// The parameter description for a DXIL instruction
	class dxil_param<int _pos, string type, string _name, string _doc,			class dxil_param<int _pos, string type, string _name, string _doc,
	bit _is_const = 0, string _enum_name = "",			bit _is_const = 0, string _enum_name = "",
	int _max_value = 0> {			int _max_value = 0> {
	int pos = _pos; // position in parameter list			int pos = _pos; // position in parameter list
	string llvm_type = type; // llvm type name, $o for overload, $r for resource			string llvm_type = type; // llvm type name, $o for overload, $r for resource
	// type, $cb for legacy cbuffer, $u4 for u4 struct			// type, $cb for legacy cbuffer, $u4 for u4 struct
	▲ Show 20 Lines • Show All 95 Lines • ▼ Show 20 Lines

	def FlattenedThreadIdInGroup :dxil_op< "FlattenedThreadIdInGroup", 96, FlattenedThreadIdInGroupClass, ComputeID,			def FlattenedThreadIdInGroup :dxil_op< "FlattenedThreadIdInGroup", 96, FlattenedThreadIdInGroupClass, ComputeID,
	"provides a flattened index for a given thread within a given group (SV_GroupIndex)", "i32;", "rn",			"provides a flattened index for a given thread within a given group (SV_GroupIndex)", "i32;", "rn",
	[			[
	dxil_param<0, "i32", "", "result">,			dxil_param<0, "i32", "", "result">,
	dxil_param<1, "i32", "opcode", "DXIL opcode">			dxil_param<1, "i32", "opcode", "DXIL opcode">
	]>,			]>,
	dxil_map_intrinsic<int_dxil_flattened_thread_id_in_group>;			dxil_map_intrinsic<int_dxil_flattened_thread_id_in_group>;

				def BufferLoad : dxil_op< "BufferLoad", 68, BufferLoadClass,Resources, "reads from a TypedBuffer", "half;float;i16;i32;", "ro",
				[
				dxil_param<0, "dx.types.ResRet", "", "the loaded value">,
				dxil_param<1, "i32", "opcode", "DXIL opcode">,
				dxil_param<2, "dx.types.Handle", "srv", "handle of TypedBuffer SRV to sample">,
				dxil_param<3, "i32", "index", "element index">,
				dxil_param<4, "i32", "wot", "coordinate">
				],
				["tex_load"]>,
				dxil_map_intrinsic<int_dxil_buffer_load>;

				def BufferStore : dxil_op< "BufferStore", 69, BufferStoreClass, Resources, "writes to a RWTypedBuffer", "half;float;i16;i32;", "",
				[
				dxil_param<0, "v", "", "">,
				dxil_param<1, "i32", "opcode", "DXIL opcode">,
				dxil_param<2, "dx.types.Handle", "uav", "handle of UAV to store to">,
				dxil_param<3, "i32", "coord0", "coordinate in elements">,
				dxil_param<4, "i32", "coord1", "coordinate (unused?)">,
				dxil_param<5, "$o", "value0", "value">,
				dxil_param<6, "$o", "value1", "value">,
				dxil_param<7, "$o", "value2", "value">,
				dxil_param<8, "$o", "value3", "value">,
				dxil_param<9, "i8", "mask", "written value mask">
				],
				["tex_store"]>,
				dxil_map_intrinsic<int_dxil_buffer_store>;

				def CreateHandle : dxil_op< "CreateHandle", 57, CreateHandleClass, Resources, "creates the handle to a resource",
				"void;", "ro",
				[
				dxil_param<0, "dx.types.Handle", "", "the handle to the resource">,
				dxil_param<1, "i32", "opcode", "DXIL opcode">,
				dxil_param<2, "i8", "resourceClass", "the class of resource to create (SRV, UAV, CBuffer, Sampler)", 1>, // maps to DxilResourceBase::Class
				dxil_param<3, "i32", "rangeId", "range identifier for resource", 1>,
				dxil_param<4, "i32", "index", "zero-based index into range">,
				dxil_param<5, "i1", "nonUniformIndex", "non-uniform resource index", 1>
				]>,
				dxil_map_intrinsic<int_dxil_create_handle>;

llvm/lib/Target/DirectX/DXILOpLowering.cpp

Show First 20 Lines • Show All 132 Lines • ▼ Show 20 Lines	struct OpCodeProperty {
DXIL::OpCode OpCode;		DXIL::OpCode OpCode;
// Offset in DXILOpCodeNameTable.		// Offset in DXILOpCodeNameTable.
unsigned OpCodeNameOffset;		unsigned OpCodeNameOffset;
DXIL::OpCodeClass OpCodeClass;		DXIL::OpCodeClass OpCodeClass;
// Offset in DXILOpCodeClassNameTable.		// Offset in DXILOpCodeClassNameTable.
unsigned OpCodeClassNameOffset;		unsigned OpCodeClassNameOffset;
uint16_t OverloadTys;		uint16_t OverloadTys;
llvm::Attribute::AttrKind FuncAttr;		llvm::Attribute::AttrKind FuncAttr;
		bool HasOverload; // If only has one type in OverloadTypes, HasOverload will
		// be false.
		int OverloadParamIndex; // parameter index which control the overload
		bool OverloadTypeInStruct; // The overload parameter type is struct, the
		// overload type is first field type of the struct
		// type. This happens for things like buffer load.
		SmallVector<std::pair<unsigned, StringRef>>
		StructParams; // Param which is struct type and need to mutate to DXIL
		// type.
};		};

// Include getOpCodeClassName getOpCodeProperty and getOpCodeName which		// Include getOpCodeClassName getOpCodeProperty and getOpCodeName which
// generated by tableGen.		// generated by tableGen.
#define DXIL_OP_OPERATION_TABLE		#define DXIL_OP_OPERATION_TABLE
#include "DXILOperation.inc"		#include "DXILOperation.inc"
#undef DXIL_OP_OPERATION_TABLE		#undef DXIL_OP_OPERATION_TABLE

static std::string constructOverloadName(OverloadKind Kind, Type *Ty,		static std::string constructOverloadName(OverloadKind Kind, Type *Ty,
const OpCodeProperty &Prop) {		const OpCodeProperty &Prop) {
if (Kind == OverloadKind::VOID) {		if (Kind == OverloadKind::VOID) {
return (Twine(DXILOpNamePrefix) + getOpCodeClassName(Prop)).str();		return (Twine(DXILOpNamePrefix) + getOpCodeClassName(Prop)).str();
}		}
return (Twine(DXILOpNamePrefix) + getOpCodeClassName(Prop) + "." +		return (Twine(DXILOpNamePrefix) + getOpCodeClassName(Prop) + "." +
getTypeName(Kind, Ty))		getTypeName(Kind, Ty))
.str();		.str();
}		}

		static Type getOverloadType(const OpCodeProperty Prop, FunctionType *FT) {
		if (!Prop->HasOverload) {
		auto &Ctx = FT->getContext();
		// When only has 1 overload type, just return it.
		switch (Prop->OverloadTys) {
		case OverloadKind::VOID:
		return Type::getVoidTy(Ctx);
		case OverloadKind::HALF:
		return Type::getHalfTy(Ctx);
		case OverloadKind::FLOAT:
		return Type::getFloatTy(Ctx);
		case OverloadKind::DOUBLE:
		return Type::getDoubleTy(Ctx);
		case OverloadKind::I1:
		return Type::getInt1Ty(Ctx);
		case OverloadKind::I8:
		return Type::getInt8Ty(Ctx);
		case OverloadKind::I16:
		return Type::getInt16Ty(Ctx);
		case OverloadKind::I32:
		return Type::getInt32Ty(Ctx);
		case OverloadKind::I64:
		return Type::getInt64Ty(Ctx);
		default:
		break;
		}
		}

		// Prop->OverloadParamIndex is 0, overload type is FT->getReturnType().
		Type *OverloadType = FT->getReturnType();
		if (Prop->OverloadParamIndex != 0) {
		// Skip Return Type and Type for DXIL opcode.
		OverloadType = FT->getParamType(Prop->OverloadParamIndex - 2);
		}

		if (Prop->OverloadTypeInStruct) {
		auto *ST = cast<StructType>(OverloadType);
		OverloadType = ST->getElementType(0);
		}
		return OverloadType;
		}

		static std::string constructOverloadTypeName(OverloadKind Kind,
		StringRef TypeName) {
		if (Kind == OverloadKind::VOID)
		return TypeName.str();

		assert(Kind < OverloadKind::UserDefineType && "invalid overload kind");
		return (Twine(TypeName) + getOverloadTypeName(Kind)).str();
		}

		static StructType *getOrCreateStructType(StringRef Name,
		ArrayRef<Type *> EltTys,
		LLVMContext &Ctx) {
		StructType *ST = StructType::getTypeByName(Ctx, Name);
		if (ST)
		return ST;

		return StructType::create(Ctx, EltTys, Name);
		}

		static StructType getResRetType(Type OverloadTy, LLVMContext &Ctx) {
		OverloadKind Kind = getOverloadKind(OverloadTy);
		std::string TypeName = constructOverloadTypeName(Kind, "dx.types.ResRet.");
		Type *FieldTypes[5] = {OverloadTy, OverloadTy, OverloadTy, OverloadTy,
		Type::getInt32Ty(Ctx)};
		return getOrCreateStructType(TypeName, FieldTypes, Ctx);
		}

		static StructType *getHandleType(LLVMContext &Ctx) {
		return getOrCreateStructType("dx.types.Handle", Type::getInt8PtrTy(Ctx), Ctx);
		}

		static StructType getDXILStructType(StringRef Name, Type CurTy,
		LLVMContext &Ctx) {
		StructType *ST = StructType::getTypeByName(Ctx, Name);
		if (ST)
		return ST;

		if (Name == "dx.types.Handle")
		return getHandleType(Ctx);

		if (Name == "dx.types.ResRet")
		return getResRetType(CurTy, Ctx);

		llvm_unreachable("invalid DXIL struct type");
		return nullptr;
		}

static FunctionCallee createDXILOpFunction(DXIL::OpCode DXILOp, Function &F,		static FunctionCallee createDXILOpFunction(DXIL::OpCode DXILOp, Function &F,
Module &M) {		Module &M) {
const OpCodeProperty *Prop = getOpCodeProperty(DXILOp);		const OpCodeProperty *Prop = getOpCodeProperty(DXILOp);

// Get return type as overload type for DXILOp.		FunctionType *FT = F.getFunctionType();
// Only simple mapping case here, so return type is good enough.		Type *OverloadTy = getOverloadType(Prop, FT);
Type *OverloadTy = F.getReturnType();

OverloadKind Kind = getOverloadKind(OverloadTy);		OverloadKind Kind = getOverloadKind(OverloadTy);
// FIXME: find the issue and report error in clang instead of check it in		// FIXME: find the issue and report error in clang instead of check it in
// backend.		// backend.
if ((Prop->OverloadTys & (uint16_t)Kind) == 0) {		if ((Prop->OverloadTys & (uint16_t)Kind) == 0) {
llvm_unreachable("invalid overload");		llvm_unreachable("invalid overload");
}		}

std::string FnName = constructOverloadName(Kind, OverloadTy, *Prop);		std::string FnName = constructOverloadName(Kind, OverloadTy, *Prop);
assert(!M.getFunction(FnName) && "Function already exists");		assert(!M.getFunction(FnName) && "Function already exists");

auto &Ctx = M.getContext();		auto &Ctx = M.getContext();
Type *OpCodeTy = Type::getInt32Ty(Ctx);		Type *OpCodeTy = Type::getInt32Ty(Ctx);

		auto StructParamIt = Prop->StructParams.begin();
		auto StructParamEnd = Prop->StructParams.end();
		auto *RetTy = FT->getReturnType();
		// Change struct type to DXIL type.
		if (StructParamIt != StructParamEnd && StructParamIt->first == 0) {
		RetTy = getDXILStructType(StructParamIt->second, OverloadTy, Ctx);
		++StructParamIt;
		}

SmallVector<Type *> ArgTypes;		SmallVector<Type *> ArgTypes;
// DXIL has i32 opcode as first arg.		// DXIL has i32 opcode as first arg.
ArgTypes.emplace_back(OpCodeTy);		ArgTypes.emplace_back(OpCodeTy);
FunctionType *FT = F.getFunctionType();
ArgTypes.append(FT->param_begin(), FT->param_end());		for (unsigned I = 0; I < FT->getNumParams(); ++I) {
FunctionType *DXILOpFT = FunctionType::get(OverloadTy, ArgTypes, false);		auto *ParamTy = FT->getParamType(I);
		// i+2 to skip RetType and DXIL opcode.
		if (StructParamIt != StructParamEnd && StructParamIt->first == (I + 2)) {
		// Change struct type to DXIL type.
		ArgTypes.emplace_back(
		getDXILStructType(StructParamIt->second, OverloadTy, Ctx));
		++StructParamIt;
		} else {
		ArgTypes.emplace_back(ParamTy);
		}
		}

		FunctionType *DXILOpFT = FunctionType::get(RetTy, ArgTypes, false);

return M.getOrInsertFunction(FnName, DXILOpFT);		return M.getOrInsertFunction(FnName, DXILOpFT);
}		}

static void lowerIntrinsic(DXIL::OpCode DXILOp, Function &F, Module &M) {		static FunctionCallee getOrCreateCastFunction(Type FromTy, Type ToTy,
		Module &M) {
		std::string CastFnName = "Tmp.Cast.";
		llvm::raw_string_ostream OS(CastFnName);
		FromTy->print(OS);
		OS << ".";
		ToTy->print(OS);
		OS.flush();
		if (auto Fn = M.getFunction(CastFnName))
		return Fn;
		FunctionType *FT = FunctionType::get(ToTy, FromTy, false);

		return M.getOrInsertFunction(CastFnName, FT);
		}

		static void lowerIntrinsic(DXIL::OpCode DXILOp, Function &F, Module &M,
		SmallDenseSet<Function *> &TmpCastFnSet) {
auto DXILOpFn = createDXILOpFunction(DXILOp, F, M);		auto DXILOpFn = createDXILOpFunction(DXILOp, F, M);
IRBuilder<> B(M.getContext());		IRBuilder<> B(M.getContext());
Value *DXILOpArg = B.getInt32(static_cast<unsigned>(DXILOp));		Value *DXILOpArg = B.getInt32(static_cast<unsigned>(DXILOp));
		FunctionType *FT = DXILOpFn.getFunctionType();
		Type *RetTy = FT->getReturnType();

		// When RetTy or ParamTy for DXILOpFn is DXIL struct type, it will be mismatch
		// from the type in intrinsic. The size will be the same. Create bitcast to
		// make it match. These bitcast will be removed later.
for (User *U : make_early_inc_range(F.users())) {		for (User *U : make_early_inc_range(F.users())) {
CallInst *CI = dyn_cast<CallInst>(U);		CallInst *CI = dyn_cast<CallInst>(U);
if (!CI)		if (!CI)
continue;		continue;
		if (TmpCastFnSet.contains(CI->getCalledFunction()))
		continue;
SmallVector<Value *> Args;		SmallVector<Value *> Args;
Args.emplace_back(DXILOpArg);		Args.emplace_back(DXILOpArg);
Args.append(CI->arg_begin(), CI->arg_end());		auto ParamTyIt = FT->param_begin();
		// Skip param for DXIL opcode.
		++ParamTyIt;
B.SetInsertPoint(CI);		B.SetInsertPoint(CI);
CallInst *DXILCI = B.CreateCall(DXILOpFn, Args);		for (auto &Arg : CI->args()) {
		auto *ArgTy = Arg->getType();
		auto ParamTy = (ParamTyIt++);
		if (ArgTy == ParamTy) {
		Args.emplace_back(Arg);
		continue;
		}
		auto CastFn = getOrCreateCastFunction(ArgTy, ParamTy, M);
		TmpCastFnSet.insert(cast<Function>(CastFn.getCallee()));
		auto *Cast = B.CreateCall(CastFn, {Arg});
		Args.emplace_back(Cast);
		}

		Value *DXILCI = B.CreateCall(DXILOpFn, Args);
LLVM_DEBUG(DXILCI->setName(getOpCodeName(DXILOp)));		LLVM_DEBUG(DXILCI->setName(getOpCodeName(DXILOp)));
		if (CI->getType() != RetTy) {
		auto CastFn = getOrCreateCastFunction(RetTy, CI->getType(), M);
		TmpCastFnSet.insert(cast<Function>(CastFn.getCallee()));
		DXILCI = B.CreateCall(CastFn, {DXILCI});
		}
CI->replaceAllUsesWith(DXILCI);		CI->replaceAllUsesWith(DXILCI);
CI->eraseFromParent();		CI->eraseFromParent();
}		}
if (F.user_empty())		if (F.user_empty())
F.eraseFromParent();		F.eraseFromParent();
}		}

static bool lowerIntrinsics(Module &M) {		static bool lowerIntrinsics(Module &M) {
bool Updated = false;		bool Updated = false;

#define DXIL_OP_INTRINSIC_MAP		#define DXIL_OP_INTRINSIC_MAP
#include "DXILOperation.inc"		#include "DXILOperation.inc"
#undef DXIL_OP_INTRINSIC_MAP		#undef DXIL_OP_INTRINSIC_MAP

		SmallDenseSet<Function *> TmpCastFnSet;

for (Function &F : make_early_inc_range(M.functions())) {		for (Function &F : make_early_inc_range(M.functions())) {
if (!F.isDeclaration())		if (!F.isDeclaration())
continue;		continue;
Intrinsic::ID ID = F.getIntrinsicID();		Intrinsic::ID ID = F.getIntrinsicID();
if (ID == Intrinsic::not_intrinsic)		if (ID == Intrinsic::not_intrinsic)
continue;		continue;
auto LowerIt = LowerMap.find(ID);		auto LowerIt = LowerMap.find(ID);
if (LowerIt == LowerMap.end())		if (LowerIt == LowerMap.end())
continue;		continue;
lowerIntrinsic(LowerIt->second, F, M);		lowerIntrinsic(LowerIt->second, F, M, TmpCastFnSet);
Updated = true;		Updated = true;
}		}
		// Remove cast functions.
		for (auto *CastFn : TmpCastFnSet) {
		Type *RetTy = CastFn->getReturnType();
		// Skip cast to non dxil struct type.
		StructType *ST = dyn_cast<StructType>(RetTy);
		if (!ST)
		continue;
		if (!ST->hasName()) {
		// Replace extractvalue use.
		for (User *U : make_early_inc_range(CastFn->users())) {
		CallInst *CI = cast<CallInst>(U);
		Value *Arg = CI->getArgOperand(0);

		// FIXME: support cases where the Arg is not CallInst.
		CallInst *CIArg = dyn_cast<CallInst>(Arg);
		if (!CIArg) {
		llvm_unreachable("unsupported DXIL struct type cast");
		break;
		}
		for (User *CastU : make_early_inc_range(CI->users())) {
		ExtractValueInst *EVI = dyn_cast<ExtractValueInst>(CastU);
		if (EVI)
		EVI->setOperand(0, CIArg);
		}
		if (CI->user_empty())
		CI->eraseFromParent();
		}

		continue;
		}

		for (User *U : make_early_inc_range(CastFn->users())) {
		CallInst *CI = cast<CallInst>(U);
		Value *Arg = CI->getArgOperand(0);
		// FIXME: support cases where the Arg is not CallInst.
		CallInst *CIArg = dyn_cast<CallInst>(Arg);
		if (!CIArg) {
		llvm_unreachable("unsupported DXIL struct type cast");
		break;
		}
		assert(TmpCastFnSet.contains(CIArg->getCalledFunction()));
		Value *InputArg = CIArg->getArgOperand(0);
		assert(InputArg->getType() == RetTy);
		CI->replaceAllUsesWith(InputArg);
		CI->eraseFromParent();
		if (CIArg->user_empty())
		CIArg->eraseFromParent();
		}
		}
		// All CastFn should be user empty now.
		for (auto *CastFn : TmpCastFnSet) {
		CastFn->eraseFromParent();
		}

return Updated;		return Updated;
}		}

namespace {		namespace {
/// A pass that transforms external global definitions into declarations.		/// A pass that transforms external global definitions into declarations.
class DXILOpLowering : public PassInfoMixin<DXILOpLowering> {		class DXILOpLowering : public PassInfoMixin<DXILOpLowering> {
public:		public:
PreservedAnalyses run(Module &M, ModuleAnalysisManager &) {		PreservedAnalyses run(Module &M, ModuleAnalysisManager &) {
Show All 28 Lines

llvm/test/CodeGen/DirectX/resources.ll

This file was added.

				; RUN: opt -S -dxil-op-lower < %s \| FileCheck %s

				; Make sure dxil operation function calls for createHandle bufferLoad/Store operations are generated.

				target datalayout = "e-m:e-p:32:32-i1:32-i8:8-i16:16-i32:32-i64:64-f16:16-f32:32-f64:64-n8:16:32:64"
				target triple = "dxil-pc-shadermodel6.3-library"

				%struct.anon = type { float, float, float, float, i32 }


				; CHECK-LABEL:test_buffer_load_f32
				; CHECK: %[[HDL:.+]] = call %dx.types.Handle @dx.op.createHandle(i32 57, i8 %res_class, i32 %range_id, i32 %index, i1 %non_uniform_index)
				; CHECK: call %dx.types.ResRet.f32 @dx.op.bufferLoad.f32(i32 68, %dx.types.Handle %[[HDL]], i32 %idx, i32 undef)
				define float @test_buffer_load_f32(i32 %idx, i8 %res_class, i32 %range_id, i32 %index, i1 %non_uniform_index) #0 {
				%hdl = call i64 @llvm.dxil.create.handle(i8 %res_class, i32 %range_id, i32 %index, i1 %non_uniform_index)
				%1 = call %struct.anon @llvm.dxil.buffer.load.f32(i64 %hdl, i32 %idx, i32 undef)
				%2 = extractvalue %struct.anon %1, 0
				ret float %2
				}
				; CHECK-LABEL:test_buffer_store_f32
				; CHECK: %[[HDL2:.+]] = call %dx.types.Handle @dx.op.createHandle(i32 57, i8 %res_class, i32 %range_id, i32 %index, i1 %non_uniform_index)
				; CHECK: call void @dx.op.bufferStore.f32(i32 69, %dx.types.Handle %[[HDL2]], i32 %idx, i32 undef, float %v, float undef, float undef, float undef, i8 1)
				define void @test_buffer_store_f32(i32 %idx, float %v, i8 %res_class, i32 %range_id, i32 %index, i1 %non_uniform_index) #0 {
				%hdl = call i64 @llvm.dxil.create.handle(i8 %res_class, i32 %range_id, i32 %index, i1 %non_uniform_index)
				call void @llvm.dxil.buffer.store.f32(i64 %hdl, i32 %idx, i32 undef, float %v, float undef, float undef, float undef, i8 1)
				ret void
				}

				; CHECK-DAG:declare %dx.types.Handle @dx.op.createHandle(i32, i8, i32, i32, i1)
				declare i64 @llvm.dxil.create.handle(i8, i32, i32, i1) #1

				; CHECK-DAG:declare %dx.types.ResRet.f32 @dx.op.bufferLoad.f32(i32, %dx.types.Handle, i32, i32)
				declare %struct.anon @llvm.dxil.buffer.load.f32(i64, i32, i32) #2

				; CHECK-DAG:declare void @dx.op.bufferStore.f32(i32, %dx.types.Handle, i32, i32, float, float, float, float, i8)
				declare void @llvm.dxil.buffer.store.f32(i64, i32, i32, float, float, float, float, i8) #3

				; Make sure no other function declaration.
				; CHECK-NOT:declare

				attributes #0 = { noinline nounwind }
				attributes #1 = { nounwind readnone willreturn }
				attributes #2 = { nounwind readonly willreturn }
				attributes #3 = { nounwind willreturn }

llvm/utils/TableGen/DXILEmitter.cpp

Show First 20 Lines • Show All 69 Lines • ▼ Show 20 Lines	struct DXILOperationData {
bool RequiresUniformInputs; // whether this operation requires that all		bool RequiresUniformInputs; // whether this operation requires that all
// of its inputs are uniform across the wave		// of its inputs are uniform across the wave
SmallVector<StringRef, 4>		SmallVector<StringRef, 4>
ShaderStages; // shader stages to which this applies, empty for all.		ShaderStages; // shader stages to which this applies, empty for all.
DXILShaderModel ShaderModel; // minimum shader model required		DXILShaderModel ShaderModel; // minimum shader model required
DXILShaderModel ShaderModelTranslated; // minimum shader model required with		DXILShaderModel ShaderModelTranslated; // minimum shader model required with
// translation by linker		// translation by linker
SmallVector<StringRef, 4> counters; // counters for this inst.		SmallVector<StringRef, 4> counters; // counters for this inst.

		bool HasOverload; // If only has one type in OverloadTypes, HasOverload will
		// be false.
		int OverloadParamIndex; // parameter index which control the overload
		bool OverloadTypeInStruct; // The overload parameter type is struct, the
		// overload type is first field type of the struct
		// type. This happens for things like buffer load.
		SmallVector<std::pair<int, StringRef>> StructParams;

DXILOperationData(const Record *R) {		DXILOperationData(const Record *R) {
Name = R->getValueAsString("name");		Name = R->getValueAsString("name");
DXILOp = R->getValueAsString("dxil_op");		DXILOp = R->getValueAsString("dxil_op");
DXILOpID = R->getValueAsInt("dxil_opid");		DXILOpID = R->getValueAsInt("dxil_opid");
DXILClass = R->getValueAsDef("op_class")->getValueAsString("name");		DXILClass = R->getValueAsDef("op_class")->getValueAsString("name");
Category = R->getValueAsDef("category")->getValueAsString("name");		Category = R->getValueAsDef("category")->getValueAsString("name");

if (R->getValue("llvm_intrinsic")) {		if (R->getValue("llvm_intrinsic")) {
auto *IntrinsicDef = R->getValueAsDef("llvm_intrinsic");		auto *IntrinsicDef = R->getValueAsDef("llvm_intrinsic");
auto DefName = IntrinsicDef->getName();		auto DefName = IntrinsicDef->getName();
assert(DefName.startswith("int_") && "invalid intrinsic name");		assert(DefName.startswith("int_") && "invalid intrinsic name");
// Remove the int_ from intrinsic name.		// Remove the int_ from intrinsic name.
Intrinsic = DefName.substr(4);		Intrinsic = DefName.substr(4);
}		}

Doc = R->getValueAsString("doc");		Doc = R->getValueAsString("doc");

		OverloadParamIndex = -1;
		OverloadTypeInStruct = false;
ListInit *ParamList = R->getValueAsListInit("ops");		ListInit *ParamList = R->getValueAsListInit("ops");
for (unsigned i = 0; i < ParamList->size(); ++i) {		for (unsigned i = 0; i < ParamList->size(); ++i) {
Record *Param = ParamList->getElementAsRecord(i);		Record *Param = ParamList->getElementAsRecord(i);
Params.emplace_back(DXILParam(Param));		Params.emplace_back(DXILParam(Param));
		auto &CurParam = Params.back();
		if (CurParam.Type == "$o" \|\| CurParam.Type == "udt" \|\|
		CurParam.Type == "obj") {
		OverloadParamIndex = i;
		} else if (CurParam.Type == "dx.types.CBufRet" \|\|
		CurParam.Type == "dx.types.ResRet") {
		OverloadParamIndex = i;
		OverloadTypeInStruct = true;
		}
		if (CurParam.Type.startswith("dx.types."))
		StructParams.emplace_back(std::make_pair(i, CurParam.Type));
}		}
OverloadTypes = R->getValueAsString("oload_types");		OverloadTypes = R->getValueAsString("oload_types");
FnAttr = R->getValueAsString("fn_attr");		FnAttr = R->getValueAsString("fn_attr");

		SmallVector<StringRef> OverloadStrs;
		OverloadTypes.split(OverloadStrs, ';', /MaxSplit/ -1,
		/KeepEmpty/ false);
		HasOverload = OverloadStrs.size() > 1;
}		}
};		};

} // end anonymous namespace		} // end anonymous namespace

static void emitDXILOpEnum(DXILOperationData &DXILOp, raw_ostream &OS) {		static void emitDXILOpEnum(DXILOperationData &DXILOp, raw_ostream &OS) {
// Name = ID, // Doc		// Name = ID, // Doc
OS << DXILOp.Name << " = " << DXILOp.DXILOpID << ", // " << DXILOp.Doc		OS << DXILOp.Name << " = " << DXILOp.DXILOpID << ", // " << DXILOp.Doc
<< "\n";		<< "\n";
}		}

▲ Show 20 Lines • Show All 186 Lines • ▼ Show 20 Lines	static void emitDXILOperationTable(std::vector<DXILOperationData> &DXILOps,

OS << " static const OpCodeProperty OpCodeProps[] = {\n";		OS << " static const OpCodeProperty OpCodeProps[] = {\n";
for (auto &DXILOp : DXILOps) {		for (auto &DXILOp : DXILOps) {
OS << " { DXIL::OpCode::" << DXILOp.DXILOp << ", "		OS << " { DXIL::OpCode::" << DXILOp.DXILOp << ", "
<< OpStrings.get(DXILOp.DXILOp.str())		<< OpStrings.get(DXILOp.DXILOp.str())
<< ", OpCodeClass::" << DXILOp.DXILClass << ", "		<< ", OpCodeClass::" << DXILOp.DXILClass << ", "
<< OpClassStrings.get(getDXILOpClassName(DXILOp.DXILClass)) << ", "		<< OpClassStrings.get(getDXILOpClassName(DXILOp.DXILClass)) << ", "
<< getDXILOperationOverload(DXILOp.OverloadTypes) << ", "		<< getDXILOperationOverload(DXILOp.OverloadTypes) << ", "
<< emitDXILOperationFnAttr(DXILOp.FnAttr) << " },\n";		<< emitDXILOperationFnAttr(DXILOp.FnAttr) << ", " << DXILOp.HasOverload
		<< ", " << DXILOp.OverloadParamIndex << ", "
		<< DXILOp.OverloadTypeInStruct << ", ";
		OS << "{ ";
		for (auto &StructParam : DXILOp.StructParams) {
		OS << " { " << StructParam.first << ", \"" << StructParam.second
		<< "\" } ,";
		}
		OS << "} ";
		OS << " },\n";
}		}
OS << " };\n";		OS << " };\n";

OS << " // FIXME: change search to indexing with\n";		OS << " // FIXME: change search to indexing with\n";
OS << " // DXILOp once all DXIL op is added.\n";		OS << " // DXILOp once all DXIL op is added.\n";
OS << " OpCodeProperty TmpProp;\n";		OS << " OpCodeProperty TmpProp;\n";
OS << " TmpProp.OpCode = DXILOp;\n";		OS << " TmpProp.OpCode = DXILOp;\n";
OS << " const OpCodeProperty *Prop =\n";		OS << " const OpCodeProperty *Prop =\n";
▲ Show 20 Lines • Show All 60 Lines • Show Last 20 Lines