This is an archive of the discontinued LLVM Phabricator instance.

[DirectX backend] Add createHandle BufferLoad/Store DXIL operation
Needs ReviewPublic

Authored by python3kgae on Jun 29 2022, 10:32 AM.

Details

Summary

New DXIL operations for createHandle BufferLoad/Store are added.
A new helper class DXILOpBuilder is added to create DXIL op function calls.

TableGen backend for DXILOperation will create table for DXIL op function parameter types.
When create DXIL op function, these parameter types will used to create the function type.

Diff Detail

Event Timeline

python3kgae created this revision.Jun 29 2022, 10:32 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 29 2022, 10:32 AM
python3kgae requested review of this revision.Jun 29 2022, 10:32 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 29 2022, 10:32 AM
beanz added inline comments.Jun 29 2022, 10:55 AM
llvm/include/llvm/IR/IntrinsicsDXIL.td
22 ↗(On Diff #441076)

Do you have a plan for taking LLVM load instructions and converting them to these intrinsics?

I think we need to think about how we want to translate LLVM gep/load/store instructions into DXIL ops, and I don't think we should add these intrinsics until we know what that is going to look like.

python3kgae added inline comments.Jun 29 2022, 11:30 AM
llvm/include/llvm/IR/IntrinsicsDXIL.td
22 ↗(On Diff #441076)

These intrinsics are trying to make the distance from hlsl to DXIL shorter.
They're just wrapper for DXIL operation functions so generate DXIL is easier.

I did experiment to generate DXIL directly from GEP/load/store, then found create intrinsic might help the translation.

beanz added inline comments.Jun 29 2022, 11:32 AM
llvm/include/llvm/IR/IntrinsicsDXIL.td
22 ↗(On Diff #441076)

I still don't know that these are the _right_ intrinsics. How are we going to map GEP/load/store to these intrinsics?

nikic resigned from this revision.Jun 30 2022, 12:32 AM
python3kgae added inline comments.Jun 30 2022, 4:51 AM
llvm/include/llvm/IR/IntrinsicsDXIL.td
22 ↗(On Diff #441076)

It will not be a simple map, we'll need a pass to translate GEP/load/store to these intrinsics.
These intrinsics are to make the pass easier to write and leave the details like DXIL opcode, DXIL struct type to DXILOpLowering pass.

Maybe we can allow GEP/load/store in final DXIL for future DXIL version, but to generate early version of DXIL, these intrinsics will be
helpful.

beanz added inline comments.Jun 30 2022, 6:24 AM
llvm/include/llvm/IR/IntrinsicsDXIL.td
22 ↗(On Diff #441076)

I didn't mean to imply it would be a simple map (as in map data structure), but it is a mapping operation. GEPs get folded in with loads and stores to form load and store DXIL Ops.

Clang will generate GEPs, loads, and stores through known handle pointers. Unlike in DXC we won't map those to "high level" intrinsics during codegen, instead we'll emit the GEPs, loads and stores. That will allow LLVM's optimization passes (like SROA) to run without needing to be taught about all of the special intrinsics for HLSL.

If the input to our backend is expected to be GEPs, loads and stores, I fail to see why we would translate those to an intrinsic that has an identical signature to the DXIL Op (minus the opcode) instead of just translating it to the DXIL Op directly.

Remove intrinsic, add DXILOpBuilder to create DXIL op function calls.

python3kgae edited the summary of this revision. (Show Details)Jul 2 2022, 1:45 PM
nlopes added a comment.Jul 2 2022, 3:24 PM

Please avoid using UndefValue::get whenever possible as we are trying to get rid of undef. Please use PoisonValue. Thank you!

python3kgae edited the summary of this revision. (Show Details)

Change UndefValue to PoisonValue.

nhaehnle added inline comments.Jul 4 2022, 5:26 AM
llvm/include/llvm/IR/IntrinsicsDXIL.td
22 ↗(On Diff #441076)

FWIW, we have a similar issue in LLPC, our SPIR-V-to-AMDGPU-backend shader compiler. The backend has a family of llvm.amdgcn.buffer.load/store intrinsics that take a buffer descriptor and offset arguments. We generate those from load/store/atomic/gep on a "fat pointer" address space in this pass: https://github.com/GPUOpen-Drivers/llpc/blob/dev/lgc/patch/PatchBufferOp.cpp

I don't really have more thoughts on the issue right now, but I believe it is a very similar problem and so a future exchange of thoughts may well be helpful.

For example, it is not clear to me what the "correct" place for lowering these loads and store is. What AMDGPU and LLPC do has evolved historically. I'd say it's fairly reasonable, we did learn that pushing the lowering to be later is helpful in our case but pushing it all the way to a MIR pass (which the DXIL backend doesn't use anyway) would have been a painful amount of work.

beanz added a comment.Jul 21 2022, 7:10 AM

This change adds a bunch of functionality with no tests. How are the new dxil ops generated?

I feel like you have two changes going in here at once. One is a refactor that is probably NFC, and the other is adding some operations that aren't used anywhere. These should be two separate changes, and we shouldn't add the operations until we have codegen in place so that we know how we're going to generate them.

kuhar resigned from this revision.Jun 11 2023, 1:15 PM