Changeset View
Changeset View
Standalone View
Standalone View
mlir/include/mlir/Dialect/GPU/GPUOps.td
Show First 20 Lines • Show All 254 Lines • ▼ Show 20 Lines | def GPU_LaunchFuncOp : GPU_Op<"launch_func">, | ||||
let summary = "Launches a function as a GPU kerneel"; | let summary = "Launches a function as a GPU kerneel"; | ||||
let description = [{ | let description = [{ | ||||
Launch a kernel function on the specified grid of thread blocks. | Launch a kernel function on the specified grid of thread blocks. | ||||
`gpu.launch` operations are lowered to `gpu.launch_func` operations by | `gpu.launch` operations are lowered to `gpu.launch_func` operations by | ||||
outlining the kernel body into a function in a dedicated module, which | outlining the kernel body into a function in a dedicated module, which | ||||
reflects the separate compilation process. The kernel function is required | reflects the separate compilation process. The kernel function is required | ||||
to have the `gpu.kernel` attribute. The module containing the kernel | to have the `gpu.kernel` attribute. The module containing the kernel | ||||
function is required to have the `gpu.kernel_module` attribute and must be | function is required to be a gpu.module. And finally, the module containing | ||||
named. And finally, the module containing the kernel module (which thus | the kernel module (which thus cannot be the top-level module) is required | ||||
cannot be the top-level module) is required to have the | to have the `gpu.container_module` attribute. The `gpu.launch_func` | ||||
`gpu.container_module` attribute. The `gpu.launch_func` operation has a | operation has a symbol attribute named `kernel` to identify the fully specified | ||||
frgossen: The kernel is now identified with a single nested symbol attribute `kernel` (https://github. | |||||
string attribute named `kernel` to specify the name of the kernel function | kernel function to launch (both the gpu.module and func). | ||||
to launch and an attribute named `kernel_module` to specify the name of the | |||||
module containing that kernel function. | |||||
The operation takes at least six operands, with the first three operands | The operation takes at least six operands, with the first three operands | ||||
being grid sizes along x,y,z dimensions and the following three being block | being grid sizes along x,y,z dimensions and the following three being block | ||||
sizes along x,y,z dimensions. When a lower-dimensional kernel is required, | sizes along x,y,z dimensions. When a lower-dimensional kernel is required, | ||||
unused sizes must be explicitly set to `1`. The remaining operands are | unused sizes must be explicitly set to `1`. The remaining operands are | ||||
passed as arguments to the kernel function. | passed as arguments to the kernel function. | ||||
A custom syntax for this operation is currently not available. | A custom syntax for this operation is currently not available. | ||||
Example: | Example: | ||||
```mlir | ```mlir | ||||
module attributes {gpu.container_module} { | module attributes {gpu.container_module} { | ||||
// This module creates a separate compilation unit for the GPU compiler. | // This module creates a separate compilation unit for the GPU compiler. | ||||
module @kernels attributes {gpu.kernel_module} { | gpu.module @kernels { | ||||
func @kernel_1(%arg0 : f32, %arg1 : !llvm<"float*">) | func @kernel_1(%arg0 : f32, %arg1 : !llvm<"float*">) | ||||
attributes { nvvm.kernel = true } { | attributes { nvvm.kernel = true } { | ||||
// Operations that produce block/thread IDs and dimensions are | // Operations that produce block/thread IDs and dimensions are | ||||
// injected when outlining the `gpu.launch` body to a function called | // injected when outlining the `gpu.launch` body to a function called | ||||
// by `gpu.launch_func`. | // by `gpu.launch_func`. | ||||
%tIdX = "gpu.thread_id"() {dimension = "x"} : () -> (index) | %tIdX = "gpu.thread_id"() {dimension = "x"} : () -> (index) | ||||
%tIdY = "gpu.thread_id"() {dimension = "y"} : () -> (index) | %tIdY = "gpu.thread_id"() {dimension = "y"} : () -> (index) | ||||
▲ Show 20 Lines • Show All 402 Lines • Show Last 20 Lines |
The kernel is now identified with a single nested symbol attribute kernel (https://github.com/llvm/llvm-project/commit/0372db05bb1552c2b39fc735f949977e0a863a25). I should have updated this, sorry.