Create a gpu memset op and corresponding CUDA and ROCm wrappers.
I would probably infer the type of $value from the the element type of $dst (custom<GpuMemsetType>(type($dst), type($value)) in the assembly format, and provide print/parseGpuMemSetType()).
Alternatively, use AnyType for $value and check that it matches the element type of $dst in the verifier.
Updating D107548: [mlir] create gpu memset op
Made changes based on internal review:
- explicitly only supporting 32 bit memset for now
- cuda wrappers call cuMemsetD32Async directly and no longer import the cuda runtime api
- extracted out getNumElements for gpu memrefs for logic that was common to memset and memcpy
- mlir memset op takes in a scalar of AnyType instead of AnyMemRef
Thanks for the approval!
I've tried to land this, but I don't seem to have github permissions for llvm (Permission to llvm/llvm-project.git denied to lorenrose1013.)
Is this something I can request, or is someone else able to land the change instead?