On AMD target, printf is replaced and call to printf_alloc are inserted
in order to get the buffer offset to write to
Does the printf expansion work for all amdgpu triples, including amdgcn--mesa3d?
will it work for non-ROC capable GPUs?
since it uses gcn specific builtin it should be in amdgcn/ rather than amdgpu/
Really I would prefer if this just cloned rocm-device-libs. The backend doesn't need to maintain two ABIs for dnot
Nothing in the compiler actually depends on roc capable GPUs. The limitations are all runtime/driver related
The exception to this would be places in the libraries that assume flat instructions are available. Theoretically we could handle these in the backend for SI with tagged pointers or something
so there are no hidden assumptions; like the printf buffer being cache coherent with host CPU, or the implementation using s_sendmsg to trigger CPU draining the buffer before the kernel finishes execution?
the main difference is atomic usage (I use those already in liclc), implict arg pos and asking to set the offset to 8 at buffer init instead of adding 8 on every call.
I can easily change the last 2
The implementation use a regular device buffer that is read at the end of the execution. Mostly like a clEnqueueReadBuffer
Is that on ROCm side or clover? will it stay that way on the ROCm side in the future?
The idea of using different ABI (and -mesa3d triple) was that clover is not tied to changes AMD makes for ROCm.
it can make independent decisions, it's not exposed to potential breakage when ROCm does incompatible changes and does not force ROCm ABI on other users of clover (like r600 or nouveau).
It might be OK for printf (let's see what curro says on the mesa side), but I'd expect that attempting to closely follow ROCm will speed up efforts to switch mesa opencl to SPIRV->NIR path.
I think we should probably have an #if guard around this declaration to make sure the CLC language version is 1.2 or higher.
With this patch applied, mesa/clover on a CL1.1 device can no longer build CL kernels.
Have you checked that returning NULL is handled properly?
Should the buffer wrap around instead?
Calling the file printf_alloc or similar would make it clearer that it doesn't actually implement printf
'replace' -> 'replaces'
since the return value is a pointer, NULL would be more appropriate
'store' -> 'stores'
pls add a short table of the expected buffer layout
This would be more readable with a temp var buffer_offset_ptr since you're using it below as well.
return type is a pointer, NULL would be more appropriate