Emit .actual_access metadata for the deduced argument access qualifier,
and .access for kernel_arg_access_qual.
Details
- Reviewers
arsenm bcahoon scott.linder kzhuravl rampitec b-sumner - Group Reviewers
Restricted Project - Commits
- rGffa7c7897c14: [AMDGPU] Emit .actual_access metadata
Diff Detail
Event Timeline
llvm/test/CodeGen/AMDGPU/hsa-metadata-deduce-ro-arg-v3.ll | ||
---|---|---|
26 ↗ | (On Diff #548385) | This looks like lost test coverage, need to test the access of the metadata |
llvm/test/CodeGen/AMDGPU/hsa-metadata-deduce-ro-arg-v3.ll | ||
---|---|---|
26 ↗ | (On Diff #548385) | Under V3, if the access is default, it won't print out "default", (will not print out the access qualifier at all). But I am curious because "noalias readonly %in", why the access is not "readonly"? Does it mean the original access qualifier got lost? |
llvm/test/CodeGen/AMDGPU/hsa-metadata-deduce-ro-arg-v3.ll | ||
---|---|---|
26 ↗ | (On Diff #548385) | the point of this is introspection of the source program, not the actual properties. if someone defines something as readwrite but is only ever read, the optimizer can conclude it's readonly even though it wasn't declared that way |
llvm/test/CodeGen/AMDGPU/hsa-metadata-deduce-ro-arg-v3.ll | ||
---|---|---|
26 ↗ | (On Diff #548385) | Under V3 or above, it is designed NOT to print out the access qualifier if it is "default". |
Update the LIT tests with kernel_arg_access_qual specified to "read_only: for the first argument.
llvm/test/CodeGen/AMDGPU/hsa-metadata-deduce-ro-arg-v3.ll | ||
---|---|---|
2 ↗ | (On Diff #548385) | I am not sure about the actual uses. But apparently the conformance test get_kernel_arg_info checks that. |
The first iteration of this test in D32091 suggests this actually does something, so it seems unfortunate if we're limited by introspection tests
llvm/test/CodeGen/AMDGPU/hsa-metadata-deduce-ro-arg-v3.ll | ||
---|---|---|
26 ↗ | (On Diff #548385) | Right, this can be deduced and had quite positive performance impact on some kernels. |
llvm/test/CodeGen/AMDGPU/hsa-metadata-deduce-ro-arg-v3.ll | ||
---|---|---|
26 ↗ | (On Diff #548385) | It seems we are using the same AccQual field for two purposes: 1) optimization, 2) get_kernel_arg_info.AccessQualifier, and |
llvm/test/CodeGen/AMDGPU/hsa-metadata-deduce-ro-arg-v3.ll | ||
---|---|---|
26 ↗ | (On Diff #548385) | This is the same thing, one is because developer marked it (and language allowed this marking), another because we were able to prove it from compiler. Net impact is still the same, RT may omit copying memory before every dispatch. |
llvm/test/CodeGen/AMDGPU/hsa-metadata-deduce-ro-arg-v3.ll | ||
---|---|---|
26 ↗ | (On Diff #548385) | We are seeing a conformance test failure with test_api get_kernel_arg_info. |
llvm/test/CodeGen/AMDGPU/hsa-metadata-deduce-ro-arg-v3.ll | ||
---|---|---|
26 ↗ | (On Diff #548385) | I meant if the language allows us to change, it should not check for the original ones. |
llvm/test/CodeGen/AMDGPU/hsa-metadata-deduce-ro-arg-v3.ll | ||
---|---|---|
26 ↗ | (On Diff #548385) | Sigh. I do not see why the same qualifier cannot apply to other types of memory. Readonly is readonly, and the optimization is useful. I understand that the standard did not say it is allowed, but conformance did pass for a long time this optimization is there. |
llvm/test/CodeGen/AMDGPU/hsa-metadata-deduce-ro-arg-v3.ll | ||
---|---|---|
26 ↗ | (On Diff #548385) | There is a kernel arg metadata “.actual_access” which may be used instead: “.actual_access” string “read_only” https://llvm.org/docs/AMDGPUUsage.html#code-object-v3-metadata |
llvm/test/CodeGen/AMDGPU/hsa-metadata-deduce-ro-arg-v3.ll | ||
---|---|---|
26 ↗ | (On Diff #548385) | Makes senses as long as agreed with RT. However, I would argue this shall be set in the same patch. |
llvm/test/CodeGen/AMDGPU/hsa-metadata-deduce-ro-arg.ll | ||
---|---|---|
2–3 | We currently only emit read_only for .actual_access. Do you know what are the exact if (Arg.getType()->isPointerTy() && Arg.onlyReadsMemory() && Arg.hasNoAliasAttr()) { ActAccQual = "read_only"; } |
llvm/test/CodeGen/AMDGPU/hsa-metadata-deduce-ro-arg-v3.ll | ||
---|---|---|
28 ↗ | (On Diff #548385) | This is the test for code object version 2. Do you think I should still update for V2? |
llvm/test/CodeGen/AMDGPU/hsa-metadata-deduce-ro-arg.ll | ||
---|---|---|
2–3 | Arg.onlyReadsMemory and Arg.hasAttribute(Attribute::WriteOnly) (don't know if you really need the noalias part here, it's probably a runtime workaround if you do) |
llvm/test/CodeGen/AMDGPU/hsa-metadata-deduce-ro-arg.ll | ||
---|---|---|
2–3 | The fact that you do not write through this pointer and pointers derived from it does not tell that you do not write this memory. You can tell you don't if there can be no other pointers to that memory as well. |
llvm/test/CodeGen/AMDGPU/hsa-metadata-deduce-ro-arg.ll | ||
---|---|---|
2–3 | Like: void foo(const int *p1, int *p2) { *p2 = p1[1]; } This is wrong to assume that memory under p1 was not modified because you can call it like this: foo(p, p); |
llvm/test/CodeGen/AMDGPU/hsa-metadata-deduce-ro-arg.ll | ||
---|---|---|
28 | also should test without the noaliases |
Add tests without noalias attribute.
Also plan to delete the following test in commit because all cases have been included in the added file.
llvm/test/CodeGen/AMDGPU/hsa-metadata-deduce-ro-arg.ll
Seems like the mirror write only deduction test is missing