Details
- Reviewers
rampitec
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
llvm/test/CodeGen/AMDGPU/perfhint.ll | ||
---|---|---|
33 | This one is memory bound, there are practically only memory operations here. I think it needs some ALU in between to catch large stride only as intended. | |
87 | Looks like it did not :( Anyway, this case is not memory bound even though it is indirect. This is because we have a single load followed by multiple stores, that was the point of the check. |
I've committed the easy bits as 2dc3d1b3136522e7c8e92d742d8ecc3405b9b4bb.
llvm/test/CodeGen/AMDGPU/perfhint.ll | ||
---|---|---|
33 | OK, fixed in f05bce86af32d7b5cf1ab28b3abf6ee473bf3ef1. |
llvm/test/CodeGen/AMDGPU/perfhint.ll | ||
---|---|---|
87 | The problem is that after AMDGPULowerKernelArguments, the load from %arg looks like this: %arg.load = load float addrspace(1)*, float addrspace(1)* addrspace(4)* %arg.kernarg.offset.cast, align 4, !invariant.load !0 %load = load float, float addrspace(1)* %arg.load, align 8 which is indirect. Any ideas? |
This check fails.