This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Add missing test prefixes
AbandonedPublic

Authored by foad on Jul 15 2020, 4:24 AM.

Details

Reviewers
rampitec

Diff Detail

Event Timeline

foad created this revision.Jul 15 2020, 4:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 15 2020, 4:24 AM
foad marked 2 inline comments as done.Jul 15 2020, 4:26 AM

This can't be committed as-is because some of the checks fail. @rampitec how should they be fixed? Should I just update the checks, or do we need to fix some real bugs?

llvm/test/CodeGen/AMDGPU/perfhint.ll
33

This check fails.

87

This check fails. Perhaps D47740 never worked?

rampitec added inline comments.Jul 15 2020, 11:43 AM
llvm/test/CodeGen/AMDGPU/perfhint.ll
33

This one is memory bound, there are practically only memory operations here. I think it needs some ALU in between to catch large stride only as intended.

87

Looks like it did not :(

Anyway, this case is not memory bound even though it is indirect. This is because we have a single load followed by multiple stores, that was the point of the check.

foad marked an inline comment as done.Jul 17 2020, 5:00 AM

I've committed the easy bits as 2dc3d1b3136522e7c8e92d742d8ecc3405b9b4bb.

llvm/test/CodeGen/AMDGPU/perfhint.ll
33
foad marked an inline comment as done.Jul 17 2020, 5:43 AM
foad added inline comments.
llvm/test/CodeGen/AMDGPU/perfhint.ll
87

The problem is that after AMDGPULowerKernelArguments, the load from %arg looks like this:

%arg.load = load float addrspace(1)*, float addrspace(1)* addrspace(4)* %arg.kernarg.offset.cast, align 4, !invariant.load !0
%load = load float, float addrspace(1)* %arg.load, align 8

which is indirect. Any ideas?

rampitec added inline comments.Jul 17 2020, 9:17 AM
llvm/test/CodeGen/AMDGPU/perfhint.ll
33

Thank you!

87

A-ha! The representation changed, but we did not catch it because of the broken test.

The first load is from constant, so it is uniform. I suppose we can ignore constant address space for this purpose. It creates much less memory traffic.

foad abandoned this revision.Jul 22 2020, 3:03 AM
foad marked an inline comment as done.

All changes have been committed or split out into another review.

llvm/test/CodeGen/AMDGPU/perfhint.ll
87

I've done this in D84301.