This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][InferAddressSpaces] Only rewrite address-spaces that can be trivially casted to flat for llvm.amdgcn.flat.atomic.{fadd,fmax,fmin} (2/2)
ClosedPublic

Authored by jmmartinez on May 5 2023, 1:45 AM.

Details

Summary

The intrinsic @llvm.amdgcn.flat.atomic.{fadd,fmax,fmin} can only be
selected for flat address spaces (constant, flat and global).

This patch restricts the cases over which GCNTTIImpl::rewriteIntrinsicWithAddressSpace
rewrites the intrinsic.

Diff Detail

Event Timeline

jmmartinez created this revision.May 5 2023, 1:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 5 2023, 1:45 AM
jmmartinez requested review of this revision.May 5 2023, 1:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 5 2023, 1:45 AM
arsenm added inline comments.May 9 2023, 8:25 AM
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
1088–1089

But the not-noop address space casts are the cases we're interested in handling. Do you mean to check for valid casts?

llvm/test/Transforms/InferAddressSpaces/AMDGPU/flat-fadd-fmin-fmax-intrinsics.ll
4

Can you precommit the test to show the diff?

166

Add a test with a random number, other unknown address spaces should generally act like global.

Also test the new fat pointer address spaces?

arsenm added inline comments.May 9 2023, 8:57 AM
llvm/test/Transforms/InferAddressSpaces/AMDGPU/flat-fadd-fmin-fmax-intrinsics.ll
166

Also constant 32 bit

  • Taking into account remarks
jmmartinez retitled this revision from [AMDGPU][InferAddressSpaces] Only rewrite address-spaces that can be trivially casted to flat for llvm.amdgcn.flat.atomic.{fadd,fmax,fmin} to [AMDGPU][InferAddressSpaces] Only rewrite address-spaces that can be trivially casted to flat for llvm.amdgcn.flat.atomic.{fadd,fmax,fmin} (2/2).May 10 2023, 2:55 AM
jmmartinez marked 3 inline comments as done.May 10 2023, 3:11 AM
jmmartinez added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
1088–1089

The problem is that currently we cannot lower this builtin for other address spaces other than flat.

I agree that the rewrite of this intrinsic loses almost all interest for this intrinsics. Maybe we shouldn't even rewrite them at all, which would simplify the code.

Handling CONSTANT_ADDRESS_32BIT gets awkward since it's not handled in isFlatGlobalAddrSpace. I modified isFlatGlobalAddrSpace to take it into account but I've got several fails in other tests (I haven't looked at the details yet).

llvm/test/Transforms/InferAddressSpaces/AMDGPU/flat-fadd-fmin-fmax-intrinsics.ll
4

I've added https://reviews.llvm.org/D150259 as a parent patch.

jmmartinez marked an inline comment as done.May 10 2023, 8:40 AM
jmmartinez added inline comments.May 15 2023, 2:10 AM
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
1088–1089

Ping

arsenm added inline comments.May 15 2023, 2:43 PM
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
1092–1093

Is this not just isExtendedGlobalAddrSpace?

1097

I don't understand why you are checking both address spaces. The old address space had to have been flat?

  • Remarks taken into account.
jmmartinez marked 3 inline comments as done.May 16 2023, 6:27 AM
jmmartinez added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
1097

You're right. I fixed the code to match.

jmmartinez marked an inline comment as done.
  • Reduced diff
arsenm accepted this revision.May 16 2023, 6:59 AM
This revision is now accepted and ready to land.May 16 2023, 6:59 AM