This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Stop mulhi from doing 24 bit mul for uniform values
ClosedPublic

Authored by dstuttard on May 28 2021, 8:34 AM.

Details

Summary

Added support to check if architecture supports s_mulhi which is used as part of
the decision whether or not to use valu 24 bit mul (if the mulhi gets
transformed to a valu op anyway, then may as well use it).

This is an extension of the work in D97063

Diff Detail

Event Timeline

dstuttard created this revision.May 28 2021, 8:34 AM
dstuttard requested review of this revision.May 28 2021, 8:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 28 2021, 8:34 AM
dstuttard retitled this revision from Stop mulhi from doing 24 bit mul for uniform values to [AMDGPU] Stop mulhi from doing 24 bit mul for uniform values.May 28 2021, 8:35 AM

Wasn't sure about the new HasSMulHi - any thoughts?

foad added inline comments.May 28 2021, 9:08 AM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
166

I would just put the >= GFX9 test in here. No need for a HasSMulHi field. There are plenty of precedents for this.

dstuttard added inline comments.Jun 18 2021, 3:39 AM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
166

Having given this some thought I'm not sure that I agree - the precedent in the implementation is to do something like I've implemented. That's the consistent thing to do.
Is there a good reason not to (other than it is slightly less code?) - implemented this way it is a lot clearer too.

foad added inline comments.Jun 18 2021, 4:01 AM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
166

Both implementations ultimately say that s_mul_hi is supported on gfx9+. Mine just has fewer moving parts. I don't find yours clearer.

dstuttard added inline comments.Jun 18 2021, 9:17 AM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
166

Until there's a variant at some point in the future that doesn't have SMulHi (although I guess the implementation can be changed if that ever occurs).

dstuttard marked an inline comment as done.Jun 21 2021, 6:07 AM
dstuttard added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
166

Actually - this is the way that this is handled in other cases. Because this test is inside AMDGPUISelLowering it has to use the SubTarget - so the test function has to be exposed from AMDGPUSubtarget - the getGeneration is only supported in GCNSubtarget (and defaults to false for the others - which is correct).

foad accepted this revision.Jun 21 2021, 6:30 AM

LGTM with some extra test coverage for s_mul_hi_u32.

llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
166

Fair enough.

llvm/test/CodeGen/AMDGPU/mul_int24.ll
3

Could you also add GFX9 coverage in mul_uint24-amdgcn.ll?

This revision is now accepted and ready to land.Jun 21 2021, 6:30 AM
dstuttard updated this revision to Diff 353360.Jun 21 2021, 7:07 AM

Adding mul_uint24-amdgcn.ll gfx9 variant

dstuttard marked 2 inline comments as done.Jun 21 2021, 7:08 AM
This revision was landed with ongoing or failed builds.Jul 5 2021, 2:34 AM
This revision was automatically updated to reflect the committed changes.