This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Add A16/G16 to InstCombine
ClosedPublic

Authored by Flakebi on Aug 13 2020, 3:03 AM.

Details

Summary

When sampling from images with coordinates that only have 16 bit
accuracy, convert the image intrinsic call to use a16 or g16.
This does only happen if the target hardware supports it.

An alternative would be to always apply this combination, independent of
the target hardware and extend 16 bit arguments to 32 bit arguments
during legalization. To me, this sounds like an unnecessary roundtrip
that could prevent some further InstCombine optimizations.

Diff Detail

Event Timeline

Flakebi created this revision.Aug 13 2020, 3:03 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 13 2020, 3:03 AM
Flakebi requested review of this revision.Aug 13 2020, 3:03 AM
arsenm added inline comments.Aug 13 2020, 6:02 AM
llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
56

Typo loosing

63

No else after return

70

No else after return. Also use match()?

85

No else after return

91

Dead code

96

Don't need llvm::

113

No else after return

824–826

The subtarget is already available in GCNTTI

Flakebi updated this revision to Diff 285388.Aug 13 2020, 8:45 AM

Fix review comments

arsenm added inline comments.Aug 13 2020, 1:02 PM
llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll
3815

Can you add a test making sure this preserves fast math flags? We should be able to mark these as nnan / ninf etc. and have it be preserved

Flakebi updated this revision to Diff 285568.Aug 14 2020, 12:36 AM

Preserve fast-math flags and add test that ensures a16 combining is not done on gfx8.

arsenm accepted this revision.Aug 19 2020, 11:22 AM
This revision is now accepted and ready to land.Aug 19 2020, 11:22 AM
This revision was automatically updated to reflect the committed changes.
foad added a subscriber: foad.Aug 21 2020, 12:14 AM