This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: fold fmed3 of fpext sources to f16 fmed3
ClosedPublic

Authored by arsenm on May 5 2023, 5:29 PM.

Details

Reviewers
foad
b-sumner
Pierre-vh
Group Reviewers
Restricted Project
Summary

InstCombine already does this for minnum/maxnum. If we
also apply this to fmed3, we don't need to explicitly
use 16-bit fmed3 if we're not sure the target
supports 16-bit instructions yet.

Diff Detail

Event Timeline

arsenm created this revision.May 5 2023, 5:29 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 5 2023, 5:29 PM
arsenm requested review of this revision.May 5 2023, 5:29 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 5 2023, 5:29 PM
Herald added a subscriber: wdng. · View Herald Transcript
arsenm updated this revision to Diff 520214.May 7 2023, 3:03 PM

Rebase on more tests

foad added inline comments.May 9 2023, 7:39 AM
llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
731

For constants, you need to check that they are exactly representable as half, otherwise this could change the result.

arsenm updated this revision to Diff 520707.May 9 2023, 8:18 AM
arsenm marked an inline comment as done.

Check conversion

foad accepted this revision.May 9 2023, 12:07 PM

Please add a test case for a constant that can't be losslessly converted. OK with that.

This revision is now accepted and ready to land.May 9 2023, 12:07 PM
arsenm added a comment.May 9 2023, 1:04 PM

Please add a test case for a constant that can't be losslessly converted. OK with that.

It’s already there