This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Fix interaction of tfe and d16
ClosedPublic

Authored by arsenm on Jan 17 2020, 3:53 PM.

Details

Summary

This using the wrong result register, and dropping the result entirely
for v2f16. This would fail to select on the scalar case. I believe it
was also mishandling packed/unpacked subtargets.

Diff Detail

Event Timeline

arsenm created this revision.Jan 17 2020, 3:53 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 17 2020, 3:53 PM
nhaehnle accepted this revision.Jan 22 2020, 4:20 AM

One question, but apart from that LGTM.

The register initialization code is suboptimal, bit I'm going to write up a patch for that.

llvm/lib/Target/AMDGPU/SIISelLowering.cpp
5311

When does this case actually happen?

This revision is now accepted and ready to land.Jan 22 2020, 4:20 AM

The register initialization code is suboptimal, bit I'm going to write up a patch for that.

Never mind, I believe at least some API use cases actually need it like that... so it would require the frontend to give us more information, and it's not high impact anyway.

arsenm marked an inline comment as done.Jan 22 2020, 5:20 AM
arsenm added inline comments.
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
5311

This is the normal case with a load. The chainless case is the weird one above for the non-loading intrinsics