This using the wrong result register, and dropping the result entirely
for v2f16. This would fail to select on the scalar case. I believe it
was also mishandling packed/unpacked subtargets.
Details
Details
Diff Detail
Diff Detail
Event Timeline
Comment Actions
One question, but apart from that LGTM.
The register initialization code is suboptimal, bit I'm going to write up a patch for that.
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
5311 | When does this case actually happen? |
Comment Actions
Never mind, I believe at least some API use cases actually need it like that... so it would require the frontend to give us more information, and it's not high impact anyway.
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
5311 | This is the normal case with a load. The chainless case is the weird one above for the non-loading intrinsics |
When does this case actually happen?