v3i16 and v3f16 currently cannot be legalized and lowered so they should
not be emitted by inst combining.
Moved the check down to still allow extracting 1 or 2 elements via the dmask.
Fixes image intrinsics being combined to return v3x16.
Paths
| Differential D84223
[AMDGPU] Don't combine memory intrs to v3i16 ClosedPublic Authored by Flakebi on Jul 21 2020, 2:14 AM.
Details
Summary v3i16 and v3f16 currently cannot be legalized and lowered so they should Moved the check down to still allow extracting 1 or 2 elements via the dmask. Fixes image intrinsics being combined to return v3x16.
Diff Detail
Event TimelineComment Actions I’m also trying to get it working properly (currently for SDag). I think I got the legalization/widening part working but I’m still trying to figure out how to select the right instruction patterns. The next two weeks I’m on vacation, so it will still take a while. I think Marek wants a slightly quicker fix, probably something in mesa hit this. This revision is now accepted and ready to land.Jul 21 2020, 10:01 AM Closed by commit rG2c659082bda6: [AMDGPU] Don't combine memory intrs to v3i16 (authored by sebastian-ne). · Explain WhyJul 22 2020, 3:44 AM This revision was automatically updated to reflect the committed changes. Comment Actions
Revision Contents
Diff 279757 llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-demanded-vector-elts.ll
|