Details
- Reviewers
arsenm - Commits
- rGbb1fe369774a: [AMDGPU] Make v8i16/v8f16 legal
Diff Detail
Event Timeline
llvm/test/CodeGen/AMDGPU/vector_shuffle.packed.ll | ||
---|---|---|
1273–1282 ↗ | (On Diff #401420) | What happened here? |
llvm/test/CodeGen/AMDGPU/vector_shuffle.packed.ll | ||
---|---|---|
1273–1282 ↗ | (On Diff #401420) | I believe this is the result of extract_subvector inserted by the extract_elt lowering for v8f16. |
llvm/test/CodeGen/AMDGPU/vector_shuffle.packed.ll | ||
---|---|---|
1273–1282 ↗ | (On Diff #401420) | There's some special casing of other f16 vectors in LowerEXTRACT_SUBVECTOR which I would assume would follow for these |
llvm/test/CodeGen/AMDGPU/vector_shuffle.packed.ll | ||
---|---|---|
1273–1282 ↗ | (On Diff #401420) | Thanks, found it. I will probably need to add special cases for v8 too. |
llvm/test/CodeGen/AMDGPU/vector_shuffle.packed.ll | ||
---|---|---|
1273–1282 ↗ | (On Diff #401420) | Hm... I run into the dag lowering loop. Maybe I need to find another way to do extract_elt here, w/o extract_subvector. |
There are upcoming intrinsics to use it. Without a legal type these would use custom lowering.
Custom lowering for vector_shuffle.
llvm/test/CodeGen/AMDGPU/vector_shuffle.packed.ll | ||
---|---|---|
1273–1282 ↗ | (On Diff #401420) | Well, it was missing custom lowering for vector_shuffle. |
- Added extract_subvector custom lowering which was missing.
- Added extract_subvector patterns for sub0_sub1 and sub2_sub3.
- Switched extract_elt to use cast to v2i64 for splitting instead of extract_subvector to avoid dag lowering loop.
clang-format: please reformat the code