An extractelement with non-constant index will be lowered either to
scratch or movrel loop in most cases. This patch converts such
instruction into a set of selects if vector size is not too big.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
8057–8058 ↗ | (On Diff #173434) | Is this a combine instead of custom lowering to handle illegal typed vectors? |
8060–8061 ↗ | (On Diff #173434) | Grammar, too many |
test/CodeGen/AMDGPU/extract_vector_dynelt.ll | ||
2 ↗ | (On Diff #173434) | Should have some 8 and 16-bit element vectors (and 1-bit since those always break things) |
lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
8057–8058 ↗ | (On Diff #173434) | This is to allow further combining of selects. Ideally I thought about an IR pass, though amdgpu codegen prepare works too early to catch extract vectors created by promote alloca. A DAG combine works just fine. |
Fixed comment and added tests.
test/CodeGen/AMDGPU/extract_vector_dynelt.ll | ||
---|---|---|
2 ↗ | (On Diff #173434) | Added tests except for i1. Extract i1 element is already broken in the DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT() before my patch. I have a testcase but that needs to be a separate independent patch wrt legalizer. |
test/CodeGen/AMDGPU/extract_vector_dynelt.ll | ||
---|---|---|
2 ↗ | (On Diff #173434) | I have fixed i1 splitvector, it is here in the review. |