An extractelement with non-constant index will be lowered either to
scratch or movrel loop in most cases. This patch converts such
instruction into a set of selects if vector size is not too big.
Details
Diff Detail
Event Timeline
lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
8057–8058 | This is to allow further combining of selects. Ideally I thought about an IR pass, though amdgpu codegen prepare works too early to catch extract vectors created by promote alloca. A DAG combine works just fine. |
Fixed comment and added tests.
test/CodeGen/AMDGPU/extract_vector_dynelt.ll | ||
---|---|---|
2 | Added tests except for i1. Extract i1 element is already broken in the DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT() before my patch. I have a testcase but that needs to be a separate independent patch wrt legalizer. |
test/CodeGen/AMDGPU/extract_vector_dynelt.ll | ||
---|---|---|
2 | I have fixed i1 splitvector, it is here in the review. |
Is this a combine instead of custom lowering to handle illegal typed vectors?