This patch adds patterns and tests for subvector insert/extract
intrinsics to/from all legal predicate types.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td | ||
---|---|---|
1542 | nit: whitespace change | |
llvm/test/CodeGen/AArch64/sve-insert-vector.ll | ||
711 | Is there value in adding a simple test that inserts <vscale x 1 x i1> into a <vscale x 1 x i1> vector? I assume that standard DAG combines will treat this as a simple copy? | |
992 | I don't think this is a problem with your patch, but the spill and fill in this output looks unnecessary. We have enough registers to support this without spilling I think? Also, something weird seems to be happening with the offset for the spill/fill, i.e. "#7, mul vl". I assume that translates to offset = 7 x vscale x 2? It seems to fit into the stack space we've allocated, but I wonder if this is just pure luck? |
llvm/test/CodeGen/AArch64/sve-insert-vector.ll | ||
---|---|---|
992 | Ah, perhaps we're actually just storing it into the top of the temporary stack space we've allocated, i.e. the top part of a "vscale x 16" byte object. |
llvm/test/CodeGen/AArch64/sve-insert-vector.ll | ||
---|---|---|
711 | I think there's little value in that because it becomes a copy straight away when building the DAG. | |
992 | This happens because p4-p15 are callee-saved, and it needs p4 as a scratch register in this function.
Correct. The space it allocates is aligned to <vscale x 16 x i8>, so that's the smallest space that gets allocated, but then it only stores a <vscale x 16 x i1>. |
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td | ||
---|---|---|
1542 | That was actually intentional, to cluster all the two-stage unpacks together. |
nit: whitespace change