Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
9051 | We can but currently EXTRACT_VECTOR_ELT doesn't support any of the unpacked types so I didn't bother considering them because I'm mainly concerned with maintaining code quality for D87843. Although this means the code is wrong because it's incorrectly reporting the nodes as legal, so I'll just remove this block. | |
9074 | I just figured it was better to reuse the existing patterns? |
Remove block that is incorrectly reporting unpacked and predicate EXTRACT_VECTOR_ELT as legal.
Please make sure we have test coverage for the SVE-only extracts (e.g. extracting the third element of a <vscale x 2 x i64>).
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
9074 | The EXTRACT_SUBVECTOR doesn't really have any semantic meaning here: it's just to model the fact that the underlying instructions are "NEON" instructions. From a DAGCombine perspective, that isn't really useful information. Given that, I'd lean towards adding more patterns, even if they're sort of redundant. That said, the current approach is okay, I guess. Is there any reason to prefer NEON extract instructions over SVE ones if we're producing a floating-point value? |
Replaced custom selection with isel patterns. Since we're going the isel route I figures I may was well add the missing patterns for unpacked floating point types.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
9074 |
I was thinking about it the other way round, namely why would we prefer SVE over NEON? This was based on an assumption that a NEON extract might be cheaper than a full register SVE dup. At this stage though I'm happy to wait for proof one way or the other, so have just omitted the floating-point patterns for this patch. |
LGTM with one minor fix.
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td | ||
---|---|---|
2202 ↗ | (On Diff #292827) | Typo for i64? Should be nxv2i64. I think this affects one of the tests. |
I assume we could handle nxv2f16 if we wanted to? I guess the indexing gets a little more complicated.