This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Add support for extracting elements of scalable vectors
ClosedPublic

Authored by david-arm on May 19 2020, 6:17 AM.

Details

Summary

I have tried to ensure that SelectionDAG and DAGCombiner do
sensible things for scalable vectors, and added support for a
limited number of simple folds. Codegen support for the vector
extract patterns have also been added to the AArch64 backend.

New vector extract tests have been added here:

CodeGen/AArch64/sve-extract-element.ll

and I have also added new folds using inserts and extracts here:

CodeGen/AArch64/sve-insert-element.ll

Diff Detail

Event Timeline

david-arm created this revision.May 19 2020, 6:17 AM
david-arm added a reviewer: kmclaughlin.
efriedma added inline comments.
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
5379

Please note why this doesn't work for scalable vectors.

5432

Please note why this doesn't work for scalable vectors.

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
1875

Maybe better to use lastb(whilels(0, idx), vec) or something like that?

david-arm updated this revision to Diff 265720.May 22 2020, 5:56 AM
david-arm marked 3 inline comments as done.May 22 2020, 6:00 AM

I've used the patterns you suggested for the extracts, since you're right they are more efficient. I've left the insert elements alone for now, but I'll change these in a future patch to use lastb(whilels ..) as well.

efriedma added inline comments.May 22 2020, 11:13 AM
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
5433

This comment doesn't seem quite right.

For fixed-width vectors, we only do this for vectors with exactly one element, but that's not because it's semantically incorrect for vectors with more than one element. It's easy to convert indexing on a subvector to indexing on the original vector. The issue is just that it would make legalization more complex: we don't want to do the transform for vectors with more than one element without a profitability check.

llvm/test/CodeGen/AArch64/sve-extract-element.ll
172

This doesn't assemble. (I assume you meant to refer to xzr?)

david-arm marked 2 inline comments as done.May 25 2020, 11:42 PM
david-arm added inline comments.
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
5433

OK fair enough. For the extract_vector_elt we're completely ignoring the index, which is fine for fixed width vectors, but not fine for scalable vectors, since we'd then need to generate code taking the index into account, which isn't worth it here. If the index > 0 for fixed width vectors the result is undefined anyway so we can simply return anything. I'll try to update the comment to be indicate this,

llvm/test/CodeGen/AArch64/sve-extract-element.ll
172

OK, I'll take a look. I built and ran a test for variants of v16i8, but missed this,

david-arm marked 2 inline comments as done.May 25 2020, 11:56 PM
david-arm added inline comments.
llvm/test/CodeGen/AArch64/sve-extract-element.ll
172

Hi Eli, I tried assembling this and it worked for me with this command:

llc --filetype=obj -mtriple=aarch64-linux-gnu -mattr=+sve < ../llvm/test/CodeGen/AArch64/sve-extract-element.ll > test.o

What command did you use?

efriedma added inline comments.May 26 2020, 11:53 AM
llvm/test/CodeGen/AArch64/sve-extract-element.ll
172

echo "whilels p0.d, #0, x8" | llvm-mc -triple=aarch64 -mattr=+sve

david-arm updated this revision to Diff 266450.May 27 2020, 1:25 AM
efriedma added inline comments.May 27 2020, 2:29 PM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
17433–17434

Should this be if ((VecOp.getOpcode() == ISD::SPLAT_VECTOR || (IndexC && VecOp.getOpcode() == ISD::BUILD_VECTOR) && [...])?

david-arm marked an inline comment as done.
This revision is now accepted and ready to land.May 28 2020, 1:39 PM
This revision was automatically updated to reflect the committed changes.