Page MenuHomePhabricator

[CodeGen] Add support for extracting elements of scalable vectors

Authored by david-arm on May 19 2020, 6:17 AM.



I have tried to ensure that SelectionDAG and DAGCombiner do
sensible things for scalable vectors, and added support for a
limited number of simple folds. Codegen support for the vector
extract patterns have also been added to the AArch64 backend.

New vector extract tests have been added here:


and I have also added new folds using inserts and extracts here:


Diff Detail

Event Timeline

david-arm created this revision.May 19 2020, 6:17 AM
david-arm added a reviewer: kmclaughlin.
efriedma added inline comments.

Please note why this doesn't work for scalable vectors.


Please note why this doesn't work for scalable vectors.


Maybe better to use lastb(whilels(0, idx), vec) or something like that?

david-arm updated this revision to Diff 265720.May 22 2020, 5:56 AM
david-arm marked 3 inline comments as done.May 22 2020, 6:00 AM

I've used the patterns you suggested for the extracts, since you're right they are more efficient. I've left the insert elements alone for now, but I'll change these in a future patch to use lastb(whilels ..) as well.

efriedma added inline comments.May 22 2020, 11:13 AM

This comment doesn't seem quite right.

For fixed-width vectors, we only do this for vectors with exactly one element, but that's not because it's semantically incorrect for vectors with more than one element. It's easy to convert indexing on a subvector to indexing on the original vector. The issue is just that it would make legalization more complex: we don't want to do the transform for vectors with more than one element without a profitability check.


This doesn't assemble. (I assume you meant to refer to xzr?)

david-arm marked 2 inline comments as done.May 25 2020, 11:42 PM
david-arm added inline comments.

OK fair enough. For the extract_vector_elt we're completely ignoring the index, which is fine for fixed width vectors, but not fine for scalable vectors, since we'd then need to generate code taking the index into account, which isn't worth it here. If the index > 0 for fixed width vectors the result is undefined anyway so we can simply return anything. I'll try to update the comment to be indicate this,


OK, I'll take a look. I built and ran a test for variants of v16i8, but missed this,

david-arm marked 2 inline comments as done.May 25 2020, 11:56 PM
david-arm added inline comments.

Hi Eli, I tried assembling this and it worked for me with this command:

llc --filetype=obj -mtriple=aarch64-linux-gnu -mattr=+sve < ../llvm/test/CodeGen/AArch64/sve-extract-element.ll > test.o

What command did you use?

efriedma added inline comments.May 26 2020, 11:53 AM

echo "whilels p0.d, #0, x8" | llvm-mc -triple=aarch64 -mattr=+sve

david-arm updated this revision to Diff 266450.May 27 2020, 1:25 AM
efriedma added inline comments.May 27 2020, 2:29 PM

Should this be if ((VecOp.getOpcode() == ISD::SPLAT_VECTOR || (IndexC && VecOp.getOpcode() == ISD::BUILD_VECTOR) && [...])?

david-arm marked an inline comment as done.
This revision is now accepted and ready to land.May 28 2020, 1:39 PM
This revision was automatically updated to reflect the committed changes.