This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Cost-model vector concatenation
Needs RevisionPublic

Authored by SjoerdMeijer on Aug 5 2021, 1:41 AM.

Details

Summary

Vector concatenation is cheap as it can be achieved with a zip or mov.

Diff Detail

Event Timeline

SjoerdMeijer created this revision.Aug 5 2021, 1:41 AM
SjoerdMeijer requested review of this revision.Aug 5 2021, 1:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 5 2021, 1:41 AM
david-arm added inline comments.Aug 5 2021, 7:16 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
2105

I think that getVectorNumElements will assert if SubLT.second is not a vector, so the additional isVector() call below is redundant here. If you specifically want to guard against the case where it isn't a vector it's probably better to write this as:

if (SubLT.second.isVector() && (Index + SubLT.second.getVectorNumElements()) == NumElts)
SjoerdMeijer added inline comments.Aug 5 2021, 7:38 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
2105

Ah yeah, thanks, you're right about that. I was a bit too enthusiastic hoisting that getVectorNumElements() out of the if. Will fix this.

Will also fix the failing sve-intrinsics.ll that I forgot to update.

Can we also add some more tests, for things that are insert_subvector's but not just the simple concat's?

Can we also add some more tests, for things that are insert_subvector's but not just the simple concat's?

Yeah, I have tried, but unsuccessfully. The shuffle cost for a SK_InsertSubvector is guarded by isInsertSubvectorMask():

if (Shuffle->isInsertSubvectorMask(NumSubElts, SubIndex))
  return TargetTTI->getShuffleCost(
    TTI::SK_InsertSubvector, VecTy, Shuffle->getShuffleMask(),
    SubIndex,
    FixedVectorType::get(VecTy->getScalarType(), NumSubElts));

And this isInsertSubvectorMask() seems to only return true for subverter masks that are identity masks, so I haven't been able to trigger this with other masks.

  • fixed the if-statement,
  • fixed sve-intrinsics.ll.

See my previous comment about subvector insertions that are not concats; haven't been able to trigger that and find test cases for that.

SjoerdMeijer added inline comments.Aug 6 2021, 5:19 AM
llvm/test/Analysis/CostModel/AArch64/sve-intrinsics.ll
7 ↗(On Diff #364756)

I have just reran the script, but this is the only real change in this test.

david-arm added inline comments.Aug 6 2021, 8:02 AM
llvm/test/Analysis/CostModel/AArch64/sve-intrinsics.ll
7 ↗(On Diff #364756)

The cost for this looks very optimistic judging by the codegen as I can see around 30 instructions generated! Not sure why this has changed as I'd have expected an assert to fire in getVectorNumElements for scalable vectors.

SjoerdMeijer added inline comments.Aug 6 2021, 8:19 AM
llvm/test/Analysis/CostModel/AArch64/sve-intrinsics.ll
7 ↗(On Diff #364756)

Not sure why this has changed as I'd have expected an assert to fire in getVectorNumElements for scalable vectors.

Because of this:

unsigned getVectorNumElements() const {
  // TODO: Check that this isn't a scalable vector.
  return getVectorMinNumElements();
}

So it is happily returning 4 for this case. :-(

I think I will add a check for scalable vectors and a TODO for this.

Matt added a subscriber: Matt.Aug 6 2021, 8:55 AM

Can we also add some more tests, for things that are insert_subvector's but not just the simple concat's?

Yeah, I have tried, but unsuccessfully. The shuffle cost for a SK_InsertSubvector is guarded by isInsertSubvectorMask():

if (Shuffle->isInsertSubvectorMask(NumSubElts, SubIndex))
  return TargetTTI->getShuffleCost(
    TTI::SK_InsertSubvector, VecTy, Shuffle->getShuffleMask(),
    SubIndex,
    FixedVectorType::get(VecTy->getScalarType(), NumSubElts));

And this isInsertSubvectorMask() seems to only return true for subverter masks that are identity masks, so I haven't been able to trigger this with other masks.

This would count too %X = shufflevector <8 x i16> undef, <8 x i16> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 8, i32 9>. It has an identity for the first vector, with the second inserted into it.

I think the condition can be more specific to a concat if those are the cases that are really cheap. Two 64bit vectors combined together into a 128bit vector.

david-arm resigned from this revision.May 16 2022, 6:22 AM

Nothing has happened with this patch since last August, so I'm resigning as a reviewer for now!

Herald added a project: Restricted Project. · View Herald TranscriptMay 16 2022, 6:22 AM
fhahn requested changes to this revision.Sep 27 2022, 1:13 AM

Marking as changes requested as it looks like there are pending comments.

This revision now requires changes to proceed.Sep 27 2022, 1:13 AM