This is an archive of the discontinued LLVM Phabricator instance.

[clang][AArch64][SVE] Avoid going through memory for fixed/scalable predicate casts
ClosedPublic

Authored by bsmith on Jul 27 2021, 3:56 AM.

Details

Summary

For fixed SVE types, predicates are represented using vectors of i8,
where as for scalable types they are represented using vectors of i1. We
can avoid going through memory for casts between these by bitcasting the
i1 scalable vectors to/from a scalable i8 vector of matching size, which
can then use the existing vector insert/extract logic.

Diff Detail

Event Timeline

bsmith created this revision.Jul 27 2021, 3:56 AM
bsmith requested review of this revision.Jul 27 2021, 3:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 27 2021, 3:56 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
Matt added a subscriber: Matt.Jul 27 2021, 4:54 AM
c-rhodes added inline comments.Jul 27 2021, 6:25 AM
clang/lib/CodeGen/CGExprScalar.cpp
2065–2076

should const be dropped now the type might change?

2110–2128

With the predicate casting now using the intrinsics I don't think this is needed any longer. Perhaps we should add an unreachable above if the element type doesn't match?

bsmith added inline comments.Jul 28 2021, 5:18 AM
clang/lib/CodeGen/CGExprScalar.cpp
2110–2128

Don't we still need this for casting between vectors with different element types, or are these guaranteed to not hit this code path?

c-rhodes added inline comments.Jul 28 2021, 8:28 AM
clang/lib/CodeGen/CGExprScalar.cpp
2110–2128

Don't we still need this for casting between vectors with different element types, or are these guaranteed to not hit this code path?

Apologies I was wrong, you're right we'll still need it for casting between vectors with different element types to support lax vector conversions, although it's no longer tested. Might be worth adding a codegen test for ~:

svint64_t lax_cast(fixed_int32_t t) {
    return t;
}

and the comment could also be updated

junparser added inline comments.Jul 28 2021, 7:00 PM
clang/lib/CodeGen/CGExprScalar.cpp
2102

I think this may also works for casting between vectors with different element types.

bsmith added inline comments.Jul 29 2021, 8:44 AM
clang/lib/CodeGen/CGExprScalar.cpp
2102

A similar argument applies here as the other related ticket, in principal we could, however it's not clear that there is a good use case for writing code that would make use of this. So for now it's probably best to just deal with predicates which are definitely a problem and other cases as they arise.

junparser added inline comments.Jul 29 2021, 7:31 PM
clang/lib/CodeGen/CGExprScalar.cpp
2102

Although i believe this generates better code than using memory load/store. Thanks for explaining this.

bsmith updated this revision to Diff 363093.Jul 30 2021, 7:58 AM
bsmith marked an inline comment as done.
  • Update comment to reflect changes
  • Add new test for lax casting via memory
c-rhodes accepted this revision.Aug 2 2021, 6:29 AM

thanks @bsmith, just left one minor nit but otherwise LGTM

clang/test/CodeGen/attr-arm-sve-vector-bits-cast.c
10

nit: unused?

This revision is now accepted and ready to land.Aug 2 2021, 6:29 AM
This revision was landed with ongoing or failed builds.Aug 4 2021, 9:10 AM
This revision was automatically updated to reflect the committed changes.