This is an archive of the discontinued LLVM Phabricator instance.

[SveEmitter] Add builtins for svdupq and svdupq_lane
ClosedPublic

Authored by sdesmalen on Apr 23 2020, 12:34 PM.

Details

Summary
  • svdupq builtins that duplicate scalars to every quadword of a vector are defined using builtins for svld1rq (load and replicate quadword).
  • svdupq builtins that duplicate boolean values to fill a predicate vector are defined using svcmpne.

Diff Detail

Event Timeline

sdesmalen created this revision.Apr 23 2020, 12:34 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 23 2020, 12:34 PM
Herald added a subscriber: tschuett. · View Herald Transcript

Is there some reason you decided to implement these particular functions as inline functions directly in the header?

sdesmalen updated this revision to Diff 261537.May 1 2020, 2:13 PM
  • Moved dupq implementation to CGBuiltins

Is there some reason you decided to implement these particular functions as inline functions directly in the header?

Earlier implementations of the ACLE had a lot more things done in the header file (such as the predication with zero/undef, immediate checks, etc). We've moved most of that to CGBuiltins. These macros were a bit specific and hadn't been moved to CGBuiltin yet, but are now kind of remnants of the old implementation. These are better expressed like the other builtins, so that we can further simplify the header file and maybe get rid of it entirely at some point. I've updated the patch accordingly.

sdesmalen updated this revision to Diff 261545.May 1 2020, 2:18 PM
  • Re-added patch with diff context.
sdesmalen marked an inline comment as done.May 1 2020, 2:22 PM
sdesmalen added inline comments.
clang/utils/TableGen/SveEmitter.cpp
98

note: the changes to isScalarPredicate and to SVEType::str() are fixes to allow expressing bool, which weren't previously exhibited because there wasn't yet a builtin that used a scalar bool as argument.

efriedma added inline comments.May 1 2020, 6:19 PM
clang/lib/CodeGen/CGBuiltin.cpp
8054

Please use something like CreateTempAlloca(llvm::ArrayType::get(EltTy, NumOpnds), CharUnits::fromQuantity(16)).

(In particular, the way you've written it, the code allocates stack memory dynamically. Might want to add a test with some control flow to demonstrate that you're putting the alloca into the entry block.)

8058

Builder.getInt64().

clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq.c
258

Please CHECK the alignment of all these operations.

sdesmalen updated this revision to Diff 261853.May 4 2020, 9:53 AM
  • Use CreateTempAlloca instead of CreateAlloca
  • Added checks for alignment to the stores.
  • Added test with control-flow to check the alloca is added to the entry-block.
sdesmalen marked 4 inline comments as done.May 4 2020, 9:54 AM
sdesmalen added inline comments.
clang/lib/CodeGen/CGBuiltin.cpp
8054

Good shout, I didn't realise that. I've fixed that in the latest revision.

This revision is now accepted and ready to land.May 4 2020, 11:22 AM
This revision was automatically updated to reflect the committed changes.
sdesmalen marked an inline comment as done.

I suspect some 'CHECK' lines should be CHECK-DAG lines, but I'll investigate that tomorrow. Thanks for pointing out!

I have reverted the patch in the meantime