Page MenuHomePhabricator

[AArch64][SVE] Add a pass for SVE intrinsic optimisations
ClosedPublic

Authored by kmclaughlin on Mar 12 2020, 10:12 AM.

Details

Summary

Creates the SVEIntrinsicOpts pass. In this patch, the pass tries
to remove unnecessary reinterpret intrinsics which convert to
and from svbool_t (llvm.aarch64.sve.convert.[to|from].svbool)

For example, the reinterprets below are redundant:

%1 = call <vscale x 16 x i1> @llvm.aarch64.sve.convert.to.svbool.nxv4i1(<vscale x 4 x i1> %a)
%2 = call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> %1)

The pass also looks for ptest intrinsics and phi instructions where
the operands are being needlessly converted to and from svbool_t.

Diff Detail

Event Timeline

kmclaughlin created this revision.Mar 12 2020, 10:12 AM
Herald added a project: Restricted Project. · View Herald Transcript
efriedma added inline comments.Mar 12 2020, 12:43 PM
llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
444

unused bool?

llvm/lib/Target/AArch64/SVEIntrinsicOpts.cpp
57

processPhiNode(IntrinsicInst *I)?

110

Please use getArgOperand() to get the arguments of calls.

235

You might want to check whether the module actually declares any of the SVE intrinsics before you iterate over the whole function.

Cheers for working on this @kmclaughlin ! Have you considered adding an interface for the new PM? You could check this for reference: https://reviews.llvm.org/rGd6de5f12d485a85504bc99d384a85634574a27e2 (also implements a FunctionPass).

llvm/lib/Target/AArch64/SVEIntrinsicOpts.cpp
31

In AArch64TargetMachine.cpp it's:

static cl::opt<bool> EnableSVEIntrinsicOpts(
    "aarch64-sve-intrinsic-opts", cl::Hidden,
    cl::desc("Enable SVE intrinsic opts"),
    cl::init(true));

so it probably make sense to use see-intrinsic-opts here as well?

111

Given where this method is called, this is guaranteed to be always non-null. Perhaps assert instead?

121

Isn't it guaranteed that RequiredType == Reinterpret->getArgOperand(0)->getType() is always true? I.e., PN and the incoming values have identical type.

126

[nit] Perhaps Ctx instead of C1?

132

i -> I

188

I'd be tempted to rewrite this to use early exit (https://llvm.org/docs/CodingStandards.html#use-early-exits-and-continue-to-simplify-code):

auto *Y = isReinterpretToSVBool(I->getArgOperand(0));
if (nullptr == Y)
  return false;

// The rest of the function
225
  1. Could this be a for-range loop instead?
  1. This loop seems to be a perfect candidate for make_early_inc_range (https://github.com/llvm/llvm-project/blob/172f1460ae05ab5c33c757142c8bdb10acfbdbe1/llvm/include/llvm/ADT/STLExtras.h#L499), e.g.
for (Instruction &I : make_early_inc_range(BB))
 Changed |= optimizeIntrinsic(&I);
241

AFAIK, this iterates over BBs so I should be replaced with BB.

Also, perhaps for (auto *BB : RPOT) { instead?

llvm/test/CodeGen/AArch64/sve-intrinsic-opts-ptest.ll
19

What's %1 and %2? Is it worth adding the calls that generated them in the expected output?

Have you considered adding an interface for the new PM?

Please ignore that comment ^^^. I incorrectly assumed that the backend has also been ported to the new PM, but that's not the case. In particular, llc will only use the legacy PM anyway: https://github.com/llvm/llvm-project/blob/55b92dcb35a0758bea294ab6e42f9954ac06cac3/llvm/tools/llc/llc.cpp#L510

Apologies for the confusion!

kmclaughlin marked 13 inline comments as done.
  • Changed this from a function pass to a module pass & now check if any of the relevant SVE intrinsics are declared first before iterating over functions
  • Added more checks on expected output in the tests

Thanks for reviewing this, @efriedma & @andwar!

llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
444

Removed

llvm/lib/Target/AArch64/SVEIntrinsicOpts.cpp
121

The incoming values to PN will all have the same type, but this is making sure that the reinterprets are all converting from the same type (there is a test for this in sve-intrinsic-opts-reinterpret.ll called reinterpret_reductions_1, where the arguments to convert.to.svbool are a mix of nxv2i1 and nxv4i1)

225

Changed this to use make_early_inc_range as suggested

235

Thanks for the suggestion - I changed this to a module pass so that we can check if any of the SVE intrinsics we are interested in are declared first.

llvm/test/CodeGen/AArch64/sve-intrinsic-opts-ptest.ll
19

I think that would make sense. I've added %1 and %2 to the expected output and added more checks to the other tests here.

Thanks for the updates @kmclaughlin ! Would you mind adding a comment to clearly mark the negative tests? E.g. for @reinterpret_reductions_1? Maybe also a comment why a particular case is not optimised? You've already done that for some tests, but not all of them.

llvm/lib/Target/AArch64/SVEIntrinsicOpts.cpp
11

I don't see any documentation for the pass that's being added here (apart from the commit msg). Perhaps it's worth expanding this comment?

222

I think that you could simplify things a bit by using SmallPtrSet instead: http://llvm.org/docs/ProgrammersManual.html#dss-smallptrset.

With a set you can avoid explicit checks like this:

std::find(Functions.begin(), Functions.end(), Inst->getFunction()) == Functions.end()

With SmallPtrSet you can write less code :)

242

Could you decorate this for loop with a comment explaining what kind of data is generated here and why? Ta!

kmclaughlin marked 3 inline comments as done.

Use SmallPtrSet instead of SmallVector for storing functions found by runOnModule
Add more comments to clarify the purpose of the pass and some of the negative reinterpret tests

efriedma added inline comments.Mar 24 2020, 10:56 AM
llvm/lib/Target/AArch64/SVEIntrinsicOpts.cpp
234

Iterating over a SmallPtrSet is non-deterministic.

In this context, probably SetVector is the right data structure.

kmclaughlin marked an inline comment as done.

Use SmallSetVector for the list of functions gathered by runOnModule to preserve the order of iteration

efriedma accepted this revision.Apr 9 2020, 1:37 PM

LGTM

This revision is now accepted and ready to land.Apr 9 2020, 1:37 PM
This revision was automatically updated to reflect the committed changes.