This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Analysis/
-
llvm/
-
Analysis/
3/3
TargetLibraryInfo.h
40/40
VectorUtils.h
-
lib/
-
Analysis/
-
LazyCallGraph.cpp
5/5
LoopAccessAnalysis.cpp
-
Transforms/Vectorize/
-
Vectorize/
-
LoopVectorizationLegality.cpp
1/1
LoopVectorize.cpp

Differential D67572

[VectorUtils] Introduce the Vector Function Database (VFDatabase).
ClosedPublic

Authored by fpetrogalli on Sep 13 2019, 2:40 PM.

Download Raw Diff

Details

Reviewers

sdesmalen
jdoerfert
simoll
hsaito
ABataev
fhahn
rengolin
hfinkel

Commits

rG0be81968a283: [VectorUtils] Introduce the Vector Function Database (VFDatabase).

Summary

This patch introduced the VFDatabase, the framework proposed in
http://lists.llvm.org/pipermail/llvm-dev/2019-June/133484.html. [*]

In this patch the VFDatabase is used to bridge the TargetLibraryInfo
(TLI) calls that were previously used to query for the availability of
vector counterparts of scalar functions.

The VFISAKind field `ISA` of VFShape have been moved into into VFInfo,
under the assumption that different vector ISAs may provide the same
vector signature. At the moment, the vectorizer accepts any of the
available ISAs as long as the signature provided by the VFDatabase
matches the one expected in the vectorization process. For example,
when targeting AVX or AVX2, which both have 256-bit registers, the IR
signature of the two vector functions associated to the two ISAs is
the same. The `getVectorizedFunction` method at the moment returns the
first available match. We will need to add more heuristics to the
search system to decide which of the available version (TLI, AVX,
AVX2, ...)  the system should prefer, when multiple versions with the
same VFShape are present.

Some of the code in this patch is based on the work done by Sumedh
Arani in https://reviews.llvm.org/D66025.

[*] Notice that in the proposal the VFDatabase was called SVFS. The
name VFDatabase is more in line with LLVM recommendations for
naming classes and variables.

Differential Revision: https://reviews.llvm.org/D67572

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

fpetrogalli created this revision.Sep 13 2019, 2:40 PM

Herald added a project: Restricted Project. · View Herald TranscriptSep 13 2019, 2:40 PM

Herald added subscribers: llvm-commits, hiraditya. · View Herald Transcript

This is a follow up patch to https://reviews.llvm.org/D66024

It is a WIP becase I want to seek feedback before doing a full implementation with tests.

The current tests/unittests pass without failures, so at least it seems I haven't broken anything :)

I am in particular interested to know what you think about the way I have deferred the question "isFunctionVectorizable" and related "get" method from the TargetLibraryInfo (TLI). I have marched such methods as private, so that with this patch the only way to ask such questions around vectorizable function is to go through the SearchVFSystem class.

Notice that this patch doesn't implement the interface explained in the RFC at http://lists.llvm.org/pipermail/llvm-dev/2019-June/133484.html (yet).

One more thing: in the RFC we decided to pass the CallInst to the methods of the class instead of the Function definition, because we wanted to make the attribute vector-function-abi-variant to be local to the call site, not the declaration. Unfortunately, I couldn't find a way to attach an attribute to a CallInst. Is this the case? Or have I missed somethign around Instructions and attributes?

Kind regards,

Francesco

I'm fine with the general idea. The non-WIP patch needs documentation and a test for the new functionality (non-TLI-based selection).

In D67572#1670052, @fpetrogalli wrote:

One more thing: in the RFC we decided to pass the CallInst to the methods of the class instead of the Function definition, because we wanted to make the attribute vector-function-abi-variant to be local to the call site, not the declaration. Unfortunately, I couldn't find a way to attach an attribute to a CallInst. Is this the case? Or have I missed somethign around Instructions and attributes?

Could you add it to the attribute list of the call site at AttributeList::FunctionIndex?

In D67572#1725229, @simoll wrote:

In D67572#1670052, @fpetrogalli wrote:

One more thing: in the RFC we decided to pass the CallInst to the methods of the class instead of the Function definition, because we wanted to make the attribute vector-function-abi-variant to be local to the call site, not the declaration. Unfortunately, I couldn't find a way to attach an attribute to a CallInst. Is this the case? Or have I missed somethign around Instructions and attributes?

Could you add it to the attribute list of the call site at AttributeList::FunctionIndex?

Thank you for the suggestion. I have looked into this but it seems that adding an attribute to a CallInst in the AttributeList::FunctionIndex has the effect of juts adding the attribute to the function being called.

In D67572#1727678, @fpetrogalli wrote:

In D67572#1725229, @simoll wrote:

In D67572#1670052, @fpetrogalli wrote:

One more thing: in the RFC we decided to pass the CallInst to the methods of the class instead of the Function definition, because we wanted to make the attribute vector-function-abi-variant to be local to the call site, not the declaration. Unfortunately, I couldn't find a way to attach an attribute to a CallInst. Is this the case? Or have I missed somethign around Instructions and attributes?

Could you add it to the attribute list of the call site at AttributeList::FunctionIndex?

Thank you for the suggestion. I have looked into this but it seems that adding an attribute to a CallInst in the AttributeList::FunctionIndex has the effect of juts adding the attribute to the function being called.

The attribute lists are different. You might have added it to the function one but you can add it to the CallBase one as well.

In D67572#1727921, @jdoerfert wrote:

In D67572#1727678, @fpetrogalli wrote:

In D67572#1725229, @simoll wrote:

In D67572#1670052, @fpetrogalli wrote:

One more thing: in the RFC we decided to pass the CallInst to the methods of the class instead of the Function definition, because we wanted to make the attribute vector-function-abi-variant to be local to the call site, not the declaration. Unfortunately, I couldn't find a way to attach an attribute to a CallInst. Is this the case? Or have I missed somethign around Instructions and attributes?

Could you add it to the attribute list of the call site at AttributeList::FunctionIndex?

Thank you for the suggestion. I have looked into this but it seems that adding an attribute to a CallInst in the AttributeList::FunctionIndex has the effect of juts adding the attribute to the function being called.

The attribute lists are different. You might have added it to the function one but you can add it to the CallBase one as well.

Exactly

In D67572#1732710, @simoll wrote:

In D67572#1727921, @jdoerfert wrote:

In D67572#1727678, @fpetrogalli wrote:

In D67572#1725229, @simoll wrote:

In D67572#1670052, @fpetrogalli wrote:

One more thing: in the RFC we decided to pass the CallInst to the methods of the class instead of the Function definition, because we wanted to make the attribute vector-function-abi-variant to be local to the call site, not the declaration. Unfortunately, I couldn't find a way to attach an attribute to a CallInst. Is this the case? Or have I missed somethign around Instructions and attributes?

Could you add it to the attribute list of the call site at AttributeList::FunctionIndex?

Thank you for the suggestion. I have looked into this but it seems that adding an attribute to a CallInst in the AttributeList::FunctionIndex has the effect of juts adding the attribute to the function being called.

The attribute lists are different. You might have added it to the function one but you can add it to the CallBase one as well.

Exactly

Yep! I have a working version that I cleaning up for submission.

fpetrogalli updated this revision to Diff 227770.Nov 4 2019, 1:17 PM

fpetrogalli retitled this revision from [SVFS] SearchVFSystem interface (WIP). to [SVFS] The Search Vector Function System..

fpetrogalli edited the summary of this revision. (Show Details)

fpetrogalli added reviewers: hsaito, ABataev.

fpetrogalli marked an inline comment as done.Nov 4 2019, 1:22 PM

fpetrogalli added inline comments.

llvm/include/llvm/Analysis/VectorUtils.h
57	I have added this to make it work with the TLI. An alternative would be to add an extra field to the `VecDesc` instances stored in the TLI to hold also the ISA information. I am happy to do so (or to investigate any other alternative approach) is the solution proposed in this patch is not convincing.

fpetrogalli marked an inline comment as done.Nov 4 2019, 1:32 PM

fpetrogalli added inline comments.

llvm/include/llvm/Analysis/VectorUtils.h
246	The `VFISAKind::LLVM_INTERNAL_TLI` value, at the moment, is restricting the use of the SVFS to only those functions that are listed in the TLI. This makes me wonder whether it would be better to move the VFISAKind from the VFShape to the VFInfo. This could be justified under the assumption that a scalar function can be mapped to multiple vector functions with the same vectorization shape (same `FunctionType`), just with a different underlying ISA. Any thoughts?

fpetrogalli marked an inline comment as done.Nov 5 2019, 11:05 AM

fpetrogalli added inline comments.

llvm/include/llvm/Analysis/VectorUtils.h
221–223	This method `fixUpVFABINames` could be invoked once directly in the LoopVectorizer (runOnFunction), so that: it avoids calling it every time the SVFS is instantiated, by doing it once for all the calls in the module (or function). it allows the SVFS to be independent on the TLI. This seems a better solution to me - any opinions?

JonChesterfield added a subscriber: JonChesterfield.Nov 6 2019, 7:59 AM

jdoerfert added inline comments.Nov 6 2019, 12:03 PM

llvm/include/llvm/Analysis/TargetLibraryInfo.h
201	Documentation missing, and please do not call it "fixup", maybe "adjust" or something but not "fixup".
262	Why do you privatize these redirects? I would expect them to be used, thus public, or unused, thus deleted.
llvm/include/llvm/Analysis/VectorUtils.h
55	I generally dislike non-scoped macros. Is there a reason a `static constexpr char*` doesn't work? I would also opt for a shorter name, "_llvm_" or "_llvm_tli_" but that is debatable.
132–156	I doubt this change is necessary. Only if you need to own the underlying data you need to replace StringRef with std::string.
164	I'd prefer `void getVectorVariantNames(CallInst CI, SmallSet<std::string, 8> &VariantMappings);` but for sure you want a reference here: void setVectorVariantNames(CallInst CI, const SmallSet<std::string, 8> &VariantMappings);

fpetrogalli marked 5 inline comments as done.Nov 6 2019, 2:00 PM

fpetrogalli added inline comments.

llvm/include/llvm/Analysis/TargetLibraryInfo.h
262	I am privatizing these because `isFunctionVectorizable` and `getVectorizedFunction` to be asked to any component other than the `SearchVFSystem` introduced in this patch.

fpetrogalli updated this revision to Diff 228136.Nov 6 2019, 2:00 PM

fpetrogalli added reviewers: fhahn, rengolin.Nov 6 2019, 2:03 PM

After the last rework, I think this patch deserves a bit of a summary.

The patch as it is should be split in at least the following components, that needs to be done one by one in sequence:

Methods in the VFABI namespace to read and set the vector-function-abi-variant attribute. In the current patch, this is done with the methods [get|set]VectorVariantNames.
An early pass to be run early in the opt pass sequence, at least before the LoopVectorize pass and any of the analysis passes needed for vectorization, that populates the module with the attribute that describes the TLI mappings. This is currently _hacked_ in this patch by invoking addMappingsFromTLI in LoopVectorizer.cpp.
Once we have the TLI mappings in the IR, we can add the SearchVFSystem and use it's search mechanism to perform vectorization (an example of how this works is already in this patch)

Now, once we have this first item in places, we can extend the getVectorizedFunction interface to be more clever, and ask questions also areound properties of the vector function like "linear" and "uniform" parameters.

I would like to ask to the reviewers if this implementation sequence I propose makes sense. Please le t me know if you have any question.

Also, I have added a couple of people that I know have opinion on the vectorizers. I am sure there are more out there, but the list of contributors to all the places in LLVM that this patch touches is quite extended and it is difficult for me to make a choice.

fpetrogalli added a reviewer: hfinkel.Nov 6 2019, 2:14 PM

I have implemented the first item of the breakdown described in https://reviews.llvm.org/D67572#1736230 in patch https://reviews.llvm.org/D69976 - read/write methods for the VFABI attribute.

Thanks all,

Francesco

This is an update of the code after extracting the read/write function of the VFABI attribute in https://reviews.llvm.org/D69976.

jdoerfert added inline comments.Nov 7 2019, 6:02 PM

llvm/include/llvm/Analysis/VectorUtils.h
218	Arguably, you could just call `getVFMappings` and cache the result. Whoever created a SearchVFSystem will need the values eventually and this way the `isFucntionVectorizable` call is free. `CI->getModule()` The explicit and delete seems a lot of overhead, make the member a `CallInst &CI` and you should not need any of it.
260	Typos in the above TODO.
llvm/lib/Analysis/LoopAccessAnalysis.cpp
2428–2429	Leftover?
llvm/lib/Analysis/TargetLibraryInfo.cpp
22 ↗	(On Diff #228337)	Leftover?

Changes:

Update the patch to use the current version of the VFABI read/write methods at https://reviews.llvm.org/D69976
Address review from @jdoerfert

Note: in the next step I will extract from this patch the changes that are needed for the internal ISA for the TLI functions.

fpetrogalli marked 2 inline comments as done.Nov 8 2019, 3:12 PM

Internal vector function mangling for TLI added at https://reviews.llvm.org/D70089

fpetrogalli updated this revision to Diff 229627.Nov 15 2019, 1:14 PM

I have rebased the changes after the addition of the pass for injecting the TLI calls.

The vectorization process itself works, but I don't undertstand why a
bunch of tests are failing - I must have done somethign wrong in the
initialization of the pass.

Most (if not all) failures assert on the following:

Unable to schedule 'Scalar Evolution Analysis' required by 'Loop Vectorization'
Unable to schedule pass
UNREACHABLE executed at /home/frapet01/projects/upstream-clang/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1289!

Herald added a subscriber: mehdi_amini. · View Herald TranscriptNov 15 2019, 1:14 PM

Harbormaster completed remote builds in B41057: Diff 229627.Nov 15 2019, 1:21 PM

Down to two failures.

Failing Tests (2):

LLVM :: Other/pass-pipelines.ll
LLVM :: Transforms/SimplifyCFG/HoistCode.ll

The latter probably caused by a recent rebase.

The former is a genuine failure of this patch.

Harbormaster completed remote builds in B41065: Diff 229646.Nov 15 2019, 2:58 PM

In this patch:

The SearchVFSystem has been renamed to VFDatabase.

The query interface of the VFDatabase has been reduced to only two functions.

The field VFIsaKind ISA of VFShape has been moved to VFInfo. This change is justified in the new commit message.

fpetrogalli retitled this revision from [SVFS] The Search Vector Function System. to [VectorUtils] Introduce the Vector Function Database (VFDatabase)..Nov 18 2019, 3:48 PM

fpetrogalli edited the summary of this revision. (Show Details)

Harbormaster completed remote builds in B41144: Diff 229942.Nov 18 2019, 3:50 PM

steleman added a subscriber: steleman.Nov 18 2019, 4:00 PM

Thanks @fpetrogalli for updating this patch!

Down to two failures.
Failing Tests (2):

Just checking, are these failures resolved in the latest revision?

llvm/include/llvm/Analysis/VectorUtils.h
119	nit: retrive -> retrieve
124	I don't really understand what you mean by a "flat" vectorization shape. Is this function supposed to return a 'widened' (vector version of a) function? If so, can we please rename this to `getVectorShapeForCall` ?
127	nit: unnecessary curly braces
130	is `HasGlobalPred` the same as `isPredicated`? Also, is the predicate always known/expected to be the last parameter by the Vector ABI?
167	nit: in a class members are private by default.
168	nit: put comments above variable
177	StringRef ?
186	nit: These two if-statements can be merged with a `&&`
188	should `CI.getModule()->getFunction(Shape.getValue().VectorName)` be an assert? When would it ever happen that the vector function is not declared * in the IR? note that I'm specifically saying declared* and not defined here.
215	Is it worth implementing a `operator<` for VFShape and sorting the result for `getMappings()`? That way we can use binary search using llvm::lower_bound instead of looping through each shape in `ScalarToVectorMappings`.
llvm/lib/Analysis/LoopAccessAnalysis.cpp
1847	`getMappings` is quite an expensive call, so you'll want to add a special function here that bails out earlier, and doesn't have to demangle string and populate (and possibly sort) an array.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
3214	the call to `isFunctionVectorizable` is expensive, so please reorder this after `CI->isNoBuiltin()`, so that the function can bail out more cheaply.

fpetrogalli added inline comments.Nov 20 2019, 7:00 AM

llvm/include/llvm/Analysis/VectorUtils.h
124	Yeah, the name is misleading. This is supposed to return a widenened version of the function... but I don't like the name `getVectorShapeForCall`, because a Shape for a Vector Call can have Linear modifiers for example, while this function is returning only the shape that uses `VFParamKind::Vector` for all `VFPArameter`s. How about `getAllVectorsShape`?

In D67572#1753104, @sdesmalen wrote:

Thanks @fpetrogalli for updating this patch!

Down to two failures.
Failing Tests (2):

Just checking, are these failures resolved in the latest revision?

Yes, all good:

[1440/1440] Running the LLVM regression tests

Testing Time: 269.21s
  Expected Passes    : 33842
  Expected Failures  : 149
  Unsupported Tests  : 451

Address review comments from @sdesmalen. Thank you!

llvm/include/llvm/Analysis/VectorUtils.h
124	I explained the meaning of the function in the comment.
124	I have renamed to `VFShape::getAllVectorsParams`.
130	is HasGlobalPred the same as isPredicated? Th GlobalPredicate is listed in the `VFParamKind` enum class: GlobalPredicate, // Global logical predicate that acts on all lanes // of the input and output mask concurrently. For // example, it is implied by the `M` token in the // Vector Function ABI mangled name. It is a special predicate because it differ from the parameter masks that can be individually attached to the parameters. These kind of masks are not handled yet by the VFParamKind, but I understand that they were needed by @simoll when discussing the RFC. Also, is the predicate always known/expected to be the last parameter by the Vector ABI? For all the Vector Function ABIs supported at the moment in LLVM, yes. If vendor X decides to produce an ABI were the mask is the first parameter, we will have to change this code. But for now, we will assume the global predicate is the last parameter.
188	should CI.getModule()->getFunction(Shape.getValue().VectorName) be an assert? When would it ever happen that the vector function is not declared * in the IR? Yes, done. In fact, and IR where there is a VFABI attribute with a mapping to function X, but doesn't have X declared, shoudl be considered broken. *note that I'm specifically saying declared and not defined here. You are reaching enlightenment! :)
215	sort them by what field? VF? Also, I don't expect an attribute to hold more than 8 functions , which seems to be the worst case scenario when all X86 vector extensions are being used... are you sure you want to add such optimization. How about we leave it as it is (slow but simple) and leave optimizations for the future if it turns out we need to speed up the search?

Harbormaster completed remote builds in B41249: Diff 230277.Nov 20 2019, 9:08 AM

I have optimized the getVFABIMappings method of VFDatabase with an
early exit if the VFABI attribute is empty.

fpetrogalli marked 2 inline comments as done.Nov 20 2019, 9:23 AM

fpetrogalli added inline comments.

llvm/lib/Analysis/LoopAccessAnalysis.cpp
1847	Done, check the implementation of `getVFABIMappings`.

Harbormaster completed remote builds in B41250: Diff 230280.Nov 20 2019, 9:26 AM

Remove commented code...

Harbormaster completed remote builds in B41252: Diff 230282.Nov 20 2019, 9:36 AM

fpetrogalli marked an inline comment as done.Nov 20 2019, 2:35 PM

fpetrogalli added inline comments.

llvm/include/llvm/Analysis/VectorUtils.h
124	I have extracted the VFShape API here: https://reviews.llvm.org/D70513 (it is a work in progress as of now, I might finish up things later today).

sdesmalen added inline comments.Nov 21 2019, 3:15 AM

llvm/include/llvm/Analysis/VectorUtils.h
124	I think this can take `FunctionType` instead of `CallInst`. I have renamed to VFShape::getAllVectorsParams nit: I know this is bikeshedding, but what do you think of `VFShape::widenAllParams`?
125	these parameters shouldn't use `const`.
125	Is there a reason to pass VF and IsScalable separately, instead of passing an `ElementCount` ?
130	It is a special predicate because it differ from the parameter masks that can be individually attached to the parameters. I'd expect the operation to be predicated, not the individual operands. What would be the meaning of a predicated operand? For all the Vector Function ABIs supported at the moment in LLVM, yes. If vendor X decides to produce an ABI were the mask is the first parameter, we will have to change this code. Can we add an assert somewhere to enforce the assumption that the global predicate is passed as the last operand?
195	Please add a message to the assert.
203	this method doesn't do anything more than invoke `getVFABIMappings` so has no value (other than being public).
215	Alright, if it only has a few elements and is constructed to do a single lookup it is probably not worth the overhead of sorting.
llvm/lib/Analysis/LoopAccessAnalysis.cpp
1847	The early exit in getMappings doesn't stop a SmallVector from having to be created/destroyed. It would be better to create a new method such as `bool hasVectorVariants()` that answers the question directly.

fpetrogalli marked 16 inline comments as done.Nov 21 2019, 1:22 PM

fpetrogalli added inline comments.

llvm/include/llvm/Analysis/VectorUtils.h
124	I think this can take FunctionType instead of CallInst Yes, but... what's the point? We will have to introduce one or two extra calls to get the FunctionType... All we need is the number of arguments of the function, so in fact the function could just take an unsigned integer. But I don't like it. The fact that I use CallInst is cleaner and not worse than having an optimized interface. But this is my personal preference, so if you ask nicely :P I will change the interface. nit: I know this is bike shedding, but what do you think of VFShape::widenAllParams? I know where you come from, the Vectorizer :). I'd rather not, I really want to have a static `get` method attached to the VFShape. It seems to be in the style of LLVM to use `get` for static public member functions that build objects (see http://llvm.org/doxygen/classllvm_1_1VectorType.html). Since this is the only `get` method of `VFShape`, I have renamed it to `get`. For the record, the changes have been applied to https://reviews.llvm.org/D70513 This function this disappear from this revision.
124	Please take any other further review of `getAllVectorParams` to https://reviews.llvm.org/D70513
125	these parameters shouldn't use const. I disagree, especially for the CI argument. The method is not supposed to change the reference. I have anyway removed the const from the other parameters. (again, in https://reviews.llvm.org/D70513, not here, but I will update this patch)
130	I'd expect the operation to be predicated, not the individual operands. What would be the meaning of a predicated operand? This was a requirement from @simoll, he needs to have masking associated to each of the operands. Can we add an assert somewhere to enforce the assumption that the global predicate is passed as the last operand? Is is done in https://reviews.llvm.org/D70513
203	this method doesn't do anything more than invoke getVFABIMappings so has no value (other than being public). Yeah, but the value is int he comment inside of it, stating that other VFShapes can be build outside of a VFABI context. I'd prefer to keep it.
llvm/lib/Analysis/LoopAccessAnalysis.cpp
1847	I understand your concerns in performance here, but I am not keen in doing this. The attribute might contain junk that don't demangle correctly to a VFInfo - in that case the function would return "yes, this call has vector variant", but that wouldn't make sense because there would be no variant. I'd rather play it safe.

I have addressed the comments (some of the changes relative to the VFShape API are reflected in https://reviews.llvm.org/D70513).

I have also rebased the code on top of https://reviews.llvm.org/D70513.

Gentle ping after the merge of https://reviews.llvm.org/D70513.

Thank you,

Francesco

ABataev added inline comments.Dec 4 2019, 1:18 PM

llvm/include/llvm/Analysis/VectorUtils.h
124–125	Use `\\\` style of comment here
162	No need for `\brief`
188	`const auto &`

Address review comments from @ABataev.

Thank you,

Francesco

fpetrogalli marked an inline comment as done.Dec 4 2019, 1:57 PM

Harbormaster completed remote builds in B41881: Diff 232201.Dec 4 2019, 2:00 PM

Thanks Francesco, looks ok to me, but I'll leave to @jdoerfert or @sdesmalen to approve.

LGTM!

This revision is now accepted and ready to land.Dec 10 2019, 5:39 AM

Closed by commit rG0be81968a283: [VectorUtils] Introduce the Vector Function Database (VFDatabase). (authored by fpetrogalli). · Explain WhyDec 10 2019, 8:43 AM

This revision was automatically updated to reflect the committed changes.

MaskRay mentioned this in rG83b79f8a186f: [VectorUtils] Fix -Wunused-private-field after D67572.Dec 10 2019, 9:41 AM

Hi @fpetrogalli ,

A question regarding this patch.
For my out-of-tree target vectorization of intrinsics added for my target seems to have stopped working with this patch.
Is there something/what do I have to do to make the vectorizer understand my intrinsics are vectorizable?

Looking at this code in LoopVectorizationLegality.cpp:

// We handle calls that:
//   * Are debug info intrinsics.
//   * Have a mapping to an IR intrinsic.
//   * Have a vector version available.
auto *CI = dyn_cast<CallInst>(&I);
if (CI && !getVectorIntrinsicIDForCall(CI, TLI) &&
    !isa<DbgInfoIntrinsic>(CI) &&
    !(CI->getCalledFunction() && TLI &&
      !VFDatabase::getMappings(*CI).empty())) {

VFDatabase::getMappings(*CI).empty() is indeed true for my intrisic, and if I dig further, I take this return in

static void getVFABIMappings(const CallInst &CI,
                             SmallVectorImpl<VFInfo> &Mappings) {
  const StringRef ScalarName = CI.getCalledFunction()->getName();
  const StringRef S =
      CI.getAttribute(AttributeList::FunctionIndex, VFABI::MappingsAttrName)
          .getValueAsString();
  if (S.empty())
    return;

Is there some existing commit where in-tree targets have been modified already to work with the new VFDatabase?

Thanks!

Hi @uabelho

In D67572#1779762, @uabelho wrote:

Hi @fpetrogalli ,

A question regarding this patch.
For my out-of-tree target vectorization of intrinsics added for my target seems to have stopped working with this patch.

Ops... sorry!

Is there something/what do I have to do to make the vectorizer understand my intrinsics are vectorizable?

Yes, you need to add an attribute in the IR that maps the (scalar) attribute to its vector counterpart.

The attribute is called vector-function-abi-variant.

You can add it by using the following method in llvm/include/llvm/Transforms/Utils/ModuleUtils.h:

namespace VFABI {
/// Overwrite the Vector Function ABI variants attribute with the names provide
/// in \p VariantMappings.
void setVectorVariantNames(CallInst *CI,
                           const SmallVector<std::string, 8> &VariantMappings);
} // End VFABI namespace

The VariantMappins are strings that need to be generated according to some Vector Function ABI (VFABI). If your target doesn't have such ABI, you can use the LLVM internal mangling.

For example, say that your attribute is double @llvm.funky.intrinsic (double), and you need to map it to an unmasked vector function with a vectorization factor of two, say custom_vector_function, the string that you need to add in the attribute is __ZGV_LLVM_N2v_llvm.funky.intrinsic(custom_vector_function).

The name mangling rules are admittedly not well documented for the internal mangling, but other than for the ISA token (which is _LLVM_), they correspond to the ones of x86 and AArch64, which are the same (you can browse the latter here: https://github.com/ARM-software/software-standards/blob/master/abi/vfabia64/vfabia64.rst#vector-function-name-mangling)

I will definitely add some docs to explain more in detail the mangling rules, but for the moment you can look at the tests in llvm/unittests/Analysis/VectorFunctionABITest.cpp to get a sense of the meaning of the different tokens in the mangled name, especially the use of the <parameters> in _ZGV<isa><mask><vlen><parameters>_<scalarname>[(<redirection>)].

Looking at this code in LoopVectorizationLegality.cpp:

// We handle calls that:
//   * Are debug info intrinsics.
//   * Have a mapping to an IR intrinsic.
//   * Have a vector version available.
auto *CI = dyn_cast<CallInst>(&I);
if (CI && !getVectorIntrinsicIDForCall(CI, TLI) &&
    !isa<DbgInfoIntrinsic>(CI) &&
    !(CI->getCalledFunction() && TLI &&
      !VFDatabase::getMappings(*CI).empty())) {

VFDatabase::getMappings(*CI).empty() is indeed true for my intrisic, and if I dig further, I take this return in

static void getVFABIMappings(const CallInst &CI,
                             SmallVectorImpl<VFInfo> &Mappings) {
  const StringRef ScalarName = CI.getCalledFunction()->getName();
  const StringRef S =
      CI.getAttribute(AttributeList::FunctionIndex, VFABI::MappingsAttrName)
          .getValueAsString();
  if (S.empty())
    return;

Is there some existing commit where in-tree targets have been modified already to work with the new VFDatabase?

Unless I have been missing something, all targets in the in-tree version are using VFDatabase now. The patch in this revision is what introduced the change.

Thanks!

I hope you find this useful. Let me know if you need more help with this, I am generally available on IRC and discord too.

Francesco

Hi,

Thanks for the reply!

Ok, I think I understand what is happening now at least.

We have a bunch of target intrinsics that we say are vectorizable, but we don't provide a name of the vector version of the intrinsic.

This meant that before this patch

LoopVectorizationLegality::canVectorizeInstrs()

accepted to vectorize the loop since

TLI->isFunctionVectorizable(CI->getCalledFunction()->getName())

returned true for the intrinsic.

Then in LoopVectorizationCostModel::getVectorCallCost we decided that the call to the intrinsic should be scalarized, since

TLI->isFunctionVectorizable(FnName, VF)

returned false.

So the loop was vectorized, but we got VF calls to the scalar version of the intrinsic, just as we wanted.

However, with this patch, the check in

LoopVectorizationLegality::canVectorizeInstrs()

now says false, since we do

VFDatabase::getMappings(*CI).empty()

and we indeed get empty mappings since we don't provide any vector version.

So the presence of an intrinsic that we don't provide a vector version for prevents vectorization of the entire loop, even if it would we totally ok to do VF calls to the scalar version instead.

Is this change in behavior intended?

In D67572#1781286, @uabelho wrote:
Hi,

Thanks for the reply!

Ok, I think I understand what is happening now at least.

We have a bunch of target intrinsics that we say are vectorizable, but we don't provide a name of the vector version of the intrinsic.

This meant that before this patch
LoopVectorizationLegality::canVectorizeInstrs()
accepted to vectorize the loop since
TLI->isFunctionVectorizable(CI->getCalledFunction()->getName())
returned true for the intrinsic.

Then in LoopVectorizationCostModel::getVectorCallCost we decided that the call to the intrinsic should be scalarized, since
TLI->isFunctionVectorizable(FnName, VF)
returned false.

So the loop was vectorized, but we got VF calls to the scalar version of the intrinsic, just as we wanted.

However, with this patch, the check in
LoopVectorizationLegality::canVectorizeInstrs()
now says false, since we do
VFDatabase::getMappings(*CI).empty()
and we indeed get empty mappings since we don't provide any vector version.

So the presence of an intrinsic that we don't provide a vector version for prevents vectorization of the entire loop, even if it would we totally ok to do VF calls to the scalar version instead.

Is this change in behavior intended?

This change in behavior sounds a bit worrying for down-stream targets.

I think we should have at least an assertion making sure the check in LoopVEctorizationLegality succeeds in all cases it did previously with isFunctionVectorizable, otherwise down-stream targets will silently miss out on vectorization. But I think ideally the patch would preserve the existing behavior in the cases @uabelho described.

In D67572#1781363, @fhahn wrote:

[...]

Is this change in behavior intended?

This change in behavior sounds a bit worrying for down-stream targets.

I think we should have at least an assertion making sure the check in LoopVEctorizationLegality succeeds in all cases it did previously with isFunctionVectorizable, otherwise down-stream targets will silently miss out on vectorization.

Yes that's exactly what happened to us. Or well, "silently", since our benchmark numbers went crazy after the merge :)

But I think ideally the patch would preserve the existing behavior in the cases @uabelho described.

Sounds good!

In D67572#1781286, @uabelho wrote:
...

This meant that before this patch
LoopVectorizationLegality::canVectorizeInstrs()
accepted to vectorize the loop since
TLI->isFunctionVectorizable(CI->getCalledFunction()->getName())
returned true for the intrinsic.

Then in LoopVectorizationCostModel::getVectorCallCost we decided that the call to the intrinsic should be scalarized, since
TLI->isFunctionVectorizable(FnName, VF)
returned false.

Ah, OK. I see what you mean. I think we could solve this by scalarizing after vectorization, instead of doing it in the vectorizer.

We could have a IR pass that runs after the vectorizer that looks for intrinsics calls .When it sees an intrinsics that operates on vector, it checks whether the target is able to lower it to some function or instruction. If not, it scalarizes it. With this we wouldn't have to introduce special behavior in the vectorizer for handling intrinsics: it could just vectorize any call to intrinsics for which vectorization make sense.

In D67572#1781286, @uabelho wrote:

We have a bunch of target intrinsics that we say are vectorizable, but we don't provide a name of the vector version of the intrinsic.

You can still use the name mangling of the VFABI attribute (for internal LLVM mangling) to map a scalar attribute to its vector version:

_ZGV_LLVM_N2v_llvm.custom.attribute(llvm.custom.attribute)

With this, the VFDatabase::getMappings(*CI).empty() woudl return false, so that the intrinsic would be vectorized as a regular function. Then, you could use the post-vectorization pass I mentioned in the previous message to scalarize it.

The mappings in the IR could be added by the frontend, or in a pre-vectorization pass if you don't want to touch the frontend.

It sounds like resolving this will require some extra thought. It would probably be good to revert this patch until then.

In D67572#1782113, @fhahn wrote:

It sounds like resolving this will require some extra thought. It would probably be good to revert this patch until then.

I am happy to do so. Shall I just revert it in git and push the change, or is there a formal way to do it via phabricator or arc?

@uabelho: would it be possible for you to provide me a minimal reproducer that I could use to craft the (wip) solution I have in mind?

Thank you,

Francesco

In D67572#1782337, @fpetrogalli wrote:

In D67572#1782113, @fhahn wrote:

It sounds like resolving this will require some extra thought. It would probably be good to revert this patch until then.

I am happy to do so. Shall I just revert it in git and push the change, or is there a formal way to do it via phabricator or arc?

Just pushing the revert in git is fine.

@uabelho: would it be possible for you to provide me a minimal reproducer that I could use to craft the (wip) solution I have in mind?

I'm not sure what kind of reproducer you expect since my reproducer requires our out-of-tree target and intrinsics but I can at least try to show something.

What we have in our target is that in initialize() in TargetLibraryInfo.cpp we add our target intrinsics that we allow in vectorized loops.
So we have like:

const VecDesc VecIntrinsics[] = {
  {"llvm.phx.abs.i32", "", 4}
};

TLI.addVectorizableFunctions(VecIntrinsics);

where we say that it's ok to vectorize a loop containing a call to the intrinsic llvm.phx.abs.i32, but we don't provide a vector version that should be used when it's vectorized.

I think in-tree targets did like this before, I'm not sure if they do anymore or if that has changed now.

Then if I run -loop-vectorize on the following input

define i32 @f() {
  br label %bb1

bb1:                                              ; preds = %bb1, %0
  %sum = phi i32 [ 0, %0 ], [ %sum_next, %bb1 ]
  %i = phi i16 [ 0, %0 ], [ %i_inc, %bb1 ]
  %call = tail call i32 @llvm.phx.abs.i32(i32 0)
  %sum_next = add i32 %sum, %call
  %i_inc = add nuw nsw i16 %i, 1
  %exit = icmp eq i16 %i_inc, 100
  br i1 %exit, label %bb3, label %bb1

bb3:                                              ; preds = %bb1
  ret i32 %sum_next
}

declare i32 @llvm.phx.abs.i32(i32)

I used to get a vectorized loop like

vector.body:                                      ; preds = %vector.body, %vector.ph
  %index = phi i32 [ 0, %vector.ph ], [ %index.next, %vector.body ]
  %vec.phi = phi <4 x i32> [ zeroinitializer, %vector.ph ], [ %10, %vector.body ]
  %offset.idx = trunc i32 %index to i16
  %broadcast.splatinsert = insertelement <4 x i16> undef, i16 %offset.idx, i32 0
  %broadcast.splat = shufflevector <4 x i16> %broadcast.splatinsert, <4 x i16> undef, <4 x i32> zeroinitializer
  %induction = add <4 x i16> %broadcast.splat, <i16 0, i16 1, i16 2, i16 3>
  %1 = add i16 %offset.idx, 0
  %2 = tail call i32 @llvm.phx.abs.i32(i32 0)
  %3 = tail call i32 @llvm.phx.abs.i32(i32 0)
  %4 = tail call i32 @llvm.phx.abs.i32(i32 0)
  %5 = tail call i32 @llvm.phx.abs.i32(i32 0)
  %6 = insertelement <4 x i32> undef, i32 %2, i32 0
  %7 = insertelement <4 x i32> %6, i32 %3, i32 1
  %8 = insertelement <4 x i32> %7, i32 %4, i32 2
  %9 = insertelement <4 x i32> %8, i32 %5, i32 3
  %10 = add <4 x i32> %vec.phi, %9
  %index.next = add i32 %index, 4
  %11 = icmp eq i32 %index.next, 100
  br i1 %11, label %middle.block, label %vector.body, !llvm.loop !0

but with this patch LoopVectorizationLegality bails out with

LV: Not vectorizing: Found a non-intrinsic callsite   %call = tail call i32 @llvm.phx.abs.i32(i32 0)

Right now I've done a hacky workaround in LoopVectorizationLegality to get the old behavior for our target so we still get vectorization for the above case:

@@ -704,7 +704,12 @@ bool LoopVectorizationLegality::canVectorizeInstrs() {
       if (CI && !getVectorIntrinsicIDForCall(CI, TLI) &&
           !isa<DbgInfoIntrinsic>(CI) &&
           !(CI->getCalledFunction() && TLI &&
-            !VFDatabase::getMappings(*CI).empty())) {
+            (!VFDatabase::getMappings(*CI).empty() ||
+             // Hack: Allow vectorization even if we didn't provide
+             // a vector version of the intrinsic.
+             (CI->getParent()->getModule()->isTargetPhoenix() &&
+              TLI->isFunctionVectorizable(CI->getCalledFunction()
+                                          ->getName()))))) {

uabelho added a subscriber: bjope.Dec 13 2019, 12:14 AM

In D67572#1781924, @fpetrogalli wrote:

Ah, OK. I see what you mean. I think we could solve this by scalarizing after vectorization, instead of doing it in the vectorizer.

We could have a IR pass that runs after the vectorizer that looks for intrinsics calls .When it sees an intrinsics that operates on vector, it checks whether the target is able to lower it to some function or instruction. If not, it scalarizes it. With this we wouldn't have to introduce special behavior in the vectorizer for handling intrinsics: it could just vectorize any call to intrinsics for which vectorization make sense.

That would mean we would have to introduce vector versions of of all intrinsics, that would just be used between the vectorizer and the scalarizer, right? Sounds a little bit cumbersome since those vector versions don't exist today, but perhaps it's the best way anyway, I don't know.

Btw, in case you didn't know, there is already an existing scalarizer pass, though I don't think it's widely used.

In D67572#1783041, @uabelho wrote:
@uabelho: would it be possible for you to provide me a minimal reproducer that I could use to craft the (wip) solution I have in mind?

I'm not sure what kind of reproducer you expect since my reproducer requires our out-of-tree target and intrinsics but I can at least try to show something.

What we have in our target is that in initialize() in TargetLibraryInfo.cpp we add our target intrinsics that we allow in vectorized loops.
So we have like:
const VecDesc VecIntrinsics[] = {
  {"llvm.phx.abs.i32", "", 4}
};

TLI.addVectorizableFunctions(VecIntrinsics);
where we say that it's ok to vectorize a loop containing a call to the intrinsic llvm.phx.abs.i32, but we don't provide a vector version that should be used when it's vectorized.

I think in-tree targets did like this before, I'm not sure if they do anymore or if that has changed now.

I cannot find anything like that in the current TargetLibraryInfo.cpp.

Then if I run -loop-vectorize on the following input

define i32 @f() {
  br label %bb1

bb1:                                              ; preds = %bb1, %0
  %sum = phi i32 [ 0, %0 ], [ %sum_next, %bb1 ]
  %i = phi i16 [ 0, %0 ], [ %i_inc, %bb1 ]
  %call = tail call i32 @llvm.phx.abs.i32(i32 0)
  %sum_next = add i32 %sum, %call
  %i_inc = add nuw nsw i16 %i, 1
  %exit = icmp eq i16 %i_inc, 100
  br i1 %exit, label %bb3, label %bb1

bb3:                                              ; preds = %bb1
  ret i32 %sum_next
}

declare i32 @llvm.phx.abs.i32(i32)

I used to get a vectorized loop like

vector.body:                                      ; preds = %vector.body, %vector.ph
  %index = phi i32 [ 0, %vector.ph ], [ %index.next, %vector.body ]
  %vec.phi = phi <4 x i32> [ zeroinitializer, %vector.ph ], [ %10, %vector.body ]
  %offset.idx = trunc i32 %index to i16
  %broadcast.splatinsert = insertelement <4 x i16> undef, i16 %offset.idx, i32 0
  %broadcast.splat = shufflevector <4 x i16> %broadcast.splatinsert, <4 x i16> undef, <4 x i32> zeroinitializer
  %induction = add <4 x i16> %broadcast.splat, <i16 0, i16 1, i16 2, i16 3>
  %1 = add i16 %offset.idx, 0
  %2 = tail call i32 @llvm.phx.abs.i32(i32 0)
  %3 = tail call i32 @llvm.phx.abs.i32(i32 0)
  %4 = tail call i32 @llvm.phx.abs.i32(i32 0)
  %5 = tail call i32 @llvm.phx.abs.i32(i32 0)
  %6 = insertelement <4 x i32> undef, i32 %2, i32 0
  %7 = insertelement <4 x i32> %6, i32 %3, i32 1
  %8 = insertelement <4 x i32> %7, i32 %4, i32 2
  %9 = insertelement <4 x i32> %8, i32 %5, i32 3
  %10 = add <4 x i32> %vec.phi, %9
  %index.next = add i32 %index, 4
  %11 = icmp eq i32 %index.next, 100
  br i1 %11, label %middle.block, label %vector.body, !llvm.loop !0

but with this patch LoopVectorizationLegality bails out with

LV: Not vectorizing: Found a non-intrinsic callsite   %call = tail call i32 @llvm.phx.abs.i32(i32 0)

Right now I've done a hacky workaround in LoopVectorizationLegality to get the old behavior for our target so we still get vectorization for the above case:

@@ -704,7 +704,12 @@ bool LoopVectorizationLegality::canVectorizeInstrs() {
       if (CI && !getVectorIntrinsicIDForCall(CI, TLI) &&
           !isa<DbgInfoIntrinsic>(CI) &&
           !(CI->getCalledFunction() && TLI &&
-            !VFDatabase::getMappings(*CI).empty())) {
+            (!VFDatabase::getMappings(*CI).empty() ||
+             // Hack: Allow vectorization even if we didn't provide
+             // a vector version of the intrinsic.
+             (CI->getParent()->getModule()->isTargetPhoenix() &&
+              TLI->isFunctionVectorizable(CI->getCalledFunction()
+                                          ->getName()))))) {

Hi @uabelho and @fhahn ,

I have reverted the change to avoid disruption in your work.

@uabelho, the example you posted here is very useful, I will send you a modified version of the code for review, so that you can verify it works for you.

Kind regards,

Francesco

rogfer01 added a subscriber: rogfer01.Jan 1 2020, 1:14 PM

@uabelho ,

I am working on a solution for the problem you reported.

So far, I have decided to use a special name in the mappings scalar-to-vector that informs the compiler that the function that is being vectorized by the vectorizer should be scalarized after vectorization.
That would require modifying the mappings from

const VecDesc VecIntrinsics[] = {
  {"llvm.phx.abs.i32", "", 4}
};

to the following form:

const VecDesc VecIntrinsics[] = {
  {"llvm.phx.abs.i32", "__LLVM_scalarize__", 4}
};

The method bool VFDatabase::shouldFunctionScalarize(VFShape Shape) would be used to test such situation.

Would that work for you?

Kind regards,

Francesco

In D67572#1817461, @fpetrogalli wrote:

Would that work for you?

That sounds ok to me.

(Note that I don't really know this code, so I don't know if this is acceptable from a general design point of view. But for my target, changing the mappings in that way is ok.)

Thanks!

Reworked in https://reviews.llvm.org/D72734

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

TargetLibraryInfo.h

6 lines

VectorUtils.h

72 lines

lib/

Analysis/

LazyCallGraph.cpp

8 lines

LoopAccessAnalysis.cpp

2 lines

Transforms/

Vectorize/

LoopVectorizationLegality.cpp

2 lines

LoopVectorize.cpp

7 lines

Diff 220174

llvm/include/llvm/Analysis/TargetLibraryInfo.h

Show First 20 Lines • Show All 192 Lines • ▼ Show 20 Lines	public:
void setShouldSignExtI32Param(bool Val) {		void setShouldSignExtI32Param(bool Val) {
ShouldSignExtI32Param = Val;		ShouldSignExtI32Param = Val;
}		}

/// Returns the size of the wchar_t type in bytes or 0 if the size is unknown.		/// Returns the size of the wchar_t type in bytes or 0 if the size is unknown.
/// This queries the 'wchar_size' metadata.		/// This queries the 'wchar_size' metadata.
unsigned getWCharSize(const Module &M) const;		unsigned getWCharSize(const Module &M) const;
};		};

		jdoerfertUnsubmitted Done Reply Inline Actions Documentation missing, and please do not call it "fixup", maybe "adjust" or something but not "fixup". jdoerfert: Documentation missing, and please do not call it "fixup", maybe "adjust" or something but not…
		class SearchVFSystem;
/// Provides information about what library functions are available for		/// Provides information about what library functions are available for
/// the current target.		/// the current target.
///		///
/// This both allows optimizations to handle them specially and frontends to		/// This both allows optimizations to handle them specially and frontends to
/// disable such optimizations through -fno-builtin etc.		/// disable such optimizations through -fno-builtin etc.
class TargetLibraryInfo {		class TargetLibraryInfo {
friend class TargetLibraryAnalysis;		friend class TargetLibraryAnalysis;
friend class TargetLibraryInfoWrapperPass;		friend class TargetLibraryInfoWrapperPass;
		friend class SearchVFSystem;
const TargetLibraryInfoImpl *Impl;		const TargetLibraryInfoImpl *Impl;

public:		public:
explicit TargetLibraryInfo(const TargetLibraryInfoImpl &Impl) : Impl(&Impl) {}		explicit TargetLibraryInfo(const TargetLibraryInfoImpl &Impl) : Impl(&Impl) {}

// Provide value semantics.		// Provide value semantics.
TargetLibraryInfo(const TargetLibraryInfo &TLI) : Impl(TLI.Impl) {}		TargetLibraryInfo(const TargetLibraryInfo &TLI) : Impl(TLI.Impl) {}
TargetLibraryInfo(TargetLibraryInfo &&TLI) : Impl(TLI.Impl) {}		TargetLibraryInfo(TargetLibraryInfo &&TLI) : Impl(TLI.Impl) {}
Show All 24 Lines	bool getLibFunc(ImmutableCallSite CS, LibFunc &F) const {
return !CS.isNoBuiltin() && CS.getCalledFunction() &&		return !CS.isNoBuiltin() && CS.getCalledFunction() &&
getLibFunc(*(CS.getCalledFunction()), F);		getLibFunc(*(CS.getCalledFunction()), F);
}		}

/// Tests whether a library function is available.		/// Tests whether a library function is available.
bool has(LibFunc F) const {		bool has(LibFunc F) const {
return Impl->getState(F) != TargetLibraryInfoImpl::Unavailable;		return Impl->getState(F) != TargetLibraryInfoImpl::Unavailable;
}		}

		private:
bool isFunctionVectorizable(StringRef F, unsigned VF) const {		bool isFunctionVectorizable(StringRef F, unsigned VF) const {
return Impl->isFunctionVectorizable(F, VF);		return Impl->isFunctionVectorizable(F, VF);
}		}
bool isFunctionVectorizable(StringRef F) const {		bool isFunctionVectorizable(StringRef F) const {
return Impl->isFunctionVectorizable(F);		return Impl->isFunctionVectorizable(F);
}		}
StringRef getVectorizedFunction(StringRef F, unsigned VF) const {		StringRef getVectorizedFunction(StringRef F, unsigned VF) const {
return Impl->getVectorizedFunction(F, VF);		return Impl->getVectorizedFunction(F, VF);
}		}
		jdoerfertUnsubmitted Done Reply Inline Actions Why do you privatize these redirects? I would expect them to be used, thus public, or unused, thus deleted. jdoerfert: Why do you privatize these redirects? I would expect them to be used, thus public, or unused…
		fpetrogalliAuthorUnsubmitted Done Reply Inline Actions I am privatizing these because `isFunctionVectorizable` and `getVectorizedFunction` to be asked to any component other than the `SearchVFSystem` introduced in this patch. fpetrogalli: I am privatizing these because `isFunctionVectorizable` and `getVectorizedFunction` to be…

		public:
/// Tests if the function is both available and a candidate for optimized code		/// Tests if the function is both available and a candidate for optimized code
/// generation.		/// generation.
bool hasOptimizedCodeGen(LibFunc F) const {		bool hasOptimizedCodeGen(LibFunc F) const {
if (Impl->getState(F) == TargetLibraryInfoImpl::Unavailable)		if (Impl->getState(F) == TargetLibraryInfoImpl::Unavailable)
return false;		return false;
switch (F) {		switch (F) {
default: break;		default: break;
case LibFunc_copysign: case LibFunc_copysignf: case LibFunc_copysignl:		case LibFunc_copysign: case LibFunc_copysignf: case LibFunc_copysignl:
▲ Show 20 Lines • Show All 131 Lines • Show Last 20 Lines

llvm/include/llvm/Analysis/VectorUtils.h

Show All 9 Lines
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_ANALYSIS_VECTORUTILS_H		#ifndef LLVM_ANALYSIS_VECTORUTILS_H
#define LLVM_ANALYSIS_VECTORUTILS_H		#define LLVM_ANALYSIS_VECTORUTILS_H

#include "llvm/ADT/MapVector.h"		#include "llvm/ADT/MapVector.h"
#include "llvm/Analysis/LoopAccessAnalysis.h"		#include "llvm/Analysis/LoopAccessAnalysis.h"
		#include "llvm/Analysis/TargetLibraryInfo.h"
#include "llvm/IR/IRBuilder.h"		#include "llvm/IR/IRBuilder.h"
#include "llvm/Support/CheckedArithmetic.h"		#include "llvm/Support/CheckedArithmetic.h"

namespace llvm {		namespace llvm {

/// Describes the type of Parameters		/// Describes the type of Parameters
enum class VFParamKind {		enum class VFParamKind {
Vector, // No semantic information.		Vector, // No semantic information.
Show All 20 Lines	enum class VFISAKind {
SSE, // x86 SSE		SSE, // x86 SSE
AVX, // x86 AVX		AVX, // x86 AVX
AVX2, // x86 AVX2		AVX2, // x86 AVX2
AVX512, // x86 AVX512		AVX512, // x86 AVX512
Unknown // Unknown ISA		Unknown // Unknown ISA
};		};

/// Encapsulates information needed to describe a parameter.		/// Encapsulates information needed to describe a parameter.
///		///
		jdoerfertUnsubmitted Done Reply Inline Actions I generally dislike non-scoped macros. Is there a reason a `static constexpr char` doesn't work? I would also opt for a shorter name, "_llvm_" or "_llvm_tli_" but that is debatable. jdoerfert:* I generally dislike non-scoped macros. Is there a reason a `static constexpr char*` doesn't…
/// The description of the parameter is not linked directly to		/// The description of the parameter is not linked directly to
/// OpenMP or any other vector function description. This structure		/// OpenMP or any other vector function description. This structure
		fpetrogalliAuthorUnsubmitted Done Reply Inline Actions I have added this to make it work with the TLI. An alternative would be to add an extra field to the `VecDesc` instances stored in the TLI to hold also the ISA information. I am happy to do so (or to investigate any other alternative approach) is the solution proposed in this patch is not convincing. fpetrogalli: I have added this to make it work with the TLI. An alternative would be to add an extra field…
/// is extendible to handle other paradigms that describe vector		/// is extendible to handle other paradigms that describe vector
/// functions and their parameters.		/// functions and their parameters.
struct VFParameter {		struct VFParameter {
unsigned ParamPos; // Parameter Position in Scalar Function.		unsigned ParamPos; // Parameter Position in Scalar Function.
VFParamKind ParamKind; // Kind of Parameter.		VFParamKind ParamKind; // Kind of Parameter.
int LinearStepOrPos = 0; // Step or Position of the Parameter.		int LinearStepOrPos = 0; // Step or Position of the Parameter.
Align Alignment = Align(); // Optional aligment in bytes, defaulted to 1.		Align Alignment = Align(); // Optional aligment in bytes, defaulted to 1.

▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	struct VFShape {
VFISAKind ISA; // Instruction Set Architecture.		VFISAKind ISA; // Instruction Set Architecture.
SmallVector<VFParameter, 8> Parameters; // List of parameter informations.		SmallVector<VFParameter, 8> Parameters; // List of parameter informations.
// Comparison operator.		// Comparison operator.
bool operator==(const VFShape &Other) const {		bool operator==(const VFShape &Other) const {
return std::tie(VF, IsScalable, ISA, Parameters) ==		return std::tie(VF, IsScalable, ISA, Parameters) ==
std::tie(Other.VF, Other.IsScalable, Other.ISA, Other.Parameters);		std::tie(Other.VF, Other.IsScalable, Other.ISA, Other.Parameters);
}		}
};		};

		sdesmalenUnsubmitted Done Reply Inline Actions nit: retrive -> retrieve sdesmalen: nit: retrive -> retrieve
/// Holds the VFShape for a specific scalar to vector function mapping.		/// Holds the VFShape for a specific scalar to vector function mapping.
struct VFInfo {		struct VFInfo {
VFShape Shape; // Classification of the vector function.		VFShape Shape; // Classification of the vector function.
StringRef ScalarName; // Scalar Function Name.		StringRef ScalarName; // Scalar Function Name.
StringRef VectorName; // Vector Function Name associated to this VFInfo.		StringRef VectorName; // Vector Function Name associated to this VFInfo.
		sdesmalenUnsubmitted Done Reply Inline Actions I don't really understand what you mean by a "flat" vectorization shape. Is this function supposed to return a 'widened' (vector version of a) function? If so, can we please rename this to `getVectorShapeForCall` ? sdesmalen: I don't really understand what you mean by a "flat" vectorization shape. Is this function…
		fpetrogalliAuthorUnsubmitted Done Reply Inline Actions Yeah, the name is misleading. This is supposed to return a widenened version of the function... but I don't like the name `getVectorShapeForCall`, because a Shape for a Vector Call can have Linear modifiers for example, while this function is returning only the shape that uses `VFParamKind::Vector` for all `VFPArameter`s. How about `getAllVectorsShape`? fpetrogalli: Yeah, the name is misleading. This is supposed to return a widenened version of the function...
		fpetrogalliAuthorUnsubmitted Done Reply Inline Actions I explained the meaning of the function in the comment. fpetrogalli: I explained the meaning of the function in the comment.
		fpetrogalliAuthorUnsubmitted Done Reply Inline Actions I have renamed to `VFShape::getAllVectorsParams`. fpetrogalli: I have renamed to `VFShape::getAllVectorsParams`.
		fpetrogalliAuthorUnsubmitted Done Reply Inline Actions I have extracted the VFShape API here: https://reviews.llvm.org/D70513 (it is a work in progress as of now, I might finish up things later today). fpetrogalli: I have extracted the VFShape API here: https://reviews.llvm.org/D70513 (it is a work in…
		sdesmalenUnsubmitted Done Reply Inline Actions I think this can take `FunctionType` instead of `CallInst`. I have renamed to VFShape::getAllVectorsParams nit: I know this is bikeshedding, but what do you think of `VFShape::widenAllParams`? sdesmalen: I think this can take `FunctionType` instead of `CallInst`. > I have renamed to VFShape…
		fpetrogalliAuthorUnsubmitted Done Reply Inline Actions I think this can take FunctionType instead of CallInst Yes, but... what's the point? We will have to introduce one or two extra calls to get the FunctionType... All we need is the number of arguments of the function, so in fact the function could just take an unsigned integer. But I don't like it. The fact that I use CallInst is cleaner and not worse than having an optimized interface. But this is my personal preference, so if you ask nicely :P I will change the interface. nit: I know this is bike shedding, but what do you think of VFShape::widenAllParams? I know where you come from, the Vectorizer :). I'd rather not, I really want to have a static `get` method attached to the VFShape. It seems to be in the style of LLVM to use `get` for static public member functions that build objects (see http://llvm.org/doxygen/classllvm_1_1VectorType.html). Since this is the only `get` method of `VFShape`, I have renamed it to `get`. For the record, the changes have been applied to https://reviews.llvm.org/D70513 This function this disappear from this revision. fpetrogalli: > I think this can take FunctionType instead of CallInst Yes, but... what's the point? We will…
		fpetrogalliAuthorUnsubmitted Done Reply Inline Actions Please take any other further review of `getAllVectorParams` to https://reviews.llvm.org/D70513 fpetrogalli: Please take any other further review of `getAllVectorParams` to https://reviews.llvm.org/D70513

		sdesmalenUnsubmitted Done Reply Inline Actions these parameters shouldn't use `const`. sdesmalen: these parameters shouldn't use `const`.
		fpetrogalliAuthorUnsubmitted Done Reply Inline Actions these parameters shouldn't use const. I disagree, especially for the CI argument. The method is not supposed to change the reference. I have anyway removed the const from the other parameters. (again, in https://reviews.llvm.org/D70513, not here, but I will update this patch) fpetrogalli: > these parameters shouldn't use const. I disagree, especially for the CI argument. The method…
		sdesmalenUnsubmitted Done Reply Inline Actions Is there a reason to pass VF and IsScalable separately, instead of passing an `ElementCount` ? sdesmalen: Is there a reason to pass VF and IsScalable separately, instead of passing an `ElementCount` ?
		ABataevUnsubmitted Done Reply Inline Actions Use `\\\` style of comment here ABataev: Use `\\\` style of comment here
// Comparison operator.		// Comparison operator.
bool operator==(const VFInfo &Other) const {		bool operator==(const VFInfo &Other) const {
		sdesmalenUnsubmitted Done Reply Inline Actions nit: unnecessary curly braces sdesmalen: nit: unnecessary curly braces
return std::tie(Shape, ScalarName, VectorName) ==		return std::tie(Shape, ScalarName, VectorName) ==
std::tie(Shape, Other.ScalarName, Other.VectorName);		std::tie(Shape, Other.ScalarName, Other.VectorName);
}		}
		sdesmalenUnsubmitted Done Reply Inline Actions is `HasGlobalPred` the same as `isPredicated`? Also, is the predicate always known/expected to be the last parameter by the Vector ABI? sdesmalen: is `HasGlobalPred` the same as `isPredicated`? Also, is the predicate always known/expected to…
		fpetrogalliAuthorUnsubmitted Done Reply Inline Actions is HasGlobalPred the same as isPredicated? Th GlobalPredicate is listed in the `VFParamKind` enum class: GlobalPredicate, // Global logical predicate that acts on all lanes // of the input and output mask concurrently. For // example, it is implied by the `M` token in the // Vector Function ABI mangled name. It is a special predicate because it differ from the parameter masks that can be individually attached to the parameters. These kind of masks are not handled yet by the VFParamKind, but I understand that they were needed by @simoll when discussing the RFC. Also, is the predicate always known/expected to be the last parameter by the Vector ABI? For all the Vector Function ABIs supported at the moment in LLVM, yes. If vendor X decides to produce an ABI were the mask is the first parameter, we will have to change this code. But for now, we will assume the global predicate is the last parameter. fpetrogalli: > is HasGlobalPred the same as isPredicated? Th GlobalPredicate is listed in the `VFParamKind`…
		sdesmalenUnsubmitted Done Reply Inline Actions It is a special predicate because it differ from the parameter masks that can be individually attached to the parameters. I'd expect the operation to be predicated, not the individual operands. What would be the meaning of a predicated operand? For all the Vector Function ABIs supported at the moment in LLVM, yes. If vendor X decides to produce an ABI were the mask is the first parameter, we will have to change this code. Can we add an assert somewhere to enforce the assumption that the global predicate is passed as the last operand? sdesmalen: > It is a special predicate because it differ from the parameter masks that can be individually…
		fpetrogalliAuthorUnsubmitted Done Reply Inline Actions I'd expect the operation to be predicated, not the individual operands. What would be the meaning of a predicated operand? This was a requirement from @simoll, he needs to have masking associated to each of the operands. Can we add an assert somewhere to enforce the assumption that the global predicate is passed as the last operand? Is is done in https://reviews.llvm.org/D70513 fpetrogalli: > I'd expect the operation to be predicated, not the individual operands. What would be the…

/// Static method to contruct a VFInfo out of a mangled names in the		/// Static method to contruct a VFInfo out of a mangled names in the
/// following format:		/// following format:
///		///
/// <VFABI_name>{(<redirection>)}		/// <VFABI_name>{(<redirection>)}
///		///
/// where <VFABI_name> is the name of the vector function, mangled according		/// where <VFABI_name> is the name of the vector function, mangled according
/// to the rules described in the Vector Function ABI of the target vector		/// to the rules described in the Vector Function ABI of the target vector
/// extentsion (or <isa> from now on). The <VFABI_name> is in the following		/// extentsion (or <isa> from now on). The <VFABI_name> is in the following
/// format:		/// format:
///		///
/// _ZGV<isa><mask><vlen><parameters>_<scalarname>[(<redirection>)]		/// _ZGV<isa><mask><vlen><parameters>_<scalarname>[(<redirection>)]
///		///
/// This methods support demangling rules for the following <isa>:		/// This methods support demangling rules for the following <isa>:
///		///
/// * AArch64: https://developer.arm.com/docs/101129/latest		/// * AArch64: https://developer.arm.com/docs/101129/latest
///		///
/// * x86 (libmvec): https://sourceware.org/glibc/wiki/libmvec and		/// * x86 (libmvec): https://sourceware.org/glibc/wiki/libmvec and
/// https://sourceware.org/glibc/wiki/libmvec?action=AttachFile&do=view&target=VectorABI.txt		/// https://sourceware.org/glibc/wiki/libmvec?action=AttachFile&do=view&target=VectorABI.txt
static Optional<VFInfo> getFromVFABI(StringRef MangledName) {		static Optional<VFInfo> getFromVFABI(StringRef MangledName) {
const auto OptPD = VFABI::tryDemangleForVFABI(MangledName);		const auto OptPD = VFABI::tryDemangleForVFABI(MangledName);
if (OptPD.hasValue()) {		if (OptPD.hasValue()) {
const VFABI::ParsedData PD = OptPD.getValue();		const VFABI::ParsedData PD = OptPD.getValue();
const VFShape Shape({PD.VF, PD.IsScalable, PD.ISA, PD.Parameters});		const VFShape Shape({PD.VF, PD.IsScalable, PD.ISA, PD.Parameters});
return VFInfo({Shape, PD.ScalarName, PD.VectorName});		return VFInfo({Shape, PD.ScalarName, PD.VectorName});
}		}
		jdoerfertUnsubmitted Done Reply Inline Actions I doubt this change is necessary. Only if you need to own the underlying data you need to replace StringRef with std::string. jdoerfert: I doubt this change is necessary. Only if you need to own the underlying data you need to…

return None;		return None;
}		}
};		};

		class SearchVFSystem {
		ABataevUnsubmitted Done Reply Inline Actions No need for `\brief` ABataev: No need for `\brief`
		private:
		const TargetLibraryInfo *TLI;
		jdoerfertUnsubmitted Done Reply Inline Actions I'd prefer `void getVectorVariantNames(CallInst CI, SmallSet<std::string, 8> &VariantMappings);` but for sure you want a reference here: void setVectorVariantNames(CallInst CI, const SmallSet<std::string, 8> &VariantMappings); jdoerfert: I'd prefer `void getVectorVariantNames(CallInst *CI, SmallSet<std::string, 8>…
		SmallVector<StringRef, 8> extractMangledNames(CallInst *CI) const {
		const StringRef AttrString =
		CI->getCalledFunction()
		sdesmalenUnsubmitted Done Reply Inline Actions nit: in a class members are private by default. sdesmalen: nit: in a class members are private by default.
		->getFnAttribute("vector-function-abi-variant")
		sdesmalenUnsubmitted Done Reply Inline Actions nit: put comments above variable sdesmalen: nit: put comments above variable
		.getValueAsString();
		SmallVector<StringRef, 8> ListOfStrings;
		AttrString.split(ListOfStrings, ",");
		return ListOfStrings;
		}

		public:
		explicit SearchVFSystem(const TargetLibraryInfo *TLI) : TLI(TLI) {}
		SmallVector<VFInfo, 8> getVFMappings(CallInst *CI) const {
		sdesmalenUnsubmitted Done Reply Inline Actions StringRef ? sdesmalen: StringRef ?
		const StringRef ScalarName = CI->getCalledFunction()->getName();
		const SmallVector<StringRef, 8> ListOfStrings = extractMangledNames(CI);

		SmallVector<VFInfo, 8> Ret;
		for (auto MangledName : ListOfStrings) {
		auto Shape = VFInfo::getFromVFABI(MangledName);
		if (Shape.hasValue() && Shape.getValue().ScalarName == ScalarName)
		Ret.push_back(Shape.getValue());
		}
		sdesmalenUnsubmitted Done Reply Inline Actions nit: These two if-statements can be merged with a `&&` sdesmalen: nit: These two if-statements can be merged with a `&&`

		return Ret;
		sdesmalenUnsubmitted Done Reply Inline Actions should `CI.getModule()->getFunction(Shape.getValue().VectorName)` be an assert? When would it ever happen that the vector function is not declared * in the IR? note that I'm specifically saying declared* and not defined here. sdesmalen: should `CI.getModule()->getFunction(Shape.getValue().VectorName)` be an assert? When would it…
		fpetrogalliAuthorUnsubmitted Done Reply Inline Actions should CI.getModule()->getFunction(Shape.getValue().VectorName) be an assert? When would it ever happen that the vector function is not declared * in the IR? Yes, done. In fact, and IR where there is a VFABI attribute with a mapping to function X, but doesn't have X declared, shoudl be considered broken. note that I'm specifically saying declared and not defined here. You are reaching enlightenment! :) fpetrogalli:* > should CI.getModule()->getFunction(Shape.getValue().VectorName) be an assert? When would it…
		ABataevUnsubmitted Done Reply Inline Actions `const auto &` ABataev: `const auto &`
		}
		bool isFunctionVectorizable(CallInst *CI) const {
		if (!getVFMappings(CI).empty())
		return true;

		if (TLI)
		return TLI->isFunctionVectorizable(CI->getCalledFunction()->getName());
		sdesmalenUnsubmitted Done Reply Inline Actions Please add a message to the assert. sdesmalen: Please add a message to the assert.

		return false;
		}
		bool isFunctionVectorizable(CallInst *CI, unsigned VF) const {

		const auto Shapes = getVFMappings(CI);

		if (std::any_of(Shapes.begin(), Shapes.end(),
		sdesmalenUnsubmitted Done Reply Inline Actions this method doesn't do anything more than invoke `getVFABIMappings` so has no value (other than being public). sdesmalen: this method doesn't do anything more than invoke `getVFABIMappings` so has no value (other than…
		fpetrogalliAuthorUnsubmitted Done Reply Inline Actions this method doesn't do anything more than invoke getVFABIMappings so has no value (other than being public). Yeah, but the value is int he comment inside of it, stating that other VFShapes can be build outside of a VFABI context. I'd prefer to keep it. fpetrogalli: > this method doesn't do anything more than invoke getVFABIMappings so has no value (other than…
		[&VF](VFInfo S) { return S.Shape.VF == VF; }))
		return true;

		if (TLI)
		return TLI->isFunctionVectorizable(CI->getCalledFunction()->getName(),
		VF);

		return false;
		}
		StringRef getVectorizedFunction(CallInst *CI, unsigned VF) const {
		const auto Mappings = getVFMappings(CI);
		for (const auto &Info : Mappings)
		sdesmalenUnsubmitted Done Reply Inline Actions Is it worth implementing a `operator<` for VFShape and sorting the result for `getMappings()`? That way we can use binary search using llvm::lower_bound instead of looping through each shape in `ScalarToVectorMappings`. sdesmalen: Is it worth implementing a `operator<` for VFShape and sorting the result for `getMappings()`?
		fpetrogalliAuthorUnsubmitted Done Reply Inline Actions sort them by what field? VF? Also, I don't expect an attribute to hold more than 8 functions , which seems to be the worst case scenario when all X86 vector extensions are being used... are you sure you want to add such optimization. How about we leave it as it is (slow but simple) and leave optimizations for the future if it turns out we need to speed up the search? fpetrogalli: sort them by what field? VF? Also, I don't expect an attribute to hold more than 8 functions…
		sdesmalenUnsubmitted Done Reply Inline Actions Alright, if it only has a few elements and is constructed to do a single lookup it is probably not worth the overhead of sorting. sdesmalen: Alright, if it only has a few elements and is constructed to do a single lookup it is probably…
		if (Info.Shape.VF == VF)
		return Info.VectorName;

		jdoerfertUnsubmitted Done Reply Inline Actions Arguably, you could just call `getVFMappings` and cache the result. Whoever created a SearchVFSystem will need the values eventually and this way the `isFucntionVectorizable` call is free. `CI->getModule()` The explicit and delete seems a lot of overhead, make the member a `CallInst &CI` and you should not need any of it. jdoerfert: Arguably, you could just call `getVFMappings` and cache the result. Whoever created a…
		if (TLI)
		return TLI->getVectorizedFunction(CI->getCalledFunction()->getName(), VF);

		return "";
		}
		fpetrogalliAuthorUnsubmitted Done Reply Inline Actions This method `fixUpVFABINames` could be invoked once directly in the LoopVectorizer (runOnFunction), so that: it avoids calling it every time the SVFS is instantiated, by doing it once for all the calls in the module (or function). it allows the SVFS to be independent on the TLI. This seems a better solution to me - any opinions? fpetrogalli: This method `fixUpVFABINames` could be invoked once directly in the LoopVectorizer…
		bool isKnownVectorFunctionInLibrary(StringRef F) const {
		if (TLI)
		return TLI->isFunctionVectorizable(F);

		return false;
		}
		};

template <typename T> class ArrayRef;		template <typename T> class ArrayRef;
class DemandedBits;		class DemandedBits;
class GetElementPtrInst;		class GetElementPtrInst;
template <typename InstTy> class InterleaveGroup;		template <typename InstTy> class InterleaveGroup;
class Loop;		class Loop;
class ScalarEvolution;		class ScalarEvolution;
class TargetLibraryInfo;
class TargetTransformInfo;		class TargetTransformInfo;
class Type;		class Type;
class Value;		class Value;

namespace Intrinsic {		namespace Intrinsic {
enum ID : unsigned;		enum ID : unsigned;
}		}

/// Identify if the intrinsic is trivially vectorizable.		/// Identify if the intrinsic is trivially vectorizable.
		fpetrogalliAuthorUnsubmitted Done Reply Inline Actions The `VFISAKind::LLVM_INTERNAL_TLI` value, at the moment, is restricting the use of the SVFS to only those functions that are listed in the TLI. This makes me wonder whether it would be better to move the VFISAKind from the VFShape to the VFInfo. This could be justified under the assumption that a scalar function can be mapped to multiple vector functions with the same vectorization shape (same `FunctionType`), just with a different underlying ISA. Any thoughts? fpetrogalli: The `VFISAKind::LLVM_INTERNAL_TLI` value, at the moment, is restricting the use of the SVFS to…
/// This method returns true if the intrinsic's argument types are all scalars		/// This method returns true if the intrinsic's argument types are all scalars
/// for the scalar form of the intrinsic and all vectors (or scalars handled by		/// for the scalar form of the intrinsic and all vectors (or scalars handled by
/// hasVectorInstrinsicScalarOpd) for the vector form of the intrinsic.		/// hasVectorInstrinsicScalarOpd) for the vector form of the intrinsic.
bool isTriviallyVectorizable(Intrinsic::ID ID);		bool isTriviallyVectorizable(Intrinsic::ID ID);

/// Identifies if the vector form of the intrinsic has a scalar operand.		/// Identifies if the vector form of the intrinsic has a scalar operand.
bool hasVectorInstrinsicScalarOpd(Intrinsic::ID ID, unsigned ScalarOpdIdx);		bool hasVectorInstrinsicScalarOpd(Intrinsic::ID ID, unsigned ScalarOpdIdx);

/// Returns intrinsic ID for call.		/// Returns intrinsic ID for call.
/// For the input call instruction it finds mapping intrinsic and returns		/// For the input call instruction it finds mapping intrinsic and returns
/// its intrinsic ID, in case it does not found it return not_intrinsic.		/// its intrinsic ID, in case it does not found it return not_intrinsic.
Intrinsic::ID getVectorIntrinsicIDForCall(const CallInst *CI,		Intrinsic::ID getVectorIntrinsicIDForCall(const CallInst *CI,
const TargetLibraryInfo *TLI);		const TargetLibraryInfo *TLI);

		jdoerfertUnsubmitted Done Reply Inline Actions Typos in the above TODO. jdoerfert: Typos in the above TODO.
/// Find the operand of the GEP that should be checked for consecutive		/// Find the operand of the GEP that should be checked for consecutive
/// stores. This ignores trailing indices that have no effect on the final		/// stores. This ignores trailing indices that have no effect on the final
/// pointer.		/// pointer.
unsigned getGEPInductionOperand(const GetElementPtrInst *Gep);		unsigned getGEPInductionOperand(const GetElementPtrInst *Gep);

/// If the argument is a GEP, then returns the operand identified by		/// If the argument is a GEP, then returns the operand identified by
/// getGEPInductionOperand. However, if there is some other non-loop-invariant		/// getGEPInductionOperand. However, if there is some other non-loop-invariant
/// operand, it returns that instead.		/// operand, it returns that instead.
▲ Show 20 Lines • Show All 572 Lines • Show Last 20 Lines

llvm/lib/Analysis/LazyCallGraph.cpp

	Show All 9 Lines
	#include "llvm/ADT/ArrayRef.h"			#include "llvm/ADT/ArrayRef.h"
	#include "llvm/ADT/STLExtras.h"			#include "llvm/ADT/STLExtras.h"
	#include "llvm/ADT/ScopeExit.h"			#include "llvm/ADT/ScopeExit.h"
	#include "llvm/ADT/Sequence.h"			#include "llvm/ADT/Sequence.h"
	#include "llvm/ADT/SmallPtrSet.h"			#include "llvm/ADT/SmallPtrSet.h"
	#include "llvm/ADT/SmallVector.h"			#include "llvm/ADT/SmallVector.h"
	#include "llvm/ADT/iterator_range.h"			#include "llvm/ADT/iterator_range.h"
	#include "llvm/Analysis/TargetLibraryInfo.h"			#include "llvm/Analysis/TargetLibraryInfo.h"
				#include "llvm/Analysis/VectorUtils.h"
	#include "llvm/Config/llvm-config.h"			#include "llvm/Config/llvm-config.h"
	#include "llvm/IR/CallSite.h"			#include "llvm/IR/CallSite.h"
	#include "llvm/IR/Function.h"			#include "llvm/IR/Function.h"
	#include "llvm/IR/GlobalVariable.h"			#include "llvm/IR/GlobalVariable.h"
	#include "llvm/IR/Instruction.h"			#include "llvm/IR/Instruction.h"
	#include "llvm/IR/Module.h"			#include "llvm/IR/Module.h"
	#include "llvm/IR/PassManager.h"			#include "llvm/IR/PassManager.h"
	#include "llvm/Support/Casting.h"			#include "llvm/Support/Casting.h"
	▲ Show 20 Lines • Show All 115 Lines • ▼ Show 20 Lines
	LLVM_DUMP_METHOD void LazyCallGraph::Node::dump() const {			LLVM_DUMP_METHOD void LazyCallGraph::Node::dump() const {
	dbgs() << *this << '\n';			dbgs() << *this << '\n';
	}			}
	#endif			#endif

	static bool isKnownLibFunction(Function &F, TargetLibraryInfo &TLI) {			static bool isKnownLibFunction(Function &F, TargetLibraryInfo &TLI) {
	LibFunc LF;			LibFunc LF;

	// Either this is a normal library function or a "vectorizable" function.			// Either this is a normal library function or a "vectorizable"
	return TLI.getLibFunc(F, LF) \|\| TLI.isFunctionVectorizable(F.getName());			// function. Not using the SearchVFSystem here because this query
				// is related only to libraries handled via the TLI.
				return TLI.getLibFunc(F, LF) \|\|
				SearchVFSystem(&TLI).isKnownVectorFunctionInLibrary(F.getName());
	}			}

	LazyCallGraph::LazyCallGraph(			LazyCallGraph::LazyCallGraph(
	Module &M, function_ref<TargetLibraryInfo &(Function &)> GetTLI) {			Module &M, function_ref<TargetLibraryInfo &(Function &)> GetTLI) {
	LLVM_DEBUG(dbgs() << "Building CG for module: " << M.getModuleIdentifier()			LLVM_DEBUG(dbgs() << "Building CG for module: " << M.getModuleIdentifier()
	<< "\n");			<< "\n");
	for (Function &F : M) {			for (Function &F : M) {
	if (F.isDeclaration())			if (F.isDeclaration())
	▲ Show 20 Lines • Show All 1,658 Lines • Show Last 20 Lines

llvm/lib/Analysis/LoopAccessAnalysis.cpp

Show First 20 Lines • Show All 1,838 Lines • ▼ Show 20 Lines	for (Instruction &I : *BB) {
// the flag. Therefore, it is safe to ignore this read from memory.		// the flag. Therefore, it is safe to ignore this read from memory.
auto *Call = dyn_cast<CallInst>(&I);		auto *Call = dyn_cast<CallInst>(&I);
if (Call && getVectorIntrinsicIDForCall(Call, TLI))		if (Call && getVectorIntrinsicIDForCall(Call, TLI))
continue;		continue;

// If the function has an explicit vectorized counterpart, we can safely		// If the function has an explicit vectorized counterpart, we can safely
// assume that it can be vectorized.		// assume that it can be vectorized.
if (Call && !Call->isNoBuiltin() && Call->getCalledFunction() &&		if (Call && !Call->isNoBuiltin() && Call->getCalledFunction() &&
TLI->isFunctionVectorizable(Call->getCalledFunction()->getName()))		SearchVFSystem(TLI).isFunctionVectorizable(Call))
		sdesmalenUnsubmitted Done Reply Inline Actions `getMappings` is quite an expensive call, so you'll want to add a special function here that bails out earlier, and doesn't have to demangle string and populate (and possibly sort) an array. sdesmalen: `getMappings` is quite an expensive call, so you'll want to add a special function here that…
		fpetrogalliAuthorUnsubmitted Done Reply Inline Actions Done, check the implementation of `getVFABIMappings`. fpetrogalli: Done, check the implementation of `getVFABIMappings`.
		sdesmalenUnsubmitted Done Reply Inline Actions The early exit in getMappings doesn't stop a SmallVector from having to be created/destroyed. It would be better to create a new method such as `bool hasVectorVariants()` that answers the question directly. sdesmalen: The early exit in getMappings doesn't stop a SmallVector from having to be created/destroyed.
		fpetrogalliAuthorUnsubmitted Done Reply Inline Actions I understand your concerns in performance here, but I am not keen in doing this. The attribute might contain junk that don't demangle correctly to a VFInfo - in that case the function would return "yes, this call has vector variant", but that wouldn't make sense because there would be no variant. I'd rather play it safe. fpetrogalli: I understand your concerns in performance here, but I am not keen in doing this. The attribute…
continue;		continue;

auto *Ld = dyn_cast<LoadInst>(&I);		auto *Ld = dyn_cast<LoadInst>(&I);
if (!Ld) {		if (!Ld) {
recordAnalysis("CantVectorizeInstruction", Ld)		recordAnalysis("CantVectorizeInstruction", Ld)
<< "instruction cannot be vectorized";		<< "instruction cannot be vectorized";
HasComplexMemInst = true;		HasComplexMemInst = true;
continue;		continue;
▲ Show 20 Lines • Show All 564 Lines • ▼ Show 20 Lines
bool LoopAccessLegacyAnalysis::runOnFunction(Function &F) {		bool LoopAccessLegacyAnalysis::runOnFunction(Function &F) {
SE = &getAnalysis<ScalarEvolutionWrapperPass>().getSE();		SE = &getAnalysis<ScalarEvolutionWrapperPass>().getSE();
auto *TLIP = getAnalysisIfAvailable<TargetLibraryInfoWrapperPass>();		auto *TLIP = getAnalysisIfAvailable<TargetLibraryInfoWrapperPass>();
TLI = TLIP ? &TLIP->getTLI(F) : nullptr;		TLI = TLIP ? &TLIP->getTLI(F) : nullptr;
AA = &getAnalysis<AAResultsWrapperPass>().getAAResults();		AA = &getAnalysis<AAResultsWrapperPass>().getAAResults();
DT = &getAnalysis<DominatorTreeWrapperPass>().getDomTree();		DT = &getAnalysis<DominatorTreeWrapperPass>().getDomTree();
LI = &getAnalysis<LoopInfoWrapperPass>().getLoopInfo();		LI = &getAnalysis<LoopInfoWrapperPass>().getLoopInfo();

return false;		return false;
}		}
		jdoerfertUnsubmitted Done Reply Inline Actions Leftover? jdoerfert: Leftover?

void LoopAccessLegacyAnalysis::getAnalysisUsage(AnalysisUsage &AU) const {		void LoopAccessLegacyAnalysis::getAnalysisUsage(AnalysisUsage &AU) const {
AU.addRequired<ScalarEvolutionWrapperPass>();		AU.addRequired<ScalarEvolutionWrapperPass>();
AU.addRequired<AAResultsWrapperPass>();		AU.addRequired<AAResultsWrapperPass>();
AU.addRequired<DominatorTreeWrapperPass>();		AU.addRequired<DominatorTreeWrapperPass>();
AU.addRequired<LoopInfoWrapperPass>();		AU.addRequired<LoopInfoWrapperPass>();

AU.setPreservesAll();		AU.setPreservesAll();
Show All 27 Lines

llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp

Show First 20 Lines • Show All 663 Lines • ▼ Show 20 Lines	for (Instruction &I : *BB) {
// We handle calls that:		// We handle calls that:
// * Are debug info intrinsics.		// * Are debug info intrinsics.
// * Have a mapping to an IR intrinsic.		// * Have a mapping to an IR intrinsic.
// * Have a vector version available.		// * Have a vector version available.
auto *CI = dyn_cast<CallInst>(&I);		auto *CI = dyn_cast<CallInst>(&I);
if (CI && !getVectorIntrinsicIDForCall(CI, TLI) &&		if (CI && !getVectorIntrinsicIDForCall(CI, TLI) &&
!isa<DbgInfoIntrinsic>(CI) &&		!isa<DbgInfoIntrinsic>(CI) &&
!(CI->getCalledFunction() && TLI &&		!(CI->getCalledFunction() && TLI &&
TLI->isFunctionVectorizable(CI->getCalledFunction()->getName()))) {		SearchVFSystem(TLI).isFunctionVectorizable(CI))) {
// If the call is a recognized math libary call, it is likely that		// If the call is a recognized math libary call, it is likely that
// we can vectorize it given loosened floating-point constraints.		// we can vectorize it given loosened floating-point constraints.
LibFunc Func;		LibFunc Func;
bool IsMathLibCall =		bool IsMathLibCall =
TLI && CI->getCalledFunction() &&		TLI && CI->getCalledFunction() &&
CI->getType()->isFloatingPointTy() &&		CI->getType()->isFloatingPointTy() &&
TLI->getLibFunc(CI->getCalledFunction()->getName(), Func) &&		TLI->getLibFunc(CI->getCalledFunction()->getName(), Func) &&
TLI->hasOptimizedCodeGen(Func);		TLI->hasOptimizedCodeGen(Func);
▲ Show 20 Lines • Show All 560 Lines • Show Last 20 Lines

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 3,177 Lines • ▼ Show 20 Lines	for (BasicBlock::iterator I = BB->begin(), E = BB->end(); I != E;) {
CSEMap[In] = In;		CSEMap[In] = In;
}		}
}		}

unsigned LoopVectorizationCostModel::getVectorCallCost(CallInst *CI,		unsigned LoopVectorizationCostModel::getVectorCallCost(CallInst *CI,
unsigned VF,		unsigned VF,
bool &NeedToScalarize) {		bool &NeedToScalarize) {
Function *F = CI->getCalledFunction();		Function *F = CI->getCalledFunction();
StringRef FnName = CI->getCalledFunction()->getName();
Type *ScalarRetTy = CI->getType();		Type *ScalarRetTy = CI->getType();
SmallVector<Type *, 4> Tys, ScalarTys;		SmallVector<Type *, 4> Tys, ScalarTys;
for (auto &ArgOp : CI->arg_operands())		for (auto &ArgOp : CI->arg_operands())
ScalarTys.push_back(ArgOp->getType());		ScalarTys.push_back(ArgOp->getType());

// Estimate cost of scalarized vector call. The source operands are assumed		// Estimate cost of scalarized vector call. The source operands are assumed
// to be vectors, so we need to extract individual elements from there,		// to be vectors, so we need to extract individual elements from there,
// execute VF scalar calls, and then gather the result into the vector return		// execute VF scalar calls, and then gather the result into the vector return
Show All 11 Lines	unsigned LoopVectorizationCostModel::getVectorCallCost(CallInst *CI,
// packing the return values to a vector.		// packing the return values to a vector.
unsigned ScalarizationCost = getScalarizationOverhead(CI, VF);		unsigned ScalarizationCost = getScalarizationOverhead(CI, VF);

unsigned Cost = ScalarCallCost * VF + ScalarizationCost;		unsigned Cost = ScalarCallCost * VF + ScalarizationCost;

// If we can't emit a vector call for this function, then the currently found		// If we can't emit a vector call for this function, then the currently found
// cost is the cost we need to return.		// cost is the cost we need to return.
NeedToScalarize = true;		NeedToScalarize = true;
if (!TLI \|\| !TLI->isFunctionVectorizable(FnName, VF) \|\| CI->isNoBuiltin())		if (!TLI \|\| !SearchVFSystem(TLI).isFunctionVectorizable(CI, VF) \|\|
		CI->isNoBuiltin())
		sdesmalenUnsubmitted Done Reply Inline Actions the call to `isFunctionVectorizable` is expensive, so please reorder this after `CI->isNoBuiltin()`, so that the function can bail out more cheaply. sdesmalen: the call to `isFunctionVectorizable` is expensive, so please reorder this after `CI…
return Cost;		return Cost;

// If the corresponding vector cost is cheaper, return its cost.		// If the corresponding vector cost is cheaper, return its cost.
unsigned VectorCallCost = TTI.getCallInstrCost(nullptr, RetTy, Tys);		unsigned VectorCallCost = TTI.getCallInstrCost(nullptr, RetTy, Tys);
if (VectorCallCost < Cost) {		if (VectorCallCost < Cost) {
NeedToScalarize = false;		NeedToScalarize = false;
return VectorCallCost;		return VectorCallCost;
}		}
▲ Show 20 Lines • Show All 997 Lines • ▼ Show 20 Lines	case Instruction::Call: {
// Ignore dbg intrinsics.		// Ignore dbg intrinsics.
if (isa<DbgInfoIntrinsic>(I))		if (isa<DbgInfoIntrinsic>(I))
break;		break;
setDebugLocFromInst(Builder, &I);		setDebugLocFromInst(Builder, &I);

Module *M = I.getParent()->getParent()->getParent();		Module *M = I.getParent()->getParent()->getParent();
auto *CI = cast<CallInst>(&I);		auto *CI = cast<CallInst>(&I);

StringRef FnName = CI->getCalledFunction()->getName();
Function *F = CI->getCalledFunction();		Function *F = CI->getCalledFunction();
Type *RetTy = ToVectorTy(CI->getType(), VF);		Type *RetTy = ToVectorTy(CI->getType(), VF);
SmallVector<Type *, 4> Tys;		SmallVector<Type *, 4> Tys;
for (Value *ArgOperand : CI->arg_operands())		for (Value *ArgOperand : CI->arg_operands())
Tys.push_back(ToVectorTy(ArgOperand->getType(), VF));		Tys.push_back(ToVectorTy(ArgOperand->getType(), VF));

Intrinsic::ID ID = getVectorIntrinsicIDForCall(CI, TLI);		Intrinsic::ID ID = getVectorIntrinsicIDForCall(CI, TLI);

Show All 22 Lines	for (unsigned Part = 0; Part < UF; ++Part) {
if (UseVectorIntrinsic) {		if (UseVectorIntrinsic) {
// Use vector version of the intrinsic.		// Use vector version of the intrinsic.
Type *TysForDecl[] = {CI->getType()};		Type *TysForDecl[] = {CI->getType()};
if (VF > 1)		if (VF > 1)
TysForDecl[0] = VectorType::get(CI->getType()->getScalarType(), VF);		TysForDecl[0] = VectorType::get(CI->getType()->getScalarType(), VF);
VectorF = Intrinsic::getDeclaration(M, ID, TysForDecl);		VectorF = Intrinsic::getDeclaration(M, ID, TysForDecl);
} else {		} else {
// Use vector version of the library call.		// Use vector version of the library call.
StringRef VFnName = TLI->getVectorizedFunction(FnName, VF);		StringRef VFnName = SearchVFSystem(TLI).getVectorizedFunction(CI, VF);
assert(!VFnName.empty() && "Vector function name is empty.");		assert(!VFnName.empty() && "Vector function name is empty.");
VectorF = M->getFunction(VFnName);		VectorF = M->getFunction(VFnName);
if (!VectorF) {		if (!VectorF) {
// Generate a declaration		// Generate a declaration
FunctionType *FTy = FunctionType::get(RetTy, Tys, false);		FunctionType *FTy = FunctionType::get(RetTy, Tys, false);
VectorF =		VectorF =
Function::Create(FTy, Function::ExternalLinkage, VFnName, M);		Function::Create(FTy, Function::ExternalLinkage, VFnName, M);
VectorF->copyAttributesFrom(F);		VectorF->copyAttributesFrom(F);
▲ Show 20 Lines • Show All 3,561 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[VectorUtils] Introduce the Vector Function Database (VFDatabase).ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 220174

llvm/include/llvm/Analysis/TargetLibraryInfo.h

llvm/include/llvm/Analysis/VectorUtils.h

llvm/lib/Analysis/LazyCallGraph.cpp

llvm/lib/Analysis/LoopAccessAnalysis.cpp

llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

[VectorUtils] Introduce the Vector Function Database (VFDatabase).
ClosedPublic