This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/
-
llvm/
-
CodeGen/
-
CodeGenPassBuilder.h
-
MachinePassRegistry.def
-
Passes.h
1/1
ReplaceWithVeclib.h
-
InitializePasses.h
-
lib/CodeGen/
-
CodeGen/
-
CMakeLists.txt
5/5
ReplaceWithVeclib.cpp
-
TargetPassConfig.cpp
-
test/CodeGen/
-
CodeGen/
-
AArch64/
-
O3-pipeline.ll
-
ARM/
-
O3-pipeline.ll
-
Generic/
2/2
replace-intrinsics-with-veclib.ll
-
X86/
-
opt-pipeline.ll
-
tools/
-
llc/
-
llc.cpp
-
opt/
-
opt.cpp
-
utils/gn/secondary/llvm/lib/CodeGen/
-
gn/
-
secondary/
-
llvm/
-
lib/
-
CodeGen/
-
BUILD.gn

Differential D95373

Replace vector intrinsics with call to vector library
ClosedPublic

Authored by LukasSommerTu on Jan 25 2021, 9:22 AM.

Download Raw Diff

Details

Reviewers

venkataramanan.kumar.llvm
spatel
fhahn
lebedev.ri

Commits

rG6577cef9b03f: [CodeGen] New pass: Replace vector intrinsics with call to vector library
rG2303e93e666e: [Codegen][ReplaceWithVecLib] add pass to replace vector intrinsics with calls…

Summary

This patch adds a pass to replace calls to vector intrinsics (i.e., LLVM intrinsics operating on vector operands) with calls to a vector library.

Currently, calls to LLVM intrinsics are only replaced with calls to vector libraries when scalar calls to intrinsics are vectorized by the Loop- or SLP-Vectorizer.

With this pass, it is now possible to replace calls to LLVM intrinsics already operating on vector operands, e.g., if such code was generated by MLIR. For the replacement, information from the TargetLibraryInfo, e.g., as specified via -vector-library is used.

Diff Detail

Event Timeline

LukasSommerTu created this revision.Jan 25 2021, 9:22 AM

Herald added subscribers: rriddle, hiraditya, mgorny. · View Herald TranscriptJan 25 2021, 9:22 AM

LukasSommerTu requested review of this revision.Jan 25 2021, 9:22 AM

Herald added subscribers: llvm-commits, stephenneuendorffer. · View Herald TranscriptJan 25 2021, 9:22 AM

What's the intended use-case of this pass?

In D95373#2520302, @lebedev.ri wrote:

What's the intended use-case of this pass?

Some frontends or in my case MLIR, generate calls to LLVM intrinsics operating on vector operands, e.g.:
%call = call <4 x double> @llvm.exp.v4f64(<4 x double> %in)

Those calls are currently not replaced with calls to vector libraries (e.g., libmvec, SVML), because this replacement only happens for scalar calls in the Loop-/SLP-vectorizer.

With the pass, it is now possible to replace the calls for improved performance, resulting in the following for SVML & the example above:
%call = call <4 x double> @__svml_exp4(<4 x double> %in)

In D95373#2520314, @LukasSommerTu wrote:

In D95373#2520302, @lebedev.ri wrote:

What's the intended use-case of this pass?

Some frontends or in my case MLIR, generate calls to LLVM intrinsics operating on vector operands, e.g.:
%call = call <4 x double> @llvm.exp.v4f64(<4 x double> %in)

Those calls are currently not replaced with calls to vector libraries (e.g., libmvec, SVML), because this replacement only happens for scalar calls in the Loop-/SLP-vectorizer.

With the pass, it is now possible to replace the calls for improved performance, resulting in the following for SVML & the example above:
%call = call <4 x double> @__svml_exp4(<4 x double> %in)

I understand what transformation this performs, that wasn't my question.

What i mean is that this is the opposite of the transformation
i'd expect to happen for middle-end IR optimizations,
and this lowering should be happening somewhere in backend part
of the compilation pipeline. So i'm mainly asking where this is intended to be run

@lebedev.ri: Thanks for your feedback!

My reasoning to implement this as an IR pass was that the replacement for the scalar version of the intrinsics is also happening as part of the middle-end, so it made sense to me to implement this in a similar location and use similar mechansisms.

My motivating usage scenario was to add this pass to the pipeline that is applied after converting from MLIR to LLVM IR, which introduces these calls. As the replacement requires some target-specific information (TargetLibraryInfo), implementing it on the LLVM side rather than in MLIR seemed reasonable to me. If other frontends produce similar IR, they can also benefit from this pass/transformation.

In D95373#2520409, @LukasSommerTu wrote:

@lebedev.ri: Thanks for your feedback!

My reasoning to implement this as an IR pass was that the replacement for the scalar version of the intrinsics is also happening as part of the middle-end, so it made sense to me to implement this in a similar location and use similar mechansisms.

Hm, it does? That seems quite surprising to me.

My motivating usage scenario was to add this pass to the pipeline that is applied after converting from MLIR to LLVM IR, which introduces these calls. As the replacement requires some target-specific information (TargetLibraryInfo), implementing it on the LLVM side rather than in MLIR seemed reasonable to me. If other frontends produce similar IR, they can also benefit from this pass/transformation.

The thing is, the passes that now know how to deal with whatever IR instruction/intrinsic,
will now need to also know about N flavors of those intrinsics shaped in the form of a libcall.
This doesn't quite seem like the right direction.
Also, tangentially related, is there cost modelling being performed for them?

So i would still expect this to happen somewhere in codegen (llc) pipeline, not optimization (opt) pipeline..

Harbormaster completed remote builds in B86585: Diff 319032.Jan 25 2021, 10:32 AM

In D95373#2520441, @lebedev.ri wrote:

In D95373#2520409, @LukasSommerTu wrote:

@lebedev.ri: Thanks for your feedback!

My reasoning to implement this as an IR pass was that the replacement for the scalar version of the intrinsics is also happening as part of the middle-end, so it made sense to me to implement this in a similar location and use similar mechansisms.

Hm, it does? That seems quite surprising to me.

Yes, inject-tli-mappings add the VFABI-attributes to the intrinsic calls and the loop-vectorizer performs the replacement as part of the vectorization.
For example, %call = tail call double @llvm.sin.f64(double %conv) will be replaced inside a loop that is vectorized by %5 = call <4 x double> @_ZGVdN4v_sin(<4 x double> %4) using opt -vector-library=LIBMVEC-X86 -S -inject-tli-mappings -loop-vectorize

More examples can be found in https://github.com/llvm/llvm-project/blob/main/llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls.ll

I get your point about placing such a transformation as last as possible in the pipeline or in the codegen, but as the Loop- and SLP-vectorizer perform a similar transformation, I thought it might be a good idea to reuse similar mechanisms and infrastructure.

I haven't looked at the trade-offs, but if we can do this transform later ( SelectionDAGLegalize::ConvertNodeToLibcall() ? ) that would seem to be more flexible...and smaller patch?
For reference:
D8131 - added support for external vector library calls
D70107 - added inject-tli-mappings

In D95373#2522603, @spatel wrote:

I haven't looked at the trade-offs, but if we can do this transform later ( SelectionDAGLegalize::ConvertNodeToLibcall() ? ) that would seem to be more flexible...and smaller patch?
For reference:
D8131 - added support for external vector library calls
D70107 - added inject-tli-mappings

+1 on doing this as late as possible. Ideally we would then just get rid of the logic in LV/SLP that inserts the library calls.

fhahn edited reviewers, added: fhahn; removed: Florian.Jan 26 2021, 5:36 AM

lebedev.ri requested changes to this revision.Jan 26 2021, 5:38 AM

This revision now requires changes to proceed.Jan 26 2021, 5:38 AM

In D95373#2522607, @fhahn wrote:

In D95373#2522603, @spatel wrote:

I haven't looked at the trade-offs, but if we can do this transform later ( SelectionDAGLegalize::ConvertNodeToLibcall() ? ) that would seem to be more flexible...and smaller patch?
For reference:
D8131 - added support for external vector library calls
D70107 - added inject-tli-mappings

+1 on doing this as late as possible. Ideally we would then just get rid of the logic in LV/SLP that inserts the library calls.

I think SelectionDAG would scalarize the operation in LegalizeVectorOps before they reach LegalizeDAG.

In D95373#2523207, @craig.topper wrote:

In D95373#2522607, @fhahn wrote:

In D95373#2522603, @spatel wrote:

I haven't looked at the trade-offs, but if we can do this transform later ( SelectionDAGLegalize::ConvertNodeToLibcall() ? ) that would seem to be more flexible...and smaller patch?
For reference:
D8131 - added support for external vector library calls
D70107 - added inject-tli-mappings

+1 on doing this as late as possible. Ideally we would then just get rid of the logic in LV/SLP that inserts the library calls.

I think SelectionDAG would scalarize the operation in LegalizeVectorOps before they reach LegalizeDAG.

So an IR codegen pass that runs after the optimizer would be best? If the patch is small enough, we could glom this into CGP?

In D95373#2523383, @spatel wrote:

In D95373#2523207, @craig.topper wrote:

In D95373#2522607, @fhahn wrote:

In D95373#2522603, @spatel wrote:

I haven't looked at the trade-offs, but if we can do this transform later ( SelectionDAGLegalize::ConvertNodeToLibcall() ? ) that would seem to be more flexible...and smaller patch?
For reference:
D8131 - added support for external vector library calls
D70107 - added inject-tli-mappings

+1 on doing this as late as possible. Ideally we would then just get rid of the logic in LV/SLP that inserts the library calls.

I think SelectionDAG would scalarize the operation in LegalizeVectorOps before they reach LegalizeDAG.

So an IR codegen pass that runs after the optimizer would be best? If the patch is small enough, we could glom this into CGP?

Does GlobalISel also use CGP? Do we want to emit vector libcalls for vector IR intrinsics at O0? If I recall, CGP is disabled at O0.

In D95373#2523383, @spatel wrote:

In D95373#2523207, @craig.topper wrote:

In D95373#2522607, @fhahn wrote:

In D95373#2522603, @spatel wrote:

I haven't looked at the trade-offs, but if we can do this transform later ( SelectionDAGLegalize::ConvertNodeToLibcall() ? ) that would seem to be more flexible...and smaller patch?
For reference:
D8131 - added support for external vector library calls
D70107 - added inject-tli-mappings

+1 on doing this as late as possible. Ideally we would then just get rid of the logic in LV/SLP that inserts the library calls.

I think SelectionDAG would scalarize the operation in LegalizeVectorOps before they reach LegalizeDAG.

So an IR codegen pass that runs after the optimizer would be best? If the patch is small enough, we could glom this into CGP?

Can this be scheduled as a pre code generation pass ?

In D95373#2523413, @craig.topper wrote:

Does GlobalISel also use CGP? Do we want to emit vector libcalls for vector IR intrinsics at O0? If I recall, CGP is disabled at O0.

GlobalISel should not use CGP in the long-run since CGP was purely a hack to overcome SDAG's single-block limitation. I'm not sure what the status is for GlobalISel targets currently though.

In D95373#2524237, @venkataramanan.kumar.llvm wrote:

Can this be scheduled as a pre code generation pass ?

I haven't actually looked at the patch details, but if there's agreement that it's ok to leave it as an IR transform, then it should be a small adjustment to make this an IR codegen pass. It would be similar to the existing ExpandReductions pass.

In D95373#2525396, @spatel wrote:

In D95373#2523413, @craig.topper wrote:

Does GlobalISel also use CGP? Do we want to emit vector libcalls for vector IR intrinsics at O0? If I recall, CGP is disabled at O0.

GlobalISel should not use CGP in the long-run since CGP was purely a hack to overcome SDAG's single-block limitation. I'm not sure what the status is for GlobalISel targets currently though.

In D95373#2524237, @venkataramanan.kumar.llvm wrote:

Can this be scheduled as a pre code generation pass ?

I haven't actually looked at the patch details, but if there's agreement that it's ok to leave it as an IR transform, then it should be a small adjustment to make this an IR codegen pass. It would be similar to the existing ExpandReductions pass.

That is what i was suggesting to do FWIW.
Perhaps, this makes sense as middle-end optimization, but then we should have intrinsics instead of all these libcalls,
and somehow everything would need to be updated to be okay with that, but that will require a *much* larger changes.

Thanks to everyone for your feedback so far!

I haven't actually looked at the patch details, but if there's agreement that it's ok to leave it as an IR transform, then it should be a small adjustment to make this an IR codegen pass. It would be similar to the existing ExpandReductions pass.

As this solution (i.e., keeping it as an IR pass, but move it to be an IR codegen pass) seems to be acceptable, I would change the implementation accordingly and update the patch.

@lebedev.ri, @fhahn, @spatel, @venkataramanan.kumar.llvm, @craig.topper: Any objections?

In D95373#2525539, @LukasSommerTu wrote:

Thanks to everyone for your feedback so far!

I haven't actually looked at the patch details, but if there's agreement that it's ok to leave it as an IR transform, then it should be a small adjustment to make this an IR codegen pass. It would be similar to the existing ExpandReductions pass.

As this solution (i.e., keeping it as an IR pass, but move it to be an IR codegen pass) seems to be acceptable, I would change the implementation accordingly and update the patch.

@lebedev.ri, @fhahn, @spatel, @venkataramanan.kumar.llvm, @craig.topper: Any objections?

No objections from me.

In D95373#2525539, @LukasSommerTu wrote:

Thanks to everyone for your feedback so far!

I haven't actually looked at the patch details, but if there's agreement that it's ok to leave it as an IR transform, then it should be a small adjustment to make this an IR codegen pass. It would be similar to the existing ExpandReductions pass.

As this solution (i.e., keeping it as an IR pass, but move it to be an IR codegen pass) seems to be acceptable, I would change the implementation accordingly and update the patch.

@lebedev.ri, @fhahn, @spatel, @venkataramanan.kumar.llvm, @craig.topper: Any objections?

I am ok with this.

I've updated the implementation to have this pass as an IR codegen pass.

I did not add an explicit flag to deactivate the pass (as it for example is the case for ExpandReductions), as the replacement can be deactivated by setting --vector-library=none (which is also the default). The pass is also not added to the pipeline for -O0.

Similar to ExpandReductions, the pass now explicitly deletes the call instructions to intrinsics that have been replaced and does not rely on DCE, given that it is now very late in pipeline.

Herald added subscribers: nikic, pengfei. · View Herald TranscriptFeb 1 2021, 8:04 AM

I've updated the implementation to have this pass as an IR codegen pass.

I did not add an explicit flag to deactivate the pass (as it for example is the case for ExpandReductions), as the replacement can be deactivated by setting --vector-library=none (which is also the default). The pass is also not added to the pipeline for -O0.

Similar to ExpandReductions, the pass now explicitly deletes the call instructions to intrinsics that have been replaced and does not rely on DCE, given that it is now very late in pipeline.

[EDIT] I realized there was a inconsistency between CodeGenPassBuilder and TargetPassConfig, which is now corrected.

spatel added inline comments.Feb 3 2021, 12:13 PM

llvm/lib/CodeGen/ReplaceWithVeclib.cpp
79	Can we assert here that the FunctionType of `Replacement` is the same as the FunctionType of `CI`?
165	typo: intrinsic
186	typo: intrinsics
llvm/test/CodeGen/Generic/replace-intrinsics-with-veclib.ll
14	FileCheck typo: "sAME" Can you use utils/update_test_checks.py to auto-generate the assertions?

Thanks @spatel for the detailed review!

I've addressed your comments and updated the patch. The test is now generated by utils/update_test_checks.py and I've added the assertion.

LukasSommerTu marked 4 inline comments as done.Feb 4 2021, 7:07 AM

LGTM - see inline for a couple of minor points.

llvm/lib/CodeGen/ReplaceWithVeclib.cpp
81	Move this above the replaceAllUsesWith() to be more effective (not just for FP calls)?
llvm/test/CodeGen/Generic/replace-intrinsics-with-veclib.ll
65	The test comments that were here in the previous revision were helpful. I'd put those back (but put them outside of the function body to avoid interfering with the script's CHECK lines).

@spatel: Thanks for your quick feedback, I've addressed your comments and updated the patch.

I do not have commit access, could someone please commit for me? Thanks!

LukasSommerTu marked 2 inline comments as done.Feb 4 2021, 9:25 AM

In D95373#2542587, @LukasSommerTu wrote:

@spatel: Thanks for your quick feedback, I've addressed your comments and updated the patch.

I do not have commit access, could someone please commit for me? Thanks!

I can commit it for you. Let's check if there are any more comments though - @lebedev.ri , was making this a codegen pass sufficient for your change request?

This revision was not accepted when it landed; it landed in state Needs Review.Feb 5 2021, 11:25 AM

Closed by commit rG2303e93e666e: [Codegen][ReplaceWithVecLib] add pass to replace vector intrinsics with calls… (authored by LukasSommerTu, committed by spatel). · Explain Why

This revision was automatically updated to reflect the committed changes.

spatel added a commit: rG2303e93e666e: [Codegen][ReplaceWithVecLib] add pass to replace vector intrinsics with calls….

In D95373#2543185, @spatel wrote:

In D95373#2542587, @LukasSommerTu wrote:

@spatel: Thanks for your quick feedback, I've addressed your comments and updated the patch.

I do not have commit access, could someone please commit for me? Thanks!

I can commit it for you. Let's check if there are any more comments though - @lebedev.ri , was making this a codegen pass sufficient for your change request?

Oops - I didn't notice that I had this patch lined up with other local changes and pushed it just now. Let me know if I should revert.

@spatel: I got notified about some build-bot failures, so maybe it's better to revert the commit and I will check for the cause of the build-bot failures.

spatel added a reverting change: rGc981f6f8e16e: Revert "[Codegen][ReplaceWithVecLib] add pass to replace vector intrinsics with….Feb 5 2021, 12:10 PM

In D95373#2545856, @LukasSommerTu wrote:

@spatel: I got notified about some build-bot failures, so maybe it's better to revert the commit and I will check for the cause of the build-bot failures.

Ok - reverted:
c981f6f8e16

spatel reopened this revision.Feb 5 2021, 12:12 PM

In D95373#2545877, @spatel wrote:

In D95373#2545856, @LukasSommerTu wrote:

@spatel: I got notified about some build-bot failures, so maybe it's better to revert the commit and I will check for the cause of the build-bot failures.

Ok - reverted:
c981f6f8e16

My guess on the problem is that the test file was put in the "Generic" test directory but specified an x86 target in the text. BuildBots may not build the X86 target. We either need to make the test truly generic (target-independent) or put the file in the X86 dir.

The failure also seems to happen with X86 builds, so I suspect some connections to changes to the new PassManager, as the pass is not recognized at all. I will check that in detail and also make sure that the test works correctly, if the X86 target is not built.

@LukasSommerTu are you also looking into unifying/removing the code in SLPVectorizer/LV to create library calls?

llvm/include/llvm/CodeGen/ReplaceWithVeclib.h
23	nit: I think the coding style recommends using `struct` if all members are public.
llvm/lib/CodeGen/ReplaceWithVeclib.cpp
73	nit: Why `IRBuilder{&CI}` here but `Args(CI.arg_operands())` below? can this just be `IRBuilder(&CI)` for consistency?

I've found the reason for the build-bot failure, I had overlooked that codegen passes are currently pinned to the legacy PM in opt. I fixed this and also addressed @fhahn's inline comments (thanks for the feedback!).

LukasSommerTu marked 2 inline comments as done.Feb 8 2021, 9:24 AM

@LukasSommerTu are you also looking into unifying/removing the code in SLPVectorizer/LV to create library calls?

@fhahn: No, I haven't looked into that so far. I assume it would make sense to have a separate patch for that?

Should I try pushing this again?

I reproduced the cause of the failure in the build-bot locally, fixed the bug and successfully ran the tests locally, so if you do not have any additional things that I should change, you could try again. Thanks!

In D95373#2557483, @LukasSommerTu wrote:

I reproduced the cause of the failure in the build-bot locally, fixed the bug and successfully ran the tests locally, so if you do not have any additional things that I should change, you could try again. Thanks!

D96011 changed the TLI.getVectorizedFunction() API, so we need a small adjustment to make this compile. Let me know if this looks right ( cc @david-arm ):

diff --git a/llvm/lib/CodeGen/ReplaceWithVeclib.cpp b/llvm/lib/CodeGen/ReplaceWithVeclib.cpp
index 943199933494..bec0fb772c3e 100644
--- a/llvm/lib/CodeGen/ReplaceWithVeclib.cpp
+++ b/llvm/lib/CodeGen/ReplaceWithVeclib.cpp
@@ -104,7 +104,7 @@ static bool replaceWithCallToVeclib(const TargetLibraryInfo &TLI,
 
   // Convert vector arguments to scalar type and check that
   // all vector operands have identical vector width.
-  unsigned VF = 0;
+  ElementCount VF;
   SmallVector<Type *> ScalarTypes;
   for (auto Arg : enumerate(CI.arg_operands())) {
     auto *ArgType = Arg.value()->getType();
@@ -121,17 +121,17 @@ static bool replaceWithCallToVeclib(const TargetLibraryInfo &TLI,
         // the replacement.
         return false;
       }
-      auto NumElements = VectorArgTy->getElementCount();
+      ElementCount NumElements = VectorArgTy->getElementCount();
       if (NumElements.isScalable()) {
         // The current implementation does not support
         // scalable vectors.
         return false;
       }
-      if (VF && VF != NumElements.getFixedValue()) {
+      if (VF && VF != NumElements) {
         // The different arguments differ in vector size.
         return false;
       } else {
-        VF = NumElements.getFixedValue();
+        VF = NumElements;
       }
       ScalarTypes.push_back(VectorArgTy->getElementType());
     }

Hi @spatel, I think that fix looks sensible, although I'd recommend initialising the ElementCount to something in the same way as VF = 0 rather than relying upon the default constructor behaviour. You can write this as:

ElementCount VF = ElementCount::getFixed(0);

and this ensures it's initialised to the value that you expect. Also you might want to explicitly check for non-zero element counts, i.e.:

if (VF.isNonZero() && VF != NumElements) {

Thanks!

@spatel, @david-arm: Thanks for the update, I will integrate your proposed changes and update the patch after running some tests locally.

@spatel, @david-arm: I integrated your proposed changes/updates and updated the patch. Local tests run fine, so it should be ready for committing from my side.

This revision was not accepted when it landed; it landed in state Needs Review.Feb 12 2021, 9:53 AM

Closed by commit rG6577cef9b03f: [CodeGen] New pass: Replace vector intrinsics with call to vector library (authored by LukasSommerTu, committed by spatel). · Explain Why

This revision was automatically updated to reflect the committed changes.

spatel added a commit: rG6577cef9b03f: [CodeGen] New pass: Replace vector intrinsics with call to vector library.

georgemitenkov mentioned this in D53927: [AArch64] Enable libm vectorized functions via SLEEF.May 28 2021, 6:30 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

CodeGenPassBuilder.h

7 lines

MachinePassRegistry.def

1 line

Passes.h

4 lines

ReplaceWithVeclib.h

38 lines

InitializePasses.h

1 line

lib/

CodeGen/

CMakeLists.txt

1 line

ReplaceWithVeclib.cpp

254 lines

TargetPassConfig.cpp

3 lines

test/

CodeGen/

AArch64/

O3-pipeline.ll

1 line

ARM/

O3-pipeline.ll

1 line

Generic/

replace-intrinsics-with-veclib.ll

92 lines

X86/

opt-pipeline.ll

1 line

tools/

llc/

llc.cpp

1 line

opt/

opt.cpp

1 line

utils/

gn/

secondary/

llvm/

lib/

CodeGen/

BUILD.gn

1 line

Diff 320489

llvm/include/llvm/CodeGen/CodeGenPassBuilder.h

Show All 23 Lines
#include "llvm/Analysis/CFLSteensAliasAnalysis.h"		#include "llvm/Analysis/CFLSteensAliasAnalysis.h"
#include "llvm/Analysis/ScopedNoAliasAA.h"		#include "llvm/Analysis/ScopedNoAliasAA.h"
#include "llvm/Analysis/TargetTransformInfo.h"		#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/Analysis/TypeBasedAliasAnalysis.h"		#include "llvm/Analysis/TypeBasedAliasAnalysis.h"
#include "llvm/CodeGen/ExpandReductions.h"		#include "llvm/CodeGen/ExpandReductions.h"
#include "llvm/CodeGen/MachineModuleInfo.h"		#include "llvm/CodeGen/MachineModuleInfo.h"
#include "llvm/CodeGen/MachinePassManager.h"		#include "llvm/CodeGen/MachinePassManager.h"
#include "llvm/CodeGen/PreISelIntrinsicLowering.h"		#include "llvm/CodeGen/PreISelIntrinsicLowering.h"
		#include "llvm/CodeGen/ReplaceWithVeclib.h"
#include "llvm/CodeGen/UnreachableBlockElim.h"		#include "llvm/CodeGen/UnreachableBlockElim.h"
#include "llvm/IR/IRPrintingPasses.h"		#include "llvm/IR/IRPrintingPasses.h"
#include "llvm/IR/PassManager.h"		#include "llvm/IR/PassManager.h"
#include "llvm/IR/Verifier.h"		#include "llvm/IR/Verifier.h"
#include "llvm/MC/MCAsmInfo.h"		#include "llvm/MC/MCAsmInfo.h"
#include "llvm/MC/MCStreamer.h"		#include "llvm/MC/MCStreamer.h"
#include "llvm/MC/MCTargetOptions.h"		#include "llvm/MC/MCTargetOptions.h"
#include "llvm/Support/CodeGen.h"		#include "llvm/Support/CodeGen.h"
▲ Show 20 Lines • Show All 606 Lines • ▼ Show 20 Lines	void CodeGenPassBuilder<Derived>::addIRPasses(AddIRPass &addPass) const {

// Make sure that no unreachable blocks are instruction selected.		// Make sure that no unreachable blocks are instruction selected.
addPass(UnreachableBlockElimPass());		addPass(UnreachableBlockElimPass());

// Prepare expensive constants for SelectionDAG.		// Prepare expensive constants for SelectionDAG.
if (getOptLevel() != CodeGenOpt::None && !Opt.DisableConstantHoisting)		if (getOptLevel() != CodeGenOpt::None && !Opt.DisableConstantHoisting)
addPass(ConstantHoistingPass());		addPass(ConstantHoistingPass());

		if (getOptLevel() != CodeGenOpt::None) {
		// Replace calls to LLVM intrinsics (e.g., exp, log) operating on vector
		// operands with calls to the corresponding functions in a vector library.
		addPass(ReplaceWithVeclib());
		}

if (getOptLevel() != CodeGenOpt::None && !Opt.DisablePartialLibcallInlining)		if (getOptLevel() != CodeGenOpt::None && !Opt.DisablePartialLibcallInlining)
addPass(PartiallyInlineLibCallsPass());		addPass(PartiallyInlineLibCallsPass());

// Instrument function entry and exit, e.g. with calls to mcount().		// Instrument function entry and exit, e.g. with calls to mcount().
addPass(EntryExitInstrumenterPass(/PostInlining=/true));		addPass(EntryExitInstrumenterPass(/PostInlining=/true));

// Add scalarization of target's unsupported masked memory intrinsics pass.		// Add scalarization of target's unsupported masked memory intrinsics pass.
// the unsupported intrinsic will be replaced with a chain of basic blocks,		// the unsupported intrinsic will be replaced with a chain of basic blocks,
▲ Show 20 Lines • Show All 483 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/MachinePassRegistry.def

	Show All 33 Lines

	#ifndef FUNCTION_PASS			#ifndef FUNCTION_PASS
	#define FUNCTION_PASS(NAME, PASS_NAME, CONSTRUCTOR)			#define FUNCTION_PASS(NAME, PASS_NAME, CONSTRUCTOR)
	#endif			#endif
	FUNCTION_PASS("mergeicmps", MergeICmpsPass, ())			FUNCTION_PASS("mergeicmps", MergeICmpsPass, ())
	FUNCTION_PASS("lower-constant-intrinsics", LowerConstantIntrinsicsPass, ())			FUNCTION_PASS("lower-constant-intrinsics", LowerConstantIntrinsicsPass, ())
	FUNCTION_PASS("unreachableblockelim", UnreachableBlockElimPass, ())			FUNCTION_PASS("unreachableblockelim", UnreachableBlockElimPass, ())
	FUNCTION_PASS("consthoist", ConstantHoistingPass, ())			FUNCTION_PASS("consthoist", ConstantHoistingPass, ())
				FUNCTION_PASS("replace-with-veclib", ReplaceWithVeclib, ())
	FUNCTION_PASS("partially-inline-libcalls", PartiallyInlineLibCallsPass, ())			FUNCTION_PASS("partially-inline-libcalls", PartiallyInlineLibCallsPass, ())
	FUNCTION_PASS("ee-instrument", EntryExitInstrumenterPass, (false))			FUNCTION_PASS("ee-instrument", EntryExitInstrumenterPass, (false))
	FUNCTION_PASS("post-inline-ee-instrument", EntryExitInstrumenterPass, (true))			FUNCTION_PASS("post-inline-ee-instrument", EntryExitInstrumenterPass, (true))
	FUNCTION_PASS("expand-reductions", ExpandReductionsPass, ())			FUNCTION_PASS("expand-reductions", ExpandReductionsPass, ())
	FUNCTION_PASS("lowerinvoke", LowerInvokePass, ())			FUNCTION_PASS("lowerinvoke", LowerInvokePass, ())
	FUNCTION_PASS("scalarize-masked-mem-intrin", ScalarizeMaskedMemIntrinPass, ())			FUNCTION_PASS("scalarize-masked-mem-intrin", ScalarizeMaskedMemIntrinPass, ())
	FUNCTION_PASS("verify", VerifierPass, ())			FUNCTION_PASS("verify", VerifierPass, ())
	#undef FUNCTION_PASS			#undef FUNCTION_PASS
	▲ Show 20 Lines • Show All 148 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/Passes.h

Show First 20 Lines • Show All 442 Lines • ▼ Show 20 Lines	/// MachineDominanaceFrontier - This pass is a machine dominators analysis pass.
/// This pass performs outlining on machine instructions directly before		/// This pass performs outlining on machine instructions directly before
/// printing assembly.		/// printing assembly.
ModulePass *createMachineOutlinerPass(bool RunOnAllFunctions = true);		ModulePass *createMachineOutlinerPass(bool RunOnAllFunctions = true);

/// This pass expands the experimental reduction intrinsics into sequences of		/// This pass expands the experimental reduction intrinsics into sequences of
/// shuffles.		/// shuffles.
FunctionPass *createExpandReductionsPass();		FunctionPass *createExpandReductionsPass();

		// This pass replaces intrinsics operating on vector operands with calls to
		// the corresponding function in a vector library (e.g., SVML, libmvec).
		FunctionPass *createReplaceWithVeclibLegacyPass();

// This pass expands memcmp() to load/stores.		// This pass expands memcmp() to load/stores.
FunctionPass *createExpandMemCmpPass();		FunctionPass *createExpandMemCmpPass();

/// Creates Break False Dependencies pass. \see BreakFalseDeps.cpp		/// Creates Break False Dependencies pass. \see BreakFalseDeps.cpp
FunctionPass *createBreakFalseDeps();		FunctionPass *createBreakFalseDeps();

// This pass expands indirectbr instructions.		// This pass expands indirectbr instructions.
FunctionPass *createIndirectBrExpandPass();		FunctionPass *createIndirectBrExpandPass();
Show All 39 Lines

llvm/include/llvm/CodeGen/ReplaceWithVeclib.h

This file was added.

				//===- ReplaceWithVeclib.h - Replace vector instrinsics with veclib calls -===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// Replaces calls to LLVM vector instrinsics (i.e., calls to LLVM intrinsics
				// with vector operands) with matching calls to functions from a vector
				// library (e.g., libmvec, SVML) according to TargetLibraryInfo.
				//
				//===----------------------------------------------------------------------===//
				#ifndef LLVM_TRANSFORMS_UTILS_REPLACEWITHVECLIB_H
				#define LLVM_TRANSFORMS_UTILS_REPLACEWITHVECLIB_H

				#include "llvm/IR/PassManager.h"
				#include "llvm/InitializePasses.h"

				namespace llvm {
				class ReplaceWithVeclib : public PassInfoMixin<ReplaceWithVeclib> {
				public:
				PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
				fhahnUnsubmitted Done Reply Inline Actions nit: I think the coding style recommends using `struct` if all members are public. fhahn: nit: I think the coding style recommends using `struct ` if all members are public.
				};

				// Legacy pass
				class ReplaceWithVeclibLegacy : public FunctionPass {
				public:
				static char ID;
				ReplaceWithVeclibLegacy() : FunctionPass(ID) {
				initializeReplaceWithVeclibLegacyPass(*PassRegistry::getPassRegistry());
				}
				void getAnalysisUsage(AnalysisUsage &AU) const override;
				bool runOnFunction(Function &F) override;
				};

				} // End namespace llvm
				#endif // LLVM_TRANSFORMS_UTILS_REPLACEWITHVECLIB_H
				No newline at end of file

llvm/include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 374 Lines • ▼ Show 20 Lines
	void initializeRegUsageInfoPropagationPass(PassRegistry&);			void initializeRegUsageInfoPropagationPass(PassRegistry&);
	void initializeRegionInfoPassPass(PassRegistry&);			void initializeRegionInfoPassPass(PassRegistry&);
	void initializeRegionOnlyPrinterPass(PassRegistry&);			void initializeRegionOnlyPrinterPass(PassRegistry&);
	void initializeRegionOnlyViewerPass(PassRegistry&);			void initializeRegionOnlyViewerPass(PassRegistry&);
	void initializeRegionPrinterPass(PassRegistry&);			void initializeRegionPrinterPass(PassRegistry&);
	void initializeRegionViewerPass(PassRegistry&);			void initializeRegionViewerPass(PassRegistry&);
	void initializeRegisterCoalescerPass(PassRegistry&);			void initializeRegisterCoalescerPass(PassRegistry&);
	void initializeRenameIndependentSubregsPass(PassRegistry&);			void initializeRenameIndependentSubregsPass(PassRegistry&);
				void initializeReplaceWithVeclibLegacyPass(PassRegistry &);
	void initializeResetMachineFunctionPass(PassRegistry&);			void initializeResetMachineFunctionPass(PassRegistry&);
	void initializeReversePostOrderFunctionAttrsLegacyPassPass(PassRegistry&);			void initializeReversePostOrderFunctionAttrsLegacyPassPass(PassRegistry&);
	void initializeRewriteStatepointsForGCLegacyPassPass(PassRegistry &);			void initializeRewriteStatepointsForGCLegacyPassPass(PassRegistry &);
	void initializeRewriteSymbolsLegacyPassPass(PassRegistry&);			void initializeRewriteSymbolsLegacyPassPass(PassRegistry&);
	void initializeSCCPLegacyPassPass(PassRegistry&);			void initializeSCCPLegacyPassPass(PassRegistry&);
	void initializeSCEVAAWrapperPassPass(PassRegistry&);			void initializeSCEVAAWrapperPassPass(PassRegistry&);
	void initializeSLPVectorizerPass(PassRegistry&);			void initializeSLPVectorizerPass(PassRegistry&);
	void initializeSROALegacyPassPass(PassRegistry&);			void initializeSROALegacyPassPass(PassRegistry&);
	▲ Show 20 Lines • Show All 65 Lines • Show Last 20 Lines

llvm/lib/CodeGen/CMakeLists.txt

Show First 20 Lines • Show All 141 Lines • ▼ Show 20 Lines	add_llvm_component_library(LLVMCodeGen
RenameIndependentSubregs.cpp		RenameIndependentSubregs.cpp
MachineStableHash.cpp		MachineStableHash.cpp
MIRVRegNamerUtils.cpp		MIRVRegNamerUtils.cpp
MIRNamerPass.cpp		MIRNamerPass.cpp
MIRCanonicalizerPass.cpp		MIRCanonicalizerPass.cpp
RegisterUsageInfo.cpp		RegisterUsageInfo.cpp
RegUsageInfoCollector.cpp		RegUsageInfoCollector.cpp
RegUsageInfoPropagate.cpp		RegUsageInfoPropagate.cpp
		ReplaceWithVeclib.cpp
ResetMachineFunctionPass.cpp		ResetMachineFunctionPass.cpp
SafeStack.cpp		SafeStack.cpp
SafeStackLayout.cpp		SafeStackLayout.cpp
ScheduleDAG.cpp		ScheduleDAG.cpp
ScheduleDAGInstrs.cpp		ScheduleDAGInstrs.cpp
ScheduleDAGPrinter.cpp		ScheduleDAGPrinter.cpp
ScoreboardHazardRecognizer.cpp		ScoreboardHazardRecognizer.cpp
ShadowStackGCLowering.cpp		ShadowStackGCLowering.cpp
▲ Show 20 Lines • Show All 62 Lines • Show Last 20 Lines

llvm/lib/CodeGen/ReplaceWithVeclib.cpp

This file was added.

				//=== ReplaceWithVeclib.cpp - Replace vector instrinsics with veclib calls ===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// Replaces calls to LLVM vector instrinsics (i.e., calls to LLVM intrinsics
				// with vector operands) with matching calls to functions from a vector
				// library (e.g., libmvec, SVML) according to TargetLibraryInfo.
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/CodeGen/ReplaceWithVeclib.h"
				#include "llvm/ADT/STLExtras.h"
				#include "llvm/ADT/Statistic.h"
				#include "llvm/Analysis/DemandedBits.h"
				#include "llvm/Analysis/GlobalsModRef.h"
				#include "llvm/Analysis/OptimizationRemarkEmitter.h"
				#include "llvm/Analysis/TargetLibraryInfo.h"
				#include "llvm/Analysis/VectorUtils.h"
				#include "llvm/CodeGen/Passes.h"
				#include "llvm/IR/IRBuilder.h"
				#include "llvm/IR/InstIterator.h"
				#include "llvm/IR/IntrinsicInst.h"
				#include "llvm/Transforms/Utils/ModuleUtils.h"

				using namespace llvm;

				#define DEBUG_TYPE "replace-with-veclib"

				STATISTIC(NumCallsReplaced,
				"Number of calls to intrinsics that have been replaced.");

				STATISTIC(NumTLIFuncDeclAdded,
				"Number of vector library function declarations added.");

				STATISTIC(NumFuncUsedAdded,
				"Number of functions added to `llvm.compiler.used`");

				static bool replaceWithTLIFunction(CallInst &CI, const StringRef TLIName) {
				Module *M = CI.getModule();

				Function *OldFunc = CI.getCalledFunction();

				// Check if the vector library function is already declared in this module,
				// otherwise insert it.
				Function *TLIFunc = M->getFunction(TLIName);
				if (!TLIFunc) {
				TLIFunc = Function::Create(OldFunc->getFunctionType(),
				Function::ExternalLinkage, TLIName, *M);
				TLIFunc->copyAttributesFrom(OldFunc);

				LLVM_DEBUG(dbgs() << DEBUG_TYPE << ": Added vector library function `"
				<< TLIName << "` of type `" << *(TLIFunc->getType())
				<< "` to module.\n");

				++NumTLIFuncDeclAdded;

				// Add the freshly created function to llvm.compiler.used,
				// similar to as it is done in InjectTLIMappings
				appendToCompilerUsed(*M, {TLIFunc});

				LLVM_DEBUG(dbgs() << DEBUG_TYPE << ": Adding `" << TLIName
				<< "` to `@llvm.compiler.used`.\n");
				++NumFuncUsedAdded;
				}

				// Replace the call to the vector intrinsic with a call
				// to the corresponding function from the vector library.
				IRBuilder<> IRBuilder{&CI};
				SmallVector<Value *> Args(CI.arg_operands());
				fhahnUnsubmitted Done Reply Inline Actions nit: Why `IRBuilder{&CI}` here but `Args(CI.arg_operands())` below? can this just be `IRBuilder(&CI)` for consistency? fhahn: nit: Why `IRBuilder{&CI}` here but `Args(CI.arg_operands())` below? can this just be `IRBuilder…
				// Preserve the operand bundles.
				SmallVector<OperandBundleDef, 1> OpBundles;
				CI.getOperandBundlesAsDefs(OpBundles);
				CallInst *Replacement = IRBuilder.CreateCall(TLIFunc, Args, OpBundles);
				CI.replaceAllUsesWith(Replacement);
				if (isa<FPMathOperator>(Replacement)) {
				spatelUnsubmitted Done Reply Inline Actions Can we assert here that the FunctionType of `Replacement` is the same as the FunctionType of `CI`? spatel: Can we assert here that the FunctionType of `Replacement` is the same as the FunctionType of…
				// Preserve fast math flags for FP math.
				Replacement->copyFastMathFlags(&CI);
				spatelUnsubmitted Done Reply Inline Actions Move this above the replaceAllUsesWith() to be more effective (not just for FP calls)? spatel: Move this above the replaceAllUsesWith() to be more effective (not just for FP calls)?
				}

				LLVM_DEBUG(dbgs() << DEBUG_TYPE << ": Replaced call to `"
				<< OldFunc->getName() << "` with call to `" << TLIName
				<< "`.\n");
				++NumCallsReplaced;
				return true;
				}

				static bool replaceWithCallToVeclib(const TargetLibraryInfo &TLI,
				CallInst &CI) {
				if (!CI.getCalledFunction()) {
				return false;
				}

				auto IntrinsicID = CI.getCalledFunction()->getIntrinsicID();
				if (IntrinsicID == Intrinsic::not_intrinsic) {
				// Replacement is only performed for intrinsic functions
				return false;
				}

				// Convert vector arguments to scalar type and check that
				// all vector operands have identical vector width.
				unsigned VF = 0;
				SmallVector<Type *> ScalarTypes;
				for (auto Arg : enumerate(CI.arg_operands())) {
				auto *ArgType = Arg.value()->getType();
				// Vector calls to intrinsics can still have
				// scalar operands for specific arguments.
				if (hasVectorInstrinsicScalarOpd(IntrinsicID, Arg.index())) {
				ScalarTypes.push_back(ArgType);
				} else {
				// The argument in this place should be a vector if
				// this is a call to a vector intrinsic.
				auto *VectorArgTy = dyn_cast<VectorType>(ArgType);
				if (!VectorArgTy) {
				// The argument is not a vector, do not perform
				// the replacement.
				return false;
				}
				auto NumElements = VectorArgTy->getElementCount();
				if (NumElements.isScalable()) {
				// The current implementation does not support
				// scalable vectors.
				return false;
				}
				if (VF && VF != NumElements.getFixedValue()) {
				// The different arguments differ in vector size.
				return false;
				} else {
				VF = NumElements.getFixedValue();
				}
				ScalarTypes.push_back(VectorArgTy->getElementType());
				}
				}

				// Try to reconstruct the name for the scalar version of this
				// intrinsic using the intrinsic ID and the argument types
				// converted to scalar above.
				std::string ScalarName;
				if (Intrinsic::isOverloaded(IntrinsicID)) {
				ScalarName = Intrinsic::getName(IntrinsicID, ScalarTypes);
				} else {
				ScalarName = Intrinsic::getName(IntrinsicID).str();
				}

				if (!TLI.isFunctionVectorizable(ScalarName)) {
				// The TargetLibraryInfo does not contain a vectorized version of
				// the scalar function.
				return false;
				}

				// Try to find the mapping for the scalar version of this intrinsic
				// and the exact vector width of the call operands in the
				// TargetLibraryInfo.
				const std::string TLIName =
				std::string(TLI.getVectorizedFunction(ScalarName, VF));

				LLVM_DEBUG(dbgs() << DEBUG_TYPE << ": Looking up TLI mapping for `"
				<< ScalarName << "` and vector width " << VF << ".\n");

				if (!TLIName.empty()) {
				// Found the correct mapping in the TargetLibraryInfo,
				// replace the call to the instrinsic with a call to
				spatelUnsubmitted Done Reply Inline Actions typo: intrinsic spatel: typo: intrinsic
				// the vector library function.
				LLVM_DEBUG(dbgs() << DEBUG_TYPE << ": Found TLI function `" << TLIName
				<< "`.\n");
				return replaceWithTLIFunction(CI, TLIName);
				}

				return false;
				}

				static bool runImpl(const TargetLibraryInfo &TLI, Function &F) {
				bool Changed = false;
				SmallVector<CallInst *> ReplacedCalls;
				for (auto &I : instructions(F)) {
				if (auto *CI = dyn_cast<CallInst>(&I)) {
				if (replaceWithCallToVeclib(TLI, *CI)) {
				ReplacedCalls.push_back(CI);
				Changed = true;
				}
				}
				}
				// Erase the calls to the intrincis that have been replaced
				spatelUnsubmitted Done Reply Inline Actions typo: intrinsics spatel: typo: intrinsics
				// with calls to the vector library.
				for (auto *CI : ReplacedCalls) {
				CI->eraseFromParent();
				}
				return Changed;
				}

				////////////////////////////////////////////////////////////////////////////////
				// New pass manager implementation.
				////////////////////////////////////////////////////////////////////////////////
				PreservedAnalyses ReplaceWithVeclib::run(Function &F,
				FunctionAnalysisManager &AM) {
				const TargetLibraryInfo &TLI = AM.getResult<TargetLibraryAnalysis>(F);
				auto Changed = runImpl(TLI, F);
				if (Changed) {
				PreservedAnalyses PA;
				PA.preserveSet<CFGAnalyses>();
				PA.preserve<TargetLibraryAnalysis>();
				PA.preserve<ScalarEvolutionAnalysis>();
				PA.preserve<AAManager>();
				PA.preserve<LoopAccessAnalysis>();
				PA.preserve<DemandedBitsAnalysis>();
				PA.preserve<OptimizationRemarkEmitterAnalysis>();
				PA.preserve<GlobalsAA>();
				return PA;
				} else {
				// The pass did not replace any calls, hence it preserves all analyses.
				return PreservedAnalyses::all();
				}
				}

				////////////////////////////////////////////////////////////////////////////////
				// Legacy PM Implementation.
				////////////////////////////////////////////////////////////////////////////////
				bool ReplaceWithVeclibLegacy::runOnFunction(Function &F) {
				const TargetLibraryInfo &TLI =
				getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);
				return runImpl(TLI, F);
				}

				void ReplaceWithVeclibLegacy::getAnalysisUsage(AnalysisUsage &AU) const {
				AU.setPreservesCFG();
				AU.addRequired<TargetLibraryInfoWrapperPass>();
				AU.addPreserved<TargetLibraryInfoWrapperPass>();
				AU.addPreserved<ScalarEvolutionWrapperPass>();
				AU.addPreserved<AAResultsWrapperPass>();
				AU.addPreserved<LoopAccessLegacyAnalysis>();
				AU.addPreserved<DemandedBitsWrapperPass>();
				AU.addPreserved<OptimizationRemarkEmitterWrapperPass>();
				AU.addPreserved<GlobalsAAWrapperPass>();
				}

				////////////////////////////////////////////////////////////////////////////////
				// Legacy Pass manager initialization
				////////////////////////////////////////////////////////////////////////////////
				char ReplaceWithVeclibLegacy::ID = 0;

				INITIALIZE_PASS_BEGIN(ReplaceWithVeclibLegacy, DEBUG_TYPE,
				"Replace intrinsics with calls to vector library", false,
				false)
				INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
				INITIALIZE_PASS_END(ReplaceWithVeclibLegacy, DEBUG_TYPE,
				"Replace intrinsics with calls to vector library", false,
				false)

				FunctionPass *llvm::createReplaceWithVeclibLegacyPass() {
				return new ReplaceWithVeclibLegacy();
				}
				No newline at end of file

llvm/lib/CodeGen/TargetPassConfig.cpp

Show First 20 Lines • Show All 852 Lines • ▼ Show 20 Lines	void TargetPassConfig::addIRPasses() {

// Make sure that no unreachable blocks are instruction selected.		// Make sure that no unreachable blocks are instruction selected.
addPass(createUnreachableBlockEliminationPass());		addPass(createUnreachableBlockEliminationPass());

// Prepare expensive constants for SelectionDAG.		// Prepare expensive constants for SelectionDAG.
if (getOptLevel() != CodeGenOpt::None && !DisableConstantHoisting)		if (getOptLevel() != CodeGenOpt::None && !DisableConstantHoisting)
addPass(createConstantHoistingPass());		addPass(createConstantHoistingPass());

		if (getOptLevel() != CodeGenOpt::None)
		addPass(createReplaceWithVeclibLegacyPass());

if (getOptLevel() != CodeGenOpt::None && !DisablePartialLibcallInlining)		if (getOptLevel() != CodeGenOpt::None && !DisablePartialLibcallInlining)
addPass(createPartiallyInlineLibCallsPass());		addPass(createPartiallyInlineLibCallsPass());

// Instrument function entry and exit, e.g. with calls to mcount().		// Instrument function entry and exit, e.g. with calls to mcount().
addPass(createPostInlineEntryExitInstrumenterPass());		addPass(createPostInlineEntryExitInstrumenterPass());

// Add scalarization of target's unsupported masked memory intrinsics pass.		// Add scalarization of target's unsupported masked memory intrinsics pass.
// the unsupported intrinsic will be replaced with a chain of basic blocks,		// the unsupported intrinsic will be replaced with a chain of basic blocks,
▲ Show 20 Lines • Show All 594 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/O3-pipeline.ll

	Show First 20 Lines • Show All 49 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Lower constant intrinsics			; CHECK-NEXT: Lower constant intrinsics
	; CHECK-NEXT: Remove unreachable blocks from the CFG			; CHECK-NEXT: Remove unreachable blocks from the CFG
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Post-Dominator Tree Construction			; CHECK-NEXT: Post-Dominator Tree Construction
	; CHECK-NEXT: Branch Probability Analysis			; CHECK-NEXT: Branch Probability Analysis
	; CHECK-NEXT: Block Frequency Analysis			; CHECK-NEXT: Block Frequency Analysis
	; CHECK-NEXT: Constant Hoisting			; CHECK-NEXT: Constant Hoisting
				; CHECK-NEXT: Replace intrinsics with calls to vector library
	; CHECK-NEXT: Partially inline calls to library functions			; CHECK-NEXT: Partially inline calls to library functions
	; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)			; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)
	; CHECK-NEXT: Scalarize Masked Memory Intrinsics			; CHECK-NEXT: Scalarize Masked Memory Intrinsics
	; CHECK-NEXT: Expand reduction intrinsics			; CHECK-NEXT: Expand reduction intrinsics
	; CHECK-NEXT: Stack Safety Analysis			; CHECK-NEXT: Stack Safety Analysis
	; CHECK-NEXT: FunctionPass Manager			; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	▲ Show 20 Lines • Show All 161 Lines • Show Last 20 Lines

llvm/test/CodeGen/ARM/O3-pipeline.ll

	Show All 29 Lines
	; CHECK-NEXT: Lower constant intrinsics			; CHECK-NEXT: Lower constant intrinsics
	; CHECK-NEXT: Remove unreachable blocks from the CFG			; CHECK-NEXT: Remove unreachable blocks from the CFG
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Post-Dominator Tree Construction			; CHECK-NEXT: Post-Dominator Tree Construction
	; CHECK-NEXT: Branch Probability Analysis			; CHECK-NEXT: Branch Probability Analysis
	; CHECK-NEXT: Block Frequency Analysis			; CHECK-NEXT: Block Frequency Analysis
	; CHECK-NEXT: Constant Hoisting			; CHECK-NEXT: Constant Hoisting
				; CHECK-NEXT: Replace intrinsics with calls to vector library
	; CHECK-NEXT: Partially inline calls to library functions			; CHECK-NEXT: Partially inline calls to library functions
	; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)			; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)
	; CHECK-NEXT: Scalarize Masked Memory Intrinsics			; CHECK-NEXT: Scalarize Masked Memory Intrinsics
	; CHECK-NEXT: Expand reduction intrinsics			; CHECK-NEXT: Expand reduction intrinsics
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Scalar Evolution Analysis			; CHECK-NEXT: Scalar Evolution Analysis
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	▲ Show 20 Lines • Show All 145 Lines • Show Last 20 Lines

llvm/test/CodeGen/Generic/replace-intrinsics-with-veclib.ll

This file was added.

				; RUN: opt -vector-library=SVML -replace-with-veclib -S < %s \| FileCheck %s --check-prefixes=COMMON,SVML,F64
				; RUN: opt -vector-library=LIBMVEC-X86 -replace-with-veclib -S < %s \| FileCheck %s --check-prefixes=COMMON,LIBMVEC-X86,F64
				; RUN: opt -vector-library=MASSV -replace-with-veclib -S < %s \| FileCheck %s --check-prefixes=COMMON,MASSV,F32
				; RUN: opt -vector-library=Accelerate -replace-with-veclib -S < %s \| FileCheck %s --check-prefixes=COMMON,ACCELERATE,F32

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				; COMMON-LABEL: @llvm.compiler.used = appending global
				; F64-SAME: [2 x i8*] [
				; SVML-SAME: i8* bitcast (<4 x double> (<4 x double>)* @__svml_exp4 to i8*),
				; SVML-SAME: i8* bitcast (<4 x float> (<4 x float>)* @__svml_expf4 to i8*
				; LIBMVEC-X86-SAME: i8* bitcast (<4 x double> (<4 x double>)* @_ZGVdN4v_exp to i8*),
				; LIBMVEC-X86-sAME: i8* bitcast (<4 x float> (<4 x float>)* @_ZGVbN4v_expf to i8*)
				spatelUnsubmitted Done Reply Inline Actions FileCheck typo: "sAME" Can you use utils/update_test_checks.py to auto-generate the assertions? spatel: FileCheck typo: "sAME" Can you use utils/update_test_checks.py to auto-generate the assertions?
				; F32-SAME: [1 x i8*] [
				; MASSV-SAME: i8* bitcast (<4 x float> (<4 x float>)* @__expf4_massv to i8*)
				; ACCELERATE-SAME: i8* bitcast (<4 x float> (<4 x float>)* @vexpf to i8*)
				; COMMON-SAME: ], section "llvm.metadata"

				define <4 x double> @exp_v4(<4 x double> %in) {
				; COMMON-LABEL: @exp_v4(
				; COMMON-SAME: <4 x double> %[[IN:[a-zA-Z0-9_]+]]
				; LIBMVEC-X86: %[[CALL:[a-zA-Z0-9_]+]] = call <4 x double> @_ZGVdN4v_exp(<4 x double> %[[IN]])
				; SVML: %[[CALL:[a-zA-Z0-9_]+]] = call <4 x double> @__svml_exp4(<4 x double> %[[IN]])
				; F32: %[[CALL:[a-zA-Z0-9_]+]] = call <4 x double> @llvm.exp.v4f64(<4 x double> %[[IN]])
				; F64-NOT: call @llvm.exp.v4f64
				; COMMON: ret <4 x double> %[[CALL]]
				%call = call <4 x double> @llvm.exp.v4f64(<4 x double> %in)
				ret <4 x double> %call
				}

				declare <4 x double> @llvm.exp.v4f64(<4 x double>) #0

				define <4 x float> @exp_f32(<4 x float> %in) {
				; COMMON-LABEL: @exp_f32(
				; COMMON-SAME: <4 x float> %[[IN1:[a-zA-Z0-9_]+]]
				; LIBMVEC-X86: %[[#CALL1:]] = call <4 x float> @_ZGVbN4v_expf(<4 x float> %[[IN1]])
				; SVML: %[[#CALL1:]] = call <4 x float> @__svml_expf4(<4 x float> %[[IN1]])
				; MASSV: %[[#CALL1:]] = call <4 x float> @__expf4_massv(<4 x float> %[[IN1]])
				; ACCELERATE: %[[#CALL1:]] = call <4 x float> @vexpf(<4 x float> %[[IN1]])
				; COMMON-NOT: call @llvm.exp.v4f32
				; COMMON: ret <4 x float> %[[#CALL1]]
				%call = call <4 x float> @llvm.exp.v4f32(<4 x float> %in)
				ret <4 x float> %call
				}

				declare <4 x float> @llvm.exp.v4f32(<4 x float>) #0

				define double @exp_f64(double %in) {
				; No replacement should take place for non-vector intrinsic
				; COMMON-LABEL: @exp_f64(
				; COMMON: %[[CALL2:[a-zA-Z0-9_]+]] = call double @llvm.exp.f64
				; COMMON: ret double %[[CALL2]]
				%call = call double @llvm.exp.f64(double %in)
				ret double %call
				}

				declare double @llvm.exp.f64(double) #0

				define <4 x double> @powi_v4(<4 x double> %in){
				; Check that the pass works with scalar operands on
				; vector intrinsics. No vector library has a substitute for powi
				; COMMON-LABEL: @powi_v4(
				; COMMON: %[[CALL3:[a-zA-Z0-9_]+]] = call <4 x double> @llvm.powi.v4f64
				; COMMON: ret <4 x double> %[[CALL3]]
				spatelUnsubmitted Done Reply Inline Actions The test comments that were here in the previous revision were helpful. I'd put those back (but put them outside of the function body to avoid interfering with the script's CHECK lines). spatel: The test comments that were here in the previous revision were helpful. I'd put those back (but…
				%call = call <4 x double> @llvm.powi.v4f64(<4 x double> %in, i32 3)
				ret <4 x double> %call
				}

				declare <4 x double> @llvm.powi.v4f64(<4 x double>, i32) #0

				define <3 x double> @exp_v3(<3 x double> %in) {
				; Replacement should not take place if the vector length
				; does not match exactly.
				; COMMON-LABEL: @exp_v3(
				; COMMON: %[[CALL4:[a-zA-Z0-9_]+]] = call <3 x double> @llvm.exp.v3f64
				; COMMON: ret <3 x double> %[[CALL4]]
				%call = call <3 x double> @llvm.exp.v3f64(<3 x double> %in)
				ret <3 x double> %call
				}

				declare <3 x double> @llvm.exp.v3f64(<3 x double>) #0

				; LIBMVEC-X86: declare <4 x double> @_ZGVdN4v_exp(<4 x double>) #0
				; LIBMVEC-X86: declare <4 x float> @_ZGVbN4v_expf(<4 x float>) #0
				; SVML: declare <4 x double> @__svml_exp4(<4 x double>) #0
				; SVML: declare <4 x float> @__svml_expf4(<4 x float>) #0
				; MASSV: declare <4 x float> @__expf4_massv(<4 x float>) #0
				; ACCELERATE: declare <4 x float> @vexpf(<4 x float>) #0

				attributes #0 = {nounwind readnone}

llvm/test/CodeGen/X86/opt-pipeline.ll

	Show First 20 Lines • Show All 46 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Lower constant intrinsics			; CHECK-NEXT: Lower constant intrinsics
	; CHECK-NEXT: Remove unreachable blocks from the CFG			; CHECK-NEXT: Remove unreachable blocks from the CFG
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Post-Dominator Tree Construction			; CHECK-NEXT: Post-Dominator Tree Construction
	; CHECK-NEXT: Branch Probability Analysis			; CHECK-NEXT: Branch Probability Analysis
	; CHECK-NEXT: Block Frequency Analysis			; CHECK-NEXT: Block Frequency Analysis
	; CHECK-NEXT: Constant Hoisting			; CHECK-NEXT: Constant Hoisting
				; CHECK-NEXT: Replace intrinsics with calls to vector library
	; CHECK-NEXT: Partially inline calls to library functions			; CHECK-NEXT: Partially inline calls to library functions
	; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)			; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)
	; CHECK-NEXT: Scalarize Masked Memory Intrinsics			; CHECK-NEXT: Scalarize Masked Memory Intrinsics
	; CHECK-NEXT: Expand reduction intrinsics			; CHECK-NEXT: Expand reduction intrinsics
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Interleaved Access Pass			; CHECK-NEXT: Interleaved Access Pass
	; CHECK-NEXT: X86 Partial Reduction			; CHECK-NEXT: X86 Partial Reduction
	; CHECK-NEXT: Expand indirectbr instructions			; CHECK-NEXT: Expand indirectbr instructions
	▲ Show 20 Lines • Show All 148 Lines • Show Last 20 Lines

llvm/tools/llc/llc.cpp

Show First 20 Lines • Show All 317 Lines • ▼ Show 20 Lines	int main(int argc, char **argv) {
initializeUnreachableBlockElimLegacyPassPass(*Registry);		initializeUnreachableBlockElimLegacyPassPass(*Registry);
initializeConstantHoistingLegacyPassPass(*Registry);		initializeConstantHoistingLegacyPassPass(*Registry);
initializeScalarOpts(*Registry);		initializeScalarOpts(*Registry);
initializeVectorization(*Registry);		initializeVectorization(*Registry);
initializeScalarizeMaskedMemIntrinLegacyPassPass(*Registry);		initializeScalarizeMaskedMemIntrinLegacyPassPass(*Registry);
initializeExpandReductionsPass(*Registry);		initializeExpandReductionsPass(*Registry);
initializeHardwareLoopsPass(*Registry);		initializeHardwareLoopsPass(*Registry);
initializeTransformUtils(*Registry);		initializeTransformUtils(*Registry);
		initializeReplaceWithVeclibLegacyPass(*Registry);

// Initialize debugging passes.		// Initialize debugging passes.
initializeScavengerTestPass(*Registry);		initializeScavengerTestPass(*Registry);

// Register the target printer for --version.		// Register the target printer for --version.
cl::AddExtraVersionPrinter(TargetRegistry::printRegisteredTargetsForVersion);		cl::AddExtraVersionPrinter(TargetRegistry::printRegisteredTargetsForVersion);

cl::ParseCommandLineOptions(argc, argv, "llvm system compiler\n");		cl::ParseCommandLineOptions(argc, argv, "llvm system compiler\n");
▲ Show 20 Lines • Show All 359 Lines • Show Last 20 Lines

llvm/tools/opt/opt.cpp

Show First 20 Lines • Show All 572 Lines • ▼ Show 20 Lines	int main(int argc, char **argv) {
initializeEntryExitInstrumenterPass(Registry);		initializeEntryExitInstrumenterPass(Registry);
initializePostInlineEntryExitInstrumenterPass(Registry);		initializePostInlineEntryExitInstrumenterPass(Registry);
initializeUnreachableBlockElimLegacyPassPass(Registry);		initializeUnreachableBlockElimLegacyPassPass(Registry);
initializeExpandReductionsPass(Registry);		initializeExpandReductionsPass(Registry);
initializeWasmEHPreparePass(Registry);		initializeWasmEHPreparePass(Registry);
initializeWriteBitcodePassPass(Registry);		initializeWriteBitcodePassPass(Registry);
initializeHardwareLoopsPass(Registry);		initializeHardwareLoopsPass(Registry);
initializeTypePromotionPass(Registry);		initializeTypePromotionPass(Registry);
		initializeReplaceWithVeclibLegacyPass(Registry);

#ifdef BUILD_EXAMPLES		#ifdef BUILD_EXAMPLES
initializeExampleIRTransforms(Registry);		initializeExampleIRTransforms(Registry);
#endif		#endif

cl::ParseCommandLineOptions(argc, argv,		cl::ParseCommandLineOptions(argc, argv,
"llvm .bc -> .bc modular optimizer and analysis printer\n");		"llvm .bc -> .bc modular optimizer and analysis printer\n");

▲ Show 20 Lines • Show All 461 Lines • Show Last 20 Lines

llvm/utils/gn/secondary/llvm/lib/CodeGen/BUILD.gn

Show First 20 Lines • Show All 162 Lines • ▼ Show 20 Lines	sources = [
"RegUsageInfoCollector.cpp",		"RegUsageInfoCollector.cpp",
"RegUsageInfoPropagate.cpp",		"RegUsageInfoPropagate.cpp",
"RegisterClassInfo.cpp",		"RegisterClassInfo.cpp",
"RegisterCoalescer.cpp",		"RegisterCoalescer.cpp",
"RegisterPressure.cpp",		"RegisterPressure.cpp",
"RegisterScavenging.cpp",		"RegisterScavenging.cpp",
"RegisterUsageInfo.cpp",		"RegisterUsageInfo.cpp",
"RenameIndependentSubregs.cpp",		"RenameIndependentSubregs.cpp",
		"ReplaceWithVeclib.cpp",
"ResetMachineFunctionPass.cpp",		"ResetMachineFunctionPass.cpp",
"SafeStack.cpp",		"SafeStack.cpp",
"SafeStackLayout.cpp",		"SafeStackLayout.cpp",
"ScheduleDAG.cpp",		"ScheduleDAG.cpp",
"ScheduleDAGInstrs.cpp",		"ScheduleDAGInstrs.cpp",
"ScheduleDAGPrinter.cpp",		"ScheduleDAGPrinter.cpp",
"ScoreboardHazardRecognizer.cpp",		"ScoreboardHazardRecognizer.cpp",
"ShadowStackGCLowering.cpp",		"ShadowStackGCLowering.cpp",
Show All 33 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Replace vector intrinsics with call to vector libraryClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 320489

llvm/include/llvm/CodeGen/CodeGenPassBuilder.h

llvm/include/llvm/CodeGen/MachinePassRegistry.def

llvm/include/llvm/CodeGen/Passes.h

llvm/include/llvm/CodeGen/ReplaceWithVeclib.h

llvm/include/llvm/InitializePasses.h

llvm/lib/CodeGen/CMakeLists.txt

llvm/lib/CodeGen/ReplaceWithVeclib.cpp

llvm/lib/CodeGen/TargetPassConfig.cpp

llvm/test/CodeGen/AArch64/O3-pipeline.ll

llvm/test/CodeGen/ARM/O3-pipeline.ll

llvm/test/CodeGen/Generic/replace-intrinsics-with-veclib.ll

llvm/test/CodeGen/X86/opt-pipeline.ll

llvm/tools/llc/llc.cpp

llvm/tools/opt/opt.cpp

llvm/utils/gn/secondary/llvm/lib/CodeGen/BUILD.gn

Replace vector intrinsics with call to vector library
ClosedPublic