This is an archive of the discontinued LLVM Phabricator instance.

[RFC] Vector math function loop idiom recognition
Needs ReviewPublic

Authored by qianzhen on May 8 2023, 8:07 AM.

Download Raw Diff

Details

Reviewers

bmahjour
Meinersbur
craig.topper
efriedma
eopXD
syzaara
alexgatea
masoud.ataei

Summary

This patch extends the loop idiom recognize pass by recognizing a scalar math function call in the loop and transforming it to a vector math function call.

For example, to transform from

for (int i = 0; i < 100; i++)
  y[i] = exp(x[i]); // scalar math

vexp(y, x, 100);    // vector math

A vector math function computes the same mathematical function for a vector of operands stored in contiguous memory. By design, the average computation time per compute element in vector math function is less than that in the equivalent scalar math function when the number of compute elements is greater than a threshold value, resulting in better performance overall. There are a number of math libraries supporting vector math functions, including IBM MASS library, Intel Math Kernel Library, etc. As an example, the vector math functions from IBM MASS library are approximately 30 times faster per compute element on geometric mean than the scalar equivalent on IBM Power10, measured by computing 1000 elements with valid but random input values.

The threshold value is dependent on the specific math function and library. The values may be generated from a heuristic, and then provided to the pass in a table from TargetLibraryInfo.

The motivation of this patch is to achieve this performance benefit for various math libraries on different targets. Hence, the design is to transform the idioms to a new set of LLVM intrinsics for vector math functions, which is designed to be general for all math libraries on all targets. Then a new pass added in the back-end will lower the intrinsics to the actual vector math functions on its target. To demonstrate, a new pass is added in the PowerPC target to lower the intrinsics to the MASS library vector math functions in this patch.

In a more complex loop, preparation passes may be required before this patch can recognize the idiom. For example, when there are data dependencies on the input or output of the scalar math function in the loop, loop distribution may be required to split the dependencies into separate loops.

As a demonstration, currently this patch accepts the threshold value for profitability from a command line option, and only evaluates it in loops with known trip count at compile time. For loops with unknown trip count at compile time, one solution is to version the loop and insert condition checking code to evaluate at run time.

Regarding adding the new vector math functions, another idea is to expand the existing VecFunc.def to include them, instead of adding a new VectorMathFunc.def shown in this patch. With that, the VecDesc structure in TLI may be expanded and changed as below.

struct VecDesc {
  StringRef ScalarFnName; // scalar math function (e.g. exp)
  StringRef SIMDFnName; // rename to SIMD math function (e.g. expd2)
  ElementCount VectorizationFactor; // vectorization factor for the SIMD math function
  StringRef VectorFnName; // new vector math function (e.g. vexp)
  Intrinsic::ID IntrinID; // new LLVM intrinsic for vector math function
}

The new name "SIMD" replaces the old name "vector" to represent the type of math functions which takes a vector data type as the parameter and return type, such as expd2. And the name "vector" is used to represent the new type of math functions that are introduced in this patch, such as vexp. The names and definitions of the query functions such as isFunctionVectorizable will also need to be changed accordingly. This approach may help better distinguishing these two types of math functions in LLVM.

Any comments would be appreciated.

RFC at https://discourse.llvm.org/t/rfc-vector-math-function-loop-idiom-recognition/70465

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	50 ms	x64 debian > LLVM.CodeGen/PowerPC::O0-pipeline.ll
	70 ms	x64 debian > LLVM.CodeGen/PowerPC::O3-pipeline.ll

Event Timeline

qianzhen created this revision.May 8 2023, 8:07 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 8 2023, 8:07 AM

Herald added subscribers: steven.zhang, mgrang, kbarton and 2 others. · View Herald Transcript

qianzhen requested review of this revision.May 8 2023, 8:07 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 8 2023, 8:07 AM

Herald added subscribers: llvm-commits, • pcwang-thead, jdoerfert. · View Herald Transcript

qianzhen edited the summary of this revision. (Show Details)May 8 2023, 8:11 AM

Harbormaster completed remote builds in B230649: Diff 520381.May 8 2023, 9:50 AM

syzaara added a subscriber: syzaara.May 9 2023, 10:27 AM

qianzhen added a project: Restricted Project.May 9 2023, 10:43 AM

qianzhen added reviewers: bmahjour, Meinersbur, craig.topper, efriedma, eopXD, syzaara, alexgatea.May 9 2023, 11:54 AM

efriedma edited the summary of this revision. (Show Details)May 9 2023, 11:58 AM

bmahjour added a reviewer: masoud.ataei.May 15 2023, 6:32 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

TargetLibraryInfo.h

58 lines

VectorMathFuncs.def

135 lines

IR/

Intrinsics.td

415 lines

Target/

TargetOptions.h

8 lines

Transforms/

Scalar/

LoopIdiomRecognize.h

3 lines

lib/

Analysis/

TargetLibraryInfo.cpp

80 lines

Target/

PowerPC/

CMakeLists.txt

1 line

PPC.h

4 lines

PPCGenVectorMASSEntries.cpp

163 lines

PPCTargetMachine.cpp

5 lines

Transforms/

Scalar/

LoopIdiomRecognize.cpp

361 lines

test/

CodeGen/

PowerPC/

lower-intrinsics-vector-mass.ll

15 lines

Transforms/

LoopIdiom/

math.ll

44 lines

Diff 520381

llvm/include/llvm/Analysis/TargetLibraryInfo.h

Show All 9 Lines
#define LLVM_ANALYSIS_TARGETLIBRARYINFO_H		#define LLVM_ANALYSIS_TARGETLIBRARYINFO_H

#include "llvm/ADT/BitVector.h"		#include "llvm/ADT/BitVector.h"
#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/IR/InstrTypes.h"		#include "llvm/IR/InstrTypes.h"
#include "llvm/IR/PassManager.h"		#include "llvm/IR/PassManager.h"
#include "llvm/Pass.h"		#include "llvm/Pass.h"
#include "llvm/TargetParser/Triple.h"		#include "llvm/TargetParser/Triple.h"
		#include <map>
#include <optional>		#include <optional>

namespace llvm {		namespace llvm {

template <typename T> class ArrayRef;		template <typename T> class ArrayRef;
class Function;		class Function;
class Module;		class Module;
class Triple;		class Triple;

/// Describes a possible vectorization of a function.		/// Describes a possible vectorization of a function.
/// Function 'VectorFnName' is equivalent to 'ScalarFnName' vectorized		/// Function 'VectorFnName' is equivalent to 'ScalarFnName' vectorized
/// by a factor 'VectorizationFactor'.		/// by a factor 'VectorizationFactor'.
struct VecDesc {		struct VecDesc {
StringRef ScalarFnName;		StringRef ScalarFnName;
StringRef VectorFnName;		StringRef VectorFnName;
ElementCount VectorizationFactor;		ElementCount VectorizationFactor;
bool Masked;		bool Masked;
};		};

		/// Describes a vector math function VectorFnName of an equivalent scalar math
		/// function ScalarFnName, and its corresponding LLVM intrinsic ID.
		struct VectorMathDesc {
		StringRef ScalarFnName;
		StringRef VectorFnName;
		Intrinsic::ID ID;
		};

enum LibFunc : unsigned {		enum LibFunc : unsigned {
#define TLI_DEFINE_ENUM		#define TLI_DEFINE_ENUM
#include "llvm/Analysis/TargetLibraryInfo.def"		#include "llvm/Analysis/TargetLibraryInfo.def"

NumLibFuncs,		NumLibFuncs,
NotLibFunc		NotLibFunc
};		};

Show All 26 Lines	class TargetLibraryInfoImpl {
}		}

/// Vectorization descriptors - sorted by ScalarFnName.		/// Vectorization descriptors - sorted by ScalarFnName.
std::vector<VecDesc> VectorDescs;		std::vector<VecDesc> VectorDescs;
/// Scalarization descriptors - same content as VectorDescs but sorted based		/// Scalarization descriptors - same content as VectorDescs but sorted based
/// on VectorFnName rather than ScalarFnName.		/// on VectorFnName rather than ScalarFnName.
std::vector<VecDesc> ScalarDescs;		std::vector<VecDesc> ScalarDescs;

		/// A set of scalar -> vector math function mappings.
		/// The vector math functions compute the same mathematical operation
		/// for an array of operands. They are different from the vector functions in
		/// VectorDescs/ScalarDescs, which compute for each element of the SIMD
		/// vector operands.
		/// Vector math descriptors - sorted by ScalarFnName.
		std::vector<VectorMathDesc> VectorMathFuncDescs;
		/// Vector math descriptors - sorted by intrinsic ID.
		std::vector<VectorMathDesc> VectorMathIntrinDescs;

/// Return true if the function type FTy is valid for the library function		/// Return true if the function type FTy is valid for the library function
/// F, regardless of whether the function is available.		/// F, regardless of whether the function is available.
bool isValidProtoForLibFunc(const FunctionType &FTy, LibFunc F,		bool isValidProtoForLibFunc(const FunctionType &FTy, LibFunc F,
const Module &M) const;		const Module &M) const;

public:		public:
/// List of known vector-functions libraries.		/// List of known vector-functions libraries.
///		///
▲ Show 20 Lines • Show All 83 Lines • ▼ Show 20 Lines	public:
/// vectorization factor.		/// vectorization factor.
bool isFunctionVectorizable(StringRef F) const;		bool isFunctionVectorizable(StringRef F) const;

/// Return the name of the equivalent of F, vectorized with factor VF. If no		/// Return the name of the equivalent of F, vectorized with factor VF. If no
/// such mapping exists, return the empty string.		/// such mapping exists, return the empty string.
StringRef getVectorizedFunction(StringRef F, const ElementCount &VF,		StringRef getVectorizedFunction(StringRef F, const ElementCount &VF,
bool Masked) const;		bool Masked) const;

		/// Add a set of scalar -> vector math function mappings for the given
		/// vector library, queryable via getVectorMathIntrinsic.
		void addVectorMathFunctions(ArrayRef<VectorMathDesc> Fns);

		/// Calls addVectorMathFunctions with a known preset of functions for the
		/// given vector library.
		void addVectorMathFunctionsFromVecLib(enum VectorLibrary VecLib);

		/// Return true if the scalar function F has an equivalent vector math
		/// function.
		bool isVectorMathFunctionAvailable(StringRef F) const;

		/// Return true if the intrinsic ID has an equivalent vector math function.
		bool isVectorMathFunctionAvailable(Intrinsic::ID ID) const;

		/// Return the corresponding vector math intrinsic of the scalar function F.
		/// If no such mapping exists, return not_intrinsic.
		Intrinsic::ID getVectorMathIntrinsic(StringRef F) const;

		/// Return the corresponding vector math function of the intrinsic.
		/// If no such mapping exists, return empty string.
		StringRef getVectorMathFunction(Intrinsic::ID ID) const;

		/// Return true if the VecLib has vector math functions.
		bool hasVectorMathFunctions() const { return !VectorMathFuncDescs.empty(); }

/// Set to true iff i32 parameters to library functions should have signext		/// Set to true iff i32 parameters to library functions should have signext
/// or zeroext attributes if they correspond to C-level int or unsigned int,		/// or zeroext attributes if they correspond to C-level int or unsigned int,
/// respectively.		/// respectively.
void setShouldExtI32Param(bool Val) {		void setShouldExtI32Param(bool Val) {
ShouldExtI32Param = Val;		ShouldExtI32Param = Val;
}		}

/// Set to true iff i32 results from library functions should have signext		/// Set to true iff i32 results from library functions should have signext
▲ Show 20 Lines • Show All 162 Lines • ▼ Show 20 Lines	public:
}		}
bool isFunctionVectorizable(StringRef F) const {		bool isFunctionVectorizable(StringRef F) const {
return Impl->isFunctionVectorizable(F);		return Impl->isFunctionVectorizable(F);
}		}
StringRef getVectorizedFunction(StringRef F, const ElementCount &VF,		StringRef getVectorizedFunction(StringRef F, const ElementCount &VF,
bool Masked = false) const {		bool Masked = false) const {
return Impl->getVectorizedFunction(F, VF, Masked);		return Impl->getVectorizedFunction(F, VF, Masked);
}		}
		bool isVectorMathFunctionAvailable(StringRef F) const {
		return Impl->isVectorMathFunctionAvailable(F);
		}
		bool isVectorMathFunctionAvailable(Intrinsic::ID ID) const {
		return Impl->isVectorMathFunctionAvailable(ID);
		}
		Intrinsic::ID getVectorMathIntrinsic(StringRef F) const {
		return Impl->getVectorMathIntrinsic(F);
		}
		StringRef getVectorMathFunction(Intrinsic::ID ID) const {
		return Impl->getVectorMathFunction(ID);
		}
		bool hasVectorMathFunctions() const { return Impl->hasVectorMathFunctions(); }

/// Tests if the function is both available and a candidate for optimized code		/// Tests if the function is both available and a candidate for optimized code
/// generation.		/// generation.
bool hasOptimizedCodeGen(LibFunc F) const {		bool hasOptimizedCodeGen(LibFunc F) const {
if (getState(F) == TargetLibraryInfoImpl::Unavailable)		if (getState(F) == TargetLibraryInfoImpl::Unavailable)
return false;		return false;
switch (F) {		switch (F) {
default: break;		default: break;
▲ Show 20 Lines • Show All 227 Lines • Show Last 20 Lines

llvm/include/llvm/Analysis/VectorMathFuncs.def

This file was added.

				//===-- VectorMathFuncs.def - Library information ----------- C++ -------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				// This .def file creates mapping from standard IEEE math functions to
				// their corresponding LLVM vector math intrinsics.
				// LLVM vector math intrinsics will be converted to the actual vector
				// math functions supported in the specified framework or library.

				#define TLI_DEFINE_VECTOR_FUNC(SCAL, VEC, ID) {SCAL, VEC, ID},

				#if defined(TLI_DEFINE_MASSV_VECTOR_MATH_FUNCS)
				// IBM MASS library's vector math functions

				TLI_DEFINE_VECTOR_FUNC("acosf", "__vsacos", Intrinsic::experimental_vector_acosf)
				TLI_DEFINE_VECTOR_FUNC("__acosf_finite", "__vsacos",
				Intrinsic::experimental_vector_acosf)
				TLI_DEFINE_VECTOR_FUNC("acos", "__vacos", Intrinsic::experimental_vector_acos)
				TLI_DEFINE_VECTOR_FUNC("__acos_finite", "__vacos", Intrinsic::experimental_vector_acos)

				TLI_DEFINE_VECTOR_FUNC("acoshf", "__vsacosh", Intrinsic::experimental_vector_acoshf)
				TLI_DEFINE_VECTOR_FUNC("__acoshf_finite", "__vsacosh", Intrinsic::experimental_vector_acoshf)
				TLI_DEFINE_VECTOR_FUNC("acosh", "__vacosh", Intrinsic::experimental_vector_acosh)
				TLI_DEFINE_VECTOR_FUNC("__acosh_finite", "__vacosh", Intrinsic::experimental_vector_acosh)

				TLI_DEFINE_VECTOR_FUNC("asinf", "__vsasin", Intrinsic::experimental_vector_asinf)
				TLI_DEFINE_VECTOR_FUNC("__asinf_finite", "__vsasin", Intrinsic::experimental_vector_asinf)
				TLI_DEFINE_VECTOR_FUNC("asin", "__vasin", Intrinsic::experimental_vector_asin)
				TLI_DEFINE_VECTOR_FUNC("__asin_finite", "__vasin", Intrinsic::experimental_vector_asin)

				TLI_DEFINE_VECTOR_FUNC("asinhf", "__vsasinh", Intrinsic::experimental_vector_asinhf)
				TLI_DEFINE_VECTOR_FUNC("asinh", "__vasinh", Intrinsic::experimental_vector_asinh)

				TLI_DEFINE_VECTOR_FUNC("atanf", "__vsatan", Intrinsic::experimental_vector_atanf)
				TLI_DEFINE_VECTOR_FUNC("atan", "__vatan", Intrinsic::experimental_vector_atan)

				TLI_DEFINE_VECTOR_FUNC("atan2", "__vatan2", Intrinsic::experimental_vector_atan2)
				TLI_DEFINE_VECTOR_FUNC("__atan2_finite", "__vatan2", Intrinsic::experimental_vector_atan2)
				TLI_DEFINE_VECTOR_FUNC("atan2f", "__vsatan2", Intrinsic::experimental_vector_atan2f)
				TLI_DEFINE_VECTOR_FUNC("__atan2f_finite", "__vsatan2", Intrinsic::experimental_vector_atan2f)

				TLI_DEFINE_VECTOR_FUNC("atanhf", "__vsatanh", Intrinsic::experimental_vector_atanhf)
				TLI_DEFINE_VECTOR_FUNC("__atanhf_finite", "__vsatanh", Intrinsic::experimental_vector_atanhf)
				TLI_DEFINE_VECTOR_FUNC("atanh", "__vatanh", Intrinsic::experimental_vector_atanh)
				TLI_DEFINE_VECTOR_FUNC("__atanh_finite", "__vatanh", Intrinsic::experimental_vector_atanh)

				TLI_DEFINE_VECTOR_FUNC("cbrtf", "__vscbrt", Intrinsic::experimental_vector_cbrtf)
				TLI_DEFINE_VECTOR_FUNC("cbrt", "__vcbrt", Intrinsic::experimental_vector_cbrt)

				TLI_DEFINE_VECTOR_FUNC("cosf", "__vscos", Intrinsic::experimental_vector_cosf)
				TLI_DEFINE_VECTOR_FUNC("llvm.cos.f32", "__vscos", Intrinsic::experimental_vector_cosf)
				TLI_DEFINE_VECTOR_FUNC("cos", "__vcos", Intrinsic::experimental_vector_cos)
				TLI_DEFINE_VECTOR_FUNC("llvm.cos.f64", "__vcos", Intrinsic::experimental_vector_cos)

				TLI_DEFINE_VECTOR_FUNC("coshf", "__vscosh", Intrinsic::experimental_vector_coshf)
				TLI_DEFINE_VECTOR_FUNC("__coshf_finite", "__vscosh", Intrinsic::experimental_vector_coshf)
				TLI_DEFINE_VECTOR_FUNC("cosh", "__vcosh", Intrinsic::experimental_vector_cosh)
				TLI_DEFINE_VECTOR_FUNC("__cosh_finite", "__vcosh", Intrinsic::experimental_vector_cosh)

				TLI_DEFINE_VECTOR_FUNC("erff", "__vserf", Intrinsic::experimental_vector_erff)
				TLI_DEFINE_VECTOR_FUNC("erf", "__verf", Intrinsic::experimental_vector_erf)

				TLI_DEFINE_VECTOR_FUNC("erfcf", "__vserfc", Intrinsic::experimental_vector_erfcf)
				TLI_DEFINE_VECTOR_FUNC("erfc", "__verfc", Intrinsic::experimental_vector_erfc)

				TLI_DEFINE_VECTOR_FUNC("expf", "__vsexp", Intrinsic::experimental_vector_expf)
				TLI_DEFINE_VECTOR_FUNC("__expf_finite", "__vsexp", Intrinsic::experimental_vector_expf)
				TLI_DEFINE_VECTOR_FUNC("llvm.exp.f32", "__vsexp", Intrinsic::experimental_vector_expf)
				TLI_DEFINE_VECTOR_FUNC("exp", "__vexp", Intrinsic::experimental_vector_exp)
				TLI_DEFINE_VECTOR_FUNC("__exp_finite", "__vexp", Intrinsic::experimental_vector_exp)
				TLI_DEFINE_VECTOR_FUNC("llvm.exp.f64", "__vexp", Intrinsic::experimental_vector_exp)

				TLI_DEFINE_VECTOR_FUNC("expm1f", "__vsexpm1", Intrinsic::experimental_vector_expm1f)
				TLI_DEFINE_VECTOR_FUNC("expm1", "__vexpm1", Intrinsic::experimental_vector_expm1)

				TLI_DEFINE_VECTOR_FUNC("hypotf", "__vshypot", Intrinsic::experimental_vector_hypotf)
				TLI_DEFINE_VECTOR_FUNC("hypot", "__vhypot", Intrinsic::experimental_vector_hypot)

				TLI_DEFINE_VECTOR_FUNC("lgammaf", "__vslgamma", Intrinsic::experimental_vector_lgammaf)
				TLI_DEFINE_VECTOR_FUNC("lgamma", "__vlgamma", Intrinsic::experimental_vector_lgamma)

				TLI_DEFINE_VECTOR_FUNC("logf", "__vslog", Intrinsic::experimental_vector_logf)
				TLI_DEFINE_VECTOR_FUNC("__logf_finite", "__vslog", Intrinsic::experimental_vector_logf)
				TLI_DEFINE_VECTOR_FUNC("llvm.log.f32", "__vslog", Intrinsic::experimental_vector_logf)
				TLI_DEFINE_VECTOR_FUNC("log", "__vlog", Intrinsic::experimental_vector_log)
				TLI_DEFINE_VECTOR_FUNC("__log_finite", "__vlog", Intrinsic::experimental_vector_log)
				TLI_DEFINE_VECTOR_FUNC("llvm.log.f64", "__vlog", Intrinsic::experimental_vector_log)

				TLI_DEFINE_VECTOR_FUNC("log10f", "__vslog10", Intrinsic::experimental_vector_log10f)
				TLI_DEFINE_VECTOR_FUNC("__log10f_finite", "__vslog10", Intrinsic::experimental_vector_log10f)
				TLI_DEFINE_VECTOR_FUNC("llvm.log10.f32", "__vslog10", Intrinsic::experimental_vector_log10f)
				TLI_DEFINE_VECTOR_FUNC("log10", "__vlog10", Intrinsic::experimental_vector_log10)
				TLI_DEFINE_VECTOR_FUNC("__log10_finite", "__vlog10", Intrinsic::experimental_vector_log10)
				TLI_DEFINE_VECTOR_FUNC("llvm.log10.f64", "__vlog10", Intrinsic::experimental_vector_log10)

				TLI_DEFINE_VECTOR_FUNC("log1pf", "__vslog1p", Intrinsic::experimental_vector_log1pf)
				TLI_DEFINE_VECTOR_FUNC("log1p", "__vlog1p", Intrinsic::experimental_vector_log1p)

				TLI_DEFINE_VECTOR_FUNC("powf", "__vspow", Intrinsic::experimental_vector_powf)
				TLI_DEFINE_VECTOR_FUNC("__powf_finite", "__vspow", Intrinsic::experimental_vector_powf)
				TLI_DEFINE_VECTOR_FUNC("llvm.pow.f32", "__vspow", Intrinsic::experimental_vector_powf)
				TLI_DEFINE_VECTOR_FUNC("pow", "__vpow", Intrinsic::experimental_vector_pow)
				TLI_DEFINE_VECTOR_FUNC("__pow_finite", "__vpow", Intrinsic::experimental_vector_pow)
				TLI_DEFINE_VECTOR_FUNC("llvm.pow.f64", "__vpow", Intrinsic::experimental_vector_pow)

				TLI_DEFINE_VECTOR_FUNC("rsqrt", "__vrsqrt", Intrinsic::experimental_vector_rsqrt)

				TLI_DEFINE_VECTOR_FUNC("sinf", "__vssin", Intrinsic::experimental_vector_sinf)
				TLI_DEFINE_VECTOR_FUNC("llvm.sin.f32", "__vssin", Intrinsic::experimental_vector_sinf)
				TLI_DEFINE_VECTOR_FUNC("sin", "__vsin", Intrinsic::experimental_vector_sin)
				TLI_DEFINE_VECTOR_FUNC("llvm.sin.f64", "__vsin", Intrinsic::experimental_vector_sin)

				TLI_DEFINE_VECTOR_FUNC("sinhf", "__vssinh", Intrinsic::experimental_vector_sinhf)
				TLI_DEFINE_VECTOR_FUNC("__sinhf_finite", "__vssinh", Intrinsic::experimental_vector_sinhf)
				TLI_DEFINE_VECTOR_FUNC("sinh", "__vsinh", Intrinsic::experimental_vector_sinh)
				TLI_DEFINE_VECTOR_FUNC("__sinh_finite", "__vsinh", Intrinsic::experimental_vector_sinh)

				TLI_DEFINE_VECTOR_FUNC("sqrt", "__vsqrt", Intrinsic::experimental_vector_sqrt)

				TLI_DEFINE_VECTOR_FUNC("tanf", "__vstan", Intrinsic::experimental_vector_tanf)
				TLI_DEFINE_VECTOR_FUNC("tan", "__vtan", Intrinsic::experimental_vector_tan)

				TLI_DEFINE_VECTOR_FUNC("tanhf", "__vstanh", Intrinsic::experimental_vector_tanhf)
				TLI_DEFINE_VECTOR_FUNC("tanh", "__vtanh", Intrinsic::experimental_vector_tanh)

				#else
				#error "Must choose which vector library functions are to be defined."
				#endif

				#undef TLI_DEFINE_MASSV_VECTOR_MATH_FUNCS
				#undef TLI_DEFINE_VECTOR_FUNC

llvm/include/llvm/IR/Intrinsics.td

Show First 20 Lines • Show All 1,256 Lines • ▼ Show 20 Lines	def int_experimental_constrained_fcmp
llvm_metadata_ty, llvm_metadata_ty ]>;		llvm_metadata_ty, llvm_metadata_ty ]>;
def int_experimental_constrained_fcmps		def int_experimental_constrained_fcmps
: DefaultAttrsIntrinsic<[ LLVMScalarOrSameVectorWidth<0, llvm_i1_ty> ],		: DefaultAttrsIntrinsic<[ LLVMScalarOrSameVectorWidth<0, llvm_i1_ty> ],
[ llvm_anyfloat_ty, LLVMMatchType<0>,		[ llvm_anyfloat_ty, LLVMMatchType<0>,
llvm_metadata_ty, llvm_metadata_ty ]>;		llvm_metadata_ty, llvm_metadata_ty ]>;
}		}
// FIXME: Consider maybe adding intrinsics for sitofp, uitofp.		// FIXME: Consider maybe adding intrinsics for sitofp, uitofp.

		//===--------------- Floating Point Vector Math Intrinsics ----------------===//
		//

		def int_experimental_vector_acosf
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_acos
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_acoshf
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_acosh
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_asinf
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_asin
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_asinhf
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_asinh
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_atanf
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_atan
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_atan2f
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty, llvm_anyptr_ty],
		[
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoCapture<ArgIndex<3>>, NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>,
		NoAlias<ArgIndex<3>>, WriteOnly<ArgIndex<1>>, ReadOnly<ArgIndex<2>>,
		ReadOnly<ArgIndex<3>>
		]>;
		def int_experimental_vector_atan2
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty, llvm_anyptr_ty],
		[
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoCapture<ArgIndex<3>>, NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>,
		NoAlias<ArgIndex<3>>, WriteOnly<ArgIndex<1>>, ReadOnly<ArgIndex<2>>,
		ReadOnly<ArgIndex<3>>
		]>;
		def int_experimental_vector_atanhf
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_atanh
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_cbrtf
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_cbrt
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_cosf
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_cos
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_coshf
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_cosh
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_erff
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_erf
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_erfcf
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_erfc
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_expf
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_exp
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_expm1f
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_expm1
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_hypotf
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty, llvm_anyptr_ty],
		[
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoCapture<ArgIndex<3>>, NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>,
		NoAlias<ArgIndex<3>>, WriteOnly<ArgIndex<1>>, ReadOnly<ArgIndex<2>>,
		ReadOnly<ArgIndex<3>>
		]>;
		def int_experimental_vector_hypot
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty, llvm_anyptr_ty],
		[
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoCapture<ArgIndex<3>>, NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>,
		NoAlias<ArgIndex<3>>, WriteOnly<ArgIndex<1>>, ReadOnly<ArgIndex<2>>,
		ReadOnly<ArgIndex<3>>
		]>;
		def int_experimental_vector_lgammaf
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_lgamma
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_logf
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_log
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_log10f
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_log10
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_log1pf
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_log1p
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_powf
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty, llvm_anyptr_ty],
		[
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoCapture<ArgIndex<3>>, NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>,
		NoAlias<ArgIndex<3>>, WriteOnly<ArgIndex<1>>, ReadOnly<ArgIndex<2>>,
		ReadOnly<ArgIndex<3>>
		]>;
		def int_experimental_vector_pow
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty, llvm_anyptr_ty],
		[
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoCapture<ArgIndex<3>>, NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>,
		NoAlias<ArgIndex<3>>, WriteOnly<ArgIndex<1>>, ReadOnly<ArgIndex<2>>,
		ReadOnly<ArgIndex<3>>
		]>;
		def int_experimental_vector_rsqrt
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_sqrt
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_sinf
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_sin
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_sinhf
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_sinh
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_tanf
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_tan
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_tanhf
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;
		def int_experimental_vector_tanh
		: DefaultAttrsIntrinsic<
		[], [llvm_anyint_ty, llvm_anyptr_ty, llvm_anyptr_ty], [
		IntrArgMemOnly, IntrWillReturn, IntrNoFree, IntrNoCallback,
		NoCapture<ArgIndex<1>>, NoCapture<ArgIndex<2>>,
		NoAlias<ArgIndex<1>>, NoAlias<ArgIndex<2>>, WriteOnly<ArgIndex<1>>,
		ReadOnly<ArgIndex<2>>
		]>;

//===------------------------- Expect Intrinsics --------------------------===//		//===------------------------- Expect Intrinsics --------------------------===//
//		//
def int_expect : DefaultAttrsIntrinsic<[llvm_anyint_ty],		def int_expect : DefaultAttrsIntrinsic<[llvm_anyint_ty],
[LLVMMatchType<0>, LLVMMatchType<0>], [IntrNoMem, IntrWillReturn]>;		[LLVMMatchType<0>, LLVMMatchType<0>], [IntrNoMem, IntrWillReturn]>;

def int_expect_with_probability : DefaultAttrsIntrinsic<[llvm_anyint_ty],		def int_expect_with_probability : DefaultAttrsIntrinsic<[llvm_anyint_ty],
[LLVMMatchType<0>, LLVMMatchType<0>, llvm_double_ty],		[LLVMMatchType<0>, LLVMMatchType<0>, llvm_double_ty],
▲ Show 20 Lines • Show All 1,241 Lines • Show Last 20 Lines

llvm/include/llvm/Target/TargetOptions.h

Show First 20 Lines • Show All 136 Lines • ▼ Show 20 Lines	TargetOptions()
TrapUnreachable(false), NoTrapAfterNoreturn(false), TLSSize(0),		TrapUnreachable(false), NoTrapAfterNoreturn(false), TLSSize(0),
EmulatedTLS(false), EnableIPRA(false), EmitStackSizeSection(false),		EmulatedTLS(false), EnableIPRA(false), EmitStackSizeSection(false),
EnableMachineOutliner(false), EnableMachineFunctionSplitter(false),		EnableMachineOutliner(false), EnableMachineFunctionSplitter(false),
SupportsDefaultOutlining(false), EmitAddrsig(false),		SupportsDefaultOutlining(false), EmitAddrsig(false),
EmitCallSiteInfo(false), SupportsDebugEntryValues(false),		EmitCallSiteInfo(false), SupportsDebugEntryValues(false),
EnableDebugEntryValues(false), ValueTrackingVariableLocations(false),		EnableDebugEntryValues(false), ValueTrackingVariableLocations(false),
ForceDwarfFrameSection(false), XRayOmitFunctionIndex(false),		ForceDwarfFrameSection(false), XRayOmitFunctionIndex(false),
DebugStrictDwarf(false), Hotpatch(false),		DebugStrictDwarf(false), Hotpatch(false),
PPCGenScalarMASSEntries(false), JMCInstrument(false),		PPCGenScalarMASSEntries(false), PPCGenVectorMASSEntries(false),
EnableCFIFixup(false), MisExpect(false), XCOFFReadOnlyPointers(false),		JMCInstrument(false), EnableCFIFixup(false), MisExpect(false),
		XCOFFReadOnlyPointers(false),
FPDenormalMode(DenormalMode::IEEE, DenormalMode::IEEE) {}		FPDenormalMode(DenormalMode::IEEE, DenormalMode::IEEE) {}

/// DisableFramePointerElim - This returns true if frame pointer elimination		/// DisableFramePointerElim - This returns true if frame pointer elimination
/// optimization should be disabled for the given machine function.		/// optimization should be disabled for the given machine function.
bool DisableFramePointerElim(const MachineFunction &MF) const;		bool DisableFramePointerElim(const MachineFunction &MF) const;

/// If greater than 0, override the default value of		/// If greater than 0, override the default value of
/// MCAsmInfo::BinutilsVersion.		/// MCAsmInfo::BinutilsVersion.
▲ Show 20 Lines • Show All 186 Lines • ▼ Show 20 Lines	public:
unsigned DebugStrictDwarf : 1;		unsigned DebugStrictDwarf : 1;

/// Emit the hotpatch flag in CodeView debug.		/// Emit the hotpatch flag in CodeView debug.
unsigned Hotpatch : 1;		unsigned Hotpatch : 1;

/// Enables scalar MASS conversions		/// Enables scalar MASS conversions
unsigned PPCGenScalarMASSEntries : 1;		unsigned PPCGenScalarMASSEntries : 1;

		/// Enables vector MASS conversions
		unsigned PPCGenVectorMASSEntries : 1;

/// Enable JustMyCode instrumentation.		/// Enable JustMyCode instrumentation.
unsigned JMCInstrument : 1;		unsigned JMCInstrument : 1;

/// Enable the CFIFixup pass.		/// Enable the CFIFixup pass.
unsigned EnableCFIFixup : 1;		unsigned EnableCFIFixup : 1;

/// When set to true, enable MisExpect Diagnostics		/// When set to true, enable MisExpect Diagnostics
/// By default, it is set to false		/// By default, it is set to false
▲ Show 20 Lines • Show All 92 Lines • Show Last 20 Lines

llvm/include/llvm/Transforms/Scalar/LoopIdiomRecognize.h

Show All 28 Lines	struct DisableLIRP {
/// When true, the entire pass is disabled.		/// When true, the entire pass is disabled.
static bool All;		static bool All;

/// When true, Memset is disabled.		/// When true, Memset is disabled.
static bool Memset;		static bool Memset;

/// When true, Memcpy is disabled.		/// When true, Memcpy is disabled.
static bool Memcpy;		static bool Memcpy;

		/// When true, VectorMath is disabled.
		static bool VectorMath;
};		};

/// Performs Loop Idiom Recognize Pass.		/// Performs Loop Idiom Recognize Pass.
class LoopIdiomRecognizePass : public PassInfoMixin<LoopIdiomRecognizePass> {		class LoopIdiomRecognizePass : public PassInfoMixin<LoopIdiomRecognizePass> {
public:		public:
PreservedAnalyses run(Loop &L, LoopAnalysisManager &AM,		PreservedAnalyses run(Loop &L, LoopAnalysisManager &AM,
LoopStandardAnalysisResults &AR, LPMUpdater &U);		LoopStandardAnalysisResults &AR, LPMUpdater &U);
};		};

} // end namespace llvm		} // end namespace llvm

#endif // LLVM_TRANSFORMS_SCALAR_LOOPIDIOMRECOGNIZE_H		#endif // LLVM_TRANSFORMS_SCALAR_LOOPIDIOMRECOGNIZE_H

llvm/lib/Analysis/TargetLibraryInfo.cpp

//===-- TargetLibraryInfo.cpp - Runtime library information ----------------==//		//===-- TargetLibraryInfo.cpp - Runtime library information ----------------==//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file implements the TargetLibraryInfo class.		// This file implements the TargetLibraryInfo class.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/Analysis/TargetLibraryInfo.h"		#include "llvm/Analysis/TargetLibraryInfo.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
		#include "llvm/IR/Intrinsics.h"
#include "llvm/InitializePasses.h"		#include "llvm/InitializePasses.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
#include "llvm/TargetParser/Triple.h"		#include "llvm/TargetParser/Triple.h"
using namespace llvm;		using namespace llvm;

static cl::opt<TargetLibraryInfoImpl::VectorLibrary> ClVectorLibrary(		static cl::opt<TargetLibraryInfoImpl::VectorLibrary> ClVectorLibrary(
"vector-library", cl::Hidden, cl::desc("Vector functions library"),		"vector-library", cl::Hidden, cl::desc("Vector functions library"),
cl::init(TargetLibraryInfoImpl::NoLibrary),		cl::init(TargetLibraryInfoImpl::NoLibrary),
▲ Show 20 Lines • Show All 835 Lines • ▼ Show 20 Lines	static void initialize(TargetLibraryInfoImpl &TLI, const Triple &T,
if (!T.isOSAIX()) {		if (!T.isOSAIX()) {
TLI.setUnavailable(LibFunc_vec_calloc);		TLI.setUnavailable(LibFunc_vec_calloc);
TLI.setUnavailable(LibFunc_vec_malloc);		TLI.setUnavailable(LibFunc_vec_malloc);
TLI.setUnavailable(LibFunc_vec_realloc);		TLI.setUnavailable(LibFunc_vec_realloc);
TLI.setUnavailable(LibFunc_vec_free);		TLI.setUnavailable(LibFunc_vec_free);
}		}

TLI.addVectorizableFunctionsFromVecLib(ClVectorLibrary, T);		TLI.addVectorizableFunctionsFromVecLib(ClVectorLibrary, T);
		TLI.addVectorMathFunctionsFromVecLib(ClVectorLibrary);
}		}

TargetLibraryInfoImpl::TargetLibraryInfoImpl() {		TargetLibraryInfoImpl::TargetLibraryInfoImpl() {
// Default to everything being available.		// Default to everything being available.
memset(AvailableArray, -1, sizeof(AvailableArray));		memset(AvailableArray, -1, sizeof(AvailableArray));

initialize(*this, Triple(), StandardNames);		initialize(*this, Triple(), StandardNames);
}		}
Show All 9 Lines	TargetLibraryInfoImpl::TargetLibraryInfoImpl(const TargetLibraryInfoImpl &TLI)
: CustomNames(TLI.CustomNames), ShouldExtI32Param(TLI.ShouldExtI32Param),		: CustomNames(TLI.CustomNames), ShouldExtI32Param(TLI.ShouldExtI32Param),
ShouldExtI32Return(TLI.ShouldExtI32Return),		ShouldExtI32Return(TLI.ShouldExtI32Return),
ShouldSignExtI32Param(TLI.ShouldSignExtI32Param),		ShouldSignExtI32Param(TLI.ShouldSignExtI32Param),
ShouldSignExtI32Return(TLI.ShouldSignExtI32Return),		ShouldSignExtI32Return(TLI.ShouldSignExtI32Return),
SizeOfInt(TLI.SizeOfInt) {		SizeOfInt(TLI.SizeOfInt) {
memcpy(AvailableArray, TLI.AvailableArray, sizeof(AvailableArray));		memcpy(AvailableArray, TLI.AvailableArray, sizeof(AvailableArray));
VectorDescs = TLI.VectorDescs;		VectorDescs = TLI.VectorDescs;
ScalarDescs = TLI.ScalarDescs;		ScalarDescs = TLI.ScalarDescs;
		VectorMathFuncDescs = TLI.VectorMathFuncDescs;
		VectorMathIntrinDescs = TLI.VectorMathIntrinDescs;
}		}

TargetLibraryInfoImpl::TargetLibraryInfoImpl(TargetLibraryInfoImpl &&TLI)		TargetLibraryInfoImpl::TargetLibraryInfoImpl(TargetLibraryInfoImpl &&TLI)
: CustomNames(std::move(TLI.CustomNames)),		: CustomNames(std::move(TLI.CustomNames)),
ShouldExtI32Param(TLI.ShouldExtI32Param),		ShouldExtI32Param(TLI.ShouldExtI32Param),
ShouldExtI32Return(TLI.ShouldExtI32Return),		ShouldExtI32Return(TLI.ShouldExtI32Return),
ShouldSignExtI32Param(TLI.ShouldSignExtI32Param),		ShouldSignExtI32Param(TLI.ShouldSignExtI32Param),
ShouldSignExtI32Return(TLI.ShouldSignExtI32Return),		ShouldSignExtI32Return(TLI.ShouldSignExtI32Return),
SizeOfInt(TLI.SizeOfInt) {		SizeOfInt(TLI.SizeOfInt) {
std::move(std::begin(TLI.AvailableArray), std::end(TLI.AvailableArray),		std::move(std::begin(TLI.AvailableArray), std::end(TLI.AvailableArray),
AvailableArray);		AvailableArray);
VectorDescs = TLI.VectorDescs;		VectorDescs = TLI.VectorDescs;
ScalarDescs = TLI.ScalarDescs;		ScalarDescs = TLI.ScalarDescs;
		VectorMathFuncDescs = TLI.VectorMathFuncDescs;
		VectorMathIntrinDescs = TLI.VectorMathIntrinDescs;
}		}

TargetLibraryInfoImpl &TargetLibraryInfoImpl::operator=(const TargetLibraryInfoImpl &TLI) {		TargetLibraryInfoImpl &TargetLibraryInfoImpl::operator=(const TargetLibraryInfoImpl &TLI) {
CustomNames = TLI.CustomNames;		CustomNames = TLI.CustomNames;
ShouldExtI32Param = TLI.ShouldExtI32Param;		ShouldExtI32Param = TLI.ShouldExtI32Param;
ShouldExtI32Return = TLI.ShouldExtI32Return;		ShouldExtI32Return = TLI.ShouldExtI32Return;
ShouldSignExtI32Param = TLI.ShouldSignExtI32Param;		ShouldSignExtI32Param = TLI.ShouldSignExtI32Param;
ShouldSignExtI32Return = TLI.ShouldSignExtI32Return;		ShouldSignExtI32Return = TLI.ShouldSignExtI32Return;
▲ Show 20 Lines • Show All 329 Lines • ▼ Show 20 Lines	StringRef TargetLibraryInfoImpl::getVectorizedFunction(StringRef F,
while (I != VectorDescs.end() && StringRef(I->ScalarFnName) == F) {		while (I != VectorDescs.end() && StringRef(I->ScalarFnName) == F) {
if ((I->VectorizationFactor == VF) && (I->Masked == Masked))		if ((I->VectorizationFactor == VF) && (I->Masked == Masked))
return I->VectorFnName;		return I->VectorFnName;
++I;		++I;
}		}
return StringRef();		return StringRef();
}		}

		void TargetLibraryInfoImpl::addVectorMathFunctions(
		ArrayRef<VectorMathDesc> Fns) {
		llvm::append_range(VectorMathFuncDescs, Fns);
		llvm::sort(VectorMathFuncDescs,
		[](const VectorMathDesc &LHS, const VectorMathDesc &RHS) {
		return LHS.ScalarFnName < RHS.ScalarFnName;
		});

		llvm::append_range(VectorMathIntrinDescs, Fns);
		llvm::sort(VectorMathIntrinDescs,
		[](const VectorMathDesc &LHS, const VectorMathDesc &RHS) {
		return LHS.ID < RHS.ID;
		});
		}

		void TargetLibraryInfoImpl::addVectorMathFunctionsFromVecLib(
		enum VectorLibrary VecLib) {
		if (VecLib == MASSV) {
		const VectorMathDesc Funcs[] = {
		#define TLI_DEFINE_MASSV_VECTOR_MATH_FUNCS
		#include "llvm/Analysis/VectorMathFuncs.def"
		};
		addVectorMathFunctions(Funcs);
		}
		}

		bool TargetLibraryInfoImpl::isVectorMathFunctionAvailable(
		StringRef FuncName) const {
		FuncName = sanitizeFunctionName(FuncName);
		if (FuncName.empty())
		return false;

		std::vector<VectorMathDesc>::const_iterator I =
		llvm::lower_bound(VectorMathFuncDescs, FuncName,
		[](const VectorMathDesc &LHS, StringRef S) {
		return LHS.ScalarFnName < S;
		});
		return I != VectorMathFuncDescs.end() &&
		StringRef(I->ScalarFnName) == FuncName;
		}

		bool TargetLibraryInfoImpl::isVectorMathFunctionAvailable(
		Intrinsic::ID ID) const {
		std::vector<VectorMathDesc>::const_iterator I = llvm::lower_bound(
		VectorMathIntrinDescs, ID,
		[](const VectorMathDesc &LHS, Intrinsic::ID ID) { return LHS.ID < ID; });
		return I != VectorMathFuncDescs.end() && I->ID == ID;
		}

		Intrinsic::ID TargetLibraryInfoImpl::getVectorMathIntrinsic(StringRef F) const {
		F = sanitizeFunctionName(F);
		if (F.empty())
		return Intrinsic::not_intrinsic;

		std::vector<VectorMathDesc>::const_iterator I = llvm::lower_bound(
		VectorMathFuncDescs, F, [](const VectorMathDesc &LHS, StringRef S) {
		return LHS.ScalarFnName < S;
		});
		if (I != VectorMathFuncDescs.end() && StringRef(I->ScalarFnName) == F)
		return I->ID;

		return Intrinsic::not_intrinsic;
		}

		StringRef TargetLibraryInfoImpl::getVectorMathFunction(Intrinsic::ID ID) const {
		std::vector<VectorMathDesc>::const_iterator I = llvm::lower_bound(
		VectorMathIntrinDescs, ID,
		[](const VectorMathDesc &LHS, Intrinsic::ID ID) { return LHS.ID < ID; });
		if (I != VectorMathFuncDescs.end() && I->ID == ID)
		return I->VectorFnName;

		return StringRef();
		}

TargetLibraryInfo TargetLibraryAnalysis::run(const Function &F,		TargetLibraryInfo TargetLibraryAnalysis::run(const Function &F,
FunctionAnalysisManager &) {		FunctionAnalysisManager &) {
if (!BaselineInfoImpl)		if (!BaselineInfoImpl)
BaselineInfoImpl =		BaselineInfoImpl =
TargetLibraryInfoImpl(Triple(F.getParent()->getTargetTriple()));		TargetLibraryInfoImpl(Triple(F.getParent()->getTargetTriple()));
return TargetLibraryInfo(*BaselineInfoImpl, &F);		return TargetLibraryInfo(*BaselineInfoImpl, &F);
}		}

▲ Show 20 Lines • Show All 69 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/CMakeLists.txt

Show First 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	add_llvm_target(PowerPCCodeGen
PPCVSXCopy.cpp		PPCVSXCopy.cpp
PPCReduceCRLogicals.cpp		PPCReduceCRLogicals.cpp
PPCVSXFMAMutate.cpp		PPCVSXFMAMutate.cpp
PPCVSXSwapRemoval.cpp		PPCVSXSwapRemoval.cpp
PPCExpandISEL.cpp		PPCExpandISEL.cpp
PPCPreEmitPeephole.cpp		PPCPreEmitPeephole.cpp
PPCLowerMASSVEntries.cpp		PPCLowerMASSVEntries.cpp
PPCGenScalarMASSEntries.cpp		PPCGenScalarMASSEntries.cpp
		PPCGenVectorMASSEntries.cpp
GISel/PPCCallLowering.cpp		GISel/PPCCallLowering.cpp
GISel/PPCRegisterBankInfo.cpp		GISel/PPCRegisterBankInfo.cpp
GISel/PPCLegalizerInfo.cpp		GISel/PPCLegalizerInfo.cpp

LINK_COMPONENTS		LINK_COMPONENTS
Analysis		Analysis
AsmPrinter		AsmPrinter
BinaryFormat		BinaryFormat
Show All 21 Lines

llvm/lib/Target/PowerPC/PPC.h

Show First 20 Lines • Show All 83 Lines • ▼ Show 20 Lines	#endif
ModulePass *createPPCLowerMASSVEntriesPass();		ModulePass *createPPCLowerMASSVEntriesPass();
void initializePPCLowerMASSVEntriesPass(PassRegistry &);		void initializePPCLowerMASSVEntriesPass(PassRegistry &);
extern char &PPCLowerMASSVEntriesID;		extern char &PPCLowerMASSVEntriesID;

ModulePass *createPPCGenScalarMASSEntriesPass();		ModulePass *createPPCGenScalarMASSEntriesPass();
void initializePPCGenScalarMASSEntriesPass(PassRegistry &);		void initializePPCGenScalarMASSEntriesPass(PassRegistry &);
extern char &PPCGenScalarMASSEntriesID;		extern char &PPCGenScalarMASSEntriesID;

		ModulePass *createPPCGenVectorMASSEntriesPass();
		void initializePPCGenVectorMASSEntriesPass(PassRegistry &);
		extern char &PPCGenVectorMASSEntriesID;

InstructionSelector *		InstructionSelector *
createPPCInstructionSelector(const PPCTargetMachine &, const PPCSubtarget &,		createPPCInstructionSelector(const PPCTargetMachine &, const PPCSubtarget &,
const PPCRegisterBankInfo &);		const PPCRegisterBankInfo &);
namespace PPCII {		namespace PPCII {

/// Target Operand Flag enum.		/// Target Operand Flag enum.
enum TOF {		enum TOF {
//===------------------------------------------------------------------===//		//===------------------------------------------------------------------===//
▲ Show 20 Lines • Show All 81 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCGenVectorMASSEntries.cpp

This file was added.

				//===-- PPCGenVectorMASSEntries.cpp ---------------------------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This transformation converts vector math intrinsics into their
				// corresponding MASS (vector) entries for PowerPC targets.
				// Following are examples of such conversion:
				// llvm.experimental.vector.tanh ---> __vtanh
				// Such lowering is legal under the fast-math option.
				//
				//===----------------------------------------------------------------------===//

				#include "PPC.h"
				#include "PPCSubtarget.h"
				#include "PPCTargetMachine.h"
				#include "llvm/ADT/Statistic.h"
				#include "llvm/Analysis/TargetLibraryInfo.h"
				#include "llvm/CodeGen/TargetPassConfig.h"
				#include "llvm/IR/IRBuilder.h"
				#include "llvm/IR/Instructions.h"
				#include "llvm/IR/Module.h"
				#include "llvm/InitializePasses.h"

				#define DEBUG_TYPE "ppc-gen-vector-mass"

				using namespace llvm;

				STATISTIC(NumVectorMASS, "Number of vector MASS calls created");

				namespace {

				class PPCGenVectorMASSEntries : public ModulePass {
				public:
				static char ID;

				PPCGenVectorMASSEntries() : ModulePass(ID) {}

				bool runOnModule(Module &M) override;

				StringRef getPassName() const override {
				return "PPC Generate Vector MASS Entries";
				}

				void getAnalysisUsage(AnalysisUsage &AU) const override {
				AU.addRequired<TargetLibraryInfoWrapperPass>();
				}

				private:
				bool createVectorMASSCall(Intrinsic::ID ID, IntrinsicInst &II,
				TargetLibraryInfo &TLI) const;
				};

				} // namespace

				/// Create an alloca instruction in the entry block of the function.
				static AllocaInst createEntryBlockAlloca(Function &F, Type Ty) {
				IRBuilder<> Builder(&F.getEntryBlock(), F.getEntryBlock().begin());
				return Builder.CreateAlloca(Ty);
				}

				/// Lowers vector math intrinsics to vector MASS functions.
				/// e.g.: llvm.experimental.vector.tanh --> __vtanh
				/// Both function prototype and its callsite is updated during lowering.
				bool PPCGenVectorMASSEntries::createVectorMASSCall(
				Intrinsic::ID IntrinID, IntrinsicInst &II, TargetLibraryInfo &TLI) const {
				assert(II.getIntrinsicID() == IntrinID && "Intrinsic ID mismatched.");

				// Create alloca for the length parameter in vector MASS functions.
				IRBuilder<> Builder(&II);
				Function &F = *II.getParent()->getParent();
				Value *Len = II.getArgOperand(0);
				AllocaInst *AI = createEntryBlockAlloca(F, Len->getType());
				Builder.CreateStore(Len, AI);

				// Create the vector MASS call.
				StringRef VectorFuncName = TLI.getVectorMathFunction(IntrinID);

				SmallVector<Value *, 4> CallArgs;
				for (unsigned I = 1; I < II.arg_size(); ++I)
				CallArgs.push_back(II.getArgOperand(I));
				CallArgs.push_back(AI);

				SmallVector<Type *, 4> ArgsTy;
				for (Value *Arg : CallArgs)
				ArgsTy.push_back(Arg->getType());

				FunctionType *FTy =
				FunctionType::get(Builder.getVoidTy(), ArgsTy, /* isVarArg = */ false);
				Module *M = II.getModule();
				assert(M && "Expecting a valid Module.");

				Function *Intrin = II.getCalledFunction();
				const AttributeList &IntrinAttrs = Intrin->getAttributes();
				AttributeSet FnAttrs = IntrinAttrs.getFnAttrs();
				AttributeSet RetAttrs = IntrinAttrs.getRetAttrs();
				SmallVector<AttributeSet, 8> ArgAttrVec;
				for (unsigned I = 1; I < II.arg_size(); ++I)
				ArgAttrVec.push_back(IntrinAttrs.getParamAttrs(I));
				ArgAttrVec.push_back(IntrinAttrs.getParamAttrs(0));
				assert(ArgAttrVec.size() == CallArgs.size() &&
				"Each function parameter must have an attribute set.");
				AttributeList NewCallAttrs =
				AttributeList::get(F.getContext(), FnAttrs, RetAttrs, ArgAttrVec);

				FunctionCallee VectorFuncCallee =
				M->getOrInsertFunction(VectorFuncName, FTy, NewCallAttrs);
				CallInst *NewCall = Builder.CreateCall(VectorFuncCallee, CallArgs);
				NewCall->copyMetadata(II);
				II.eraseFromParent();
				++NumVectorMASS;

				return true;
				}

				bool PPCGenVectorMASSEntries::runOnModule(Module &M) {
				bool Changed = false;

				auto *TPC = getAnalysisIfAvailable<TargetPassConfig>();
				if (!TPC \|\| skipModule(M))
				return false;

				for (Function &Func : M) {
				Intrinsic::ID IntrinID = Func.getIntrinsicID();
				if (IntrinID == Intrinsic::not_intrinsic)
				continue;

				TargetLibraryInfo &TLI =
				getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(Func);
				if (!TLI.isVectorMathFunctionAvailable(IntrinID))
				continue;

				// The call to createVectorMASSCall() invalidates the iterator over users
				// upon replacing the users. Precomputing the current list of users allows
				// us to replace all the call sites.
				SmallVector<User *, 4> TheUsers;
				for (auto *User : Func.users())
				TheUsers.push_back(User);

				for (auto *User : TheUsers)
				if (auto *II = dyn_cast_or_null<IntrinsicInst>(User))
				Changed \|= createVectorMASSCall(IntrinID, *II, TLI);
				}

				return Changed;
				}

				char PPCGenVectorMASSEntries::ID = 0;

				char &llvm::PPCGenVectorMASSEntriesID = PPCGenVectorMASSEntries::ID;

				INITIALIZE_PASS_BEGIN(PPCGenVectorMASSEntries, DEBUG_TYPE,
				"Generate Vector MASS entries", false, false)
				INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
				INITIALIZE_PASS_END(PPCGenVectorMASSEntries, DEBUG_TYPE,
				"Generate Vector MASS entries", false, false)

				ModulePass *llvm::createPPCGenVectorMASSEntriesPass() {
				return new PPCGenVectorMASSEntries();
				}

llvm/lib/Target/PowerPC/PPCTargetMachine.cpp

Show First 20 Lines • Show All 127 Lines • ▼ Show 20 Lines	#endif
initializePPCBranchCoalescingPass(PR);		initializePPCBranchCoalescingPass(PR);
initializePPCBoolRetToIntPass(PR);		initializePPCBoolRetToIntPass(PR);
initializePPCExpandISELPass(PR);		initializePPCExpandISELPass(PR);
initializePPCPreEmitPeepholePass(PR);		initializePPCPreEmitPeepholePass(PR);
initializePPCTLSDynamicCallPass(PR);		initializePPCTLSDynamicCallPass(PR);
initializePPCMIPeepholePass(PR);		initializePPCMIPeepholePass(PR);
initializePPCLowerMASSVEntriesPass(PR);		initializePPCLowerMASSVEntriesPass(PR);
initializePPCGenScalarMASSEntriesPass(PR);		initializePPCGenScalarMASSEntriesPass(PR);
		initializePPCGenVectorMASSEntriesPass(PR);
initializePPCExpandAtomicPseudoPass(PR);		initializePPCExpandAtomicPseudoPass(PR);
initializeGlobalISel(PR);		initializeGlobalISel(PR);
initializePPCCTRLoopsPass(PR);		initializePPCCTRLoopsPass(PR);
initializePPCDAGToDAGISelPass(PR);		initializePPCDAGToDAGISelPass(PR);
}		}

static bool isLittleEndianTriple(const Triple &T) {		static bool isLittleEndianTriple(const Triple &T) {
return T.getArch() == Triple::ppc64le \|\| T.getArch() == Triple::ppcle;		return T.getArch() == Triple::ppc64le \|\| T.getArch() == Triple::ppcle;
▲ Show 20 Lines • Show All 314 Lines • ▼ Show 20 Lines	void PPCPassConfig::addIRPasses() {
// Generate PowerPC target-specific entries for scalar math functions		// Generate PowerPC target-specific entries for scalar math functions
// that are available in IBM MASS (scalar) library.		// that are available in IBM MASS (scalar) library.
if (TM->getOptLevel() == CodeGenOpt::Aggressive &&		if (TM->getOptLevel() == CodeGenOpt::Aggressive &&
EnablePPCGenScalarMASSEntries) {		EnablePPCGenScalarMASSEntries) {
TM->Options.PPCGenScalarMASSEntries = EnablePPCGenScalarMASSEntries;		TM->Options.PPCGenScalarMASSEntries = EnablePPCGenScalarMASSEntries;
addPass(createPPCGenScalarMASSEntriesPass());		addPass(createPPCGenScalarMASSEntriesPass());
}		}

		// Generate PowerPC target-specific entries for vector math intrinsics
		// that are available in IBM MASS (vector) library.
		addPass(createPPCGenVectorMASSEntriesPass());

// If explicitly requested, add explicit data prefetch intrinsics.		// If explicitly requested, add explicit data prefetch intrinsics.
if (EnablePrefetch.getNumOccurrences() > 0)		if (EnablePrefetch.getNumOccurrences() > 0)
addPass(createLoopDataPrefetchPass());		addPass(createLoopDataPrefetchPass());

if (TM->getOptLevel() >= CodeGenOpt::Default && EnableGEPOpt) {		if (TM->getOptLevel() >= CodeGenOpt::Default && EnableGEPOpt) {
// Call SeparateConstOffsetFromGEP pass to extract constants within indices		// Call SeparateConstOffsetFromGEP pass to extract constants within indices
// and lower a GEP with multiple indices to either arithmetic operations or		// and lower a GEP with multiple indices to either arithmetic operations or
// multiple GEPs with single index.		// multiple GEPs with single index.
▲ Show 20 Lines • Show All 163 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp

Show First 20 Lines • Show All 105 Lines • ▼ Show 20 Lines
STATISTIC(NumMemSet, "Number of memset's formed from loop stores");		STATISTIC(NumMemSet, "Number of memset's formed from loop stores");
STATISTIC(NumMemCpy, "Number of memcpy's formed from loop load+stores");		STATISTIC(NumMemCpy, "Number of memcpy's formed from loop load+stores");
STATISTIC(NumMemMove, "Number of memmove's formed from loop load+stores");		STATISTIC(NumMemMove, "Number of memmove's formed from loop load+stores");
STATISTIC(		STATISTIC(
NumShiftUntilBitTest,		NumShiftUntilBitTest,
"Number of uncountable loops recognized as 'shift until bitttest' idiom");		"Number of uncountable loops recognized as 'shift until bitttest' idiom");
STATISTIC(NumShiftUntilZero,		STATISTIC(NumShiftUntilZero,
"Number of uncountable loops recognized as 'shift until zero' idiom");		"Number of uncountable loops recognized as 'shift until zero' idiom");
		STATISTIC(NumVectorMath,
		"Number of vector math functions formed from scalar math functions");

bool DisableLIRP::All;		bool DisableLIRP::All;
static cl::opt<bool, true>		static cl::opt<bool, true>
DisableLIRPAll("disable-" DEBUG_TYPE "-all",		DisableLIRPAll("disable-" DEBUG_TYPE "-all",
cl::desc("Options to disable Loop Idiom Recognize Pass."),		cl::desc("Options to disable Loop Idiom Recognize Pass."),
cl::location(DisableLIRP::All), cl::init(false),		cl::location(DisableLIRP::All), cl::init(false),
cl::ReallyHidden);		cl::ReallyHidden);

bool DisableLIRP::Memset;		bool DisableLIRP::Memset;
static cl::opt<bool, true>		static cl::opt<bool, true>
DisableLIRPMemset("disable-" DEBUG_TYPE "-memset",		DisableLIRPMemset("disable-" DEBUG_TYPE "-memset",
cl::desc("Proceed with loop idiom recognize pass, but do "		cl::desc("Proceed with loop idiom recognize pass, but do "
"not convert loop(s) to memset."),		"not convert loop(s) to memset."),
cl::location(DisableLIRP::Memset), cl::init(false),		cl::location(DisableLIRP::Memset), cl::init(false),
cl::ReallyHidden);		cl::ReallyHidden);

bool DisableLIRP::Memcpy;		bool DisableLIRP::Memcpy;
static cl::opt<bool, true>		static cl::opt<bool, true>
DisableLIRPMemcpy("disable-" DEBUG_TYPE "-memcpy",		DisableLIRPMemcpy("disable-" DEBUG_TYPE "-memcpy",
cl::desc("Proceed with loop idiom recognize pass, but do "		cl::desc("Proceed with loop idiom recognize pass, but do "
"not convert loop(s) to memcpy."),		"not convert loop(s) to memcpy."),
cl::location(DisableLIRP::Memcpy), cl::init(false),		cl::location(DisableLIRP::Memcpy), cl::init(false),
cl::ReallyHidden);		cl::ReallyHidden);

		bool DisableLIRP::VectorMath;
		static cl::opt<bool, true> DisableLIRPVectorMath(
		"disable-" DEBUG_TYPE "-vector-math",
		cl::desc("Proceed with loop idiom recognize pass, but do "
		"not convert loop(s) to vector math functions."),
		cl::location(DisableLIRP::VectorMath), cl::init(true), cl::ReallyHidden);

static cl::opt<bool> UseLIRCodeSizeHeurs(		static cl::opt<bool> UseLIRCodeSizeHeurs(
"use-lir-code-size-heurs",		"use-lir-code-size-heurs",
cl::desc("Use loop idiom recognition code size heuristics when compiling"		cl::desc("Use loop idiom recognition code size heuristics when compiling"
"with -Os/-Oz"),		"with -Os/-Oz"),
cl::init(true), cl::Hidden);		cl::init(true), cl::Hidden);

		static cl::opt<unsigned> MathLoopTripCountProfitThreshold(
		"math-loop-tripcount-profit", cl::init(8), cl::Hidden,
		cl::desc("The loop trip count beyond which the math vector array function "
		"transformation is profitable"));

namespace {		namespace {

class LoopIdiomRecognize {		class LoopIdiomRecognize {
Loop *CurLoop = nullptr;		Loop *CurLoop = nullptr;
AliasAnalysis *AA;		AliasAnalysis *AA;
DominatorTree *DT;		DominatorTree *DT;
LoopInfo *LI;		LoopInfo *LI;
ScalarEvolution *SE;		ScalarEvolution *SE;
Show All 20 Lines

private:		private:
using StoreList = SmallVector<StoreInst *, 8>;		using StoreList = SmallVector<StoreInst *, 8>;
using StoreListMap = MapVector<Value *, StoreList>;		using StoreListMap = MapVector<Value *, StoreList>;

StoreListMap StoreRefsForMemset;		StoreListMap StoreRefsForMemset;
StoreListMap StoreRefsForMemsetPattern;		StoreListMap StoreRefsForMemsetPattern;
StoreList StoreRefsForMemcpy;		StoreList StoreRefsForMemcpy;
		StoreList StoreRefsForVectorMath;
bool HasMemset;		bool HasMemset;
bool HasMemsetPattern;		bool HasMemsetPattern;
bool HasMemcpy;		bool HasMemcpy;
		bool HasVectorMath;

/// Return code for isLegalStore()		/// Return code for isLegalStore()
enum LegalStoreKind {		enum LegalStoreKind {
None = 0,		None = 0,
Memset,		Memset,
MemsetPattern,		MemsetPattern,
Memcpy,		Memcpy,
UnorderedAtomicMemcpy,		UnorderedAtomicMemcpy,
		VectorMath,
DontUse // Dummy retval never to be used. Allows catching errors in retval		DontUse // Dummy retval never to be used. Allows catching errors in retval
// handling.		// handling.
};		};

/// \name Countable Loop Idiom Handling		/// \name Countable Loop Idiom Handling
/// @{		/// @{

bool runOnCountableLoop();		bool runOnCountableLoop();
bool runOnLoopBlock(BasicBlock BB, const SCEV BECount,		bool runOnLoopBlock(BasicBlock BB, const SCEV BECount,
SmallVectorImpl<BasicBlock *> &ExitBlocks);		SmallVectorImpl<BasicBlock *> &ExitBlocks);

void collectStores(BasicBlock *BB);		void collectStores(BasicBlock *BB);
LegalStoreKind isLegalStore(StoreInst *SI);		LegalStoreKind isLegalStore(StoreInst *SI);
		bool isLegalLoad(LoadInst LI, const SCEVAddRecExpr StoreEv);
		bool isLegalMathCall(CallInst CI, const SCEVAddRecExpr StoreEv);
enum class ForMemset { No, Yes };		enum class ForMemset { No, Yes };
bool processLoopStores(SmallVectorImpl<StoreInst > &SL, const SCEV BECount,		bool processLoopStores(SmallVectorImpl<StoreInst > &SL, const SCEV BECount,
ForMemset For);		ForMemset For);

template <typename MemInst>		template <typename MemInst>
bool processLoopMemIntrinsic(		bool processLoopMemIntrinsic(
BasicBlock *BB,		BasicBlock *BB,
bool (LoopIdiomRecognize::Processor)(MemInst , const SCEV *),		bool (LoopIdiomRecognize::Processor)(MemInst , const SCEV *),
Show All 10 Lines	private:
bool processLoopStoreOfLoopLoad(StoreInst SI, const SCEV BECount);		bool processLoopStoreOfLoopLoad(StoreInst SI, const SCEV BECount);
bool processLoopStoreOfLoopLoad(Value DestPtr, Value SourcePtr,		bool processLoopStoreOfLoopLoad(Value DestPtr, Value SourcePtr,
const SCEV *StoreSize, MaybeAlign StoreAlign,		const SCEV *StoreSize, MaybeAlign StoreAlign,
MaybeAlign LoadAlign, Instruction *TheStore,		MaybeAlign LoadAlign, Instruction *TheStore,
Instruction *TheLoad,		Instruction *TheLoad,
const SCEVAddRecExpr *StoreEv,		const SCEVAddRecExpr *StoreEv,
const SCEVAddRecExpr *LoadEv,		const SCEVAddRecExpr *LoadEv,
const SCEV *BECount);		const SCEV *BECount);
		bool processLoopStoreForVectorMath(StoreInst SI, const SCEV BECount);
bool avoidLIRForMultiBlockLoop(bool IsMemset = false,		bool avoidLIRForMultiBlockLoop(bool IsMemset = false,
bool IsLoopMemset = false);		bool IsLoopMemset = false);

/// @}		/// @}
/// \name Noncountable Loop Idiom Handling		/// \name Noncountable Loop Idiom Handling
/// @{		/// @{

bool runOnNoncountableLoop();		bool runOnNoncountableLoop();
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	bool LoopIdiomRecognize::runOnLoop(Loop *L) {
CurLoop = L;		CurLoop = L;
// If the loop could not be converted to canonical form, it must have an		// If the loop could not be converted to canonical form, it must have an
// indirectbr in it, just give up.		// indirectbr in it, just give up.
if (!L->getLoopPreheader())		if (!L->getLoopPreheader())
return false;		return false;

// Disable loop idiom recognition if the function's name is a common idiom.		// Disable loop idiom recognition if the function's name is a common idiom.
StringRef Name = L->getHeader()->getParent()->getName();		StringRef Name = L->getHeader()->getParent()->getName();
if (Name == "memset" \|\| Name == "memcpy")		if (Name == "memset" \|\| Name == "memcpy" \|\|
		TLI->isVectorMathFunctionAvailable(Name))
return false;		return false;

// Determine if code size heuristics need to be applied.		// Determine if code size heuristics need to be applied.
ApplyCodeSizeHeuristics =		ApplyCodeSizeHeuristics =
L->getHeader()->getParent()->hasOptSize() && UseLIRCodeSizeHeurs;		L->getHeader()->getParent()->hasOptSize() && UseLIRCodeSizeHeurs;

HasMemset = TLI->has(LibFunc_memset);		HasMemset = TLI->has(LibFunc_memset);
HasMemsetPattern = TLI->has(LibFunc_memset_pattern16);		HasMemsetPattern = TLI->has(LibFunc_memset_pattern16);
HasMemcpy = TLI->has(LibFunc_memcpy);		HasMemcpy = TLI->has(LibFunc_memcpy);
		HasVectorMath = TLI->hasVectorMathFunctions();

if (HasMemset \|\| HasMemsetPattern \|\| HasMemcpy)		if (HasMemset \|\| HasMemsetPattern \|\| HasMemcpy \|\| HasVectorMath)
if (SE->hasLoopInvariantBackedgeTakenCount(L))		if (SE->hasLoopInvariantBackedgeTakenCount(L))
return runOnCountableLoop();		return runOnCountableLoop();

return runOnNoncountableLoop();		return runOnNoncountableLoop();
}		}

bool LoopIdiomRecognize::runOnCountableLoop() {		bool LoopIdiomRecognize::runOnCountableLoop() {
const SCEV *BECount = SE->getBackedgeTakenCount(CurLoop);		const SCEV *BECount = SE->getBackedgeTakenCount(CurLoop);
▲ Show 20 Lines • Show All 79 Lines • ▼ Show 20 Lines	if (Size == 16)
return C;		return C;

// Otherwise, we'll use an array of the constants.		// Otherwise, we'll use an array of the constants.
unsigned ArraySize = 16 / Size;		unsigned ArraySize = 16 / Size;
ArrayType *AT = ArrayType::get(V->getType(), ArraySize);		ArrayType *AT = ArrayType::get(V->getType(), ArraySize);
return ConstantArray::get(AT, std::vector<Constant *>(ArraySize, C));		return ConstantArray::get(AT, std::vector<Constant *>(ArraySize, C));
}		}

		bool LoopIdiomRecognize::isLegalLoad(LoadInst *LI,
		const SCEVAddRecExpr *StoreEv) {
		// Only allow non-volatile loads
		if (!LI \|\| LI->isVolatile())
		return false;
		// Only allow simple or unordered-atomic loads
		if (!LI->isUnordered())
		return false;

		// See if the pointer expression is an AddRec like {base,+,1} on the current
		// loop, which indicates a strided load. If we have something else, it's a
		// random load we can't handle.
		const SCEVAddRecExpr *LoadEv =
		dyn_cast<SCEVAddRecExpr>(SE->getSCEV(LI->getPointerOperand()));
		if (!LoadEv \|\| LoadEv->getLoop() != CurLoop \|\| !LoadEv->isAffine())
		return false;

		// The store and load must share the same stride.
		assert(StoreEv && "Expected valid store SCEVAddRecExpr.");
		if (StoreEv->getOperand(1) != LoadEv->getOperand(1))
		return false;

		return true;
		}

		bool LoopIdiomRecognize::isLegalMathCall(CallInst *CI,
		const SCEVAddRecExpr *StoreEv) {
		if (!CI)
		return false;

		// The call must be in the same loop as the store.
		if (LI->getLoopFor(CI->getParent()) != CurLoop)
		return false;

		// The call must be a direct call to a scalar math function.
		Function *Func = CI->getCalledFunction();
		if (!Func)
		return false;

		StringRef FuncName = Func->getName();
		if (!TLI->isVectorMathFunctionAvailable(FuncName))
		return false;

		// The call instruction must only have one user, which is the store.
		// Otherwise, the call to scalar math function has to be kept for the other
		// users after it is hoisted to form a vector math function, resulting
		// in duplicate math computations.
		if (!CI->hasOneUser())
		return false;

		// The call to a scalar math function must be fed with non-volatile
		// loads.
		for (unsigned I = 0; I < CI->arg_size(); ++I) {
		LoadInst *Load = dyn_cast<LoadInst>(CI->getArgOperand(I));

		if (!isLegalLoad(Load, StoreEv))
		return false;
		}

		return true;
		}

LoopIdiomRecognize::LegalStoreKind		LoopIdiomRecognize::LegalStoreKind
LoopIdiomRecognize::isLegalStore(StoreInst *SI) {		LoopIdiomRecognize::isLegalStore(StoreInst *SI) {
// Don't touch volatile stores.		// Don't touch volatile stores.
if (SI->isVolatile())		if (SI->isVolatile())
return LegalStoreKind::None;		return LegalStoreKind::None;
// We only want simple or unordered-atomic stores.		// We only want simple or unordered-atomic stores.
if (!SI->isUnordered())		if (!SI->isUnordered())
return LegalStoreKind::None;		return LegalStoreKind::None;
▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	LoopIdiomRecognize::isLegalStore(StoreInst *SI) {
if (!UnorderedAtomic && HasMemsetPattern && !DisableLIRP::Memset &&		if (!UnorderedAtomic && HasMemsetPattern && !DisableLIRP::Memset &&
// Don't create memset_pattern16s with address spaces.		// Don't create memset_pattern16s with address spaces.
StorePtr->getType()->getPointerAddressSpace() == 0 &&		StorePtr->getType()->getPointerAddressSpace() == 0 &&
getMemSetPatternValue(StoredVal, DL)) {		getMemSetPatternValue(StoredVal, DL)) {
// It looks like we can use PatternValue!		// It looks like we can use PatternValue!
return LegalStoreKind::MemsetPattern;		return LegalStoreKind::MemsetPattern;
}		}

// Otherwise, see if the store can be turned into a memcpy.		// Otherwise, see if the store can be turned into a memcpy or vector math
if (HasMemcpy && !DisableLIRP::Memcpy) {		// call.
		if ((HasMemcpy && !DisableLIRP::Memcpy) \|\|
		(HasVectorMath && !DisableLIRP::VectorMath)) {
// Check to see if the stride matches the size of the store. If so, then we		// Check to see if the stride matches the size of the store. If so, then we
// know that every byte is touched in the loop.		// know that every byte is touched in the loop.
APInt Stride = getStoreStride(StoreEv);		APInt Stride = getStoreStride(StoreEv);
unsigned StoreSize = DL->getTypeStoreSize(SI->getValueOperand()->getType());		unsigned StoreSize = DL->getTypeStoreSize(SI->getValueOperand()->getType());
if (StoreSize != Stride && StoreSize != -Stride)		if (StoreSize != Stride && StoreSize != -Stride)
return LegalStoreKind::None;		return LegalStoreKind::None;

// The store must be feeding a non-volatile load.		// See if the store can be turned into a memcpy.
LoadInst *LI = dyn_cast<LoadInst>(SI->getValueOperand());

// Only allow non-volatile loads
if (!LI \|\| LI->isVolatile())
return LegalStoreKind::None;
// Only allow simple or unordered-atomic loads
if (!LI->isUnordered())
return LegalStoreKind::None;

// See if the pointer expression is an AddRec like {base,+,1} on the current
// loop, which indicates a strided load. If we have something else, it's a
// random load we can't handle.
const SCEVAddRecExpr *LoadEv =
dyn_cast<SCEVAddRecExpr>(SE->getSCEV(LI->getPointerOperand()));
if (!LoadEv \|\| LoadEv->getLoop() != CurLoop \|\| !LoadEv->isAffine())
return LegalStoreKind::None;

// The store and load must share the same stride.
if (StoreEv->getOperand(1) != LoadEv->getOperand(1))
return LegalStoreKind::None;

		// For memcpy, the store must be fed with a non-volatile load.
		LoadInst *Load = dyn_cast<LoadInst>(SI->getValueOperand());
		if (isLegalLoad(Load, StoreEv)) {
// Success. This store can be converted into a memcpy.		// Success. This store can be converted into a memcpy.
UnorderedAtomic = UnorderedAtomic \|\| LI->isAtomic();		UnorderedAtomic = UnorderedAtomic \|\| Load->isAtomic();

		if (HasMemcpy && !DisableLIRP::Memcpy)
return UnorderedAtomic ? LegalStoreKind::UnorderedAtomicMemcpy		return UnorderedAtomic ? LegalStoreKind::UnorderedAtomicMemcpy
: LegalStoreKind::Memcpy;		: LegalStoreKind::Memcpy;
}		}
// This store can't be transformed into a memset/memcpy.
		// See if the store can be turned into a vector math call.

		// The store must be fed with a call to a scalar math function.
		CallInst *CI = dyn_cast<CallInst>(StoredVal);
		if (isLegalMathCall(CI, StoreEv)) {
		// Success. This store can be converted into a vector math call.
		if (HasVectorMath && !DisableLIRP::VectorMath)
		return LegalStoreKind::VectorMath;
		}
		}

		// This store can't be transformed into a memset/memcpy or vector math call.
return LegalStoreKind::None;		return LegalStoreKind::None;
}		}

void LoopIdiomRecognize::collectStores(BasicBlock *BB) {		void LoopIdiomRecognize::collectStores(BasicBlock *BB) {
StoreRefsForMemset.clear();		StoreRefsForMemset.clear();
StoreRefsForMemsetPattern.clear();		StoreRefsForMemsetPattern.clear();
StoreRefsForMemcpy.clear();		StoreRefsForMemcpy.clear();
		StoreRefsForVectorMath.clear();
for (Instruction &I : *BB) {		for (Instruction &I : *BB) {
StoreInst *SI = dyn_cast<StoreInst>(&I);		StoreInst *SI = dyn_cast<StoreInst>(&I);
if (!SI)		if (!SI)
continue;		continue;

// Make sure this is a strided store with a constant stride.		// Make sure this is a strided store with a constant stride.
switch (isLegalStore(SI)) {		switch (isLegalStore(SI)) {
case LegalStoreKind::None:		case LegalStoreKind::None:
// Nothing to do		// Nothing to do
break;		break;
case LegalStoreKind::Memset: {		case LegalStoreKind::Memset: {
// Find the base pointer.		// Find the base pointer.
Value *Ptr = getUnderlyingObject(SI->getPointerOperand());		Value *Ptr = getUnderlyingObject(SI->getPointerOperand());
StoreRefsForMemset[Ptr].push_back(SI);		StoreRefsForMemset[Ptr].push_back(SI);
} break;		} break;
case LegalStoreKind::MemsetPattern: {		case LegalStoreKind::MemsetPattern: {
// Find the base pointer.		// Find the base pointer.
Value *Ptr = getUnderlyingObject(SI->getPointerOperand());		Value *Ptr = getUnderlyingObject(SI->getPointerOperand());
StoreRefsForMemsetPattern[Ptr].push_back(SI);		StoreRefsForMemsetPattern[Ptr].push_back(SI);
} break;		} break;
case LegalStoreKind::Memcpy:		case LegalStoreKind::Memcpy:
case LegalStoreKind::UnorderedAtomicMemcpy:		case LegalStoreKind::UnorderedAtomicMemcpy:
StoreRefsForMemcpy.push_back(SI);		StoreRefsForMemcpy.push_back(SI);
break;		break;
		case LegalStoreKind::VectorMath:
		StoreRefsForVectorMath.push_back(SI);
		break;
default:		default:
assert(false && "unhandled return value");		assert(false && "unhandled return value");
break;		break;
}		}
}		}
}		}

/// runOnLoopBlock - Process the specified block, which lives in a counted loop		/// runOnLoopBlock - Process the specified block, which lives in a counted loop
Show All 21 Lines	bool LoopIdiomRecognize::runOnLoopBlock(

for (auto &SL : StoreRefsForMemsetPattern)		for (auto &SL : StoreRefsForMemsetPattern)
MadeChange \|= processLoopStores(SL.second, BECount, ForMemset::No);		MadeChange \|= processLoopStores(SL.second, BECount, ForMemset::No);

// Optimize the store into a memcpy, if it feeds an similarly strided load.		// Optimize the store into a memcpy, if it feeds an similarly strided load.
for (auto &SI : StoreRefsForMemcpy)		for (auto &SI : StoreRefsForMemcpy)
MadeChange \|= processLoopStoreOfLoopLoad(SI, BECount);		MadeChange \|= processLoopStoreOfLoopLoad(SI, BECount);

		// Optimize the scalar math call to the equivalent vector math call.
		for (auto &SI : StoreRefsForVectorMath)
		MadeChange \|= processLoopStoreForVectorMath(SI, BECount);

MadeChange \|= processLoopMemIntrinsic<MemCpyInst>(		MadeChange \|= processLoopMemIntrinsic<MemCpyInst>(
BB, &LoopIdiomRecognize::processLoopMemCpy, BECount);		BB, &LoopIdiomRecognize::processLoopMemCpy, BECount);
MadeChange \|= processLoopMemIntrinsic<MemSetInst>(		MadeChange \|= processLoopMemIntrinsic<MemSetInst>(
BB, &LoopIdiomRecognize::processLoopMemSet, BECount);		BB, &LoopIdiomRecognize::processLoopMemSet, BECount);

return MadeChange;		return MadeChange;
}		}

▲ Show 20 Lines • Show All 403 Lines • ▼ Show 20 Lines	static const SCEV getNumBytes(const SCEV BECount, Type *IntPtr,
const DataLayout DL, ScalarEvolution SE) {		const DataLayout DL, ScalarEvolution SE) {
const SCEV *TripCountSCEV =		const SCEV *TripCountSCEV =
SE->getTripCountFromExitCount(BECount, IntPtr, CurLoop);		SE->getTripCountFromExitCount(BECount, IntPtr, CurLoop);
return SE->getMulExpr(TripCountSCEV,		return SE->getMulExpr(TripCountSCEV,
SE->getTruncateOrZeroExtend(StoreSizeSCEV, IntPtr),		SE->getTruncateOrZeroExtend(StoreSizeSCEV, IntPtr),
SCEV::FlagNUW);		SCEV::FlagNUW);
}		}

		/// Create and insert a call instruction to the vector math intrinsic.
		/// For example:
		/// void llvm.experimental.vector.add(<type> Len, <type> Res, <type> In1,
		/// <type> In2)
		static CallInst *
		insertVectorMathIntrinsic(Intrinsic::ID IntrinID, Value *TripCount,
		Value *StoreBasePtr,
		SmallVectorImpl<LoadInst *> &LoadInstList,
		SmallVectorImpl<Value *> &LoadBasePtrList,
		StoreInst SI, CallInst CI, IRBuilder<> &Builder) {
		SmallVector<Value *, 4> CallArgs;
		CallArgs.push_back(TripCount);
		CallArgs.push_back(StoreBasePtr);
		for (Value *LoadBasePtr : LoadBasePtrList)
		CallArgs.push_back(LoadBasePtr);

		SmallVector<Type *, 4> ArgsTy;
		for (Value *Arg : CallArgs)
		ArgsTy.push_back(Arg->getType());

		Module *M = SI->getModule();
		Function *Func = Intrinsic::getDeclaration(M, IntrinID, ArgsTy);
		CallInst *NewCall = Builder.CreateCall(Func, CallArgs);
		NewCall->setDebugLoc(SI->getDebugLoc());

		// Set the metadata to the new call instruction.
		AAMDNodes AATags = SI->getAAMetadata();
		for (LoadInst *LI : LoadInstList) {
		AAMDNodes LoadAATags = LI->getAAMetadata();
		AATags = AATags.merge(LoadAATags);
		}
		if (auto ConstInt = dyn_cast<ConstantInt>(TripCount))
		AATags = AATags.extendTo(ConstInt->getZExtValue());
		else
		AATags = AATags.extendTo(-1);

		if (AATags.TBAA)
		NewCall->setMetadata(LLVMContext::MD_tbaa, AATags.TBAA);

		if (AATags.TBAAStruct)
		NewCall->setMetadata(LLVMContext::MD_tbaa_struct, AATags.TBAAStruct);

		if (AATags.Scope)
		NewCall->setMetadata(LLVMContext::MD_alias_scope, AATags.Scope);

		if (AATags.NoAlias)
		NewCall->setMetadata(LLVMContext::MD_noalias, AATags.NoAlias);

		NewCall->copyMetadata(*CI);

		return NewCall;
		}

/// processLoopStridedStore - We see a strided store of some value. If we can		/// processLoopStridedStore - We see a strided store of some value. If we can
/// transform this into a memset or memset_pattern in the loop preheader, do so.		/// transform this into a memset or memset_pattern in the loop preheader, do so.
bool LoopIdiomRecognize::processLoopStridedStore(		bool LoopIdiomRecognize::processLoopStridedStore(
Value DestPtr, const SCEV StoreSizeSCEV, MaybeAlign StoreAlignment,		Value DestPtr, const SCEV StoreSizeSCEV, MaybeAlign StoreAlignment,
Value StoredVal, Instruction TheStore,		Value StoredVal, Instruction TheStore,
SmallPtrSetImpl<Instruction > &Stores, const SCEVAddRecExpr Ev,		SmallPtrSetImpl<Instruction > &Stores, const SCEVAddRecExpr Ev,
const SCEV *BECount, bool IsNegStride, bool IsLoopMemset) {		const SCEV *BECount, bool IsNegStride, bool IsLoopMemset) {
Module *M = TheStore->getModule();		Module *M = TheStore->getModule();
▲ Show 20 Lines • Show All 165 Lines • ▼ Show 20 Lines	bool LoopIdiomRecognize::processLoopStoreOfLoopLoad(StoreInst *SI,
const SCEVAddRecExpr *LoadEv = cast<SCEVAddRecExpr>(SE->getSCEV(LoadPtr));		const SCEVAddRecExpr *LoadEv = cast<SCEVAddRecExpr>(SE->getSCEV(LoadPtr));

const SCEV *StoreSizeSCEV = SE->getConstant(StorePtr->getType(), StoreSize);		const SCEV *StoreSizeSCEV = SE->getConstant(StorePtr->getType(), StoreSize);
return processLoopStoreOfLoopLoad(StorePtr, LoadPtr, StoreSizeSCEV,		return processLoopStoreOfLoopLoad(StorePtr, LoadPtr, StoreSizeSCEV,
SI->getAlign(), LI->getAlign(), SI, LI,		SI->getAlign(), LI->getAlign(), SI, LI,
StoreEv, LoadEv, BECount);		StoreEv, LoadEv, BECount);
}		}

		// See if this scalar math call can be promoted to the equivalent vector math
		// call.
		bool LoopIdiomRecognize::processLoopStoreForVectorMath(StoreInst *SI,
		const SCEV *BECount) {
		// Do not transform this candidate if it is known to be not profitable. That
		// is, the loop trip count is less than or equal to the threshold. The loop
		// trip count is BECount plus one.
		if (const SCEVConstant *BECst = dyn_cast<SCEVConstant>(BECount))
		if (BECst->getAPInt().getZExtValue() < MathLoopTripCountProfitThreshold)
		return false;

		Value *StoredVal = SI->getValueOperand();
		CallInst *CI = dyn_cast<CallInst>(StoredVal);
		assert(CI && "call instruction is expected");

		Value *StorePtr = SI->getPointerOperand();
		const SCEVAddRecExpr *StoreEv = cast<SCEVAddRecExpr>(SE->getSCEV(StorePtr));
		unsigned StoreSize = DL->getTypeStoreSize(StoredVal->getType());

		BasicBlock *Preheader = CurLoop->getLoopPreheader();
		IRBuilder<> Builder(Preheader->getTerminator());
		SCEVExpander Expander(SE, DL, "loop-idiom");
		SCEVExpanderCleaner ExpCleaner(Expander);

		bool Changed = false;
		const SCEV *StrStart = StoreEv->getStart();
		unsigned StrAS = StorePtr->getType()->getPointerAddressSpace();
		Type *IntIdxTy = Builder.getIntNTy(DL->getIndexSizeInBits(StrAS));

		APInt Stride = getStoreStride(StoreEv);
		bool IsNegStride = StoreSize == -Stride;
		const SCEV *StoreSizeS = SE->getConstant(IntIdxTy, StoreSize);

		// Handle negative strided loops.
		if (IsNegStride)
		StrStart =
		getStartForNegStride(StrStart, BECount, IntIdxTy, StoreSizeS, SE);

		// Okay, we have a strided store "p[i]" of a return value from a scalar math
		// function, whose input is a loaded value. We can turn this into a call to
		// vector math function in the loop preheader now. However, this would be
		// unsafe to do if the loop contains any other reads/writes to the memory
		// region we're storing to. This includes the load that feeds the stores.
		// Check for an alias by generating the base address and checking everything.
		Value *StoreBasePtr = Expander.expandCodeFor(
		StrStart, Builder.getInt8PtrTy(StrAS), Preheader->getTerminator());

		// From here on out, conservatively report to the pass manager that we've
		// changed the IR, even if we later clean up these added instructions. There
		// may be structural differences e.g. in the order of use lists not accounted
		// for in just a textual dump of the IR. This is written as a variable, even
		// though statically all the places this dominates could be replaced with
		// 'true', with the hope that anyone trying to be clever / "more precise" with
		// the return value will read this comment, and leave them alone.
		Changed = true;

		SmallPtrSet<Instruction *, 2> IgnoredInsts{SI, CI};

		if (mayLoopAccessLocation(StoreBasePtr, ModRefInfo::ModRef, CurLoop, BECount,
		StoreSizeS, *AA, IgnoredInsts)) {
		ORE.emit([&]() {
		return OptimizationRemarkMissed(DEBUG_TYPE, "LoopMayAccessStore", SI)
		<< ore::NV("Inst", "load and store") << " in "
		<< ore::NV("Function", SI->getFunction())
		<< " function will not be hoisted: "
		<< ore::NV("Reason", "The loop may access store location");
		});
		return Changed;
		}

		SmallVector<LoadInst *, 2> LoadInstList;
		SmallVector<Value *, 2> LoadBasePtrList;
		for (unsigned I = 0; I < CI->arg_size(); ++I) {
		LoadInst *Load = dyn_cast<LoadInst>(CI->getArgOperand(I));
		assert(Load && "load instruction is expected.");
		Value *LoadPtr = Load->getPointerOperand();
		const SCEVAddRecExpr *LoadEv = cast<SCEVAddRecExpr>(SE->getSCEV(LoadPtr));

		const SCEV *LdStart = LoadEv->getStart();
		unsigned LdAS = LoadPtr->getType()->getPointerAddressSpace();

		// Handle negative strided loops.
		if (IsNegStride)
		LdStart =
		getStartForNegStride(LdStart, BECount, IntIdxTy, StoreSizeS, SE);

		// We have to make sure that the input array is not being mutated by the
		// loop.
		Value *LoadBasePtr = Expander.expandCodeFor(
		LdStart, Builder.getInt8PtrTy(LdAS), Preheader->getTerminator());

		// Only ignore the call instruction.
		IgnoredInsts.clear();
		IgnoredInsts.insert(CI);
		if (mayLoopAccessLocation(LoadBasePtr, ModRefInfo::Mod, CurLoop, BECount,
		StoreSizeS, *AA, IgnoredInsts)) {
		ORE.emit([&]() {
		return OptimizationRemarkMissed(DEBUG_TYPE, "LoopMayAccessLoad", Load)
		<< ore::NV("Inst", "load and store") << " in "
		<< ore::NV("Function", SI->getFunction())
		<< " function will not be hoisted: "
		<< ore::NV("Reason", "The loop may access load location");
		});
		return Changed;
		}

		LoadInstList.push_back(Load);
		LoadBasePtrList.push_back(LoadBasePtr);
		}

		if (avoidLIRForMultiBlockLoop())
		return Changed;

		// Okay, everything is safe, we can transform this!

		const SCEV *TripCountS =
		SE->getTripCountFromExitCount(BECount, IntIdxTy, CurLoop);

		Value *TripCount =
		Expander.expandCodeFor(TripCountS, IntIdxTy, Preheader->getTerminator());

		StringRef ScalarFuncName = CI->getCalledFunction()->getName();
		assert(TLI->isVectorMathFunctionAvailable(ScalarFuncName) &&
		"The equivalent vector math function must be available.");
		Intrinsic::ID IntrinID = TLI->getVectorMathIntrinsic(ScalarFuncName);
		CallInst *NewCall =
		insertVectorMathIntrinsic(IntrinID, TripCount, StoreBasePtr, LoadInstList,
		LoadBasePtrList, SI, CI, Builder);

		if (MSSAU) {
		MemoryAccess *NewMemAcc = MSSAU->createMemoryAccessInBB(
		NewCall, nullptr, NewCall->getParent(), MemorySSA::BeforeTerminator);
		MSSAU->insertDef(cast<MemoryDef>(NewMemAcc), true);
		}

		ORE.emit([&]() {
		return OptimizationRemark(DEBUG_TYPE, "processLoopStoreForVectorMath",
		NewCall->getDebugLoc(), Preheader)
		<< "Formed a call to "
		<< ore::NV("NewFunction", NewCall->getCalledFunction()) << "() from "
		<< ore::NV("Inst", "load and store") << " instruction in "
		<< ore::NV("Function", SI->getFunction()) << " function"
		<< ore::setExtraArgs()
		<< ore::NV("FromBlock", SI->getParent()->getName())
		<< ore::NV("ToBlock", Preheader->getName());
		});

		// Okay, a new call to vector math function has been formed.
		// Zap the original store and anything that feeds into it.
		if (MSSAU) {
		MSSAU->removeMemoryAccess(SI, true);
		MSSAU->removeMemoryAccess(CI, true);
		}
		deleteDeadInstruction(SI);
		deleteDeadInstruction(CI);
		if (MSSAU && VerifyMemorySSA)
		MSSAU->getMemorySSA()->verifyMemorySSA();
		++NumVectorMath;
		ExpCleaner.markResultUsed();

		return true;
		}

namespace {		namespace {
class MemmoveVerifier {		class MemmoveVerifier {
public:		public:
explicit MemmoveVerifier(const Value &LoadBasePtr, const Value &StoreBasePtr,		explicit MemmoveVerifier(const Value &LoadBasePtr, const Value &StoreBasePtr,
const DataLayout &DL)		const DataLayout &DL)
: DL(DL), BP1(llvm::GetPointerBaseWithConstantOffset(		: DL(DL), BP1(llvm::GetPointerBaseWithConstantOffset(
LoadBasePtr.stripPointerCasts(), LoadOff, DL)),		LoadBasePtr.stripPointerCasts(), LoadOff, DL)),
BP2(llvm::GetPointerBaseWithConstantOffset(		BP2(llvm::GetPointerBaseWithConstantOffset(
▲ Show 20 Lines • Show All 1,651 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/lower-intrinsics-vector-mass.ll

This file was added.

				; RUN: llc -vector-library=MASSV -O3 -mtriple=powerpc64le-unknown-linux-gnu < %s \| FileCheck %s
				; RUN: llc -vector-library=MASSV -O3 -mtriple=powerpc-ibm-aix-xcoff < %s \| FileCheck %s

				declare void @llvm.experimental.vector.exp.i64.p0.p0(i64, ptr noalias nocapture writeonly, ptr noalias nocapture readonly) #0

				define dso_local void @exp_f64(ptr noalias nocapture noundef readonly %x, ptr noalias nocapture noundef writeonly %y) {
				; CHECK-LABEL: exp_f64
				; CHECK: __vexp
				; CHECK: blr
				entry:
				tail call void @llvm.experimental.vector.exp.i64.p0.p0(i64 100, ptr %y, ptr %x)
				ret void
				}

				attributes #0 = { nocallback nofree nosync nounwind willreturn memory(argmem: readwrite) }

llvm/test/Transforms/LoopIdiom/math.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt -passes=loop-idiom -vector-library=MASSV -disable-loop-idiom-vector-math=false < %s -S \| FileCheck %s

				target datalayout = "e-m:e-i64:64-n32:64-S128-v256:256:256-v512:512:512"

				declare double @exp(double) #0

				define void @exp_f64(ptr noalias noundef %x, ptr noalias noundef %y) {
				; CHECK-LABEL: @exp_f64(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: call void @llvm.experimental.vector.exp.i64.p0.p0(i64 1000, ptr [[Y:%.]], ptr [[X:%.]])
				; CHECK-NEXT: br label [[FOR_BODY:%.*]]
				; CHECK: for.body:
				; CHECK-NEXT: [[I:%.]] = phi i32 [ [[INC:%.]], [[FOR_BODY]] ], [ 0, [[ENTRY:%.*]] ]
				; CHECK-NEXT: [[IDXPROM:%.*]] = zext i32 [[I]] to i64
				; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[IDXPROM]]
				; CHECK-NEXT: [[TMP0:%.*]] = load double, ptr [[ARRAYIDX]], align 8
				; CHECK-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[IDXPROM]]
				; CHECK-NEXT: [[INC]] = add nuw nsw i32 [[I]], 1
				; CHECK-NEXT: [[CMP:%.*]] = icmp slt i32 [[INC]], 1000
				; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_EXIT:%.*]]
				; CHECK: for.exit:
				; CHECK-NEXT: ret void
				;
				entry:
				br label %for.body

				for.body:
				%i = phi i32 [ %inc, %for.body ], [ 0, %entry ]
				%idxprom = zext i32 %i to i64
				%arrayidx = getelementptr inbounds double, ptr %x, i64 %idxprom
				%0 = load double, ptr %arrayidx, align 8
				%call = tail call double @exp(double noundef %0)
				%arrayidx2 = getelementptr inbounds double, ptr %y, i64 %idxprom
				store double %call, ptr %arrayidx2, align 8
				%inc = add nuw nsw i32 %i, 1
				%cmp = icmp slt i32 %inc, 1000
				br i1 %cmp, label %for.body, label %for.exit

				for.exit:
				ret void
				}

				attributes #0 = { nounwind willreturn memory(write) }

This is an archive of the discontinued LLVM Phabricator instance.

[RFC] Vector math function loop idiom recognitionNeeds ReviewPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 520381

llvm/include/llvm/Analysis/TargetLibraryInfo.h

llvm/include/llvm/Analysis/VectorMathFuncs.def

llvm/include/llvm/IR/Intrinsics.td

llvm/include/llvm/Target/TargetOptions.h

llvm/include/llvm/Transforms/Scalar/LoopIdiomRecognize.h

llvm/lib/Analysis/TargetLibraryInfo.cpp

llvm/lib/Target/PowerPC/CMakeLists.txt

llvm/lib/Target/PowerPC/PPC.h

llvm/lib/Target/PowerPC/PPCGenVectorMASSEntries.cpp

llvm/lib/Target/PowerPC/PPCTargetMachine.cpp

llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp

llvm/test/CodeGen/PowerPC/lower-intrinsics-vector-mass.ll

llvm/test/Transforms/LoopIdiom/math.ll

[RFC] Vector math function loop idiom recognition
Needs ReviewPublic